Quantcast
Channel: Zenoss Community : Popular Discussions - All Communities
Viewing all articles
Browse latest Browse all 220

ERROR_SEM_TIMEOUT on Windows

$
0
0
Greetings, all.

Just started monitoring three Windows and one Linux server with Zenoss 2.0.3. Overall, very impressed.

I'm sure my network topology is somewhat unique. My Windows servers are at a third-party hosting provider and sitting behind a firewall. So I installed Hamachi VPN services on all my boxes. If you're not familiar, Hamachi is a pretty cool free VPN/Virtual LAN tool (www.hamachi.cc). It allows me to use SNMP and WMI monitoring securely and without poking holes in firewalls or dealing with overly complex VPN solutions.

As I've seen one or two other people report, I've been having a recurring issue with zenwin and zenwinmodeler. When polling my Windows boxes, I'll occasionally get the following error:
ERROR_SEM_TIMEOUT: The semaphore timeout period has expired. (121)


This is a Windows error, I believe. It's probably due to network slowness or maybe some vaguary of Hamachi, I don't know. The main issue is that it seems to cause zenwin and zenwinmodeler to up and die. Then I receive heartbeat failure emails until they restart themselves.

Here's an excerpt from my zenwinmodeler.log right when it happens:
2007-07-13 17:51:28 ERROR zen.zenwinmodeler: ERROR_SEM_TIMEOUT: The semaphore timeout period has expired. (121)
Traceback (most recent call last):  File "/usr/local/zenoss/Products/ZenWin/zenwinmodeler.py", line 63, in processLoop    svcs = self.getServices(name, user, passwd)  File "/usr/local/zenoss/Products/ZenWin/zenwinmodeler.py", line 97, in getServices    dev.connect()  File "/usr/local/zenoss/Products/ZenWin/wmiclient.py", line 51, in connect    self.flags,self.valueset)  File "usr/local/zenoss/lib/python/win32com/client.py", line 33, in ConnectServer    services = pywmi.WBEM_ConnectServer(name, namespace, user, passwd, locale, flags, authority, valueset)
com_error: com_error(121): DOS code 0x00000079
2007-07-13 17:51:28 INFO zen.zenwinmodeler: collecting from my1.server.com using user .\Administrator
2007-07-13 17:53:28 ERROR zen.zenwinmodeler: ERROR_SEM_TIMEOUT: The semaphore timeout period has expired. (121)
Traceback (most recent call last):  File "/usr/local/zenoss/Products/ZenWin/zenwinmodeler.py", line 63, in processLoop    svcs = self.getServices(name, user, passwd)  File "/usr/local/zenoss/Products/ZenWin/zenwinmodeler.py", line 97, in getServices    dev.connect()  File "/usr/local/zenoss/Products/ZenWin/wmiclient.py", line 51, in connect    self.flags,self.valueset)  File "usr/local/zenoss/lib/python/win32com/client.py", line 33, in ConnectServer    services = pywmi.WBEM_ConnectServer(name, namespace, user, passwd, locale, flags, authority, valueset)
com_error: com_error(121): DOS code 0x00000079


Also sprinkled throughout the logs are:
2007-07-13 17:54:28 WARNING zen.zenwinmodeler: skipping my1.server.com has bad wmi state
2007-07-13 17:55:28 WARNING zen.zenwinmodeler: skipping my1.server.com has bad wmi state
2007-07-13 17:56:28 WARNING zen.zenwinmodeler: skipping my1.server.com has bad wmi state


I do seem to be tracking performance metrics on these machines, so I'm not sure what the "bad wmi state" thing is about.

Anyway, I don't know what the solution is. Maybe someone does. Perhaps the Win32 Python libraries could be more forgiving with respect to their semaphore timeouts? Is that configurable somewhere?

Also, are there some settings that could possibly be tweaked in the registry on the servers to help in this situation?

Thanks

Viewing all articles
Browse latest Browse all 220