Jump to content

Recommended Posts

All--

 

So I have an issue at times where the LT-Offline Server does not work correctly.  Specifically the server is reporting back to the database, but overall the machine is hosed.  When I look at Automate, the agent is not checked in/show as offline.  When I connect to another node that is online I can ping the host, but cannot connect to it.  Essentially requiring the server to be rebooted to come back up and come online.  The issue here is that the offline monitor should kick off.  But because it is still sending some heartbeat info to the SQL, it is essentially telling Automate that it is still on.

 

I now have a group for server that I want to apply a new internal monitor for, but the logic is hard to come by.  Would anyone be able to help me out?

I have tried the below and it is not working


SQL command

SELECT * FROM computers

WHERE lastContact <= NOW() -INTERVAL 10 MINUTE

 

I also tried this one

SELECT NAME, Lastcontact FROM computers

WHERE DATE(LASTCONTACT)>NOW() -INTERVAL 10 MINUTE

 

withing SQLyog, the second item returns 0 results. 

 

Then I would also need to understand the logic required to actually setup the monitor if someone has the time.

 

Thanks

Jason

Share this post


Link to post
Share on other sites

Give this a try:

  • Take the existing LT - Offline Servers monitor and copy it.
  • Change the Check Condition to be "Anything".
  • Clear the "Result" value.

Then use the following in the "Additional Condition"

lastcontact < date_add(now(), interval -20 minute) 
OR (
lastcontact < date_add(now(), interval -6 minute)
AND heartbeatcomputers.LastHeartbeatTime <  DATE_ADD(NOW(),INTERVAL - (SELECT heartbeatdbupdateinterval *6 FROM advancedconfig) SECOND) 
AND computers.computerid in (select computerid from computerconfig where computerconfig.heartbeatenable=1)
)

It's the same logic as before, but adds a second way to detect an offline agent: No contact for over 20 minutes regardless of heartbeat.

  • Like 1

Share this post


Link to post
Share on other sites

Just confirming this works like a champ.  We have multiple systems that respond to ping but are not usable by a human.  Things like the shares are inaccessible or a user cannot logon remotely or via console.  This monitor catches them.

Share this post


Link to post
Share on other sites

Agreed, this is a very good monitor.  I was running this in tandem with the offline server and noticed this worked better than the other.  Thanks for the help Darren

Share this post


Link to post
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.


×
×
  • Create New...