Jump to content
kgrube

LabTech Stuck Commands Fixes

Recommended Posts

There's a current known issue where agents will continue to check in if they are online, but never execute any commands. This can cause the script queue to quickly fill up because the script engine will never clear out the scripts on these machines. Left long enough, no scripts will be able to run until the script queue is cleared out. Machines exhibiting this behavior can be identified because all commands on the Commands tab will be listed as executing.

 

  • Firstly, clear the script queue to allow new scripts to run. Either restart the database agent to clear all scripts, or delete from runningscripts manually.
  • There's two routes to go down now:
    1. Do a one-off fix through ScreenConnect
    2. Create autofixes

 

One-off through screenconnect:

  1. (Optional) Delete from pendingcommands to clear queued commands.
  2. Either run this on all your machines (restarting every agent) or create a new session group limiting to machines identified as 'stuck'
  3. Log into ScreenConnect web interface
  4. Select machines
  5. Right click -> Run command
  6. Paste in this one liner:
    net stop ltsvcmon  & taskkill /im ltsvcmon.exe /f & taskkill /im ltsvc.exe /f & taskkill /im lttray.exe /f & net start ltsvcmon & net start ltservice


    This will stop the service monitor, kill off LTSvcMon.exe, LTSvc.exe, LTTray.exe, then try to restart everything.

 

Internal monitor/autofix overview:

  1. Create internal monitor
  2. Create Autofix, call from internal monitor

 

 

Create an internal RAWSQL monitor. Note: change the limit on the last line to whatever allows this query to run on your system.

 

SELECT DISTINCT
   computers.computerid AS `TestValue`
 , computers.ComputerID AS `IdentityField`
 , computers.computerid AS `ComputerID`
 , acd.NoAlerts
 , computers.domain
 , acd.UpTimeStart
 , acd.UpTimeEnd
FROM commands
 JOIN computers ON computers.ComputerID = commands.ComputerID
 LEFT JOIN AgentComputerData acd ON computers.computerid = acd.computerid
WHERE
 commands.Status = 2
 AND commands.DateUpdated < DATE_ADD(NOW(), INTERVAL -15 MINUTE)
 AND commands.DateUpdated > DATE_ADD(NOW(), INTERVAL -60 MINUTE)
 AND Computers.ComputerID IN
     (
       SELECT
         ComputerID
       FROM commands
       WHERE status = 2
       GROUP BY commands.ComputerID
       HAVING Count(*) > 2)
 AND Computers.LastContact > DATE_ADD(NOW(), INTERVAL -15 MINUTE)
 AND computers.os NOT IN ('Darwin', 'Linux')

limit 0, 50

 

  • Set Table to Check and Field to Check to `RAWSQL`.
  • Set Check Condition to GreaterThan
  • Set Result to 0
  • Set Identity Field to `computers.computerid`
  • Paste the code block above into Additional Condition

 

Create an autofix script

 

  1. Import the attached script files
  2. Put psexec.exe in L:\Transfer\Apps\toolkit
    (Or edit the script to where it exists on your L: drive already)
  3. Edit lines 5/6 for your info categories for the types of tickets this will make
  4. Edit line 22 with the script id of this script in your system.
  5. Make sure the last line in the script points to the CTS-Autoclose-Update-Generic function script
    (I don't know how these script exports work)
  6. Add the script to the internal monitor

 

Overview of how the autofix works:

  1. Check that the machine is checking in
  2. See if the machine's already been fixed by injecting a FasTalk command into `commands` table
    (Script execution will pause if you try to run an actual command and the script will never complete)
  3. Delete all non-complete commands for this machine from `commands`
  4. Delete all running scripts (except this one) for this machine from `runningscripts`
  5. Delete all pending scripts for this machine from `pendingscripts`
  6. Delete all remote monitors for this machine from `agents`
  7. Find another machine at this machine's location that is not having the problem
  8. Run PSExec commands from this other machine to restart the agent on the original machine
  9. Inject another FasTalk command into `commands` table
  10. Create/update/close ticket based on value of @status@ variable.

 

As always, use this at your own risk. If you have any feed back for me, please let me know. I'm new at this 'exporting scripts' thing, so it's probably borked.

 

Script contents:

DXwSzzQ.png

TIUG9S5.png

0HOQ6U8.png

lts.zip

Edited by Guest

Share this post


Link to post
Share on other sites

The problem we ran into with this is the that commands to fix the first machine get stuck on the second machine and are now waiting for a third machine to be dispatched to fix the second one which will delete the commands on the second one leaving the first machine and possibly the second machine broke. Possibly a never ending cycle until you run out of machines at that site to try to fix the other ones.

Share this post


Link to post
Share on other sites

We have been seeing a lot of stuck commands on our servers. For some reason monitors are getting stuck at removing and installing. So may have give this a try.

Share this post


Link to post
Share on other sites

Been working with LT support on this issue for over a week. It is a known issue across multiple partners. Their workaround was the one off through ScreenConnect as mentioned above. When you have 3700 agents, its a pain.

Share this post


Link to post
Share on other sites

I been working with LT support for about week too. Just found out it's known issue today. I thought it was just us having issue. We have 5200 agents and we get about dozen a day with stuck commands. It's getting old quick.

Share this post


Link to post
Share on other sites
I been working with LT support for about week too. Just found out it's known issue today. I thought it was just us having issue. We have 5200 agents and we get about dozen a day with stuck commands. It's getting old quick.

 

We have a similar number of agents and are seeing around 500 after a day or two or doing nothing, that's why I wrote this script. This script has a pretty decent success rate but anything that blocks PSExec will cause it to fail.

Share this post


Link to post
Share on other sites

First thanks kgrube for the excellent write up.

 

I made some modifications and also created a screenconnect plugin to assist with this and other possible tasks.

 

Create an autofix script

 

1. Upload screenconnect plugin to your App_Extensions folder under your screenconnect server.

2. Enable plugin in screenconnect administration.

3. Edit the plugin file service.aspx and change the value of the key as this is the only security(like a password in very lose terms)

4. Import the attached script files

5. Edit lines 5/6 for your info categories for the types of tickets this will make

6. Edit line 26 with the script id of this script in your system.

7. Edit line 39 with the id of your LabTech server(Usually 1 but can be a different computerid)

8. Edit line 42 to match the key in your screenconnect key variable in the service.aspx file.

9. Make sure the last line in the script points to the CTS-Autoclose-Update-Generic function script (I don't know how these script exports work)

10. Add the script to the internal monitor

 

autofix changes:

Run screenconnect command on machine from screenconnect instead of using PSEXEC and a machine on the customer network.

 

 

The ScreenConnect Plugin:

the plugin contains two functions currently.

You can target these with HTTP GET requests.

 

Function ExecuteCommand():

This will take a variable data(URLEncoded) and send the command to the screenconnect guest that matches sessionID

 

To Use:

https://yourscreenconnecturl:port/App_Extensions/8e78224d-79db-4dbb-b62a-833276b46c6e/Service.ashx/ExecuteCommand?key=aljgdlkajglkjalksjgdl&sessionID=88b7b298-3664-4670-b1ff-bbb61843cc07&data=ipconfig

Function IsOnline():

This will take a sessionID and let you know if the guest is connected to screenconnect or not. Returns 1 if connected 0 if not.

 

To Use:

https://yourscreenconnecturl:port/App_Extensions/8e78224d-79db-4dbb-b62a-833276b46c6e/Service.ashx/isOnline?key=aljgdlkajglkjalksjgdl&sessionID=88b7b298-3664-4670-b1ff-bbb61843cc07

Labtech Script.zip

Labtech Helper - ScreenConnect Plugin.zip

Share this post


Link to post
Share on other sites

ive had the same issue working with LT support, looks liek it may be an issue with WEbroot causing CMD commands to hang. the script utilizing screenconnect looks very interesting

Share this post


Link to post
Share on other sites

We are having trouble finding the service.aspx file in the plug-in. Can you provide us the path to where it is located? We looked in the zip file and do not see it in there.

Share this post


Link to post
Share on other sites

I keep getting this error on our test systems any ideas? ERROR: The process "ltsvcmon.exe" not found. Also what has everyone found to be a good amount of commands a machine can handle before it locks up? We are seeing this happen on almost all servers and probably 25% of all workstations.

Share this post


Link to post
Share on other sites

I reported this issue to Labtech on the 27th of October. Reported it to Webroot on the 1st of November. Webroot is actually suspending other process also. We have think it has been causing issues with terminal servers and connecting to the server. Labtech tells me they are in contact with Webroot. Webroot tells me that only two clients have reported the issue. If you are having this issue please report it to Webroot. If we all do not tell Webroot to work on the issue it is going to continue. Doing manual processing or stuck commands is not what we are paying Labtech or Webroot for. I started to create a script to look for the stuck processes on systems because some systems have it stuck but is not backlogging anything currently.

Share this post


Link to post
Share on other sites

I have reported it to LT support and not Webroot. I was told LT development is working on a fix. So LT is telling you Webroot is causing these issues? Are you running 10.5 or 11? We are on 10.5 patch 8.

Share this post


Link to post
Share on other sites

Labtech is reporting to my ticket that they are working with Webroot as it appears to only effect Labtech Partners with Webroot. I have clients that are using their own AV and we do not have the issue with them, Symantec, Vipre. Up until late today when communicating with webroot they told me that only one other webroot client has reported the issue to them. (We used Vipre before webroot and webroot detected things Vipre never saw) Our nbext step is to start looking at bitdefender if webroot cannot get this issue resolved soon.

 

From our findings processes by Labtech and some that have nothing to do with Labtech are getting suspended. Until you turn off webroot or uninstall it you cannot delete these suspended processes it will come back as Access denied. If you do kill the parent process it orphans the suspended process and those start to build up on the system. We are concerned that this is the reason we had 2 different Terminal Servers stop allowing connection until rebooted. We have not been able to isolate that issue but once we started to clean up the suspended process the issue started to diminish.

 

Late this afternoon I got am email for the escalation tech at webroot that they are working on the issue with Labtech and the work around is to uninstall webroot, reboot, setup a policy to turn off automatic update and install a previous version with the no update switch while installing.

 

We are currently on 11 in a cloud instance with Labtech.

 

The issue we have currently is finding the systems that have suspended processes before they queue up a bunch of commands. Some of these commands are scheduled reboots that are getting missed.

 

I can appreciate the complexity of the issue and the time it takes to get a solution in place but the lack of communication from both webroot and labtech is frustrating. I know as an MSP if we communicated with our clients in the same way we would be losing clients. Unfortunately I have found that this is true with most RMM providers. We came from Kaseya and they were much worse then Labtech. Labtech does and excellent job on several layers but these little nagging issues that cause manual steps to perform to get the product to work right it is counter productive and labtech management just doe not care, at least in my experience with them and trying to get my concerns heard from high level management.

Share this post


Link to post
Share on other sites

Hi

 

Want to add our experience re Webroot over the last 2 months across ± 900 seats.

 

I can confirm that Webroot breaks the Terminal Server logon process on many of our TS servers. We went through many upgrades, policy changes etc. and it is still happening. You may have received the email this morning about setting the ‘Self Protection’ to low. This leads to another shortcoming in the portal – You cannot apply changes globally across sites – if you create a new policy you have to go to each site, search for servers and then apply the new policy. I feel really sorry for anyone with lots of sites and wanting to implement something new.

 

We had Webroot blue screen many workstations using wireless keyboards.

 

We had to re-register Revit after install WR as it breaks the registration – so Architect’s and related users cannot work after a WR install. Revit even have a KB about this issue.

 

IF you use Xero Accounting’s , import bank statement it will not work with FNB in South Africa – I closed the request at WR after weeks of no feedback.

 

We suspect that WR is also breaking some versions of Sage (Pastel Accounting) – we have numerous reports from clients pointing in this direction.

 

The worst thing is we had 2 ransomware infections this week where WR was installed, running, up to date ETC. and it still did not prevent it from happening. WR could not ‘roll back the changes’ as is promised everywhere and the files was lost or had to be recovered from backup.

 

There are other less prominent problems with using this product.

 

1. Reporting is very limited

2. We had quite a few issues with websites being blocked and an inability to control this

3. The unclear policies regarding what data is being collected and stored in US data centers

4. Etc.

 

I wish we had the above information 2 months ago, perhaps our experience will help someone else regarding implementing WR or NOT.

Share this post


Link to post
Share on other sites

Individually as MSP and large IT departments we tend to get brushed off like it is isolated issue. We continue to find a "work around" to situation that are pushed upon us because there are not enough voices calling out to demand it be fixed. I would like to figure out a way we can help each other provide a unified voice to these providers to get action done quicker. I have a list of items that have been reported to Labtech in particular that still have not been fixed. For instance anyone notice that the search for online system in the advanced search returns systems that have not been online for more then a month.

 

This forum is a good place to share information but I doubt it has any impact on Labtech or Webroot. Even the Labtech Forum seems to have no impact on Labtech. I still like to stress that Labtech product is in my opinion better then the others that we looked at but their support and SLA are right in line with the rest, poor an not adhered to.

Share this post


Link to post
Share on other sites

Will have to give these work-around scripts a try - thanks to those who posted them. We're also seeing this issue and LT support confirmed it. I suspect a fix might be a while now that most of LT is at IT Nation. The reports in here of Webroot blocking a number of critical functions is concerning to me since we've been moving customers from SEP to Webroot. We had no issues with SEP but Webroot costs less and our own internal review of it prior to adopting it as an offering showed it worked well.

Share this post


Link to post
Share on other sites

We were using the Vipre cloud product before switching to Webroot. We are much happier with Webroot then we were with Vipre. I still stand behind the switch to webroot. With that said we cannot just accept the fact it breaks our environments and just do the work around. We need to let Webroot and Labtech know that it is broken so they can see the full impact of the issue. The more we report it the more they can collect data. Right now I have 9000+ commands that are stuck. (Look at "Dataviews>Commands>Executing commands" in the Labtech control Center) and for the last two weeks I have spent my days releasing commands and rebooting systems.

 

You can delete them form the MySQL query with "SELECT * FROM commands WHERE STATUS=2" this will remove them from the list to process but it doe snot fix the suspended commands. You can restart the system but it will return. You can stop webroot, kill the processes then restart webroot. You can restart the labtech services but when you do this it leaves orphaned processes in suspended mode.

 

The "work arounds" are great to get the issue remediated temporarily but, as we all know, the more we have to do to get it to work correctly the less the product becomes automated.

Share this post


Link to post
Share on other sites

Webroot support response this morning:

 

however unless the issue is readily reproducible this is unnecessary as we are already attempting to reproduce this issue within multiple environments on our side in addition to the environments Labtech Support is using with their attempts.

 

I have offered my servers with a reproducible issue. If anyone else has a server that can have the issue produced on demand please contact webroot.

Share this post


Link to post
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.


×
×
  • Create New...