Jump to content
BlueToast

Connection failures spewing in all agent logs (including RMM server)

Recommended Posts

Posted (edited)

Is anyone else experiencing this issue (observable in LTErrors.txt agent log file)?

Quote

LTService  v190.58     - 3/22/2019 12:35:56 AM     - WebRqst: https://yourrmmserver.website.com/LabTech/agent.aspx?30034c5&10 : ConnectFailure : Unable to connect to the remote server :   inner: No connection could be made because the target machine actively refused it 123.123.123.123:443 : :::
LTService  v190.58     - 3/22/2019 12:36:07 AM     - WebRqst: https://123.123.123.123/LabTech/agent.aspx?30034c5&10 : ConnectFailure : Unable to connect to the remote server :   inner: No connection could be made because the target machine actively refused it 123.123.123.123:443 : :::

In a client environment with a solid and stable network I have personally observed the impact of this (RMM server thought agent was offline even though I was remoted into another agent on the same network). In communication with ConnectWise, they did confirm that this could result in potentially missed alerts and data (the original focus of our issue).

We have also checked three different RMM systems (one of which is our own on-prem, another which is cloud-hosted by ConnectWise, and the third in Azure) and found the same issue to exist in those environments, including on the Automate application server. Two systems we checked had no AV whatsoever, and we did not find anything misconfigured with DNS or network ports.

Additional details:

  • Azure RMM has less than 10 agents
  • Cloud-hosted RMM has around 25 agents
  • Our on-prem RMM has 10k+
Edited by BlueToast
Added additional details (agent counts for each RMM instance)

Share this post


Link to post
Share on other sites

You might check your CWA server's firewall/router.  Long ago when starting out we had a lower end router that could not handle "n" simultaneous connections. The sporadic connections got progressively worse as we added agents.  For instance does it get better at night when PCs turn off?  Sometimes it's more subtle, for instance with some advice from others I got AT&T to confirm the router they supply with their AT&T Business Fiber can't handle a high number of connections (forget the number exactly), so even if we have our own behind it, it would be a problem.

Share this post


Link to post
Share on other sites

We saw that, too.  I had assumed it was from blocking TLS 1.0 server-side, now that I've finally killed 2003.

Share this post


Link to post
Share on other sites

We have an open ticket with support for this exact issue and currently have 312 of our 915 total agents showing offline. Some of these are obviously legitimate but it's safe to say a majority are this error. A reinstall has show to temporarily fix this issue (tray icon comes up, connection is made, heartbeat okay) but later that day or the next day the machine will stop checking in and start throwing those errors.

Other errors in the log we have gathered:

LTService  v190.78     - 4/11/2019 1:30:26 PM     - Failed Signup, Will wait over 30 minutes to try again.:::

LTService  v190.78     - 4/11/2019 10:11:39 PM     - WebRqst: https:/ourfqdn.com/LabTech/agent.aspx?0c1&9 : SecureChannelFailure : The request was aborted: Could not create SSL/TLS secure channel. : :::

Other than updating to 2019.3 release, nothing else has changed in our environment. No rhyme or reason can be found as to affected agents. Sometimes its the whole client, other times its a few agents, some clients are connecting without issue.

Suspecting something to do with the update announced last month. We are on 2019.3 release and almost every agent is currently on 190.78 version but still having issues. Will update this with more info as our ticket gets worked on/resolved. It's been "escalted" for a few days and AFAIK no progress has been made on it yet.

Share this post


Link to post
Share on other sites
On 3/28/2019 at 3:38 AM, SteveYates said:

You might check your CWA server's firewall/router.  Long ago when starting out we had a lower end router that could not handle "n" simultaneous connections. The sporadic connections got progressively worse as we added agents.  For instance does it get better at night when PCs turn off?  Sometimes it's more subtle, for instance with some advice from others I got AT&T to confirm the router they supply with their AT&T Business Fiber can't handle a high number of connections (forget the number exactly), so even if we have our own behind it, it would be a problem.

Our downstream providers use stateless equipment and our firewalls are rated for 500k+ concurrent connections. We are nowhere near that limit. Additionally, the cloud-hosted and Azure-hosted RMM instances are also experiencing the connection failure issues and are fairly small (less than 25 agents), so I think this is not the root of the issue. I wish ConnectWise would factor in the cloud-hosted RMM environment and troubleshoot that since it's their own environment and setup.

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×