Jump to content
timwiser

New tool for monitoring Dell hardware - Feedback required :)

Recommended Posts

1 hour ago, LoneWolf said:

My one thought is how to ensure the same results for VMWare ESXi hosts.  We don't have very many, but it's not as easy there.  I also don't think LabTech has support for EsxCLI commands to query the arrays and report back.

If you get the Virtualization Manager setup, CWA can create tickets for disk issues on ESXi Hosts (as long as the host is aware of its own hardware and you can see the disks on the host from inside VMware). You also have to turn the ticketing on from inside the Virtualization Manager, simply having it recognize your Virtual Hosts is not enough.

Virtualization Manager
https://docs.connectwise.com/ConnectWise_Automate/ConnectWise_Automate_Documentation/080/060/010

Aside from the requirements in the documentation, it helps to understand the functionality provided by a "Probe" machine, specifically when it does network scans, when it attempts to discover what a device is (Switch, Router, Printer, Hypervisor), how to force a scan, force a rediscovery and how to test credentials against a discovered ESXi Host. You also need a little patience :) . I cannot confirm but I felt like I had better luck when SNMP was enabled on the ESXi Hosts, and the SNMP string was configured as a string for the "Probe" machine to use with scanning/discovery. When the probe has discovered an ESXi host it will place it at the top of the "Network" list for that location and if you have the correct credentials it will show up in the Virtualization Manager.

I have a couple plugins from Plugins4LabTech and also considered purchasing their VMWare ESX Health Monitor plugin
https://www.plugins4labtech.com/products/vmware-esx-health-monitor

It looks to be much easier to deploy and setup, and may even be your only alternative if you've been demoing the new Network Probe,  IIRC the new Network Probe demo isn't compatible with Virtualization Manager.

Share this post


Link to post
Share on other sites

Just as FYI if it's useful for people, here's my basic autojoin search:

image.png.b6faae1fa5541a6d46d64ad8a9415cb5.png

Here's the remote monitors on my group this search joins servers to:

image.thumb.png.0f38820bbb42e635adf1a416548300d3.png

This provides every OpenManage Disk Event ID I saw as important, going direct from Dell's own documentation.  Several, like Unexpected Sense, Command Timeout on Physical Disk, or SMART categories are canaries in the coal mine (possible issues but not yet failure), whereas others are quite obvious.  We direct these via different categories to Connectwise, with the high priority ones going to a Triage board our dispatcher has view of.

I also have scripts that run CLI commands with omreport, Dell's OpenManage tool, to do things like run a consistency check on a RAID array, or query a server for its controllers, physical disks, and virtual disks if anyone is interested.

Edited by LoneWolf

Share this post


Link to post
Share on other sites

I would be interested in those scripts for querying OpenManage.  Those monitors you have setup have given me a few ways to tackle the issue of monitoring our Dell hardware. Do you have a monitor to alert if a server is missing OpenManage?

 

Share this post


Link to post
Share on other sites
On 12/28/2017 at 3:23 PM, LoneWolf said:

Just as FYI if it's useful for people, here's my basic autojoin search:

image.png.b6faae1fa5541a6d46d64ad8a9415cb5.png

Here's the remote monitors on my group this search joins servers to:

image.thumb.png.0f38820bbb42e635adf1a416548300d3.png

This provides every OpenManage Disk Event ID I saw as important, going direct from Dell's own documentation.  Several, like Unexpected Sense, Command Timeout on Physical Disk, or SMART categories are canaries in the coal mine (possible issues but not yet failure), whereas others are quite obvious.  We direct these via different categories to Connectwise, with the high priority ones going to a Triage board our dispatcher has view of.

I also have scripts that run CLI commands with omreport, Dell's OpenManage tool, to do things like run a consistency check on a RAID array, or query a server for its controllers, physical disks, and virtual disks if anyone is interested.

I got these remote monitors setup and the alerts setup and they seem like they will work great.

I am reviewing the documentation for OpenManage and using the command line. Can I ask how you are scheduling consistency checks? Are you manually editing the script for each server as not all servers have the same VD setup etc, or do you have another way of doing it?

Share this post


Link to post
Share on other sites

@abeauchamp I have a script for consistency checks, but have not been scheduling it.  I primarily use this script manually when an array has had a drive replaced and everything is good after an array is rebuilt.

I am looking into scheduling such scripts, but would probably need to do so on a per-client basis; not everyone can have their I/O ramped up on a weekend, and clients that run third shift can't always have that same ramp-up (and corresponding performance decrease) in the wee hours of the morning.

I'm also finishing up a script that checks all Dell servers for the presence of OpenManage, then checks that the version is at least x.x.x, and confirms that it isn't a 32-bit install of OpenManage running on a 64-bit server.  If any of the checks fail, a ticket is opened.  It's still in progress, but I probably only want to run it about six months. after running it manually.  I just want to make sure any older servers are up-to-date and that no mistakes are made, since all of the checks rely on OpenManage to be present.

Share this post


Link to post
Share on other sites

@abeauchamp, et al. I'm going to create a separate thread for my methods, as they should probably be all linked.  They'll be on this board, you should see them shortly.

 

Edited by LoneWolf

Share this post


Link to post
Share on other sites
1 minute ago, abeauchamp said:

@LoneWolf , Thanks for the info and the scripts. I will review them this weekend. Will help with getting our Dell server management more automated.

Happy New Year!

My setup has been linked in a new threat, to keep from cluttering this one.  You're very welcome. @dpltadmin, I thank you for your thread as well. I think your idea is great, my main reason for diverging is so that I don't rely too much on you for support to an .EXE file for who knows how long (we all know that relative at Thanksgiving who needs computer help; I don't want to be that with you :D).

Here's a link for my stuff, and thanks for being the one to get the wheels turning in my head.

 

Share this post


Link to post
Share on other sites
10 hours ago, LoneWolf said:

@dpltadmin, I thank you for your thread as well. I think your idea is great, my main reason for diverging is so that I don't rely too much on you for support to an .EXE file for who knows how long (we all know that relative at Thanksgiving who needs computer help; I don't want to be that with you :D).

 

@LoneWolf, You are welcome. I'm just assuming your talking about my post with the Virtualization Manager for ESXi Hosts. Just to be clear that method has nothing to do with Tim's original .EXE. Everything I mentioned in that post relies on CWA Built-in Functionality.

In my post on page 1 of this thread, again, not my .EXE, I just laid out the path and things I did to implement TIm's Tool. I can't take credit for his handy work.

I'm looking forward to reviewing the items you've laid out in this thread and possibly implementing them myself. Great work!

 

Share this post


Link to post
Share on other sites
On 12/28/2017 at 3:23 PM, LoneWolf said:

Just as FYI if it's useful for people, here's my basic autojoin search:

image.png.b6faae1fa5541a6d46d64ad8a9415cb5.png

Here's the remote monitors on my group this search joins servers to:

image.thumb.png.0f38820bbb42e635adf1a416548300d3.png

This provides every OpenManage Disk Event ID I saw as important, going direct from Dell's own documentation.  Several, like Unexpected Sense, Command Timeout on Physical Disk, or SMART categories are canaries in the coal mine (possible issues but not yet failure), whereas others are quite obvious.  We direct these via different categories to Connectwise, with the high priority ones going to a Triage board our dispatcher has view of.

I also have scripts that run CLI commands with omreport, Dell's OpenManage tool, to do things like run a consistency check on a RAID array, or query a server for its controllers, physical disks, and virtual disks if anyone is interested.

I'd be very interested in these scripts if you have them and are willing to share them. Thanks in advance.

Share this post


Link to post
Share on other sites

@timwiser, I'm having an issue with a non-RAID disk:

Monitor Log shows: Controller 0, physical disk 0:1:6
* indicates predictive failure
X indicates a non-genuine drive
Issues with disk(s) detected!
No issues with chassis.
No issues with battery.
Result code 2

 

OpenManage does not show any alerts for this disk, and I am using the /ignorenongenuine:yes

Share this post


Link to post
Share on other sites

@LoneWolfYou could look at the VMware ESXi plugin from Plugins4Labtech.com for hardware monitoring on your VMware hosts.  We use it and it just works.

 

Share this post


Link to post
Share on other sites
On 5/3/2018 at 10:14 AM, jshaw said:

I'm having an issue with a non-RAID disk:

Monitor Log shows: Controller 0, physical disk 0:1:6
* indicates predictive failure
X indicates a non-genuine drive
Issues with disk(s) detected!
No issues with chassis.
No issues with battery.
Result code 2

 

OpenManage does not show any alerts for this disk, and I am using the /ignorenongenuine:yes

Has anyone else run into this issue with Non-RAID disks?

Share this post


Link to post
Share on other sites

I am running this as a remote monitor.  When I test is i get the following: Failed Result:Exe Monitor Missing 

 

I verified the path, and manually with no parameters I get:image.png.15628a4b26b5219557ab66a8e11290f0.png

 

If I add parameters and run manually a cmd window opens and closes and I never see any results.

 

image.png.9f3badb52af902ac7fa13180d117fcb2.png

Share this post


Link to post
Share on other sites

Heads up:

The OMSA VIB for VMware 6.7 is out, it requires OMSA 9.2 (or if you have a PE R840 or 940 you can also use 9.1.1). However, if you do a clean install of OMSA 9.2 you will notice that dellmonitoring.exe claims OMSA isn't installed. This is because there is an issue with OMSA 9.2 and 9.1.0 not installing some additional features (namely the 'Command Line Interface' feature that installs OMREPORT.EXE).

 

In order to get this working you'll need to click on the 'Older version' link in the Dell downloads and download OMSA 9.0.1 and then install the "DUP" (dell update) release for 9.1.0 and then 9.2

 

For whatever reason when installing 9.1 or 9.2 when you go to 'Custom' the option for 'Command Line Interface' feature is not available. Without that feature selected you will not get the OMREPORT.exe (so make sure its selected when installing 9.0.1).

Edited by ohsnap
typos errors etc

Share this post


Link to post
Share on other sites

Hi Guys,

This looks amazing! I have setup a test location and group. I have put two test servers in the group (manually). One has a degraded RAID the other is fine. I have created two Remote Monitor's. One with the Condition Equals 0 and 2. I added the second monitor results equal 0 for testing. The monitor is set to run hourly. The odd thing is that I am getting get two tickets generated for each server. Even though one of them has no failed disks or any issues. Any ideas where I am going wrong?

image.thumb.png.e12c3363ec486fc5a65ec2f3442aeb7c.png

 

Also, in the remote monitors data summary I get an error of "exe monitor missing"

any help would be fab!

 

 

Edited by TechnicalHero

Share this post


Link to post
Share on other sites
On 7/3/2017 at 10:16 AM, skyscan said:

@dpltadmin

 

Here is the script I have pieced together that transfers Tim's tool over and also downloads/installs Dell Open Manage.

 

I have two Dell Groups - both with auto join searches.

One group is looking for Dell servers that do not have open manage (The script I posted runs on that group every hour)

My other Dell group looks for servers that do have open manage and I have Tim's .exe monitor applied to that group as a remote monitor.

 

 

MX - Dell Monitor Install.zip

Hi

 

when the script downloads it does not seem to extract.  when you run the command on a desktop it splashes a winzip dialog, could that be causing issues with the script?  Any thoughts would be appreciated.

thanks

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×