Jump to content
Offsite

HP Server Monitoring (SNMP, OID, Windows)

Recommended Posts

Hi All,

 

I just wanted to share my way of monitoring our clients whom use HP server hardware.

 

I've found that this is the most reliable & functional for how we work & monitor our customers & thought I'd share it in case it helps anyone else.

 

Please note: I will not be responsible for any damages / downtime occurred to your devices if any occur.

 

First of all, you need to have SNMP installed on your servers, you can complete this by following the below:

 

Install SNMP feature on the OS that is directly on top of the hardware

 

Log into your dedicated server

Click on Windows Key > Administrative Tools > Server Manager.

Click Manage > Add Roles and Features.

Click Next > Select SNMP

 

SNMP Service Installed

 

Click on Windows Key > Administrative Tools > Services.

Right click on SNMP Service and click on Properties.

Click on the Security tab.

Add permissions as below:

public - READ ONLY

private - READ WRITE

Set "Allow SNMP Packets from these hosts"

Ensure "localhost" & "127.0.0.1" are in the field below.

 

One thing to note, is if you are managing this on a server core using RDP & you cannot see the security settings for the SNMP service, you will need to install SNMP on the machine you are accessing the server from.

 

Secondly, You will need to ensure you have HP system management homepage installed on the physical server.

You will need to look on HP's website to download drivers, you will need the below for this to work properly:

 

  • iLO Channel Interface Driver

  • iLO Management Controller Driver

  • HP Insight Management Agents for Windows x64 editions (NOT WBEM, these are the SNMP Agents)

  • HP System Management Homepage

 

Try and install in the order above & you shouldn't have any problems, the management agents will not install if SNMP feature is not installed.

 

Thirdly, you will need to create a group for this particular set of servers / monitors.

 

  • Create a group.

  • add your searches for the group so you can auto join devices.

  • I set as master to keep devices in the group & limit the search.

 

Once this has been completed, you will need to add your monitors.

 

  • Right click the group you created

  • Select "add remote monitors"

  • Select "Monitor the Network" and next

  • Select "SNMP"

  • Input the loopback address here "127.0.0.1" (as we added earlier in the service security) & Select next

  • Input your OID you want to monitor & ensure the community string is set to public & select next

 

Below are the OID's I use to monitor critical hardware, you can check other items to monitor by using a MIB browser & searching the MIB file "cpqhlth.mib" (download from HP website).

 

  • HP - Memory Check
    .1.3.6.1.4.1.232.6.2.14.4.0

 

  • HP - PSU Condition
    .1.3.6.1.4.1.232.6.2.9.1.0

 

  • HP - RAID Status
    .1.3.6.1.4.1.232.3.1.3.0

 

  • HP - Overall Server Health Status
    .1.3.6.1.4.1.232.6.1.3.0

 

Now, here's the interesting part, you can choose different methods for the comparison.

I set mine as:

Less than/Equals - 2

 

1 = Other, 2 = Healthy, 3 = Degraded, 4 = Critical or error.

 

This basically sets the monitor up so if it returns a status value of 1 or 2, it will become a success status, if it returns value 3 or 4, it will become a fail status.

 

Select all your frequency & alert template, I set mine to alert "once" so it doesn't spam the ticket board.

 

You do not need to add any alert messages.

 

Name the Montor & select finish, wait a while, whilst the monitor is applied to all the servers in your group. ( I found that it took a good 30 minutes to apply to our servers, so be patient!)

 

Once the monitor is added you can run a test via one of the servers to see if it reports a status back.

 

 

  • This does not affect your servers if you are using / want to continue using WBEM for the HP system management homepage.
    Servers should not require a reboot after this either.

 

 

I've written this as I've searched for up to date HP hardware monitoring via Labtech & not had much luck being able to actively monitor our clients HP Server hardware.

 

I am aware you can use WMI queries now & other methods, but I find SNMP is just so much more easier with Labtech & how we use it.

 

Hope this helps someone!

  • Like 1

Share this post


Link to post
Share on other sites

Thank you for this but just curious where you got the OID values from? I have an HP server that is perfectly healthy (according to all aspects of the system management homepage) but is returning a value of 4.

 

I would like to read more info on this - or just curious if you have seen this before?

 

Any help would be appreciated.

Share this post


Link to post
Share on other sites

Check your IML logs in iLO or System Management Homepage. Events in there don't auto-resolve, so if there were any left in the logs from a previous event it will cause the OID to pull a warning or error state.

Share this post


Link to post
Share on other sites
Check your IML logs in iLO or System Management Homepage. Events in there don't auto-resolve, so if there were any left in the logs from a previous event it will cause the OID to pull a warning or error state.

 

You are correct, I had to clear the IML logs for things to be happy. Everything is working great,thanks again for the post!

Share this post


Link to post
Share on other sites

Glad you managed to resolve your issue, yes, if there are previous logs in IML or elsewhere, it can send an error over.

 

The OID values were gathered by using a MIB browser & searching the MIB file "cpqhlth.mib" (downloaded from HP website), if you want to look for any other values.

Share this post


Link to post
Share on other sites

Hi, I fortunately ran across this forum and clearing the IML log also worked for me as well.  Have an older G6 ProLiant running ILO 2.  OID .1.3.6.1.4.1.232.6.1.3.0 was showing a value of 4 until I cleared the IML -- then wala -- with a quick refresh it is now showing a value of 2.  Thanks!

Now on to correct my Raid Controller Status -- with an OID of 1.3.6.1.4.1.232.3.2.2.1.1.6.0

Getting a returned value of NULL.  Wondering if even I have the right OID for this?

 

Best, John

Share this post


Link to post
Share on other sites

Trying the RAID status documented above -- using the listed OID

HP - RAID Status
.1.3.6.1.4.1.232.3.1.3.0

Reporting in fine, now on to trigger and alert to test.

Share this post


Link to post
Share on other sites

Just a small update.

 

If you are having issues with this on Gen10 HPE servers, just install the iLO Channel Interface Driver along with the Agentless Monitoring Service.

 

Enable the HP service that is set to disabled on the server you've just installed the Agentless Monitoring service and the above guide will still work even without the HP System Management Home page.

Share this post


Link to post
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.


×
×
  • Create New...