Top Process PowerShell script

Task Manager output for 'Top Process PowerShell script management pack'
Task Manager output for ‘Top Process PowerShell script management pack’

 

Ever wish you had task manager output when you had a monitor go unhealthy?  Following Kevin Holman’s lead to ‘Monitor Processes‘, the idea landed to build out the ‘Top Process PowerShell script’.  This morphed into a management pack with Knowledge entries to better explain what is being done.  Integrating Top Process into Health Explorer output as a recovery task helped provide another step before alerting.    The idea started from the need to prove which Security tool(s) were causing the over-utilized compute spikes, causing non-responsive server(s).  Thinking back to my UNIX days, we simply used top, vmstat, iostat, and other commands to identify problematic processes.  Integrating PowerShell scripts into SCOM is part of the fun, then linking the obfuscated Security processes for the final output.  From there, extrapolate into Azure Functions or Azure Logic apps, for additional functionality for cloud native monitoring.

 

Quick Download: https://github.com/theKevinJustin/TopProcess

Tier1 separated monitoring (no AD) https://github.com/theKevinJustin/TopProcessTier1

Building out the ‘Top Process PowerShell script’

Kevin Holman built a ‘ Monitor.Performance.ConsecSamples.ThenScript.TwoState.mpx fragment, beginning the logical journey.   His fragment helped me start with a working model, taking processes and cores into consideration for true CPU usage on multi-core servers.

Kevin Holman Monitor performance then script fragment for PowerShell get-counter syntax
Kevin Holman Monitor performance then script fragment for PowerShell get-counter syntax

 

We need to see the processes, and their corresponding value, then build an output table (custom object).  After gathering the processes, feed the TopProcesses array, lastly sorting the array for CPUValue

Top Process memory usage snippet
Top Process memory usage snippet

Next, we’ll want to see what applications/tools might be involved, including Active Client, IIS, monitoring, and EndPoint Management tools (keep things honest!).

Added the Security Processes into the mix
Added the Security Processes into the mix

Then we build an output of the data so we can take the datasource (DS) or WriteAction (WA) into a scripted monitor/rule, or recovery tasks linked to various monitors.  Even built a forked version in case of SAW/Red Forest, separating Tier0 monitoring from Tier1 (snippet below is NOT that pack)

snippet of manual tasks and recoveries that link to multiple monitors
snippet of manual tasks and recoveries that link to multiple monitors

 

Useful links

Kevin Holman MP fragments blog and GitHub Fragment library/repository

Need to find the command UNIX pack runs for perf counter

Magnifying Glass

 

 

Have you ever needed to find the command UNIX pack runs for perf counter?   Say the processor time value doesn’t match what the Unix admin may be saying SCOM is showing.

 

Many times you can look at the SCOM management pack, and those commands trace back to the UNIX library.

 

Background:  The SCOM management server runs many of the cross-plat/xplat workflows to the UNIX agent through WinRM.

 

Agenda
  1. Unseal SCOM UNIX management pack to obtain URI
  2. Understand command line options from UNIX/Linux side, and how to view the output
  3. Enumerate command line
  4. Test Command line from SCOM MS

 

 

 

Unseal SCOM UNIX management pack

The screenshot below is unsealing the Solaris10 pack to XML, and then viewing/searching to show the processor reference.

Solaris 10 processor rules

NOTE that’s a URI, not a script

 

 

How UNIX admin may supply processor output

Example – Unix admin typically uses vmstat or iostat.

 

The screenshot uses ‘vmstat 2 10‘ – a snapshot every 2 second intervals, 10 times

vmstat output

 

We can discuss the vmstat output, but it shows way more than just processor (ready queue, swap, user, system, and cpu %) to help figure out which operating system component is the problem.

 

 

Enumerate command line test

How do we test the command line syntax, to verify what SCOM pulls when running the rule?

For example, we need to make the URI actionable from the management pack.  What is needed to make a usable command?

 

Grab the URI from the pack

http://schemas.microsoft.com/wbem/wscim/1/cim-schema/2/SCX_ProcessorStatisticalInformation?__cimnamespace=root/scx

 

Because we know the URI, we now build out the syntax with WinRM

winrm enumerate http://schemas.microsoft.com/wbem/wscim/1/cim-schema/2/SCX_ProcessorStatisticalInformation?__cimnamespace=root/scx -auth:basic -remote:https://<servername>:1270 -username:<scomID, not necessarily root> -skipCACheck -skipCNCheck -skiprevocationcheck –encoding:utf-8

 

 

Test WinRM command from SCOM MS

For instance, we want to test the WinRM command from the MS to the UNIX server

winrm enumerate http://schemas.microsoft.com/wbem/wscim/1/cim-schema/2/SCX_ProcessorStatisticalInformation?__cimnamespace=root/scx -auth:basic -remote:https://ubuntu:1270 -username:scom -skipCACheck -skipCNCheck -skiprevocationcheck –encoding:utf-8

 

Example output

SCX_ProcessorStatisticalInformation
InstanceID = null
Caption = Processor information
Description = CPU usage statistics
ElementName = null
Name = 0
IsAggregate = FALSE
PercentIdleTime = 99
PercentUserTime = 0
PercentNiceTime = 0
PercentPrivilegedTime = 0
PercentInterruptTime = 0
PercentDPCTime = 0
PercentProcessorTime = 1
PercentIOWaitTime = 0

SCX_ProcessorStatisticalInformation
InstanceID = null
Caption = Processor information
Description = CPU usage statistics
ElementName = null
Name = _Total
IsAggregate = TRUE
PercentIdleTime = 99
PercentUserTime = 0
PercentNiceTime = 0
PercentPrivilegedTime = 0
PercentInterruptTime = 0
PercentDPCTime = 0
PercentProcessorTime = 1
PercentIOWaitTime = 0

 

Additional references for WinRM syntax and troubleshooting

Warren’s blog

Docs site

Use Unix MP’s for shell commands