Need to find the command UNIX pack runs for perf counter

Magnifying Glass

 

 

Have you ever needed to find the command UNIX pack runs for perf counter?   Say the processor time value doesn’t match what the Unix admin may be saying SCOM is showing.

 

Many times you can look at the SCOM management pack, and those commands trace back to the UNIX library.

 

Background:  The SCOM management server runs many of the cross-plat/xplat workflows to the UNIX agent through WinRM.

 

Agenda
  1. Unseal SCOM UNIX management pack to obtain URI
  2. Understand command line options from UNIX/Linux side, and how to view the output
  3. Enumerate command line
  4. Test Command line from SCOM MS

 

 

 

Unseal SCOM UNIX management pack

The screenshot below is unsealing the Solaris10 pack to XML, and then viewing/searching to show the processor reference.

Solaris 10 processor rules

NOTE that’s a URI, not a script

 

 

How UNIX admin may supply processor output

Example – Unix admin typically uses vmstat or iostat.

 

The screenshot uses ‘vmstat 2 10‘ – a snapshot every 2 second intervals, 10 times

vmstat output

 

We can discuss the vmstat output, but it shows way more than just processor (ready queue, swap, user, system, and cpu %) to help figure out which operating system component is the problem.

 

 

Enumerate command line test

How do we test the command line syntax, to verify what SCOM pulls when running the rule?

For example, we need to make the URI actionable from the management pack.  What is needed to make a usable command?

 

Grab the URI from the pack

http://schemas.microsoft.com/wbem/wscim/1/cim-schema/2/SCX_ProcessorStatisticalInformation?__cimnamespace=root/scx

 

Because we know the URI, we now build out the syntax with WinRM

winrm enumerate http://schemas.microsoft.com/wbem/wscim/1/cim-schema/2/SCX_ProcessorStatisticalInformation?__cimnamespace=root/scx -auth:basic -remote:https://<servername>:1270 -username:<scomID, not necessarily root> -skipCACheck -skipCNCheck -skiprevocationcheck –encoding:utf-8

 

 

Test WinRM command from SCOM MS

For instance, we want to test the WinRM command from the MS to the UNIX server

winrm enumerate http://schemas.microsoft.com/wbem/wscim/1/cim-schema/2/SCX_ProcessorStatisticalInformation?__cimnamespace=root/scx -auth:basic -remote:https://ubuntu:1270 -username:scom -skipCACheck -skipCNCheck -skiprevocationcheck –encoding:utf-8

 

Example output

SCX_ProcessorStatisticalInformation
InstanceID = null
Caption = Processor information
Description = CPU usage statistics
ElementName = null
Name = 0
IsAggregate = FALSE
PercentIdleTime = 99
PercentUserTime = 0
PercentNiceTime = 0
PercentPrivilegedTime = 0
PercentInterruptTime = 0
PercentDPCTime = 0
PercentProcessorTime = 1
PercentIOWaitTime = 0

SCX_ProcessorStatisticalInformation
InstanceID = null
Caption = Processor information
Description = CPU usage statistics
ElementName = null
Name = _Total
IsAggregate = TRUE
PercentIdleTime = 99
PercentUserTime = 0
PercentNiceTime = 0
PercentPrivilegedTime = 0
PercentInterruptTime = 0
PercentDPCTime = 0
PercentProcessorTime = 1
PercentIOWaitTime = 0

 

Additional references for WinRM syntax and troubleshooting

Warren’s blog

Docs site

Use Unix MP’s for shell commands

 

Build FluentD conf file

Build trust one block at a time

Ready to build out a FluentD conf file?

 

Let’s build a FluentD conf file.  We can use the docs site for another example.  And now, let’s build a simple FluentD configuration file. Paste the XML code below, and save as <yourlogfile>.conf

Create custom log file to test

cd /etc/opt/microsoft/omsagent/scom/conf/omsagent.d/
# vi <yourlogfile>.conf

vi mylog.conf

# Example conf file

<source>
# Specifies input plugin. Tail is a fluentd input plugin – http://docs.fluentd.org/v0.12/articles/in_tail
type tail
# Specify the log file path. Supports wild cards.
path /var/log/mylog
# Recommended so that Fluentd will record the position it last read into this file.
pos_file /home/omsagent/fluent-logging/mylog.pos

# Used to correlate the directives.
tag scom.log.mylog

format /(?<message>.*)/
</source>

<filter scom.log.mylog>
type filter_scom_simple_match
regexp1 message 911
event_id1 911
</filter>

<match scom.log.mylog>
#Disable mutual Auth
enable_server_auth false

# Output plugin to use
type out_scom
log_level trace
num_threads 5

# Size of the buffer chunk. If the top chunk exceeds this limit or the time limit flush_interval, a new empty chunk is pushed to the top of the
queue and bottom chunk is written out.
buffer_chunk_limit 5m
flush_interval 15s
# Specifies the buffer plugin to use.
buffer_type file
# Specifies the file path for buffer. Fluentd must have write access to this directory.
buffer_path /var/opt/microsoft/omsagent/scom/state/out_scom_common*.buffer
# If queue length exceeds the specified limit, events are rejected.
buffer_queue_limit 10
# Control the buffer behavior when the queue becomes full: exception, block, drop_oldest_chunk
buffer_queue_full_action drop_oldest_chunk
# Number of times Fluentd will attempt to write the chunk if it fails.
retry_limit 10
# If the bottom chunk fails to be written out, it will remain in the queue and Fluentd will retry after waiting retry_wait seconds
retry_wait 30s
# The retry wait time doubles each time until max_retry_wait.
max_retry_wait 9m
</match>

Save (:wq!)

 

# Restart Agent

/opt/microsoft/omsagent/bin/service_control restart

# Check for errors – see blog

grep -i error /var/opt/microsoft/omsagent/scom/log/omsagent.log

# Test strings into your logfile

# Options

echo test >> /var/log/mylog

echo 911 error >> /var/log/mylog

# mimic syslog or messages syntax

echo `date +”%b %e %H:%M:%S”` MYLOG 911 test string. Call 911 >> /var/log/mylog

 

Please stay tuned for more management pack options to alert on the strings.  Refer to the part1/2 blogs for more details on unit testing for alerts.

Configure Linux FluentD

'Thanks' written in collage form for many languages
Thanks

What are you Fluent in?

 

Join me as we configure FluentD on Linux, and continue to improve and document monitoring cross-platform (UNIX/Linux) servers.
Background:
Some of our previous topics included UNIX logical disk class differ from Windows (here), and cross platform agent setup.   Because we always ‘need more power!’, it’s time to configure Linux FluentD.  All this, because the ‘Linux FluentD’ was updated on docs.microsoft.com!

Don’t worry, if you’re on the edge, this may be a Scotty moment of “I’m giving her all she’s got” with your current monitoring environment.

 

 

Maybe you’re thinking ‘convince me’, so what does FluentD provide monitoring wise?

    • System Center Operations Manager (SCOM) 2016+ has enhanced log file monitoring capabilities for Linux servers.
    • Wild card characters in log file name and path.
    • New match patterns for customizable log search
    • Use community published plugins versus having to build from scratch

 

I wanted to take a moment to validate the steps provided, not just because I’ve had a pretty large part of my career supporting cross-platform environments for large enterprise companies.  So, let’s get started!

 

 

Review the FluentD setup procedure 

Let’s speed up the ‘do’ part.  Review the procedure at docs.microsoft.com

FluentD basic operation here

Configuration Overview here

FluentD pre-reqs

Server types are covered = Linux (RedHat/Ubuntu)

Management Packs required for SCOM
        • Load latest Linux Operating system management packs (2019).
          1. Find the pack download here
          2. Load the relevant Linux or Universal Linux packs
          3. Verify Microsoft.Linux.Log.Monitoring pack is loaded
Verify ID/password and sudoers capability for root

Use docs article if you need additional assistance to install agent on the Linux server.  Alternatively, use my blog posts for PowerShell or GUI install

 

Update agent to latest release

Grab the latest OMS Agent release from github

      • From server command line:

wget https://raw.githubusercontent.com/Microsoft/OMS-Agent-for-Linux/master/installer/scripts/onboard\_agent.sh

sh onboard_agent.sh

            1. The wget command above will install the OMSAgent and it’s pre-req packages
            2. This includes OMI, scx, OMSAgent, OMSConfig, auoms,  Apache, Docker, MySQL (if the last three apply to your Linux server)

Load Management packs from latest UNIX release.

Verify packs are version 10.19.1082.0

Navigation steps:

From SCOM Console > Administration Tab > Installed Management packs

List of Linux based SCOM management packs installed needed to configure Linux FluentD

 

After adding the updated Linux Management packs

Screenshot list of 10.19.1082.0 versioned Linux management packs required to 'configure Linux FluentD'

 

 

Linux server screenshots installing OMSAgent via wget command

Output from the wget command to install omsagent on Linux server

screenshot saving file from GitHub to Linux server

Install of oms agent components on Linux server

 

 

Stay tuned, I’m currently testing FluentD configuration on my 2019 UR1 lab environment on Ubuntu16.



SQL on Windows Addendum pack

It’s spring time; time to tune the SQL carb!

 

Carbs are way less easy to find these days, but I’ve been busy tuning the SQL agnostic pack (MSSQL on Windows).

 

Tuning the SQL Agnostic pack would be far less successful without expert help.  My thanks to Brandon Pires – MCS SQL Consultant who helped provide a SQL DBA perspective.   Brandon’s LinkedIn profile

 

Always grab an expert, and for SQL, it’s a DBA.  If you’re new to SCOM, most product teams provide their management packs.  SCOM PFE’s build addendum packs to improve a pack (from our perspective).  Addendum packs make the a pack stronger, for an improved customer experience.  I’m not complaining at what the pack delivers.  The SQL Team is awesome for taking user feedback and making improvements quarterly!

 

Background:

Initially this journey started out with Tim McFadden disabling the duplicate rules/monitors in the SQL MP’s (here).

After talking with Tim and Kevin H, I set out to clean up the SQL version specific packs to remove bloat by creating the version specific OFF packs.  The OFF packs disabled the plethora of SQL performance counters (see MP bloat blog here).

With the SQL Agnostic packs (thank God!), I wanted to deliver an addendum pack to tune the SQL alerts/health for what SQL PFE/Consultants recommended for an improved out of the box experience (OoBE).

 

 

MP Version history
v1.0.0.0 24 Feb 2020 Override to enable SQL Monitoring
v1.0.0.1 24 Feb 2020 Override pack cleanup to human readable format
v1.0.0.2  2 Mar 2020 Overrides for severities and SQL CPU samples
v1.0.0.3  2 Mar 2020 Overrides for SQL rules for warning
v1.0.0.4  4 Mar 2020 Completed overrides for SQL warning rules

v1.0.0.5  1 Apr 2020 Updated rules for backup failures when customer uses Netbackup vs. SQL agent/scheduled tasks

v1.0.0.6  9 Apr 2020 Created groups for seed discovery Test/Dev and Prod; excluded EXPRESS, disabled Securables monitor

v1.0.0.7 15 Apr 2020 Updated pack name to include ‘SQL Server’.

Updated AddendumGroupGUIDUpdate to include RegEx pattern replace
AddendumGroupGUIDUpdate will version pack to v1.0.0.7 for group GUID and regex changes

 

 

Please feel free to download the zip file, which includes the XLS for review of what was updated.

My website download

 

 

Additional References

The Agnostic OFF Pack to turn off the performance rules (found here)

The old SQL version specific OFF packs for the performance counters can be found here.

TechNet Gallery download here

 

UNIX Logical Disk classes

Time to talk about SCOM2019 UNIX classes!

 

 

Just came across an example where the UNIX Logical disk class was targeted.

 

Did you know: This class in the UNIX library is not like the Windows library, where Logical Disk has a matched discovery.

Logical Disk is broke out to the various UNIX flavors, where the version of UNIX has it’s own class and discovery, but the class refers to the base class of UNIX Library.

 

Let’s go through an example from the SCOM Console

Monitoring Tab > Discovered Inventory > Change Target Type

 

This lab example is for an Ubuntu (Universal Linux Library)

The Logical Disk target for the UNIX/Linux Core Library has the same output in SCOM for the flavor (i.e. Logical Disk for the Universal Linux Operating System)

 

 

How’s that possible… ?

Let’s look at the examples for the various Logical Disk Classes.

Example

AIX 7 pack – AIX Logical disk discovery/class

<ClassType ID=”Microsoft.AIX.LogicalDisk” Abstract=”true” Accessibility=”Public” Hosted=”true” Singleton=”false” Base=”Unix!Microsoft.Unix.LogicalDisk” />

Universal Linux Monitoring Library

<ClassType ID=”Microsoft.Linux.Universal.LogicalDisk” Accessibility=”Public” Abstract=”false” Base=”Linux!Microsoft.Linux.LogicalDisk” Hosted=”true” Singleton=”false” Extension=”false” />

Linux Operating System Library

<ClassType ID=”Microsoft.Linux.LogicalDisk” Accessibility=”Public” Abstract=”true” Base=”Unix!Microsoft.Unix.LogicalDisk” Hosted=”true” Singleton=”false” Extension=”false” />

 

This makes sense, as Linux operating systems are SUSE, RHEL, Universal Debian and RPM.  Solaris and AIX are their own operating systems.  This helps describe the class hierarchy.

UNIX

Flavor of Unix (Linux, Solaris, or AIX)

Version or flavor of Linux, Solaris, or AIX

 

 

How did I get to this conclusion?

MPViewer will help view the classes and discoveries.

What does this mean to me:    Create a single view to view ALL  UNIX ‘Logical Disk’ entries discovered.  As the UNIX flavors all use UNIX Logical Disk class for their base class,  ALL the inherited classes are displayed.

 

 

AIX Logical Disk Discovery

 

Univeral Linux Discovery

Universal Linux Classes

 

Windows Server packs are very similar

Windows Logical Disk class

 

 

Using Unix MP’s for Shell commands and scripts

Ready to move out of the UI ?

Thanks to Saurav Babu, and Tim Helton’s help, I was able to push my MP authoring limits further.

The good thing with the Shell command template in SCOM is that your script is encoded.

Bad news

  1. If functionality doesn’t exist in the UI, you can’t easily pull the monitor and just add variables to get that functionality.
  2. Scripts and Shell commands are encoded (great news for security!)

Now to the use case – need Sample Count and Match Count to prevent false positive alerts

The UNIX Shell Command library allows us to use the following variables out of the box:

Interval, SyncTime, TargetSystem, UserName, Password, Script, ScriptArgs, TimeOut, TimeOutInMS, HealthyExpression, ErrorExpression

AND we can override Interval, Script, TimeOut, TimeOutInMS

If that’s not enough options, then read on!

When the built-in functionality doesn’t exist

For this UNIX shell command/script monitor, we required SampleCount and MatchCount

Variables explained

SampleCount is the number of times (samples for an alert).

If SampleCount = 4, this means 4 samples will generate an alert

MatchCount is the number of intervals before monitor state changes

If Interval = 60 (s), and MatchCount = 10, then it will take 10 minutes (600s before we alert)

Combining the 2 means 4 samples over 10 minutes will generate an alert.

Sometimes this is called alert suppression or counting failures before alerting

Built a custom DataSource, ProbeAction, and WriteAction, as the UNIX Shell Library MP did not include these additional variables.

Please review my updated MP Fragments TechNet Gallery for the custom MP and fragments!

https://gallery.technet.microsoft.com/Uncommon-Custom-MP-c5a12a86

Encoding the script or command to run

The other issue with UNIX scripts and commands, is the UI encodes the scripts.

How do we get around it you ask?

Since we are building an MP Fragment and MP, we must figure out how to encode.

To encode the script to put into your SCOM monitor (and MP Fragment)

Example

$script = ‘if [ `ps -ef | grep sleep | grep -v grep | wc -l` -eq “1” ]; then echo false; else echo true; fi’

# Verify script variable
$script

# Get $script bytes
$s = [System.Text.Encoding]::UTF8.GetBytes($script)

# Verify script bytes output (optional as bytes broken out by line)
$s

# Encode script to Base64
$encoded = [System.Convert]::ToBase64String($s)

# Verify $encoded
$encoded

# Optional
# Verify string converts back properly
[System.Text.Encoding]::UTF8.GetString($s)

$encoded output is what needs to be entered into the <script></script> variable in your monitor

Example Output

PS C:\Users\scomadmin\desktop> $script = ‘if [ `ps -ef | grep sleep | grep -v grep | wc -l` -eq “1” ]; then echo false;
else echo true; fi’
PS C:\Users\scomadmin\desktop> $script
if [ `ps -ef | grep sleep | grep -v grep | wc -l` -eq “1” ]; then echo false; else echo true; fi
PS C:\Users\scomadmin\desktop> $s = [System.Text.Encoding]::UTF8.GetBytes($script)
PS C:\Users\scomadmin\desktop> $s
PS C:\Users\scomadmin\desktop> $s = [System.Text.Encoding]::UTF8.GetBytes($script)

PS C:\Users\scomadmin\desktop> $encoded = [System.Convert]::ToBase64String($s)
PS C:\Users\scomadmin\desktop> $encoded
aWYgWyBgcHMgLWVmIHwgZ3JlcCBzbGVlcCB8IGdyZXAgLXYgZ3JlcCB8IHdjIC1sYCAtZXEgIjEiIF07IHRoZW4gZWNobyBmYWxzZTsgZWxzZSBlY2hvIHRydWU7IGZp
PS C:\Users\scomadmin\desktop> [System.Text.Encoding]::UTF8.GetString($s)
if [ `ps -ef | grep sleep | grep -v grep | wc -l` -eq “1” ]; then echo false; else echo true; fi
PS C:\Users\scomadmin\desktop>

References

Jonathan Almquist’s blog post

Kevin Holman’s blog on service with Samples

SharePoint Management framework Private Preview

 

Do you have an Enterprise SharePoint farms that you manage health and performance via custom scripts?

Have you used SETH to manage SharePoint 2010 problems with the farm(s)?

 

Would you want a scalable tool you can add your own scripts and enable/check, and then alert on what you want?

 

 

Background

SharePoint Engineer Troubleshooting Helper (SETH) was a Microsoft tool for SharePoint 2010

Using SETH

Troubleshooting SETH

 

 

For SharePoint 2016 and 2019, the Customer Support team brought up the need for bringing back a utility to help with common SharePoint scenarios

On Premise Diagnostic (OPD) is the second generation of project (for SharePoint 2016 and 2019).

 

My goal was to help the Escalation Engineers have a full platform that can be implemented and is scalable for the technical community to maintain and use.

 

BTW, the only thing preventing 2013 SharePoint support is the dependency on WMF v5.0 or better on SharePoint servers.

 

 

SCOM management pack can be found here

 

Updated Skype for Business 2015 Addendum pack

Continuing work with Nick Wood on the Skype pack for additional operational features.

Previously Blogged about this July 2018, and continue to make improvements

The TechNet gallery bundle is updated with new functionality.

Skype KHI addendum

Pack gathers the Skype KHI performance counters

Packets * Discards performance rules where greater than 100 discards are seen on NIC’s,

Monitoring Tab folder/performance view

Skype Custom Overrides

Includes common overrides for noisy monitors/rules.

Install SCVMM management packs from VMM Server

Time for some automation

Ever have to upgrade SCVMM packs every time a new Update Release (UR) comes out?

Copy the files off from the VMM server to your SCOM MS, install.

How long does that take?

Try this script out – assuming you have a login on the VMM Server

TechNet Gallery post here

# Set up some variables

$UR=”UR5″

$VMMServer = “16VMM01”

# Set up your path, this example is monadmin\backup

$date = Get-Date -UFormat “%Y-%m-%d”

# Set up backup path

$backupPath = “C:\monadmin\backup”

$backupDrive = “C:”

# Create some functions

Watch them roll, let PowerShell do your work!

UR6 packs

SCOM management packs backed up

Check out the SCOM Console Admin tab for updates!

Troubleshooting Service Map pack

 

 

 

Updated 14 Mar 2019

 

If you get these exceptions like me, the issue has been raised, with a deliverable targeted for SCOM2019UR1.

Disable the rule to reduce noise.

 

 

Are you using Service Map Management pack, and getting errors?

 

This alert is based on the 46651/46652 event ID in the Operations Manager event log

From SCOM Console > Authoring Tab > Management Pack Objects > Rules

Search in ‘Look for:’ bar GenericException (yes no space in between)

 

Rule

 

 

Rule Details

 

To enable debug on the MS

 

For collecting logs, please do the following:

  • Create folders “c:\Debug\ext\”
  • Now, Wait for an hour(which is the default time interval set in the rule for running service map api).
  • You will see some log files created in that folder “ext”. Please share the same in email.

 

The file showed up after the alerts, and listed debug INFO and WARN lines, and the time stamps match up to the generic exception rules.

 

Stay tuned for more information, I have been trying to get more answers on the exception

{WARN} [12:35:20.966] [ScomUtils] failed to export XML for Management Pack: System.NullReferenceException: Object reference not set to an instance of an object.

   at ScomBridge.ScomUtils.WritePackXmlToFile(ManagementPack pack, String filename)