Need to find the command UNIX pack runs for perf counter

Magnifying Glass

 

 

Have you ever needed to find the command UNIX pack runs for perf counter?   Say the processor time value doesn’t match what the Unix admin may be saying SCOM is showing.

 

Many times you can look at the SCOM management pack, and those commands trace back to the UNIX library.

 

Background:  The SCOM management server runs many of the cross-plat/xplat workflows to the UNIX agent through WinRM.

 

Agenda
  1. Unseal SCOM UNIX management pack to obtain URI
  2. Understand command line options from UNIX/Linux side, and how to view the output
  3. Enumerate command line
  4. Test Command line from SCOM MS

 

 

 

Unseal SCOM UNIX management pack

The screenshot below is unsealing the Solaris10 pack to XML, and then viewing/searching to show the processor reference.

Solaris 10 processor rules

NOTE that’s a URI, not a script

 

 

How UNIX admin may supply processor output

Example – Unix admin typically uses vmstat or iostat.

 

The screenshot uses ‘vmstat 2 10‘ – a snapshot every 2 second intervals, 10 times

vmstat output

 

We can discuss the vmstat output, but it shows way more than just processor (ready queue, swap, user, system, and cpu %) to help figure out which operating system component is the problem.

 

 

Enumerate command line test

How do we test the command line syntax, to verify what SCOM pulls when running the rule?

For example, we need to make the URI actionable from the management pack.  What is needed to make a usable command?

 

Grab the URI from the pack

http://schemas.microsoft.com/wbem/wscim/1/cim-schema/2/SCX_ProcessorStatisticalInformation?__cimnamespace=root/scx

 

Because we know the URI, we now build out the syntax with WinRM

winrm enumerate http://schemas.microsoft.com/wbem/wscim/1/cim-schema/2/SCX_ProcessorStatisticalInformation?__cimnamespace=root/scx -auth:basic -remote:https://<servername>:1270 -username:<scomID, not necessarily root> -skipCACheck -skipCNCheck -skiprevocationcheck –encoding:utf-8

 

 

Test WinRM command from SCOM MS

For instance, we want to test the WinRM command from the MS to the UNIX server

winrm enumerate http://schemas.microsoft.com/wbem/wscim/1/cim-schema/2/SCX_ProcessorStatisticalInformation?__cimnamespace=root/scx -auth:basic -remote:https://ubuntu:1270 -username:scom -skipCACheck -skipCNCheck -skiprevocationcheck –encoding:utf-8

 

Example output

SCX_ProcessorStatisticalInformation
InstanceID = null
Caption = Processor information
Description = CPU usage statistics
ElementName = null
Name = 0
IsAggregate = FALSE
PercentIdleTime = 99
PercentUserTime = 0
PercentNiceTime = 0
PercentPrivilegedTime = 0
PercentInterruptTime = 0
PercentDPCTime = 0
PercentProcessorTime = 1
PercentIOWaitTime = 0

SCX_ProcessorStatisticalInformation
InstanceID = null
Caption = Processor information
Description = CPU usage statistics
ElementName = null
Name = _Total
IsAggregate = TRUE
PercentIdleTime = 99
PercentUserTime = 0
PercentNiceTime = 0
PercentPrivilegedTime = 0
PercentInterruptTime = 0
PercentDPCTime = 0
PercentProcessorTime = 1
PercentIOWaitTime = 0

 

Additional references for WinRM syntax and troubleshooting

Warren’s blog

Docs site

Use Unix MP’s for shell commands

 

Possible SQL issues affecting SCOM performance

 

Good reasons for a Risk Assessment

 

SQL RAS runs 800+ queries to check on target SQL servers

Check Best Practice Recommendations (BPR)

 

May be good opportunity to audit the SQL build for BPR!

 

 

 

Ran across some good examples where SQL settings brought SCOM to a standstill

One was Cardinality Estimation – basically, predicts how many rows a query will return

Part of SQL since 1998 with SQL Server v7.0

 

Let’s figure out what SQL2016 runs OoB (out of box)

 

SQL 2016

SELECT ServerProperty(‘ProductVersion’);
GO 


SELECT name, value
FROM sys.database_scoped_configurations
WHERE name = ‘LEGACY_CARDINALITY_ESTIMATION’;
GO

 

 

The other is CLR Strict Security

SELECT * FROM sys.configurations

WHERE name = ‘clr enabled’

 

 

Talking with Shawn Nakhostin – SQL PFE, we discussed opportunities and questions around SQL optimization and best practices.

Shawn gave me the following feedback on customer performance issues:

I’ve found some customers who have had performance issues with SQL based on organizational SQL settings:

  1. Trace flag 9481
  2. CLR Strict Security is by default enabled

 

Trace flag 9481

Enabling or disabling this TF is not a matter of best practice.

The customer should see what works for them.

Here is the explanation:

Customer started using a new cardinality estimator in SQL Server 2014.

The product team knew that the new CE improved some of the query plans, but not all of them. In other words, they knew that this would improve overall query performance in “some” environments but might have a different impact in other environments.

For this reason, they created TF 9481 so that environments that see query performance degradation after upgrading SQL Server from version 2012 and earlier, they can turn on this trace flag so that the query optimizer uses the old algorithm for CE.

Note:-Trace flag 9481 forces the query optimizer to use version 70 (the SQL Server 2012 version) of the cardinality estimator when creating the query plan.

https://blogs.technet.microsoft.com/dataplatform/2017/03/22/sql-server-2016-new-features-to-deal-with-the-new-ce/

https://support.microsoft.com/en-in/help/2801413/enable-plan-affecting-sql-server-query-optimizer-behavior-that-can-be

 

CLR Strict Security is by default enabled

This causes all assemblies to be treated as unsafe.

As a result, assemblies will not load.

To get the assemblies to load they can do one of the following:

  • Sign the assembly. This may work if you have a few assemblies but becomes a huge task if there are many assemblies to sign.
  • Set the TRUSTWORTHY database property to on.
    • This is not recommended because in some form defeats the purpose of using CLR Strict Security.
  • Add the assembly to the trusted assemblies list.
    • This is called whitelisting, which may be a better option than the previous two.

https://docs.microsoft.com/en-us/sql/database-engine/configure-windows/clr-strict-security?view=sql-server-2017

 

 

Skype for Business 2015 (premise) Addendum MP

Ever try to figure out a Skype alert, for which server in the pool(s) is failing?

 

While maybe not the clearest to find root cause, the Skype pack brings a bunch of functionality, including synthetic transactions.

 

I was lucky enough to collaborate with Nick Wood, Skype PFE, to help provide more detail, troubleshooting, impact on what is critical versus warning.

 

What the addendum pack brings

Do you think 656 monitors can all be critical?

  • Sets up service restart recovery tasks for all Skype services
  • Company Knowledge tab for troubleshooting/user impact

 

Gallery Download

 

Here is a visual of our Skype efforts for integrating troubleshooting details into SCOM console.

NOTE:  Company Knowledge tab would be accessible from the alert as well

 

Company Knowledge

SCOM Console, Authoring tab, Dispatcher Queue monitor

Highlight monitor, right click, choose properties

Click on ‘Company Knowledge’ tab

Incorporated the XLS into SCOM under Company Knowledge for additional information on user impact, causes, and troubleshooting (under resolutions)

 

Welcome to my new blog

Hi all,

Welcome to my new blog. While I’m going to primarily focus on System Center Operations Manager, I will from time to time post about other system center products on my page.

Thanks for stopping by!