Improving SCOM Monitor reset logic

Faster - Improving SCOM Monitor reset logic
Faster – Improving SCOM Monitor reset logic

 

My thanks to Aris Somatis for his deep dive reviewing the packs with me, particularly new use cases.  The PowerShell below builds on Scott Murr’s initial TechNet published logic from years back.  Consequently, the reset logic provides a ‘manual intervention required’ alerting/monitoring system.

 

 

Improving SCOM monitor reset logic

Calling the reset method has been a game changer for my customers – including operators, system and application owners!

Background

Scott’s reset logic, from SCOM2012, helped administrators reset unhealthy monitors where alerts may have been closed.  Because Scott leveraged the ResetMonitoringState method, the community gained a way to keep true health.  Additionally, many administrators and engineers built custom management packs to provide solutions.  Second, the addendum packs blog brought in more options – best practices, lessons from the field (and customers), and health model accurate alerting for what was really broken in the environment.  Third,    addressing ‘gaps’ or ‘blind spots’ from product teams.   As a result of NEW monitoring, the packs may include: rules/monitors, datasource/writeAction (DS/WA) workflows, recovery tasks and automation, count logic monitors, overrides, discoveries, and groups.    Thirdly, to take monitoring to the next level.  To top that off, with very little/NO cost compared to competitors!

PowerShell code

Aris’s Age use case takes this even further.  Using monitor age allows further analysis to dial down ‘monitor reset’ to object is X days old.  Comparatively, the 24-72 hour setup default is used in the addendums, so Age provides a second option.  Third option can rely on SCOM’s built-in cleanup, but that’s typically 14-30 days.   Overall, flexibility is a good thing.

 

# Specify age variable for your environment

$Age = [DateTime](Get-Date).AddDays(-7)

 

PowerShell code snippet

First, the reset logic can pivot on the age requirement.  Then, adjust the Age variable per requirements.  Third, figure out which method applies to gather a unique list of classes, whether by partial string(s), or by management pack name(s).

 

Set age variable (how long ‘OLD’ monitors might be stale and need reset)

# Example sets $Age variable to 7 days ago (-7)

$Age = [DateTime](Get-Date).AddDays(-7)

 

 

Unpack two different ways to gather classes for monitors to reset

# When common string name exists in all classes

Example DFS/FileServices packs all have one of the three strings:

# DFS pack naming

$DFSClasses = @(Get-SCOMClass -Name “*FileServices*”; Get-SCOMClass -Name “*FileServer*”; Get-SCOMClass -Name “*DFS*” )

$DFSClass = $DFSClasses | sort -property Name -uniq

# Debug

$DFSClass.Count

 

# Get AD classes – Microsoft.Windows.Server.AD.2016.Discovery, Microsoft.Windows.Server.AD.Library

$ADLibrary = Get-SCOMManagementPack -name “Microsoft.Windows.Server.AD.Class.Library”

#get-scomclass -ManagementPack $ADLibrary

$ADMonitoring = Get-SCOMManagementPack -name “Microsoft.Windows.Server.AD.2016.Monitoring”

#get-scomclass -ManagementPack $ADMonitoring | fl DisplayName,Name,ID

$ADDiscovery = Get-SCOMManagementPack -name Microsoft.Windows.Server.AD.2016.Discovery

#get-scomclass -ManagementPack $ADDiscovery | fl DisplayName,Name,ID

# ADDS pack naming

$ADDSClasses = @(Get-SCOMClass -ManagementPack $ADLibrary; Get-SCOMClass -ManagementPack $ADDiscovery; )

# NOTE Excluded AD Monitoring pack as NO classes existed

$ADDSClass = $ADDSClasses | sort -property Name -uniq

 

# Debug count of unique classes

$ADDSClass.Count

 

 

Reset monitor PowerShell screenshot

Download from GitHub https://github.com/theKevinJustin/SCOMMonitorReset

SCOM Reset PowerShell
SCOM Reset PowerShell

 

Example PowerShell on HealthService resets

NOTE debug logic enabled

0
Found 0 unhealthy monitors for class Microsoft.SystemCenter.HealthServicesGroup
1
Found 1 unhealthy monitors for class Microsoft.SystemCenter.HealthServiceWatcher

Resetting Health State on ' + Microsoft.SystemCenter.HealthServiceWatcher:Microsoft.SystemCenter.AgentWatchersGroup;5e0
4f804-8b71-6eb6-0101-dcbb58022498 + '

Guid
----
0218d239-3d37-f9b1-75d2-6d52c2c7c0c1


Documentation/Sources

Building on Scott’s idea – (retired links)
Original post https://sc.scomurr.com/scom-2012-monitor-reset-cleaning-up-the-environment/
TechNet gallery download https://gallery.technet.microsoft.com/SCOM-2012-Batch-reset-63a17534
Alternate link https://gallery.technet.microsoft.com/scriptcenter/Auto-reset-script-for-d8b775ca

DNS2012R2 Addendum pack

Still running Server2012R2 servers with AD DCs with AD integrated DNS?
Still running Server2012R2 servers with AD DCs with AD integrated DNS?

In case you’re still running Windows Server 2012R2, here’s the ‘DNS2012R2 Addendum pack’ giving the same functionality as the version agnostic 2016+ addendum.  Why?  DNS is a translation method to convert names to IP’s.  Can you imagine if we wanted to connect to google via IP?  The number of workflows in the SCOM DNS pack (built by the DNS Product Group) makes for an astounding number of workflows running on your DC every minute.  Forward and reverse lookups are a good check, verifying DNS is functioning.  In a complex environment with 100’s of zones, SCOM becomes a utilization culprit for a DC’s primary missions – authenticate and resolve.  This article will help you understand how the pack will add new capabilities and tune DNS monitoring to best practice.

 

Quick Download HTTPS://GITHUB.COM/THEKEVINJUSTIN/DNSADDENDUM2012R2/

 

 

What capabilities does the ‘DNS Addendum pack’ provide?

Count logic monitors (i.e. x events in y time, and self heal)

Daily summary report of DNS alerts broken out

Daily alert closure workflow to close out DNS rules/monitor

DNS service(s) recovery automation

Synthetic internal/external nslookup monitor (scoped to PDC emulators versus ALL DNS servers

WMI validation alert recovery to prevent false positive alerts with weird one off scenarios – one example: Security tools randomly block WMI access.

 

Download the ‘DNS2012R2 Addendum pack’ on GitHub to improve AD Integrated (ADI) DNS monitoring on Windows Server 2016+ (version agnostic).

Save and Import pack, then update XML for group GUIDs

 

 

Update XML

First, update XML with the GUIDs from your management group.  Second, map the group DisplayName to find/replace the GUID for each group.

Get-SCOMClassInstance output for DNS2012R2 groups
Get-SCOMClassInstance output for DNS2012R2 groups

 

Third, using Notepad++ highlight the ContextInstance GUID and hit Control-H, and paste the group GUID then click Replace All.

Using Notepad++ highlight the ContextInstance GUID and hit Control-H, and paste the group GUID then click Replace All.
Using Notepad++ highlight the ContextInstance GUID and hit Control-H, and paste the group GUID then click Replace All.

Fourth – Rinse and repeat for the other three groups.

Lastly, save file, move to SCOM MS, and import!

 

Documentation and links

DNS Pack download

DNS2012R2 addendum blog including updates

GitHub Repository https://github.com/theKevinJustin/DNSAddendum2012R2/