My thanks to Aris Somatis for his deep dive reviewing the packs with me, particularly new use cases. The PowerShell below builds on Scott Murr’s initial TechNet published logic from years back. Consequently, the reset logic provides a ‘manual intervention required’ alerting/monitoring system.
Improving SCOM monitor reset logic
Calling the reset method has been a game changer for my customers – including operators, system and application owners!
Background
Scott’s reset logic, from SCOM2012, helped administrators reset unhealthy monitors where alerts may have been closed. Because Scott leveraged the ResetMonitoringState method, the community gained a way to keep true health. Additionally, many administrators and engineers built custom management packs to provide solutions. Second, the addendum packs blog brought in more options – best practices, lessons from the field (and customers), and health model accurate alerting for what was really broken in the environment. Third, addressing ‘gaps’ or ‘blind spots’ from product teams. As a result of NEW monitoring, the packs may include: rules/monitors, datasource/writeAction (DS/WA) workflows, recovery tasks and automation, count logic monitors, overrides, discoveries, and groups. Thirdly, to take monitoring to the next level. To top that off, with very little/NO cost compared to competitors!
PowerShell code
Aris’s Age use case takes this even further. Using monitor age allows further analysis to dial down ‘monitor reset’ to object is X days old. Comparatively, the 24-72 hour setup default is used in the addendums, so Age provides a second option. Third option can rely on SCOM’s built-in cleanup, but that’s typically 14-30 days. Overall, flexibility is a good thing.
# Specify age variable for your environment
$Age = [DateTime](Get-Date).AddDays(-7)
PowerShell code snippet
First, the reset logic can pivot on the age requirement. Then, adjust the Age variable per requirements. Third, figure out which method applies to gather a unique list of classes, whether by partial string(s), or by management pack name(s).
Set age variable (how long ‘OLD’ monitors might be stale and need reset)
# Example sets $Age variable to 7 days ago (-7)
$Age = [DateTime](Get-Date).AddDays(-7)
Unpack two different ways to gather classes for monitors to reset
# When common string name exists in all classes
Example DFS/FileServices packs all have one of the three strings:
# DFS pack naming
$DFSClasses = @(Get-SCOMClass -Name “*FileServices*”; Get-SCOMClass -Name “*FileServer*”; Get-SCOMClass -Name “*DFS*” )
$DFSClass = $DFSClasses | sort -property Name -uniq
# Debug
$DFSClass.Count
# Get AD classes – Microsoft.Windows.Server.AD.2016.Discovery, Microsoft.Windows.Server.AD.Library
$ADLibrary = Get-SCOMManagementPack -name “Microsoft.Windows.Server.AD.Class.Library”
#get-scomclass -ManagementPack $ADLibrary
$ADMonitoring = Get-SCOMManagementPack -name “Microsoft.Windows.Server.AD.2016.Monitoring”
#get-scomclass -ManagementPack $ADMonitoring | fl DisplayName,Name,ID
$ADDiscovery = Get-SCOMManagementPack -name Microsoft.Windows.Server.AD.2016.Discovery
#get-scomclass -ManagementPack $ADDiscovery | fl DisplayName,Name,ID
# ADDS pack naming
$ADDSClasses = @(Get-SCOMClass -ManagementPack $ADLibrary; Get-SCOMClass -ManagementPack $ADDiscovery; )
# NOTE Excluded AD Monitoring pack as NO classes existed
$ADDSClass = $ADDSClasses | sort -property Name -uniq
# Debug count of unique classes
$ADDSClass.Count
Reset monitor PowerShell screenshot
Download from GitHub https://github.com/theKevinJustin/SCOMMonitorReset
Example PowerShell on HealthService resets
NOTE debug logic enabled
0 Found 0 unhealthy monitors for class Microsoft.SystemCenter.HealthServicesGroup 1 Found 1 unhealthy monitors for class Microsoft.SystemCenter.HealthServiceWatcher Resetting Health State on ' + Microsoft.SystemCenter.HealthServiceWatcher:Microsoft.SystemCenter.AgentWatchersGroup;5e0 4f804-8b71-6eb6-0101-dcbb58022498 + ' Guid ---- 0218d239-3d37-f9b1-75d2-6d52c2c7c0c1