SCOM Agent Maintenance

Wrench for SCOM agent maintenance
Wrench for SCOM agent maintenance

When we talk about best practices for monitoring, this will typically include (SLA) Service Level Availability.  SLA is an important piece in your environment, as uptime and happy customers come with a high SLA.  There are some cases where IT Teams do work on demand.  On-demand work is outside of a standard change window, a scheduled change.  Typically this is outside configuration management tools, responsible to update software (applications/packages), machines, drivers, compliance settings, and more.  In the one-off, non-scheduled maintenance or recovery, try leveraging ‘SCOM Agent Maintenance’ PowerShell commands on SCOM agents.

 

SCOM Agent maintenance PowerShell commands

cd “C:\Program Files\Microsoft Monitoring Agent\Agent”

Import-module .\MaintenanceMode.dll

Start-SCOMAgentMaintenanceMode -Duration 10 -Reason PlannedOther

 

# Verify

# If messages show with current timestamp, Agent objects are in maintenance.

 

get-eventlog -LogName “Operations Manager” -newest 50 | ? { $_.Message -like “Suspending monitoring*”  } | ft TimeGenerated,Message -autosize

 

TimeGenerated        Message

————-        ——-

6/25/2020 8:37:57 AM Suspending monitoring for instance “modeldev” with id:”{F9E45AA4-7DF7-C1F1-70C9-5D76C9F2725C}” …

6/25/2020 8:37:57 AM Suspending monitoring for instance “C:” with id:”{ED00048A-7DDC-D4BE-901D-D64DA281B7C6}” as the…

6/25/2020 8:37:57 AM Suspending monitoring for instance “central_log” with id:”{EA619D69-D1CC-3B19-D93C-2E3FCD1409AE…

 

PS C:\Program Files\Microsoft Monitoring Agent\Agent> get-eventlog -LogName “Operations Manager” -newest 25 | ? { $_.Message -like “Resuming monitoring*”  } | ft TimeGenerated,Message -autosize

 

343998 Jun 25 08:50  Information HealthService          1073743040 Resuming monitoring for instance “modeldev” wit…

343997 Jun 25 08:50  Information HealthService          1073743040 Resuming monitoring for instance “C:” with id:”…

343996 Jun 25 08:50  Information HealthService          1073743040 Resuming monitoring for instance “central_log” …

343995 Jun 25 08:50  Information HealthService          1073743040 Resuming monitoring for instance “dnmll05s1.UNE…

SCOM maintenance schedules

SCOM maintenance schedules
SCOM maintenance schedules
Do your SCOM users need to know if a server is in scheduled maintenance?  This came about as Aris asked questions.
 First, let’s discuss specific maintenance mode and maintenance schedule scenarios users might ask.  Second, determining IF scheduled maintenance enabled, running, about to run.  Third, how does another user know when scheduled maintenance ends, allowing action and decision point to add/extend server maintenance.  Fourth, whenever scheduled maintenance entered by one user, is NOT automatically seen by other roles.  While product guidance states ‘maintenance schedules be added by someone in SCOM admin group’, self-service users still need visibility.  Lastly, can we figure out a way to answer these questions.  Given these points, users to be able to see server maintenance details.  Also, can solution adhere to best practice ‘no alerts during planned maintenance’.
From PowerShell on SCOM MS
Get-SCOMMaintenanceScheduleList
$ScheduleList = Get-SCOMMaintenanceScheduleList
$ScheduleList.ID
$ScheduleList.ScheduleID.Guid
foreach ( $ID in $ScheduleList.ScheduleID)
{
$Schedule = get-SCOMMaintenanceSchedule -ID $ID
# $Schedule.MonitoringObjects ;
(get-scomclassInstance -id $Schedule.MonitoringObjects.Guid).DisplayName
# Debug endtime
$Schedule | ft User,ActiveStartTime,ActiveEndDate,ScheduledEndTime
}
Example Output
SCOM Maintenance Schedule Output
SCOM Maintenance Schedule Output
Workflows:
Scheduled Maintenance report task
Maintenance mode report – what’s about to end maintenance mode.
Obviously, expect both workflows into the ‘Proactive NOSC DailyTasks’ pack.  GitHub repo  https://github.com/theKevinJustin/ProactiveNOSCDailyTasks

SCOM Snapshot Synchronization alerts

Synchronized Swimmers

Updated 4 Apr 2023 with Tyson’s feedback!

 

First, some background – the Snapshot Synchronization alert just tells you there was a SQL issue running the workflow.

Second, the Snapshot Synchronization alert from a health model perspective, is NOT a critical issue (outage).  Create override severity to 1 (warning) to prevent false wake-up calls.  I’ll get this to my GitHub repo shortly!

 

 

 

Let’s troubleshoot the alert

Start with Tyson’s SCOM Maintenance pack, and run the tasks

Tyson’s SCOM Maintenance pack tasks

 

Alternative long steps

Login to server with SSMS installed –

Open SSMS > Connect to the SCOM OpsMgr DB > Click on New Query

***NOTE verify database dropdown shows Operations Manager!

Paste SQL query into the query textbox

Select WorkItemName, b.WorkItemStateName, ServerName, StartedDateTimeUtc, CompletedDateTimeUtc, DurationSeconds, ERRORMESSAGE
from cs.WorkItem a , cs.WorkItemState b
where a.WorkItemStateId= b.WorkItemStateId
and WorkItemName = ‘SnapshotSynchronization’

Screenshot

Snapshot Synchronization alert query

 

 

 

Example SQL Output

SQLQueryExecution

WorkItemName WorkItemStateName ServerName StartedDateTimeUtc CompletedDateTimeUtc DurationSeconds ERRORMESSAGE
SnapshotSynchronization Succeeded SCOMV01 2023-01-24 00:26:23.427 2023-01-24 00:27:46.100 83 NULL
SnapshotSynchronization Failed SCOMV02 2023-01-25 00:27:52.363 2023-01-25 00:28:07.520 15
SnapshotSynchronization Succeeded SCOMV00 2023-01-25 21:43:36.540 2023-01-25 21:45:07.947 91 NULL
SnapshotSynchronization Running SCOMV00 2023-01-25 21:45:32.227 NULL NULL NULL

Solution:
The jobs may show Succeeded by the time you login to SQL = EOJ (end of job)

If Failed is latest date/timestamp, re-run the task “Request Snapshot Synchronization” which can be found when we select  “Management Configuration Service Group” in the below mentioned view.

View:

From Monitoring Tab > Click on Operations Manager folder > Click on Management Group Health widget > Highlight unhealthy state from Management Group Functions.

Click on the ‘Request Snapshot Synchronization’ task to execute the Stored Procedure “SnapshotSynchronizationForce” on the OpsMgr DB.

 

NOTE: There are two tasks with same name but with different targets i.e. ‘Management Configuration Service Group’ and ‘Management Configuration Services’

 

The other task can be found on below view after selecting the Management Server you want the Task to be executed on

View:

From Monitoring Tab > Expand Operations Manager folder > Expand Management Configuration Service folder > Click on Services State view

 

 

Create Override for the alert

To change snapshot monitor to warning

From SCOM Console > Authoring Tab

Expand ‘Management Pack Objects’ > Click on Monitors

In the ‘Look for:’ bar type Snapshot synchronization state and hit enter

Monitor name = Snapshot synchronization state

Right click on monitor > Overrides > Override the monitor > For all objects of class

Click checkbox for Severity > change Critical to Warning

Click Edit – add comment – i.e. date/time changing to warning

Select your override pack > Click OK

Click OK to execute change

Override snapshot monitor to warning

 

 

Resources

Tyson’s MonitoringGuys blog for SCOM Maintenance pack – download here

Link to TechNet article

 

SCOM Maintenance Mode PowerShell

My thanks to Matt Taylor and Kevin Holman, Ralph Kyttle, and John Kavanagh for their guidance!

Updated 24 Jun 2022

 

 

Read on if these apply
Trying to start, update, or end SCOM MM

Get alerts when MM is updated
PowerShell only in your shop!
SCORCH in play but need to convert runbooks to straight PowerShell

Ran into issues using Set-SCOMMaintenanceMode, as the cmdlet doesn’t put ALL the recursive classes under Windows Computer

 

 

Background

Set-SCOMMaintenanceMode cmdlet is actually “by design.”  ☹

 

Start-SCOMMaintenanceMode assumes you want recursive action when you start maintenance mode….

Pick a Windows Computer and it places the Windows Computer object (AND all contained objects) into MM.

 

Computer in MM

All contained objects in MM

 

 

However, the problem is that Set-SCOMMaintenancemode does not have an understanding of recursiveness.

Command changes the MM entry for the Windows Computer, but NOT all the contained objects.  So they retain the original setting.

 

Health explorer looks like this, resulting in unwanted alerts

 

 

 

Details

NOTE these $Time and DateTime Method are dependent on the delay between running the commands
If you start MM, and wait 5 minutes, then update, the total MM duration will be ~20 minutes

 

 

 

Maintenance Mode options and examples

# Setup variables for MM

# Example 1 Windows Computer

$server = “Servername.FQDN”

$instance = (get-scomclass -DisplayName “Windows Computer” |Get-SCOMClassInstance | where { $_.DisplayName -eq $server } )

# Set time for 6 minutes

$Time = (Get-Date).addMinutes(6)

Start-SCOMMaintenanceMode -Instance $Instance -EndTime $Time -Comment “Starting Maintenance Mode.” -Reason “PlannedOther”

 

# Example 2

# Business needs require Windows Operating System monitoring to occur while Application is in maintenance

# My Example is Defender, could be SQL, MSMQ, Lync, Skype, or your custom class created for your application

$Class = (get-scomclass)
$Class | ? { $_.Name -like “*Defender*” } | fl DisplayName,Name
$Class | ? { $_.Name -like “*Defender*” } | fl DisplayName,Name

DisplayName : Protected Endpoint
Name        : Microsoft.WindowsDefender.ProtectedServer

DisplayName : Protected Candidate
Name        : Microsoft.WindowsDefender.ProtectedServerCandidate

DisplayName : Unprotected Endpoint
Name        : Microsoft.WindowsDefender.UnprotectedServer

DisplayName : Microsoft Windows Defender Class
Name        : Microsoft.Windows.Defender.Class

# Choose the class needed

$server = “Servername.FQDN”

$instance = ( $Class | ? { $_.Name -like “Microsoft.Windows.Defender*” } |Get-SCOMClassInstance | ? { $_.DisplayName -eq $server } )

# Verify Instance variable

$instance

PS C:\Users\scomadmin> $instance

HealthState     InMaintenanceMode  DisplayName
———–     —————–  ———–
Success               False        WFM.testlab.net

 

# Don’t forget to add time variable

$Time = (Get-Date).addMinutes(6)

# Start maintenance mode

Start-SCOMMaintenanceMode -Instance $Instance -EndTime $Time -Comment “Starting Maintenance Mode.” -Reason “PlannedOther”

 

 

 

 

Start, Update, End and Verify Maintenance mode syntax

 

# Start MM via PoSH cmdlet

Start-SCOMMaintenanceMode -Instance $Instance -EndTime $Time -Comment “Starting Maintenance Mode.” -Reason “PlannedOther”

 

 

# Start MM using method vs. PowerShell cmdlet

Note Recursive in $WCobj.ScheduleMaintenanceMode

$windowsComment=”PlannedOther”
$windowReason=”PlannedOther”
$windowsComment=”Testing Maintenance Mode”
$windowDuration=15

$server= “wfm.testlab.net”
$instance = (get-scomclass -DisplayName “Windows Computer” |Get-SCOMClassInstance | ? { $_.DisplayName -eq $server } )
$instance.ScheduleMaintenanceMode([datetime]::Now.touniversaltime(),([datetime]::Now).addminutes($windowDuration).touniversaltime(), “$windowReason”, “$windowsComment” , “Recursive”)

# Drop Recursive if you don’t want it (but can’t imagine why you would!)

 

 

# Update MM

# Make sure you’ve put object in MM

$server= “wfm.testlab.net”
$instance = (get-scomclass -DisplayName “Windows Computer” |Get-SCOMClassInstance | ? { $_.DisplayName -eq $server } )

# 15 minutes in the future
$instance.UpdateMaintenanceMode([System.datetime]::Now.touniversaltime().addminutes(15),[Microsoft.EnterpriseManagement.Monitoring.MaintenanceModeReason]::PlannedOther,[System.string]::”Adding 15 minutes to the end time.”,[Microsoft.EnterpriseManagement.Common.TraversalDepth]::Recursive);

 

# Stop MM

# Make sure you’ve put object in MM

# Immediate
$instance.StopMaintenanceMode([System.DateTime]::Now.ToUniversalTime());

My thanks to Jan Nevaril

$server.StopMaintenanceMode([System.DateTime]::Now.ToUniversalTime(),“Recursive”)

 

 

 

Verification steps

 

# Verify MM

get-scommaintenancemode -ComputerName $instance.Name|fl MonitoringObjectId,StartTime,ScheduledEndTime

NOTE This will error if you’ve stopped maintenance

Example

PS C:\Users\scomadmin> get-scommaintenancemode -ComputerName $instance.Name
get-scommaintenancemode : The Data Access service is either not running or not yet initialized. Check the event log
for more information.
At line:1 char:1
+ get-scommaintenancemode -ComputerName $instance.Name
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+ CategoryInfo          : InvalidOperation: (Microsoft.Syste…anceModeCommand:GetSCMaintenanceModeCommand) [Get-S
COMMaintenanceMode], ServiceNotRunningException
+ FullyQualifiedErrorId : ExecutionError,Microsoft.SystemCenter.OperationsManagerV10.Commands.GetSCMaintenanceMode
Command

 

 

# Validate MM through Operations Manager Event ID’s 1215 and 1216 logged

get-eventlog -LogName “Operations Manager” | ? { $_.EventID -eq 1215 -OR $_.EventID -eq 1216 } |fl EventID,TimeGenerated,Message

# Alternate command to check latest 100 events

get-eventlog -LogName “Operations Manager” -newest 100 | ? { $_.EventID -eq 1215 -OR $_.EventID -eq 1216 } |fl EventID,TimeGenerated,Message

 

 

# Error if object NOT in MM

Cannot find an overload for “UpdateMaintenanceMode” and the argument count: “1”.

At line:1 char:1

+ $WCobj.UpdateMaintenanceMode(([System.datetime]::Now).addminutes(15). …

+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

    + CategoryInfo          : NotSpecified: (:) [], MethodException

    + FullyQualifiedErrorId : MethodCountCouldNotFindBest

 

PS C:\Windows\system32>

 

Testing System datetime

PS C:\Windows\system32> [System.datetime]::Now.addminutes(15)

 

Thursday, August 24, 2017 9:18:04 AM

 

 

PS C:\Windows\system32> ([System.datetime]::Now.addminutes(15)).touniversaltime()

 

Thursday, August 24, 2017 2:18:16 PM

 

 

 

 

References

2012 PowerShell cmdlets https://docs.microsoft.com/en-us/previous-versions/system-center/hh920227(v=sc.20)

2016 PowerShell cmdlets https://docs.microsoft.com/en-us/powershell/module/operationsmanager/?view=systemcenter-ps-2016

2019 PowerShell cmdlets https://docs.microsoft.com/en-us/powershell/module/operationsmanager/?view=systemcenter-ps-2019

SDK

Ralph Kyttle Blog https://blogs.technet.microsoft.com/ralphkyttle/2014/11/10/scom-2012-r2-use-powershell-to-end-an-active-maintenance-mode/

DateTime Methods https://docs.microsoft.com/en-us/dotnet/api/system.datetime

SCOM 2019 Maintenance Mode
https://docs.microsoft.com/en-us/system-center/scom/manage-maintenance-mode-overview?view=sc-om-2019

MSDN MaintenanceModeReason Method https://docs.microsoft.com/en-us/previous-versions/system-center/developer/bb465591(v=msdn.10)

MSDN StopMaintenanceMode Method

UpdateMaintenanceMode Method https://docs.microsoft.com/en-us/previous-versions/system-center/developer/bb424495(v=msdn.10)

 

MM deluxe custom script https://gist.github.com/stegenfeldt/b3f044aa77894ed80d82f8849a48035b