OS Addendum updates

OS Addendum updates - Can I get your attention please!?
Can I get your attention please!?

 

More updates for your monitoring pleasure with OS addendum updates!

 

OS Addendum updates

Been busy in the monitoring ‘bat’ cave crafting up new ways to simplify things, automating recoveries, top process finds, STIG compliance, automatic services logic, and PowerShell transcription checks.

Download https://github.com/theKevinJustin/2016ServerAgnostic/

 

Automatic Services – monitors services with automatic service type, and includes built in recovery when monitor unhealthy

PowerShell transcription – be compliant for STIG V-257502 and V-257503, also includes Daily report (runs daily Monday-Friday)

Sets multiple overrides to reduce extraneous OS workflows that most customers never use.

 

Change/revision history

v1.0.7.6 13 Jan 2025 Updated EnableTranscription Report variables
v1.0.7.5 8 Nov 2024 Added additional services into excluded services, uncommented ServiceName variable
v1.0.7.4 7 Nov 2024 Updated ExcludedServices in AutomaticServices monitor DS
v1.0.7.3 21 Oct 2024 Logical Disk rollup disable, hourly monitor setting changes, event and perf collection rule disabled
v1.0.7.2 21 Oct 2024 Updated AutomaticServices DS/WA workflows
v1.0.7.1 18 Oct 2024 Updated Excluded Services array with additional services, updated AutomaticServices DS/WA workflows.
v1.0.7.0 17 Oct 2024 Updated AutomaticServices monitor timeout from 240 to 600 for OpsMgr event ID 22404
v1.0.6.8 16 Oct 2024 Updated Storport Timed Out Monitor from 5 to 10 events per hour
v1.0.6.8 11 Oct 2024 Additional EnableTranscription workflows – reports, report cleanup, transcription log cleanup
v1.0.6.7 11 Apr 2024 New weekly PowerShell Enable Transcription monitor for STIG V-257502 and V-257503, and new weekly report
v1.0.6.5 10 Apr 2024 Updated recovery logic for AutomaticServices DS/WA
v1.0.6.4 9 Apr 2024 Updated AutomaticServices DS/WA logic change, ServiceName variable change to Service

Updated NOSC Daily Tasks

Keep your head up

Updated NOSC Daily Tasks with more insights, whether NOC/NOSC, or SCOM Admin related, check out the GitHub for the pack and change/revision history.

Keep your head up!   I find this is always a positive message to look up, not down. Leverage new key insights and download the pack from my GitHub repo – Proactive NOSC Daily Tasks link

 

Updated NOSC Daily Tasks Summary

Latest round adds simplification of SCOM agent workflow errors, adding the offending computer with the SCOMAdmin DailySummary alert details.

Offending alert examples from multiple customers

MSSQL on Windows: SQL Server has failed to allocate sufficient memory to run

Alert generation was temporarily suspended due to too many alerts (event ID 5399)

PowerShell was dropped

Expression Filter Module Failed Initialization (group regex errors)

The November pack updates add TicketID field to the SCOMAdmin, Daily Summary, Logical Disk report, and Alert updates reports.  This is invaluable when integrating service management (ITSM) system events/alerts/incidents into your monitoring.  Lastly, visibility into created incidents is key to business issues (see the AlertUpdates workflows).

Details

NOSC Management pack provides summary report alerts of key insights including: Expiring certificates, Logical Disk alerts, Pending reboots, System Admin summary, and SCOM admin reports including long-running scripts, script errors, SCOM errors, and alert updates report.

Blog https://kevinjustin.com/blog/2023/08/15/proactive-daily-reports/

Change History

v1.0.5.7  13 Jan 2025 Updated SCOMAlerts report details with format-table properties from select
v1.0.5.6  15 Nov 2024 SCOMAdmin and Daily Summary, Logical disk report changes
v1.0.5.4  12 Nov 2024 AlertUpdates report and various logging changes
v1.0.5.3   5 Nov 2024 Enabled AlertUpdate rules
v1.0.5.2  30 Oct 2024 Daily Summary and SCOMAlerts report updates
v1.0.5.1  17 Oct 2024 Added Operations Manager Event ID's 22402, 22406
v1.0.5.0   4 Jan 2024 Resolution State logic improvements for large environments
v1.0.4.9  21 Dec 2023 WhiteSpace, newline, return updates, Expiring Certs report moved back 1 hour
v1.0.4.8  20 Dec 2023 Updated all Get-SCOMAlert queries to use -ResolutionState (0..254) for performance increase over where-object
v1.0.4.7  18 Dec 2023 Updated Expiring Certs DS/WA, whitespace code check
v1.0.4.6  30 Nov 2023 Removed debug detail from DS/WA which showed in Health Explorer pane

SQL STIGMonitor

End the STIG(ma)

Seriously, dream on!  End the STIGma is a good thing, but STIGs can be a burden.  Hit the easy button, if you’re not already using it.  Contact your SQL Data and AI Cloud Solutions Architect for the latest SQL STIG Monitor 2024 Q4 build!


 

Latest SQL STIG monitor 31 Oct 2024 release includes

DISA UPDATES – see link
MS SQL Server 2016 Instance STIG, V3R2:

(NOTE: DISA has been contacted to remove related CCI STIGID for AzureSQLDB that was overlooked: ASQL-00-010700)

POWERSHELL MODULE
Updated version to 1.23
Added STIGID parameter to Invoke-StigMonitor allowing granular control over STIGID scanning.

DATABASE CHANGES
Updated Checklist Templates for Q4 Revisions.
Updated Instance & Database STIG for Q4 benchmark date.
Script updates include:

CNTNMIXDB: Not A Finding if using Windows Auth
FORCENRYPT: NA if using Windows Auth
PWDCMPLX: Updated Finding to remove OS STIG reference
AZDBPERMISS: Revised script with new version.
DBPERMISS: Revised script with new version.
ENFCACCSS: Revised script with new version.
PSERRPERM: Revised script with new version.
UNQSVCACC: Removed code stripping out port number.
AZAUDITSTATE: Properly returns No Finding when audit setup is correct.
Fixed bug in vDocumentation view causing POAMs to not display custom comment in exported documentation.
Added usp_RemoveInstance stored procedure to easily clean up a specific Instance from StigMonitor that no longer exists.

DOCUMENTS
Updated checklist templates, Approvals scripts, and Documentation Templates for Q4 Revisions.
Removed Set-CEIPRegKeys.ps1, Set-FIPSCompliance.ps1, and Set-SqlRegKey.ps1 in favor of Module commands.
Updated InfoPage with new StigMonitor logo and text references.
Documentation updated with new examples of Invoke-StigMonitor STIGID parameter.
Updated documentation to add Azure DB Permission for MS_SecurityDefinitionReader.
Added DatabaseName to CSV Export of Export-StigDocumentation.

REPORTS
Updated Report banner to display new StigMonitor logo and latest report versions.
Removed Adhoc scanning to Policy Management Report in favor of Invoke-StigMonitor parameter.
Removed references to Sunset 2012 and 2014 STIGs.
Added AzureSQLMI for future use.
Combined NF and Approved in Total Findings summary
Reduced Recent Scans to latest 6.

Also please send us your feedback if you get a chance to check this out.
If you want to be added/removed from this, click here (Subscribe /Unsubscribe) or send us an email.

 

DNS Scavenging alerts

DNS Scavenging how it works

Need DNS Scavenging alerts, to see what’s cleaned up, or that scavenging failed?  Download the DNS Addendum pack from my GitHub repo https://github.com/theKevinJustin/DNSAddendumAgnostic

Latest revision first includes a EventID 2502 monitor for scavenging failed.  Second, the monitor has count logic (setup to alert with 2 events in 30 minutes).  Third, EventID 2501 rule details scavenging totals.  Lastly, built a weekly report to summarize the scavenging alerts (cliff notes!).

 

 

Some quick ‘how-to’ setup DNS scavenging

Example of RegKey showing that Scavenging is setup – note Scavenging Interval key

 

Example of AD integrated DNS setup with 21 day scavenging interval, and prompts to configure (click OK twice)

DNS Scavenging setup on AD integrated DNS server

 

Import management pack, and run DNS scavenging.

 

Verify scavenging alerts

SCOM Monitoring Tab > Active Alerts > ‘Look for:’ scavenging

Example output

 

Additional SCOM PowerShell commands

Run PowerShell commands from the SCOM management server (MS)

$DNSAlerts = get-scomalert -name "*Scavenging*"
$DNSAlerts
$DNSAlerts | format-table PrincipalName,TimeRaised,Description -auto -wrap

 

Example Output

PS C:\Users\scomadmin> $DNSAlerts = get-scomalert -name “*Scavenging*”

PS C:\Users\scomadmin> $DNSAlerts

 

Severity     Priority   Name                                                                        TimeRaised

——–     ——–   —-                                                                        ———-

Warning      Normal     Windows DNS Event 2502 Scavenging Failed monitor addendum alert             8/19/2024 2:02:3…

Warning      Normal     Windows DNS Event 2502 Scavenging Failed monitor addendum alert             8/19/2024 1:07:0…

Information  Normal     Proactive DailyTasks DNSAlerts Scavenging Summary Report Alert              8/19/2024 10:11:…

 

 

DNS alerts formatted

PS C:\Users\scomadmin> $DNSAlerts | format-table PrincipalName,TimeRaised,Description -auto -wrap

 

PrincipalName    TimeRaised            Description

————-    ———-            ———–

DC02.testlab.net 8/19/2024 2:02:32 PM  Windows DNS Event 2502 Scavenging Failed monitor alert 1 alert in 15 minutes

Event Description:

The DNS server has completed a scavenging cycle but no nodes were visited.

Possible causes of this condition include:

The next scavenging cycle is scheduled to run in 168 hours.

 

Learn articles for more details https://learn.microsoft.com/en-us/troubleshoot/windows-server/networking/dns-scavenging-setup

ServiceNow Event integration

ServiceNow Event integration
ServiceNow Event integration
Time to integrate your Monitoring tools to ITSM tool.  First, this blog post documents ‘ServiceNow Event integration’.  Second, let’s explain the common acronym in my experience is SNOW/SNow.  Third, some background – ServiceNow has been around for some time as an Information Technology Service Management (ITSM), and discovery tool.  As a SaaS solution, companies can purchase a subscription and integrate tools via RESTAPI to create/update/close events or incidents.
First, let’s begin to discuss SCOM notification methods.  SCOM2022 adds a new capability with Teams integration.  Second, most people are familiar with notification methods leveraging Email (html or not), perhaps SMS, but not so much command channel, calling some script in shell, PowerShell, etc.  Generally, the command channel is basically a post processing script capability to execute notifications.  Third, example tools where command channel might be used – BMC BEM (BMC Event Manager), BMC Remedy, xMatters, DerDack; SNOW integration within SCOM, using notification channels.  Lastly, SaaS solutions (vendors like xMatters, and ServiceNow) allow RESTAPI crafted requests to take actions.
SNOW prerequisites
1) ServiceNow User/Password (or API key)
2) SNOW RESTAPI PowerShell needs to securely access credentials
For the Incident PowerShell, we store Credentials within Windows Credential Manager
3) Network connectivity to SaaS provider (use PowerShell test-netconnection from SCOM MS to test connectivity over whatever port(s) vendor requires.
4) ServiceNow CallerID GUID
5) Production and Test URL’s (also required for network connectivity tests)
6) Access to SNOW UI to verify required fields and values for the script parameters.
Update incident script and begin testing.
Download script from GitHub repo https://github.com/theKevinJustin/New-SNowEvent/
Download script, and copy to monitoring repository
Copy to SCOM management servers (MS)
NOTE Path, to run from management server
Update script, with pre-reqs above –
Credential Manager stored ID
For more detail, look at parameter examples below to verify UI.
Update with customer/ServiceNow SNOW subscription specific values:
##CallerID##
##CUSTOMER##    (customize SNOW short_description)
##TEAM##    (customize SNOW short_description)$Channel = “Direct”
$ServiceNowURL=”https://##SERVICENOWURL##/api/now/table/em_event”
$CallerID = “##CallerID##”
# if proxy is used, uncomment and replace with Proxy URL
#$Proxy = “##Proxy##”
# Test New-SNOWEvent.ps1
# Depending on how you want to randomly choose an alert to create SNOW event
Lab example
$Alerts = get-scomalert -resolutionstate 0 | where { $_.Name -like “System Center*” }
Gather Critical, New alerts
$Alerts = get-scomalert -ResolutionState 0 -severity 2
Debug for warning alerts
$Alerts = get-scomalert -ResolutionState 0 -severity 1
# Debug
$Alerts[0] | fl ID,Name,Description,Severity,MonitoringObjectDisplayName
.\New-SNOWEvent.ps1 -AlertName $Alerts[0].Name -AlertID $Alerts[0].ID -Impact 4 -Urgency 4 -Priority 3 -AssignmentGroup “System Admin” -BusinessService “System Management” -Category Support -SubCategory Repair -Channel Direct
Example output
PS C:\Users\scomadmin\Desktop> .\New-SNOWIncident.ps1 -AlertName $Alert.Name -AlertID $Alert.ID -Impact 4 -Urgency 4 -Priority 3 -AssignmentGroup “System Admin” -BusinessService “System Management” -Category Support -SubCategory Repair -Channel Direct
TEST ServiceNow URL specified.
CredentialManager PoSH Module Installed, ModuleBase = C:\Program Files\WindowsPowerShell\Modules\CredentialManager\2.0
The System Center Management Health Service 5E04F804-8B71-6EB6-0101-DCBB58022498 running on host 16DB02.testlab.net and s
erving management group with id {E39F5F53-9FBB-9D7F-4BFE-5F0324630AE5} is not healthy. Some system rules failed to load.
16DB02
Warning
impact 4
urgency 4
priority 3
ServiceNow Credential NOT stored on server

SCOM maintenance schedules

SCOM maintenance schedules
SCOM maintenance schedules
Do your SCOM users need to know if a server is in scheduled maintenance?  This came about as Aris asked questions.
 First, let’s discuss specific maintenance mode and maintenance schedule scenarios users might ask.  Second, determining IF scheduled maintenance enabled, running, about to run.  Third, how does another user know when scheduled maintenance ends, allowing action and decision point to add/extend server maintenance.  Fourth, whenever scheduled maintenance entered by one user, is NOT automatically seen by other roles.  While product guidance states ‘maintenance schedules be added by someone in SCOM admin group’, self-service users still need visibility.  Lastly, can we figure out a way to answer these questions.  Given these points, users to be able to see server maintenance details.  Also, can solution adhere to best practice ‘no alerts during planned maintenance’.
From PowerShell on SCOM MS
Get-SCOMMaintenanceScheduleList
$ScheduleList = Get-SCOMMaintenanceScheduleList
$ScheduleList.ID
$ScheduleList.ScheduleID.Guid
foreach ( $ID in $ScheduleList.ScheduleID)
{
$Schedule = get-SCOMMaintenanceSchedule -ID $ID
# $Schedule.MonitoringObjects ;
(get-scomclassInstance -id $Schedule.MonitoringObjects.Guid).DisplayName
# Debug endtime
$Schedule | ft User,ActiveStartTime,ActiveEndDate,ScheduledEndTime
}
Example Output
SCOM Maintenance Schedule Output
SCOM Maintenance Schedule Output
Workflows:
Scheduled Maintenance report task
Maintenance mode report – what’s about to end maintenance mode.
Obviously, expect both workflows into the ‘Proactive NOSC DailyTasks’ pack.  GitHub repo  https://github.com/theKevinJustin/ProactiveNOSCDailyTasks

January addendum updates

Fast and Furious (sarcasm and humor)
Fast and Furious (sarcasm and humor)

January addendum updates for multiple management packs

First, the biggest change item for large enterprise environments included a change in syntax for get-SCOMAlert
Example
get-scomalert -ResolutionState (0..254) -Name “##stringhere##*”
get-scomalert -ResolutionState 255 -Name “##stringhere##*”
Second, another change with the repo’s was a ‘whitespace audit’ encoded characters, or ‘data concealment’.  See AT&T link CyberSecurity Link
Third, after whitespace we focused on script/workflow efficiencies seen in large enterprise environments.  While Efforts began in December, the workflow efficiencies sprint resulted in two sets of improvements.
Fast and Efficient
Fast and Efficient
1) Added ‘Reset Monitors Script base code’ $Age variable
What does this mean?
Simply put $Age allows admins to define monitor age before resetting.
The default is 1 (day), but can be specified in the script to tailor to requirements.
Example
$Age = [DateTime](Get-Date).AddDays(-1)
2) Beyond incorporating $Age into the reset monitor logic, the packs utilize logic for a much faster runtime (~90%+).
What does this mean?
Updated logic quickly gathers unhealthy monitor objects, by leveraging ‘Get-SCOMManagementPack‘ and then ‘Get-SCOMClass‘, before passing to ‘Get-SCOMClassInstance‘.
Example PowerShell
## Grab the MP, get the Monitors and Rules from the MP, then grab all alerts found inside the Monitors/Rules
$SCOMCoreMP = Get-SCOMManagementPack -DisplayName “Microsoft Windows Server DNS Monitoring”
# Get classes – Examples –
$Monitoring = $SCOMCoreMP
# DNS pack naming
$DNSClasses = @(Get-SCOMClass -ManagementPack $Monitoring; )
$DNSClass = $DNSClasses | sort -property Name -uniq

Repo’s updated in January

January addendum updates include:
ADCS, ADDS, DNS, DFS/File Services, IIS, SCCM pack for MECM/MEM/MCM monitoring, Operating Systems, Proactive NOSC Daily Tasks, and Tangible ProV application monitoring.

Links below to GitHub repositories (repo’s)

Tangible ProV application monitoring

Tangible ProV application monitoring - (touch)
Tangible ProV application monitoring – (touch)

 

Use the Tangible SCOM management pack to monitor logins and ProV application registration issues.  First, the management pack configures Seed class discovery.  Second, the pack includes rules/monitors for Tangible ProV software.  Third, rules and monitors for 2802 ‘Could not validate product key’ and 4402 ‘Could not validate the contents of user logon request context: AS-REQ contains an invalid or unknown username type’ events.  Fourth, the service monitor, which uses Kevin Holman’s fragment library for service recovery scripts/rules.  Fifth, scheduled and on-demand daily reports for audit and record keeping purposes.  Lastly, alert cleanup logic, to reduce admin burden and overhead.

 

Reference the Tangible vendor’s website – Tangible ProV application website

 

NOTE: This may not apply for everyone, as the ProV application ‘Auto-provisions Active Directory user accounts for visitors or new employees whenever they want to work from one of your PCs.’

 

The Daily report piece of the pack makes things easier answering ‘what happened in the last 24-72 hours’ question.  Gathers open/closed insights and organizes alerts.

Screenshot of the daily report

Zero Alert example of daily report
Zero Alert example of daily report

Report example of insights (in text)

Open ProV alerts = 13Since last report run:#———————–Total ProV alerts = 23Auto-closed monitors = 22Auto-closed rules = 0Total automation closures:#—————————Auto-closed monitors = 262Auto-closed rules = 0# Unhealthy Tangible ProV service alert details#==============================================NetbiosComputerName TimeRaised           RepeatCount Name                     ——————- ———-           ———– —-                     DC01        8/11/2023 5:18:14 AM           0     Tangible ProV ProVService…

 

All in all, the daily report utilizes get and set-SCOMAlert to accomodate large enterprise environments.

$OpenAlerts = get-scomalert -ResolutionState (0..254) -Name “Tangible ProV ProVService Service*”

$OpenAlerts = $OpenAlerts | ? { $_.TimeRaised -ge $Time }
# $OpenAlerts.count

# Closed alerts
$ClosedAlerts = get-scomalert -ResolutionState 255 -Name “Tangible ProV ProVService Service*” | ? { $_.TimeRaised -ge $Time }
# $ClosedAlerts.count

 

 

Tangible ProV application monitoring details and download

GitHub https://github.com/theKevinJustin/TangibleProV

Download here

 

Improving SCOM Monitor reset logic

Faster - Improving SCOM Monitor reset logic
Faster – Improving SCOM Monitor reset logic

 

My thanks to Aris Somatis for his deep dive reviewing the packs with me, particularly new use cases.  The PowerShell below builds on Scott Murr’s initial TechNet published logic from years back.  Consequently, the reset logic provides a ‘manual intervention required’ alerting/monitoring system.

 

 

Improving SCOM monitor reset logic

Calling the reset method has been a game changer for my customers – including operators, system and application owners!

Background

Scott’s reset logic, from SCOM2012, helped administrators reset unhealthy monitors where alerts may have been closed.  Because Scott leveraged the ResetMonitoringState method, the community gained a way to keep true health.  Additionally, many administrators and engineers built custom management packs to provide solutions.  Second, the addendum packs blog brought in more options – best practices, lessons from the field (and customers), and health model accurate alerting for what was really broken in the environment.  Third,    addressing ‘gaps’ or ‘blind spots’ from product teams.   As a result of NEW monitoring, the packs may include: rules/monitors, datasource/writeAction (DS/WA) workflows, recovery tasks and automation, count logic monitors, overrides, discoveries, and groups.    Thirdly, to take monitoring to the next level.  To top that off, with very little/NO cost compared to competitors!

PowerShell code

Aris’s Age use case takes this even further.  Using monitor age allows further analysis to dial down ‘monitor reset’ to object is X days old.  Comparatively, the 24-72 hour setup default is used in the addendums, so Age provides a second option.  Third option can rely on SCOM’s built-in cleanup, but that’s typically 14-30 days.   Overall, flexibility is a good thing.

 

# Specify age variable for your environment

$Age = [DateTime](Get-Date).AddDays(-7)

 

PowerShell code snippet

First, the reset logic can pivot on the age requirement.  Then, adjust the Age variable per requirements.  Third, figure out which method applies to gather a unique list of classes, whether by partial string(s), or by management pack name(s).

 

Set age variable (how long ‘OLD’ monitors might be stale and need reset)

# Example sets $Age variable to 7 days ago (-7)

$Age = [DateTime](Get-Date).AddDays(-7)

 

 

Unpack two different ways to gather classes for monitors to reset

# When common string name exists in all classes

Example DFS/FileServices packs all have one of the three strings:

# DFS pack naming

$DFSClasses = @(Get-SCOMClass -Name “*FileServices*”; Get-SCOMClass -Name “*FileServer*”; Get-SCOMClass -Name “*DFS*” )

$DFSClass = $DFSClasses | sort -property Name -uniq

# Debug

$DFSClass.Count

 

# Get AD classes – Microsoft.Windows.Server.AD.2016.Discovery, Microsoft.Windows.Server.AD.Library

$ADLibrary = Get-SCOMManagementPack -name “Microsoft.Windows.Server.AD.Class.Library”

#get-scomclass -ManagementPack $ADLibrary

$ADMonitoring = Get-SCOMManagementPack -name “Microsoft.Windows.Server.AD.2016.Monitoring”

#get-scomclass -ManagementPack $ADMonitoring | fl DisplayName,Name,ID

$ADDiscovery = Get-SCOMManagementPack -name Microsoft.Windows.Server.AD.2016.Discovery

#get-scomclass -ManagementPack $ADDiscovery | fl DisplayName,Name,ID

# ADDS pack naming

$ADDSClasses = @(Get-SCOMClass -ManagementPack $ADLibrary; Get-SCOMClass -ManagementPack $ADDiscovery; )

# NOTE Excluded AD Monitoring pack as NO classes existed

$ADDSClass = $ADDSClasses | sort -property Name -uniq

 

# Debug count of unique classes

$ADDSClass.Count

 

 

Reset monitor PowerShell screenshot

Download from GitHub https://github.com/theKevinJustin/SCOMMonitorReset

SCOM Reset PowerShell
SCOM Reset PowerShell

 

Example PowerShell on HealthService resets

NOTE debug logic enabled

0
Found 0 unhealthy monitors for class Microsoft.SystemCenter.HealthServicesGroup
1
Found 1 unhealthy monitors for class Microsoft.SystemCenter.HealthServiceWatcher

Resetting Health State on ' + Microsoft.SystemCenter.HealthServiceWatcher:Microsoft.SystemCenter.AgentWatchersGroup;5e0
4f804-8b71-6eb6-0101-dcbb58022498 + '

Guid
----
0218d239-3d37-f9b1-75d2-6d52c2c7c0c1


Documentation/Sources

Building on Scott’s idea – (retired links)
Original post https://sc.scomurr.com/scom-2012-monitor-reset-cleaning-up-the-environment/
TechNet gallery download https://gallery.technet.microsoft.com/SCOM-2012-Batch-reset-63a17534
Alternate link https://gallery.technet.microsoft.com/scriptcenter/Auto-reset-script-for-d8b775ca