SCOMCore Addendum – having a strong core makes bigger gains
Updated SCOMCore addendum pack now contains DWDataRP integration, and additional overrides since the last pack posted in 2023. There’s been a lot of updates made since the last update to GitHub. Github Link https://github.com/theKevinJustin/SCOMCoreAddendum
More updates for your monitoring pleasure with OS addendum updates!
OS Addendum updates
Been busy in the monitoring ‘bat’ cave crafting up new ways to simplify things, automating recoveries, top process finds, STIG compliance, automatic services logic, and PowerShell transcription checks.
Updated NOSC Daily Tasks with more insights, whether NOC/NOSC, or SCOM Admin related, check out the GitHub for the pack and change/revision history.
Keep your head up! I find this is always a positive message to look up, not down. Leverage new key insights and download the pack from my GitHub repo – Proactive NOSC Daily Tasks link
Updated NOSC Daily Tasks Summary
Latest round adds simplification of SCOM agent workflow errors, adding the offending computer with the SCOMAdmin DailySummary alert details.
Offending alert examples from multiple customers
MSSQL on Windows: SQL Server has failed to allocate sufficient memory to run
Alert generation was temporarily suspended due to too many alerts (event ID 5399)
The November pack updates add TicketID field to the SCOMAdmin, Daily Summary, Logical Disk report, and Alert updates reports. This is invaluable when integrating service management (ITSM) system events/alerts/incidents into your monitoring. Lastly, visibility into created incidents is key to business issues (see the AlertUpdates workflows).
Details
NOSC Management pack provides summary report alerts of key insights including: Expiring certificates, Logical Disk alerts, Pending reboots, System Admin summary, and SCOM admin reports including long-running scripts, script errors, SCOM errors, and alert updates report.
v1.0.5.7 13 Jan 2025 Updated SCOMAlerts report details with format-table properties from select
v1.0.5.6 15 Nov 2024 SCOMAdmin and Daily Summary, Logical disk report changes
v1.0.5.4 12 Nov 2024 AlertUpdates report and various logging changes
v1.0.5.3 5 Nov 2024 Enabled AlertUpdate rules
v1.0.5.2 30 Oct 2024 Daily Summary and SCOMAlerts report updates
v1.0.5.1 17 Oct 2024 Added Operations Manager Event ID's 22402, 22406
v1.0.5.0 4 Jan 2024 Resolution State logic improvements for large environments
v1.0.4.9 21 Dec 2023 WhiteSpace, newline, return updates, Expiring Certs report moved back 1 hour
v1.0.4.8 20 Dec 2023 Updated all Get-SCOMAlert queries to use -ResolutionState (0..254) for performance increase over where-object
v1.0.4.7 18 Dec 2023 Updated Expiring Certs DS/WA, whitespace code check
v1.0.4.6 30 Nov 2023 Removed debug detail from DS/WA which showed in Health Explorer pane
Seriously, dream on! End the STIGma is a good thing, but STIGs can be a burden. Hit the easy button, if you’re not already using it. Contact your SQL Data and AI Cloud Solutions Architect for the latest SQL STIG Monitor 2024 Q4 build!
Latest SQL STIG monitor 31 Oct 2024 release includes
DISA UPDATES – see link
MS SQL Server 2016 Instance STIG, V3R2:
(NOTE: DISA has been contacted to remove related CCI STIGID for AzureSQLDB that was overlooked: ASQL-00-010700)
POWERSHELL MODULE
Updated version to 1.23
Added STIGID parameter to Invoke-StigMonitor allowing granular control over STIGID scanning.
DATABASE CHANGES
Updated Checklist Templates for Q4 Revisions.
Updated Instance & Database STIG for Q4 benchmark date.
Script updates include:
CNTNMIXDB: Not A Finding if using Windows Auth
FORCENRYPT: NA if using Windows Auth
PWDCMPLX: Updated Finding to remove OS STIG reference
AZDBPERMISS: Revised script with new version.
DBPERMISS: Revised script with new version.
ENFCACCSS: Revised script with new version.
PSERRPERM: Revised script with new version.
UNQSVCACC: Removed code stripping out port number.
AZAUDITSTATE: Properly returns No Finding when audit setup is correct.
Fixed bug in vDocumentation view causing POAMs to not display custom comment in exported documentation.
Added usp_RemoveInstance stored procedure to easily clean up a specific Instance from StigMonitor that no longer exists.
DOCUMENTS
Updated checklist templates, Approvals scripts, and Documentation Templates for Q4 Revisions.
Removed Set-CEIPRegKeys.ps1, Set-FIPSCompliance.ps1, and Set-SqlRegKey.ps1 in favor of Module commands.
Updated InfoPage with new StigMonitor logo and text references.
Documentation updated with new examples of Invoke-StigMonitor STIGID parameter.
Updated documentation to add Azure DB Permission for MS_SecurityDefinitionReader.
Added DatabaseName to CSV Export of Export-StigDocumentation.
REPORTS
Updated Report banner to display new StigMonitor logo and latest report versions.
Removed Adhoc scanning to Policy Management Report in favor of Invoke-StigMonitor parameter.
Removed references to Sunset 2012 and 2014 STIGs.
Added AzureSQLMI for future use.
Combined NF and Approved in Total Findings summary
Reduced Recent Scans to latest 6.
Also please send us your feedback if you get a chance to check this out.
If you want to be added/removed from this, click here (Subscribe /Unsubscribe) or send us an email.
Latest revision first includes a EventID 2502 monitor for scavenging failed. Second, the monitor has count logic (setup to alert with 2 events in 30 minutes). Third, EventID 2501 rule details scavenging totals. Lastly, built a weekly report to summarize the scavenging alerts (cliff notes!).
Some quick ‘how-to’ setup DNS scavenging
Example of RegKey showing that Scavenging is setup – note Scavenging Interval key
Example of AD integrated DNS setup with 21 day scavenging interval, and prompts to configure (click OK twice)
DNS Scavenging setup on AD integrated DNS server
Import management pack, and run DNS scavenging.
Verify scavenging alerts
SCOM Monitoring Tab > Active Alerts > ‘Look for:’ scavenging
Example output
Additional SCOM PowerShell commands
Run PowerShell commands from the SCOM management server (MS)
Time to integrate your Monitoring tools to ITSM tool. First, this blog post documents ‘ServiceNow Event integration’. Second, let’s explain the common acronym in my experience is SNOW/SNow. Third, some background – ServiceNow has been around for some time as an Information Technology Service Management (ITSM), and discovery tool. As a SaaS solution, companies can purchase a subscription and integrate tools via RESTAPI to create/update/close events or incidents.
First, let’s begin to discuss SCOM notification methods. SCOM2022 adds a new capability with Teams integration. Second, most people are familiar with notification methods leveraging Email (html or not), perhaps SMS, but not so much command channel, calling some script in shell, PowerShell, etc. Generally, the command channel is basically a post processing script capability to execute notifications. Third, example tools where command channel might be used – BMC BEM (BMC Event Manager), BMC Remedy, xMatters, DerDack; SNOW integration within SCOM, using notification channels. Lastly, SaaS solutions (vendors like xMatters, and ServiceNow) allow RESTAPI crafted requests to take actions.
SNOW prerequisites
1) ServiceNow User/Password (or API key)
2) SNOW RESTAPI PowerShell needs to securely access credentials
For the Incident PowerShell, we store Credentials within Windows Credential Manager
3) Network connectivity to SaaS provider (use PowerShell test-netconnection from SCOM MS to test connectivity over whatever port(s) vendor requires.
4) ServiceNow CallerID GUID
5) Production and Test URL’s (also required for network connectivity tests)
6) Access to SNOW UI to verify required fields and values for the script parameters.
Do your SCOM users need to know if a server is in scheduled maintenance? This came about as Aris asked questions.
First, let’s discuss specific maintenance mode and maintenance schedule scenarios users might ask. Second, determining IF scheduled maintenance enabled, running, about to run. Third, how does another user know when scheduled maintenance ends, allowing action and decision point to add/extend server maintenance. Fourth, whenever scheduled maintenance entered by one user, is NOT automatically seen by other roles. While product guidance states ‘maintenance schedules be added by someone in SCOM admin group’, self-service users still need visibility. Lastly, can we figure out a way to answer these questions. Given these points, users to be able to see server maintenance details. Also, can solution adhere to best practice ‘no alerts during planned maintenance’.
Second, another change with the repo’s was a ‘whitespace audit’ encoded characters, or ‘data concealment’. See AT&T link CyberSecurity Link
Third, after whitespace we focused on script/workflow efficiencies seen in large enterprise environments. While Efforts began in December, the workflow efficiencies sprint resulted in two sets of improvements.
Fast and Efficient
1) Added ‘Reset Monitors Script base code’ $Age variable
What does this mean?
Simply put $Age allows admins to define monitor age before resetting.
The default is 1 (day), but can be specified in the script to tailor to requirements.
Example
$Age = [DateTime](Get-Date).AddDays(-1)
2) Beyond incorporating $Age into the reset monitor logic, the packs utilize logic for a much faster runtime (~90%+).
What does this mean?
Updated logic quickly gathers unhealthy monitor objects, by leveraging ‘Get-SCOMManagementPack‘ and then ‘Get-SCOMClass‘, before passing to ‘Get-SCOMClassInstance‘.
Example PowerShell
## Grab the MP, get the Monitors and Rules from the MP, then grab all alerts found inside the Monitors/Rules
$SCOMCoreMP = Get-SCOMManagementPack -DisplayName “Microsoft Windows Server DNS Monitoring”
Use the Tangible SCOM management pack to monitor logins and ProV application registration issues. First, the management pack configures Seed class discovery. Second, the pack includes rules/monitors for Tangible ProV software. Third, rules and monitors for 2802 ‘Could not validate product key’ and 4402 ‘Could not validate the contents of user logon request context: AS-REQ contains an invalid or unknown username type’ events. Fourth, the service monitor, which uses Kevin Holman’s fragment library for service recovery scripts/rules. Fifth, scheduled and on-demand daily reports for audit and record keeping purposes. Lastly, alert cleanup logic, to reduce admin burden and overhead.
Reference the Tangible vendor’s website – Tangible ProV application website
NOTE: This may not apply for everyone, as the ProV application ‘Auto-provisions Active Directory user accounts for visitors or new employees whenever they want to work from one of your PCs.’
The Daily report piece of the pack makes things easier answering ‘what happened in the last 24-72 hours’ question. Gathers open/closed insights and organizes alerts.
Screenshot of the daily report
Zero Alert example of daily report
Report example of insights (in text)
Open ProV alerts = 13
Since last report run: #———————– Total ProV alerts = 23 Auto-closed monitors = 22 Auto-closed rules = 0
My thanks to Aris Somatis for his deep dive reviewing the packs with me, particularly new use cases. The PowerShell below builds on Scott Murr’s initial TechNet published logic from years back. Consequently, the reset logic provides a ‘manual intervention required’ alerting/monitoring system.
Improving SCOM monitor reset logic
Calling the reset method has been a game changer for my customers – including operators, system and application owners!
Background
Scott’s reset logic, from SCOM2012, helped administrators reset unhealthy monitors where alerts may have been closed. Because Scott leveraged the ResetMonitoringState method, the community gained a way to keep true health. Additionally, many administrators and engineers built custom management packs to provide solutions. Second, the addendum packs blog brought in more options – best practices, lessons from the field (and customers), and health model accurate alerting for what was really broken in the environment. Third, addressing ‘gaps’ or ‘blind spots’ from product teams. As a result of NEW monitoring, the packs may include: rules/monitors, datasource/writeAction (DS/WA) workflows, recovery tasks and automation, count logic monitors, overrides, discoveries, and groups. Thirdly, to take monitoring to the next level. To top that off, with very little/NO cost compared to competitors!
PowerShell code
Aris’s Age use case takes this even further. Using monitor age allows further analysis to dial down ‘monitor reset’ to object is X days old. Comparatively, the 24-72 hour setup default is used in the addendums, so Age provides a second option. Third option can rely on SCOM’s built-in cleanup, but that’s typically 14-30 days. Overall, flexibility is a good thing.
# Specify age variable for your environment
$Age = [DateTime](Get-Date).AddDays(-7)
PowerShell code snippet
First, the reset logic can pivot on the age requirement. Then, adjust the Age variable per requirements. Third, figure out which method applies to gather a unique list of classes, whether by partial string(s), or by management pack name(s).
Set age variable (how long ‘OLD’ monitors might be stale and need reset)
# Example sets $Age variable to 7 days ago (-7)
$Age = [DateTime](Get-Date).AddDays(-7)
Unpack two different ways to gather classes for monitors to reset
# When common string name exists in all classes
Example DFS/FileServices packs all have one of the three strings:
0
Found 0 unhealthy monitors for class Microsoft.SystemCenter.HealthServicesGroup
1
Found 1 unhealthy monitors for class Microsoft.SystemCenter.HealthServiceWatcher
Resetting Health State on ' + Microsoft.SystemCenter.HealthServiceWatcher:Microsoft.SystemCenter.AgentWatchersGroup;5e0
4f804-8b71-6eb6-0101-dcbb58022498 + '
Guid
----
0218d239-3d37-f9b1-75d2-6d52c2c7c0c1
Alternate link https://gallery.technet.microsoft.com/scriptcenter/Auto-reset-script-for-d8b775ca
Manage Cookie Consent
To provide the best experiences, we use technologies like cookies to store and/or access device information. Consenting to these technologies will allow us to process data such as browsing behavior or unique IDs on this site. Not consenting or withdrawing consent, may adversely affect certain features and functions.
Functional
Always active
The technical storage or access is strictly necessary for the legitimate purpose of enabling the use of a specific service explicitly requested by the subscriber or user, or for the sole purpose of carrying out the transmission of a communication over an electronic communications network.
Preferences
The technical storage or access is necessary for the legitimate purpose of storing preferences that are not requested by the subscriber or user.
Statistics
The technical storage or access that is used exclusively for statistical purposes.The technical storage or access that is used exclusively for anonymous statistical purposes. Without a subpoena, voluntary compliance on the part of your Internet Service Provider, or additional records from a third party, information stored or retrieved for this purpose alone cannot usually be used to identify you.
Marketing
The technical storage or access is required to create user profiles to send advertising, or to track the user on a website or across several websites for similar marketing purposes.