Updated NOSC Daily Tasks

Keep your head up

Updated NOSC Daily Tasks with more insights, whether NOC/NOSC, or SCOM Admin related, check out the GitHub for the pack and change/revision history.

Keep your head up!   I find this is always a positive message to look up, not down. Leverage new key insights and download the pack from my GitHub repo – Proactive NOSC Daily Tasks link

 

Updated NOSC Daily Tasks Summary

Latest round adds simplification of SCOM agent workflow errors, adding the offending computer with the SCOMAdmin DailySummary alert details.

Offending alert examples from multiple customers

MSSQL on Windows: SQL Server has failed to allocate sufficient memory to run

Alert generation was temporarily suspended due to too many alerts (event ID 5399)

PowerShell was dropped

Expression Filter Module Failed Initialization (group regex errors)

The November pack updates add TicketID field to the SCOMAdmin, Daily Summary, Logical Disk report, and Alert updates reports.  This is invaluable when integrating service management (ITSM) system events/alerts/incidents into your monitoring.  Lastly, visibility into created incidents is key to business issues (see the AlertUpdates workflows).

Details

NOSC Management pack provides summary report alerts of key insights including: Expiring certificates, Logical Disk alerts, Pending reboots, System Admin summary, and SCOM admin reports including long-running scripts, script errors, SCOM errors, and alert updates report.

Blog https://kevinjustin.com/blog/2023/08/15/proactive-daily-reports/

Change History

v1.0.5.7  13 Jan 2025 Updated SCOMAlerts report details with format-table properties from select
v1.0.5.6  15 Nov 2024 SCOMAdmin and Daily Summary, Logical disk report changes
v1.0.5.4  12 Nov 2024 AlertUpdates report and various logging changes
v1.0.5.3   5 Nov 2024 Enabled AlertUpdate rules
v1.0.5.2  30 Oct 2024 Daily Summary and SCOMAlerts report updates
v1.0.5.1  17 Oct 2024 Added Operations Manager Event ID's 22402, 22406
v1.0.5.0   4 Jan 2024 Resolution State logic improvements for large environments
v1.0.4.9  21 Dec 2023 WhiteSpace, newline, return updates, Expiring Certs report moved back 1 hour
v1.0.4.8  20 Dec 2023 Updated all Get-SCOMAlert queries to use -ResolutionState (0..254) for performance increase over where-object
v1.0.4.7  18 Dec 2023 Updated Expiring Certs DS/WA, whitespace code check
v1.0.4.6  30 Nov 2023 Removed debug detail from DS/WA which showed in Health Explorer pane

Proactive Daily Reports

Proactive Analyst Reports as a new way to ingest key insights from SCOM
Proactive Analyst Reports as a new way to ingest key insights from SCOM

As a SME or team lead, ever need to know a key insight for the enclave?  Let’s talk about the ‘Proactive Daily Reports’ pack.  This provides you some built-in reports on what transpired in an enclave.  Building again on the Health pillar, we can simplify what owners need to see.  Creating a PowerShell script was a simpler alternative to a complex SSRS report that often broke due to patching, and not following best practices.  The pack shows a simpler way to bring key insights to owners for Pending Reboots, Expiring PKI certificates, Logical Disk alerts, System Admin summary, and SCOM admin reports including long-running scripts, script errors, SCOM errors, and alert updates report.

 

Quick Download: https://github.com/theKevinJustin/ProactiveNOSCDailyTasks

Testing the Proactive Daily reports

Let’s start with some example reports – examples for expiring certificates, Logical Disk, Pending Reboot, System Admin summary, and SCOM admin reports including long-running scripts, script errors, SCOM errors, and alert updates report.

 

Expiring Certs –

About to expire certificates

Expiring PKI certificates reports
Expiring PKI certificates reports

 

Logical disk alerts –

Shows Server, drive, and % full data

Logical disk alerts report, showing zero for the past 72 hours (over a weekend)
Logical disk alerts report, showing zero for the past 72 hours (over a weekend)

 

Pending Reboots

Alerts of servers pending restart, not patched, not rebooted

Pending reboot report lists servers pending restart, not patched, not rebooted alerts
Pending reboot report lists servers pending restart, not patched, not rebooted alerts

 

System Admin summary

This is really a consolidation of multiple insights:

Server performance issues
Open ITSM/Remedy tickets
Unhealthy Agents
Pending Reboot, Not Rebooted, Not patched
Disabled/Unhealthy/MaintenanceMode, Repeatedly down Agents
Logical Disk free space alerts
Expiring certificates
AD DC (ADDS) critical alerts
DNS alerts
Group Policy issues

SysAdmin daily summary report example alert
SysAdmin daily summary report example alert

 

SCOM admin reports

Admin reports have a few separate alert reports, including long-running scripts, script errors, SCOM errors, and alert updates report.

SCOM Admin alerts report example of common SCOM problems
SCOM Admin alerts report example of common SCOM problems

 

Long running scripts

SCOM Admin long running scripts alerts report example of longrunning report workflows to help tune run-time
SCOM Admin long running scripts alerts report example of long-running report workflows to help tune run-time

 

ScriptErrors showing key SCOM connectivity issuesSCOM Admin script errors to help diagnose report script syntax errors

SCOM Admin script errors to help diagnose report script syntax errors

Useful links

Other blog posts for addendum management packs and capabilities –

https://kevinjustin.com/blog/2023/08/15/proactive-patching-alerts/
https://kevinjustin.com/blog/2023/08/14/top-process-powershell-script/
https://kevinjustin.com/blog/2023/08/15/proactive-daily-reports/

https://kevinjustin.com/blog/2023/08/08/create-closed-alerts-view/