January addendum updates

Fast and Furious (sarcasm and humor)
Fast and Furious (sarcasm and humor)

January addendum updates for multiple management packs

First, the biggest change item for large enterprise environments included a change in syntax for get-SCOMAlert
Example
get-scomalert -ResolutionState (0..254) -Name “##stringhere##*”
get-scomalert -ResolutionState 255 -Name “##stringhere##*”
Second, another change with the repo’s was a ‘whitespace audit’ encoded characters, or ‘data concealment’.  See AT&T link CyberSecurity Link
Third, after whitespace we focused on script/workflow efficiencies seen in large enterprise environments.  While Efforts began in December, the workflow efficiencies sprint resulted in two sets of improvements.
Fast and Efficient
Fast and Efficient
1) Added ‘Reset Monitors Script base code’ $Age variable
What does this mean?
Simply put $Age allows admins to define monitor age before resetting.
The default is 1 (day), but can be specified in the script to tailor to requirements.
Example
$Age = [DateTime](Get-Date).AddDays(-1)
2) Beyond incorporating $Age into the reset monitor logic, the packs utilize logic for a much faster runtime (~90%+).
What does this mean?
Updated logic quickly gathers unhealthy monitor objects, by leveraging ‘Get-SCOMManagementPack‘ and then ‘Get-SCOMClass‘, before passing to ‘Get-SCOMClassInstance‘.
Example PowerShell
## Grab the MP, get the Monitors and Rules from the MP, then grab all alerts found inside the Monitors/Rules
$SCOMCoreMP = Get-SCOMManagementPack -DisplayName “Microsoft Windows Server DNS Monitoring”
# Get classes – Examples –
$Monitoring = $SCOMCoreMP
# DNS pack naming
$DNSClasses = @(Get-SCOMClass -ManagementPack $Monitoring; )
$DNSClass = $DNSClasses | sort -property Name -uniq

Repo’s updated in January

January addendum updates include:
ADCS, ADDS, DNS, DFS/File Services, IIS, SCCM pack for MECM/MEM/MCM monitoring, Operating Systems, Proactive NOSC Daily Tasks, and Tangible ProV application monitoring.

Links below to GitHub repositories (repo’s)

Hear ye hear ye

Hear ye hear ye - see the nice warm toasty updated packs
Hear ye hear ye – see the nice warm toasty updated packs

See the nice warm toasty updated packs

Fresh off the press, right to your door, just in time for that gift for your special someone!  Time for new updates to keep you ever-green’d, up to date, fixes, etc.  ;-P

 

Holman updated his SCOM.Management pack for SCOM2022 UR2

Github https://github.com/thekevinholman/SCOM.Management

 

Addendum packs updated

Multiple packs with multiple updates.  Removed debug detail for DS/WA (Data Source/Write Action workflows) Health Explorer outputs, simplified mgmt pack recovery tasks for single WA script.

Active Directory Certificate Services (ADCS) version agnostic 2016+ addendum https://github.com/theKevinJustin/ADCS2016-Addendum 2012 here  See the blog post for capabilities here https://kevinjustin.com/blog/2023/08/18/adcs-addendum-packs/

Active Directory Domain Services addendum https://github.com/theKevinJustin/ADDSAddendumAgnostic

See the blog post for capabilities here https://kevinjustin.com/blog/2023/08/18/adds-addendum-pack/

Active Directory Federation Services addendum https://github.com/theKevinJustin/ADFSAddendum See the blog post for capabilities here https://kevinjustin.com/blog/2023/08/18/adfs-addendum-pack/

FileServices Agnostic addendum https://github.com/theKevinJustin/FileServicesAddendum See the blog post for capabilities here https://kevinjustin.com/blog/2023/08/31/file-services-addendum/

MCM/MEM/MECM/SCCM Configuration Manager addendum https://github.com/theKevinJustin/MECMSCCMAddendum See the blog post for capabilities here https://kevinjustin.com/blog/2023/08/30/mecm-sccm-addendum-pack/

PKI certificate monitoring addendum https://github.com/theKevinJustin/PKIAddendum See the blog post for capabilities here https://kevinjustin.com/blog/2023/08/24/pki-addendum-pack/

Proactive NOSC DailyTasks reports addendum https://github.com/theKevinJustin/ProactiveNOSCDailyTasks See the blog post for capabilities here https://kevinjustin.com/blog/2023/08/15/proactive-daily-reports/

SCOM Core addendum https://github.com/theKevinJustin/SCOMCoreAddendum See the blog post for capabilities here https://kevinjustin.com/blog/2023/08/30/scomcore-addendum-pack/

Top Process workflows tied to monitors in Tier1 https://github.com/theKevinJustin/TopProcessTier1 See the blog post for capabilities here https://kevinjustin.com/blog/2023/08/15/proactive-daily-reports/

Tier0 https://github.com/theKevinJustin/TopProcess

Windows Server 2012/2012R2 Operating System Addendum https://github.com/theKevinJustin/2012OSAddendum See the blog post for capabilities here https://kevinjustin.com/blog/2023/08/28/os-addendum-packs/

Windows Server 2016+ version agnostic Operating System Addendum https://github.com/theKevinJustin/2016ServerAgnostic See the blog post for capabilities here https://kevinjustin.com/blog/2023/08/28/os-addendum-packs/

 

Enjoy!

NiCE VMware addendum

'NiCE VMware addendum' enhances VMware monitoring, tuning alerts to 'manual intervention' required alerting. 
‘NiCE VMware addendum’ enhances VMware monitoring, tuning alerts to ‘manual intervention’ required alerting.

‘NiCE VMware addendum’ enhances VMware monitoring, tuning alerts to ‘manual intervention’ required alerting. The NiCE folks have been around for some time as a trusted Microsoft partner, creating additional monitoring functionality across Microsoft products.  Having completed a number of projects implementing the VMware pack, it’s time to share the configuration and alert report capabilities.

 

Quick Download HTTPS://GITHUB.COM/THEKEVINJUSTIN/NICEVMWAREADDENDUM/

Changes to Nice vmware pack

Key breakdown of VMware ESX environment monitoring

NiCE VMware monitoring features for ESX, vSphere, vSAN environments
NiCE VMware monitoring features for ESX, vSphere, vSAN environments

 

Adjustments to vendor pack to further the mantra ‘alert when manual intervention required’.

Set monitor alerts to multiple samples over an hour (i.e. compute and performance of ESX environment)

Reports by team (requires regular expression updates for environment servers owned by each team)

Monitor reset logic, and service monitorType (count logic for X failures over Y time, before alert)

Overrides to change vendor pack provided discoveries, rules, monitors

Remove alert noise for unmanaged objects in ESX environment

 

Customize pack for environment

Customize the ‘NiCE VMware addendum’ pack for specific environment. This means updating group discoveries, and GUIDs for group specific overrides.  Further updates are required to update server naming conventions for team virtualization reports.

Classes/groups created for pack

VMware classes included for additional customization.

Discoveries

Breakout of Discoveries that need pattern updates to match

Find/Replace ##ESXHostDataStoreNamingConventions## with names to exclude

Example of regular expressions for multiple customers

VMware Group Seed Classes defined in the addendum.

 

Update disable guest machine alerts

Disable guest machines in ESX environment to disable alerts.

Find ##ESXGuestServersDiskUsageNamingConventions##

Replace with relevant guest naming conventions

 

Example template/guest/virtual machine names typically disabled

Update discovery to disable alerts on object names of virtual machines in ESX environment.

 

Service MonitorType

Service MonitorType adds Samples and Intervals to alert after consecutive failures (x failures in y minutes then alert )

VMware service MonitorType defined in the addendum.

Rules, Monitors, Recoveries

List of workflows used to troubleshoot/resolve problems

VMware addendum rules, VMTools monitor, and recovery components included.

 

 

Documentation

NiCE VMware management pack https://www.nice.de/nice-vmware-mp/

 

IIS addendum packs

IIS addendum packs to tune IIS from 2012 forward.

IIS addendum packs to tune IIS from 2012 forward.’IIS addendum packs’ to tune IIS from 2012 forward.  The GitHub repository has two packs 2012/2016+ (version agnostic pack).  This includes an IIS enabled group, Daily report and cleanup DataSource and WriteAction (tasks), as well as a regular expression to set up the IIS enabled group.  The IIS enabled group is to enable IIS monitoring on servers IIS monitoring is needed.

 

 

Customize for environment

Update addendums to server naming conventions for enabled IIS monitoring.  Read below to better understand addendum functionality.

First, the addendums include class/group, datasource and write action alert reports and automated alert closure workflows, as well as event count logic/reset monitorType.

Addendum includes class/group, datasource and write action workflows for alert reports and automated alert closure, as well as event count logic reset monitorType.

 

Second, the group discovery, find/replace the pattern to various application/web server naming conventions where IIS monitoring IS wanted.

Third, the version agnostic has overrides to disable most perf and rule alerts.  Can provide OFF packs to turn off performance counter collection rules, to keep both the OperationsManager, and OperationsManagerDW databases cleaner, thereby faster with less data.

IIS2012 overrides
IIS2012 overrides

Lastly, once addendum updated, save file, move to SCOM MS, and import.

Enjoy the ‘IIS addendum packs’ for how few alerts, perhaps life changing?! (sarcasm)

 

 

Documentation

Download Addendum packs https://github.com/theKevinJustin/IISAddendums

IIS2012 SCOM Management pack download https://www.microsoft.com/en-us/download/details.aspx?id=34767

IIS2016+ SCOM management pack download https://www.microsoft.com/en-us/download/details.aspx?id=54445

SCOMCore Addendum pack

SCOMCore Addendum pack - having a strong core makes bigger gains
SCOMCore Addendum pack – having a strong core makes bigger gains

Time to configure the Microsoft System Center Core Monitoring pack per health model and best practice.  That’s where the SCOMCore Addendum pack comes in. Addendum adds High Agent Handle count group, daily report and alert closure automation, and rule/monitor overrides.  Some assembly required – update the discovery pattern for offending high handle counts, and high handle count group ContextInstance GUID after import.

 

Quick Download: https://github.com/theKevinJustin/SCOMCoreAddendum

 

 

Background:

While High Agent  Handle count was more an issue before the x365 platform migrated UC, SharePoint, and email (i.e. Lync/Skype, SharePoint, Exchange on prem) went to the cloud.  This is still seen where cloud scalability options and virtualization/storage limitations exist.  Example typically is an over-utilized virtual machine in hybrid/IaaS/premise scenarios.  Kevin Holman caught this performance issue years back, creating monitoring alerts pack and blog.  In case you’re on SCOM jeopardy, the LAW/OMS/Microsoft Monitoring Agent/SCOM agent has a built-in health check.  The built-in health check restarts service when Handle Count or memory of the HealthService (aka Microsoft Monitoring Agent service) ran too hot per SCOM PG.   SCOM agent restarts caused config churn, and high compute, as workflows re-ran after the service restarted.

 

 

Assess agent restarts

Begin by verifying if you have Kevin Holman’s pack for SCOM agent restarts  downloaded and installed, which sets memory/handle count informational alerts https://github.com/thekevinholman/SCOM.AgentThresholds

Validate pack installed

Verify SCOM Agent Thresholds pack installed.
Verify SCOM Agent Thresholds pack installed.

 

 

Configure addendum for environment

Download and Install ‘SCOMCore Addendum pack’ here

Open saved XML in notepad or Notepad++ (your favorite XML editor here!)

Update the regular expression pattern line for offending servers in the

Update the pattern for the high agent handle count group for any offenders.

 

Figure out the group GUID for the high agent handle count

From PowerShell on SCOM management server, run:

Get-SCOMClassInstance -DisplayName “Proactive High Agent Handle Count servers” | fl DisplayName,ID

 

Find/Replace GUID

PowerShell GUID check.

 

Save file and Import > enjoy less alerts!

 

 

Documentation:

Kevin Holman blog on SCOM agent restarts

Holman’s pack for SCOM agent restarts and setting memory/handle count alerts https://github.com/thekevinholman/SCOM.AgentThresholds

Addendum download https://github.com/theKevinJustin/SCOMCoreAddendum

MSSQL Addendum pack

 

Time to tune MSSQL alerts!
Time to tune MSSQL alerts!

The ‘MSSQL Addendum pack’ wouldn’t be possible without Brandon Pires contributions.  Brandon dealt with my many questions to better alert!  If you need more background, check the ‘why addendum pack’ post.

Quick Download(s)

2012+ https://github.com/theKevinJustin/MSSQLAddendum

 

Capabilities

The pack is based on the SQL engineering blog and program team making multiple updates per year for SQL monitoring.  The addendum creates two groups for dev/test and notification/subscription modeling.  Second, the overrides, man there are a bunch! aid consumption of real issues.   Lastly, most environments should be SQL 2016+, as the 2012R2 EOL/EOSL is quickly approaching in October!

MSSQL groups defined in the Addendum pack
MSSQL groups defined in the Addendum pack

MSSQL group discoveries require updates to be applicable to environment

 

Tailor addendum

First, the Addendum pack requires the MSSQL packs MUST be installed.  The addendum is based on the MSSQL 2016+ version agnostic is currently supported, as the 2012,2012R2 products are near end of support.

Find/Replace the variables as needed:

Example    ##TESTSERVER##|##DEVSERVER##

Save file

 

Overrides

Addendum pack contains discovery, monitor, and rule overrides to tune MSSQL to CSA (old PFE/CE/CSAe Microsoft Field engineer recommendations), to match the health model reducing critical ‘wake me up in the middle of the night’ alerts.

Partial snapshot of MSSQL overrides in the pack
Partial snapshot of MSSQL overrides in the pack

Import

Download pack, and save to your environment

Import into SCOM

Enjoy!

 

 

MSSQL Addendum references

MSSQL Engineering blog and old post here

SQL Releases TechCommunity here

Engineering team latest management pack, TechCommunity release v7.2.0.0

Import ‘gotcha’ importing new custom functionality blog

ADFS Addendum pack

Do you associate StarTrek when the word federation is used inside of federation services (ADFS)?
Do you associate StarTrek when the word federation is used inside of federation services (ADFS)?

To begin, the ‘ADFS addendum pack’ needs acknowledgement of the contributors who dealt with my many questions to better alert on AD issues!  My thanks to Jason Windisch for his help and expertise with Active Directory Federation Services (ADFS).  If you need more background, check the ‘why addendum pack’ post.  BTW, what do you associate with the word – Federation?

Quick Download(s)

2016+ https://github.com/theKevinJustin/ADFSAddendum

 

Overview of capabilities

The Active Directory Federation Services ‘ADFS Addendum pack’ configures ADFS group of related classes for notification/subscription modeling.  Second, the rules, service monitors, tasks, service recovery, alert cleanup, and summary reports aid consumption of real issues.  Third, if you have ADFS2012R2, I have an addendum pack, but coordination necessary to get the ADFS management packs MSI (not currently available).  Lastly, most environments should be 2016+, as the EOL/EOSL is quickly approaching in October!

ADFS Addendum pack creates ADFS Group AND discovery requiring server names applicable to environment.
ADFS Addendum pack creates ADFS Group AND discovery requiring server names applicable to environment.

ADFS Group discovery requires server names applicable to environment

 

Tailoring the pack(s) to your environment

First, the Active Directory Federation Services management packs MUST be installed for the ‘ADFS Addendum pack’ to load.  2016+ agnostic is currently supported, as the 2012,2012R2 products are near end of support.

Find/Replace the variables as needed

##ADFSSERVERNAME1##|##ADFSSERVERNAME1##|##LAB##

Save file

 

Workflows

First, the DataSources (DS) and WriteActions (WA) clean up alerts, create daily reports, where the WA are the on-demand tasks versions.

Data source (DS) scheduled workflows run weekdays between 0600-0700 local SCOM management server local time.  The summary and team reports (run during this time) summarize key insights.  NOTE: the Monday report gathers the last 72 hours, so administrators get a ‘what happened over the weekend’ view.  Tuesday-Friday reports are past 24 hours.  Lastly, the group policy report summarizing unique GPUpdate error output.

 

Monitoring

ADFS Monitoring components screenshot from Notepad++
ADFS Monitoring components screenshot from Notepad++

Addendum pack rules schedule data source execution, add on-demand tasks.   The service monitor, and Recovery tasks add service recovery automation to bring us to the ‘manual intervention required’ alerting.  There are a few monitor/rule overrides to match the health model.

 

Import

Download updated ‘ADFS addendum pack’ and save to your environment

Import into SCOM

Enjoy!

 

Documentation

ADFS 2016+ management pack download

Why Addendum packs

IT Ninja required for improving monitoring hence 'Why addendum packs'
IT Ninja required for improving monitoring hence ‘Why addendum packs’

 

‘Why addendum packs’?  What value can they bring to my customer?  Kevin Holman started the Addendum thought process quite a while back.  Added functionality to a core application/program/product.  The first example of this pack naming convention is his SQL RunAs Addendum to simplify SQL monitoring.   Let’s break down a number of examples how the SCOM community has built packs to better monitoring, and how I believe the addendum packs bring IT Ninja lessons from Microsoft experts monitoring to your environment.

 

Why Addendum packs

Better monitoring from the experts, including customer examples for other ‘blind spots’ in monitoring.  Blind spots consist of ‘not monitored’ pieces of infrastructure, from simply an event, ping, service, tcp port check, process, web site, scripted workflow, with the purpose to identify a problem.

The goal of monitoring is to:

Identify, self-heal, automatically run recovery or diagnostic workflows alert when manual intervention is required.  Doesn’t matter what tool you use, they all do some portion of these steps.

 

The addendum packs do these things, adding a few differentiators.

Auto closure daily scripts (close rules/monitors)

Auto reports of problems (M-F 0600-0700 local, reflecting last 24-72 hours of open/closed alerts)

Employ count logic (x in y time)

Self-heal monitors with no new events

Adjust alert severities to health model

where critical (red) = outage, warning (yellow) = issue, informational reports or FYI’s

Capable of updating alerts (status, owner, ticketID+)

Tasks to run workflows on-demand

Recovery tasks – (i.e. service restart automation or TopProcess, Logical disk cleanup, MECM Client cache clean )

Integrate additional monitoring (like DFS replication queue script/alerts)

Synthetic checks for DNS and web applications

Web Availability and Transactional monitoring, ADFS, CRL, PowerShell Invoke-WebRequest, and more

Security and Compliance checks

 

Imagine I forgot something capability wise.

Stay tuned, as this builds into an even better outcome, quality data into ‘a single pane of glass’ of multiple tools within PowerBI.

Top Process PowerShell script

Task Manager output for 'Top Process PowerShell script management pack'
Task Manager output for ‘Top Process PowerShell script management pack’

 

Ever wish you had task manager output when you had a monitor go unhealthy?  Following Kevin Holman’s lead to ‘Monitor Processes‘, the idea landed to build out the ‘Top Process PowerShell script’.  This morphed into a management pack with Knowledge entries to better explain what is being done.  Integrating Top Process into Health Explorer output as a recovery task helped provide another step before alerting.    The idea started from the need to prove which Security tool(s) were causing the over-utilized compute spikes, causing non-responsive server(s).  Thinking back to my UNIX days, we simply used top, vmstat, iostat, and other commands to identify problematic processes.  Integrating PowerShell scripts into SCOM is part of the fun, then linking the obfuscated Security processes for the final output.  From there, extrapolate into Azure Functions or Azure Logic apps, for additional functionality for cloud native monitoring.

 

Quick Download: https://github.com/theKevinJustin/TopProcess

Tier1 separated monitoring (no AD) https://github.com/theKevinJustin/TopProcessTier1

Building out the ‘Top Process PowerShell script’

Kevin Holman built a ‘ Monitor.Performance.ConsecSamples.ThenScript.TwoState.mpx fragment, beginning the logical journey.   His fragment helped me start with a working model, taking processes and cores into consideration for true CPU usage on multi-core servers.

Kevin Holman Monitor performance then script fragment for PowerShell get-counter syntax
Kevin Holman Monitor performance then script fragment for PowerShell get-counter syntax

 

We need to see the processes, and their corresponding value, then build an output table (custom object).  After gathering the processes, feed the TopProcesses array, lastly sorting the array for CPUValue

Top Process memory usage snippet
Top Process memory usage snippet

Next, we’ll want to see what applications/tools might be involved, including Active Client, IIS, monitoring, and EndPoint Management tools (keep things honest!).

Added the Security Processes into the mix
Added the Security Processes into the mix

Then we build an output of the data so we can take the datasource (DS) or WriteAction (WA) into a scripted monitor/rule, or recovery tasks linked to various monitors.  Even built a forked version in case of SAW/Red Forest, separating Tier0 monitoring from Tier1 (snippet below is NOT that pack)

snippet of manual tasks and recoveries that link to multiple monitors
snippet of manual tasks and recoveries that link to multiple monitors

 

Useful links

Kevin Holman MP fragments blog and GitHub Fragment library/repository

ConfigMgr SMS role alerts

Microsoft Endpoint Configuration Manager
Microsoft Endpoint Configuration Manager

It’s that time to figure out the ConfigMgr SMS role alerts – If you are monitoring your SCCM/MECM environment, then you get role failure alerts.  Many times, the Operations Helpdesk, NOSC, NOC, SOC, etc. will get alerts when various roles fail on the Configuration Manager platform.  The common ask is why, what do you see, etc.  Much like Exchange, ConfigMgr internalizes the checks that are seen in the console as registry keys or events documenting said degraded component/feature.  Helping the MECM administrator understand the failure is key to decoding how to notify administrator, and when the helpdesk needs to act on ‘ConfigMgr SMS role alerts’.

 

Example – MECM/SCCM looks at replication probe action state $Config/RoleName$

Example MECM Service Monitor for role alerts
Example MECM Service Monitor for role alerts

 

The role check is based on a variable of the RoleName in a registry key that the application updates.

 

MECM Monitor Config
MECM Monitor Config

 

This is the origin of ConfigMgr SMS role alerts

HKLM:SOFTWARE\Microsoft\SMS\Operations Management\SMS Server Role\$Config/RoleName$\Availability State

 

Decoder ring:

1 is critical state

2,3,4 are warning states

 

If more details are needed, download SCCM/MECM Management Pack for SCOM here

Use Tyson’s SCOM Helper pack to unseal, and inspect XML.

 

Once you know the origin of the ConfigMgr SMS role alerts, you can begin tuning the MECM alerts to your environment.  Understanding role alerts will help both teams understand MECM application health.  First, use MECM application health to trend alerts/outages.  Second, leverage maintenance mode schedules, or MM scripts to NOT monitor for common administration tasks.  From my experience, the alerts are the result of MECM Admins maintaining the application – common actions like building application/package lists, cleanup actions, site maintenance, backups, etc.  Lastly, set up a subscription to notify after the tuning discussion.  See my blog on building a subscription for more details.