It’s that time to figure out the ConfigMgr SMS role alerts – If you are monitoring your SCCM/MECM environment, then you get role failure alerts. Many times, the Operations Helpdesk, NOSC, NOC, SOC, etc. will get alerts when various roles fail on the Configuration Manager platform. The common ask is why, what do you see, etc. Much like Exchange, ConfigMgr internalizes the checks that are seen in the console as registry keys or events documenting said degraded component/feature. Helping the MECM administrator understand the failure is key to decoding how to notify administrator, and when the helpdesk needs to act on ‘ConfigMgr SMS role alerts’.
Example – MECM/SCCM looks at replication probe action state $Config/RoleName$
The role check is based on a variable of the RoleName in a registry key that the application updates.
This is the origin of ConfigMgr SMS role alerts
HKLM:SOFTWARE\Microsoft\SMS\Operations Management\SMS Server Role\$Config/RoleName$\Availability State
1 is critical state
2,3,4 are warning states
If more details are needed, download SCCM/MECM Management Pack for SCOM here
Use Tyson’s SCOM Helper pack to unseal, and inspect XML.
Once you know the origin of the ConfigMgr SMS role alerts, you can begin tuning the MECM alerts to your environment. Understanding role alerts will help both teams understand MECM application health. First, use MECM application health to trend alerts/outages. Second, leverage maintenance mode schedules, or MM scripts to NOT monitor for common administration tasks. From my experience, the alerts are the result of MECM Admins maintaining the application – common actions like building application/package lists, cleanup actions, site maintenance, backups, etc. Lastly, set up a subscription to notify after the tuning discussion. See my blog on building a subscription for more details.