So I spun up a new System Center Operations Manager 2016 environment. I have been building SCOM systems for some time back to 2007. I like the product and as far as I am concerned there is nothing else for a primarily windows environment that will give you that level of visibility, monitoring and reporting.
The environment is 3 servers running Windows 2016 Datacenter with GUI. 1 SQL 2016, 2 2016 SCOM servers.
That said, SCOM is a beast and it took my quite some time early on to get up to speed, it can do so much it is easy to get overwhelmed.
The other problem I ran into was the alerts, SCOM is very chatty OOB and if you don’t do some tuning before you start sending alerts to your Admins\Engineers they will hate the product and you for waking them up for no reason. I will write another post concerning tuning, I have a few rules I follow before I turn a new SCOM environment alerting on.
My latest problem was with the Console, the SCOM console is has always been a bit slow for me when compared to some other solutions like Nagios. That few seconds of lag always drove me a little nuts especially when I was trying to drill down to a specific monitor.
When I first deployed 2016 it was painfully slow, so slow I found it unusable. I found a few posts online referencing modifying the degree of parallelism on the SQL server hosting the scom database but that actually made it worse. I undid that change.
After finding nothing in all my searches I decided to dig deeper into what I figured it was, the SQL server. SCOM is just an application front end for data stored in a SQL database. All of the monitors\alerts\reports are all data that is stored in a SQL server. When you import a management pack you are just importing data into SQL.
Troubleshooting the SQL seerver:
I tried giving it more memory, CPU, disk space even. Moving to faster storage, nothing helped.
I then deleted all of the VM’s and started over from scratch thinking the problem existed between the keyboard and chair.
I had the same issues as before, so I knew something was wrong with my configuration.
After some more digging I found that I was timing out when the SCOM managers were trying to access the SQL database. I directed all of my attention to troubleshooting the SQL server configuration since the hardware was performing well.
As it turns out the SCOM installation only created 4 tempdb files, I then added an additional 4 temp db files and and went back to the SCOM console. It was night and day, the scom console not responds immediately and just as fast as Nagios.
- To verify this is your issue go to the SQL server hosting the SCOM database and DataWarehouse.
- Open SQL Management Studio
- Expand Databases
- Expand System Databases
- Right Click tempdb and click properties
- Click Files
- The number of tempdb files you should have should correspond with how many CPU’s you have up to 8.
- If you have 8 CPU and less than 8 tempdb files you have this issue and should click add
- name it the next increment of temp1-8 and set the same initial size and autogrowth as the one thats already there.
- Reboot your SQL server just for good measure, then reboot your SCOM servers just to clear out any backlogs you may have
- Now go test out your console access