This adventure came about from troubleshooting a problem with coworkers through slack. While we had hubot commands available to us to look at CPU and Memory usage for an given point in time, we would go to vSphere’s UI to screen grab Memory usage over time. It became quickly apparent that we needed a command to just generate these charts as we needed the in chat. So, our use case was:
Using historical VMWare/vSphere data generate a graph of CPU or Memory usage over time and upload it to slack using a hubot command.
This is actually way harder than it sounds because of one reason. Software that generates charts & graphs is pretty complicated. So, breaking down the use case into smaller sub-problems this is what it look like:
- Build a Hubot (nodejs) command that will kick off the process.call into Powershell to execute the operations necessary.
- Retrieve historical CPU/Memory data about a machine from our VMWare/vSphere infrastructure using PowerCLI.
- Generate a graph/chart from the data using any graphing technology that will actually run (a) outside of a browser and (b) in our environment. This is much more difficult than it sounds. (A Future Post)
- Send the graph back into slack so that it can be displayed in the channel where it’s needed. (Yet Another Future Post)
Build a Hubot Command
We currently run a hubot (nodejs) instance to handle ChatOps commands through Slack. Even though hubot is build on top of nodejs, we are a Microsoft shop and the majority of knowledge with scripting is in Powershell. So, long ago, when we implemented hubot we used PoshHubot to bridge the technology gap and allow the nodejs platform call our Powershell modules.
If we had to do it over again, with the current technology that available today, we would probably use Poshbot instead.
Anyways, we’ve been doing this for a long time, so this part of things wasn’t too new or difficult.
Retrieve Historical CPU/Memory Data
Within the debugging procedures that started all of this, we were pulling in graphs using screen grabs of vSphere. So, the best place to get data from would be vSphere and VMWare does a great job of making that information available through PowerCLI.
To do this, we used the Get-Stat command with either the cpu.usage.average or mem.usage.average stat being retrieved.
Quick note: I’m having a difficult time getting Connect-VIServer to work with a PSCredentials object. I’m not sure what the issue is, but for now the authentication process to the server is working because Connect-VIServer allows you to store credentials on a machine and reuse them. That was pretty nice of them.
The Get-Stat data is somewhat wonky at times. In the “1h” timeframe I’m using an IntervalSecs of 20 simply because that’s the only interval allowed. Each data point is always 20 seconds apart at 00, 20, and 40 seconds. If you use a Start and Finish range of over a week along with an IntervalSec amount, you could wait a real long time to get your data back; but you will get back all the data.
Because of all the data that will come back if you use an Interval amount, when you start to get a time range longer than a few hours it’s best to just let the Get-Stat command figure out what’s the appropriate interval amount to send back. That’s why on the “1d”, “1w”, “1m”, and “1y” timeframes I just use the Start and Finish parameters without an interval.
Both the cpu.usage.average and mem.usage.average data points return a percentage value back. This is fine for the CPU data, because we normally think about CPU usage in percentages. But, for memory, we normally think of its usage in GBs. So, there’s a quick section which converts the Memory usage percentage over to the actual amount of GBs used.
Next time, I’ll dig into New-ChartJsUtilizationGraph.
0 comments:
Post a Comment