Using PLA.dll to create alerts

Spread the love

In previous posts I’ve covered how to collect perf counters and ETW trace logs using PLA.dll. Today I will discuss the “A” of PLA — performance counter alerts. What are alerts good for? Well, quoting from MSDN, “You can create a custom Data Collector Set containing performance counters and configure alert activities based on the performance counters exceeding or dropping below limits you define.”

I had to add a few new types to the sample to support creating alerts. (As always, you can go to the PlaSample project on GitHub to get the code.)

First, we need the notion of an alert threshold based on a performance counter value, specifically whether a named counter has risen above or fallen below a target value. Hence, CounterThreshold and ThresholdCondition (CounterName was already conveniently present from the previous code to create perf counter logs.)

public enum ThresholdCondition
{
    Above = 0,
    Below = 1
}

public class CounterThreshold
{
    public CounterThreshold()
    {
    }

    public CounterName Name { get; set; }

    public ThresholdCondition Condition { get; set; }

    public double Value { get; set; }

    public override string ToString()
    {
        string condition = null;
        switch (this.Condition)
        {
            case ThresholdCondition.Above:
                condition = ">";
                break;
            case ThresholdCondition.Below:
                condition = "<";
                break;
        }

        return this.Name + condition + this.Value;
    }
}

PLA requires the threshold text in a very specific format which is provided by the ToString() method above.

And now the CounterAlertInfo class which defines a few properties for the alert in question:

public class CounterAlertInfo
{
    public CounterAlertInfo(string name)
    {
        this.Name = name;
        this.Thresholds = new List<CounterThreshold>();
    }

    public string Name { get; private set; }

    public IList<CounterThreshold> Thresholds { get; private set; }

    public TimeSpan? SampleInterval { get; set; }
}

Note that an alert will be fired for every threshold in the list which is reached. The main reason to group many under a single alert collector is so you can execute the same action for a set of thresholds. In this sample, I simply set the alert to write an event log entry. Here is the code to set it all up:

public ICollectorSet Create()
{
    // Data collector set is the core abstraction for collecting diagnostic data.
    DataCollectorSet dcs = new DataCollectorSet();

    // Create a data collector for a perf counter alert.
    IAlertDataCollector dc = (IAlertDataCollector)dcs.DataCollectors.CreateDataCollector(DataCollectorType.plaAlert);
    dc.name = this.Name + "_DC";
    dcs.DataCollectors.Add(dc);

    // Set sample interval, if present.
    if (this.SampleInterval.HasValue)
    {
        dc.SampleInterval = (uint)this.SampleInterval.Value.TotalSeconds;
    }

    // Set collector to create an event log entry when threshold is reached.
    dc.EventLog = true;

    // Build up the list of alert thresholds.
    string[] alertThresholds = new string[this.Thresholds.Count];
    for (int i = 0; i < this.Thresholds.Count; ++i)
    {
        alertThresholds[i] = this.Thresholds[i].ToString();
    }

    dc.AlertThresholds = alertThresholds;

    // Now actually create (or modify existing) the set.
    dcs.Commit(this.Name, null, CommitMode.plaCreateOrModify);

    // Return an opaque wrapper with which the user can control the session.
    return new CollectorSetWrapper(dcs);
}

Finally, some application code to create an alert that fires whenever the CPU utilization of Notepad falls below five percent for an interval of two seconds:

CounterAlertInfo info = new CounterAlertInfo("MyAlert");

info.SampleInterval = TimeSpan.FromSeconds(2.0d);

CounterName counterName = new CounterName() { Category = "Process", Counter = "% Processor Time", Instance = "notepad" };
info.Thresholds.Add(new CounterThreshold() { Name = counterName, Condition = ThresholdCondition.Below, Value = 5.0d });

ICollectorSet collector = info.Create();
collector.Start();

Thread.Sleep(5000);

collector.Stop();

collector.Delete();

When the alert fires, it will write event 2031 to the Microsoft/Windows/Diagnosis-PLA Operational log containing the details. You can see the information in Event Viewer. The above alert would produce event details similar to the following:

Performance counter \Process(notepad)\% Processor Time has tripped its alert threshold. The counter value of 0.000000 is under the limit value of 5.000000. 5.000000 is the alert threshold value.

In a later post, I will discuss some practical uses of performance counter alerts in automated testing.

Leave a Reply

Your email address will not be published. Required fields are marked *