Friday, 12 July 2024

Alert processing rules

 

Alert processing rules

Alert processing rules allow you to apply processing on fired alerts. Alert processing rules are different from alert rules. Alert rules generate new alerts that notify you when something happens, while alert processing rules modify the fired alerts as they're being fired to change the usual alert behavior.

You can use alert processing rules to add action groups or remove (suppress) action groups from your fired alerts. You can apply alert processing rules to different resource scopes, from a single resource, or to an entire subscription, as long as they are within the same subscription as the alert processing rule. You can also use them to apply various filters or have the rule work on a predefined schedule.

Some common use cases for alert processing rules are described here.

Suppress notifications during planned maintenance

Many customers set up a planned maintenance time for their resources, either on a one-time basis or on a regular schedule. The planned maintenance might cover a single resource, like a virtual machine, or multiple resources, like all virtual machines in a resource group. So, you might want to stop receiving alert notifications for those resources during the maintenance window. In other cases, you might prefer to not receive alert notifications outside of your business hours. Alert processing rules allow you to achieve that.

You could suppress alert notifications by disabling the alert rules themselves at the beginning of the maintenance window, and reenable them after the maintenance is over. In that case, the alerts won't fire in the first place. That approach has several limitations:

  • This approach is only practical if the scope of the alert rule is exactly the scope of the resources under maintenance. For example, a single alert rule might cover multiple resources, but only a few of those resources are going through maintenance. So, if you disable the alert rule, you won't be alerted when the remaining resources covered by that rule run into issues.
  • You might have many alert rules that cover the resource. Updating all of them is time consuming and error prone.
  • You might have some alerts that aren't created by an alert rule at all, like alerts from Azure Backup.

In all these cases, an alert processing rule provides an easy way to suppress notifications.

Management at scale

Most customers tend to define a few action groups that are used repeatedly in their alert rules. For example, they might want to call a specific action group whenever any high-severity alert is fired. As their number of alert rules grows, manually making sure that each alert rule has the right set of action groups is becoming harder.

Alert processing rules allow you to specify that logic in a single rule, instead of having to set it consistently in all your alert rules. They also cover alert types that aren't generated by an alert rule.

Add action groups to all alert types

Azure Monitor alert rules let you select which action groups will be triggered when their alerts are fired. However, not all Azure alert sources let you specify action groups. Some examples of such alerts include Azure Backup alertsVM Insights guest health alertsAzure Stack Edge, and Azure Stack Hub.

For those alert types, you can use alert processing rules to add action groups.

Scope and filters for alert processing rules

This section describes the scope and filters for alert processing rules.

Each alert processing rule has a scope. A scope is a list of one or more specific Azure resources, a specific resource group, or an entire subscription. The alert processing rule applies to alerts that fired on resources within that scope. You cannot create an alert processing rule on a resource from a different subscription.

You can also define filters to narrow down which specific subset of alerts are affected within the scope. The available filters are described in the following table.

FilterDescription
Alert context (payload)The rule applies only to alerts that contain any of the filter's strings within the alert context section of the alert. This section includes fields specific to each alert type.
Alert rule IDThe rule applies only to alerts from a specific alert rule. The value should be the full resource ID, for example, /subscriptions/SUB1/resourceGroups/RG1/providers/microsoft.insights/metricalerts/MY-API-LATENCY. To locate the alert rule ID, open a specific alert rule in the portal, select Properties, and copy the Resource ID value. You can also locate it by listing your alert rules from PowerShell or the Azure CLI.
Alert rule nameThe rule applies only to alerts with this alert rule name. It can also be useful with a Contains operator.
DescriptionThe rule applies only to alerts that contain the specified string within the alert rule description field.
Monitor conditionThe rule applies only to alerts with the specified monitor condition, either Fired or Resolved.
Monitor serviceThe rule applies only to alerts from any of the specified monitoring services that are sending the signal. Different services are available depending on the type of signal. For example:
- Platform: For metric signals, the monitor service is the metric namespace. ‘Platform’ means the metrics are provided by the resource provider, namely 'Azure'.
- Azure.ApplicationInsights: Customer-reported metrics, sent by the Application Insights SDK.
- Azure.VM.Windows.GuestMetrics: VM guest metrics, collected by an extension running on the VM. Can include built-in operating system perf counters, and custom perf counters.
<Custom namespace>: A custom metric namespace, containing custom metrics sent with the Azure Monitor Metrics API.
- Log Analytics: The service that provides the ‘Custom log search’ and ‘Log (saved query)’ signals.
- Activity Log – Administrative: The service that provides the ‘Administrative’ activity log events.
- Activity Log – Policy: The service that provides the 'Policy' activity log events.
- Activity Log – Autoscale The service that provides the ‘Autoscale’ activity log events.
- Activity Log – Security: The service that provides the ‘Security’ activity log events.
- Resource health: The service that provides the resource-level health status.
- Service health: The service that provides the subscription-level health status.
ResourceThe rule applies only to alerts from the specified Azure resource. For example, you can use this filter with Does not equal to exclude one or more resources when the rule's scope is a subscription.
Resource groupThe rule applies only to alerts from the specified resource groups. For example, you can use this filter with Does not equal to exclude one or more resource groups when the rule's scope is a subscription.
Resource typeThe rule applies only to alerts on resources from the specified resource types, such as virtual machines. You can use Equals to match one or more specific resources. You can also use Contains to match a resource type and all its child resources. For example, use resource type contains "MICROSOFT.SQL/SERVERS" to match both SQL servers and all their child resources, like databases.
SeverityThe rule applies only to alerts with the selected severities.

Alert processing rule filters

  • If you define multiple filters in a rule, all the rules apply. There's a logical AND between all filters.
    For example, if you set both resource type = "Virtual Machines" and severity = "Sev0", then the rule applies only for Sev0 alerts on virtual machines in the scope.
  • Each filter can include up to five values. There's a logical OR between the values.
    For example, if you set description contains "this, that" (in the field there is no need to write the apostrophes), then the rule applies only to alerts whose description contains either "this" or "that".
  • Notice that you dont have any spaces (before, after or between) the string that is matched it will effect the matching of the filter.

What should this rule do?

Choose one of the following actions:

  • Suppression: This action removes all the action groups from the affected fired alerts. So, the fired alerts won't invoke any of their action groups, not even at the end of the maintenance window. Those fired alerts will still be visible when you list your alerts in the portal, Azure Resource Graph, API, or PowerShell. The suppression action has a higher priority over the Apply action groups action. If a single fired alert is affected by different alert processing rules of both types, the action groups of that alert will be suppressed.
  • Apply action groups: This action adds one or more action groups to the affected fired alerts.

When should this rule apply?

You can control when the rule will apply. The rule is always active, by default. You can select a one-time window for this rule to apply, or you can have a recurring window, such as a weekly recurrence.

Configure an alert processing rule

You can access alert processing rules by going to the Alerts home page in Azure Monitor. Then you can select Alert processing rules to see and manage your existing rules. You can also select Create > Alert processing rules to open the new alert processing rule wizard.

Screenshot that shows how to access alert processing rules from the Azure Monitor landing page.

Let's review the new alert processing rule wizard.

  1. On the Scope tab, you select which fired alerts are covered by this rule. Pick the scope of resources whose alerts will be covered. You can choose multiple resources and resource groups, or an entire subscription. You can also optionally add filters, as previously described.

    Screenshot that shows the Scope tab of the alert processing rules wizard.

  2. On the Rule settings tab, you select which action to apply on the affected alerts. Choose between Suppress notifications or Apply action group. If you choose Apply action group, you can select existing action groups by selecting Add action groups. You can also create a new action group.

    Screenshot that shows the Rule settings tab of the alert processing rules wizard.

  3. On the Scheduling tab, you select an optional schedule for the rule. By default, the rule works all the time, unless you disable it. You can set it to work On a specific time, or you can set up a Recurring schedule.

    Let's see an example of a schedule for a one-time, overnight, planned maintenance. It starts in the evening and continues until the next morning, in a specific time zone.

    Screenshot that shows the Scheduling tab of the alert processing rules wizard with a one-time rule.

    An example of a more complex schedule covers an "outside of business hours" case. It has a recurring schedule with two recurrences. One recurrence is daily from the afternoon until the morning. The other recurrence is weekly and covers full days for Saturday and Sunday.

    Screenshot that shows the Scheduling tab of the alert processing rules wizard with a recurring rule.

  4. On the Details tab, you give this rule a name, pick where it will be stored, and optionally add a description for your reference.

  5. On the Tags tab, you can optionally add tags to the rule.

  6. On the Review + create tab, you can review and create the alert processing rule.

Manage alert processing rules

You can view and manage your alert processing rules from the list view:

Screenshot that shows the list view of alert processing rules.

From here, you can enable, disable, or delete alert processing rules at scale by selecting the checkboxes next to them. Selecting an alert processing rule opens it for editing. You can enable or disable the rule on the Details tab.

No comments:

Post a Comment