Thursday, 11 July 2024

Azure Monitor overview

 

Azure Monitor overview

Azure Monitor is a comprehensive monitoring solution for collecting, analyzing, and responding to monitoring data from your cloud and on-premises environments. You can use Azure Monitor to maximize the availability and performance of your applications and services. It helps you understand how your applications are performing and allows you to manually and programmatically respond to system events.

Azure Monitor collects and aggregates the data from every layer and component of your system across multiple Azure and non-Azure subscriptions and tenants. It stores it in a common data platform for consumption by a common set of tools that can correlate, analyze, visualize, and/or respond to the data. You can also integrate other Microsoft and non-Microsoft tools.

Diagram that shows an abstracted view of what Azure monitor does as described in the previous paragraph.

The diagram above shows an abstracted view of the monitoring process. A more detailed breakdown of the Azure Monitor architecture is shown in the High level architecture section below.

High level architecture

Azure Monitor can monitor these types of resources in Azure, other clouds, or on-premises:

  • Applications
  • Virtual machines
  • Guest operating systems
  • Containers including Prometheus metrics
  • Databases
  • Security events in combination with Azure Sentinel
  • Networking events and health in combination with Network Watcher
  • Custom sources that use the APIs to get data into Azure Monitor

You can also export monitoring data from Azure Monitor into other systems so you can:

  • Integrate with other third-party and open-source monitoring and visualization tools
  • Integrate with ticketing and other ITSM systems

If you're a System Center Operations Manager (SCOM) user, Azure Monitor now includes Azure Monitor SCOM Managed Instance (SCOM MI). Operations Manager MI is a cloud-hosted version of Operations Manager and allows you to move your on-premises Operations Manager installation to Azure.

The following diagram shows a high-level architecture view of Azure Monitor.

Diagram that shows an overview of Azure Monitor with data sources on the left sending data to a central data platform and features of Azure Monitor on the right that use the collected data.

Click on the diagram to see a more detailed expanded version showing a larger breakdown of data sources and data collection methods.

The diagram depicts the Azure Monitor system components:

  • Data sources are the types of resources being monitored.

  • The data is collected and routed to the data platform. Clicking on the diagram shows these options, which are also called out in detail later in this article.

  • The data platform stores the collected monitoring data. Azure Monitor's core data platform has stores for metrics, logs, traces, and changes. System Center Operations Manager MI uses its own database hosted in SQL Managed Instance.

  • The consumption section shows the components that use data from the data platform.

    • Azure Monitor's core consumption methods include tools to provide insightsvisualize, and analyze data. The visualization tools build on the analysis tools and the insights build on top of both the visualization and analysis tools.
    • There are additional mechanisms to help you respond to incoming monitoring data.
  • The SCOM MI path uses the traditional Operations Manager console that SCOM customers are already familiar with.

  • Interoperability options are shown in the integrate section. Not all services integrate at all levels. SCOM MI only integrates with Power BI.

Data sources

Azure Monitor can collect data from multiple sources.

The diagram below shows an expanded version of the data source types that Azure Monitor can gather monitoring data from.

Diagram that shows an overview of Azure Monitor data sources.

Click on the diagram above to see a larger version of the data sources diagram in context.

You can integrate application, infrastructure, and custom data source monitoring data from outside Azure, including from on-premises, and non-Microsoft clouds.

Azure Monitor collects these types of data:

Data TypeDescription and subtypes
App/WorkloadsApp- Application performance, health, and activity data.

Workloads - IaaS workloads such as SQL server, Oracle or SAP running on a hosted Virtual Machine.
InfrastructureContainer - Data about containers, such as Azure Kubernetes ServicePrometheus, and the applications running inside containers.

Operating system - Data about the guest operating system on which your application is running.
Azure PlatformAzure resource - Data about the operation of an Azure resource from inside the resource, including changes. Resource Logs are one example.

Azure subscription - The operation and management of an Azure subscription, and data about the health and operation of Azure itself. The activity log is one example.

Azure tenant - Data about the operation of tenant-level Azure services, such as Microsoft Entra ID.
Custom SourcesData that gets into the system using the
- Azure Monitor REST API
- Data Collection API

For detailed information about each of the data sources, see data sources.

SCOM MI (like on premises SCOM) collects only IaaS Workload and Operating System sources.

Data collection and routing

Azure Monitor collects and routes monitoring data using a few different mechanisms depending on the data being routed and the destination. Much like a road system improved over the years, not all roads lead to all locations. Some are legacy, some new, and some are better to take than others given how Azure Monitor has evolved over time. For more information, see data sources.

Diagram that shows an overview of Azure Monitor data collection and routing.

Click on the diagram to see a larger version of the data collection in context.

Collection methodDescription
Application instrumentationApplication Insights is enabled through either Auto-Instrumentation (agent) or by adding the Application Insights SDK to your application code. In addition, Application Insights is in process of implementing Open Telemetry. For more information, reference How do I instrument an application?.
AgentsAgents can collect monitoring data from the guest operating system of Azure and hybrid virtual machines.
Data collection rulesUse data collection rules to specify what data should be collected, how to transform it, and where to send it.
Zero ConfigData is automatically sent to a destination without user configuration. Platform metrics are the most common example.
Diagnostic settingsUse diagnostic settings to determine where to send resource log and activity log data on the data platform.
Azure Monitor REST APIThe Logs Ingestion API in Azure Monitor lets you send data to a Log Analytics workspace in Azure Monitor Logs. You can also send metrics into the Azure Monitor Metrics store using the custom metrics API.

A common way to route monitoring data to other non-Microsoft tools is using Event hubs. See more in the Integrate section below.

SCOM MI (like on-premises SCOM) uses an agent to collect data, which it sends to a management server running in a SCOM MI on Azure.

For detailed information about data collection, see data collection.

Data platform

Azure Monitor stores data in data stores for each of the three pillars of observability, plus an additional one:

  • metrics
  • logs
  • distributed traces
  • changes

Each store is optimized for specific types of data and monitoring scenarios.

Diagram that shows an overview of Azure Monitor data platform.

Select the preceding diagram to see the Data Platform in the context of the whole of Azure Monitor.

Pillar of Observability/
Data Store
Description
Azure Monitor MetricsMetrics are numerical values that describe an aspect of a system at a particular point in time. Azure Monitor Metrics is a time-series database, optimized for analyzing time-stamped data. Azure Monitor collects metrics at regular intervals. Metrics are identified with a timestamp, a name, a value, and one or more defining labels. They can be aggregated using algorithms, compared to other metrics, and analyzed for trends over time. It supports native Azure Monitor metrics and Prometheus metrics.
Azure Monitor LogsLogs are recorded system events. Logs can contain different types of data, be structured or free-form text, and they contain a timestamp. Azure Monitor stores structured and unstructured log data of all types in Azure Monitor Logs. You can route data to Log Analytics workspaces for querying and analysis.
TracesDistributed tracing allows you to see the path of a request as it travels through different services and components. Azure Monitor gets distributed trace data from instrumented applications. The trace data is stored in a separate workspace in Azure Monitor Logs.
ChangesChanges are a series of events in your application and resources. They're tracked and stored when you use the Change Analysis service, which uses Azure Resource Graph as its store. Change Analysis helps you understand which changes, such as deploying updated code, may have caused issues in your systems.

Distributed tracing is a technique used to trace requests as they travel through a distributed system. It allows you to see the path of a request as it travels through different services and components. It helps you to identify performance bottlenecks and troubleshoot issues in a distributed system.

For less expensive, long-term archival of monitoring data for auditing or compliance purposes, you can export to Azure Storage.

SCOM MI is similar to SCOM on-premises. It stores its information in an SQL Database, but uses SQL Managed Instance because it's in Azure.

Consumption

The following sections outline methods and services that consume monitoring data from the Azure Monitor data platform.

All areas in the consumption section of the diagram have a user interface that appears in the Azure portal.

The top part of the consumption section applies to Azure Monitor core only. SCOM MI uses the traditional Ops Console running in the cloud. It can also send monitoring data to Power BI for visualization.

The Azure portal

The Azure portal is a web-based, unified console that provides an alternative to command-line tools. With the Azure portal, you can manage your Azure subscription using a graphical user interface. You can build, manage, and monitor everything from simple web apps to complex cloud deployments in the portal. The Monitor section of the Azure portal provides a visual interface that gives you access to the data collected for Azure resources and an easy way to access the tools, insights, and visualizations in Azure Monitor.

Screenshot that shows the Monitor section of the Azure portal.

Insights

Some Azure resource providers have curated visualizations that provide a customized monitoring experience and require minimal configuration. Insights are large, scalable, curated visualizations.

Diagram that shows the Insights part of the Consumption section of the Azure Monitor system.

The following table describes some of the larger insights:

InsightDescription
Application InsightsApplication Insights monitors the availability, performance, and usage of your web applications.
Container InsightsContainer Insights gives you performance visibility into container workloads that are deployed to managed Kubernetes clusters hosted on Azure Kubernetes Service. Container Insights collects container logs and metrics from controllers, nodes, and containers that are available in Kubernetes through the Metrics API. After you enable monitoring from Kubernetes clusters, these metrics and logs are automatically collected for you through a containerized version of the Log Analytics agent for Linux.
VM InsightsVM Insights monitors your Azure VMs. It analyzes the performance and health of your Windows and Linux VMs and identifies their different processes and interconnected dependencies on external processes. The solution includes support for monitoring performance and application dependencies for VMs hosted on-premises or another cloud provider.
Network InsightsNetwork Insights provides a comprehensive and visual representation through topologies, of health and metrics for all deployed network resources, without requiring any configuration. It also provides access to network monitoring capabilities like Connection Monitor, flow logging for network security groups (NSGs), and Traffic Analytics and other diagnostic features.

For more information, see the list of insights and curated visualizations in the Azure Monitor Insights overview.

Visualize

Diagram that shows the Visualize part of the Consumption section of the Azure Monitor system.

Visualizations such as charts and tables are effective tools for summarizing monitoring data and presenting it to different audiences. Azure Monitor has its own features for visualizing monitoring data and uses other Azure services for publishing it to different audiences. Power BI and Grafana are not officially part of the Azure Monitor product, but they're a core integration and part of the Azure Monitor story.

VisualizationDescription
DashboardsAzure dashboards allow you to combine different kinds of data into a single pane in the Azure portal. You can optionally share the dashboard with other Azure users. You can add the output of any log query or metrics chart to an Azure dashboard. For example, you could create a dashboard that combines tiles that show a graph of metrics, a table of activity logs, a usage chart from Application Insights, and the output of a log query.
WorkbooksWorkbooks provide a flexible canvas for data analysis and the creation of rich visual reports in the Azure portal. You can use them to query data from multiple data sources. Workbooks can combine and correlate data from multiple data sets in one visualization giving you easy visual representation of your system. Workbooks are interactive and can be shared across teams with data updating in real time. Use workbooks provided with Insights, utilize the library of templates, or create your own.
Power BIPower BI is a business analytics service that provides interactive visualizations across various data sources. It's an effective means of making data available to others within and outside your organization. You can configure Power BI to automatically import log data from Azure Monitor to take advantage of these visualizations.
GrafanaGrafana is an open platform that excels in operational dashboards. All versions of Grafana include the Azure Monitor data source plug-in to visualize your Azure Monitor metrics and logs. Azure Managed Grafana also optimizes this experience for Azure-native data stores such as Azure Monitor and Azure Data Explorer. In this way, you can easily connect to any resource in your subscription and view all resulting monitoring data in a familiar Grafana dashboard. It also supports pinning charts from Azure Monitor metrics and logs to Grafana dashboards.

Grafana has popular plug-ins and dashboard templates for non-Microsoft APM tools such as Dynatrace, New Relic, and AppDynamics as well. You can use these resources to visualize Azure platform data alongside other metrics from higher in the stack collected by these other tools. It also has AWS CloudWatch and GCP BigQuery plug-ins for multicloud monitoring in a single pane of glass.

Analyze

The Azure portal contains built-in tools that allow you to analyze monitoring data.

Diagram that shows the Analyze part of the Consumption section of the Azure Monitor system.

ToolDescription
Metrics explorerUse the Azure Monitor metrics explorer user interface in the Azure portal to investigate the health and utilization of your resources. Metrics explorer helps you plot charts, visually correlate trends, and investigate spikes and dips in metric values. Metrics explorer contains features for applying dimensions and filtering, and for customizing charts. These features help you analyze exactly the data you need in a visually intuitive way.
Log AnalyticsThe Log Analytics user interface in the Azure portal helps you query the log data collected by Azure Monitor so that you can quickly retrieve, consolidate, and analyze collected data. After creating test queries, you can then directly analyze the data with Azure Monitor tools, or you can save the queries for use with visualizations or alert rules. Log Analytics workspaces are based on Azure Data Explorer, using a powerful analysis engine and the rich Kusto query language (KQL).Azure Monitor Logs uses a version of the Kusto Query Language suitable for simple log queries, and advanced functionality such as aggregations, joins, and smart analytics. You can get started with KQL quickly and easily. NOTE: The term "Log Analytics" is sometimes used to mean both the Azure Monitor Logs data platform store and the UI that accesses that store. Previous to 2019, the term "Log Analytics" did refer to both. It's still common to find content using that framing in various blogs and documentation on the internet.
Change AnalysisChange Analysis is a subscription-level Azure resource provider that checks resource changes in the subscription and provides data for diagnostic tools to help users understand what changes might have caused issues. The Change Analysis user interface in the Azure portal gives you insight into the cause of live site issues, outages, or component failures. Change Analysis uses the Azure Resource Graph to detect various types of changes, from the infrastructure layer through application deployment.

Respond

An effective monitoring solution proactively responds to critical events, without the need for an individual or team to notice the issue. The response could be a text or email to an administrator, or an automated process that attempts to correct an error condition.

Diagram that shows the Respond part of the Consumption section of the Azure Monitor system.

Artificial Intelligence for IT Operations (AIOps) can improve service quality and reliability by using machine learning to process and automatically act on data you collect from applications, services, and IT resources into Azure Monitor. It automates data-driven tasks, predicts capacity usage, identifies performance issues, and detects anomalies across applications, services, and IT resources. These features simplify IT monitoring and operations without requiring machine learning expertise.

Azure Monitor Alerts notify you of critical conditions and can take corrective action. Alert rules can be based on metric or log data.

  • Metric alert rules provide near-real-time alerts based on collected metrics.
  • Log search alert rules based on logs allow for complex logic across data from multiple sources.

Alert rules use action groups, which can perform actions such as sending email or SMS notifications. Action groups can send notifications using webhooks to trigger external processes or to integrate with your IT service management tools. Action groups, actions, and sets of recipients can be shared across multiple rules.

Screenshot that shows the Azure Monitor alerts UI in the Azure portal.

SCOM MI currently uses its own separate traditional SCOM alerting mechanism in the Ops Console.

Autoscale allows you to dynamically control the number of resources running to handle the load on your application. You can create rules that use Azure Monitor metrics to determine when to automatically add resources when the load increases or remove resources that are sitting idle. You can specify a minimum and maximum number of instances, and the logic for when to increase or decrease resources to save money and to increase performance.

Conceptual diagram showing how autoscale grows

Azure Logic Apps is also an option. For more information, see the Integrate section below.

Integrate

You may need to integrate Azure Monitor with other systems or to build custom solutions that use your monitoring data. These Azure services work with Azure Monitor to provide integration capabilities. Below are only a few of the possible integrations.

Diagram that shows the Integrate part of the Consumption section of the Azure Monitor system.

Network Watcher in Azure Features

 Network Watcher in Azure Features

What is Azure Network Watcher

Azure Network Watcher provides tools to monitor, diagnose, view metrics, and enable or disable logs for resources in an Azure virtual network. It is designed to monitor and repair the network health of IaaS (Infrastructure-as-a-Service) products which includes Virtual Machines, Virtual Networks, Application Gateways, Load balancers, etc.

The Connection Troubleshoot capability enables you to test a connection between a VM and another VM, an FQDN, a URI, or an IPv4 address. The test returns similar information returned when using the connection monitor capability, but tests the connection at a point in time, rather than monitoring it over time, as the connection monitor does.

By using Network Watcher, you can Monitor communication between a virtual machine and an endpoint. Endpoints can be another virtual machine (VM), a fully qualified domain name (FQDN), a uniform resource identifier (URI), or an IPv4 address. The connection monitor capability monitors communication at a regular interval and informs you of reachability, latency, and network topology changes between the VM and the endpoint. For example, you might have a web server VM that communicates with a database server VM. Someone in your organization may, unknown to you, apply a custom route or network security rule to the web server or database server VM or subnet.

Features of Network Watcher

Azure Network Watcher allows you to monitor, diagnose and gain insight into your network performance between various points in your network infrastructure. Here’s a breakdown of some of the elements:

1. The Monitoring Element – You can monitor from one endpoint to another with a connection monitor to ensure connectivity between 2 points, like a web application and a database for instance. You’ll be alerted of potential issues such as a disconnect between those two services.

2. The Network Performance Monitor – Allows monitoring between Azure and on-premises resources for hybrid scenarios using VPN or express route. It also has some advanced detection to traffic blackholing and routing errors – in other words, some advanced intelligence when it comes to these network issues.

Best of all, as you add more endpoints it will develop a visual diagram of your network with a topology tool that will look like a Visio-diagram, showing IP addresses, hostnames, etc.

3. Diagnostic Tools – From a diagnostic standpoint there are several diagnostic tools that give you better insight into your virtual network by diagnosing possible causes of traffic issues.

IP Flow– Tells you which security rule allowed or denied traffic to or from a virtual machine in your virtual network for further inspection or remediation.

4. Metrics Tools – There are some limitations as to how many resources you can deploy within an Azure network which can be based on subscriptions or regions. The Metric Tool gives you the visibility that you need to understand exactly where you are inside of those limitations. It shows you how many of those resources you’ve deployed and how many are still available that you can deploy – so it helps you set up plans for the future as you deploy more and more resources.

5. Logging – We’ve done some interesting things with log analytics. Log analytics provides the ability to capture data about a bunch of Azure networking components, like network security groups, public IP addresses, load balances, virtual networking, and application gateways, to name a few.

Configure Network Watcher

  1. Log in to Azure Portal.
  2. Search for Network Watcher in Global Search.

network watcher

3. Click Topology under Monitoring.

Topology

4. Now, select your Resource Group and Select your Region.

Network Watcher5. On Next Hop under Network diagnostic tools, enter the following:
• Source IP Address – Enter the Source IP Address as 10.0.0.4
• Destination IP Address- Enter Destination IP Address 10.0.0.5
• Click on Next Hop

Network Watcher

6. You’ll see the Next Hop type as Virtual Network because VM – 1 – Web from SUBNET1- WEB can talk to VM- 2 – SQL in SUBNET2 – SQL via System routes so no router involved in between.

ip address

7. On the Connection troubleshoot, enter the following things  then Click Run diagnostics :
• Virtual Machine: Select VM-1-Web as your First VM in the Source
• Virtual Machine: Select VM-2-SQL as your Second VM in the Destination
 Preferred IP: Select Both as Preferred IP Version
• Destination Port: Enter Destination Port as 3389

Network Watcher

8. You can see all the Results as shown below. You can also click on see details to see more
detailed results.

Network Watcher

Azure Network Watcher

What is Azure Network Watcher?

Azure Network Watcher provides a suite of tools to monitor, diagnose, view metrics, and enable or disable logs for Azure IaaS (Infrastructure-as-a-Service) resources. Network Watcher enables you to monitor and repair the network health of IaaS products like virtual machines (VMs), virtual networks (VNets), application gateways, load balancers, etc. Network Watcher isn't designed or intended for PaaS monitoring or Web analytics.

Network Watcher consists of three major sets of tools and capabilities:

Diagram showing Azure Network Watcher's capabilities.


Monitoring

Network Watcher offers two monitoring tools that help you view and monitor resources:

  • Topology
  • Connection monitor

Topology

Topology provides a visualization of the entire network for understanding network configuration. It provides an interactive interface to view resources and their relationships in Azure spanning across multiple subscriptions, resource groups, and locations. For more information, see View topology.

Connection monitor

Connection monitor provides end-to-end connection monitoring for Azure and hybrid endpoints. It helps you understand network performance between various endpoints in your network infrastructure. For more information, see Connection monitor overview and Monitor network communication between two virtual machines.

Network diagnostic tools

Network Watcher offers seven network diagnostic tools that help troubleshoot and diagnose network issues:

  • IP flow verify
  • NSG diagnostics
  • Next hop
  • Effective security rules
  • Connection troubleshoot
  • Packet capture
  • VPN troubleshoot

IP flow verify

IP flow verify allows you to detect traffic filtering issues at a virtual machine level. It checks if a packet is allowed or denied to or from an IP address (IPv4 or IPv6 address). It also tells you which security rule allowed or denied the traffic. For more information, see IP flow verify overview and Diagnose a virtual machine network traffic filter problem.

NSG diagnostics

NSG diagnostics allows you to detect traffic filtering issues at a virtual machine, virtual machine scale set, or application gateway level. It checks if a packet is allowed or denied to or from an IP address, IP prefix, or a service tag. It tells you which security rule allowed or denied the traffic. It also allows you to add a new security rule with a higher priority to allow or deny the traffic. For more information, see NSG diagnostics overview and Diagnose network security rules.

Next hop

Next hop allows you to detect routing issues. It checks if traffic is routed correctly to the intended destination. It provides you with information about the Next hop type, IP address, and Route table ID for a specific destination IP address. For more information, see Next hop overview and Diagnose a virtual machine network routing problem.

Effective security rules

Effective security rules allows you to view the effective security rules applied to a network interface. It shows you all security rules applied to the network interface, the subnet the network interface is in, and the aggregate of both. For more information, see Effective security rules overview and View details of a security rule.

Connection troubleshoot

Connection troubleshoot enables you to test a connection between a virtual machine, a virtual machine scale set, an application gateway, or a Bastion host and a virtual machine, an FQDN, a URI, or an IPv4 address. The test returns similar information returned when using the connection monitor capability, but tests the connection at a point in time instead of monitoring it over time, as connection monitor does. For more information, see Connection troubleshoot overview and Troubleshoot connections with Azure Network Watcher.

Packet capture

Packet capture allows you to remotely create packet capture sessions to track traffic to and from a virtual machine (VM) or a virtual machine scale set. For more information, see packet capture and Manage packet captures for virtual machines.

VPN troubleshoot

VPN troubleshoot enables you to troubleshoot virtual network gateways and their connections. For more information, see VPN troubleshoot overview and Diagnose a communication problem between networks.

Traffic

Network Watcher offers two traffic tools that help you log and visualize network traffic:

  • Flow logs
  • Traffic analytics

Flow logs

Flow logs allows you to log information about your Azure IP traffic and stores the data in Azure storage. You can log IP traffic flowing through a network security group or Azure virtual network. For more information, see:

Traffic analytics

Traffic analytics provides rich visualizations of flow logs data. For more information about traffic analytics, see traffic analytics and Manage traffic analytics using Azure Policy.

Screenshot showing Traffic analytics feature of Network Watcher.

Usage + quotas

The Usage + quotas capability of Network Watcher provides a summary of your deployed network resources within a subscription and region, including current usage and corresponding limits for each resource. For more information, see Networking limits to learn about the limits for each Azure network resource per region per subscription. This information is helpful when planning future resource deployments as you can't create more resources if you reach their limits within the subscription or region.

Screenshot showing Networking resources usage and limits per subscription in the Azure portal.