Thursday, 24 March 2022

Azure Blob Storage

 

  • Binary Large Object
  • Object storage solution for the cloud
  • Stores all types of files: image, video, audio, log files backups, etc.

  • Storage Account
    • Unique namespace in Azure for your data
    • If your storage account name is tutorialsdojo, then the default endpoint for Blob storage is: http://tutorialsdojo.blob.core.windows.net
  • Container
    • Organizes a set of blobs that are similar to a directory in a file system.

Blob Types

  • Block 
    • Store binary and text data up to 4.7 TB.
    • Preview larger block blobs up to 190.7 TiB
  • Append 
    • Ideal for logging data from virtual machines
  • Page
    • Store random-access files up to 8 TB in size
    • Store virtual hard drive (VHD) files

Supported Access Tiers

  • Hot
    • Frequently accessed objects
    • Most cost-effective, while storage costs are higher
    • Default in new Storage Accounts
  • Cool
    • Infrequently accessed data
    • More cost-effective, but higher access cost than hot tier
    • Data remains for at least 30 days
  • Archive
    • Rarely accessed files.
    • Lowest cost for storing data but the highest access cost.
    • Data remains for at least 180 days.
  • Lifecycle Management Policy
    • lifecycle configuration has a set of rules that define actions that are applied to a group of objects.

    • Enables you to transition your data to the appropriate access tiers.
    • Delete blobs at the end of their lifecycles

Features

  • Versioning
    • Identified by a version ID
    • Enable versioning and restore an earlier version of a blob to recover your data.
    • If you disable the versioning of the blob, it does not delete existing blobs, versions, or snapshots.
  • Snapshots
    • A read-only version of a blob that was taken at a given point in time.
    • The snapshots persist until they are explicitly deleted.
  • Object Replication
    • Copies block blobs asynchronously between a source Storage account and a destination account.
    • A source account can have up to two destination accounts. But there can be no more than two source accounts in the destination account.
  • Static Website
    • Serve your static website directly from a storage container named $web
    • CORS is not supported
    • You can grant read-only access in your resources with public access level
    • Enable Azure Content Delivery Network (CDN) to cache content from a static website
    • You can use Azure CDN to configure a custom domain endpoint
  • AzCopy
    • AzCopy is a command-line utility that allows you to transfer blobs or files to or from a storage account.
    • You can use Azure AD and SAS tokens to provide authorization credentials.
    • These are the tasks that you can do using AzCopy:
      • Upload files
      • Download blobs and directories
      • Copy blobs, directories, and containers between accounts.
      • Synchronize local storage
    • You can run AzCopy on Windows, Linux, and macOS.

Security

  • AzCopy method of authorization
    • Blob storage – Azure Active Directory and Shared Access Signature
    • File storage – Shared Access Signature only
  • You can whitelist specific IP addresses or IP ranges to access your storage account.
  • Private endpoints allow your storage account and virtual network to have a secure connection over a private link, eliminating exposure from the public internet.
  • Azure Storage is using 256-bit AES encryption
  • Customer-managed key
    • Using Azure Key Vault, you can encrypt and decrypt data in Blob storage and in Azure Files.
  • Customer-provided key
    • A customer can include their own encryption key for granular control.

Key management parameter

Microsoft-managed keys

Customer-managed keys

Customer-provided keys

Encryption/decryption operations

Azure

Azure

Azure

Azure Storage services supported

All

Blob storage, Azure Files

Blob storage

Key storage

Microsoft key store

Azure Key Vault

Customer’s own key store

Key rotation responsibility

Microsoft

Customer

Customer

Key control

Microsoft

Customer

Customer

Azure Storage Overview

 

  • An Azure storage account contains blobs, files, queues, tables, and disks.
  • Types of Storage Accounts: General-purpose (v2 and v1), BlockBlobStorage, FileStorage, and BlobStorage
  • All storage accounts are encrypted using Storage Service Encryption (SSE) for data at rest
  • Storage accounts endpoints:
    • Blob storage: https://tutorialsdojo.blob.core.windows.net
    • Table storage: https://tutorialsdojo.table.core.windows.net
    • Queue storage: https://tutorialsdojo.queue.core.windows.net
    • Azure Files: https://tutorialsdojo.file.core.windows.net
    • Azure Data Lake Storage Gen2: https://tutorialsdojo.dfs.core.windows.net
  • Access tiers are: Hot, Cool, and Archive
    • Hot
      • Highest storage costs, but lowest access costs
      • Store data that is accessed frequently
      • By default, new storage accounts are created in the hot tier
    • Cool
      • Lower storage costs, but higher access costs
      • Store data that is infrequently accessed (at least 30 days)
      • You can use a cool access tier for short-term backup.
    • Archive
      • Lowest storage costs, but the highest retrieval costs
      • Store data that is rarely accessed (at least 180 days)
      • Data needs to be stored for a long time.
  • Storage redundancy includes: Locally redundant storage (LRS), Zone-redundant storage (ZRS), Geo-redundant storage (GRS), Geo-zone-redundant storage (GZRS)
    • Locally redundant storage (LRS) 
      • A low-cost redundancy strategy
      • Your data is copied synchronously three times within the primary region
    • Zone-redundant storage (ZRS)
      • Redundancy for high availability
      • The data is copied synchronously across three Azure availability zones in the primary region
    • Geo-redundant storage (GRS)
      • Cross-regional redundancy
      • In the primary region, data is synchronously copied three times, and then asynchronously copied to the secondary region.
      • Enable read-only geo-redundant storage (RA-GRS) to access data in the secondary region.
    • Geo-zone-redundant storage (GZRS)
      • Redundancy for both high availability and maximum durability
      • Data is copied synchronously across three Azure availability zones in the primary region, then copied asynchronously to the secondary region.
      • You can also enable RA-GZRS for read access data in the secondary region
  • Moving of data into different storage account can be done automatically or manually
  • You can migrate data manually using:
    • AzCopy uses a command-line utility
    • Data Movement Library is designed for high-performance, reliable, and easy data transfer operations similar to AzCopy
    • REST API or client library lets you create a custom application to migrate your data

Types of Storage Accounts

  • General-purpose v2 accounts
    • Supports Data Lake Gen2, Blobs, Files Disks Queues Tables
    • Delivers the lowest per-gigabyte capacity prices for Azure Storage
  • General-purpose v1 accounts
    • Supports Blobs, Files, Disks, Queues, Tables
    • You can upgrade a general-purpose v1 account to a general-purpose v2 account with no downtime and without copying the data.
    • You can use general-purpose v1 accounts since the General-purpose v2 accounts and Blob storage accounts only support the Azure Resource Manager deployment model.
    • If you don’t need a large capacity for transaction-intensive or significant geo-replication bandwidth, GPv1 is a suitable choice
  • BlockBlobStorage accounts
    • Provides low, consistent latency, and higher transaction rates.
    • Upgrading a Blob storage account to a general-purpose v2 account has no downtime and you don’t need to copy the data
    • It doesn’t support hot, cool, and archive access tiers
    • You can use BlockBlobStorage for storing unstructured object data as block blobs or append blobs. 
  • FileStorage accounts
    • Only supports file shares
    • Offers IOPS bursting
  • BlobStorage accounts
    • Only supports block and append blobs.
    • BlobStorage account offers standard performance. While the BlockBlobStorage account supports premium performance.

Storage Account Type

Supported Services

Supported Performance Tiers

Supported Access Tiers

Replication Options

Deployment Model

Encryption

General-

purpose V2

Blob, File, Queue, Table, Disk, and Data Lake Gen2

Standard, Premium

Hot, Cool, Archive

LRS, GRS, RA-GRS, ZRS, GZRS (preview), RA-GZRS (preview)

Resource Manager

Encrypted

General-

purpose V1

Blob, File, Queue, Table, and Disk

Standard, Premium

N/A

LRS, GRS, RA-GRS

Resource Manager, Classic

Encrypted

BlockBlob

Storage

Blob (block blobs and append blobs only)

Premium

N/A

LRS, ZRS

Resource Manager

Encrypted

FileStorage

File only

Premium

N/A

LRS, ZRS

Resource Manager

Encrypted

BlobStorage

Blob (block blobs and append blobs only)

Standard

Hot, Cool, Archive

LRS, GRS, RA-GRS

Resource Manager

Encrypted

 

Security

  • To grant access in your storage account, the request must include a valid Authorization header
  • If authentication of identity is successful, then Azure Active Directory returns a token to use in authorizing the request to Azure Storage Services.
  • You can use shared key authorization to construct a connection string
  • Shared access signature allows you to have granular control on who can access your data
  • When you copy a file without the metadata for encryption, the blob content cannot be retrieved again.

Pricing

  • You are charged based on your Region, Account type, Access Tier, and Storage Capacity
  • The replication and reads/write operations also incur costs.
  • If your data isn’t running in the same region, you’re charged for data egress.

Azure Service Fabric

 

  • A distributed systems platform that helps package, deploy, and manage scalable and reliable microservices and containers.
  • Build microservices and container-based applications using the programming language of your choice, including .NET Core 2.0, C #, and Java. It supports two types of microservices:
    • Stateless – It does not maintain a mutable state outside a request and its response from the service such as protocol gateways and web proxies.
    • Stateful – It maintains a mutable, authoritative state beyond the request and its response.
  • Enables low-touch workflows to provision, deploy, patch, and monitor applications with Service Fabric application lifecycle management.
  • Supports the deployment of multiple application instances.
  • service fabric cluster is a set of virtual machines into which your microservices are deployed and managed.

Security

  • Create or import a certificate using Azure Key Vault.
  • Use Azure Firewall to complement your existing Network Security Group rules to control access to your cluster.

Pricing

  • You are charged based on the number of vCPU and GBs of memory allocated to each VMs.
  • You are charged based on the size, number of disks, and number of outbound data transfers.

Azure CycleCloud

 

  • Orchestrate and manage high-performance computing (HPC) environments on Azure.
  • Enables you to provision infrastructure for HPC systems, deploy familiar HPC schedulers, and scale the infrastructure automatically to run jobs efficiently at any scale.

azure cyclecloud

Features

  • Scheduler Agnostic – use standard HPC schedulers or extend CycleCloud autoscaling plugins to work with your own scheduler.
  • Manage Compute Resources – manage VMs and scale sets to provide a set of compute resources to meet your workload requirements.
  • Autoscale Resources – adjust cluster size and components automatically based on workload, availability, and time requirements.
  • Monitor and Analyze – collect node-level metrics and analyze the performance data using a visualization tool.
  • Template Clusters – enables you to share your cluster topologies.
  • CycleCloud agent (called Jetpack) – Installed by Azure CycleCloud on each virtual machine to provide the following functions:
    • Node Configuration
    • Distributed Synchronization
    • Health Check

Azure Batch

 

  • A service that runs large-scale parallel and high-performance computing (HPC) batch jobs in Azure.
  • Allows you to run jobs in a group of Linux or Windows virtual machines.

Components

  • task represents a unit of computation and a job is a collection of tasks.
  • Job priority values range from the lowest priority to the highest priority.
  • To specify certain limits for your jobs, you can use job constraints:
    • Maximum wallclock time – tasks are terminated if the job runs longer than the specified time.
    • Maximum number of task retries – if the task fails, it will be requeued to run again.
  • job manager task contains the information needed to create the tasks required for the job.
  • Scheduled jobs allow you to create recurring jobs.
  • Simultaneously run on more than one compute node with a multi-instance task.
  • With task dependencies, the task depends on the completion of other tasks before its execution.

Pricing

  • No additional charge for using Azure Batch and you are only charged for the underlying resources consumed.

Azure Container Registry

 

  • A service to manage your container images and related artifacts.
  • ACR is a regional service.

Features

  • Keep track of current valid container images.
  • Registries (SKUs) are available in three tiers: Basic, Standard, and Premium.
  • You can use the geo-replication feature of Premium registries for advanced replication and container image distribution scenarios.
  • Streamline building, testing, pushing, and deploying images to Azure with Azure Container Registry Tasks.
  • ACR Tasks supports quick taskautomatically triggered tasks, and multi-step task
  • Tag your containers using stable and unique tags.

Concepts

  • Registry
    • A registry is a collection of repositories to store and distribute container images.
    • You must be authenticated before you can pull and push images.
  • Artifact
    • The address of an artifact contains loginUrl, repository and tag
      • [loginUrl]/[repository:][tag]
  • Repository
    • A repository is a group of similar container images and other artifacts.
    • Identify similar repositories and artifacts with namespaces.
  • Image
    • Images are used in ACR tasks.
    • A container image consists of tags, layers, and a manifest.
    • Orphaned images are generated by repeated pushing of modified images with identical tags.

Best Practices

  • If you place your registry near your container hosts, it will help reduce both latency and costs.
  • When you are deploying containers to multiple regions, you can use the geo-replication feature.
  • ACR supports nested namespaces that allow you to share a single registry across multiple groups.
  • There are two main situations when authenticating with an ACR:
    • Individual identity – allows you to pull or push images from the development machine.
    • Service/Headless identity – enables you to build and deploy pipelines where the user is not directly involved.
  • ACR allows you to delete images by tag, by manifest digest, and by repository.

Tasks

  • Quick Task
    • Verify your automated build definitions and catch potential problems prior to committing your code.
    • Build and push a single container image to a container registry on-demand, in Azure, without needing a local Docker Engine installation.
  • Trigger Task
    • You can create an image using one or more triggers on:
      • Source code update
      • Base image update
      • Schedule
  • Multi-step Task
    • Multi-container-based workflows
    • With multi-step tasks in ACR Tasks, you have more granular control over image building, testing, and OS and framework patching workflows.
  • Deleted registry resources such as repositories, images, and tags cannot be recovered after deletion.

Tagging

  • Use stable tags to maintain base images for your container builds.
  • If the updated image has a stable tag, the previously tagged image is untagged, resulting in an orphaned image.
  • You can use unique tags for deployments, particularly in an environment where multiple nodes can scale.

Network

  • You can connect to your ACR via public and private endpoints.
  • A private endpoint connection is only available for Premium SKU.

Security

  • Encrypts the registry content at rest with service-managed keys or customer-managed keys.
  • Customer-Managed Key is only available for Premium SKU.
  • You can enable a customer-managed key only when you create a registry.
  • Authenticate through Azure Active Directory user, service principal, admin login, or through Azure managed identity.

Pricing

  • You are charged (GiB/day) for the image storage.
  • Users will be charged for the preceding SKU price until the point of change and will be charged for the new SKU price after the change has been made.
  • Standard networking fees apply to network egress.
  • If you replicate a registry to your desired regions, you are charged with premium registry fees for each region.

Azure Kubernetes Service (AKS)

 

  • An open-source tool for orchestrating and managing many container images and applications.
  • Lets you deploy a managed Kubernetes cluster in Azure.

Features

  • Uses clusters and pods to scale and deploy applications.
  • Kubernetes can deploy more images of containers as needed.
  • It supports horizontal scaling, self-healing, load balancing, and secret management.
  • Automatic monitoring of application load to determine when to scale the number of containers used.
  • Allows you to replicate container architectures.
  • Use Kubernetes with supported Azure regions and on-premises installations using Azure Stack.
  • The images used by AKS come from Azure Container Registry.
  • Use Azure Advisor to optimize your Kubernetes deployments with real-time, personalized recommendations.

Components

  • control plane is a managed Azure resource. It is where the components run, including API server and cluster database (etcd).
    • kube-apiserver – allows communication for management tools (kubectl).
    • etcd – a key-value store within Kubernetes.
    • kube-scheduler – defines what nodes should run in the workload.
    • kube-controller-manager – it oversees the smaller controllers that handle node operations and replication of pods.
  • Kubernetes runs an application in your instance using pods.
  • node is made up of several pods, and node pools are a group of nodes with the same configuration.
  • Use a node selector to control where a pod should be placed.
  • You can run at least 2 nodes in the default node pool to ensure your cluster operates reliably.
  • Multi-container pods are placed on the same node and allow containers to share the related resources.
  • You can specify maximum resource limits that prevent a given pod from consuming too much compute resources from the underlying node.
  • deployment determines the number of replicas (pods) to be created, but you must define a manifest file in YAML format first.
  • With StatefulSets, you can maintain the application’s state within a single pod life cycle.
  • The resources are logically grouped into a namespace, and a user may only interact with resources within their assigned namespaces.

Storage

  • Persistent volumes are provided by Azure disk and file storage.
  • Create a Kubernetes DataDisk resource using Azure Disk.
  • Mount an SMB 3.0 share backed by an Azure Storage account to pods with Azure Files.
  • Volumes that are defined and created as part of the pod lifecycle only exist until the pod is deleted.
  • AKS has four initial storage classes:
    • default – uses Azure StandardSSD storage to create a Managed Disk.
    • managed-premium – uses Azure Premium storage to create Managed Disk.
    • azurefile – uses Azure Standard storage to create an Azure File Share.
    • azurefile-premium – uses Azure Premium storage to create an Azure File Share.
  • If no StorageClass is specified for a persistent volume, the default StorageClass is used.

Security

  • With Kubernetes RBAC, you can create roles to define permissions and then assign those roles to users with role bindings.
  • You can limit network traffic between pods in your cluster with Kubernetes network policies.
  • Dynamic rules enforcement across multiple clusters with Azure Policy.
  • Azure AD-integrated AKS clusters can grant users or groups access to Kubernetes resources within a namespace or across the cluster.
  • Secure communication paths between namespaces and nodes with Azure Private Link.

Pricing

  • You only pay for virtual machines, associated storage, and networking resources.
  • There is no charge for cluster management.

Versions

  • Uses semantic versioning: [major].[minor].
  • A user has 30 days from the version removal to upgrade into a supported patch and continue receiving support.
  • Azure updates the cluster automatically if it has been out of support for more than 3 minor versions.
  • Downgrading a version is not supported.