Monday 6 March 2023

Azure HPC Cache

 Azure HPC Cache speeds access to your data for high-performance computing (HPC) tasks. By caching files in Azure, Azure HPC Cache brings the scalability of cloud computing to your existing workflow. This service can be used even for workflows where your data is stored across WAN links, such as in your local datacenter network-attached storage (NAS) environment.

Azure HPC Cache is easy to launch and monitor from the Azure portal. Existing NFS storage or new Blob containers can become part of its aggregated namespace, which makes client access simple even if you change the back-end storage target.

Use cases

Azure HPC Cache enhances productivity best for workflows like these:

  • Read-heavy file access workflow
  • Data stored in NFS-accessible storage, Azure Blob, or both
  • Compute farms of up to 75,000 CPU cores

Azure HPC Cache can be added to a wide variety of workflows across many industries. Any system where a large number of machines need to access a set of files at scale and with low latency will benefit from this service. The sections below give specific examples.

Visual effects (VFX) rendering

In media and entertainment, Azure HPC Cache can speed up data access for time-critical rendering projects. VFX rendering workflows often require last-minute processing by large numbers of compute nodes. Data for these workflows are typically located in an on-premises NAS environment. Azure HPC Cache can cache that file data in the cloud to reduce latency and enhance flexibility for on-demand rendering.

Learn more about High-performance computing for rendering.

Life sciences

Many life sciences workflows can benefit from scale-out file caching.

A research institute that wants to port its genomic analysis workflows into Azure can easily shift them by using Azure HPC Cache. Because the cache provides POSIX file access, no client-side changes are needed to run their existing client workflow in the cloud.

Azure HPC Cache also can be leveraged to improve efficiency in tasks like secondary analysis, pharmacological simulation, or AI-driven image analysis.

Learn more about High-performance computing for health and life sciences.

Silicon design verification

The silicon design industry’s design verification workloads, known as “electronic design automation (EDA) tools” are compute-intensive tools that can be run on large-scale virtual machine compute grids.

Azure HPC Cache can provide on-cloud caching of design data, libraries, binaries, and rule database files from on-premises storage systems. This provides local-like response times for directory listings, metadata, and data reads, and eliminates the need for complex data migration, syncing, and copying operations.

Azure HPC Cache also can be set up to cache output files being written by the compute jobs. This configuration gives immediate acknowledgement to the compute workflow and subsequently writes the changes back to the on-premises NAS.

HPC Cache allows chip designers to scale EDA verification jobs to tens of thousands of cores with ease, and pay minimal attention to storage performance.

Learn more about High-performance computing for silicon.

Financial services analytics

An Azure HPC Cache deployment can help speed up quantitative analysis calculations, risk analysis workloads, and Monte Carlo simulations to give financial services companies better insight to make strategic decisions.

Learn more about High-performance computing for financial services.

Region availability

Visit the Azure Global Infrastructure products by region page to learn where Azure HPC Cache is available.

Azure HPC Cache resides in a single region. It can access data stored in other regions if you connect it to Blob containers located there. The cache does not permanently store customer data.

Azure Data Shares

 Azure Data Share enables organizations to securely share data with multiple customers and partners. Data providers are always in control of the data that they've shared and Azure Data Share makes it simple to manage and monitor what data was shared, when and by whom.

In today's world, data is viewed as a key strategic asset that many organizations need to simply and securely share with their customers and partners. There are many ways that customers do this today, including through FTP, e-mail, APIs to name a few. Organizations can easily lose track of who they've shared their data with. Sharing data through FTP or through standing up their own API infrastructure is often expensive to provision and administer. There's management overhead associated with using these methods of sharing on a large scale. In addition to accountability, many organizations would like to be able to control, manage, and monitor all of their data sharing in a simple way that stays up to date, so they can derive timely insights.

Using Data Share, a data provider can share data and manage their shares all in one place. They can stay in control of how their data is handled by specifying terms of use for their data share. The data consumer must accept these terms before being able to receive the data. Data providers can specify the frequency at which their data consumers receive updates. Access to new updates can be revoked at any time by the data provider.

Azure Data Share helps enhance insights by making it easy to combine data from third parties to enrich analytics and AI scenarios. Easily use the power of Azure analytics tools to prepare, process, and analyze data shared with Azure Data Share.

Both the data provider and data consumer must have an Azure subscription to share and receive data. If you don't have an Azure subscription, create a free account.

Scenarios for Azure Data Share

Azure Data Share can be used in many different industries. For example, a retailer may want to share recent point of sales data with their suppliers. Using Azure Data Share, a retailer can set up a data share containing point of sales data for all of their suppliers and share sales on an hourly or daily basis.

Azure Data Share can also be used to establish a data marketplace for a specific industry. For example, a government or a research institution that regularly shares anonymized data about population growth with third parties.

Another use case for Azure Data Share is establishing a data consortium. For example, many different research institutions can share data with a single trusted body. Data is analyzed, aggregated or processed using Azure analytics tools and then shared with interested parties.

How it works

Azure Data Share currently offers snapshot-based sharing and in-place sharing.

data share flow

Snapshot-based sharing

In snapshot-based sharing, data moves from the data provider's Azure subscription and lands in the data consumer's Azure subscription. As a data provider, you provision a data share and invite recipients to the data share. Data consumers receive an invitation to your data share via e-mail. Once a data consumer accepts the invitation, they can trigger a full snapshot of the data shared with them. This data is received into the data consumers storage account. Data consumers can receive regular, incremental updates to the data shared with them so that they always have the latest version of the data.

Data providers can offer their data consumers incremental updates to the data shared with them through a snapshot schedule. Snapshot schedules are offered on an hourly or a daily basis. When a data consumer accepts and configures their data share, they can subscribe to a snapshot schedule. This is beneficial in scenarios where the shared data is updated regularly, and the data consumer needs the most up-to-date data.

When a data consumer accepts a data share, they're able to receive the data in a data store of their choice. For example, if the data provider shares data using Azure Blob Storage, the data consumer can receive this data in Azure Data Lake Store. Similarly, if the data provider shares data from an Azure Synapse Analytics, the data consumer can choose whether they want to receive the data into an Azure Data Lake Store, an Azure SQL Database or an Azure Synapse Analytics. If sharing from SQL-based sources, the data consumer can also choose whether they receive data in parquet or csv.

In-place sharing

With in-place sharing, data providers can share data where it resides without copying the data. After sharing relationship is established through the invitation flow, a symbolic link is created between the data provider's source data store and the data consumer's target data store. Data consumer can read and query the data in real time using its own data store. Changes to the source data store are available to the data consumer immediately. In-place sharing is currently available for Azure Data Explorer.

Key capabilities

Azure Data Share enables data providers to:

  • Share data from the list of supported data stores with customers and partners outside of your organization

  • Keep track of who you have shared your data with

  • Choice of snapshot or in-place sharing

  • How frequently your data consumers are receiving updates to your data

  • Allow your customers to pull the latest version of your data as needed, or allow them to automatically receive incremental changes to your data at an interval defined by you.

Azure Data Share enables data consumers to:

  • View a description of the type of data being shared

  • View terms of use for the data

  • Accept or reject an Azure Data Share invitation

  • Accept data shared with you into a supported data store.

  • Access data in place or trigger a full or incremental snapshot of shared data

All key capabilities listed above are supported through the Azure portal or via REST APIs. For more details on using Azure Data Share through REST APIs, check out our reference documentation.

Supported regions

For a list of Azure regions that make Azure Data Share available, refer to the products available by region page and search for Azure Data Share.

For metadata stored by Azure Data Share, in Southeast Asia (Singapore), it's stored within the region and for all other supported regions, it's stored in the geo. Azure Data Share doesn't store a copy of the shared data itself. The data is stored in the underlying data store that is being shared. For example, if a data provider stores their data in an Azure Data Lake Storage account located in West US, that is where the data is stored. If they're sharing data with an Azure Storage account located in West Europe via snapshot, typically the data is transferred directly to the Azure Storage account located in West Europe.

The Azure Data Share service doesn't have to be available in your region to use the service. For example, if you have data stored in an Azure Storage account located in a region where Azure Data Share isn't yet available, you can still use the service to share your data.

Azure NetApp Files

 

What is Azure NetApp Files

Azure NetApp Files is an Azure native, first-party, enterprise-class, high-performance file storage service. It provides NAS volumes as a service for which you can create NetApp accounts, capacity pools, select service and performance levels, create volumes, and manage data protection. It allows you to create and manage high-performance, highly available, and scalable file shares, using the same protocols and tools that you're familiar with and enterprise applications rely on on-premises. Azure NetApp Files supports SMB and NFS protocols and can be used for various use cases such as file sharing, home directories, databases, high-performance computing and more. Additionally, it also provides built-in availability, data protection and disaster recovery capabilities.

High performance

Azure NetApp Files is designed to provide high-performance file storage for enterprise workloads. Key features that contribute to the high performance include:

  • High throughput:
    Azure NetApp Files supports high throughput for large file transfers and can handle many random read and write operations with high concurrency, over the Azure high-speed network. This functionality helps to ensure that your workloads aren't bottlenecked by VM disk storage performance. Azure NetApp Files supports multiple service levels, such that you can choose the optimal mix of capacity, performance and cost.
  • Low latency:
    Azure NetApp Files is built on top of an all-flash bare-metal fleet, which is optimized for low latency, high throughput, and random IO. This functionality helps to ensure that your workloads experience optimal (low) storage latency.
  • Protocols:
    Azure NetApp Files supports both SMB, NFSv3/NFSv4.1, and dual-protocol volumes, which are the most common protocols used in enterprise environments. This functionality allows you to use the same protocols and tools that you use on-premises, which helps to ensure compatibility and ease of use. It supports NFS nconnect and SMB multichannel for increased network performance.
  • Scale:
    Azure NetApp Files can scale up or down to meet the performance and capacity needs of your workloads. You can increase or decrease the size of your volumes as needed, and the service automatically provisions the necessary throughput.
  • Changing of service levels:
    With Azure NetApp Files, you can dynamically and online change your volumes’ service levels to tune your capacity and performance needs whenever you need to. This functionality can even be fully automated through APIs.
  • Optimized for workloads:
    Azure NetApp Files is optimized for workloads like HPC, IO-intensive, and database scenarios. It provides high performance, high availability, and scalability for demanding workloads.

All these features work together to provide a high-performance file storage solution that can handle the demands of enterprise workloads. They help to ensure that your workloads experience optimal (low) storage latency.

High availability

Azure NetApp Files is designed to provide high availability for your file storage needs. Key features that contribute to the high availability include:

  • Automatic failover:
    Azure NetApp Files supports automatic failover within the bare-metal fleet if there's disruption or maintenance event. This functionality helps to ensure that your data is always available, even in a failure.
  • Multi-protocol access:
    Azure NetApp Files supports both SMB and NFS protocols, helping to ensure that your applications can access your data, regardless of the protocol they use.
  • Self-healing:
    Azure NetApp Files is built on top of a self-healing storage infrastructure, which helps to ensure that your data is always available and recoverable.
  • Support for Availability Zones:
    Volumes can be deployed in an Availability Zones of choice, enabling you to build HA application architectures for increased application availability.
  • Data replication:
    Azure NetApp Files supports data replication between different Azure regions and Availability Zones, which helps to ensure that your data is always available, even in an outage.
  • Azure NetApp Files provides a high availability SLA.

All these features work together to provide a high-availability file storage solution to ensure that your data is always available, recoverable, and accessible to your applications, even in an outage.

Data protection

Azure NetApp Files provides built-in data protection to help ensure the safe storage, availability, and recoverability of your data. Key features include:

  • Snapshot copies:
    Azure NetApp Files allows you to create point-in-time snapshots of your volumes, which can be restored or reverted to a previous state. The snapshots are incremental. That is, they only capture the changes made since the last snapshot, at the block level, which helps to drastically reduce storage consumption.
  • Backup and restore:
    Azure NetApp Files provides integrated backup, which allows you to create backups of your volume snapshots to lower-cost Azure storage and restore them if data loss happens.
  • Data replication:
    Azure NetApp Files supports data replication between different Azure regions and Availability Zones, which helps to ensure high availability and disaster recovery. Replication can be done asynchronously, and the service can fail over to a secondary region or zone in an outage.
  • Security:
    Azure NetApp Files provides built-in security features such as RBAC/IAM, Active Directory Domain Services (AD DS), Azure Active Directory Domain Services (AADDS) and LDAP integration, and Azure Policy. This functionality helps to protect data from unauthorized access, breaches, and misconfigurations.

All these features work together to provide a comprehensive data protection solution that helps to ensure that your data is always available, recoverable, and secure.

Data Box Gateway

 Azure is known to provide its users with strategic cloud direction and very useful cloud applications that work like a cloud architect for user organizations. The Azure Data Box allows the user to transfer huge quantities of data into and out of Azure through a reliable process. The maximum storage capacity of each device is 80TB. The device is designed and placed in a rugged case to secure the data while in transit. Today, in this blog post we will go over Azure Data Box pricing and features in detail.

The pricing structure for Data Box:

The process of moving the stored or in-flight data into the hybrid cloud platform through cloud migrations is challenging. The devices included within the concept of Azure Data Box are equipped with solutions for both these situations. The devices included in the Data Box can be categorized as follows:

  • Azure Data Box
  • Azure Data Box Disk
  • Azure Data Box Heavy and
  • Azure Data Box Gateway

The Data Box, Data Box Disk, and the Data Box Heavy fall under the category of offline data transfer devices. These are shipped between the data centerthe user organization, and the Azure Portal. The Data Boxes are designed to use the standard NAS protocols along with AES 256-bit encryption to protect the data and later perform a sanitization process post uploading. This ensures that all the data is cleaned off of the device.

On the other hand, the Azure Data Box Gateway is an online data transfer device that aids the virtual movement of data in and out of the Azure cloud storage.

The pricing structure of each of these devices can be enumerated as follows:

  1. Azure Data Box –
data box
Service             Unit          Price
Service Fee1 unit (10 days included)$250
Extra day fee1 day$15
Standard shipping fee11 package$95
  1. Azure Data Box Disk 
disk
Service        UnitPrice
Order processing fee1 unit$50
Order processing feePer disk per day$10
Standard shipping fee1 package$30
  1. Azure Data Box Heavy 
data box heavy
Service     UnitPrice
Service fee1 unit (20 days included)$4,000
Extra day fee1 day$100
Standard shipping fee 11 freight unitStarting from $1,500
  1. Azure Data Box Gateway –
Service     UnitPrice
Monthly subscription fee1 unit$125.00/month

The explanation for components of Data Box:

Data Box architecture

The components of Data Box can be explained as follows:-

  • Data Box – The device has a 100-TB capacity and uses the standard NAS protocols.
  • Data Box Disk – This device is an 8-TB SSD consisting of a USB/SATA interface having 128-bit encryption. It can be customized to the needs of the user company and is available in packs of five.
  • The Data Box Heavy – It is a self-contained device that is capable of lifting 1PB of data to the cloud.

Azure Data Box Gateway: Meaning and features

The Azure Data Box Gateway is a virtual product that is designed to be provisioned in the virtual environment of the user organization. The device stays within the on-premise environment of the user organization.

The features of this device include the following:-

  • The Device makes data sharing from in and out of the Azure storage accounts easier.
  • The data transfer rate of the device is similar to high-performance transfer procedures.
  • The Device has a local cache which refers to the local capacity size of the device when it is first provisioned.
  • The Device helps in writing the data even when the network is limited.

Process of buying:

Azure Data Box is a part of the cloud resources available on the Azure hybrid cloud platform. It enables the user organization to perform bulk data transfer easily and quickly through a reliable process. The data of the user company is transferred to a specific storage device with up to 80 TB of capacity. After the secure data transfer is completed the data is uploaded into the Azure storage.


Use cases of Azure Data Box:

Under Azure Data Box pricing it is best suited for performing data transfers of sizes larger than 40 TBs in situations where there is limited network connectivity. The movement of data can be categorized in the following manner:

One Time migration

This migration occurs when large quantities of on-premise data are moved into Azure. An instance of this is, moving a media library from offline tapes into the cloud storage library for creating an online media library.

Initial bulk data transfer

This is the process of performing an initial bulk data transfer that is followed by incremental transfers over the user interface.

Periodic transfers

The process is undertaken when large amounts of data are generated and periodic uploads are required to move it to Azure.

There are certainly other scenarios where the Data Box is useful in exporting data from Azure. These situations include the following:

Disaster Recovery

During disaster recovery scenarios huge amounts of Azure data are exported to a physical device or a Data Box. This device is then shipped by Microsoft and the data is restored to on-premises again. It acts as a backup infrastructure.

Data Backup and disaster recovery

Security needs

If a user organization is required to be able to export data from Azure due to governmental or security guidelines, it is made possible.

Migrating back to on-premises setup

The user company may wish to migrate back to the on-premise setup from cloud migrations or shift to a different cloud service provider, it is arranged through Data Box.

Security capabilities:

The security protections under the Azure Data Box pricing structure can be enumerated in the following manner:-

  • The Data Box consists of built-in security features to protect the data, device, and service.
  • The device comes in a rugged casing that is secured by tamper-resistant screws and tamper-evident stickers.
  • The data stored on the device is kept secure with AES 256-bit encryption always.
  • The devices can only be unlocked with specific device passwords provided by the Azure Portal for each of the devices.
  • The Data Box service is protected through the Azure security features.
  • In the case of an import order, the device disks are wiped off by the NIST 800-88r1 standards. With regards to an export order, the disks are erased once the device is received by the Azure Datacenter.