Sunday, 20 March 2022

Amazon EFS

 

  • A fully-managed file storage service that makes it easy to set up and scale file storage in the Amazon Cloud.

Features

  • The service manages all the file storage infrastructure for you, avoiding the complexity of deploying, patching, and maintaining complex file system configurations.
  • EFS supports the Network File System version 4 protocol.
  • You can mount EFS filesystems onto EC2 instances running Linux or MacOS Big Sur. Windows is not supported.
  • Aside from EC2 instances, you can also mount EFS filesystems on ECS tasks, EKS pods, and Lambda functions.
  • Multiple Amazon EC2 instances can access an EFS file system at the same time, providing a common data source for workloads and applications running on more than one instance or server.
  • EFS file systems store data and metadata across multiple Availability Zones in an AWS Region.
  • EFS file systems can grow to petabyte scale, drive high levels of throughput, and allow massively parallel access from EC2 instances to your data.
  • EFS provides file system access semantics, such as strong data consistency and file locking.
  • EFS enables you to control access to your file systems through Portable Operating System Interface (POSIX) permissions.
  • Moving your EFS file data can be managed simply with AWS DataSync – a managed data transfer service that makes it faster and simpler to move data between on-premises storage and Amazon EFS.
  • You can schedule automatic incremental backups of your EFS file system using the EFS-to-EFS Backup solution.
  • Amazon EFS Infrequent Access (EFS IA) is a new storage class for Amazon EFS that is cost-optimized for files that are accessed less frequently. Customers can use EFS IA by creating a new file system and enabling Lifecycle Management. With Lifecycle Management enabled, EFS automatically will move files that have not been accessed for 30 days from the Standard storage class to the Infrequent Access storage class. To further lower your costs in exchange for durability, you can use the EFS IA-One Zone storage class.

Performance Modes

  • General purpose performance mode (default)
    • Ideal for latency-sensitive use cases.
  • Max I/O mode
    • Can scale to higher levels of aggregate throughput and operations per second with a tradeoff of slightly higher latencies for file operations.

Throughput Modes

  • Bursting Throughput mode (default)
    • Throughput scales as your file system grows.
  • Provisioned Throughput mode
    • You specify the throughput of your file system independent of the amount of data stored.

Mount Targets

  • To access your EFS file system in a VPC, you create one or more mount targets in the VPC. A mount target provides an IP address for an NFSv4 endpoint.
  • You can create one mount target in each Availability Zone in a region.
  • You mount your file system using its DNS name, which will resolve to the IP address of the EFS mount target. Format of DNS is
    File-system-id.efs.aws-region.amazonaws.com

AWS Training Amazon EFS

  • When using Amazon EFS with an on-premises server, your on-premises server must have a Linux based operating system.

Access Points

  • EFS Access Points simplify how applications are provided access to shared data sets in an EFS file system. 
  • EFS Access Points work together with AWS IAM and enforce an operating system user and group, and a directory for every file system request made through the access point. 

Components of a File System

  • ID
  • creation token
  • creation time
  • file system size in bytes
  • number of mount targets created for the file system
  • file system state
  • mount target

Data Consistency in EFS

  • EFS provides the open-after-close consistency semantics that applications expect from NFS.
  • Write operations will be durably stored across Availability Zones.
  • Applications that perform synchronous data access and perform non-appending writes will have read-after-write consistency for data access.

Managing File Systems

  • You can create encrypted file systems. EFS supports encryption in transit and encryption at rest.
  • Managing file system network accessibility refers to managing the mount targets:
    • Creating and deleting mount targets in a VPC
    • Updating the mount target configuration
  • You can create new tags, update values of existing tags, or delete tags associated with a file system.
  • The following list explains the metered data size for different types of file system objects.
    • Regular files – the metered data size of a regular file is the logical size of the file rounded to the next 4-KiB increment, except that it may be less for sparse files.
      • A sparse file is a file to which data is not written to all positions of the file before its logical size is reached. For a sparse file, if the actual storage used is less than the logical size rounded to the next 4-KiB increment, Amazon EFS reports actual storage used as the metered data size.
    • Directories – the metered data size of a directory is the actual storage used for the directory entries and the data structure that holds them, rounded to the next 4 KiB increment. The metered data size doesn’t include the actual storage used by the file data.
    • Symbolic links and special files – the metered data size for these objects is always 4 KiB.
  • File system deletion is a destructive action that you can’t undo. You lose the file system and any data you have in it, and you can’t restore the data. You should always unmount a file system before you delete it.
  • You can use AWS DataSync to automatically, efficiently, and securely copy files between two Amazon EFS resources, including file systems in different AWS Regions and ones owned by different AWS accounts.  Using DataSync to copy data between EFS file systems, you can perform one-time migrations, periodic ingest for distributed workloads, or automate replication for data protection and recovery.
  • File systems created using the Amazon EFS console are automatically backed up daily through AWS Backup with a retention of 35 days. You can also disable automatic backups for your file systems at any time.
  • Amazon Cloudwatch Metrics can monitor your EFS file system storage usage, including the size in each of the EFS storage classes.

Mounting File Systems

  • To mount your EFS file system on your EC2 instance, use the mount helper in the amazon-efs-utils package.
  • You can mount your EFS file systems on your on-premises data center servers when connected to your Amazon VPC with AWS Direct Connect or VPN.
  • You can use fstab to automatically mount your file system using the mount helper whenever the EC2 instance is mounted on reboots.

Lifecycle Management

  • You can choose from five EFS Lifecycle Management policies (7, 14, 30, 60, or 90 days) to automatically move files into the EFS Infrequent Access (EFS IA) storage class and save up to 85% in cost.

Monitoring File Systems

  • Amazon CloudWatch Alarms
  • Amazon CloudWatch Logs
  • Amazon CloudWatch Events
  • AWS CloudTrail Log Monitoring
  • Log files on your file system

Security

  • You must have valid credentials to make EFS API requests, such as create a file system.
  • You must also have permissions to create or access resources.
  • When you first create the file system, there is only one root directory at /. By default, only the root user (UID 0) has read-write-execute permissions.
  • Specify EC2 security groups for your EC2 instances and security groups for the EFS mount targets associated with the file system.
  • You can use AWS IAM to manage Network File System (NFS) access for Amazon EFS. You can use IAM roles to identify NFS clients with cryptographic security and use IAM policies to manage client-specific permissions.

Pricing

  • You pay only for the storage used by your file system.
  • Costs related to Provisioned Throughput are determined by the throughput values you specify.

EFS vs EBS vs S3

  • Performance Comparison

Amazon EFS

Amazon EBS Provisioned IOPS

Per-operation latency

Low, consistent latency.

Lowest, consistent latency.

Throughput scale

Multiple GBs per second

Single GB per second

  • Performance Comparison

Amazon EFS

Amazon S3

Per-operation latency

Low, consistent latency.

Low, for mixed request types, and integration with CloudFront.

Throughput scale

Multiple GBs per second

Multiple GBs per second

    Storage Comparison

Amazon EFS

Amazon EBS Provisioned IOPS

Availability and durability

Data are stored redundantly across multiple AZs.

Data are stored redundantly in a single AZ.

Access

Up to thousands of EC2 instances from multiple AZs can connect concurrently to a file system.

A single EC2 instance in a single AZ can connect to a file system.

Use cases

Big data and analytics, media processing workflows, content management, web serving, and home directories.

Boot volumes, transactional and NoSQL databases, data warehousing, and ETL.

Amazon EFS

Amazon S3

Availability and durability

Data are stored redundantly across multiple AZs.

Stored redundantly across multiple AZs.

Access

Up to thousands of EC2 instances from multiple AZs can connect concurrently to a file system.

One to millions of connections over the web.

Use cases

Big data and analytics, media processing workflows, content management, web serving, and home directories.

Web serving and content management, media and entertainment, backups, big data analytics, data lake.

Amazon EBS

 

  • Block level storage volumes for use with EC2 instances.
  • Well-suited for use as the primary storage for file systems, databases, or for any applications that require fine granular updates and access to raw, unformatted, block-level storage.
  • Well-suited to both database-style applications (random reads and writes), and to throughput-intensive applications (long, continuous reads and writes).
  • New EBS volumes receive their maximum performance the moment that they are available and do not require initialization (formerly known as pre-warming). However, storage blocks on volumes that were restored from snapshots must be initialized (pulled down from Amazon S3 and written to the volume) before you can access the block.
  • Termination protection is turned off by default and must be manually enabled (keeps the volume/data when the instance is terminated)
  • You can have up to 5,000 EBS volumes by default
  • You can have up to 10,000 snapshots by default

Features

  • Different types of storage options: General Purpose SSD (gp2,gp3)Provisioned IOPS SSD (io1,io2)Throughput Optimized HDD (st1), and Cold HDD (sc1) volumes up to 16 TiB in size or 64TiB for io2 Block Express.
  • You can mount multiple volumes on the same instance, and you can mount a Provisioned IOPS volume to multiple instances at a time using Amazon EBS Multi-Attach.
  • Enable Multi-Attach on EBS Provisioned IOPS io1 volumes to allow a single volume to be concurrently attached to up to sixteen AWS Nitro System-based Amazon EC2 instances within the same AZ.
  • You can create a file system on top of these volumes, or use them in any other way you would use a block device (like a hard drive).
  • You can use encrypted EBS volumes to meet data-at-rest encryption requirements for regulated/audited data and applications.
  • You can create point-in-time snapshots of EBS volumes, which are persisted to Amazon S3. Similar to AMIs. Snapshots can be copied across AWS regions.
  • Volumes are created in a specific AZ, and can then be attached to any instances in that same AZ. To make a volume available outside of the AZ, you can create a snapshot and restore that snapshot to a new volume anywhere in that region.
  • You can copy snapshots to other regions and then restore them to new volumes there, making it easier to leverage multiple AWS regions for geographical expansion, data center migration, and disaster recovery.
  • Performance metrics, such as bandwidth, throughput, latency, and average queue length, provided by Amazon CloudWatch, allow you to monitor the performance of your volumes to make sure that you are providing enough performance for your applications without paying for resources you don’t need.
  • You can detach an EBS volume from an instance explicitly or by terminating the instance. However, if the instance is running, you must first unmount the volume from the instance.
  • If an EBS volume is the root device of an instance, you must stop the instance before you can detach the volume.
  • You can use AWS Backup, an automated and centralized backup service, to protect EBS volumes and your other AWS resources. AWS Backup is integrated with Amazon DynamoDB, Amazon EBS, Amazon RDS, Amazon EFS, and AWS Storage Gateway to give you a fully managed AWS backup solution.
  • With AWS Backup, you can configure backups for EBS volumes, automate backup scheduling, set retention policies, and monitor backup and restore activity.
  • EBS fast snapshot restore allows you to create a volume from a snapshot that is fully initialized. This removes the latency of I/O operations on the block when accessed for the first time.

Types of EBS Volumes

  • General Purpose SSD (gp3)
    • Delivers a consistent baseline rate of 3,000 IOPS and 125 MiB/s. You can provision additional IOPS (up to 16,000) and throughput (up to 1,000 MiB/s) for an additional cost.
    • The maximum ratio of provisioned IOPS to provisioned volume size is 500 IOPS per GiB. The maximum ratio of provisioned throughput to provisioned IOPS is .25 MiB/s per IOPS.
  • General Purpose SSD (gp2)
    • Base performance of 3 IOPS/GiB, with the ability to burst to 3,000 IOPS for extended periods of time.
    • Support up to 16,000 IOPS and 250 MB/s of throughput.
    • The burst duration of a volume is dependent on the size of the volume, the burst IOPS required, and the credit balance when the burst begins. Burst IO duration is computed using the following formula:

Burst duration  = (Credit balance) [(Burst IOPS) – 3 (Volume size in GiB)]

    • If your gp2 volume uses all of its I/O credit balance, the maximum IOPS performance of the volume remains at the baseline IOPS performance level and the volume’s maximum throughput is reduced to the baseline IOPS multiplied by the maximum I/O size.
    • Throughput for a gp2 volume can be calculated using the following formula, up to the throughput limit of 160 MiB/s:

Throughput in MiB/s = (Volume size in GiB) (IOPS per GiB) × (I/O size in KiB)

  • Provisioned IOPS SSD (io1/io2)
    • Designed for I/O-intensive workloads, particularly database workloads, which are sensitive to storage performance and consistency.
    • Allows you to specify a consistent IOPS rate when you create the volume
    • EBS Provisioned IOPS io2 features higher durability of 99.999%, and supports provisioning 500 IOPS for every provisioned GB. EBS io2 has 100x better volume durability and a 10x higher IOPS to storage ratio than io1, for the same price as io1.
  • Throughput Optimized HDD (st1)
    • Low-cost magnetic storage that focuses on throughput rather than IOPS.
    • Throughput of up to 500 MiB/s.
    • Subject to throughput and throughput-credit caps, the available throughput of an st1 volume is expressed by the following formula:

(Volume size)(Credit accumulation rate per TiB) = Throughput

  • Cold HDD (sc1)
    • Low-cost magnetic storage that focuses on throughput rather than IOPS.
    • Throughput of up to 250 MiB/s.

Volume Name

General Purpose SSD

Provisioned IOPS SSD

Volume type

gp3

gp2

io2

io1

Description

General Purpose SSD volume that balances price performance for a wide variety of transactional workloads

General Purpose SSD volume that balances price performance for a wide variety of transactional workloads

High performance SSD volume designed for business-critical latency-sensitive applications

High performance SSD volume designed for latency-sensitive transactional workloads

Use Cases

virtual desktops, medium sized single instance databases such as MSFT SQL Server and Oracle DB, low-latency interactive apps, dev & test, boot volumes

Boot volumes, low-latency interactive apps, dev & test

Workloads that require sub-millisecond latency, and sustained IOPS performance or more than 64,000 IOPS or 1,000 MiB/s of throughput

Workloads that require sustained IOPS performance or more than 16,000 IOPS and I/O-intensive database workloads

Volume Size

1 GB – 16 TB

1 GB – 16 TB

4 GB – 16 TB

/ 64 TB for io2 block express

4 GB – 16 TB

Durability

99.8% – 99.9% durability

99.8% – 99.9% durability

99.999%

99.8% – 99.9%

Max IOPS / Volume

16,000

16,000

64,000

/ 256,000 for io2 block express

64,000

Max Throughput  / Volume

1000 MB/s

250 MB/s

1,000 MB/s

/ 4,000 MiB/s for io2 block express

1,000 MB/s

Max IOPS / Instance

260,000

260,000

160,000

/ 260,000 MiB/s for io2 block express

260,000

Max IOPS / GB

N/A

N/A

500 IOPS/GB

/ 1,000 IOPS/GB for io2 block express

50 IOPS/GB

Max Throughput / Instance

7,500 MB/s

7,500 MB/s

4,750 MB/s

/ 7,500 MB/s for io2 block express

7,500 MB/s

Latency

single digit millisecond

single digit millisecond

single digit millisecond

single digit millisecond

Multi-Attach

No

No

Yes

Yes

 

Volume Name

Throughput Optimized HDD

Cold HDD

Volume type

st1

sc1

Description

Low cost HDD volume designed for frequently accessed, throughput-intensive workloads

Throughput-oriented storage for data that is infrequently accessed

Scenarios where the lowest storage cost is important

Use Cases

Big data, data warehouses, log processing

Colder data requiring fewer scans per day

Volume Size

125 GB – 16 TB

125 GB – 16 TB

Durability

99.8% – 99.9% durability

99.8% – 99.9% durability

Max IOPS / Volume

500

250

Max Throughput  / Volume

500 MB/s

250 MB/s

Max IOPS / Instance

260,000

260,000

Max IOPS / GB

N/A

N/A

Max Throughput / Instance

7,500 MB/s

7,500 MB/s

Multi-Attach

No

No

 

Encryption

  • Data stored at rest on an encrypted volume, disk I/O, and snapshots created from it are all encrypted.
  • Also provides encryption for data in-transit from EC2 to EBS since encryption occurs on the servers that hosts EC2 instances.
  • The following types of data are encrypted:
    • Data at rest inside the volume
    • All data moving between the volume and the instance
    • All snapshots created from the volume
    • All volumes created from those snapshots
  • Uses AWS Key Management Service (AWS KMS) master keys when creating encrypted volumes and any snapshots created from your encrypted volumes.
  • Volumes restored from encrypted snapshots are automatically encrypted.
  • EBS encryption is only available on certain instance types.
  • There is no direct way to encrypt an existing unencrypted volume, or to remove encryption from an encrypted volume. However, you can migrate data between encrypted and unencrypted volumes.
  • You can now enable Amazon Elastic Block Store (EBS) Encryption by Default, ensuring that all new EBS volumes created in your account are encrypted.

Monitoring

  • Cloudwatch Monitoring two types: Basic and Detailed monitoring
  • Volume status checks provide you the information that you need to determine whether your EBS volumes are impaired, and help you control how a potentially inconsistent volume is handled. List of statuses include:
    • Ok – normal volume
    • Warning – degraded volume
    • Impaired – stalled volume
    • Insufficient-data –  insufficient data
  • Volume events include a start time that indicates the time at which an event occurred, and a duration that indicates how long I/O for the volume was disabled. The end time is added to the event when I/O for the volume is enabled.
  • Volume events are:
    • Awaiting Action: Enable IO
    • IO Enabled
    • IO Auto-Enabled
    • Normal
    • Degraded
    • Severely Degraded
    • Stalled

Modifying the Size, IOPS, or Type of an EBS Volume on Linux

  • If your current-generation EBS volume is attached to a current-generation EC2 instance type, you can increase its size, change its volume type, or (for an io1 volume) adjust its IOPS performance, all without detaching it.
  • EBS currently supports a maximum volume size of 16 TiB.
  • Two partitioning schemes commonly used on Linux and Windows systems: master boot record (MBR) and GUID partition table (GPT).
  • An EBS volume being modified goes through a sequence of states. The volume enters first the Modifying state, then the Optimizing state, and finally the Complete state.
  • You can expand a partition to a new size. Expand by using parted or gdisk.
  • Use a file system–specific command to resize the file system to the larger size of the new volume. These commands work even if the volume to extend is the root volume. For ext2, ext3, and ext4 file systems, this command is resize2fs. For XFS file systems, this command is xfs_growfs.
  • Decreasing the size of an EBS volume is not supported.

EBS Snapshots

  • Back up the data on your EBS volumes to S3 by taking point-in-time snapshots.
  • Snapshots are incremental backups, which means that only the blocks on the device that have changed after your most recent snapshot are saved. This minimizes the time required to create the snapshot and saves on storage costs by not duplicating data.
  • When you delete a snapshot, only the data unique to that snapshot is removed.

AWS Training Amazon EBS 2

  • You can share a snapshot across AWS accounts by modifying its access permissions.
  • You can make copies of your own snapshots as well as snapshots that have been shared with you.
  • A snapshot is constrained to the Region where it was created.
  • EBS snapshots broadly support EBS encryption.
  • You can’t delete a snapshot of the root device of an EBS volume used by a registered AMI. You must first deregister the AMI before you can delete the snapshot.
  • Each account can have up to 5 concurrent snapshot copy requests to a single destination Region.
  • User-defined tags are not copied from the source snapshot to the new snapshot.
  • Snapshots are constrained to the Region in which they were created. To share a snapshot with another Region, copy the snapshot to that Region.
  • Snapshots that you intend to share must instead be encrypted with a custom CMK.

Amazon EBS–Optimized Instances

  • Provides the best performance for your EBS volumes by minimizing contention between EBS I/O and other traffic from your instance.
  • EBS–optimized instances deliver dedicated bandwidth between 500 Mbps and 60,000 Mbps to EBS.
  • For instance types that are EBS–optimized by default, there is no need to enable EBS optimization and no effect if you disable EBS optimization.

Pricing

  • You are charged by the amount you provision in GB per month until you release the storage.
  • Provisioned storage for gp2 volumes, provisioned storage and provisioned IOPS for io1 volumes, provisioned storage for st1 and sc1 volumes will be billed in per-second increments, with a 60 second minimum.
  • With Provisioned IOPS SSD (io1) volumes, you are also charged by the amount you provision in IOPS per month.
  • After you detach a volume, you are still charged for volume storage as long as the storage amount exceeds the limit of the AWS Free Tier. You must delete a volume to avoid incurring further charges.
  • Snapshot storage is based on the amount of space your data consumes in Amazon S3.
  • Copying a snapshot to a new Region does incur new storage costs.
  • When you enable EBS optimization for an instance that is not EBS-optimized by default, you pay an additional low, hourly fee for the dedicated capacity.

Improving Performance

  • Use EBS-Optimized Instances
  • Understand How Performance is Calculated
  • Understand Your Workload
  • Be Aware of the Performance Penalty When Initializing Volumes from Snapshots
  • Factors That Can Degrade HDD Performance
  • Increase Read-Ahead for High-Throughput, Read-Heavy Workloads on st1 and sc1
  • Use a Modern Linux Kernel
  • Use RAID 0 (Redundant Array of Independent Disks) to Maximize Utilization of Instance Resources
  • Track Performance Using Amazon CloudWatch