Sunday, 20 March 2022

Amazon S3

 

  • S3 stores data as objects within buckets.
  • An object consists of a file and optionally any metadata that describes that file.
  • key is the unique identifier for an object within a bucket.
  • Storage capacity is virtually unlimited.

Buckets

  • For each bucket, you can:
    • Control access to it (create, delete, and list objects in the bucket)
    • View access logs for it and its objects
    • Choose the geographical region where to store the bucket and its contents.
  • Bucket name must be a unique DNS-compliant name.
    • The name must be unique across all existing bucket names in Amazon S3.
    • After you create the bucket you cannot change the name.
    • The bucket name is visible in the URL that points to the objects that you’re going to put in your bucket.
  • By default, you can create up to 100 buckets in each of your AWS accounts.
  • You can’t change its Region after creation.
  • You can host static websites by configuring your bucket for website hosting.
  • You can’t delete an S3 bucket using the Amazon S3 console if the bucket contains 100,000 or more objects. You can’t delete an S3 bucket using the AWS CLI if versioning is enabled.

Data Consistency Model

  • read-after-write consistency for PUTS of new objects in your S3 bucket in all regions
  • strong consistency for read-after-write HEAD or GET requests
  • strong consistency for overwrite PUTS and DELETES in all regions
  • strong read-after-write consistency for any storage request
  • eventual consistency for listing all buckets after deleting a bucket (deleted bucket might still show up)
  • eventual consistency on propagation of enabling versioning on a bucket for the first time.

Storage Classes

  • Storage Classes for Frequently Accessed Objects
    • S3 STANDARD for general-purpose storage of frequently accessed data.
  • Storage Classes for Infrequently Accessed Objects
    • S3 STANDARD_IA for long-lived, but less frequently accessed data. It stores the object data redundantly across multiple geographically separated AZs.
    • S3 ONEZONE_IA stores the object data in only one AZ. Less expensive than STANDARD_IA, but data is not resilient to the physical loss of the AZ.
    • These two storage classes are suitable for objects larger than 128 KB that you plan to store for at least 30 days. If an object is less than 128 KB, Amazon S3 charges you for 128 KB. If you delete an object before the 30-day minimum, you are charged for 30 days.
  • Amazon S3 Intelligent Tiering
    • S3 Intelligent-Tiering is a storage class designed for customers who want to optimize storage costs automatically when data access patterns change, without performance impact or operational overhead. 
    • S3 Intelligent-Tiering is the first cloud object storage class that delivers automatic cost savings by moving data between two access tiers — frequent access and infrequent access — when access patterns change, and is ideal for data with unknown or changing access patterns.
    • S3 Intelligent-Tiering monitors access patterns and moves objects that have not been accessed for 30 consecutive days to the infrequent access tier. If an object in the infrequent access tier is accessed later, it is automatically moved back to the frequent access tier.
    • S3 Intelligent-Tiering supports the archive access tier.  If the objects haven’t been accessed for 90 consecutive days, it will be moved to the archive access tier. After 180 consecutive days of no access, it is automatically moved to the deep archive access tier.
    • There are no retrieval fees in S3 Intelligent-Tiering.
  • S3 GLACIER
    • For long-term archive
    • S3 Glacier provides the following storage classes: S3 Glacier Instant Retrieval, S3 Glacier Flexible Retrieval, and S3 Glacier Deep Archive.
    • Archived objects are not available for real-time access. You must first restore the objects before you can access them.
    • You cannot specify GLACIER as the storage class at the time that you create an object.
    • Glacier objects are visible through S3 only.
    • Retrieval Options
      • Expedited – allows you to quickly access your data when occasional urgent requests for a subset of archives are required. For all but the largest archived objects, data accessed are typically made available within 1–5 minutes. There are two types of Expedited retrievals: On-Demand requests are similar to EC2 On-Demand instances and are available most of the time. Provisioned requests are guaranteed to be available when you need them.
      • Standard – allows you to access any of your archived objects within several hours. Standard retrievals typically complete within 3–5 hours. This is the default option for retrieval requests that do not specify the retrieval option.
      • Bulk – Glacier’s lowest-cost retrieval option, enabling you to retrieve large amounts, even petabytes, of data inexpensively in a day. Bulk retrievals typically complete within 5–12 hours.
    • For S3 Standard, S3 Standard-IA, and Glacier storage classes, your objects are automatically stored across multiple devices spanning a minimum of three Availability Zones.

S3 Standard

S3 intelligent-Tiering *

S3 Standard-IA 

Designed for durability

99.999999999%

(11 9’ s)

Designed for availability

99.99%

99.99%

99.99%

Availability SLA

99.9%

99%

99%

Availability Zones 

>=3

>=3

>=3

Minimum capacity charge per object

N/A

N/A

128KB

Minimum storage duration charge

N/A

30 days

30 days

Retrieval fee

N/A

N/A

per GB retrieved

First byte latency

milliseconds

milliseconds

milliseconds

Storage type

 Object

Lifecycle transitions 

Yes

S3 One Zone-IA **

S3 Glacier Instant Retrieval

S3 Glacier Flexible Retrieval

Designed for durability

99.999999999%

(11 9’ s)

Designed for availability

99.5%

99.9%

99.99%

Availability SLA

99%

99%

99%

Availability Zones 

1

>=3

>=3

Minimum capacity charge per object

128KB

128KB

40KB

Minimum storage duration charge

30 days

90 days

90 days

Retrieval fee

per GB retrieved

per GB retrieved

per GB retrieved

First byte latency

milliseconds

millisecondsminutes or hours

Storage type

 Object

Lifecycle transitions 

Yes

 

S3 Glacier Deep Archive

 

 

Designed for durability

99.999999999%

(11 9’ s)

 

 

Designed for availability

99.99%

 

 

Availability SLA

99.9%

 

 

Availability Zones 

>=3

 

 

Minimum capacity charge per object

40 KB

 

 

Minimum storage duration charge

180 days

 

 

Retrieval fee

per GB retrieved

 

 

First byte latency

hours

 

 

Storage type

Object

 

 

Lifecycle transitions 

Yes

 

 

  • Amazon S3 Glacier Instant Retrieval
    • A storage class for long-lived data that are rarely accessed and must be retrieved in milliseconds.
    • When your data is accessed only once every quarter, you can save costs on storage compared to using S3 Standard-IA.
    • The data stored in S3 Glacier Instant Retrieval storage class is resilient in the event of the destruction of one entire Availability Zone.
  • Amazon S3 Glacier Flexible Retrieval
    • A storage class for storing archive data that is accessed once or twice per year.
    • S3 Glacier Flexible Retrieval provides the most cost-effective retrieval options, with access times ranging from minutes to hours and free bulk retrievals.
  • Amazon S3 Glacier Deep Archive
    • An Amazon S3 storage class that provides secure and durable object storage for long-term retention of data that is accessed rarely in a year. 
    • S3 Glacier Deep Archive offers the lowest cost storage in the cloud, at prices lower than storing and maintaining data in on-premises magnetic tape libraries or archiving data offsite. 
    • All objects stored in the S3 Glacier Deep Archive storage class are replicated and stored across at least three geographically-dispersed Availability Zones, protected by 99.999999999% durability, and can be restored within 12 hours or less. 
    • S3 Glacier Deep Archive also offers a bulk retrieval option, where you can retrieve petabytes of data within 48 hours.
  • Amazon S3 on Outposts
    • Amazon S3 on Outposts uses S3 APIs to deliver object storage to an on-premises AWS Outposts environment.
    • The data is encrypted with SSE-C and SSE-S3 and redundantly stored across Outposts servers.
    • With AWS DataSync, you can automate data transfer between Outposts and AWS Regions.

S3 API

  • REST – use standard HTTP requests to create, fetch, and delete buckets and objects. You can use S3 virtual hosting to address a bucket in a REST API call by using the HTTP Host header.
  • SOAP – support for SOAP over HTTP is deprecated, but it is still available over HTTPS. However, new Amazon S3 features will not be supported for SOAP. AWS recommends using either the REST API or the AWS SDKs.

Bucket Configurations

Subresource

Description

location

Specify the AWS Region where you want S3 to create the bucket.

policy and ACL

(access control list)

All your resources are private by default. Use bucket policy and ACL options to grant and manage bucket-level permissions.

cors

(cross-origin resource sharing)

You can configure your bucket to allow cross-origin requests. CORS defines a way for client web applications that are loaded in one domain to interact with resources in a different domain.

website

You can configure your bucket for static website hosting.

logging

Logging enables you to track requests for access to your bucket. Each access log record provides details about a single access request, such as the requester, bucket name, request time, request action, response status, and error code, if any.

event notification

You can enable your bucket to send you notifications of specified bucket events.

versioning

AWS recommends VERSIONING AS A BEST PRACTICE to recover objects from being deleted or overwritten by mistake.

lifecycle

You can define lifecycle rules for objects in your bucket that have a well-defined lifecycle.

cross-region replication

Cross-region replication is the automatic, asynchronous copying of objects across buckets in different AWS Regions.

tagging

S3 provides the tagging subresource to store and manage tags on a bucket. AWS generates a cost allocation report with usage and costs aggregated by your tags.

requestPayment

By default, the AWS account that creates the bucket (the bucket owner) pays for downloads from the bucket. The bucket owner can specify that the person requesting the download will be charged for the download.

transfer acceleration

Transfer Acceleration enables fast, easy, and secure transfers of files over long distances between your client and an S3 bucket. It takes advantage of Amazon CloudFront’s globally distributed edge locations.

Objects

  • Are private by default. Grant permissions to other users.
  • Each S3 object has data, a key, and metadata.
  • You cannot modify object metadata after object is uploaded.
  • Two kinds of metadata
    • System metadata

Name

Description

Can User Modify the Value?

Date

Current date and time.

No

Content-Length

Object size in bytes.

No

Last-Modified

Object creation date or the last modified date, whichever is the latest.

No

Content-MD5

The base64-encoded 128-bit MD5 digest of the object.

No

x-amz-server-side-encryption

Indicates whether server-side encryption is enabled for the object, and whether that encryption is from the AWS Key Management Service (SSE-KMS) or from AWS managed encryption (SSE-S3).

Yes

x-amz-version-id

Object version. When you enable versioning on a bucket, Amazon S3 assigns a version number to objects added to the bucket.

No

x-amz-delete-marker

In a bucket that has versioning enabled, this Boolean marker indicates whether the object is a delete marker.

No

x-amz-storage-class

Storage class used for storing the object.

Yes

x-amz-website-redirect-location

Redirects requests for the associated object to another object in the same bucket or an external URL.

Yes

x-amz-server-side-encryption-aws-kms-key-id

If x-amz-server-side-encryption is present and has the value of aws:kms, this indicates the ID of the AWS Key Management Service (AWS KMS) master encryption key that was used for the object.

Yes

x-amz-server-side-encryption-customer-algorithm

Indicates whether server-side encryption with customer-provided encryption keys (SSE-C) is enabled.

Yes

    • User-defined metadata – key-value pair that you provide.
  • You can upload and copy objects of up to 5 GB in size in a single operation. For objects greater than 5 GB up to 5 TB, you must use the multipart upload API.
  • Tagging
    • You can associate up to 10 tags with an object. Tags associated with an object must have unique tag keys.
    • A tag key can be up to 128 Unicode characters in length and tag values can be up to 256 Unicode characters in length.
    • Key and values are case sensitive.
  • Object Delete
    • Deleting Objects from a Version-Enabled Bucket
      • Specify a non-versioned delete request – specify only the object’s key, and not the version ID.
      • Specify a versioned delete request – specify both the key and also a version ID.
    • Deleting Objects from an MFA-Enabled Bucket
      • If you provide an invalid MFA token, the request always fails.
      • If you are not deleting a versioned object, and you don’t provide an MFA token, the delete succeeds.
  • Object Lock
    • Prevents objects from being deleted or overwritten for a fixed amount of time or indefinitely.
    • Objection retention options:
      • Retention period – object remains locked until the retention period expires.
      • Legal hold – object remains locked until you explicitly remove it.
    • Object Lock works only in versioned buckets and only applies to individual versions of objects.
  • Object Ownership
    • With the bucket-owner-full-control ACL, you can automatically assume ownership of objects that are uploaded to your buckets.
  • S3 Select
    • S3 Select is an Amazon S3 capability designed to pull out only the data you need from an object, which can dramatically improve the performance and reduce the cost of applications that need to access data in S3.
    • Amazon S3 Select works on objects stored in CSV and JSON format, Apache Parquet format, JSON Arrays, and BZIP2 compression for CSV and JSON objects.
    • CloudWatch Metrics for S3 Select lets you monitor S3 Select usage for your applications. These metrics are available at 1-minute intervals and lets you quickly identify and act on operational issues.
  • Lifecycle Management

    • lifecycle configuration is a set of rules that define actions that is applied to a group of objects.

AWS Training Amazon S3 2

      • Transition actions—Define when objects transition to another storage class. For S3-IA and S3-One-Zone, the objects must be stored at least 30 days in the current storage class before you can transition them to another class.

      • Expiration actions—Define when objects expire. S3 deletes expired objects on your behalf.

Pricing

  • S3 charges you only for what you actually use, with no hidden fees and no overage charges
  • No charge for creating a bucket, but only for storing objects in the bucket and for transferring objects in and out of the bucket.

Charge

Comments

Storage

You pay for storing objects in your S3 buckets. The rate you’re charged depends on your objects’ size, how long you stored the objects during the month, and the storage class.

Requests

You pay for requests, for example, GET requests, made against your S3 buckets and objects. This includes lifecycle requests. The rates for requests depend on what kind of request you’re making.

Retrievals

You pay for retrieving objects that are stored in STANDARD_IA, ONEZONE_IA, and GLACIER storage.

Early Deletes

If you delete an object stored in STANDARD_IA, ONEZONE_IA, or GLACIER storage before the minimum storage commitment has passed, you pay an early deletion fee for that object.

Storage Management

You pay for the storage management features that are enabled on your account’s buckets.

Bandwidth

You pay for all bandwidth into and out of S3, except for the following:

  • Data transferred in from the internet
  • Data transferred out to an Amazon EC2 instance, when the instance is in the same AWS Region as the S3 bucket
  • Data transferred out to Amazon CloudFront

You also pay a fee for any data transferred using Amazon S3 Transfer Acceleration.

Networking

  • Hosted-style access
    • Amazon S3 routes any virtual hosted–style requests to the US East (N. Virginia) region by default if you use the endpoint s3.amazonaws.com, instead of the region-specific endpoint.
    • Format:
  • Path-style access
    • In a path-style URL, the endpoint you use must match the Region in which the bucket resides.
    • Format:
  • Customize S3 URLs with CNAMEs – the bucket name must be the same as the CNAME.
  • Amazon S3 Transfer Acceleration enables fast, easy, and secure transfers of files over long distances between your client and an S3 bucket. It takes advantage of Amazon CloudFront’s globally distributed edge locations.
  • Transfer Acceleration cannot be disabled, and can only be suspended.
  • Transfer Acceleration URL is: bucket.s3-accelerate.amazonaws.com

Security

  • Policies contain the following:
    • Resources – buckets and objects
    • Actions – set of operations
    • Effect – can be either allow or deny. Need to explicitly grant allow to a resource.
    • Principal – the account, service or user who is allowed access to the actions and resources in the statement.
  • Resource Based Policies
    • Bucket Policies
      • Provides centralized access control to buckets and objects based on a variety of conditions, including S3 operations, requesters, resources, and aspects of the request (e.g., IP address).
      • Can either add or deny permissions across all (or a subset) of objects within a bucket.
      • IAM users need additional permissions from root account to perform bucket operations.
      • Bucket policies are limited to 20 KB in size.
    • Access Control Lists
      • A list of grants identifying grantee and permission granted.
      • ACLs use an S3–specific XML schema.
      • You can grant permissions only to other AWS accounts, not to users in your account.
      • You cannot grant conditional permissions, nor explicitly deny permissions.
      • Object ACLs are limited to 100 granted permissions per ACL.
      • The only recommended use case for the bucket ACL is to grant write permissions to the S3 Log Delivery group.

AWS Training Amazon S3 3

  • User Policies
    • AWS IAM (see AWS Security and Identity Services)
      • IAM User Access Keys
      • Temporary Security Credentials

AWS Training Amazon S3 4

  • Versioning
    • Use versioning to keep multiple versions of an object in one bucket.
    • Versioning protects you from the consequences of unintended overwrites and deletions.
    • You can also use versioning to archive objects so you have access to previous versions.
    • Since versioning is disabled by default, need to EXPLICITLY enable.
    • When you PUT an object in a versioning-enabled bucket, the non-current version is not overwritten.

AWS Training Amazon S3 5

    • When you DELETE an object, all versions remain in the bucket and Amazon S3 inserts a delete marker.

AWS Training Amazon S3 6

    • Performing a simple GET Object request when the current version is a delete marker returns a 404 Not Found error. You can, however, GET a non-current version of an object by specifying its version ID.

AWS Training Amazon S3 7

    • You can permanently delete an object by specifying the version you want to delete. Only the owner of an Amazon S3 bucket can permanently delete a version.

Encryption

    • Server-side Encryption using
      • Amazon S3-Managed Keys (SSE-S3)
      • AWS KMS-Managed Keys (SSE-KMS)
      • Customer-Provided Keys (SSE-C)
    • Client-side Encryption using
      • AWS KMS-managed customer master key
      • client-side master key
  • MFA Delete
    • MFA delete grants additional authentication for either of the following operations:
      • Change the versioning state of your bucket
      • Permanently delete an object version
    • MFA Delete requires two forms of authentication together:
      • Your security credentials
      • The concatenation of a valid serial number, a space, and the six-digit code displayed on an approved authentication device
  • Cross-Account Access
    • You can provide another AWS account access to an object that is stored in an Amazon Simple Storage Service (Amazon S3) bucket. These are the methods on how to grant cross-account access to objects that are stored in your own Amazon S3 bucket:
      • Resource-based policies and AWS Identity and Access Management (IAM) policies for programmatic-only access to S3 bucket objects 
      • Resource-based Access Control List (ACL) and IAM policies for programmatic-only access to S3 bucket objects 
      • Cross-account IAM roles for programmatic and console access to S3 bucket objects
  • Requester Pays Buckets 
    • Bucket owners pay for all of the Amazon S3 storage and data transfer costs associated with their bucket. To save on costs, you can enable the Requester Pays feature so the requester will pay the cost of the request and the data download from the bucket instead of the bucket owner. Take note that the bucket owner always pays the cost of storing data.

Monitoring

    • Automated monitoring tools to watch S3:
      • Amazon CloudWatch Alarms – Watch a single metric over a time period that you specify, and perform one or more actions based on the value of the metric relative to a given threshold over a number of time periods.
      • AWS CloudTrail Log Monitoring – Share log files between accounts, monitor CloudTrail log files in real time by sending them to CloudWatch Logs, write log processing applications in Java, and validate that your log files have not changed after delivery by CloudTrail.
    • Monitoring with CloudWatch
      • Daily Storage Metrics for Buckets ‐ You can monitor bucket storage using CloudWatch, which collects and processes storage data from S3 into readable, daily metrics.
      • Request metrics ‐ You can choose to monitor S3 requests to quickly identify and act on operational issues. The metrics are available at 1 minute intervals after some latency to process.
    • You can have a maximum of 1000 metrics configurations per bucket.
    • Supported event activities that occur in S3 are recorded in a CloudTrail event along with other AWS service events in Event history.
  • Website Hosting

    • Enable website hosting in your bucket Properties.
    • Your static website is available via the region-specific website endpoint.
    • You must make the objects that you want to serve publicly readable by writing a bucket policy that grants everyone s3:GetObject permission.

Key Difference

REST API Endpoint

Website Endpoint

Access control

Supports both public and private content.

Supports only publicly readable content.

Error message handling

Returns an XML-formatted error response.

Returns an HTML document.

Redirection support

Not applicable

Supports both object-level and bucket-level redirects.

Requests supported

Supports all bucket and object operations

Supports only GET and HEAD requests on objects.

Responses to GET and HEAD requests at the root of a bucket

Returns a list of the object keys in the bucket.

Returns the index document that is specified in the website configuration.

Secure Sockets Layer (SSL) support

Supports SSL connections.

Does not support SSL connections.

S3 Events Notification

    • To enable notifications, add a notification configuration identifying the events to be published, and the destinations where to send the event notifications.
    • Can publish following events:
      • A new object created event
      • An object removal event
      • A Reduced Redundancy Storage (RRS) object lost event
    • Supports the following destinations for your events:
      • Amazon Simple Notification Service (Amazon SNS) topic
      • Amazon Simple Queue Service (Amazon SQS) queue
      • AWS Lambda
  • Cross Region Replication

    • Enables automatic, asynchronous copying of objects across buckets in different AWS Regions.
    • When to use:
      • Comply with compliance requirements
      • Minimize latency
      • Increase operational efficiency
      • Maintain object copies under different ownership
    • Requirements of CRR:
      • Both source and destination buckets must have versioning enabled.
      • The source and destination buckets must be in different AWS Regions.
      • S3 must have permissions to replicate objects from the source bucket to the destination bucket on your behalf.
      • If the owner of the source bucket doesn’t own the object in the bucket, the object owner must grant the bucket owner READ and READ_ACP permissions with the object ACL.
    • Only the following are replicated:
      • Objects created after you add a replication configuration.
      • Both unencrypted objects and objects encrypted using Amazon S3 managed keys (SSE-S3) or AWS KMS managed keys (SSE-KMS), although you must explicitly enable the option to replicate objects encrypted using KMS keys. The replicated copy of the object is encrypted using the same type of server-side encryption that was used for the source object.
      • Object metadata.
      • Only objects in the source bucket for which the bucket owner has permissions to read objects and access control lists.
      • Object ACL updates, unless you direct S3 to change the replica ownership when source and destination buckets aren’t owned by the same accounts.
      • Object tags.
    • What isn’t replicated
      • Objects that existed before you added the replication configuration to the bucket.
      • Objects created with server-side encryption using customer-provided (SSE-C) encryption keys.
      • Objects in the source bucket that the bucket owner doesn’t have permissions for.
      • Updates to bucket-level subresources.
      • Actions performed by lifecycle configuration.
      • Objects in the source bucket that are replicas created by another cross-region replication. You can replicate objects from a source bucket to only one destination bucket.
    • CRR delete operations
      • If you make a DELETE request without specifying an object version ID, S3 adds a delete marker.
      • If you specify an object version ID to delete in a DELETE request, S3 deletes that object version in the source bucket, but it doesn’t replicate the deletion in the destination bucket. This protects data from malicious deletions.
  • S3 Batch Operations is a new feature that makes it simple to manage billions of objects stored in Amazon S3. Customers can make changes to object properties and metadata, and perform other storage management tasks – such as copying objects between buckets, replacing tag sets, modifying access controls, and restoring archived objects from Amazon S3 Glacier – for any number of S3 objects in minutes.

Amazon FSx

 

  • Amazon FSx is a fully managed third-party file system solution. It uses SSD storage to provide fast performance with low latency.
  • There are two available FSx solutions available in AWS:
    • Amazon FSx for Windows File Server
    • A fully managed native Microsoft Windows file system with full support for the SMB protocol, Windows NTFS, and Microsoft Active Directory (AD) integration.
    • How It Works

Amazon FSx

Common Use Cases

        • File systems that is accessible by multiple users, and can establish permissions at the file or folder level.
        • Application workloads that require shared file storage provided by Windows-based file systems (NTFS) and that use the SMB protocol.
        • Media workflows like media transcoding, processing, and streaming.
        • Data-intensive analytics workloads.
        • Content management and web serving applications such as IIS
      • Works with the following compute services:
        • Amazon EC2
        • Amazon Workspaces instances
        • Amazon AppStream 2.0 instances
        • VMs running in VMWare Cloud on AWS Environments
      • Works with Microsoft Active Directory (AD) to integrate your file system with your existing Windows environments.
      • Supports the use of AWS Direct Connect or AWS VPN to access your file systems from your on-premises compute instances.
      • Microsoft Windows File Share
        • Microsoft Windows file share is a specific folder in your file system, including that folder’s subfolders, which you make accessible to your compute instances with the Server Message Block (SMB) protocol.
        • Your file system comes with a default Windows file share, named share. You can create and manage as many other Windows file shares as you want by using the Windows graphical user interface (GUI) tool called Shared Folders.
        • To access your file shares, you use the Windows Map Network Drive functionality to map a drive letter on your compute instance to your Amazon FSx file share.

Storage

        • You can configure user storage quotas on your file systems to limit how much data storage that users can consume. After you set quotas, you can track quota status to monitor usage and see when users surpass their quotas.
        • You can use DFS (Distributed File System) Namespaces to group file shares on multiple file systems into one common folder structure (a namespace) that you use to access the entire file dataset.

Amazon FSx

      • You can reduce your data storage costs by turning on data deduplication for your file system. Data deduplication reduces or eliminates redundant data by storing duplicated portions of the dataset only once.
      • Performance Metrics
        • Amazon FSx uses SSD storage, provides storage of up to 64 TB per file system, and throughput of up to 2GB/s per file system.
        • You can use DFS Namespaces to create shared common namespaces spanning multiple Amazon FSx file systems to scale-out storage and throughput. Each common namespace is capable of scaling out storage to hundreds of petabytes of data, and performance to tens of GB/s of throughput, and millions of IOPS.

Migration

To migrate your existing file data into Amazon FSx, use Windows’s Robust File Copy (RoboCopy) to copy your files (both the data and the full set of metadata like ownership and Access Control Lists) directly to Amazon FSx.

Limits

Resource

Default Limit

Can be increased Up To

Number of file systems

100

Thousands 

Total storage for all file systems

512 TiB

Multiple PiBs

Total throughput capacity for all file systems

10 GBps

Hundreds of GBps

Total number of user-initiated backups for all file system

500

Thousands 

    • Amazon FSx for Lustre
      • A high-performance file system optimized for fast processing of workloads. Lustre is a popular open-source parallel file system.
      • You can choose between SSD storage options and HDD storage options, each offering different levels of performance. The HDD options reduce storage costs by up to 80% for throughput-intensive workloads that don’t require the sub-millisecond latencies of SSD storage.
      • Since this is a high-performance parallel file system, you can use Amazon FSx as “hot” storage for your highly accessed files, and Amazon S3 as “cold” storage for rarely accessed files.
      • When linked to an S3 bucket, an FSx for Lustre file system transparently presents S3 objects as files and allows you to write results back to S3. 
      • The Lustre file system provides a POSIX-compliant file system interface.
      • Also supports concurrent access to the same file or directory from thousands of compute instances
      • How It Works

Amazon FSx

      • You can mount your Amazon FSx for Lustre file system from an on-premises client over AWS Direct Connect or VPN, and then use parallel copy tools to import your data to your Amazon FSx for Lustre file system.
      • You can export files and their associated metadata that you have written or modified in your file system to your durable data repository on Amazon S3
      • Performance Metrics
        • Provides sub-millisecond access to your data
        • Allows you to read and write data at speeds of up to hundreds of gigabytes per second of throughput and millions of IOPS.
        • FSx for Lustre provides 200MBps of baseline throughput per TB of storage provisioned. As your file systems scales, your throughput can scale up to hundreds of GB/s.
      • Limits

  • Amazon FSx Windows File Servers vs. Amazon EFS vs. Amazon FSx
  • High Availability and Durability
    • Automatically replicates your data within an Availability Zone (AZ) it resides in (which you specify during creation).
    • Offers single AZ and multi-AZ deployment options for your Windows file-based workloads.
    • Uses the Volume Shadow Copy Service (VSS) to automatically takes highly durable, file-system consistent daily backups to S3
    • Supports restoring individual files and folders to previous versions using Windows shadow copies.
    • The default backup retention period is 7 days.

Security

    • Supports identity-based authentication over the Server Message Block (SMB) protocol through Microsoft Active Directory.
    • Amazon FSx automatically encrypts your data at-rest using AWS KMS and in-transit using SMB Kerberos session keys. 
    • Amazon FSx is ISO, PCI-DSS, and SOC compliant, and is HIPAA eligible. 
    • Integrated with AWS CloudTrail that monitors and logs your API calls.

Pricing

    • You are billed hourly for your Amazon FSx file systems, based on your configured storage capacity (priced per GB-month) and throughput capacity (priced per MBps-month). 
    • You are billed hourly for your backup storage (priced per GB-month). You can enable data deduplication to automatically reduce costs associated with redundant data by storing duplicated portions of your dataset only once.