Linux Training in Coimbatore & Best Linux Server Administration Training Institute

Thursday, 15 June 2023

Amazon Simple Storage Service(S3) – Versioning

Amazon Simple storage service (S3) it is an object oriented storage service which provides the features such as data availability data scalability, security, and also used for enhance the performance. S3 also allows you to store as many objects as you’d like with an individual object size limit of five terabytes. With cost-effective storage classes and easy-to-use management features, you can optimize costs, organize data, and configure fine-tuned access controls to meet specific business, organizational, and compliance requirements.

An organization relies on services that give them security, reliability, performance, and data availability. AWS provides an S3 feature which is basically a storage class that gives all such features and also promotes scalability of the organization as well as stores data and protects them. Now let’s understand what is S3 Versioning, in layman’s term suppose in the S3 storage class, someone uploads a picture of ID:113, and suppose after some time he/she updates the picture or replace it with ID:112. Now, suppose that he/she feels that the previous one was better and wants to roll back to picture ID:113. How to get that? The S3 Versioning comes into the picture now.

Pictorial representation of S3-Versioning

It allows storing of different versions or forms of the object. Versioning makes it easier to preserve and roll back old updates of objects, whenever needed. Moreover, it helps to restore back the object from any unintended user activity such as deleting the object unintentionally.

Implementation:

Let’s demonstrate it with step by step procedure:

Step 1: Log in to your Amazon Web Services Account>> In your console search bar, search “S3”>>then select the S3.

Step 2: Then on the Amazon S3 page click on create a bucket.

Step 3: In create bucket page, Give the bucket a Name

NOTE: name must be unique and should not contain any space or uppercase letter)>>Select any region>>Enable ACL (Access Control List basically helps to manage access to created buckets and it’s a different version of object)>>Un-tick Block all public access option (If you want to give it public access)>>Click on “I acknowledge” >> Enable Bucket Versioning>>Keep default encryption disabled>>Click on Create Bucket

Step 4: Click on your created bucket>>Click on upload>>Upload any file

Step 5: Here I have uploaded a txt file named Text1 (Content of Text1-“This is my text1”)>>Click on the file you uploaded>>Below you will find object URL>>try to hit the link in a browser, you won’t be able to access the content. Now, go to object action>>Click on the public using ACL.

Step 6: In make public page>>Click on Make Public option.

Step 7: Again, click on your uploaded file>>now copy on the object URL present below>>Try to hit on your browser.

After hitting the URL in the browser:

Step 8: Now, go to your bucket where your file is present, make some changes, and upload it again. My updated file content is “This is my updated text1”. Then follow Steps 5,6,7 again. This time you can see your updated version of the file.

Step 9: Now to get the previous content of the file or to roll back-Go to your created bucket>>Click on show version option>>You can find all your previous contents.

Step 10: To get your deleted content- Go to your bucket>>select the file>>click on delete option present on top>>Type delete in delete screen.

Step 11: Go back to the same bucket>>Click on show version

You can find your deleted file with type as “Delete marker”. To recover the deleted object, delete the “Delete marker”.

This rollback of versions of objects is what makes versioning popular.

In above we created bucket with versioning enabled

Steps To Create S3 bucket with versioning Disabled

STEP 1: Create or login to your AWS account and then you will land on AWS management console and Go to services and select S3.

STEP 3: click on create bucket and A new window will pope up, where you have to enter the details and configure your bucket

STEP 4: Configure public access settings for your bucket.

STEP 5: Configure Bucket Versioning (let it be disabled as of now) and add Tags to your bucket.

Versioning in AWS is used to store the “multiple variant of object ” inside the same bucket

STEP 6: Click on Create bucket

So we have created bucket with Versioning off in this part.

Amazon S3 – Storage Classes

Amazon Simple Storage Service (S3) is used for storing data in the form of objects S3 is quite different from any other file storage device or service. Amazon S3 also provides industry-leading scalability, data availability, security, and performance. The data which is uploaded by the user in S3, that data is stored as objects and provided an ID. Moreover, they store in shapes like buckets and can upload the maximum file size is of 5 Terabyte(TB). This service is basically designed for the online backup and archiving of data and applications on Amazon Web Services (AWS).

Amazon S3 Storage Classes:

This storage maintains the originality of data by inspecting it. Types of storage classes are as follows:

Amazon S3 Standard
Amazon S3 Intelligent-Tiering
Amazon S3 Standard-Infrequent Access
Amazon S3 One Zone-Infrequent Access
Amazon S3 Glacier Instant Retrieval
Amazon S3 Glacier Flexible Retrieval
Amazon S3 Glacier Deep Archive

1. Amazon S3 Standard:

It is used for general purposes and offers high durability, availability, and performance object storage for frequently accessed data. S3 Standard is appropriate for a wide variety of use cases, including cloud applications, dynamic websites, content distribution, mobile and gaming applications, and big data analytics.

Mainly it is used for general purposes in order to maintain durability, availability, and performance to a higher extent. Its applications are cloud applications, dynamic websites, content distribution, mobile & gaming apps as well as big data analysis or data mining.

Characteristics of S3 Standard:

Availability criteria are quite good like 99.9%.
Improves the recovery of an object file.
It is against the events which are a little bit tough that can affect an entire Availability Zone.
Durability of S3 standard is 99.999999999%.

2. Amazon S3 Intelligent-Tiering:

The first cloud storage automatically decreases the user’s storage cost. It provides very cost-effective access based on frequency, without affecting other performances. It also manages tough operations. Amazon S3 Intelligent – Tiering reduces the cost of granular objects automatically. No retrieval charges are there in Amazon S3 Intelligent – Tiering.

Characteristics of S3 Intelligent-Tiering:

Required less monitoring and automatically tier charge.
No minimum storage duration and no recovery charges are required to access the service.
Availability criteria are quite good like 99.9%.
Durability of S3 Intelligent- Tiering is 99.999999999%.

3. Amazon S3 Standard-Infrequent Access:

To access the less frequently used data, users use S3 Standard-IA. It requires rapid access when needed. We can achieve high strength, high output, and low bandwidth by using S3 Standard-IA. It is best in storing the backup, and recovery of data for a long time. It act as a data store for disaster recovery files.

Characteristics of S3 Standard-Infrequent Access:

High performance and same action rate.
Very Durable in all AZs.
Availability is 99.9% in S3 Standard-IA.
Durability is of 99.999999999%.

4. Amazon S3 Glacier Instant Retrieval:

It is an archive storage class that delivers the lowest-cost storage for data archiving and is organized to provide you with the highest performance and with more flexibility. S3 Glacier Instant Retrieval delivers the fastest access to archive storage. Same as in S3 standard, Data retrieval in milliseconds .

Characteristics of S3 Glacier Instant Retrieval:

It just takes milliseconds to recover the data.
The minimum object size should be 128KB.
Availability is 99.9% in S3 glacier Instant Retrieval.
Durability is of 99.999999999%.

5. Amazon S3 One Zone-Infrequent Access:

Different from other S3 Storage Classes which store data in a minimum of three Availability Zones, S3 One Zone-IA stores data in a single Availability Zone and costs 20% less than S3 Standard-IA. It’s a very good choice for storing secondary backup copies of on-premises data or easily re-creatable data. S3 One Zone-IA provides you the same high durability, high throughput, and low latency as in S3 Standard.

Characteristics of S3 One Zone-Infrequent Access:-

Supports SSL(Secure Sockets Layer) for data in transferring and encryption of data.
Availability Zone destruction can damage the data.
Availability is 99.5% in S3 one Zone- Infrequent Access.
Durability is of 99.999999999%.

6. Amazon S3 Glacier Flexible Retrieval:

It provides low-cost storage compared to S3 Glacier Instant Retrieval. It is a suitable solution for backing up the data so that it can be recovered easily a few times in a year. It just takes minutes to access the data.

Characteristics of S3 Glacier Flexible Retrieval:

Free recoveries in high quantity.
AZs destruction can lead to difficulty in accessing data.
when you have to retrieve large data sets , then S3 glacier flexible retrieval is best for backup and disaster recovery use cases.
Availability is 99.99% in S3 glacier flexible retrieval.
Durability is of 99.999999999%.

7. Amazon S3 Glacier Deep Archive:

The Glacier Deep Archive storage class is designed to provide long-lasting and secure long-term storage for large amounts of data at a price that is competitive with off-premises tape archival services that is very cheap. You no longer need to deal with expensive services. Accessibility is very much efficient, that it can restore data within 12 hours. This storage class is designed in such a way that users can easily get long-lasting and more secured storage for a huge amount of data at very less cost. Efficient accessibility and can restore data within very less time, therefore its time complexity is also efficient. S3 Glacier Deep Archive also have the feature of objects replication.

Characteristics of S3 Glacier Deep Archive:-

More secured storage.
Recovery time is less requires less time.
Availability is 99.99% in S3 glacier deep archive.
Durability is of 99.999999999%.

Introduction to AWS Simple Storage Service (AWS S3)

AWS Storage Services: AWS offers a wide range of storage services that can be provisioned depending on your project requirements and use case. AWS storage services have different provisions for highly confidential data, frequently accessed data, and the not so frequently accessed data. You can choose from various storage types namely, object storage, file storage, block storage services, backups, and data migration options. All of which fall under the AWS Storage Services list.

AWS Simple Storage Service (S3): From the aforementioned list, S3, is the object storage service provided by AWS. It is probably the most commonly used, go-to storage service for AWS users given the features like extremely high availability, security, and simple connection to other AWS Services. AWS S3 can be used by people with all kinds of use cases like mobile/web applications, big data, machine learning and many more.

AWS S3 Terminology:

Bucket: Data, in S3, is stored in containers called buckets.
- Each bucket will have its own set of policies and configuration. This enables users to have more control over their data.
- Bucket Names must be unique.
- Can be thought of as a parent folder of data.
- There is a limit of 100 buckets per AWS accounts. But it can be increased if requested from AWS support.
Bucket Owner: The person or organization that owns a particular bucket is its bucket owner.
Import/Export Station: A machine that uploads or downloads data to/from S3.
Key: Key, in S3, is a unique identifier for an object in a bucket. For example in a bucket ‘ABC’ your GFG.java file is stored at javaPrograms/GFG.java then ‘javaPrograms/GFG.java’ is your object key for GFG.java.
- It is important to note that ‘bucketName+key’ is unique for all objects.
- This also means that there can be only one object for a key in a bucket. If you upload 2 files with the same key. The file uploaded latest will overwrite the previously contained file.
Versioning: Versioning means to always keep a record of previously uploaded files in S3. Points to note:
- Versioning is not enabled by default. Once enabled, it is enabled for all objects in a bucket.
- Versioning keeps all the copies of your file, so, it adds cost for storing multiple copies of your data. For example, 10 copies of a file of size 1GB will have you charged for using 10GBs for S3 space.
- Versioning is helpful to prevent unintended overwrites and deletions.
- Note that objects with the same key can be stored in a bucket if versioning is enabled (since they have a unique version ID).
null Object: Version ID for objects in a bucket where versioning is suspended is null. Such objects may be referred to as null objects.
- For buckets with versioning enabled, each version of a file has a specific version ID.
Object: Fundamental entity type stored in AWS S3.
Access Control Lists (ACL): A document for verifying the access to S3 buckets from outside your AWS account. Each bucket has its own ACL.
Bucket Policies: A document for verifying the access to S3 buckets from within your AWS account, this controls which services and users have what kind of access to your S3 bucket. Each bucket has its own Bucket Policies.
Lifecycle Rules: This is a cost-saving practice that can move your files to AWS Glacier (The AWS Data Archive Service) or to some other S3 storage class for cheaper storage of old data or completely delete the data after the specified time.

Features of AWS S3:

Durability: AWS claims Amazon S3 to have a 99.999999999% of durability (11 9’s). This means the possibility of losing your data stored on S3 is one in a billion.
Availability: AWS ensures that the up-time of AWS S3 is 99.99% for standard access.
- Note that availability is related to being able to access data and durability is related to losing data altogether.
Server-Side-Encryption (SSE): AWS S3 supports three types of SSE models:
- SSE-S3: AWS S3 manages encryption keys.
- SSE-C: The customer manages encryption keys.
- SSE-KMS: The AWS Key Management Service (KMS) manages the encryption keys.
File Size support: AWS S3 can hold files of size ranging from 0 bytes to 5 terabytes. A 5TB limit on file size should not be a blocker for most of the applications in the world.
Infinite storage space: Theoretically AWS S3 is supposed to have infinite storage space. This makes S3 infinitely scalable for all kinds of use cases.
Pay as you use: The users are charged according to the S3 storage they hold.
AWS-S3 is region-specific.

S3 storage classes:

AWS S3 provides multiple storage types that offer different performance and features and different cost structure.

Standard: Suitable for frequently accessed data, that needs to be highly available and durable.
Standard Infrequent Access (Standard IA): This is a cheaper data-storage class and as the name suggests, this class is best suited for storing infrequently accessed data like log files or data archives. Note that there may be a per GB data retrieval fee associated with Standard IA class.
Intelligent Tiering: This service class classifies your files automatically into frequently accessed and infrequently accessed and stores the infrequently accessed data in infrequent access storage to save costs. This is useful for unpredictable data access to an S3 bucket.
One Zone Infrequent Access (One Zone IA): All the files on your S3 have their copies stored in a minimum of 3 Availability Zones. One Zone IA stores this data in a single availability zone. It is only recommended to use this storage class for infrequently accessed, non-essential data. There may be a per GB cost for data retrieval.
Reduced Redundancy Storage (RRS): All the other S3 classes ensure the durability of 99.999999999%. RRS only ensures a 99.99% durability. AWS no longer recommends RRS due to its less durability. However, it can be used to store non-essential data.

Difference Between Amazon EBS and Amazon EFS

The AWS EFS(Elastic file system) and AWS EBS(Elastic block storage) are two different types of storage services provided by Amazon Web Services. This article highlights some major differences between Amazon EFS and Amazon EBS.

What is AWS EBS?

EBS(Elastic block storage) is a block-level storage service provided by Amazon and it is basically designed to be used exclusively with separate EC2 instances, no two instances can have the same EBS volume attached to them. As EBS is directly attached to the instance it provides a high-performance option for many use cases, and it is used for various databases (both relational and non-relational) and also for a wide range of applications such as Software Testing and development.

EBS stores files in multiple volumes called blocks, which act as separate hard drives, and this storage is not accessible via the internet.

Note that Elastic block storage is similar to a hard-drive connected to a physical computer and this storage can be attached and detached at any time.

What is AWS EFS?

EFS(Elastic file system) is a file-level storage service that basically provides a shared elastic file system with virtually unlimited scalability support. EFS is highly available storage that can be utilized by many servers at the same time. AWS EFS is a fully managed service by amazon and it offers scalability on the fly. This means that the user need not worry about their increasing or decreasing workload. If the workload suddenly becomes higher then the storage will automatically scale itself and if the workload decreases then the storage will itself scale down. This scalability feature of EFS also provides cost benefits as you need not pay anything for the part of storage that you don’t use, you only pay for what you use(Utility-based computing).

One most important feature of EFS that makes it different from all other storage is that the IOPS rate in EFS is inversely proportional to the size of data. For example, if the size of data is less, then the performance and IOPS rate might be not much significant but when used more heavily, EFS can offer as much as 10 GB/sec along with 500,000 IOPS.

Comparison based on Characteristics:

Storage Type

EBS(elastic block storage) & EFS(elastic file system), as the name suggests EBS is block-level storage and EFS is file-level storage.

Availability

As we know that EBS is directly attached to the instance so there is no sign of the term availability in it whereas Amazon EFS is highly durable and highly available storage.

Durability

EBS is similar to hard disks but the only difference is that EBS is connected to virtual EC2 instances and it offers 20 times more reliability than normal hard disks.

EFS is highly durable storage.

Performance

EBS offers a Baseline performance of 3 IOPS per GB for General Purpose volume and also we can use Provisioned IOPS for increased performance whereas EFS supports up to 7000 file system operations per second.

Data Stored

The data stored in EBS remains in the same availability zone and multiple replicas are created within the same availability zone whereas in EFS the data stored remains in the same region and multiple replicas are created within the same region.

Comprehensive managed service

EFS is a completely managed service, which means that your firm will never have to patch, deploy, or maintain your file system, but the same is not the case with EBS.

Data Access

One most important disadvantage of EBS is that it cannot be accessed directly via the internet, it can only be accessed by a single EC2 instance with whom it is connected, whereas EFS storage allows access of 1 to 1000s of EC2 instances concurrently via the internet but these instances must be present in the same region only.

Encryption

Both EBS and EFS supports encryption and uses an AWS KMS–Managed Customer Master Key (CMK) and AES 256-bit Encryption standards for encryption.

File Size Limitation

As EBS is directly connected to the EC2 instance so we have don’t have any limitation on file size whereas in EFS the maximum size of a single file can be up to 47.9TiB.

Cost savings

EFS is the only storage in which you’ll pay for is exactly what you use, as there’s no advance provisioning, up-front fees, or commitments whereas in EBS you need to attach a fixed amount of volume, and you are charged for the same.

Use cases

Amazon EBS use cases:

Software Testing and development: Amazon EBS is connected only to a particular instance, so it is best suited for testing and development purposes.
Business continuity: Amazon EBS provides a good level of business consistency as users can run applications in different AWS regions and all they require is EBS Snapshots and Amazon machine images.
Enterprise-wide applications: EBS provides block-level storage, so it allows users to run a wide variety of applications including Microsoft Exchange, Oracle, etc.
Transactional and NoSQL databases: As EBS provides a low level of latency so it offers an optimum level of performance for transactional and NO SQL databases. It also helps in database management.

Amazon EFS use cases:

Lift-and-shift application support: EFS is elastic, highly available, and highly scalable storage, and these all features and enables users to move enterprise applications easily and quickly.
Analytics for big data: EFS has got the ability to run big data applications.
Web server support: EFS is a highly robust throughput file system and is capable of enabling web serving applications, such as websites, or blogs.
Application development and testing: Among different storages provided by Amazon EFS is the only one that provides a shared file system needed to share code and files.

Let us see the differences in a tabular form -:

	Amazon EBS	Amazon EFS
1.	The full form of Amazon EBS is Amazon Elastic Block Store	The full form of Amazon EFS is Amazon Elastic File System
2.	It is used to provide the block-level storage volumes for the use of EC2 instances.	It is simple to use.
3.	It is mainly used for data that should be quickly accessible and requires long term durability.	It is used in modernize application development
4.	It is suitable for both types of database-style applications -: 1. Those rely on random reads 2. Those rely on random writes.	Industries use this for enhancing content management systems