Thursday, 15 June 2023

Amazon S3 – Storage Classes

 Amazon Simple Storage Service (S3) is used for storing data in the form of objects S3 is quite different from any other file storage device or service. Amazon S3 also provides industry-leading scalability, data availability, security, and performance. The data which is uploaded by the user in S3, that data is stored as objects and provided an ID. Moreover, they store in shapes like buckets and can upload the maximum file size is of 5 Terabyte(TB). This service is basically designed for the online backup and archiving of data and applications on Amazon Web Services (AWS).

Amazon S3 Storage Classes:

This storage maintains the originality of data by inspecting it. Types of storage classes are as follows:

  • Amazon S3 Standard
  • Amazon S3 Intelligent-Tiering
  • Amazon S3 Standard-Infrequent Access
  • Amazon S3 One Zone-Infrequent Access
  • Amazon S3 Glacier Instant Retrieval
  • Amazon S3 Glacier Flexible Retrieval
  • Amazon S3 Glacier Deep Archive

1. Amazon S3 Standard:

It is used for general purposes and offers high durability, availability, and performance object storage for frequently accessed data. S3 Standard is appropriate for a wide variety of use cases, including cloud applications, dynamic websites, content distribution, mobile and gaming applications, and big data analytics.

Mainly it is used for general purposes in order to maintain durability, availability, and performance to a higher extent. Its applications are cloud applications, dynamic websites, content distribution, mobile & gaming apps as well as big data analysis or data mining.

Characteristics of  S3 Standard:

  • Availability criteria are quite good like 99.9%.
  • Improves the recovery of an object file.
  • It is against the events which are a little bit tough that can affect an entire Availability Zone.
  •  Durability of S3 standard is 99.999999999%.

2. Amazon S3 Intelligent-Tiering:

 The first cloud storage automatically decreases the user’s storage cost. It provides very cost-effective access based on frequency, without affecting other performances. It also manages tough operations. Amazon S3 Intelligent – Tiering reduces the cost of granular objects automatically. No retrieval charges are there in Amazon S3 Intelligent – Tiering.

Characteristics of  S3 Intelligent-Tiering:

  • Required less monitoring and automatically tier charge.
  • No minimum storage duration and no recovery charges are required to access the service.
  • Availability criteria are quite good like 99.9%.
  • Durability of S3 Intelligent- Tiering is 99.999999999%.

3. Amazon S3 Standard-Infrequent Access:

To access the less frequently used data, users use S3 Standard-IA. It requires rapid access when needed. We can achieve high strength, high output, and low bandwidth by using S3 Standard-IA. It is best in storing the backup, and recovery of data for a long time. It act as a data store for disaster recovery files.

Characteristics of  S3 Standard-Infrequent Access:

  • High performance and same action rate.
  • Very Durable in all AZs.
  • Availability is 99.9% in S3 Standard-IA.
  • Durability is  of 99.999999999%.

4. Amazon S3 Glacier Instant Retrieval:

It is an archive storage class that delivers the lowest-cost storage for data archiving and is organized to provide you with the highest performance and with more flexibility. S3 Glacier Instant Retrieval delivers the fastest access to archive storage. Same as in S3 standard, Data retrieval in milliseconds .

Characteristics of S3 Glacier Instant Retrieval:

  • It just takes milliseconds to recover the data.
  • The minimum object size should be 128KB.
  • Availability is 99.9% in S3 glacier Instant Retrieval.
  • Durability is of  99.999999999%.

5. Amazon S3 One Zone-Infrequent Access:

Different from other S3 Storage Classes which store data in a minimum of three Availability Zones, S3 One Zone-IA stores data in a single Availability Zone and costs 20% less than S3 Standard-IA. It’s a very good choice for storing secondary backup copies of on-premises data or easily re-creatable data. S3 One Zone-IA provides you the same high durability, high throughput, and low latency as in S3 Standard.

Characteristics of S3 One Zone-Infrequent Access:-

  • Supports SSL(Secure Sockets Layer) for data in transferring and encryption of data.
  • Availability Zone destruction can damage the data.
  • Availability is 99.5% in S3 one Zone- Infrequent Access.
  • Durability is of  99.999999999%.

6. Amazon S3 Glacier Flexible Retrieval:

It provides low-cost storage compared to S3 Glacier Instant Retrieval. It is a suitable solution for backing up the data so that it can be recovered easily a few times in a year. It just takes minutes to access the data. 

Characteristics of S3 Glacier Flexible Retrieval:

  • Free recoveries in high quantity.
  • AZs destruction can lead to difficulty in accessing data.
  • when you have to retrieve large data sets , then S3 glacier flexible retrieval is best for backup and disaster recovery use cases.
  • Availability is 99.99% in S3 glacier flexible retrieval.
  • Durability is of  99.999999999%.

7. Amazon S3 Glacier Deep Archive:

The Glacier Deep Archive storage class is designed to provide long-lasting and secure long-term storage for large amounts of data at a price that is competitive with off-premises tape archival services that is very cheap. You no longer need to deal with expensive services. Accessibility is very much efficient, that it can restore data within 12 hours. This storage class is designed in such a way that users can easily get long-lasting and more secured storage for a huge amount of data at very less cost. Efficient accessibility and can restore data within very less time, therefore its time complexity is also efficient. S3 Glacier Deep Archive also have the feature of objects replication.

Characteristics of S3 Glacier Deep Archive:-

  • More secured storage.
  • Recovery time is less requires less time.
  • Availability is 99.99% in S3 glacier deep archive.
  • Durability is of  99.999999999%.

Introduction to AWS Simple Storage Service (AWS S3)

 AWS Storage Services: AWS offers a wide range of storage services that can be provisioned depending on your project requirements and use case. AWS storage services have different provisions for highly confidential data, frequently accessed data, and the not so frequently accessed data. You can choose from various storage types namely, object storage, file storage, block storage services, backups, and data migration options. All of which fall under the AWS Storage Services list. 

 

AWS Simple Storage Service (S3): From the aforementioned list, S3, is the object storage service provided by AWS. It is probably the most commonly used, go-to storage service for AWS users given the features like extremely high availability, security, and simple connection to other AWS Services. AWS S3 can be used by people with all kinds of use cases like mobile/web applications, big data, machine learning and many more.

AWS S3 Terminology:

  • Bucket: Data, in S3, is stored in containers called buckets.
    • Each bucket will have its own set of policies and configuration. This enables users to have more control over their data.
    • Bucket Names must be unique.
    • Can be thought of as a parent folder of data.
    • There is a limit of 100 buckets per AWS accounts. But it can be increased if requested from AWS support.
  • Bucket Owner: The person or organization that owns a particular bucket is its bucket owner.
  • Import/Export Station:  A machine that uploads or downloads data to/from S3.
  • Key: Key, in S3, is a unique identifier for an object in a bucket. For example in a bucket ‘ABC’ your GFG.java file is stored at javaPrograms/GFG.java then ‘javaPrograms/GFG.java’ is your object key for GFG.java.
    • It is important to note that ‘bucketName+key’ is unique for all objects.
    • This also means that there can be only one object for a key in a bucket. If you upload 2 files with the same key. The file uploaded latest will overwrite the previously contained file.
  • Versioning:  Versioning means to always keep a record of previously uploaded files in S3. Points to note:
    • Versioning is not enabled by default. Once enabled, it is enabled for all objects in a bucket.
    • Versioning keeps all the copies of your file, so, it adds cost for storing multiple copies of your data. For example, 10 copies of a file of size 1GB will have you charged for using 10GBs for S3 space.
    • Versioning is helpful to prevent unintended overwrites and deletions.
    • Note that objects with the same key can be stored in a bucket if versioning is enabled (since they have a unique version ID).
  • null Object: Version ID for objects in a bucket where versioning is suspended is null. Such objects may be referred to as null objects.
    • For buckets with versioning enabled, each version of a file has a specific version ID.
  • Object: Fundamental entity type stored in AWS S3.
  • Access Control Lists (ACL): A document for verifying the access to S3 buckets from outside your AWS account. Each bucket has its own ACL.
  • Bucket Policies: A document for verifying the access to S3 buckets from within your AWS account, this controls which services and users have what kind of access to your S3 bucket. Each bucket has its own Bucket Policies.
  • Lifecycle Rules: This is a cost-saving practice that can move your files to AWS Glacier (The AWS Data Archive Service) or to some other S3 storage class for cheaper storage of old data or completely delete the data after the specified time.

Features of AWS S3:

  • Durability: AWS claims Amazon S3 to have a 99.999999999% of durability (11 9’s). This means the possibility of losing your data stored on S3 is one in a billion.
  • Availability: AWS ensures that the up-time of AWS S3 is 99.99% for standard access.
    • Note that availability is related to being able to access data and durability is related to losing data altogether.
  • Server-Side-Encryption (SSE): AWS S3 supports three types of SSE models:
    • SSE-S3: AWS S3 manages encryption keys.
    • SSE-C: The customer manages encryption keys.
    •  SSE-KMS: The AWS Key Management Service (KMS) manages the encryption keys.
  • File Size support: AWS S3 can hold files of size ranging from 0 bytes to 5 terabytes. A 5TB limit on file size should not be a blocker for most of the applications in the world.
  • Infinite storage space: Theoretically AWS S3 is supposed to have infinite storage space. This makes S3 infinitely scalable for all kinds of use cases.
  • Pay as you use: The users are charged according to the S3 storage they hold.
  • AWS-S3 is region-specific.

S3 storage classes:

AWS S3 provides multiple storage types that offer different performance and features and different cost structure. 

  • Standard: Suitable for frequently accessed data, that needs to be highly available and durable.
  • Standard Infrequent Access (Standard IA): This is a cheaper data-storage class and as the name suggests, this class is best suited for storing infrequently accessed data like log files or data archives. Note that there may be a per GB data retrieval fee associated with Standard IA class.
  • Intelligent Tiering: This service class classifies your files automatically into frequently accessed and infrequently accessed and stores the infrequently accessed data in infrequent access storage to save costs. This is useful for unpredictable data access to an S3 bucket.
  • One Zone Infrequent Access (One Zone IA): All the files on your S3 have their copies stored in a minimum of 3 Availability Zones. One Zone IA stores this data in a single availability zone. It is only recommended to use this storage class for infrequently accessed, non-essential data. There may be a per GB cost for data retrieval.
  • Reduced Redundancy Storage (RRS): All the other S3 classes ensure the durability of 99.999999999%. RRS only ensures a 99.99% durability. AWS no longer recommends RRS due to its less durability. However, it can be used to store non-essential data.

Difference Between Amazon EBS and Amazon EFS

 The AWS EFS(Elastic file system) and AWS EBS(Elastic block storage) are two different types of storage services provided by Amazon Web Services. This article highlights some major differences between Amazon EFS and Amazon EBS.

What is AWS EBS?

EBS(Elastic block storage) is a block-level storage service provided by Amazon and it is basically designed to be used exclusively with separate EC2 instances, no two instances can have the same EBS volume attached to them. As EBS is directly attached to the instance it provides a high-performance option for many use cases, and it is used for various databases (both relational and non-relational) and also for a wide range of applications such as Software Testing and development.

EBS stores files in multiple volumes called blocks, which act as separate hard drives, and this storage is not accessible via the internet.

Note that Elastic block storage is similar to a hard-drive connected to a physical computer and this storage can be attached and detached at any time.

What is AWS EFS?

EFS(Elastic file system) is a file-level storage service that basically provides a shared elastic file system with virtually unlimited scalability support. EFS is highly available storage that can be utilized by many servers at the same time. AWS EFS is a fully managed service by amazon and it offers scalability on the fly. This means that the user need not worry about their increasing or decreasing workload. If the workload suddenly becomes higher then the storage will automatically scale itself and if the workload decreases then the storage will itself scale down. This scalability feature of EFS also provides cost benefits as you need not pay anything for the part of storage that you don’t use, you only pay for what you use(Utility-based computing).

One most important feature of EFS that makes it different from all other storage is that the IOPS rate in EFS is inversely proportional to the size of data. For example, if the size of data is less, then the performance and IOPS rate might be not much significant but when used more heavily, EFS can offer as much as 10 GB/sec along with 500,000 IOPS.

Comparison based on Characteristics:

Storage Type

EBS(elastic block storage) & EFS(elastic file system), as the name suggests EBS is block-level storage and EFS is file-level storage.

Availability

As we know that EBS is directly attached to the instance so there is no sign of the term availability in it whereas Amazon EFS is highly durable and highly available storage.

Durability

EBS is similar to hard disks but the only difference is that EBS is connected to virtual EC2 instances and it offers 20 times more reliability than normal hard disks.

EFS is highly durable storage.

Performance

EBS offers a Baseline performance of 3 IOPS per GB for General Purpose volume and also we can use Provisioned IOPS for increased performance whereas EFS supports up to 7000 file system operations per second.

Data Stored

The data stored in EBS remains in the same availability zone and multiple replicas are created within the same availability zone whereas in EFS the data stored remains in the same region and multiple replicas are created within the same region.

Comprehensive managed service

EFS is a completely managed service, which means that your firm will never have to patch, deploy, or maintain your file system, but the same is not the case with EBS.

Data Access

One most important disadvantage of EBS is that it cannot be accessed directly via the internet, it can only be accessed by a single EC2 instance with whom it is connected, whereas EFS storage allows access of 1 to 1000s of EC2 instances concurrently via the internet but these instances must be present in the same region only.

Encryption

Both EBS and EFS supports encryption and uses an AWS KMS–Managed Customer Master Key (CMK) and AES 256-bit Encryption standards for encryption.

File Size Limitation

As EBS is directly connected to the EC2 instance so we have don’t have any limitation on file size whereas in EFS the maximum size of a single file can be up to 47.9TiB.

Cost savings

EFS is the only storage in which you’ll pay for is exactly what you use, as there’s no advance provisioning, up-front fees, or commitments whereas in EBS you need to attach a fixed amount of volume, and you are charged for the same.

Use cases

Amazon EBS use cases:

  • Software Testing and development: Amazon EBS is connected only to a particular instance, so it is best suited for testing and development purposes.
  • Business continuity: Amazon EBS provides a good level of business consistency as users can run applications in different AWS regions and all they require is EBS Snapshots and Amazon machine images.
  • Enterprise-wide applications: EBS provides block-level storage, so it allows users to run a wide variety of applications including Microsoft Exchange, Oracle, etc.
  • Transactional and NoSQL databases: As EBS provides a low level of latency so it offers an optimum level of performance for transactional and NO SQL databases. It also helps in database management.

Amazon EFS use cases:

  • Lift-and-shift application support: EFS is elastic, highly available, and highly scalable storage, and these all features and enables users to move enterprise applications easily and quickly.
  • Analytics for big data: EFS has got the ability to run big data applications.
  • Web server support: EFS is a highly robust throughput file system and is capable of enabling web serving applications, such as websites, or blogs.
  • Application development and testing: Among different storages provided by Amazon EFS is the only one that provides a shared file system needed to share code and files.

Let us see the differences in a tabular form -:

 Amazon EBSAmazon EFS
1.The full form of Amazon EBS is Amazon Elastic Block StoreThe full form of Amazon EFS is Amazon Elastic File System
2.It is used to provide the block-level storage volumes for the use of EC2 instances.It is simple to use.
3.It is mainly used for data that should be quickly accessible and requires long term durability.It is used in modernize application development
4.

It is suitable for both types of database-style applications -:

1. Those rely on random reads

2. Those rely on random writes.

Industries use this for enhancing content management systems

Introduction to Amazon Elastic Container Registry

 Amazon Web Services is a subsidiary of Amazon.com that provides on-demand cloud computing platforms to individuals, companies, and governments, on a paid subscription basis.

Cloud Computing:
Cloud computing is the on-demand delivery of compute power, database storage, applications, and other IT resources through a cloud services platform via the internet with pay-as-you-go pricing.

What Is Amazon Elastic Container Registry?
Amazon Elastic Container Registry (ECR) is a managed AWS Docker registry service. Amazon ECR is a secure and reliable AWS service. Just like any other cloud computing service, we can scale it up or scale it down based on our requirements. Amazon ECR uses AWS Identity and Access Management (IAM) to enable resource-based permissions for private Docker repositories. Through the Docker command line interface (CLI) we can push, pull, and manage images.

Components of Amazon ECR:
Amazon ECR has the following components:

  • Registry:
    Each AWS account has an access to Amazon ECR registry. In registry, we can create image repositories and we can also store its image.
  • Authorization Token:
    Before pushing and pulling of images, your Docker client must authenticate to Amazon ECR registries as an AWS user. The Amazon web services command line interface (CLI) has a command called get-login which provides the user with an authentication credential to be passed to docker.
  • Repository:
    The docker image is contained inside the Amazon ECR image repository.
  • Repository Policy:
    The repository policies enables the users to have control on the access to their repository and the image within it.
  • Image:
    The user can very easily push or pull the docker images to their repository. The user can use the image of the repository on their local system or it could be used in Amazon ECS task definitions.

List Of Available Commands:

  • batch-check-layer-availability
  • batch-delete-image
  • batch-get-image
  • complete-layer-upload
  • create-repository
  • delete-lifecycle-policy
  • delete-repository
  • delete-repository-policy
  • describe-images
  • describe-repositories
  • get-authorization-token
  • get-download-url-for-layer
  • get-lifecycle-policy
  • get-lifecycle-policy-preview
  • get-login
  • get-repository-policy
  • initiate-layer-upload
  • list-images
  • list-tags-for-resource
  • put-image
  • put-lifecycle-policy
  • set-repository-policy
  • start-lifecycle-policy-preview
  • tag-resource
  • untag-resource
  • upload-layer-part

Amazon Web Services – Generating Log Bundle for EKS Instance

 Amazon SageMaker is used by data scientists and developers to easily and quickly prepare, build, train, and deploy high-quality machine learning (ML) models by bringing together a broad set of capabilities purpose-built for ML. 

In this article, we will look into how users can generate a log bundle for their Amazon Elastic Kubernetes Service instances. To do so follow the below steps:

Step 1: After logging into the AWS management console navigate to the System Manager console.

Step 2: Then go to automation in the left pane.

Step 3: Then choose Execute automation.

Step 4: Then choose AWSSupport-CollectEKSinstanceLogs in the list and choose next.

Step 5: Now enter the amazon elastic compute cloud instance id for your amazon EKS instance in the EKS instance-id field. 

Step 6: To upload the collected logs to an Amazon S3 bucket, enter the bucket name in the log destination field. Note that the S3 bucket used for this purpose can’t be public otherwise, logs aren’t uploaded in the provided S3 bucket for security reasons.

To successfully run this automation and see the output the user running it needs the following permissions:

  •  SSM: ExecuteAutomation
  •  SSM:GetAutomationExecution
  •  SSM: SendCommand

If the user invoking the document doesn’t have the required permissions you must provide the appropriate AWS identity and access management role in the automation assume role field and then choose the execute button:

The AWS run command sends out the run command to run the log collection script which saves manual effort for log collection. The AWS branch validates if an S3 bucket was provided in the automation execution. The AWS run command sends out the run command to upload the lock bundle to the provided S3 bucket. When the automation execution is complete preview the output of the run upload script to view the S3 log bundle location as shown below:

Amazon Web Services – Introduction to Amazon EKS

 Amazon Elastic Kubernetes Service(EKS), is a fully managed service that you can use to run Kubernetes on Amazon Web Service. Kubernetes is open-source software that enables you to install and manage applications at a high scale. 

Its characteristics are:

  • Availability: In order to ensure high availability Kubernetes executes and scales itself to various AWS AZs.
  • Strength: Kubernetes automatically scales itself to avoid loads and unwanted control plane issues.
  • Scalability: It also works with various AWS services to provide security in applications.
    • Amazon ECR for container images
    • Elastic Load Balancing for load classification.

AWS Fargate: It is a serverless compute engine for containers. It works with Amazon EKS  or Amazon ECS.

Amazon EKS Sections :

Amazon EKS organization contains the following sections: clusters, nodes, and networking.

  1. Clusters – Clusters are consists of the control plane and EKS nodes.
  2. EKS nodes – Kubernetes nodes run with EC2 in your organization’s account of amazon web service. Each cluster is defined by a unique certificate to schedule portable storage using three ways:
    • Self-Managed Nodes
    • Managed Node Groups
    • Amazon Fargate
  3. Amazon EKS Networking-EKS operates in a Virtual Private Cloud (VPC) so that it can activate all resources to an existing subnet in a network.

Advantages of AWS EKS :

Following are the advantages of using Amazon EKS:

  1. EKS automates load distribution and parallel processing better than any DevOps engineer could.
  2. EKS uses VPC networking (explained above).
  3. Supports EC2 spot examples using managed node groups that follow best practices.
  4. Your Kubernetes assets integrate smoothly with AWS services if you use EKS.
  5. EKS allows you to run tools easily.

Amazon EKS Control Plane Architecture: 

Each cluster runs only one Kubernetes control plane. The control plane mainly consists of two API servers and three ‘etcd‘( The etcd is used for storing Kubernetes data). It manages the scalability of load to have high performance. It identifies the unwanted part of the control plane and can remove it. The control plane can’t be accessed by any other AWS accounts or clusters except for the authorized user.

Working of Amazon EKS: 

  • Firstly, create an Amazon EKS cluster in the console.
  • Now launch the EKS nodes and placed all the workloads on AWS Fargate.
  • After your cluster is ready, the user can easily communicate by using different types of tools.
  • Users can now manage the workloads over Kubernetes.

Pricing of EKS: 

The user can pay for both long-term service and short-term service. Long-term is a little bit cheaper than the other one because it set a commitment from 1-3 year