Monday, 4 July 2022

AWS Elastic file system Theory

 ELASTIC FILE SYSTEM (EFS):

Amazon Elastic File System (Amazon EFS) provides a simple, serverless, set-and-forget elastic file system for use with AWS Cloud services and on-premises resources. It is built to scale on demand to petabytes without disrupting applications, growing and shrinking automatically as you add and remove files, eliminating the need to provision and manage capacity to accommodate growth. Amazon EFS has a simple web services interface that allows you to create and configure file systems quickly and easily. The service manages all the file storage infrastructure for you, meaning that you can avoid the complexity of deploying, patching, and maintaining complex file system configurations. Amazon EFS supports the Network File System version 4 (NFSv4.1 and NFSv4.0) protocol, so the applications and tools that you use today work seamlessly with Amazon EFS. Multiple compute instances, including Amazon EC2, Amazon ECS, and AWS Lambda, can access an Amazon EFS file system at the same time, providing a common data source for workloads and applications running on more than one compute instance or server. With Amazon EFS, you pay only for the storage used by your file system and there is no minimum fee or setup cost. Amazon EFS offers a range of storage classes designed for different use cases. These include: • Standard storage classes – EFS Standard and EFS Standard–Infrequent Access (Standard–IA), which offer multi-AZ resilience and the highest levels of durability and availability. • One Zone storage classes – EFS One Zone and EFS One Zone–Infrequent Access (EFS One Zone– IA), which offer customers the choice of additional savings by choosing to save their data in a single Availability Zone.


Overview:

Amazon EFS provides a simple, serverless, set-and-forget elastic file system. With Amazon EFS, you can create a file system, mount the file system on an Amazon EC2 instance, and then read and write data to and from your file system. You can mount an Amazon EFS file system in your virtual private cloud (VPC), through the Network File System versions 4.0 and 4.1 (NFSv4) protocol. We recommend using a current generation Linux NFSv4.1 client, such as those found in the latest Amazon Linux, Amazon Linux 2, Red Hat, Ubuntu, and macOS Big Sur AMIs, in conjunction with the Amazon EFS mount helper. For instructions, see Using the amazon-efs-utils Tools (p. 51). For a list of Amazon EC2 Linux and macOS Amazon Machine Images (AMIs) that support this protocol, see NFS support (p. 387). For some AMIs, you must install an NFS client to mount your file system on your Amazon EC2 instance. For instructions, see Installing the NFS client (p. 388). You can access your Amazon EFS file system concurrently from multiple NFS clients, so applications that scale beyond a single connection can access a file system. Amazon EC2 and other AWS compute instances running in multiple Availability Zones within the same AWS Region can access the file system, so that many users can access and share a common data source. For a list of AWS Regions where you can create an Amazon EFS file system, see the Amazon Web Services General Reference. To access your Amazon EFS file system in a VPC, you create one or more mount targets in the VPC. • For file systems using Standard storage classes, you can create a mount target in each availability Zone in the AWS Region. • For file systems using One Zone storage classes, you create only a single mount target that is in the same Availability Zone as the file system. 




Features

  • The service manages all the file storage infrastructure for you, avoiding the complexity of deploying, patching, and maintaining complex file system configurations.
  • EFS supports the Network File System version 4 protocol.
  • You can mount EFS filesystems onto EC2 instances running Linux or MacOS Big Sur. Windows is not supported.
  • Aside from EC2 instances, you can also mount EFS filesystems on ECS tasks, EKS pods, and Lambda functions.
  • Multiple Amazon EC2 instances can access an EFS file system at the same time, providing a common data source for workloads and applications running on more than one instance or server.
  • EFS file systems store data and metadata across multiple Availability Zones in an AWS Region.
  • EFS file systems can grow to petabyte scale, drive high levels of throughput, and allow massively parallel access from EC2 instances to your data.
  • EFS provides file system access semantics, such as strong data consistency and file locking.
  • EFS enables you to control access to your file systems through Portable Operating System Interface (POSIX) permissions.
  • Moving your EFS file data can be managed simply with AWS DataSync – a managed data transfer service that makes it faster and simpler to move data between on-premises storage and Amazon EFS.
  • You can schedule automatic incremental backups of your EFS file system using the EFS-to-EFS Backup solution.
  • Amazon EFS Infrequent Access (EFS IA) is a new storage class for Amazon EFS that is cost-optimized for files that are accessed less frequently. Customers can use EFS IA by creating a new file system and enabling Lifecycle Management. With Lifecycle Management enabled, EFS automatically will move files that have not been accessed for 30 days from the Standard storage class to the Infrequent Access storage class. To further lower your costs in exchange for durability, you can use the EFS IA-One Zone storage class.


Performance Modes

  • General purpose performance mode (default)
    • Ideal for latency-sensitive use cases.
  • Max I/O mode
    • Can scale to higher levels of aggregate throughput and operations per second with a tradeoff of slightly higher latencies for file operations.

Throughput Modes

  • Bursting Throughput mode (default)
    • Throughput scales as your file system grows.
  • Provisioned Throughput mode
    • You specify the throughput of your file system independent of the amount of data stored.

Mount Targets

  • To access your EFS file system in a VPC, you create one or more mount targets in the VPC. A mount target provides an IP address for an NFSv4 endpoint.
  • You can create one mount target in each Availability Zone in a region.
  • You mount your file system using its DNS name, which will resolve to the IP address of the EFS mount target. Format of DNS is
    File-system-id.efs.aws-region.amazonaws.com

AWS Training Amazon EFS

  • When using Amazon EFS with an on-premises server, your on-premises server must have a Linux based operating system.

Access Points

  • EFS Access Points simplify how applications are provided access to shared data sets in an EFS file system. 
  • EFS Access Points work together with AWS IAM and enforce an operating system user and group, and a directory for every file system request made through the access point. 

Components of a File System

  • ID
  • creation token
  • creation time
  • file system size in bytes
  • number of mount targets created for the file system
  • file system state
  • mount target

Data Consistency in EFS

  • EFS provides the open-after-close consistency semantics that applications expect from NFS.
  • Write operations will be durably stored across Availability Zones.
  • Applications that perform synchronous data access and perform non-appending writes will have read-after-write consistency for data access.

Managing File Systems

  • You can create encrypted file systems. EFS supports encryption in transit and encryption at rest.
  • Managing file system network accessibility refers to managing the mount targets:
    • Creating and deleting mount targets in a VPC
    • Updating the mount target configuration
  • You can create new tags, update values of existing tags, or delete tags associated with a file system.
  • The following list explains the metered data size for different types of file system objects.
    • Regular files – the metered data size of a regular file is the logical size of the file rounded to the next 4-KiB increment, except that it may be less for sparse files.
      • A sparse file is a file to which data is not written to all positions of the file before its logical size is reached. For a sparse file, if the actual storage used is less than the logical size rounded to the next 4-KiB increment, Amazon EFS reports actual storage used as the metered data size.
    • Directories – the metered data size of a directory is the actual storage used for the directory entries and the data structure that holds them, rounded to the next 4 KiB increment. The metered data size doesn’t include the actual storage used by the file data.
    • Symbolic links and special files – the metered data size for these objects is always 4 KiB.
  • File system deletion is a destructive action that you can’t undo. You lose the file system and any data you have in it, and you can’t restore the data. You should always unmount a file system before you delete it.
  • You can use AWS DataSync to automatically, efficiently, and securely copy files between two Amazon EFS resources, including file systems in different AWS Regions and ones owned by different AWS accounts.  Using DataSync to copy data between EFS file systems, you can perform one-time migrations, periodic ingest for distributed workloads, or automate replication for data protection and recovery.
  • File systems created using the Amazon EFS console are automatically backed up daily through AWS Backup with a retention of 35 days. You can also disable automatic backups for your file systems at any time.
  • Amazon Cloudwatch Metrics can monitor your EFS file system storage usage, including the size in each of the EFS storage classes.

Mounting File Systems

  • To mount your EFS file system on your EC2 instance, use the mount helper in the amazon-efs-utils package.
  • You can mount your EFS file systems on your on-premises data center servers when connected to your Amazon VPC with AWS Direct Connect or VPN.
  • You can use fstab to automatically mount your file system using the mount helper whenever the EC2 instance is mounted on reboots.

Lifecycle Management

  • You can choose from five EFS Lifecycle Management policies (7, 14, 30, 60, or 90 days) to automatically move files into the EFS Infrequent Access (EFS IA) storage class and save up to 85% in cost.

Monitoring File Systems

  • Amazon CloudWatch Alarms
  • Amazon CloudWatch Logs
  • Amazon CloudWatch Events
  • AWS CloudTrail Log Monitoring
  • Log files on your file system

Security

  • You must have valid credentials to make EFS API requests, such as create a file system.
  • You must also have permissions to create or access resources.
  • When you first create the file system, there is only one root directory at /. By default, only the root user (UID 0) has read-write-execute permissions.
  • Specify EC2 security groups for your EC2 instances and security groups for the EFS mount targets associated with the file system.
  • You can use AWS IAM to manage Network File System (NFS) access for Amazon EFS. You can use IAM roles to identify NFS clients with cryptographic security and use IAM policies to manage client-specific permissions.

Pricing

  • You pay only for the storage used by your file system.
  • Costs related to Provisioned Throughput are determined by the throughput values you specify.

EFS vs EBS vs S3

  • Performance Comparison

Amazon EFS

Amazon EBS Provisioned IOPS

Per-operation latency

Low, consistent latency.

Lowest, consistent latency.

Throughput scale

Multiple GBs per second

Single GB per second

  • Performance Comparison

Amazon EFS

Amazon S3

Per-operation latency

Low, consistent latency.

Low, for mixed request types, and integration with CloudFront.

Throughput scale

Multiple GBs per second

Multiple GBs per second

    Storage Comparison

Amazon EFS

Amazon EBS Provisioned IOPS

Availability and durability

Data are stored redundantly across multiple AZs.

Data are stored redundantly in a single AZ.

Access

Up to thousands of EC2 instances from multiple AZs can connect concurrently to a file system.

A single EC2 instance in a single AZ can connect to a file system.

Use cases

Big data and analytics, media processing workflows, content management, web serving, and home directories.

Boot volumes, transactional and NoSQL databases, data warehousing, and ETL.


AWS Serverless Application Repository practical

 STEP 1: Create a application 



STEP2 : 



STEP3 :

                                                                                 



STEP4 :



AWS S3 Bucket - Practical

 S3 Configuration

Services -> S3

Create Amazon S3 Bucket (Source Bucket)

Click on Create bucket.

  • Bucket Name: your_source_bucket_name
  • Region: US East (N. Virginia)

Note: Every S3 bucket name is unique globally, so create the bucket with a name not currently in use.

Leave other settings as default and click on the Create button.


Once the bucket is created successfully, select your S3 bucket (click on the checkbox).

Click on the Copy Bucket ARN to copy the ARN.

  • arn:aws:s3:::zacks-source-bucket

Save the source bucket ARN in a text file for later use.


Create Amazon S3 Bucket (Destination Bucket)

Click on Create bucket.

  • Bucket Name: your_destination_bucket_name
  • Region: US East (N. Virginia)

Note: Every S3 bucket name is unique globally, so create the bucket with a name not currently in use.

Leave other settings as default and click on the Create button.


Once the bucket is created successfully, select your S3 bucket (click on the checkbox).

Click on the Copy Bucket ARN to copy the ARN.

  • arn:aws:s3:::zacks-destination-bucket

Save the source bucket ARN in a text file for later use.

AWS Serverless Application Repository Theory

 Serverless application Repository :

Serverless applications are changing the way companies do business by enabling them to deploy much faster and more frequently—a competitive advantage.

Amazon’s AWS Serverless Application Model (AWS SAM) has been a game changer in this space, making it easy for developers to create, access and deploy applications, thanks to simplified templates and code samples.

AWS Serverless Application Repositor :

The AWS Serverless Application Repository is a searchable ecosystem that allows developers to find serverless applications and their components for deployment. Its helps simplify serverless application development by providing ready-to-use apps.


Here are the basic steps:

  1. Search and discover. A developer can search the repository for code snippets, functions, serverless applications, and their components.
  2. Integrate with the AWS Lambda console. Repository components are already available to developers.
  3. Configure. Before deploying, developers can set environment variables, parameter values, and more. For example, you can go the plug-and-play route by adding repository components to a larger application framework, or you can deconstruct and tinker with the code for further customization. If needed, pull requests can also be submitted to repository authors.
  4. Deploy. Deployed applications can be managed from the AWS Management Console. A developer can follow prompts to name, describe and upload their serverless applications and components to the ecosystem where they can be shared internally and with other developers across the ecosystem. This feature makes AWS SAM a truly open-source environment

Benefits of programming in AWS SAM :

You can build serverless applications for almost any type of backend service without having to worry about scalability and managing servers. Here are some of the many benefits that building serverless applications in AWS SAM has to offer.

Low cost & efficient :

AWS SAM is low-cost and efficient for developers because of its pay-as-you-go structure. The platform only charges developers for usage, meaning you never pay for more of a service than you use.

Simplified processes :

The overarching goal of AWS SAM is ease-of-use. By design, it’s focused on simplifying application development so that programmers have more freedom to create in the open-source ecosystem.

Quick, scalable deployment :

AWS SAM makes deployment quick and simple by allowing developers to upload code to AWS and letting Amazon handle the rest. They also provide robust testing environments, so developers don’t miss a beat. All of this occurs on a platform that is easy to scale, allowing apps to grow and change to meet business objectives.

Convenient & accessible :

Undoubtedly, AWS SAM offers a convenient solution for developing in the cloud. Its serverless nature also means that it is a universally accessible platform. The wide reach of the internet makes it easy to execute code on-demand from anywhere.

Decreased time to market :

Overall, choosing a serverless application platform saves time and money that would otherwise be spent managing and operating servers or runtimes, whether on-premises or in the cloud. Because developers can create apps in a fraction of the time (think hours—not weeks or months), they are able to focus more of their attention on accelerating innovation in today’s competitive digital economy.

AWS SAM for Serverless Applications :

It’s clear that AWS SAM is a highly efficient, highly scalable, low-cost, and convenient solution for cloud programming.

But for those who haven’t yet made the switch, there are some concerns that arise from developing using AWS SAM, including:

  • 1.A general lack of control over the ecosystem that developers are coding in.
  • 2.Vendor lock-in that may occur when you sign up for any FaaS.
  • 3.Session timeouts that require developers to rewrite code, making it more complex instead of simplifying the process.
  • 4.AWS Lambda timeouts: Lambda functions are limited by a timeout value that can be configured from 3 seconds to 900 seconds (15 minutes). Lambda automatically terminates functions running longer than its time-out value.

    •  AWS Serverless Application Model :

    The AWS Serverless Application Model (SAM) is designed to make the creation, deployment, and execution of serverless applications as simple as possible. This can be done using AWS SAM templates with just a few choice code snippets—this way, practically anyone can create a serverless app.

  • AWS Serverless Application Model (SAM)

     

    • An open-source framework for building serverless applications.
    • It provides shorthand syntax to express functions, APIs, databases, and event source mappings. 
    • You create a JSON or YAML configuration template to model your applications. 
    • During deployment, SAM transforms and expands the SAM syntax into AWS CloudFormation syntax. Any resource that you can declare in an AWS CloudFormation template you can also declare in an AWS SAM template.
    • The SAM CLI provides a Lambda-like execution environment that lets you locally build, test, and debug applications defined by SAM templates. You can also use the SAM CLI to deploy your applications to AWS.
    • You can use AWS SAM to build serverless applications that use any runtime supported by AWS Lambda. You can also use SAM CLI to locally debug Lambda functions written in Node.js, Java, Python, and Go.
    • Template Anatomy
      • If you are writing an AWS Serverless Application Model template alone and not via CloudFormation, the Transform section is required.
      • The Globals section is unique to AWS SAM templates. It defines properties that are common to all your serverless functions and APIs. All the AWS::Serverless::FunctionAWS::Serverless::Api, and AWS::Serverless::SimpleTable resources inherit the properties that are defined in the Globals section.
      • The Resources section can contain a combination of AWS CloudFormation resources and AWS SAM resources.
    • Overview of Syntax
      • AWS::Serverless::Api
        • This resource type describes an API Gateway resource. It’s useful for advanced use cases where you want full control and flexibility when you configure your APIs.
      • AWS::Serverless::Application
        • This resource type embeds a serverless application from the AWS Serverless Application Repository or from an Amazon S3 bucket as a nested application. Nested applications are deployed as nested stacks, which can contain multiple other resources.
      • AWS::Serverless::Function
        • This resource type describes configuration information for creating a Lambda function. You can describe any event source that you want to attach to the Lambda function—such as Amazon S3, Amazon DynamoDB Streams, and Amazon Kinesis Data Streams.
      • AWS::Serverless::Layer Version
        • This resource type creates a Lambda layer version that contains library or runtime code needed by a Lambda function. When a serverless layer version is transformed, AWS SAM also transforms the logical ID of the resource so that old layer versions are not automatically deleted by AWS CloudFormation when the resource is updated.
      • AWS::Serverless::Simple Table
        • This resource type provides simple syntax for describing how to create DynamoDB tables.
    • Commonly used SAM CLI commands
      • The sam in it command generates pre-configured AWS SAM templates.
      • The sam local command supports local invocation and testing of your Lambda functions and SAM-based serverless applications by executing your function code locally in a Lambda-like execution environment.
      • The sam package and sam deploy commands let you bundle your application code and dependencies into a “deployment package” and then deploy your serverless application to the AWS Cloud.
      • The sam logs command enables you to fetch, tail, and filter logs for Lambda functions. 
      • The output of the sam publish command includes a link to the AWS Serverless Application Repository directly to your application.
      • Use sam validate to validate your SAM template.
    • Controlling access to APIs
      • You can use AWS SAM to control who can access your API Gateway APIs by enabling authorization within your AWS SAM template.
        • Lambda authorizer (formerly known as a custom authorizer) is a Lambda function that you provide to control access to your API. When your API is called, this Lambda function is invoked with a request context or an authorization token that are provided by the client application. The Lambda function returns a policy document that specifies the operations that the caller is authorized to perform, if any. There are two types of Lambda authorizers:
          • Token based type receives the caller’s identity in a bearer token, such as a JSON Web Token (JWT) or an OAuth token.
          • Request parameter based type receives the caller’s identity in a combination of headers, query string parameters, stage Variables, and $context variables.
        • Amazon Cognito user pools are user directories in Amazon Cognito. A client of your API must first sign a user in to the user pool and obtain an identity or access token for the user. Then your API is called with one of the returned tokens. The API call succeeds only if the required token is valid.
    • The optional Transform section of a CloudFormation template specifies one or more macros that AWS CloudFormation uses to process your template. Aside from macros you create, AWS CloudFormation also supports the AWS::Serverless transform which is a macro hosted on AWS CloudFormation.
      • The AWS::Serverless transform specifies the version of the AWS Serverless Application Model (AWS SAM) to use. This model defines the AWS SAM syntax that you can use and how AWS CloudFormation processes it. 

AWS LAMBDA - Practical

 

Lambda Configuration

Services -> Lambda

Create a Lambda Function

Click on the Create a function button.

Choose Author from scratch.

  • Function name: mylambdafunction
  • Runtime: Select Node.js 12x

Role: In the permissions section, select use an existing role.

  • Existing role: Select myrole

Click on Create function


Configuration Page: On this page, we need to configure our lambda function.

If you scroll down a little bit, you can see the Function code section. Here we need to write a NodeJs function which copies the object from the source bucket and paste it into the destination bucket.

Remove the existing code in AWS lambda index.js. Copy the below code and paste it into your lambda index.js file.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
var AWS = require("aws-sdk");
exports.handler = (event, context, callback) => {
var s3 = new AWS.S3();
var sourceBucket = "your_source_bucket_name";
var destinationBucket = "your_destination_bucket_name";
var objectKey = event.Records[0].s3.object.key;
var copySource = encodeURI(sourceBucket + "/" + objectKey);
var copyParams = { Bucket: destinationBucket, CopySource: copySource, Key: objectKey };
s3.copyObject(copyParams, function(err, data) {
if (err) {
console.log(err, err.stack);
} else {
console.log("S3 object copy successful.");
}
});
};

You need to change the source and destination bucket name (not ARN!) in the index.js file based on your bucket names.

Save the function by clicking on Deploy in the right corner.


Adding Triggers to Lambda Function

Go to the top and left page, click on + Add trigger under Designer`.

Scroll down the list and select S3 from the trigger list. Once you select S3, a form will appear. Enter these details:

  • Bucket: Select your source bucket - your_source_bucket_name.
  • Event type: All object create events

Leave other fields as default.

And check this option of Recursive invocation to avoid failures in case you upload multiple files at once.

Click on Add.

AWS LAMBDA - Theory

WHAT IS AWS LAMBDA ?


Describing AWS Lambda

AWS Lambda service is a high-scale, provision-free serverless compute offering based on functions. It is used only for the compute layer of a serverless application. The purpose of AWS Lambda is to build event-driven applications that can be triggered by several events in AWS.

In the case where you have multiple simultaneous events, Lambda simply spins up multiple copies of the function to handle the events. In other words, Lambda can be described as a type of function as a service (FaaS). Three components comprise AWS Lambda:

  • A function. This is the actual code that performs the task.
  • A configuration. This specifies how your function is executed.
  • An event source (optional). This is the event that triggers the function. You can trigger with several AWS services or a third-party service.

When you specify an event source, your function is invoked when an event from that source occurs. The diagram below shows what this looks like:



Running a Lambda function

When configuring a lambda function, you specify which runtime environment you’d like to run your code in. Depending on the language you use, each environment provides its own set of binaries available for use in your code. You are also allowed to package any libraries or binaries you like as long as you can use them within the runtime environment. All environments are based on Amazon Linux AMI.

The current available runtime environments are:

  • nodeJS
  • Python
  • Go
  • Java
  • Ruby
  • .Net
  • C#

When running a lambda function, we only focus on the code because AWS manages capacity and all updates. AWS Lambda can be invoked synchronously using the Response Request Invocation  Type and asynchronously using the Event Invocation Type.

Concepts of Lambda function

To better understand how lambda function works, there are key concepts to understand.


Event source

Although AWS Lambda can be triggered using the Invoke API, the recommended way of triggering lambda is through event sources from within AWS.

There are two models of invocation supported:

(a) Push which get triggered by other events such as API gateway, new object in S3 or Amazon Alexa.

(b) Pull where the lambda function goes and poll an event source for new objects. Examples of such event sources are DynamoDB or Amazon Kinesis.

Lambda configuration

There are few configuration settings that can be used with lambda functions:

  • Memory dial which controls not just the memory but also affects how much CPU and network resources is allocated to the function.
  • Version/Aliases are used to revert function back to older versions. This is also key in implementing a deployment strategy such as blue/green or separating production from lower environments.
  • IAM Role gives the lambda function permission to interact with other AWS services and APIs.
  • Lambda function permission defines which push model event source is allowed to invoke the lambda function.
  • Network configuration for outbound connectivity. There are two choices:
    • Default which allows internet connectivity but no connectivity to private resources in your VPC services
    • VPC which allows your function to be provisioned inside your VPC and use an ENI. You can then attach things like security like you would any other ENIs.
  • Environment variables for dynamically injecting values that are consumed by code. This idea of separating code from config is one of the 12-factor app methodology around cloud native applications.
  • Dead letter queue is where you send all failed invocation events. This can either be SNS topic or SQS

Timeouts which is the allowed amount of time a function is allowed to run before it is timed out.

Create an AWS Lambda

There are few ways to create a lambda function in AWS, but the most common way to create it is with the console but this method should only be used if testing in dev. For production, it is best practice to automate the deployment of the lambda.

There are few third-party tools to set up automation, like Terraform, but since we are specifically talking about an AWS service, AWS recommends using Serverless Application Model (SAM) for this task. SAM is pretty much built on top of AWS CloudFormation and the template looks like a normal CloudFormation template except it has a transform block that specifically says we want the template to be a SAM template as opposed to a normal CloudFormation template. You can take a look at some example templates in the AWS labs.

AWS Lambda use cases

You can use AWS Lambda in a variety of situations, including but not limited to:

  • Log transfers where a lambda function is invoked every time there is a new log event in CloudWatch to transfer the logs into tools like Elasticsearch and Kibana.
  • A website where you can invoke your Lambda function over HTTP using Amazon API Gateway as the HTTP endpoint.
  • Mobile applications where you can create a Lambda function to process events published by your custom application.