Sunday, 20 March 2022

Amazon Cognito User Pools vs Identity Pools

 With the proliferation of smartphones in our connected world, more and more developers are quickly deploying their applications on the cloud. One of the first challenges in developing applications is allowing users to log in and authenticate on your applications. There are multiple stages involved in user verification and most of these are not visible from the end-user. AWS provides an easy solution for this situation.

User Identity verification is at the core of Amazon Cognito. It provides solutions for three key areas of user identification: 

  1. Authentication – provides users sign-up and sign-in options. Enables support for federation with Enterprise Identities (Microsoft AD), or Social Identities (Amazon, Facebook, Google, etc.)
  2. Authorization – sets of permission or operations allowed for a user. It provides fine-grained access control to resources. 
  3. User Management – allows management of user lifecycles, such as importing users, onboarding users, disabling users, and storing and managing user profiles.

In this post, we’ll talk about Cognito User Pools and Identity Pools, including an overview of how they are used to provide authentication and authorization functionalities that can be integrated on your mobile app.

Amazon Cognito User Pools

Amazon Cognito User Pools are used for authentication. To verify your user’s identity, you will want to have a way for them to login using username/passwords or federated login using Identity Providers such as Amazon, Facebook, Google, or a SAML supported authentication such as Microsoft Active Directory. You can configure these Identity Providers on Cognito, and it will handle the interactions with these providers so you only have to worry about handling the Authentication tokens on your app.

Amazon Cognito Integration with Identity Providers

With Cognito User Pools, you can provide sign-up and sign-in functionality for your mobile or web app users. You don’t have to build or maintain any server infrastructure on which users will authenticate. 

This diagram shows how authentication is handled with Cognito User Pools:

Cognito User Pool for Authentication

  1. Users send authentication requests to Cognito User Pools. 
  2. The Cognito user pool verifies the identity of the user or sends the request to Identity Providers such as Facebook, Google, Amazon, or SAML authentication (with Microsoft AD).
  3. The Cognito User Pool Token is sent back to the user. 
  4. The person can then use this token to access your backend APIs hosted on your EC2 clusters or in API Gateway and Lambda.

If you want a quick login page, you can even use the pre-built login UI provided by Amazon Cognito which you just have to integrate on your application.

Default Amazon Cognito User Login Page

On the Amazon Cognito User Pool page, you can also manage users if you need to. You can reset the password, disable/enable users, and enroll/delete users or other actions needed for User Management. 

Amazon Cognito Identity Pools

Cognito Identity Pools (Federated Identities) provides different functionality compared to User Pools. Identity Pools are used for User Authorization. You can create unique identities for your users and federate them with your identity providers. Using identity pools, users can obtain temporary AWS credentials to access other AWS services. 

Identity Pools can be thought of as the actual mechanism authorizing access to AWS resources. When you create Identity Pools, think of it as defining who is allowed to get AWS credentials and use those credentials to access AWS resources.

This diagram shows how authorization is handled with Cognito Identity Pools:

Cognito Identity Pools (Federated Identities)

  1. The web app or mobile app sends its authentication token to Cognito Identity Pools. The token can come from a valid Identity Provider, like Cognito User Pools, Amazon, or Facebook. 
  2. Cognito Identity Pool exchanges the user authentication token for temporary AWS credentials to access resources such as S3 or DynamoDB. AWS credentials are sent back to the user. 
  3. The temporary AWS credentials will be used to access AWS resources. 

You can define rules in Cognito Identity Pools for mapping users to different IAM roles to provide fine-grain permissions. 

Here’s a table summary describing Cognito User Pool and Identity Pool:


Cognito User PoolsCognito Identity Pools
Handles the IdP interactions for youProvides AWS credentials for accessing resources on behalf of users
Provides profiles to manage usersSupports rules to map users to different IAM roles
Provides OpenID Connect and OAuth standard tokensFree
Priced per monthly active user

AWS Fargate

 

  • A serverless compute engine for containers that works with both Amazon Elastic Container Service (ECS) and Amazon Elastic Kubernetes Service (EKS).
  • With Fargate, no manual provisioning, patching, cluster capacity management, or any infrastructure management required.


How It Works

AWS Fargate

Use Case

    • Launching containers without having to provision or manage EC2 instances.
    • If you want a managed service for container cluster management.
  • Configurations
    • Amazon ECS task definitions for Fargate require that you specify CPU and memory at the task level (task definition).
    • Amazon ECS task definitions for Fargate support the ulimits parameter to define the resource limits to set for a container.
    • Amazon ECS task definitions for Fargate support the awslogs, splunk, firelens, and fluentd log drivers for the log configuration.
    • When provisioned, each Fargate task receives the following storage:
      • 10 GB of Docker layer storage
      • An additional 4 GB for volume mounts.
    • Task storage is ephemeral.
    • If you have a service with running tasks and want to update their platform version, you can update your service, specify a new platform version, and choose Force new deployment. Your tasks are redeployed with the latest platform version.
    • If your service is scaled up without updating the platform version, those tasks receive the platform version that was specified on the service’s current deployment.
    • Amazon ECS Exec is a way for customers to execute commands in a container running on Amazon EC2 instances or AWS Fargate. ECS Exec gives you interactive shell or single command access to a running container.

Network

    • Amazon ECS task definitions for Fargate require that the network mode is set to awsvpc. The awsvpc network mode provides each task with its own elastic network interface.

Compliance

    • PCI DSS Level 1, ISO 9001, ISO 27001, ISO 27017, ISO 27018, SOC 1, SOC 2, SOC 3, and HIPAA

Pricing

    • You pay for the amount of vCPU and memory resources consumed by your containerized applications.

Elastic Fabric Adapter (EFA)

 

  • An Elastic Fabric Adapter (EFA) is a network device that you can attach to your Amazon EC2 instance to accelerate High Performance Computing (HPC) and machine learning applications.
  • An EFA is an Elastic Network Adapter (ENA) with an additional OS-bypass functionality. 
  • How It Works

Elastic Fabric Adapter

  • EFA integrates with 
    • Libfabric 1.9.0 and it supports Open MPI 4.0.2 and Intel MPI 2019 Update 6 for HPC applications, and 
    • Nvidia Collective Communications Library (NCCL) for machine learning applications.

Elastic Fabric Adapter

  • With an EFA, HPC applications use Intel Message Passing Interface (MPI) or Nvidia Collective Communications Library (NCCL) to interface with the Libfabric API. The Libfabric API bypasses the operating system kernel and communicates directly with the EFA device to place packets on the network.
  • Supported AMIs
    • Amazon Linux
    • Amazon Linux 2
    • RHEL 7.6 and RHEL 7.7
    • CentOS 7
    • Ubuntu 16.04 and Ubuntu 18.04
  • Examples of HPC Applications
    • computational fluid dynamics (CFD)
    • crash simulations
    • weather simulations

Limitations

    • You can attach only one EFA per instance.
    • EFA OS-bypass traffic is limited to a single subnet. EFA traffic cannot be sent from one subnet to another. Normal IP traffic from the EFA can be sent from one subnet to another.
    • EFA OS-bypass traffic is not routable. Normal IP traffic from the EFA remains routable.
    • The EFA must be a member of a security group that allows all inbound and outbound traffic to and from the security group itself.

Pricing

EFA is available as an optional EC2 networking feature that you can enable on any supported EC2 instance at no additional cost.

AWS Serverless Application Model (SAM)

 

  • An open-source framework for building serverless applications.
  • It provides shorthand syntax to express functions, APIs, databases, and event source mappings. 
  • You create a JSON or YAML configuration template to model your applications. 
  • During deployment, SAM transforms and expands the SAM syntax into AWS CloudFormation syntax. Any resource that you can declare in an AWS CloudFormation template you can also declare in an AWS SAM template.
  • The SAM CLI provides a Lambda-like execution environment that lets you locally build, test, and debug applications defined by SAM templates. You can also use the SAM CLI to deploy your applications to AWS.
  • You can use AWS SAM to build serverless applications that use any runtime supported by AWS Lambda. You can also use SAM CLI to locally debug Lambda functions written in Node.js, Java, Python, and Go.
  • Template Anatomy
    • If you are writing an AWS Serverless Application Model template alone and not via CloudFormation, the Transform section is required.
    • The Globals section is unique to AWS SAM templates. It defines properties that are common to all your serverless functions and APIs. All the AWS::Serverless::FunctionAWS::Serverless::Api, and AWS::Serverless::SimpleTable resources inherit the properties that are defined in the Globals section.
    • The Resources section can contain a combination of AWS CloudFormation resources and AWS SAM resources.

  • Overview of Syntax
    • AWS::Serverless::Api
      • This resource type describes an API Gateway resource. It’s useful for advanced use cases where you want full control and flexibility when you configure your APIs.
    • AWS::Serverless::Application
      • This resource type embeds a serverless application from the AWS Serverless Application Repository or from an Amazon S3 bucket as a nested application. Nested applications are deployed as nested stacks, which can contain multiple other resources.
    • AWS::Serverless::Function
      • This resource type describes configuration information for creating a Lambda function. You can describe any event source that you want to attach to the Lambda function—such as Amazon S3, Amazon DynamoDB Streams, and Amazon Kinesis Data Streams.
    • AWS::Serverless::LayerVersion
      • This resource type creates a Lambda layer version that contains library or runtime code needed by a Lambda function. When a serverless layer version is transformed, AWS SAM also transforms the logical ID of the resource so that old layer versions are not automatically deleted by AWS CloudFormation when the resource is updated.
    • AWS::Serverless::SimpleTable
      • This resource type provides simple syntax for describing how to create DynamoDB tables.
  • Commonly used SAM CLI commands
    • The sam init command generates pre-configured AWS SAM templates.
    • The sam local command supports local invocation and testing of your Lambda functions and SAM-based serverless applications by executing your function code locally in a Lambda-like execution environment.
    • The sam package and sam deploy commands let you bundle your application code and dependencies into a “deployment package” and then deploy your serverless application to the AWS Cloud.
    • The sam logs command enables you to fetch, tail, and filter logs for Lambda functions. 
    • The output of the sam publish command includes a link to the AWS Serverless Application Repository directly to your application.
    • Use sam validate to validate your SAM template.
  • Controlling access to APIs
    • You can use AWS SAM to control who can access your API Gateway APIs by enabling authorization within your AWS SAM template.
      • Lambda authorizer (formerly known as a custom authorizer) is a Lambda function that you provide to control access to your API. When your API is called, this Lambda function is invoked with a request context or an authorization token that are provided by the client application. The Lambda function returns a policy document that specifies the operations that the caller is authorized to perform, if any. There are two types of Lambda authorizers:
        • Token based type receives the caller’s identity in a bearer token, such as a JSON Web Token (JWT) or an OAuth token.
        • Request parameter based type receives the caller’s identity in a combination of headers, query string parameters, stageVariables, and $context variables.
      • Amazon Cognito user pools are user directories in Amazon Cognito. A client of your API must first sign a user in to the user pool and obtain an identity or access token for the user. Then your API is called with one of the returned tokens. The API call succeeds only if the required token is valid.
  • The optional Transform section of a CloudFormation template specifies one or more macros that AWS CloudFormation uses to process your template. Aside from macros you create, AWS CloudFormation also supports the AWS::Serverless transform which is a macro hosted on AWS CloudFormation.
    • The AWS::Serverless transform specifies the version of the AWS Serverless Application Model (AWS SAM) to use. This model defines the AWS SAM syntax that you can use and how AWS CloudFormation processes it. 

AWS ParallelCluster

 

  • An AWS-supported open source cluster management tool for deploying and managing High Performance Computing (HPC) clusters on AWS. ParallelCluster uses a simple text file to model and provision all the resources needed for your HPC applications in an automated and secure manner.
  • AWS ParallelCluster provisions a master instance for build and control, a cluster of compute instances, a shared filesystem, and a batch scheduler. You can also extend and customize your use cases using custom pre-install and post-install bootstrap actions.

How It Works

AWS ParallelCluster

  • You have four supported schedulers to use along with ParallelCluster:
    • SGE (Son of Grid Engine)
    • Torque
    • Slurm
    • AWS Batch
  • AWS ParallelCluster supports 
    • On-Demand,
    • Reserved,
    • and Spot Instances

Networking

    • AWS ParallelCluster uses Amazon Virtual Private Cloud (VPC) for networking. The VPC must have DNS Resolution = yesDNS Hostnames = yes and DHCP options with the correct domain-name for the Region. 
    • AWS ParallelCluster supports the following high-level configurations:
      • One subnet for both master and compute instances.
      • Two subnets, with the master in one public subnet, and compute instances in a private subnet. The subnets can be new or existing.
    • AWS ParallelCluster can also be deployed to use an HTTP proxy for all AWS requests.

Storage

    • By default, AWS ParallelCluster automatically configures an external volume of 15 GB of Elastic Block Storage (EBS) attached to the cluster’s master node and exported to the cluster’s compute nodes via Network File System (NFS).
    • AWS ParallelCluster is also compatible with Amazon Elastic File System (EFS), RAID, and Amazon FSx for Lustre file systems. 
    • You can configure AWS ParallelCluster with Amazon S3 object storage as the source of job inputs or as a destination for job output.
  • Cluster Configuration
    • By default, AWS ParallelCluster uses the file ~/.parallelcluster/config for all configuration parameters. A custom configuration file may be specified via the -c or –config command line option or the AWS_PCLUSTER_CONFIG_FILE environment variable.
    • The following sections are required: 
      • [global] section and [aws] section.
      • At least one [cluster] section and one [vpc] section.
  • Cluster Processes
    • When a cluster is running, a process called a jobwatcher monitors the configured scheduler ( SGE , Slurm , or Torque ) and each minute, it evaluates the queue in order to decide when to scale up.
    • The sqswatcher process monitors for Amazon SQS messages that are sent by Auto Scaling, to notify you of state changes within the cluster.
    • The nodewatcher process runs on each node in the compute fleet and terminates instances that have been idle for a set amount of time.

Pricing

    • AWS ParallelCluster is available at no additional charge. You pay only for the AWS resources needed to run your applications.

Limitations

    • AWS ParallelCluster does not support building Windows clusters.
    • AWS ParallelCluster does not currently support mixed instance types for a cluster. However, you can pick one instance type for the master node and another instance type for the compute nodes.

AWS Lambda

 

  • A serverless compute service.
  • Lambda executes your code only when needed and scales automatically.
  • Lambda functions are stateless – no affinity to the underlying infrastructure.
  • You choose the amount of memory you want to allocate to your functions and AWS Lambda allocates proportional CPU power, network bandwidth, and disk I/O.
  • AWS Lambda is SOC, HIPAA, PCI, ISO compliant.
  • Natively supports the following languages:
    • Node.js
    • Java
    • C#
    • Go
    • Python
    • Ruby
    • PowerShell
  • You can also provide your own custom runtime.

Introduction to AWS Lambda & Serverless Applications

Components of a Lambda Application

  • Function – a script or program that runs in Lambda. Lambda passes invocation events to your function. The function processes an event and returns a response.
  • Runtimes – Lambda runtimes allow functions in different languages to run in the same base execution environment. The runtime sits in-between the Lambda service and your function code, relaying invocation events, context information, and responses between the two.
  • Layers – Lambda layers are a distribution mechanism for libraries, custom runtimes, and other function dependencies. Layers let you manage your in-development function code independently from the unchanging code and resources that it uses.
  • Event source – an AWS service or a custom service that triggers your function and executes its logic.
  • Downstream resources – an AWS service that your Lambda function calls once it is triggered.
  • Log streams – While Lambda automatically monitors your function invocations and reports metrics to CloudWatch, you can annotate your function code with custom logging statements that allow you to analyze the execution flow and performance of your Lambda function.
  • AWS Serverless Application Model

Lambda Functions

  • You upload your application code in the form of one or more Lambda functions. Lambda stores code in Amazon S3 and encrypts it at rest.
  • To create a Lambda function, you first package your code and dependencies in a deployment package. Then, you upload the deployment package to create your Lambda function.
  • After your Lambda function is in production, Lambda automatically monitors functions on your behalf, reporting metrics through Amazon CloudWatch.
  • Configure basic function settings including the description, memory usage, execution timeout, and role that the function will use to execute your code.
  • Environment variables are always encrypted at rest, and can be encrypted in transit as well.
  • Versions and aliases are secondary resources that you can create to manage function deployment and invocation.
  • layer is a ZIP archive that contains libraries, a custom runtime, or other dependencies. Use layers to manage your function’s dependencies independently and keep your deployment package small.
  • You can configure a function to mount an Amazon EFS file system to a local directory. With Amazon EFS, your function code can access and modify shared resources securely and at high concurrency.

Invoking Functions

  • Lambda supports synchronous and asynchronous invocation of a Lambda function. You can control the invocation type only when you invoke a Lambda function (referred to as on-demand invocation).
  • An event source is the entity that publishes events, and a Lambda function is the custom code that processes the events.
  • Event source mapping maps an event source to a Lambda function. It enables automatic invocation of your Lambda function when events occur. 
  • Lambda provides event source mappings for the following services.
    • Amazon Kinesis
    • Amazon DynamoDB
    • Amazon Simple Queue Service
  • Your functions’ concurrency is the number of instances that serve requests at a given time. When your function is invoked, Lambda allocates an instance of it to process the event. When the function code finishes running, it can handle another request. If the function is invoked again while a request is still being processed, another instance is allocated, which increases the function’s concurrency.
  • To ensure that a function can always reach a certain level of concurrency, you can configure the function with reserved concurrency. When a function has reserved concurrency, no other function can use that concurrency. Reserved concurrency also limits the maximum concurrency for the function.
  • To enable your function to scale without fluctuations in latency, use provisioned concurrency. By allocating provisioned concurrency before an increase in invocations, you can ensure that all requests are served by initialized instances with very low latency.

Configuring a Lambda Function to Access Resources in a VPC

In AWS Lambda, you can set up your function to establish a connection to your virtual private cloud (VPC). With this connection, your function can access the private resources of your VPC during execution like EC2, RDS and many others.

By default, AWS executes your Lambda function code securely within a VPC. Alternatively, you can enable your Lambda function to access resources inside your private VPC by providing additional VPC-specific configuration information such as VPC subnet IDs and security group IDs. It uses this information to set up elastic network interfaces which enable your Lambda function to connect securely to other resources within your VPC.

Lambda@Edge

  • Lets you run Lambda functions to customize content that CloudFront delivers, executing the functions in AWS locations closer to the viewer. The functions run in response to CloudFront events, without provisioning or managing servers.
  • You can use Lambda functions to change CloudFront requests and responses at the following points:
    • After CloudFront receives a request from a viewer (viewer request)
    • Before CloudFront forwards the request to the origin (origin request)
    • After CloudFront receives the response from the origin (origin response)
    • Before CloudFront forwards the response to the viewer (viewer response)

AWS Training Lambda

  • You can automate your serverless application’s release process using AWS CodePipeline and AWS CodeDeploy.
  • Lambda will automatically track the behavior of your Lambda function invocations and provide feedback that you can monitor. In addition, it provides metrics that allows you to analyze the full function invocation spectrum, including event source integration and whether downstream resources perform as expected.

AWS Elastic Beanstalk

 

  • Allows you to quickly deploy and manage applications in the AWS Cloud without worrying about the infrastructure that runs those applications.
  • Elastic Beanstalk automatically handles the details of capacity provisioning, load balancing, scaling, and application health monitoring for your applications.
  • It is a Platform-as-a-Service
  • Elastic Beanstalk supports the following languages:
    • Go
    • Java
    • .NET
    • Node.js
    • PHP
    • Python
    • Ruby
  • Elastic Beanstalk supports the following web containers:
    • Tomcat
    • Passenger
    • Puma
  • Elastic Beanstalk supports Docker containers.
  • Elastic Beanstalk Workflow

AWS Training Elastic Beanstalk

  • Your application’s domain name is in the format:
    subdomain.region.elasticbeanstalk.com

Environment Pages

  • The Configuration page shows the resources provisioned for this environment. This page also lets you configure some of the provisioned resources.
  • The Health page shows the status and detailed health information about the EC2 instances running your application.
  • The Monitoring page shows the statistics for the environment, such as average latency and CPU utilization. You also use this page to create alarms for the metrics that you are monitoring.
  • The Events page shows any informational or error messages from services that this environment is using.
  • The Tags page shows tags — key-value pairs that are applied to resources in the environment. You use this page to manage your environment’s tags.

Elastic Beanstalk Concepts

  • Application – a logical collection of Elastic Beanstalk components, including environments, versions, and environment configurations. It is conceptually similar to a folder.
  • Application Version – refers to a specific, labeled iteration of deployable code for a web application. An application version points to an Amazon S3 object that contains the deployable code. Applications can have many versions and each application version is unique.
  • Environment – a version that is deployed on to AWS resources. Each environment runs only a single application version at a time, however you can run the same version or different versions in many environments at the same time.
  • Environment Tier – determines whether Elastic Beanstalk provisions resources to support an application that handles HTTP requests or an application that pulls tasks from a queue. An application that serves HTTP requests runs in a web server environment. An environment that pulls tasks from an Amazon SQS queue runs in a worker environment.
  • Environment Configuration – identifies a collection of parameters and settings that define how an environment and its associated resources behave.
  • Configuration Template – a starting point for creating unique environment configurations.
  • There is a limit to the number of application versions you can have. You can avoid hitting the limit by applying an application version lifecycle policy to your applications to tell Elastic Beanstalk to delete application versions that are old, or to delete application versions when the total number of versions for an application exceeds a specified number.

Environment Types

  • Load-balancing, Autoscaling Environment – automatically starts additional instances to accommodate increasing load on your application.
  • Single-Instance Environment – contains one Amazon EC2 instance with an Elastic IP address.

Environment Configurations

  • Your environment contains:
    • Your EC2 virtual machines configured to run web apps on the platform that you choose.
    • An Auto Scaling group that ensures that there is always one instance running in a single-instance environment, and allows configuration of the group with a range of instances to run in a load-balanced environment.
    • When you enable load balancing, Elastic Beanstalk creates an Elastic Load Balancing load balancer to distributes traffic among your environment’s instances.
    • Elastic Beanstalk provides integration with Amazon RDS to help you add a database instance to your Elastic Beanstalk environment : MySQL, PostgreSQL, Oracle, or SQL Server. When you add a database instance to your environment, Elastic Beanstalk provides connection information to your application by setting environment properties for the database hostname, port, user name, password, and database name.
    • You can use environment properties to pass secrets, endpoints, debug settings, and other information to your application. Environment properties help you run your application in multiple environments for different purposes, such as development, testing, staging, and production.
    • You can configure your environment to use Amazon SNS to notify you of important events that affect your application.
    • Your environment is available to users at a subdomain of elasticbeanstalk.com. When you create an environment, you can choose a unique subdomain that represents your application.
  • You can use a shared Application Load Balancer to serve traffic for multiple applications running on multiple Elastic Beanstalk environments within the same VPC. 

Monitoring

  • Elastic Beanstalk Monitoring console displays your environment’s status and application health at a glance.
  • Elastic Beanstalk reports the health of a web server environment depending on how the application running in it responds to the health check.
  • Enhanced health reporting is a feature that you can enable on your environment to allow AWS Elastic Beanstalk to gather additional information about resources in your environment. Elastic Beanstalk analyzes the information gathered to provide a better picture of overall environment health and aid in the identification of issues that can cause your application to become unavailable.
  • You can create alarms for metrics to help you monitor changes to your environment so that you can easily identify and mitigate problems before they occur.
  • EC2 instances in your Elastic Beanstalk environment generate logs that you can view to troubleshoot issues with your application or configuration files.

Security

  • When you create an environment, Elastic Beanstalk prompts you to provide two AWS IAM roles: a service role and an instance profile.
    • Service Roles – assumed by Elastic Beanstalk to use other AWS services on your behalf.
    • Instance Profiles – applied to the instances in your environment and allows them to retrieve application versions from S3, upload logs to S3, and perform other tasks that vary depending on the environment type and platform.
  • User Policies – allow users to create and manage Elastic Beanstalk applications and environments.

Pricing

  • There is no additional charge for Elastic Beanstalk. You pay only for the underlying AWS resources that your application consumes.