Sunday, 20 March 2022

Amazon RDS vs DynamoDB

 


RDS

DynamoDB

Type of database

Managed relational (SQL) database

Fully managed key-value and document (NoSQL) database

Features

Has several database instance types for different kinds of workloads and supports six database engines – Amazon Aurora, PostgreSQL, MySQL, MariaDB, Oracle Database, and SQL Server.

Delivers single-digit millisecond performance at any scale.

Storage Size

-128 TB for Aurora engine.

-64 TB for MySQL, MariaDB, Oracle and PostgreSQL engines.

-16 TB for SQL Server engine.

Supports tables of virtually any size.

Number of tables per unit

Depends on the database engine

256

Performance

General Purpose Storage is an SSD-backed storage option that delivers at consistent baseline of 3 IOPS per provisioned GB with the ability to burst up to 3,000 IOPS.

Provisioned IOPS Storage is an SSD-backed storage option designed to deliver a consistent IOPS rate that you specify when creating a database instance, up to 40,000 IOPS per database Instance. Amazon RDS provisions that IOPS rate for the lifetime of the database instance. Optimized for OLTP database workloads.

Magnetic – Amazon RDS also supports magnetic storage for backward compatibility.

Single-digit millisecond read and write performance. Can handle more than 10 trillion requests per day with peaks greater than 20 million request per second, over petabytes ofstorage.

DynamoDB Accelerator (DAX) is an in-memory cache that can improve the read performance of your DynamoDB tables by up to 10 times – taking the time required for reads from milliseconds to microseconds, even at millions of requests per second.

You specify the read and write throughput for each of your tables.

Availability and durability

Amazon RDS Multi-AZ deployments synchronously replicates your data to a standby instance in a different Availability Zone

Amazon RDS will automatically replace the compute instance powering your deployment in the event of a hardware failure.

DynamoDB global tables replicate your data automatically across 3 Availability Zones of your choice of AWS Regions and automatically scale capacity to accommodate your workloads.

Backups

The automated backup feature enables point-in-time recovery for your database instance. Database snapshots are user-initiated backups of your instance stored in Amazon S3 that are kept until you explicitly delete them.

Point-in-time recovery (PITR) provides continuous backups of your DynamoDB table data, and you can restore that table to any point in time up to the second during the preceding 35 days. On-demand backups and restore allows you to create full backups of your DynamoDB tables’ data for data archiving.

Scalability

The Amazon Aurora engine will automatically grow the size of your database volume. The MySQL, MariaDB, SQL Server, Oracle, and PostgreSQL engines allow you to scale on-the-fly with zero downtime.

RDS also supports storage auto scaling Reads replicas are available in Amazon RDS for MySQL, MariaDB, and PostgreSQL as well as Amazon Aurora.

Support tables of virtually any size with horizontal scaling.

For tables using on-demand capacity mode, DynamoDB instantly accommodates your workloads a they ramp up or down to any previously reached traffic level. For tables using provisioned capacity, DynamoDB delivers automatic scaling of throughput and storage based on your previously set capacity.

Security

Isolate your database in your own virtual network.

Connect to your on-premises IT infrastructure using industry-standard encrypted IPsec VPNs.

You can configure firewall settings and control network access to your database instances.

Integrates with IAM.

Integrates with IAM.

Encryption 

Encrypt your databases using keys you manage through AWS KMS. With encryption enabled, data stored at rest is encrypted, as are its automated backups, read replicas, and snapshots.

Supports Transparent Data Encryption in SQL Server and Oracle.

Supports the use of SSL to secure data in transit.

DynamoDB encrypts data at rest by default using encryption keys stored in AWS KMS.

Maintenance 

Amazon RDS will update databases with the latest patches. You can exert optional control over when and if your database instance is patched.

No maintenance since DynamoDB is serverless.

Pricing 

A monthly charge for each database instance that you launch.

Option to reserve a DB instance for a One or three year term and receive discounts in pricing, compared to On-Demand instance pricing.

Charges for reading, writing, and storing data in your DynamoDb tables, along with any optional features you choose to enable. There are specific billing options for each of DynamoDB’s capacity modes.

Use cases 

Traditional applications, ERP, CRM, and e-commerce.

Internet-scale applications, real-time bidding, shopping carts, and customer Preferences, content management, Personalization, and mobile applications.

Amazon Kinesis Data Streams vs Data Firehose vs Data Analytics vs Video Streams

 


Data Streams

Data Firehose

Data Analytics

Video Streams

Short definition

Scalable and durable real-time data streaming service.

Capture, transform, and deliver streaming data into data lakes, data stores, and analytics services.

Transform and analyze streaming data in real time with Apache Flink.

Stream video from connected devices to AWS for analytics, machine learning, playback, and other processing.

Data sources

Any data source (servers, mobile devices, IoT devices, etc) that can call the Kinesis API to send data.

Any data source (servers, mobile devices, IoT devices, etc) that can call the Kinesis API to send data.

Amazon MSK, Amazon Kinesis Data Streams, servers, mobile devices, IoT devices, etc.

Any streaming device that supports Kinesis Video Streams SDK.

Data consumers

Kinesis Data Analytics, Amazon EMR, Amazon EC2, AWS Lambda

Amazon S3, Amazon Redshift, Amazon Elasticsearch Service, generic HTTP endpoints,  Datadog, New Relic, MongoDB, and Splunk

Analysis results can be sent to another Kinesis stream, a Kinesis Data Firehose delivery stream, or a Lambda function

Amazon Rekognition, Amazon SageMaker, MxNet, TensorFlow, HLS-based media playback, custom media processing application

Use cases

– Log and event data collection

– Real-time analytics

– Mobile data capture

– Gaming data feed

– IoT Analytics

– Clickstream Analytics

– Log Analytics

– Security monitoring

– Streaming ETL

– Real-time analytics

– Stateful event processing

– Smart technologies

– Video-related AI/ML

– Video processing

Amazon EFS vs Amazon FSx for Windows vs Amazon FSx for Lustre

 

Amazon EFSAmazon FSx for Windows File ServerAmazon FSx for Lustre
  •  Amazon EFS is a serverless, scalable, high-performance file system in the cloud.
  •  EFS file systems can be accessed by Amazon EC2 Linux instances, Amazon ECS, Amazon EKS, AWS Fargate, and AWS Lambda functions via a file system interface such as NFS protocol.
  •  Amazon EFS supports file system access semantics such as strong consistency and file locking.
  •  EFS file systems can automatically scale in storage to handle petabytes of data. With Bursting mode, the throughput available to a file system scales as a file system grows. Provisioned Throughput mode allows you to provision a constant file system throughput independent of the amount of data stored.
  •  EFS file systems can be concurrently accessed by thousands of compute services without sacrificing performance.
  •  Common use cases for EFS file systems include big data and analytics workloads, media processing workflows, content management, web serving, and home directories.
  •  Amazon EFS has four storage classes: Standard, Standard Infrequent Access, One Zone, and One Zone Infrequent Access
  •  You can create lifecycle management rules to move your data from standard storage classes to infrequent access storage classes.
  •  Every EFS file system object of Standard storage is redundantly stored across multiple AZs.
  •  EFS offers the ability to encrypt data at rest and in transit. Data encrypted at rest using AWS KMS for encryption keys. Data encryption in transit uses TLS 1.2
  •  To access EFS file systems from on-premises, you must have an AWS Direct Connect or AWS VPN connection between your on-premises datacenter and your Amazon VPC.
  •  Amazon FSx for Windows File Server is a fully managed, scalable file storage that is accessible over SMB protocol. 
  •  Since it is built on Windows Server, it natively supports administrative features such as user quotas, end-user file restore, and Microsoft Active Directory integration.
  •  FSx for WFS is accessible from Windows, Linux, and MacOS compute instances and devices. Thousands of compute instances and devices can access a file system concurrently.
  •  FSx  for WFS can connect your file system to Amazon EC2, Amazon ECS, VMware Cloud on AWS, Amazon WorkSpaces, and Amazon AppStream 2.0 instances.
  •  Every file system comes with a default Windows file share, named “share”.
  •  Common use cases for FSx for WFS include CRM, ERP, custom or .NET applications, home directories, data analytics, media and entertainment workflows, software build environments, and Microsoft SQL Server.
  •  You can access FSx file systems from your on-premises environment using an AWS Direct Connect or AWS VPN connection between your on-premises datacenter and your Amazon VPC. 
  •  You can choose the storage type for your file system: SSD storage for latency-sensitive workloads or workloads requiring the highest levels of IOPS/throughput. HDD storage for throughput-focused workloads that aren’t latency-sensitive.
  •  Every FSx for WFS file system has a throughput capacity that you configure when the file system is created and that you can change at any time.
  •  Each Windows File Server file system can store up to 64 TB of data. You can only manually increase the storage capacity.
  •  Your file system can be deployed in multiple AZs or a single AZ only. Multi-AZ file systems provide automatic failover.
  •  FSx for Windows File Server always encrypts your file system data and your backups at-rest using keys you manage through AWS KMS. Data-in-transit encryption uses SMB Kerberos session keys.
  •  Amazon FSx for Lustre is a serverless file system that runs on Lustre ー an open-source, high-performance file system.
  •  The Lustre file system is designed for applications that require fast storage. FSx for Lustre file systems can scale to hundreds of GB/s of throughput and millions of IOPS. FSx for Lustre also supports concurrent access to the same file or directory from thousands of compute instances.
  •  Unlike EFS, storage capacity needs to be manually increased, and only every six hours can you do so.
  •  Amazon FSx for Lustre also integrates with Amazon S3, which lets you process cloud data sets with the Lustre high-performance file system.
  •  Common use cases for Lustre include machine learning, high-performance computing (HPC), video processing, financial modeling, genome sequencing, and electronic design automation (EDA).
  •  FSx for Lustre can only be used by Linux-based instances. To access your file system, you first install the open-source Lustre client on that instance. Then you mount your file system using standard Linux commands. Lustre file systems can also be used with Amazon EKS and AWS Batch.
  •  FSx for Lustre provides two deployment options: 
    1. Scratch file systems are for temporary storage and shorter-term processing of data. Data is not replicated and does not persist if a file server fails.
    2. Persistent file systems are for longer-term storage and workloads. The file servers are highly available, and data is automatically replicated within the AZ that is associated with the file system.
  •  You can choose the storage type for your file system: SSD storage for latency-sensitive workloads or workloads requiring the highest levels of IOPS/throughput. HDD storage for throughput-focused workloads that aren’t latency-sensitive.
  •  FSx for Lustre always encrypts your file system data and your backups at-rest using keys you manage through AWS KMS. FSx encrypts data-in-transit when accessed from supported EC2 instances.

Amazon Cognito User Pools vs Identity Pools

 With the proliferation of smartphones in our connected world, more and more developers are quickly deploying their applications on the cloud. One of the first challenges in developing applications is allowing users to log in and authenticate on your applications. There are multiple stages involved in user verification and most of these are not visible from the end-user. AWS provides an easy solution for this situation.

User Identity verification is at the core of Amazon Cognito. It provides solutions for three key areas of user identification: 

  1. Authentication – provides users sign-up and sign-in options. Enables support for federation with Enterprise Identities (Microsoft AD), or Social Identities (Amazon, Facebook, Google, etc.)
  2. Authorization – sets of permission or operations allowed for a user. It provides fine-grained access control to resources. 
  3. User Management – allows management of user lifecycles, such as importing users, onboarding users, disabling users, and storing and managing user profiles.

In this post, we’ll talk about Cognito User Pools and Identity Pools, including an overview of how they are used to provide authentication and authorization functionalities that can be integrated on your mobile app.


Amazon Cognito User Pools

Amazon Cognito User Pools are used for authentication. To verify your user’s identity, you will want to have a way for them to login using username/passwords or federated login using Identity Providers such as Amazon, Facebook, Google, or a SAML supported authentication such as Microsoft Active Directory. You can configure these Identity Providers on Cognito, and it will handle the interactions with these providers so you only have to worry about handling the Authentication tokens on your app.

Amazon Cognito Integration with Identity Providers

With Cognito User Pools, you can provide sign-up and sign-in functionality for your mobile or web app users. You don’t have to build or maintain any server infrastructure on which users will authenticate. 

This diagram shows how authentication is handled with Cognito User Pools:

Cognito User Pool for Authentication

  1. Users send authentication requests to Cognito User Pools. 
  2. The Cognito user pool verifies the identity of the user or sends the request to Identity Providers such as Facebook, Google, Amazon, or SAML authentication (with Microsoft AD).
  3. The Cognito User Pool Token is sent back to the user. 
  4. The person can then use this token to access your backend APIs hosted on your EC2 clusters or in API Gateway and Lambda.

If you want a quick login page, you can even use the pre-built login UI provided by Amazon Cognito which you just have to integrate on your application.

Default Amazon Cognito User Login Page

On the Amazon Cognito User Pool page, you can also manage users if you need to. You can reset the password, disable/enable users, and enroll/delete users or other actions needed for User Management. 

Amazon Cognito Identity Pools

Cognito Identity Pools (Federated Identities) provides different functionality compared to User Pools. Identity Pools are used for User Authorization. You can create unique identities for your users and federate them with your identity providers. Using identity pools, users can obtain temporary AWS credentials to access other AWS services. 

Identity Pools can be thought of as the actual mechanism authorizing access to AWS resources. When you create Identity Pools, think of it as defining who is allowed to get AWS credentials and use those credentials to access AWS resources.

This diagram shows how authorization is handled with Cognito Identity Pools:

Cognito Identity Pools (Federated Identities)

  1. The web app or mobile app sends its authentication token to Cognito Identity Pools. The token can come from a valid Identity Provider, like Cognito User Pools, Amazon, or Facebook. 
  2. Cognito Identity Pool exchanges the user authentication token for temporary AWS credentials to access resources such as S3 or DynamoDB. AWS credentials are sent back to the user. 
  3. The temporary AWS credentials will be used to access AWS resources. 

You can define rules in Cognito Identity Pools for mapping users to different IAM roles to provide fine-grain permissions. 

Here’s a table summary describing Cognito User Pool and Identity Pool:


Cognito User PoolsCognito Identity Pools
Handles the IdP interactions for youProvides AWS credentials for accessing resources on behalf of users
Provides profiles to manage usersSupports rules to map users to different IAM roles
Provides OpenID Connect and OAuth standard tokensFree
Priced per monthly active user

Amazon Aurora vs Amazon RDS

 

Aurora

RDS

Type of database

Relational database

Features

  • • MySQL and PostgreSQL compatible.

  • • 5x faster than standard MySQL databases and 3x faster than standard PostgreSQL databases.

  • • Use Parallel Query to run transactional and analytical workloads in the same Aurora database, while maintaining high performance.

  • • You can distribute and load balance your unique workloads across different sets of Aurora DB instances using custom endpoints.

  • • Aurora Serverless allows for on-demand, autoscaling of your Aurora DB instance capacity.
  • • Has several database instance types for different kinds of workloads and support five database engines – MySQL, PostgreSQL, MariaDB, Oracle, and SQL Server.

  • • Can use either General Purpose Storage and Provisioned IOPS storage to deliver a consistent IOPS performance

Maximum storage capacity

128 TB

64 TB for MySQL, MariaDB, Oracle, and PostgreSQL engines

16 TB for SQL Server engine

DB instance classes

  • • Memory Optimized classes – for workloads that need to process large data sets in memory.

  • • Burstable classes – provides the instance the ability to burst to a higher level of CPU performance when required by the workload.

  • • Standard classes – for a wide range of workloads, you can use general purpose instance. It offers a balance of compute, memory, and networking resources.

  • • Memory Optimized classes – for workloads that need to process large data sets in memory.

  • • Burstable classes – provides the instance the ability to burst to a higher level of CPU performance when required by the workload.

Availability and durability

  • • Amazon Aurora uses RDS Multi-AZ technology to automate failover to one of up to 15 Amazon Aurora Replicas across three Availability Zones

  • • Amazon Aurora Global Database uses storage-based replication to replicate a database across multiple AWS Regions, with typical latency of less than 1 second.

  • • Self-healing: data blocks and disks are continuously scanned for errors and replaced automatically.
  • • Amazon RDS Multi-AZ deployments synchronously replicates your data to a standby instance in a different Availability Zone.

  • • Amazon RDS will automatically replace the compute instance powering your deployment in the event of a hardware failure.

Backups

  • • Point-in-time recovery to restore your database to any second during your retention period, up to the last five minutes.

  • • Automatic backup retention period up to thirty-five days.

  • • Backtrack to the original database state without needing to restore data from a backup.
  • • The automated backup feature enables point-in-time recovery for your database instance.

  • • Database snapshots are user-initiated backups of your instance stored in Amazon S3 that are kept until you explicitly delete them.

Scalability

  • • Aurora automatically increases the size of your volumes as your database grows larger (increments of 10 GB).

  • • Aurora also supports replica auto-scaling, where it automatically adds and removes DB replicas in response to changes in performance metrics.

  • • Cross-region replicas provide fast local reads to your users, and each region can have an additional 15 Aurora replicas to further scale local reads.
  • • The MySQL, MariaDB, SQL Server, Oracle, and PostgreSQL engines scale your storage automatically as your database workload grows with zero downtime.

  • • Read replicas are available for Amazon RDS for MySQL, MariaDB, PostgreSQL, Oracle, and SQL Server. Amazon RDS creates a second DB instance using a snapshot of the source DB instance and uses the engines’ native asynchronous replication to update the read replica whenever there is a change to the source.

  • • Can scale compute and memory resources (vertically) of up to a maximum of 32 vCPUs and 244 GiB of RAM.

Security

  • • Isolate the database in your own virtual network via VPC.

  • • Connect to your on-premises IT infrastructure using encrypted IPsec VPNs or Direct Connect and VPC Endpoints.

  • • Configure security group firewall and network access rules to your database instances.

  • • Integrates with IAM.

Encryption

  • • Encrypt your databases using keys you manage through AWS KMS. With Amazon Aurora encryption, data stored at rest is encrypted, as are its automated backups, snapshots, and replicas in the same cluster.

  • • Supports the use of SSL (AES-256) to secure data in transit.
  • • Encrypt your databases using keys you manage through AWS KMS. With Amazon RDS encryption, data stored at rest is encrypted, as are its automated backups, read replicas, and snapshots.

  • • Supports Transparent Data Encryption in SQL Server and Oracle.

  • • Supports the use of SSL to secure data in transit

DB Authentication

  • • Password authentication

  • • Password and IAM database authentication
  • • Password authentication

  • • Password and IAM database authentication

  • • Password and Kerberos authentication

Maintenance

  • • Amazon Aurora automatically updates the database with the latest patches.

  • • Amazon Aurora Serverless enables you to run your database in the cloud without managing/maintaining any database infrastructure.
  • • Amazon RDS will update databases with the latest major and minor patches on scheduled maintenance windows. You can exert optional control over when and if your database instance is patched.

Monitoring

  • • Use Enhanced Monitoring to collect metrics from the operating system instance.

  • • Use Performance Insights to detect database performance problems and take corrective action.

  • • Uses Amazon SNS to receive a notification on database events.

Pricing

  • • A monthly charge for each database instance that you launch if you use on-demand. This includes both the instance compute capacity and the amount of storage being used.

  • • Option to reserve a DB instance for a one or three-year term (reserve instances) and receive discounts in pricing.

Use Cases

  • • Enterprise applications – a great option for any enterprise application that uses relational database since it handles provisioning, patching, backup, recovery, failure detection, and repair.

  • • SaaS applications – without worrying about the underlying database that powers the application, you can concentrate on building high-quality applications.

  • • Web and mobile gaming – since games need a database with high throughput, storage scalability, and must be highly available. Aurora suits the variable use pattern of these apps perfectly.
  • • Web and mobile applications – since the application needs a database with high throughput, storage scalability, and must be highly available. RDS also fulfills the needs of such highly demanding apps.

  • • E-commerce applications – a managed database service that offers PCI compliance. You can just focus on building high-quality customer experiences without thinking of the underlying database.

  • • Mobile and online games – game developers don’t need to worry about provisioning, scaling, and monitoring of database servers since RDS manages the database infrastructure.