Saturday, 26 March 2022

Google Cloud Dataproc

 

  • Build fully managed Apache Spark, Apache Hadoop, Presto, and other OSS clusters on the Google Cloud Platform using Cloud Dataproc.

Features

  • You can spin up resizable clusters quickly with various virtual machine types, disk sizes, number of nodes, and networking options on Cloud Dataproc.
  • Dataproc provides autoscaling features to help you automatically manage the addition and removal of cluster workers.
  • Cloud Dataproc has built-in integration with the following Google Cloud services for a more complete and robust platform.
    • Cloud Storage
    • BigQuery
    • Cloud Bigtable
    • Cloud Logging
    • Cloud Monitoring
    • AI Hub
  • It is capable of image versioning. This will allow you to switch between different versions of the tools you want to use.
  • To avoid charges for inactive clusters, you can utilize Dataproc’s scheduled deletion.
  • You can manage your clusters via
    • Cloud Console Web UI
    • Cloud SDK
    • RESTful APIs
    • SSH access.
  • Dataproc can be provisioned with custom images according to your needs.
  • Workflow templates provide a flexible and simple mechanism for managing and executing workflows.

Pricing

  • Only pay for the resources you use and lower the total cost of ownership of OSS
  • Dataproc pricing is based on the number of vCPUs and the duration that they run.

No comments:

Post a Comment