Thursday, 24 March 2022

Amazon Elastic Inference

 

  • Allows attaching low-cost GPU-powered inference acceleration to EC2 instances, SageMaker instances, or ECS tasks.
  • Reduce machine learning inference costs by up to 75%.

Common use cases

  • Computer vision
  • Natural language processing
  • Speech recognition

Concepts

  • Accelerator
    • A GPU-powered hardware device provisioned.
    • It is not a part of the hardware where your instance is hosted.
    • Uses AWS PrivateLink endpoint service to attach to the instance over the network.
  • Only a single endpoint service is required in every Availability Zone to connect Elastic Inference accelerator to instances.

Features

  • Supports TensorFlow, Apache MXNet, PyTorch, and ONNX models.
  • Can provide 1 to 32 trillion floating-point operations per second (TFLOPS) per accelerator.
  • The accelerator attached to each instance in an auto-scaling group scales accordingly to your application’s compute demand.

Pricing

  • You are charged for the accelerator hours you consume.

No comments:

Post a Comment