Thursday, 24 March 2022

Amazon SageMaker

 

  • A fully managed service that allows data scientists and developers to easily build, train, and deploy machine learning models at scale.
  • Provides built-in algorithms that you can immediately use for model training.
  • Also supports custom algorithms through docker containers.
  • One-click model deployment.

Concepts

  • Hyperparameters
    • It refers to a set of variables that controls how a model is trained.
    • You can think of them as “volume knobs” that you can tune to acquire your model’s objective.
  • Automatic Model Tuning
    • Finds the best version of a model by automating the training job within the limits of the hyperparameters that you specified.
  • Training
    • The process where you create a machine learning model.
  • Inference
    • The process of using the trained model to make predictions.
  • Local Mode
    • Allows you to create and deploy estimators to your local machine for testing.
    • You must install the Amazon SageMaker Python SDK on your local environment to use local mode.

Common Training Data Formats For Built-in Algorithms

  • CSV
  • Protobuf RecordIO
  • JSON
  • Libsvm
  • JPEG
  • PNG

Input modes for transferring training data

  • File mode
    • Downloads data into the SageMaker instance volume before model training commences.
    • Slower than pipe mode
    • Used for Incremental training
  • Pipe mode
    • Directly stream data from Amazon S3 into the training algorithm container.
    • There’s no need to procure large volumes to store large datasets.
    • Provides shorter startup and training times.
    • Higher I/O throughputs
    • Faster than File mode.
    • You MUST use protobuf RecordIO as your training data format before you can take advantage of the Pipe mode.

Two methods of deploying a model for inference

  • Amazon SageMaker Hosting Services
    • Provides a persistent HTTPS endpoint for getting predictions one at a time.
    • Suited for web applications that need sub-second latency response.
  • Amazon SageMaker Batch Transform
    • Doesn’t need a persistent endpoint
    • Get inferences for an entire dataset

Optimization

  • Convert training data into a protobuf RecordIO format to make use of Pipe mode.
  • Use Amazon FSx for Lustre to accelerate File mode training jobs.

Monitoring

  • You can publish SageMaker instance metrics to the CloudWatch dashboard to gain a unified view of its CPU utilization, memory utilization, and latency.
  • You can also send training metrics to the CloudWatch dashboard to monitor model performance in real-time.
  • Amazon CloudTrail helps you detect unauthorized SageMaker API calls.

Pricing

  • The building, training, and deploying of ML models are billed by the second, with no minimum fees and no upfront commitments.

No comments:

Post a Comment