Amazon Elastic Inference ( EI )

What is Amazon Elastic Inference?

Amazon Elastic Inference (EI) allows you to attach GPU-powered acceleration to existing EC2 instances, Amazon SageMaker instances, or ECS tasks.
It provides flexibility by letting you add just the right amount of GPU acceleration instead of overprovisioning with an entire dedicated GPU instance.

Strengths

Cost Reduction: EI can significantly reduce the cost of running deep learning inference workloads compared to dedicated GPU instances. You pay only for the GPU resources you need.
Flexibility: Choose the right amount of GPU acceleration to match your workload's specific requirements.
Ease of Integration: Easily attach EI accelerators to existing EC2, SageMaker, or ECS instances without major code changes.
Broad Instance Compatibility: Works with a range of CPU instance types across various families.

Weaknesses

Limited Performance: EI accelerators might offer lower overall power than full-fledged GPU instances (like P3 or G4 instances).
Availability: Not all instance families may be compatible with EI.
Network Overhead: Some latency overhead is introduced due to the network communication between the EC2 instance and the EI accelerator.
Sunset of Service: AWS will stop onboarding new customers to EI starting April 15, 2023. Existing customers have time to migrate their workloads.

Real-World Use Case: Batch Image Processing

Varying Inference Needs: You have a task to process a large batch of images using a deep learning model, but the needed level of GPU acceleration varies depending on image complexity.
Cost Optimization: Attaching EI accelerators to general-purpose EC2 instances allows you to provision the appropriate level of GPU power on-demand, paying only for what you use instead of dedicated GPU instances that may be underutilized.
Scalability: You can horizontally scale your EC2 instances with EI accelerators, ensuring capacity for processing large image batches.

Important Notes

AWS is encouraging migration from EI to alternatives like Inf1 instances within SageMaker, which offer improved performance and cost-effectiveness for inference.
Always consult the latest documentation for AWS's recommendations and available inference options: https://aws.amazon.com/machine-learning/elastic-inference/