Batch Processing
Batch processing in AWS SageMaker is a method of running inference jobs on large datasets in a non-interactive manner.
How It Works
In batch processing, you submit your entire dataset as a single job and wait for SageMaker to process it. The results are then stored in an S3 bucket for later retrieval.
Benefits
- Efficiency: It can be more efficient for large datasets as it allows for parallel processing.
- Cost-Effective: It can be more cost-effective as you only pay for the compute resources used during the processing time.
Limitations
- Latency: There can be a delay between when you submit your job and when you get the results.
- Less Interactive: It’s less interactive as you have to wait for the entire job to complete before getting results.
Features
- Non-Interactive Processing: It processes the entire dataset in one go.
- Parallel Processing: It allows for parallel processing of data.
Use Cases
- Large Datasets: It’s useful when you have large datasets that don’t require real-time responses.
- Offline Tasks: It’s beneficial for tasks that can be done offline, like model training or large-scale predictions.
Real-Time Inference
Real-time inference in AWS SageMaker is a method of running inference jobs that require immediate responses.
How It Works
In real-time inference, you send a single observation to your model and get a prediction back in real-time. The model is hosted on an endpoint that stays open as long as you need it.