1. VariantName: A unique identifier for the production variant within the endpoint. It's used to differentiate between the different models or model versions deployed on the same endpoint.
  2. ModelName: The name of the Amazon SageMaker model that you want to host. This refers to the model artifact that has been previously created and registered in SageMaker.
  3. InitialInstanceCount: The initial number of instances to be launched for the model variant. It dictates the scale at which the model starts serving inference requests.
  4. InstanceType: The type of Amazon EC2 instance to use for hosting the model. This determines the compute resources available for the model variant.
  5. InitialVariantWeight: A value indicating the fraction of the total inference traffic to be routed to this model variant initially. Weights are relative and do not need to sum to 1.
  6. AcceleratorType (optional): Specifies the type of Elastic Inference accelerator to attach to each instance of the model variant. This is relevant for models that can benefit from accelerated computing.
  7. DesiredWeight (used with UpdateEndpointWeightsAndCapacities operation): Similar to InitialVariantWeight, but used for updating the traffic distribution among variants after the endpoint is already running.
  8. DesiredInstanceCount (used with UpdateEndpointWeightsAndCapacities operation): Specifies a new desired number of instances for a model variant, allowing dynamic scaling in response to demand changes.