Last Updated on September 14, 2021 by Admin 2
endpoint using the built-in object detection algorithm on a P3 instance for real-time predictions in a company’s production application. When evaluating the model’s resource utilization, the specialist notices that the model is using only a fraction of the GPU.
Which architecture changes would ensure that provisioned resources are being utilized effectively?
- Redeploy the model as a batch transform job on an M5 instance.
- Redeploy the model on an M5 instance. Attach Amazon Elastic Inference to the instance.
- Redeploy the model on a P3dn instance.
- Deploy the model onto an Amazon Elastic Container Service (Amazon ECS) cluster using a P3 instance.