Once you have set up the integration with SageMaker, you can start creating SageMaker Deployments. Generally, creating a SageMaker Deployment follows the steps outlined in Creating a Deployment. However, some settings are only available for KServe Deployments and not for SageMaker Deployments and vice versa. Below, the unique settings for SageMaker Deployments are described per step.
Step 1: Add a repository, and select a version (branch, commit)
For any deployment, it is very important that the repository used adheres to the Contract. More information about the Contract can be found here: Preparing a repository.
For SageMaker Deployments specifically, the reference system must be used. Unique to SageMaker Deployments is also that both a blob URL and docker reference can be included, as illustrated in the example below.
{
"reference": {
"blob": {
"url": "s3://deeploy-examples/sklearn/census/sagemaker/model.tar.gz"
},
"docker": {
"image": "492215442770.dkr.ecr.eu-central-1.amazonaws.com/sagemaker-scikit-learn:0.20.0-cpu-py3" ,
"uri": "/model:predict",
"port": 8000
}
}
}
In this case we used a scikit-learn pre-build docker image created by SageMaker and added a .tar.gz file to s3. This compressed file consists of:
- model.joblib: your exported trained model
- inference.py: instructions to tell SageMaker how to inference the model
# inference.py
import joblib
import os
import json
"""
Deserialize fitted model
"""
def model_fn(model_dir):
model = joblib.load(os.path.join(model_dir, "model.joblib"))
return model
"""
input_fn
request_body: The body of the request sent to the model.
request_content_type: (string) specifies the format/variable type of the request
"""
def input_fn(request_body, request_content_type):
if request_content_type == 'application/json':
request_body = json.loads(request_body)
inpVar = request_body['instances']
return inpVar
else:
raise ValueError("This model only supports application/json input")
"""
predict_fn
input_data: returned array from input_fn above
model (sklearn model) returned model loaded from model_fn above
"""
def predict_fn(input_data, model):
return model.predict(input_data)
"""
output_fn
prediction: the returned value from predict_fn above
content_type: the content type the endpoint expects to be returned. Ex: JSON, string
"""
def output_fn(prediction, content_type):
res = int(prediction[0])
respJSON = {'predictions': res}
return respJSON
- requirements.txt: additional Python requirements that are not installed in the pre-build image by default
# requirements.txtjoblib>=1.1.0
For more information on AWS SageMaker custom and pre-build images, start here.
Step 2: Add deployment metadata
An 'use default backend settings' toggle is included in the second step (Image 1). When toggled on, the default Deployment backend (as defined in the Workspace settings) is used. When toggled off, a Deployment backend can be selected from the dropdown. If SageMaker is selected, the AWS region used for the Deployment can also be changed here.
Image 1: 'Use default backend settings' toggled off
Step 3: Define the inference type
For SageMaker deployments, only the Custom Docker model type is allowed. Therefore, this option has been preselected and the dropdown disabled. The path to your custom image is automatically taken from the reference.json file in your repository.
The instance type selection gives you the flexibility to choose the appropriate mix of resources for your model. Here you can find a guide on how to choose the instance type based on your use case
Image 2: Model type selection disabled, advanced configuration expanded.
Step 4: Select the explainer
For SageMaker Deployments, the only options available are No explainer and Custom Docker. As for the model, if you choose the Custom Docker, the explainer image is automatically taken from the reference.json file in your repository.
Image 3: Explainer type options for SageMaker Deployments
Comments
0 comments
Please sign in to leave a comment.