ml-en-stt-model

Running

App Files Files Community

ml-en-stt-model / IndicTrans2 /inference /triton_server /azure_ml /README.md

viditk

Upload 134 files

d44849f verified 4 months ago

preview code

raw

history blame contribute delete

1.64 kB

	# Deployment on Azure Machine Learning

	## Pre-requisites

	```
	cd inference/triton_server
	```

	Set the environment for AML:
	```
	export RESOURCE_GROUP=Dhruva-prod
	export WORKSPACE_NAME=dhruva--central-india
	export DOCKER_REGISTRY=dhruvaprod
	```

	Also remember to edit the `yml` files accordingly.

	## Registering the model

	```
	az ml model create --file azure_ml/model.yml --resource-group $RESOURCE_GROUP --workspace-name $WORKSPACE_NAME
	```

	## Pushing the docker image to Container Registry

	```
	az acr login --name $DOCKER_REGISTRY
	docker tag indictrans2_triton $DOCKER_REGISTRY.azurecr.io/nmt/triton-indictrans-v2:latest
	docker push $DOCKER_REGISTRY.azurecr.io/nmt/triton-indictrans-v2:latest
	```

	## Creating the execution environment

	```
	az ml environment create -f azure_ml/environment.yml -g $RESOURCE_GROUP -w $WORKSPACE_NAME
	```

	## Publishing the endpoint for online inference

	```
	az ml online-endpoint create -f azure_ml/endpoint.yml -g $RESOURCE_GROUP -w $WORKSPACE_NAME
	```

	Now from the Azure Portal, open the Container Registry, and grant ACR_PULL permission for the above endpoint, so that it is allowed to download the docker image.

	## Attaching a deployment

	```
	az ml online-deployment create -f azure_ml/deployment.yml --all-traffic -g $RESOURCE_GROUP -w $WORKSPACE_NAME
	```

	## Testing if inference works

	1. From Azure ML Studio, go to the "Consume" tab, and get the endpoint domain (without `https://` or trailing `/`) and an authentication key.
	2. In `client.py`, enable `ENABLE_SSL = True`, and then set the `ENDPOINT_URL` variable as well as `Authorization` value inside `HTTP_HEADERS`.
	3. Run `python3 client.py`