9. Deploy a ModelVersion and set the Data Pipeline up

Objectives:

  • Deploy your DatasetVersion
  • Set your Data Pipeline up
  • Access your Model Monitoring Dashboard

1. Deploy your ModelVersion

From the Model Version view, you can now deploy your ModelVersion to make it able to perform Prediction by clicking on Deploy.

The Deployment can be done on the OVH infrastructure provided by Picsellia or directly on your own infrastructure.

While deploying the ModelVersion, a confidence threshold will be asked, be advised that any Shape with a confidence score lower than this threshold won't be taken into account, as a consequence we advise you to put a low threshold first and adjust it later in the Settings of the Deployment.

Once the confidence threshold is defined, the ModelVersion is deployed. You can access all your Deployment from the Deployment view and select any of them to access its Monitoring Dashboard.

2. Monitoring Dashboard

The Monitoring Dashboard of a Deployment view allows you to access many metrics related to your ModelVersion performances such as latency, heat map, KS Drift, Outlier score, mAP, and global distribution…. By the way, you can find details about them over there.

Right after the ModelVersion deployment, those metrics are supposed to be empty until Prediction are done by our ModelVersion.

3. Set the Data Pipeline up

The main idea of this Data Pipeline is to ensure that the ModelVersion is always improving by leveraging human-reviewed production Data.

As a consequence, before making a Prediction, it is useful to set up the following things in the Settings tab:

  • The training Data indicates the DatasetVersion used to train the deployed ModelVersion so supervised metrics such as KS Drift can be computed. If you pushed and deployed a ModelVersion on Picsellia without the Training Data, it's unnecessary to do it, but be advised that unsupervised metrics will not be computed.
  • The feedback loop is a way to leverage Data coming from the production by adding them in a new DatasetVersion once humanly reviewed and using this enriched DatasetVersion for further training. The philosophy behind that by using Data from the ground to retrain frequently our ModelVersion we ensure its performances over time and avoid Data drift
  • Continuous training is part of the pipeline that triggers and orchestrates automatic ModelVersion retraining once the Training Data has been enriched to enough Data coming from the ground.
  • Continuous deployment is part of the pipeline that exports the retrained ModelVersion as a new version and potentially deploys it automatically on the impacted Deployment on Picsellia.

πŸ‘

An automatized process

All those steps are crucial to building a customized and automatized Data Pipeline. Once setup you'll be able to retrain and redeploy improved ModelVersion without any human action (except the prediction review)

To set everything up, you need to access the Settings view of your Deployment and browse all tabs to define each step of your Data Pipeline.

To activate the computing of unsupervised metrics and the Feedback Loop, go into the related Settings tabs.

Select the DatasetVersion used to train the ModelVersion deployed for the Training Data tab and the DatasetVersion enriched with production Data for further ModelVersion retraining in the Feedback Loop tab.

🚧

Be patient

When setting up the dataset in Training Data, the initialization can take several minutes before the computing of unsupervised metrics become available.

Continuous Training and Continuous Deployment can also be initiated from this β€œSettings” view.

For continuous training, you will be asked to define the retraining ModelVersion parameters such as the project, the ModelVersion to retrain, the training DatasetVersion, and training hyperparameters:

It is important to know that this new training and deployment loop is activated when the number of predictions reviewed by a human and added to the dataset reaches a threshold defined in the "Trigger" subpart.

Regarding the continuous deployment, you've three possibilities:

  • Deploy Shadow: The new ModelVersion trained through Continuous Training is deployed as a shadow ModelVersion of the current Deployment. It means that Champion and Shadow ModelVersion will make Prediction for further inferences. It is the best way to assess that the new ModelVersion is over-performing the previous one before turning it into the champion.
  • Deploy Champion: Replace the Champion ModelVersion with the new ModelVersion created through Continuous Training. Further inferences will be done by the newly created ModelVersion.
  • Deploy manual: Do not deploy the new ModelVersion created through Continuous Training. This new ModelVersion remains stored on your Private Model Registry.

You can ensure that your whole Data Pipeline is well activated from the Dashboard view:

πŸ‘

The "Settings" tab also allows you to update the confidence threshold and the name of your Deployment