Processing is a piece of code (like a Python script) that interacts with your data on the platform on demand.
To explain this, let's start with a use case:
Let's say that you want to perform data augmentation on a Picsellia
Normally, the steps to achieve this would be:
- Downloading your images locally
- Running a script with some data-augmentation techniques (like rotating the image for example) on all of your images
- Creating a new
DatasetVersionyou are using
- Uploading the augmented images to this new
We know it can feel a little bit overwhelming 😮💨 Although running a script can be considered an automatic task, this process is fully manual. In addition, you must be using a computer that is actually able to run the code (it has to be in the correct environment, etc...)
This is why we came up with
Processing 🎉 To let you automate this process and launch it whenever you want, on the data you want, directly from the platform!
So let's see how to use the most common
Processing already available (handmade by Picsellia ❤️)
Processing can be run on
DatasetVersion, so you can perform actions like:
- Pre-annotation with a
- Data Augmentation
- Smart Version Creations
- Or anything you can think of regarding your data!
In the future, you will find
Processing in every part of Picsellia:
- Models (automatically optimize and convert model weights)
- Experiments (perform evaluation or run benchmarks)
- Deployments (compute Custom metrics...)
But that's just a tease for now 😉
Our journey starts on the Processings page, which you can access right below the Datasets tab in the Navigation Bar:
If you go there, you will have access to all the
Processing created by the Picsellia team alongside the ones you created.
Let's have a look at this page:
For now, we can see that only two
Processing are available. Given their names, we can conclude that they can be used to pre-annotate our
DatasetVersion with either YOLO or Tensorflow models.
Let's click on the edit icon of yolo-preannotation to see what's inside
You will see the same interface regardless of the
Processingyou want to edit is one of yours or ours.
To illustrate how you can use a
Processing, let's see one of the most useful examples: Pre-annotation with a
ModelVersion from your Registry (or our HUB)
Let's assume that we want to annotate all the cars and pedestrians in our Sample Dataset.
First, we are going to check in the Model Registry if we have a
ModelVersion suitable for the task.
ModelVersion has been trained on many
Labels, and among them, there are car and people, so it should be apt to pre-annotate my
But first, let's go back to my
DatasetVersion and create the
Label that we want our
ModelVersion to annotate (in the settings).
Now that the labels are set up, our
ModelVersion will know which
Labels to predict.
Let's go back to my
DatasetVersion, from the Assets overview, you can click on the process button.
After clicking on this button, a modal where you can select a
Processing will open
As we have decided, we are going to pre-annotate using a YOLO Model. This means that we can select the yolo-preannotation
Processing. A new menu to select the
Let's select our smart-city-yolo
As we saw in the previous section, we can now edit (if we want) the default parameters of this
Processing. We could increase the prediction batch_size for example, but let's keep it at 8 for now.
Now let's finally Launch our
When you launch a
Processing, it creates a
Jobrunning in the background. You can access the status and many more information about it in the Jobs tab.
On this page, you can see the history of all the
Job that ran or are currently running on your different
If you just launched a
Processing, you should see it at the top of the list. Let's inspect our freshly launched pre-annotation
When you launch a
Processing, there will be a short moment when the status will be pending. Once your
Job has been scheduled (and you start being billed), the status will change to running and you will see some logs being displayed in real-time (those come from the stdout of the server it runs on)
In this way, you can really track the progress and the status of your
Job and check that everything is going well.
Job is over, you will have access to the full history of logs, and the total running time, and the status will switch to succeeded (or failed, if there were issues at runtime).
Job will fail sometimes, but you'll be able to find the issue thanks to the stack trace in the
Once you have detected the issue, you have fixed it, and you have updated your Processing's Docker Image, you can click on the Re-run Job button. This will create and launch a second run just like the one on the left of the screen.
You can retry your Job as many times as you want, as long as there is no active run (meaning no run in the pending or running
Now that our job has finished, let's have a look at our
DatasetVersion! It should be fully annotated with cars and pedestrians!
That's a full success Our
DatasetVersion has been nicely pre-annotated by our
ModelVersion with barely any effort. That's the power of Data Processings on Picsellia
If you want to create your own
Processingyou can follow this guide.
Updated 17 days ago