Dataset - Fork a Dataset Version

We have already seen previously in the documentation how to create a Dataset Version from a selection of Data in the Datalake.

But for many different reasons, you might be forced to modify a DatasetVersion, in order to ensure traceability and avoid losing the history of your work, it is highly recommended to leverage the Picsellia Dataset versioning system by forking a new DatasetVersion from the to-be-modified one.

Fortunately, Picsellia embeds features that will allow you to fork easily and in a personalized way a DatasetVersion.

1. Fork a DatasetVersion

First of all, you need to access the DatasetVersion you want to fork.

Once in it, you have the possibility to fork the entire DatasetVersion or a subset of it. Depending on your needs, you can leverage the Assets selection features to select the Asset that need to be embedded in the new DatasetVersion.

Once the Asset are selected, you just need to click on Dataset > Create New Dataset Version.

_Create New Dataset Version_ button

Create New Dataset Version button

A modal will open, in order to personalize the fork. You can for sure give a name to the new DatasetVersion and decide what needs to be embedded in addition to the Asset:

  • Copy Label, which will create a new DatasetVersion with the same Detection Type and Labelmap then the initial one
  • Copy Label and Annotation, which will create a new DatasetVersion with the same Detection Type, Labelmap then the initial one. The Annotation related to the Asset of the new DatasetVersion will also be copied.
  • Copy Tags, which will copy the AssetTag in the new DatasetVersion and attach them to the new Asset the same way as the initial one.

Obviously, several options can be checked during the fork:

Fork options

Fork options

To finalize the fork, you just need to click on Create.

📈

Once the fork is launched, its completion can be tracked from the Jobs panel.