Dataset - Fork a Dataset Version

We have already seen previously in the documentation how to create a Dataset Version from a selection of Data in the Datalake.

But for many different reasons, you might be willing to modify a DatasetVersion. In order to ensure traceability and avoid losing the history of your work, it is highly recommended to leverage the Picsellia Dataset versioning system by forking a new DatasetVersion from the to-be-modified one.

Fortunately, Picsellia embeds features that will allow you to fork easily a DatasetVersion or copy Asset to a DatasetVersion to another.

1. Fork a DatasetVersion

First of all, you need to access the DatasetVersion you want to fork.

Once in it, you have the possibility to fork the entire DatasetVersion or a subset of it. Depending on your needs, you can leverage the Assets selection features to select the Asset that need to be embedded in the new DatasetVersion.

Once the Asset are selected, you just need to click on Dataset > Create New Dataset Version.

_Create New Dataset Version_ button

Create New Dataset Version button

A modal will open, in order to personalize the fork. You can for sure give a name to the new DatasetVersion and decide what needs to be embedded in addition to the Asset:

  • Copy Label, which will create a new DatasetVersion with the same Detection Type and Labelmap then the initial one
  • Copy Label and Annotation, which will create a new DatasetVersion with the same Detection Type, Labelmap then the initial one. The Annotation related to the Asset of the new DatasetVersion will also be copied.
  • Copy Tags, which will copy the AssetTag in the new DatasetVersion and attach them to the new Asset the same way as the initial one.

Obviously, several options can be checked during the fork:

Fork options

Fork options

To finalize the fork, you just need to click on Create.

📈

Once the fork is launched, its completion can be tracked from the Jobs panel.

2. Copy an Asset to another DatasetVersion

In case you want to copy an Asset from a given DatasetVersion to another already existing one of your Dataset, you can leverage the Copy to an existing dataset version button.

To use it from any DatasetVersion, you simply need to select the to-be-copied Asset and click on Dataset > Copy to an existing dataset version.

Then, a popup will open allowing you to select the DatasetVersion contained in the Dataset to which the selected Asset will be copied.

You can also choose to copy only the Asset or embed either the related Annotation or AssetTag.

Finally, you can click on Copy to make the copy happen.


❗️

Ensure Detection Type and Labelmap consistency

In case you embed Annotation in your copy, you must ensure that the source and destination DatasetVersion have the same Detection Type and that the Labelmap of the source DatasetVersion is included the destination's one.