7. Inherit a new DatasetVersion

Objectives:

  • Create easily a DatasetVersion
  • Transfer customized way Annotation from one DatasetVersion to another

1. Fork a new DatasetVersion from an existing one

It is highly recommended to leverage the Dataset versioning system to keep track of the work done on a dataset. To create a new DatasetVersion from an existing one, select all or some Asset from the initial DatasetVersion and click Dataset > Create new version to initialize a new one containing the selected Asset. Depending on your needs, you can choose the elements from the initial DatasetVersion you want to embed in the new one among AssetTag, Labels, and Annotation.

To fork a DatasetVersion with the SDK:

2. Transfer Label & Annotation from one version to another

If you want to import the Annotation but with some personalization inside the Label or Annotation to be forked, you can first just fork your DatasetVersion without copying Label or Annotation. Once the new DatasetVersion is created, go to its Settings tab. First, you can import all or a bench of Label from any other existing DatasetVersion. For sure after import, you can add some extra Label also.

For sure you can also create brand-newLabel by selecting Add new label.

🚧

Importing Label from another DatasetVersion is agnostic from the Detection Type.

It means that you can for instance import the Label name from a classification DatasetVersion to an object detection dataset. However, obviously for Annotation import, the source and destination import have to be of the same Detection Type.

Now that the new Label are set, you can personalize the import of the Annotation from another DatasetVersion. To do so, you can go to the Settings > Annotations part of the new DatasetVersion. You can select the DatasetVersion from which you want to copy the Annotation, then you just
need to select Label by Label the Annotation to inherit from the first version to the new one. It allows you to copy Annotation even if the Label name changes and merge Annotation from two or more Label in one.

👍

Globally all those Dataset management tools are aimed at helping data scientists create collaboratively, efficiently, and with traceability the perfect DatasetVersion for their future Model training!

You can still create a new DatasetVersion at any moment using the Python SDK: