5. Create your first Dataset

Objectives:

  • Create your first Dataset & DatasetVersion
  • Use the Dataset versioning system

1. It is now time to create your first DatasetVersion.

Now that your Data are uploaded on Picsellia, you can create your first Dataset, which will be used to train a Model for instance. To do so, you need to leverage your Datalake.

From your Datalake, you need to search for the images that need to be included in your DatasetVersion using the Search Bar and select the subset of images to be included in this new DatasetVersion.For example, to create a DatasetVersion with all the images uploaded with the DataTag smart_city, you need to search those images using the Search Bar, select all and or part of them (Select all or Select subset buttons) and initiate the DatasetVersion by clicking on Dataset as shown below:

For traceability and in order to ease the management of your DatasetVersion, you will be requested to add a title and description.

You can now access the recently created DatasetVersion, by accessing the Datasets view. Picsellia's Dataset versioning system allows you to create as many versions as you want of a given Dataset, each version of a Datasetis called a DatasetVersion. We believe that it is crucial to keep the history of Dataset for a given project as we know that finding the perfect & balanced Dataset requires work and modifications on Data.

You can select the Dataset you're interested in from the Dataset view and display all its existing DatasetVersion (at this step, only one DatasetVersion should be available).

You will see all the Asset composing this first version of your Dataset by clicking on the DatasetVersion.

At this step, the DatasetVersion should be free of Annotation.

πŸ“˜

Each image composing the DatasetVersion is called an Asset on Picsellia

To do it with the SDK: