⬆️ Datalake - Import Data from local drive

1. Accessing the Datalake

To access the proper Organization's Datalake, ensure that you're within the right Organization. Each Organization has its own independent Datalake.

After accessing the correct Organization, you can reach its Datalake through the Navigation Bar as illustrated below:

Access Datalake

Access Datalake

You're now visualizing the Datalake for your Organization.

Now, it's time to upload your Data to Picsellia!

2. Uploading your Data

Every machine learning project begins with data, and in our case of Computer Vision, it starts with images.

There are two ways to access your Data using Picsellia:

  1. Connect your remote storage to Picsellia's Datalake and access them through Picsellia.
  2. Upload locally stored Data directly to Picsellia. In this scenario, your Data will be stored in Picsellia's AWS-operated storage.

This page focuses on the second approach: the direct upload of Data to Picsellia's Datalake.

To proceed, simply click on the Upload data button.

Upload Data to Datalake

Upload Data

🚧

Sample Data

If you're new to Picsellia, your Datalake might be empty or contain 150 example Data.

A modal will appear. From here, click on Select files, which will open your OS's file browser, allowing you to choose the images you wish to import into your Picsellia Datalake.

Once you've selected files from your local device, you can create or select the appropriate DataTag to associate with the batch of Data you're about to upload to your Datalake. To learn more about tags and their usage, refer to this documentation page.

After selecting the necessary files and DataTags, click on Upload.

Select and tag to-be-uploaded `Data`

Data upload modal

During the upload, a progress bar will display. Depending on the amount of Data being uploaded, you may need to be patient, or consider using our SDK to upload Data.

📘

Supported Image Formats

Currently, we support the primary image formats:

  • .png
  • .jpg
  • .jpeg

If you require support for additional formats, please don't hesitate to contact us.

Once the upload is complete, you can see the total number of Data in your Datalake and sort it based on various criteria (updated date, filename, or random), as depicted below:

Visualize Data in the Datalake

Visualize and sort uploaded Data

🚧

What about Annotations?

It's important to note that labeling occurs at the DatasetVersion level, not at the Datalake level.

3. Understanding Data in Picsellia

Data refers to an image that is part of your Datalake. Any image used within any Picsellia feature is always associated with Data stored in the Datalake. For example, the Asset composing a Dataset or the Predictions made within a Deployment are always linked to Data stored in the Datalake.

The Data in Picsellia's Datalake includes both the image and its linked Metadata. This Metadata can be viewed using the Table button, which opens the Table view.

Switch to Table view in the Datalake

Switch from Grid view to Table view

The Table view provides access to the following Metadata:

  • preview: thumbnail of your image
  • filename: complete filename of your Data
  • shape: dimensions in pixels of your Data
  • source: source of your Data (e.g., direct_upload, AWS_bucket, serving....)
  • tags: DataTag associated with the Data (more about tags here)
  • created at: creation date of the Data on Picsellia
  • created by: Picsellia user who created the Data
  • Personalized Metadata
Table view of Data in the Datalake

Table view

All Metadata can be customized and personalized using the SDK. However, as Picsellia relies on this Metadata to handle images, modifying them may lead to unexpected outcomes.

📘

Searching for Data

The search bar is the most effective way to leverage Metadata and navigate through your Datalake. More information is available here.