Datalake - Setup your own Metadata

We created a system of Metadata to enhance your Data management. Here is how to use it.

For each data that you push on Picsellia, you can add additional information that you want to keep throughout your pipelines.

Here is what can be stored in Picsellia and its associated format:

latitude: float
longitude: float
altitude: float
acquired_at: datetime
acquired_by: string
weather: string
resolution_x: float
resolution_y: float
resolution_unit: string
compression: string
manufacturer: string
software: string
color_space: string
custom_id: string
reference: string

1. Uploading Metadata

The only way to upload Metadata is via SDK.

🚧

This feature is only available since SDK 6.8.0

If you want to upload one Data with additional business reference, a different date than the uploaded time for example, you can use the SDK this way:

datalake = client.get_datalake()
data = datalake.upload_data(
  filepaths="./files/1.jpg",
  metadata={
    "reference": "7NGOMI",
    "acquired_by": "USER-178",
    "acquired_at": datetime.utcnow(),
  },
)

When sending multiple Data, you need 2 lists ("filepaths" & "metadata") of the exact same size:

lake = clt.get_datalake()
data = lake.upload_data(
    filepaths=[
        "./files/1.jpg",
        "./files/2.jpg",
        "./files/3.jpg",
    ],
    metadata=[
        {
            "latitude": 43.6027394,
            "longitude": 1.4540158,
        },
        {
            "latitude": 43.601717,
            "longitude": 1.456135,
        },
        {
            "latitude": 43.6013007,
            "longitude": 1.4560448,
        },
    ],
)

2. Fill Metadata from Exif Tags

🚧

This feature is only available since SDK 6.9.0

If you want to upload Data and automatically fill EXIF tags into the Picsellia Metadata system, you can give the parameter fill_metadata=True when calling upload_data:

datalake = client.get_datalake()
data = datalake.upload_data(
  filepaths="./files/1.jpg",
  fill_metadata=True
)

These are Metadata filled automatically with the PIL library.

You can display the Metadata automatically attached to your data as shown below:

exif_data = image.getexif()
acquired_at = exif_data.get(0x0132)
acquired_by = exif_data.get(0x013B)
resolution_x = exif_data.get(0x011A)
resolution_y = exif_data.get(0x011B)
resolution_unit = exif_data.get(0x0128)
compression = exif_data.get(0x0103)
manufacturer = exif_data.get(0x010F)
software = exif_data.get(0x0131)
color_space = exif_data.get(0xA001)