Datalake is a place shared by all the members of an Organization to gather all the images (called
Data) in the frame of your Computer Vision projects.
Datalake feature mainly aims at having all your
Data available for visualization, structuring, and exploration.
First of all, it is important to note that an Organization can have several
Datalake is connected through a Storage Connector to a bucket on an Object Storage (hosted by a Cloud provider) where the
Data visualized on the
Datalake is physically stored.
When creating a new Picsellia Organization, a new dedicated bucket is created on the Picsellia Object Storage (hosted by AWS). The
Datalake of the freshly created Organization is called Default and is connected to this bucket, tho all
Data uploaded to this
Datalake will be physically stored on this Picsellia Object Storage.
However, you can also decide to create a new
Datalake for your Organization and connect it to your own bucket hosted by your Cloud provider. To do so, please refer to this tutorial.
You can easily switch from a
Datalake to another using the navigation bar as shown below:
Every machine learning project begins with data, and in our case of Computer Vision, it starts with images.
There are two ways to upload your
Data using Picsellia:
- Import Data already stored on your own Cloud Object Storage to Picsellia's
Datalakeand access them through Picsellia.
- Upload locally stored
Datadirectly to Picsellia. In this scenario, your
Datawill be physically stored by Picsellia on the bucket linked to the current
Please note that depending on the
Datalake you are using, only one or both methods are available.
Indeed, if you are accessing the
Datalake connected to the bucket created for you on the Picsellia Object Storage, which is the Datalake called default and created for you when initializing a new Organization, you will only be able to upload Data from your local drive. The uploaded
Data will be physically stored on Picsellia's Object Storage and visualized on default
Datalake using the Storage Connector created by default for your Organization named hinokuni-storage-production.
If you create a new Datalake following this tutorial which will use a new Storage Connector linked to your own bucket, then you will be able to either:
Datafrom a local drive as explained here. in this case, the uploaded
Datawill be physically stored by Picsellia on your bucket
Dataalready stored on your bucket as explained here. In this case, you will be able to visualize and explore
Dataalready physically stored on your bucket through your Picsellia
All the users with Admin rights in a given Organization can access the Organization Settings, particularly the Storages and Datalakes tab. From this one, you can manage the existing Datalake and Storage Connectors. More details are available in this tutorial.
Updated about 14 hours ago