2. Evaluate a Classification Model

1. Context

A. Pre-requisite

A Projecton Picsellia allowing you to host Experiment and DatasetVersion.
Have an Experimentwith a DatasetVersion attached to it - you can add the test alias to this DatasetVersion.
A Classification local model or a ClassificationModelVersion.

📘
If you want to integrate your local custom classification model into Picsellia you can checkout this tutorial 👉 Migrate your Models to Picsellia

B. Variables

Let's say that:

The Project is called Documentation Project
The Experimentis called my_experiment
The DatasetVersionattached is called test

C. Setup

You need to have a post-processing function that will return the class_name predicted and it's confidence_threshold:


def predict(input: Image, model: TF.Model/PT.Model):
  	
    preprocessed_input = pre_process(input)
    prediction = model(preprocessed_input)
    
    class_name, confidence_treshold = post_process(prediction) 
    return class_name, confidence_treshold

You should also create a script that will initialize PicselliaClientconnection and fetch your Project,Experiment, DatasetVersion.

from picsellia import Client 

client = Client(api_token, organization_name, host='https://app.picsellia.com')

project = client.get_project(name='Documentation Project')
experiment = project.get_experiment(name='my_experiment')

testing_dataset = experiment.get_dataset('test')

We also need to create a dictionary matching class_names and the Label objects from Picsellia in order to attach the good Label. Something like that:

{
  "cat": PicselliaLabel(Python Object),
  "dog": PicselliaLabel(Python Object)
}

picsellia_labels_name = testing_dataset.list_labels()

label_matching = {k.name: k for k in picsellia_labels_name}

2. Implementing the Model Testing

Let's take a look at the Experiment add_evaluation() method:

add_evaluation(
   asset: Asset, add_type: Union[str,
   AddEvaluationType] = AddEvaluationType.REPLACE,
   rectangles: Optional[List[Tuple[int, int, int, int, Label, float]]] = None,
   polygons: Optional[List[Tuple[List[List[int]], Label, float]]] = None,
   classifications: Optional[List[Tuple[Label, float]]] = None
)

Let's dive into 3 of the arguments:

asset: Asset (Meaning that you can only have one evaluation by Asset)
add_type: It's an enum with these possibilities : (KEEP/REPLACE) the default is REPLACE. KEEP will keep the existing evaluation if it exists.
classifications: it's a list of Tuple, the Tuple being (Label, confidence_score)

Let's wrap everything together with a Zero-Shot classifier from OpenAI found on HuggingFace, here is the snippet from HuggingFace:

from PIL import Image
import requests

from transformers import CLIPProcessor, CLIPModel

model = CLIPModel.from_pretrained("openai/clip-vit-large-patch14")
processor = CLIPProcessor.from_pretrained("openai/clip-vit-large-patch14")

url = "http://images.cocodataset.org/val2017/000000039769.jpg"
image = Image.open(requests.get(url, stream=True).raw)

inputs = processor(text=["a photo of a cat", "a photo of a dog"], images=image, return_tensors="pt", padding=True)

outputs = model(**inputs)
logits_per_image = outputs.logits_per_image # this is the image-text similarity score
probs = logits_per_image.softmax(dim=1) # we can take the softmax to get the label probabilities

Let's format this in order to integrate Picsellia into this:

from PIL import Image
import requests
import numpy as np
from transformers import CLIPProcessor, CLIPModel
from picsellia import Client
from picsellia.sdk.asset import Asset
from picsellia.types.enums import InferenceType

client = Client(api_token="", organization_name="")
project = client.get_project(name='Documentation Project')
experiment = project.get_experiment(name='my_experiment')
testing_dataset = experiment.get_dataset('test')
picsellia_labels_name = dataset.list_labels()
label_matching = {k.name: k for k in picsellia_labels_name}


model = CLIPModel.from_pretrained("openai/clip-vit-large-patch14")
processor = CLIPProcessor.from_pretrained("openai/clip-vit-large-patch14")

for asset in dataset.list_assets():
    image = Image.open(requests.get(asset.url, stream=True).raw)
    inputs = processor(text=[e for e in label_matching.keys()], images=image, return_tensors="pt", padding=True)
    outputs = model(**inputs)
    logits_per_image = outputs.logits_per_image # this is the image-text similarity score
    probs = logits_per_image.softmax(dim=1).detach().numpy() # we can take the softmax to get the label probabilities
    class_name = labels_list[np.argmax(probs)]
    experiment.add_evaluation(asset, classifications=[(labels_dict[class_name], float(np.max(probs)))])
    
experiment.compute_evaluations_metrics(inference_type=InferenceType.CLASSIFICATION)