Metrics

Metrics define how model performance is measured and evaluated.

Custom Logic

Define custom evaluation functions that match your specific business requirements.

Flexible Inputs

Accept any input format and compare against expected outputs flexibly.

Aggregation Support

Aggregate individual scores across datasets for comprehensive evaluation.

Optimization Ready

Use metrics directly with Tune for automatic prompt optimization.

Creating Metrics

Choose from four available options when creating metrics:

Auto

How to use: Select fields from your dataset schema to create exact match metrics. Required:

Dataset with defined schema containing the fields you want to compare
Select specific fields by clicking the “Select” button next to each field name

Code

How to use: Write Python code using the provided template. Required:

Define a function called metric_func(output, expected) that returns a float value (typically 0.0 or 1.0)
Replace 'field_name' placeholders with your actual field names
Function must handle None/missing values and return appropriate scores

Existing

How to use: Select from previously created metrics in your project. Required:

At least one metric must already exist in your project
Select the desired metric from the list by checking the checkbox

LLM

How to use: Enter evaluation criteria and instructions for an LLM judge. Required:

Write evaluation criteria and instructions in the text area
Your prompt must instruct the LLM to return either ‘true’ or ‘false’ in its response
LLM judge receives both the model output and expected output for comparison

Optimize Prompts

Let Tune automatically improve prompts based on your metrics

Getting Started

Data Assets

Core Features

Metrics

Metrics

Custom Logic

Flexible Inputs

Aggregation Support

Optimization Ready

Creating Metrics

Auto

Code

Existing

LLM

Optimize Prompts

Getting Started

Data Assets

Core Features

​Metrics

Custom Logic

Flexible Inputs

Aggregation Support

Optimization Ready

​Creating Metrics

​Auto

​Code

​Existing

​LLM

Optimize Prompts

Metrics

Creating Metrics

Auto

Code

Existing

LLM