Simple Prompt Change Experiment
Using Arize with Experiments
This guide demonstrates how to use Arize for logging and analyzing prompt iteration experiments with your LLM. We're going to build a simple prompt experimentation pipeline for a haiku generator. In this tutorial, you will:
-
Set up an Arize dataset
-
Implement a script that generates LLM outputs
-
Setup a function to evaluate the output using an LLM
-
Log the data in Arize to compare results across prompts
âšī¸ This notebook requires:
- An OpenAI API key
- An Arize Space ID & Developer Key (explained below)
Setup Config
Copy the Arize developer API Key and Space ID from the Datasets page (shown below) to the variables in the cell below.

Upload Dataset
Below, we'll create a dataframe of points to use for your experiments.
Let's make sure we can run async code in the notebook.
Define Task
A task is a callable that maps the input of a dataset example to an output by invoking a chain, query engine, or LLM.
Define Evaluators
Our evaluator is used to grade the task outputs. The function tone_eval is used to determine the tone of the output.
Run Experiment
Run the function below to run your task and evaluation across your whole dataset, and see the results of your experiment in Arize.