Compare prompt versions

On this page

Use cases
Features
Instruction

When crafting writing prompts, small changes in wording or configuration can lead to major differences in results. Managing and sharing different versions of these prompts, especially among team members, can be challenging. The Baserun Compare feature enables the entire team to compare prompt templates, models, and configurations side by side, facilitating easier iteration and performance evaluation.

Use cases

Deciding which models or configurations: Pick the best option by comparing their results side by side.
Decide which prompt version performs better: Determine the most effective way of phrasing prompts by evaluating different versions.
Regression Tests/Backtesting: Check if new results match what’s expected to spot any regressions or deviations.

Features

Version cotrols
Edit prompt versions & testing cases on the fly
Side-by-side comparisons
Bulk testing
Share reports
Export reports

Instruction

Click the `Compare` button at top right corner of the prompt details in your playground.

In the latest version, we merged the “Compare” and “Playground” tabs into one. You can now switch between the two modes through tabs. “New session” will default to the “Playground” mode.”New compare session” will default to the “Compare” mode.

Select the prompt and the versions you would like to compare.

By default, Baserun pre-selects the previous version, but you can also create a new version within the comparison report.

Click Run all to generate outputs.

You can also edit the test inputs at any time and rerun the test.

Save as the active version.

After you have decided on the best-performing version, click on the prompt card, and then click on the ‘Save as the Active Version’ button. This action will set the selected version as the active version for the prompt.

Custom Models Tracing

Introduction

Prompt playground

Python SDK 2.0

Get started with SDK

Monitoring

Prompt templates

Evaluation

Testing

Datasets

Fine-tune

Integrations

Compare prompt versions

Use cases

Features

Instruction

Introduction

Prompt playground

Python SDK 2.0

Get started with SDK

Monitoring

Prompt templates

Evaluation

Testing

Datasets

Fine-tune

Integrations

​Use cases

​Features

​Instruction

Use cases

Features

Instruction