vitest integration

When using typescript and vitest, the only thing you need to add is beforeAll in your test file.

Setup

npm install baserun
# or
yarn add baserun

test_module.spec.ts

import { baserun } from "baserun";
import { describe, beforeAll } from "vitest";

describe("My Test Suite", () => {
  beforeAll(async () => {
    // the only thing needed to make baserun work
    await baserun.init();
  });

  // ... the tests
});

npm run vitest test_module.spec.ts

Running multiple tests

It’s often helpful to exercise the same test over multiple examples. To do this, we suggest doing a simple for loop to autogenerate your tests and for larger numbers of examples you can read from a file or other data structure.

typescript

import { baserun } from "baserun";

const examples = [
  { place: "Paris", expected: "Eiffel Tower" },
  { place: "Rome", expected: "Colosseum" },
];

describe("Baserun end-to-end", () => {
  beforeAll(async () => {
    // add this
    await baserun.init();
  });

  for (const { place, expected } of examples) {
    it(`should suggest the ${expected}`, async () => {
      await baserun.trace(async () => {
        // Use baserun.trace here if your test involves multiple LLM calls or if you want to include any custom logs
        const chatCompletion = await openai.chat.completions.create({
          model: "gpt-3.5-turbo",
          temperature: 0.7,
          messages: [
            {
              role: "user",
              content: `What are three activities to do in ${place}?`,
            },
          ],
        });

        expect(chatCompletion.choices[0].message!.content!).toContain(expected);
        return chatCompletion;
      })();
    });
  }
});

Naming

Baserun lets you compare test results and debug each step side by side. We consider the full path of the describe and it names of the jest test as a unique identifier for the test. In the example above, two tests would be run and identified by ("Baserun end-to-end", "should suggest the Eiffel Tower") and ("Baserun end-to-end", "should suggest the Colosseum")

Automatic evaluations

Evaluate tests’ outputs.

Introduction

Prompt playground

Python SDK 2.0

Get started with SDK

Monitoring

Prompt templates

Evaluation

Testing

Datasets

Fine-tune

Integrations

vitest integration

Setup

Running multiple tests

Naming

Automatic evaluations

Introduction

Prompt playground

Python SDK 2.0

Get started with SDK

Monitoring

Prompt templates

Evaluation

Testing

Datasets

Fine-tune

Integrations

​Setup

​Running multiple tests

​Naming

Automatic evaluations

Setup

Running multiple tests

Naming