> ## Documentation Index
> Fetch the complete documentation index at: https://docs.baserun.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Testing overview

> With Baserun, testing your LLM app is as easy as writing a unit test

We leverage the existing ecosystem for testing, so you don't have to learn a new tool and can call your application logic directly.

<Frame>
  <img src="https://mintcdn.com/baserun/RoOPWwfCH85oSKT_/images/offline_evaluation_run_report.png?fit=max&auto=format&n=RoOPWwfCH85oSKT_&q=85&s=d7800ddf23972955a4cf9e570cffc137" alt="Testing Dashboard" width="2880" height="1800" data-path="images/offline_evaluation_run_report.png" />
</Frame>

### Aggregation

We automatically aggregate all of your [LLM and custom logs](/logging) by test. This is helpful to get full visibility into each step of your workflow.

### Integrations

We provide plugins for the most popular testing frameworks. Check out the [pytest](/pytest) and [jest](/jest) documentation to get started.

### Assertions

To validate the behavior of your code you can make assertions like you normally would to cause the test to pass or fail. In the following example, the test will fail if the string "Eiffel Tower" is not present in the LLM output.

<Frame>
  <img src="https://mintcdn.com/baserun/RoOPWwfCH85oSKT_/images/offline_testing_assertion.png?fit=max&auto=format&n=RoOPWwfCH85oSKT_&q=85&s=0eb1ca7b44049fc5edc7f36b1dc6a1eb" alt="Offline testing assertion" width="2788" height="376" data-path="images/offline_testing_assertion.png" />
</Frame>

<CodeGroup>
  ```python python theme={null}
  import openai

  def test_paris_trip():
      response = openai.ChatCompletion.create(
          model="gpt-3.5-turbo",
          temperature=0.7,
          messages=[
              {
                  "role": "user",
                  "content": "What are three activities to do in Paris?"
              }
          ],
      )

      assert "Eiffel Tower" in response["choices"][0]["message"]["content"]
  ```

  ```typescript typescript theme={null}
  import OpenAI from "openai";

  const openai = new OpenAI({
    apiKey: process.env.OPENAI_API_KEY,
  });

  describe("Baserun end-to-end", () => {
    it("should suggest the Eiffel Tower", async () => {
      const chatCompletion = await openai.chat.completions.create({
        model: "gpt-3.5-turbo",
        temperature: 0.7,
        messages: [
          {
            role: "user",
            content: "What are three activities to do in Paris?",
          },
        ],
      });

      expect(chatCompletion.choices[0].message!.content!).toContain(
        "Eiffel Tower"
      );
    });
  });
  ```
</CodeGroup>

### Automatic evaluation

When working with LLMs, it is also helpful to gain more visibility into the outputs using a series of checks or evaluations. Evaluations capture information about the output but will not cause the test to fail by default. Eval results are then aggregated and displayed in your Baserun dashboard.

<Frame>
  <img src="https://mintcdn.com/baserun/RoOPWwfCH85oSKT_/images/offline_testing_evals.png?fit=max&auto=format&n=RoOPWwfCH85oSKT_&q=85&s=e71ebc7a8a71675ffea0f159a446959a" alt="Offline testing assertion" width="2788" height="376" data-path="images/offline_testing_evals.png" />
</Frame>

In the following example, we add an eval to check if the output has the phrase "AI Language Model". See the [evaluation](/evaluation) documentation for more information.

<CodeGroup>
  ```python python theme={null}
  import baserun
  import openai

  def test_paris_trip():
      response = openai.ChatCompletion.create(
          model="gpt-3.5-turbo",
          temperature=0.7,
          messages=[
              {
                  "role": "user",
                  "content": "What are three activities to do in Paris?"
              }
          ],
      )

      output = response["choices"][0]["message"]["content"]
      not_has_ai_language_model = baserun.evals.not_includes("AI Language Model", output, ["AI language model"])
      # Optional: will fail the test if "AI language model" is present in the output
      assert not_has_ai_language_model
  ```

  ```typescript typescript theme={null}
  import { baserun } from "baserun";
  import OpenAI from "openai";

  const openai = new OpenAI({
    apiKey: process.env.OPENAI_API_KEY,
  });

  describe("Baserun end-to-end", () => {
    it("should suggest the Eiffel Tower", async () => {
      const chatCompletion = await openai.chat.completions.create({
        model: "gpt-3.5-turbo",
        temperature: 0.7,
        messages: [
          {
            role: "user",
            content: "What are three activities to do in Paris?",
          },
        ],
      });

      const output = chatCompletion.choices[0].message!.content!;
      const notHasLanguageModel = baserun.evals.notIncludes(
        "AI Language Model",
        output,
        ["AI language model"]
      );
      // Optional: will fail the test if "AI language model" is present in the output
      expect(notHasLanguageModel).toBeTruthy();
    });
  });
  ```
</CodeGroup>
