Complete Reference
- Python
- Typescript
match(name: str, submission: str, expected: Union[str, List[str]]) -> bool
match(name: str, submission: str, expected: Union[str, List[str]]) -> bool
includes(name: str, submission: str, expected: Union[str, List[str]]) -> bool
includes(name: str, submission: str, expected: Union[str, List[str]]) -> bool
fuzzy_match(name: str, submission: str, expected: Union[str, List[str]]) -> bool
fuzzy_match(name: str, submission: str, expected: Union[str, List[str]]) -> bool
submission
contains any of the expected
values or if any of the expected
values contain the submission
.Returns true
if there’s a fuzzy match, otherwise false
.not_match(name: str, submission: str, expected: Union[str, List[str]]) -> bool
not_match(name: str, submission: str, expected: Union[str, List[str]]) -> bool
submission
does not start with any of the expected
values.Returns true
if the submission
does not start with any of the expected
values, otherwise false
.not_includes(name: str, submission: str, expected: Union[str, List[str]]) -> bool
not_includes(name: str, submission: str, expected: Union[str, List[str]]) -> bool
submission
does not contain any of the expected
values.Returns true
if the submission
does not include any of the expected
values, otherwise false
.not_fuzzy_match(name: str, submission: str, expected: Union[str, List[str]]) -> bool
not_fuzzy_match(name: str, submission: str, expected: Union[str, List[str]]) -> bool
submission
neither contains any of the expected
values nor is contained by any of the expected
values.Returns true
if there’s no fuzzy match, otherwise false
.valid_json(name: str, submission: str) -> bool
valid_json(name: str, submission: str) -> bool
check_injection(name: str, submission: str) -> bool
check_injection(name: str, submission: str) -> bool
custom(name: str, submission: str, fn: Callable[[str], bool]) -> bool
custom(name: str, submission: str, fn: Callable[[str], bool]) -> bool
custom_async(name: str, submission: str, fn: Callable[[str], Awaitable[bool]]) -> bool
custom_async(name: str, submission: str, fn: Callable[[str], Awaitable[bool]]) -> bool
model_graded_custom(name: str, prompt: str, choices: dict[str, float], model: str, metadata: Optional[Dict[str, Any]], **variables) -> str
model_graded_custom(name: str, prompt: str, choices: dict[str, float], model: str, metadata: Optional[Dict[str, Any]], **variables) -> str
model_graded_fact(name: str, question: str, expert: str, submission: str) -> str
model_graded_fact(name: str, question: str, expert: str, submission: str) -> str
- “A”: The output is a subset of the expert answer and fully consistent with it.
- “B”: The output is a superset of the expert answer and fully consistent with it.
- “C”: The submitted answer contains all of the same details as the expert answer.
- “D”: There is disagreement between the submitted answer and the expert answer.
- “E”: The answers differ, but these differences don’t matter from the perspective of factuality.
model_graded_closed_qa(name: str, task: str, submission: str, criterion: str) -> str
model_graded_closed_qa(name: str, task: str, submission: str, criterion: str) -> str
model_graded_security(name: str, submission: str) -> str
model_graded_security(name: str, submission: str) -> str