Interface ForgeTestCase

A single test case used by the LLM judge to evaluate a newly forged tool.

The judge invokes the tool with input and compares the result against expectedOutput using semantic equivalence (not strict equality).

interface ForgeTestCase {
    input: Record<string, unknown>;
    expectedOutput: unknown;
}

Properties

input: Record<string, unknown>

Input arguments object passed to the tool's run / execution entry point. Must conform to the tool's declared inputSchema.

expectedOutput: unknown

Expected output value used for correctness scoring. The judge uses this as a reference — partial matches may still score well.