Testing infrastructure for LLM-powered code. Provides testthat integration with custom expectations for evaluating AI agent performance, tool accuracy, and hallucination rates.
Testing infrastructure for LLM-powered code. Provides testthat integration with custom expectations for evaluating AI agent performance, tool accuracy, and hallucination rates.