Create a Genesis V2 benchmark suite
Usage
benchmark_genesis_v2(
tasks,
skill_paths = "auto",
model = "claude-3-5-sonnet-20241022",
max_iterations = 3,
quality_threshold = 70
)
Arguments
- tasks
Character vector of tasks to benchmark
- skill_paths
Skill paths to use
- model
Model to use
- max_iterations
Maximum iterations per task
- quality_threshold
Quality threshold