← Agent skills catalog

Claude + Codex agent skill

Test Drive

Test an idea before you trust it. Test Drive helps people test ideas, claims, decisions, prompts, skills, strategies, data claims, and artifacts through small evidence-seeking trials.

Test Drive social preview

What it does

Test Drive identifies what kind of evidence would matter, chooses the smallest credible test, creates or recommends the needed artifact, and names any connector or approval gate required.

Evidence Types

Test Drive is horizontal because it routes by evidence type, not domain.

Human Reaction

Use interviews, surveys, posts, messages, or prototype feedback when the idea depends on what people understand, want, remember, or choose.

Behavioral / Data

Use cohort analysis, funnel analysis, segmentation, correlation, regression, or before/after readouts when the claim depends on observed behavior.

Reasoning And Expert Judgment

Use Ground Truth, The Quorum, pre-mortems, counterexample searches, or stakeholder lenses when the question depends on logic and trade-offs.

Artifact And Operational Performance

Use dry runs, eval cases, rubric scoring, connector checks, permission reviews, or pilot workflows when the thing itself has to work in practice.

Example Output Shape

Before

"I think this positioning is right, but I do not know if I am overfitting to language I personally like."

After

Test Drive classifies the uncertainty as Human Reaction plus Artifact Performance, drafts two message variants, defines resonance and failure signals, recommends a manual or connector-assisted posting path, and sets a learning loop.

Install

Download the packaged skill from the release, or ask Codex to install it from GitHub.

FAQ

How is Test Drive different from Ground Truth?

Ground Truth challenges the reasoning. Test Drive designs the evidence-seeking trial that would make the idea more or less trustworthy.

Can Test Drive run statistical analysis?

It can recommend and structure analytical tests such as cohort analysis, segmentation, funnel analysis, correlation, or regression. It can run analysis when the environment has approved data access and tooling.

Can Test Drive take action?

It can draft or build safe artifacts. It should ask for explicit approval before sending, posting, publishing, querying sensitive systems, or changing external state.