Evidence¶
Evidence is the recorded history of one OptPilot run.
The public configs define what should happen:
The run directory shows what actually happened: which candidates were proposed, how they were materialized, which trials succeeded or failed, which metrics were returned, and where evaluator output files were written.
Run Directory¶
By default, runs are written to a runs/ directory next to the directory that contains the study config. The built-in job-shop studies live under examples/studies/, so their default output root is examples/runs/. You can override this with --output-root or evidence.outputDir.
Common files:
| File | Meaning |
|---|---|
summary.json |
Final run summary, best metric, failure count, and run status. |
study_spec.json |
Compiled run spec generated from the study, environment, and method configs. |
candidates.jsonl |
Candidate records, validation details, and materialization details. |
observations.jsonl |
Trial observations and metric values. |
trials.jsonl |
Trial lifecycle records and backend metadata. |
method_calls.jsonl |
Method requests, responses, and errors. |
method_events.jsonl |
Events emitted by methods. |
scheduler_events.jsonl |
Scheduling and worker events. |
environment_snapshot.json |
Environment contract used by the run. |
run_policy.json |
Budget, retry, parallelism, and timeout policy. |
run_lineage.json |
Resume and branch lineage metadata. |
The exact set can vary by evidence level and by which parts of the runtime are used.
A typical local run looks like:
runs/my-study-2026-06-20T.../
summary.json
study_spec.json
run_policy.json
run_lineage.json
environment_snapshot.json
candidates.jsonl
trials.jsonl
observations.jsonl
method_calls.jsonl
scheduler_events.jsonl
prompts/
prompt-.../prompt.json
candidates/
candidate-.../files/...
trials/
trial-.../
candidate/
candidate.json
workspace_manifest.json
evaluator outputs...
evidence_files/
trial-.../
copied outputs when evidence.outputFileStorage: copy
The most important files for debugging are usually summary.json, observations.jsonl, candidates.jsonl, and the corresponding trials/<trial-id>/ directory.
Storage Roles¶
OptPilot uses a few runtime folders with different jobs.
| Runtime storage | Purpose |
|---|---|
| Method workspace | Scratch space for one method invocation. Command wrappers often write request files and logs here. |
| Candidate store | Durable handoff area for candidates produced by methods, especially generated files. |
| Trial workspace | Fresh evaluation directory for one trial. trialWorkspace entries are copied here and file candidates are materialized here. |
| Evidence directory | Run-level records, summaries, and retained evaluator outputs. |
The evaluator normally reads the trial workspace, not the candidate store. For file candidates, the runner copies files from the candidate store into the trial workspace according to the environment candidate contract.
Output Files¶
Evaluators may produce logs, JSON summaries, CSV files, SQLite databases, images, or other files inside the trial workspace.
There are two ways those files become visible in evidence:
- the evaluator returns
output_filesdescriptors - the environment config lists
outputFilespatterns to collect after evaluation
evidence.outputFileStorage controls whether file bytes are copied into evidence storage:
| Value | Behavior |
|---|---|
reference |
Evidence records paths to files where they were produced, usually inside trial workspaces. |
copy |
Matching output files are copied into evidence storage so they remain easy to inspect even if trial workspaces are later cleaned up. |
Metric values should still be returned or extracted through metrics. Output files are for supporting evidence, debugging, traces, plots, logs, and databases.
EvidenceView¶
Methods can inspect previous results through EvidenceView during iterative optimization.
Typical information available through this API includes:
- observations and metric values
- trial records
- candidate records
- method call records
- scheduler events
- method events
- extracted records
- evaluator output files and artifacts
This gives methods a stable way to learn from previous trials without parsing raw run files by hand.
def propose(self, n_candidates, study_state, evidence_view):
recent = evidence_view.observations(limit=3)
traces = evidence_view.artifacts(kind="json", limit=5)
rows = evidence_view.records("events", limit=20)
...
records(...) reads rows extracted from configured JSONL, CSV, SQLite, or custom record streams. artifacts(...) and output_files(...) return metadata for files produced during evaluation, such as logs, plots, JSON reports, CSV files, or SQLite databases. They return paths and content references so a method can decide what to read.
Resume And Branch¶
Resume appends more trials to an existing run:
uv run optpilot run examples/studies/job_shop_rule_parameters_baseline.yaml \
--resume-run-dir path/to/existing-run
Branch starts a new run that records a previous run as its parent: