Architecture¶

Project Structure¶

soma-evals/
├── src/soma_evals/          # Python package
│   ├── __main__.py          # CLI entry point (Click)
│   ├── runner.py            # Orchestrates eval runs
│   ├── llm_adapter.py       # Wraps llm library
│   ├── pdf_utils.py         # PDF text extraction (PyMuPDF)
│   ├── prompt_builder.py    # Assembles prompts from parts
│   └── schema_context.py    # Generates schema context per ablation level
├── results/                 # Output directory (per-level, per-model)
│   └── baseline/
│       ├── run_metadata.yaml
│       └── <model>/paper.yaml
├── extract.yaml             # Prompt template
├── models.yaml              # Model config and pricing
├── mkdocs.yml               # Documentation config
└── justfile                 # Build automation

Data Flow¶

                    ┌─────────────────┐
                    │   extract.yaml  │  prompt template
                    └────────┬────────┘
                             │
┌──────────┐    ┌────────────▼────────────┐    ┌──────────────┐
│  PDF      │───▶│   prompt_builder.py     │───▶│  LLM API     │
│  (paper)  │    │  + schema_context.py    │    │  (via llm)   │
└──────────┘    └─────────────────────────┘    └──────┬───────┘
                                                      │
                    ┌─────────────────┐               │
                    │  runner.py      │◀──────────────┘
                    │  orchestration  │
                    └────────┬────────┘
                             │
                    ┌────────▼────────┐
                    │  results/       │
                    │  level/model/   │
                    │    paper.yaml   │
                    └─────────────────┘

Key Modules¶

`main.py` -- CLI¶

Click-based interface with four commands:

Command	Description
`run --level <LEVEL>`	Run a single ablation level
`run-all`	Run all four levels sequentially
`list-models`	Display configured models and tiers
`show-context`	Debug: print schema context at each level

`runner.py` -- Orchestration¶

Manages the eval workflow:

run_single() -- one model, one level, one paper
run_level() -- all models for a single level; writes run_metadata.yaml
run_all_levels() -- all levels for all models

The runner extracts PDF text once per level and reuses it across models.

`schema_context.py` -- Ablation Levels¶

Uses linkml-runtime's SchemaView to programmatically generate schema context at four cumulative levels. See the Ablation Levels page for details.

`prompt_builder.py` -- Prompt Assembly¶

Combines three parts into the final prompt:

Schema context (if level > baseline) as ## Schema Context
Template prompt from extract.yaml
PDF text as ## Source Text

`llm_adapter.py` -- LLM Integration¶

Thin wrapper around Simon Willison's llm library. This supports any model plugin (llm-claude-3, llm-gemini, OpenAI built-in) with a uniform interface. Temperature defaults to 0.0 for deterministic output.

`pdf_utils.py` -- PDF Extraction¶

Simple PyMuPDF wrapper that extracts all text from a PDF, concatenating pages with double-newline separators.

Output Format¶

Each model run produces a YAML file with the LLM's structured extraction. Metadata is captured in run_metadata.yaml per level:

level: baseline
timestamp: '2026-04-09T00:05:30.216735+00:00'
paper_slug: montgomery2020-pm25-mucociliary
pdf_path: rcmb.2019-0454OC.pdf
runs:
  - model: gpt-4o
    status: success
    input_tokens: 18707
    output_tokens: 937
    duration_ms: 16264

Architecture¶

Project Structure¶

Data Flow¶

Key Modules¶

__main__.py -- CLI¶

runner.py -- Orchestration¶

schema_context.py -- Ablation Levels¶

prompt_builder.py -- Prompt Assembly¶

llm_adapter.py -- LLM Integration¶

pdf_utils.py -- PDF Extraction¶