About
The SOMA
The SOMA is part of the EHS Data Standards initiative, focused on developing standardized data models for environmental health sciences research.
Project Goals
This project aims to:
- Standardize data representation for exposure-outcome relationships in EHS research
- Enable data interoperability across studies, cohorts, and institutions
- Support mechanistic understanding through integration with Adverse Outcome Pathways (AOPs)
- Bridge epidemiological and toxicological data from human studies and model systems
The Data Model
Design Principles
The SOMA follows these principles:
- Ontology-first - All entities are mapped to established biomedical ontologies
- FAIR-compliant - Supports Findable, Accessible, Interoperable, and Reusable data
- Extensible - New assay types can be added without breaking existing data
- Multi-scale - Captures data from molecular to population levels
Technology Stack
The model is built using:
- LinkML - Linked Data Modeling Language for schema definition
- MkDocs with Material theme for documentation
- Python for data validation and transformation
Core Domains
| Domain | Description |
|---|---|
| Assays | Domain-specific assay classes with named measurement slots (e.g., CiliaryFunctionAssay, LungFunctionAssay) |
| Study Subjects | Biological systems under study: cell cultures (CellularSystem), human/animal subjects (InVivoSubject), populations (PopulationSubject) |
| Protocols | Typed experimental procedures: ImagingProtocol, MolecularAssayProtocol, StainingProtocol, SpirometryProtocol |
| AOP Framework | Adverse Outcome Pathways: KeyEvent, AdverseOutcomePathway, with assay linkage via informs_on_key_event |
Contributing
We welcome contributions from the community. To contribute:
- Visit the GitHub repository
- Review the existing schema in
src/soma/schema/ - Open an issue to discuss proposed changes
- Submit a pull request with your contributions
Development
Prerequisites
Quick Start
# Install dependencies
just install
# Generate documentation
just gen-doc
# Run local documentation server
just testdoc
# Run all tests
just test
Project Structure
soma/
├── src/
│ ├── docs/ # Documentation source files
│ └── soma/
│ ├── schema/ # LinkML schema definition
│ └── datamodel/ # Generated Python models
├── docs/
│ └── elements/ # Generated schema docs
├── project/ # Generated artifacts
├── tests/
│ └── data/ # Test data files
└── examples/ # Usage examples
License
This project is released under the MIT License.
Acknowledgments
This project uses the linkml-project-copier template for project structure and build tooling.
Contact
For questions or feedback, please open an issue on the GitHub repository.