Skip to content

biotope build

Materialise a runnable BioCypher project from mappings/*.mapping.yaml and (optionally) alignment.yaml. Compiles the semantic mapping IR (entities / relations / ids) into BioCypher tuple streams.

Output layout

build/
├── config/
│   ├── biocypher_config.yaml
│   └── schema_config.yaml         # generated from the semantic IR — never hand-edited
├── mappings/                       # copies of each input mapping.yaml (provenance)
├── generated/
│   └── <mapping_stem>/
│       ├── __init__.py
│       └── adapter.py              # deterministic per-mapping adapter (import-only)
└── create_knowledge_graph.py       # BioCypher entry point

Schema-config emission

For each entity the build writes:

  • schema entry key = explicit schema_term or the mapping key in lower-sentence-case
  • represented_as: node
  • input_label: <mapping_key> (autogenerated, never manually synchronised)
  • namespace: <explicit value | derived from as_curie prefix | "id">
  • properties typed from Croissant scalars where possible

For each relation: represented_as: edge, input_label: <mapping_key>, source/target resolved through the referenced entities' schema terms. preferred_id is never emitted; is_a is no longer hardcoded.

Strictness

The build is strict — it aborts with a clear regeneration message when:

  • a mapping has unresolved slots (missing record_set, id, endpoint, etc.)
  • a mapping still uses the legacy nodes / edges schema
  • a property selector resolves to a struct-valued field

Run biotope map preview to see exactly what's unresolved before invoking build.

biotope build — materialise a runnable BioCypher project from mappings.

Reads every mappings/*.mapping.yaml in the project, optionally an alignment.yaml at the project root, and emits a build/ directory containing config/schema_config.yaml, the materialised mappings, per-mapping generated Python under build/generated/<stem>/, and a create_knowledge_graph.py entry point.

Strict: unresolved or legacy nodes/edges mappings cause the build to abort with a regeneration hint.

Headless ontology by default. The generated biocypher_config.yaml sets head_ontology: null, so the per-build class hierarchy is defined exclusively by schema_config.yaml (regenerated deterministically from the resolved mappings on every run). This avoids the remote Biolink fetch that has historically made graph builds slow and fragile. Schema evolution happens between builds, via project.yaml (required_entities / required_relations) and biotope map — never within a single build, so agents cannot reassign node classes once the schema is locked in. To re-enable the Biolink hierarchy for a specific project, edit the generated build/config/biocypher_config.yaml (biotope's "only write if missing" guard preserves user-authored configs).

build(mappings_dir, alignment_path, out_dir)

Build a deterministic BioCypher project from this biotope's mappings.

Source code in biotope/biotope/commands/build.py
@click.command()
@click.option(
    "--mappings-dir",
    type=click.Path(exists=True, file_okay=False, path_type=Path),
    default=None,
    help="Directory of mapping YAML files. Default: <project_root>/mappings.",
)
@click.option(
    "--alignment",
    "alignment_path",
    type=click.Path(exists=True, dir_okay=False, path_type=Path),
    default=None,
    help="Path to alignment.yaml. Default: <project_root>/alignment.yaml if present.",
)
@click.option(
    "--out",
    "-o",
    "out_dir",
    type=click.Path(file_okay=False, path_type=Path),
    default=None,
    help="Where to materialise the BioCypher project. Default: <project_root>/build.",
)
def build(mappings_dir: Path | None, alignment_path: Path | None, out_dir: Path | None) -> None:
    """Build a deterministic BioCypher project from this biotope's mappings."""
    project_path = find_project()
    if project_path is None:
        click.echo("❌ No project.yaml found. Run `biotope init <name>` first.")
        raise click.Abort
    project_root = project_path.parent.parent if project_path.parent.name == ".biotope" else project_path.parent

    mappings_dir = mappings_dir or (project_root / "mappings")
    mapping_paths = _discover_mapping_paths(mappings_dir)

    if not mapping_paths:
        click.echo(f"❌ No mapping YAML files found under {mappings_dir}.")
        click.echo("   Run `biotope map scaffold <croissant>` first.")
        raise click.Abort

    if alignment_path is None:
        candidate = project_root / "alignment.yaml"
        if candidate.is_file():
            alignment_path = candidate

    project = Project.load(project_path)
    out_dir = out_dir or (project_root / "build")
    try:
        result = materialize(
            out_dir,
            mapping_paths,
            alignment_path,
            required_entities=list(project.required_entities),
            required_relations=list(project.required_relations),
        )
    except ValueError as exc:
        click.echo(f"❌ Build aborted: {exc}")
        raise click.Abort from exc

    console.print(f"✅ Built BioCypher project at [cyan]{out_dir}[/cyan]")
    click.echo(json.dumps(result, indent=2, default=str))