Skip to content

biotope map

Semantic mapping command group. Replaces the removed biotope describe (intent capture is now an intent flag on map) and the deprecated heuristic biotope propose-mapping (scaffolding is now biotope map scaffold).

The command never auto-picks a record set or fields. All semantic decisions are made by the human or copilot agent against deterministic inspection output.

biotope map (bare)

  • No flags: launches the guided wizard. The wizard prompts for the Croissant file or mapping if it can't auto-discover them, captures intent on first run if project.yaml is empty, walks each unresolved entity / relation slot in order, autosaves after every confirmed edit, and offers inline entity creation when a relation references a not-yet-defined entity.
  • Any intent flag present — runs non-interactively (agent-friendly):
biotope map --purpose "..."       # replace project's purpose
biotope map --entity gene --entity disease       # append to required_entities
biotope map --relation gene_associated_with_disease
biotope map --source <path-or-url>
biotope map --notes "..."
biotope map --clear-entities --clear-relations --clear-sources
biotope map --show                # print intent + mapping progress

The wizard also offers a --croissant <path> and --mapping <path> option to pin which file to operate on.

biotope map inspect <croissant>

Deterministic inspector for a Croissant dataset. For each record set: name, description, source, field inventory (with scalar / array / struct kind), identifier-like candidates, explode-eligible arrays, and sample rows rendered as vertical key-value blocks. --json produces a stable machine-readable form for agents.

biotope map scaffold <croissant>

Emit an unresolved semantic mapping scaffold:

  • top-level croissant, empty ids
  • entities and relations keyed by project.yaml's required_entities / required_relations, normalised to snake_case
  • a YAML comment appendix at the bottom with the inspector output (record sets, fields, sample rows) so a human or agent can edit the file without re-running inspection

Defaults to mappings/<croissant-stem>.mapping.yaml; pass --out or --stdout to override.

biotope map preview [<mapping>]

Validate a (partial) mapping and project its outputs. Tolerant of unresolved slots — reports them rather than crashing. Output:

  • resolved vs unresolved slots
  • validation findings (missing record sets, unknown fields, invalid explode targets, illegal $item placement, unresolved endpoints)
  • projected BioCypher schema (schema_term, input_label, namespace, represented_as, properties, edge source/target)
  • sample emitted tuples from resolved sections

--json for agent consumption. The wizard reuses this engine after every confirmed edit.

Agent workflow

Agents bypass the wizard and instead:

  1. Set intent — biotope map --entity ... --relation ....
  2. Generate scaffold — biotope map scaffold <croissant>.
  3. Inspect — biotope map inspect <croissant> --json.
  4. Edit mappings/*.mapping.yaml directly.
  5. Validate — biotope map preview --json.
  6. biotope build.

biotope map — semantic mapping authoring commands.

  • biotope map (bare) launches the guided wizard, unless any intent flag (--purpose, --entity, --relation, --source, --notes, --clear-*, --show) is present, in which case it applies the edit non-interactively and exits.
  • biotope map inspect <croissant> — deterministic Croissant/data inspector.
  • biotope map scaffold <croissant> — non-interactive unresolved scaffold.
  • biotope map preview [<mapping>] — compile-in-memory preview.

All semantic decisions are made by the user or the editing agent. The CLI never auto-picks record sets or fields.

discover_croissants(project_root)

Return all Croissant metadata files under a project's .biotope/datasets/.

Looks for both *.jsonld (baker output) and *.croissant.json (legacy / user-supplied) shapes.

Source code in biotope/biotope/commands/map.py
def discover_croissants(project_root: Path) -> list[Path]:
    """Return all Croissant metadata files under a project's ``.biotope/datasets/``.

    Looks for both ``*.jsonld`` (baker output) and ``*.croissant.json`` (legacy /
    user-supplied) shapes.
    """
    datasets_root = project_root / ".biotope" / "datasets"
    if not datasets_root.is_dir():
        return []
    files = sorted(datasets_root.rglob("*.jsonld")) + sorted(datasets_root.rglob("*.croissant.json"))
    # Stable de-dup while preserving order.
    seen: set[Path] = set()
    out: list[Path] = []
    for f in files:
        if f not in seen:
            seen.add(f)
            out.append(f)
    return out

inspect(croissant, as_json, preview_rows)

Inspect a Croissant dataset deterministically.

Source code in biotope/biotope/commands/map.py
@map_group.command()
@click.argument("croissant", type=str)
@click.option("--json", "as_json", is_flag=True, help="Emit a machine-readable JSON inspection.")
@click.option("--preview-rows", type=click.IntRange(min=0), default=3, show_default=True)
def inspect(croissant: str, as_json: bool, preview_rows: int) -> None:
    """Inspect a Croissant dataset deterministically."""
    dataset = _load_croissant(croissant)
    datasets_location = infer_datasets_location(croissant)
    inspection = inspect_dataset(
        dataset,
        datasets_location=datasets_location,
        preview_rows=preview_rows,
    )
    if as_json:
        click.echo(json.dumps(inspection.to_json(), indent=2, default=str))
        return
    click.echo(render_inspection_text(inspection))

map_group(ctx, croissant, mapping_path, purpose, entities, relations, sources, notes, clear_entities, clear_relations, clear_sources, show)

Semantic mapping for a Croissant dataset.

Source code in biotope/biotope/commands/map.py
@click.group(invoke_without_command=True, name="map")
@click.option(
    "--croissant",
    "-c",
    "croissant",
    type=str,
    default=None,
    help="Path to a Croissant JSON-LD file for the wizard to operate on.",
)
@click.option("--mapping", "mapping_path", type=click.Path(path_type=Path), default=None)
@click.option("--purpose", "-p", type=str, default=None, help="Replace the project's purpose statement.")
@click.option("--entity", "-e", "entities", multiple=True, help="Add to required_entities. Repeatable.")
@click.option("--relation", "-r", "relations", multiple=True, help="Add to required_relations. Repeatable.")
@click.option("--source", "-s", "sources", multiple=True, help="Add to data_sources. Repeatable.")
@click.option("--notes", type=str, default=None, help="Replace the notes field.")
@click.option("--clear-entities", is_flag=True, help="Empty required_entities before adding.")
@click.option("--clear-relations", is_flag=True, help="Empty required_relations before adding.")
@click.option("--clear-sources", is_flag=True, help="Empty data_sources before adding.")
@click.option("--show", is_flag=True, help="Print intent + mapping progress and exit.")
@click.pass_context
def map_group(
    ctx: click.Context,
    croissant: str | None,
    mapping_path: Path | None,
    purpose: str | None,
    entities: tuple[str, ...],
    relations: tuple[str, ...],
    sources: tuple[str, ...],
    notes: str | None,
    clear_entities: bool,
    clear_relations: bool,
    clear_sources: bool,
    show: bool,
) -> None:
    """Semantic mapping for a Croissant dataset."""
    if ctx.invoked_subcommand is not None:
        return

    intent_flags_present = any(
        [
            purpose is not None,
            notes is not None,
            entities,
            relations,
            sources,
            clear_entities,
            clear_relations,
            clear_sources,
            show,
        ],
    )

    if intent_flags_present:
        _apply_intent_flags(
            purpose=purpose,
            entities=entities,
            relations=relations,
            sources=sources,
            notes=notes,
            clear_entities=clear_entities,
            clear_relations=clear_relations,
            clear_sources=clear_sources,
            show=show,
        )
        return

    from biotope.commands.map_wizard import launch_wizard

    launch_wizard(croissant_arg=croissant, mapping_arg=mapping_path)

preview(mapping_path, as_json, sample_rows)

Validate a (partial) mapping and project its outputs.

With no path, previews every mapping under the project's mappings/ dir (multi-mapping projects are the norm); pass an explicit path to preview just one.

Source code in biotope/biotope/commands/map.py
@map_group.command()
@click.argument(
    "mapping_path",
    type=click.Path(exists=True, dir_okay=False, path_type=Path),
    required=False,
)
@click.option("--json", "as_json", is_flag=True, help="Emit a machine-readable JSON preview.")
@click.option("--rows", "sample_rows", type=click.IntRange(min=0), default=3, show_default=True)
def preview(mapping_path: Path | None, as_json: bool, sample_rows: int) -> None:
    """Validate a (partial) mapping and project its outputs.

    With no path, previews every mapping under the project's ``mappings/`` dir
    (multi-mapping projects are the norm); pass an explicit path to preview
    just one.
    """
    if mapping_path is not None:
        paths = [mapping_path]
    else:
        paths = _discover_project_mappings()
        if not paths:
            click.echo("❌ No mapping file found. Pass a path or run `biotope map scaffold` first.")
            raise click.Abort

    previews: list[tuple[Path, Mapping, object]] = []
    for path in paths:
        mapping = load_mapping(path)
        dataset = _load_croissant(mapping.croissant)
        datasets_location = infer_datasets_location(mapping.croissant)
        result = preview_mapping(
            mapping,
            dataset,
            datasets_location=datasets_location,
            sample_rows=sample_rows,
        )
        previews.append((path, mapping, result))

    aggregated = aggregate_previews([(path.name, result) for path, _, result in previews])

    if as_json:
        payload = {
            "global": aggregated.to_json(),
            "mappings": {path.name: result.to_json() for path, _, result in previews},
        }
        click.echo(json.dumps(payload, indent=2, default=str))
        return

    _render_global_schema_rich(aggregated)
    _render_slot_resolution_rich(aggregated, [path.name for path, _, _ in previews])
    _render_global_findings_rich(aggregated)
    for path, mapping, result in previews:
        _render_per_file_panels(path, result)

scaffold(croissant, out, to_stdout, preview_rows)

Generate an unresolved semantic mapping scaffold for a Croissant file.

Source code in biotope/biotope/commands/map.py
@map_group.command()
@click.argument("croissant", type=str)
@click.option(
    "--out",
    "-o",
    type=click.Path(dir_okay=False, path_type=Path),
    default=None,
    help="Where to write the scaffold. Default: mappings/<stem>.mapping.yaml under the project root.",
)
@click.option("--stdout", "to_stdout", is_flag=True, help="Print to stdout instead of writing a file.")
@click.option("--preview-rows", type=click.IntRange(min=0), default=3, show_default=True)
def scaffold(croissant: str, out: Path | None, to_stdout: bool, preview_rows: int) -> None:
    """Generate an unresolved semantic mapping scaffold for a Croissant file."""
    if out is not None and to_stdout:
        raise click.UsageError("Choose either --out or --stdout, not both.")

    _load_croissant(croissant)  # validate up front with friendly errors

    target = out
    if target is None and not to_stdout:
        target = _default_output_path(croissant)
        if target is not None:
            target.parent.mkdir(parents=True, exist_ok=True)

    project = _load_project_optional()
    result = scaffold_mapping(
        croissant,
        required_entities=list(project.required_entities) if project else [],
        required_relations=list(project.required_relations) if project else [],
        purpose=project.purpose if project else None,
        write_to=target,
        preview_rows=preview_rows,
    )
    if target:
        console.print(f"✅ Wrote {target}")
        unresolved = result.get("unresolved") or []
        if unresolved:
            console.print(
                f"[yellow]ℹ[/yellow] {len(unresolved)} unresolved slot(s); " f"run [bold]biotope map[/bold] to resolve."
            )
    else:
        click.echo(result["yaml"], nl=False)