Biotope Init
Draft stage
Biotope is in draft stage. Functionality may be missing or incomplete.
The API is subject to change.
Overview
The biotope init command initializes a new biotope project with interactive configuration. It sets up the necessary directory structure and configuration files for metadata management.
Features
Interactive Configuration
The init process guides you through several configuration options:
- Project Name: Set a name for your biotope project
- Git Integration: Choose whether to initialize Git version control
- Knowledge Graph: Optionally install a knowledge graph for enhanced data management
- Output Format: Select output format (only shown if knowledge graph is enabled)
- Project Metadata: Collect project-level metadata for annotation pre-filling
Project-Level Metadata Collection
During initialization, you can optionally collect project-level metadata that will be used to pre-fill annotation fields:
- Description: Brief description of the project and its purpose
- URL: Project homepage, repository, or documentation URL
- Creator: Name and contact information of the project maintainer
- License: Data usage license (e.g., MIT, CC-BY, etc.)
- Citation: How to cite the project or dataset
This metadata is stored in .biotope/config/biotope.yaml and automatically loaded when using biotope annotate edit.
Conditional Output Format Selection
The output format selection is only presented if you choose to install a knowledge graph, as it's only relevant for knowledge graph functionality.
Usage
Options
--dir, -d: Directory to initialize biotope project in (default: current directory)
Example
# Initialize in current directory
biotope init
# Initialize in specific directory
biotope init --dir /path/to/project
Configuration File Structure
The initialization creates a .biotope/config/biotope.yaml file with the following structure:
version: "1.0"
croissant_schema_version: "1.0"
default_metadata_template: "scientific"
data_storage:
type: "local"
path: "data"
checksum_algorithm: "sha256"
auto_stage: true
commit_message_template: "Update metadata: {description}"
# Project information (consolidated from internal metadata)
project_info:
name: "my-project"
created_at: "2024-01-01T00:00:00Z"
biotope_version: "0.1.0"
last_modified: "2024-01-01T00:00:00Z"
builds: []
knowledge_sources: []
# Project-level metadata for annotation pre-fill
project_metadata:
description: "Project description"
url: "https://example.com/project"
creator:
name: "John Doe"
email: "john@example.com"
license: "MIT"
citation: "Doe, J. (2024). Project Title. Journal Name."
# Validation configuration
annotation_validation:
enabled: true
minimum_required_fields:
- "name"
- "description"
- "creator"
- "dateCreated"
- "distribution"
field_validation:
name:
type: "string"
min_length: 1
description:
type: "string"
min_length: 10
creator:
type: "object"
required_keys: ["name"]
dateCreated:
type: "string"
format: "date"
distribution:
type: "array"
min_length: 1
Directory Structure
The init command creates the following directory structure:
project-root/
├── .biotope/
│ ├── config/
│ │ └── biotope.yaml # Consolidated configuration (Git-like)
│ ├── datasets/ # Croissant ML metadata files
│ ├── workflows/ # Bioinformatics workflow definitions
│ └── logs/ # Command execution logs
├── config/
│ └── biotope.yaml # User-facing configuration
├── data/
│ ├── raw/
│ └── processed/
├── schemas/
└── outputs/
Note: The configuration follows a Git-like approach where .biotope/config/biotope.yaml contains all biotope-specific configuration, similar to how Git uses .git/config for its configuration.
Managing Project Metadata
After initialization, you can manage project metadata using the biotope config command:
# Set project metadata
biotope config set-project-metadata
# Show current project metadata
biotope config show-project-metadata
biotope init — scaffold a new biotope project.
Default behavior is pure scaffold: create the directory layout, drop an
AGENTS.md for the agent surface, write an empty project.yaml, run
git init. No content questions. The agent (or the user via
biotope map) fills in the competence questions afterwards.
Use --interactive to open $EDITOR on the freshly-written
project.yaml so the user can fill purpose: before exiting init.
init(name, dir, purpose, no_prompt, no_git, visible, interactive)
Scaffold a new biotope project.
Default invocation: biotope init my-project. Creates my-project/ with
.biotope/, data/, mappings/, an AGENTS.md for agents to read,
and an empty project.yaml. Runs git init unless --no-git is set.
Source code in biotope/biotope/commands/init.py
112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 | |