Installation
Install as a dependency for your KG pipeline
The recommended way of installing BioCypher is through the PyPI distribution. You can use any package manager that can install from PyPI, such as pip, conda, poetry, etc. We recommend Poetry, you can get it here. You can install BioCypher as a dependency as follows:
poetry new my-awesome-kg-project
cd my-awesome-kg-project
poetry add biocypher
Alternatively, using conda/pip:
conda create --name biocypher python=3.10
conda activate biocypher
pip install biocypher
Note
BioCypher generally supports the most recent three Python versions. If you encounter any issues with a specific Python version, please open an issue on GitHub.
For developers
If you want to directly install BioCypher, here are the steps (requires Poetry):
git clone https://github.com/biocypher/biocypher
cd BioCypher
poetry install
Poetry creates a virtual environment for you (starting with biocypher-
;
alternatively you can name it yourself) and installs all dependencies.
If you want to run the tests that use a local Neo4j or PostgreSQL DBMS (database management system) instance:
Make sure that you have a Neo4j instance with the APOC plugin installed and a database named
test
running on standard bolt port7687
A PostgreSQL instance with the psql command line tool should be installed locally and running on standard port
5432
Activate the virtual environment by running
% poetry shell
and then run the tests by running% pytest
in the root directory of the repository with the command line argument--password=<your DBMS password>
.
Once this is set up, you can go through the tutorial or use it in your project as a local dependency.
Configuration
BioCypher comes with a default set of configuration parameters. You can
overwrite them by creating a biocypher_config.yaml
file in the root directory
or the config
directory of your project. You only need to specify the ones you
wish to override from default. If you want to create global user settings, you
can create a biocypher_config.yaml
in your default BioCypher user directory
(as found using appdirs.user_config_dir('biocypher')
). For instance, on Mac
OS, this would be ~/Library/Caches/biocypher/biocypher_config.yaml
. Finally,
you can also point an instance of the
BioCypher
class to any YAML file using the biocypher_config_path
parameter.
Note
It is important to follow the rules of indentation in the YAML file. BioCypher
module configuration is found under the top-level keyword biocypher
, while
the settings for DBMS systems (e.g., Neo4j) are found under their respective
keywords (e.g., neo4j
).
Quote characters
If possible, avoid using quote characters in your YAML files. If you need to
quote, for instance a tab delimiter (\t
), use single quotes ('
), since
double quotes ("
) allow parsing of escape characters in YAML, which can cause
issues downstream. It is safe to use double quotes to quote a single quote
character ("'"
).
Configuration files are read in the order default -> user level -> project level
, with the later ones overriding the preceding. The following parameters
are available:
BioCypher settings
biocypher: ### BioCypher module configuration ###
### Required parameters ###
# DBMS type
dbms: neo4j
# Offline mode: do not connect to a running DBMS instance
# Can be used e.g. for writing batch import files
offline: true
# Strict mode: do not allow to create new nodes or relationships without
# specifying source, version, and license parameters
strict_mode: false
# Schema configuration: mapping of inputs to ontology
user_schema_config_path: biocypher/_config/test_schema_config.yaml
# Ontology configuration
head_ontology:
url: https://github.com/biolink/biolink-model/raw/v3.2.1/biolink-model.owl.ttl
root_node: entity
### Optional parameters ###
# Logging granularity
# Set debug to true if more granular logging is desired
debug: false
# Set to change the log directory
log_directory: biocypher-log
# Set to change the output directory
output_directory: biocypher-out
# Set to change the Resource cache directory
cache_directory: .cache
# Optional tail ontologies
tail_ontologies:
so:
url: test/ontologies/so.owl
head_join_node: sequence variant
tail_join_node: sequence_variant
mondo:
url: test/ontologies/mondo.owl
head_join_node: disease
tail_join_node: disease