API Reference

The main BioCypher interface

Create a BioCypher instance by running:

from biocypher import BioCypher
bc = BioCypher()

Most of the settings should be configured by YAML files. See below for more information on the BioCypher class.

BioCypher([dbms, offline, strict_mode, ...])

Orchestration of BioCypher operations.

Database creation by file

Using the BioCypher instance, you can create a database by writing files by using the BioCypher.write_nodes() and BioCypher.write_edges() methods, which accept collections of nodes and edges either as tuples or as BioCypherNode and BioCypherEdge objects. For example:

# given lists of nodes and edges
bc.write_nodes(node_list)
bc.write_edges(edge_list)

Note

To facilitate the interaction with the various database management systems (DBMSs), BioCypher provides utility functions, such as writing a Neo4j admin import statement to be used for creating a Neo4j database (BioCypher.write_import_call()). The most commonly used utility functions are also available in the wrapper function BioCypher.summary(). See the BioCypher class for more information.

Details about the biocypher.output.write module responsible for these methods can be found below.

_get_writer.get_writer(dbms, translator, ...)

Function to return the writer class based on the selection in the config file.

_writer._Writer(translator, deduplicator[, ...])

Abstract class for writing node and edge representations to disk.

_batch_writer._BatchWriter(translator, ...)

Abstract batch writer class

graph._neo4j._Neo4jBatchWriter(*args, **kwargs)

Class for writing node and edge representations to disk using the format specified by Neo4j for the use of admin import.

graph._arangodb._ArangoDBBatchWriter(*args, ...)

Class for writing node and edge representations to disk using the format specified by ArangoDB for the use of "arangoimport".

graph._rdf._RDFWriter(translator, ...[, ...])

Class to write BioCypher's property graph into an RDF format using rdflib and all the extensions it supports (RDF/XML, N3, NTriples, N-Quads, Turtle, TriX, Trig and JSON-LD).

graph._networkx._NetworkXWriter(*args, **kwargs)

Class for writing node and edges to a networkx DiGraph.

relational._postgresql._PostgreSQLBatchWriter(...)

Class for writing node and edge representations to disk using the format specified by PostgreSQL for the use of "COPY FROM...".

relational._sqlite._SQLiteBatchWriter(*args, ...)

Class for writing node and edge representations to a SQLite database.

relational._csv._PandasCSVWriter(*args[, ...])

Class for writing node and edge representations to a CSV file.

In-memory Pandas knowledge graph

BioCypher provides a wrapper around the pandas.DataFrame class to facilitate the creation of a knowledge graph in memory. This is useful for testing, small datasets, and for workflows that should remain purely in Python. Example usage:

from biocypher import BioCypher
bc = BioCypher()
# given lists of nodes and edges
bc.add(node_list)
bc.add(edge_list)
# show list of dataframes (one per node/edge type)
dfs = bc.to_df()

Details about the biocypher.output.in_memory module responsible for these methods can be found below.

_pandas.Pandas(translator, deduplicator)

Database creation and manipulation by Driver

BioCypher also provides a driver for each of the supported DBMSs. The driver can be used to create a database and to write nodes and edges to it, as well as allowing more subtle manipulation usually not encountered in creating a database from scratch as in the file-based workflow. This includes merging (creation of entities only if they don’t exist) and deletion. For example:

from biocypher import BioCypher
bc = BioCypher()
# given lists of nodes and edges
bc.merge_nodes(node_set_1)
bc.merge_edges(edge_set_1)
bc.merge_nodes(node_set_2)
bc.merge_edges(edge_set_2)

Details about the biocypher.output.connect module responsible for these methods can be found below.

_neo4j_driver.get_driver(dbms, translator)

Function to return the writer class.

_neo4j_driver._Neo4jDriver(database_name, ...)

Manages a BioCypher connection to a Neo4j database using the neo4j_utils.Driver class.

Download and cache functionality

BioCypher provides a download and cache functionality for resources. Resources are defined via the abstract Resource class, which have a name, a (set of) URL(s), and a lifetime (in days, set to 0 for infinite). Two classes inherit from the Resource class, the FileDownload class and APIRequest class. The Downloader can deal with single files, lists of files, compressed files, and directories (which needs to be indicated using the is_dir parameter of the FileDownload). It uses Pooch under the hood to handle the downloading of files and Python’s requests library to perform API requests. Example usage:

from biocypher import BioCypher, FileDownload, APIRequest
bc = BioCypher()

resource1 = FileDownload(
    name="file_list_resource",
    url_s=[
        "https://example.com/file_download1.txt",
        "https://example.com/file_download2.txt"
    ],
    lifetime=1
)
resource2 = FileDownload(
    name="zipped_resource",
    url_s="https://example.com/file_download3.zip",
    lifetime=7
)
resource3 = FileDownload(
    name="directory_resource",
    url_s="https://example.com/file_download4/",
    lifetime=7,
    is_dir=True,
)
resource4 = APIRequest(
    name="list_api_request",
    url_s=[
        "https://api.example.org/api_request1",
        "https://api.example.org/api_request2",
    ],
    life_time=7,
)
resource5 = APIRequest(
    name="api_request",
    url_s="https://api.example.org/api_request1",
    life_time=7,
)
resource_list = [resource1, resource2, resource3, resource4, resource5]
paths = bc.download(resource_list)

The files and API requests will be stored in the cache directory, in subfolders according to the names of the resources, and additionally determined by Pooch (e.g., extraction of zip files can result in multiple new files). All paths of downloaded files are returned by the download method. The Downloader class can also be used directly, without the BioCypher instance. You can set the cache directory in the configuration file; if not set, it will use the TemporaryDirectory.name() method from the tempfile module. More details about the Resource , FileDownload , APIRequest and Downloader classes can be found below.

Resource(name, url_s[, lifetime])

APIRequest(name, url_s[, lifetime])

FileDownload(name, url_s[, lifetime, is_dir])

Downloader([cache_dir])

Ontology ingestion, parsing, and manipulation

Ontology(head_ontology[, ontology_mapping, ...])

A class that represents the ontological "backbone" of a BioCypher knowledge graph.

OntologyAdapter(ontology_file, root_label[, ...])

Class that represents an ontology to be used in the Biocypher framework.

Mapping of data inputs to KG ontology

OntologyMapping([config_file])

Class to store the ontology mapping and extensions.

Base classes for node and edge representations in BioCypher

BioCypherNode(node_id, node_label, ...)

Handoff class to represent biomedical entities as Neo4j nodes.

BioCypherEdge(source_id, target_id, ...)

Handoff class to represent biomedical relationships in Neo4j.

BioCypherRelAsNode(node, source_edge, ...)

Class to represent relationships as nodes (with in- and outgoing edges) as a triplet of a BioCypherNode and two BioCypherEdges.

Translation functionality for implemented types of representation

Translator(ontology[, strict_mode])

Class responsible for exacting the translation process that is configured in the schema_config.yaml file.

Logging

get_logger([name])

Access the module logger, create a new one if does not exist yet.

Miscellaneous utility functions

to_list(value)

Ensures that value is a list.

ensure_iterable(value)

Returns iterables, except strings, wraps simple types into tuple.

create_tree_visualisation(inheritance_graph)

Creates a visualisation of the inheritance tree using treelib.

from_pascal(s[, sep])

pascalcase_to_sentencecase(s)

Convert PascalCase to sentence case.

snakecase_to_sentencecase(s)

Convert snake_case to sentence case.

sentencecase_to_snakecase(s)

Convert sentence case to snake_case.

sentencecase_to_pascalcase(s[, sep])

Convert sentence case to PascalCase.