PostgreSQL

When setting the dbms parameter in the biocypher_config.yaml to postgres, the BioCypher Knowledge Graph is written to a PostgreSQL database. PostgreSQL is a relational database management system.

Install PostgreSQL

To get a PostgreSQL instance running quickly, you can use Docker. The following command will start a PostgreSQL instance with the password postgres and the port 5432 exposed to the host system.

docker run --restart always --publish=5432:5432 --env POSTGRES_PASSWORD=postgres -d postgres

The postgresql-client (also known as psql command line tool) can be used to connect and interact with the running PostgreSQL database instance. Installation instructions can be found here.

PostgreSQL settings

To overwrite the standard settings of PostgreSQL, add a postgresql section to the biocypher_config.yaml file. The following settings are possible:

biocypher_config.yaml

postgresql:  ### PostgreSQL configuration ###

  # PostgreSQL connection credentials
  database_name: postgres
  user: postgres
  password: postgres
  port: 5432

  # PostgreSQL import batch writer settings
  quote_character: '"'
  delimiter: '\t'

  # Import call prefixes to adjust the autogenerated shell script
  import_call_bin_prefix: ''  # path to "psql"
  import_call_file_prefix: /path/to/files/

Offline mode

Running BioCypher

After running BioCypher with the offline parameter set to true and the dbms set to postgres, the output folder contains:

entity-create_table.sql: The SQL scripts to create the tables for the nodes/edges. Entity is replaced by your nodes and edges and for each node and edge type an own SQL script is generated.
entity-part000.csv: The CSV file containing the data for the entity.
postgres-import-call.sh: The import script to create a database with the SQL scripts and insert the data from the CSV files.

Note

If the postgres-import-call.sh is missing, you can create it by running bc.write_import_call().

Create the PostgreSQL database

A running PostgreSQL instance (e.g. in a Docker container created with the command from above) is required to create the database and import the data. By running the import script postgres-import-call.sh, the content should be written to the PostgreSQL database.

Access the PostgreSQL database

To check the content of the database, you can use the psql command line tool.

First connect to the running PostgreSQL database instance:

psql -h localhost -p 5432 -U postgres

Then you can execute SQL queries. For example, you can list all tables in the database by running the following command in the terminal:

SELECT table_name
FROM information_schema.tables
WHERE table_schema = 'public';