PostgreSQL

When setting the dbms parameter in the biocypher_config.yaml to postgres, the BioCypher Knowledge Graph is written to a PostgreSQL database. PostgreSQL is a relational database management system.

Install PostgreSQL

To get a PostgreSQL instance running quickly, you can use Docker. The following command will start a PostgreSQL instance with the password postgres and the port 5432 exposed to the host system.

docker run --restart always --publish=5432:5432 --env POSTGRES_PASSWORD=postgres -d postgres

The postgresql-client (also known as psql command line tool) can be used to connect and interact with the running PostgreSQL database instance. Installation instructions can be found here.

PostgreSQL settings

To overwrite the standard settings of PostgreSQL, add a postgresql section to the biocypher_config.yaml file. The following settings are possible:

biocypher_config.yaml
postgresql:  ### PostgreSQL configuration ###

  # PostgreSQL connection credentials
  database_name: postgres
  user: postgres
  password: postgres
  port: 5432

  # PostgreSQL import batch writer settings
  quote_character: '"'
  delimiter: '\t'

  # Import call prefixes to adjust the autogenerated shell script
  import_call_bin_prefix: ''  # path to "psql"
  import_call_file_prefix: /path/to/files/

Offline mode

Running BioCypher

After running BioCypher with the offline parameter set to true and the dbms set to postgres, the output folder contains:

  • entity-create_table.sql: The SQL scripts to create the tables for the nodes/edges. Entity is replaced by your nodes and edges and for each node and edge type an own SQL script is generated.

  • entity-part000.csv: The CSV file containing the data for the entity.

  • postgres-import-call.sh: The import script to create a database with the SQL scripts and insert the data from the CSV files.

Note

If the postgres-import-call.sh is missing, you can create it by running bc.write_import_call().

Create the PostgreSQL database

A running PostgreSQL instance (e.g. in a Docker container created with the command from above) is required to create the database and import the data. By running the import script postgres-import-call.sh, the content should be written to the PostgreSQL database.

Access the PostgreSQL database

To check the content of the database, you can use the psql command line tool.

First connect to the running PostgreSQL database instance:

psql -h localhost -p 5432 -U postgres

Then you can execute SQL queries. For example, you can list all tables in the database by running the following command in the terminal:

SELECT table_name
FROM information_schema.tables
WHERE table_schema = 'public';