PostgreSQL
When setting the dbms
parameter in the biocypher_config.yaml
to postgres
, the BioCypher Knowledge Graph is written to a PostgreSQL database.
PostgreSQL is a relational database management system.
Install PostgreSQL
To get a PostgreSQL instance running quickly, you can use Docker. The following command will start a PostgreSQL instance with the password postgres
and the port 5432
exposed to the host system.
docker run --restart always --publish=5432:5432 --env POSTGRES_PASSWORD=postgres -d postgres
The postgresql-client
(also known as psql
command line tool) can be used to connect and interact with the running PostgreSQL database instance. Installation instructions can be found here.
PostgreSQL settings
To overwrite the standard settings of PostgreSQL, add a postgresql
section to the biocypher_config.yaml
file.
The following settings are possible:
postgresql: ### PostgreSQL configuration ###
# PostgreSQL connection credentials
database_name: postgres
user: postgres
password: postgres
port: 5432
# PostgreSQL import batch writer settings
quote_character: '"'
delimiter: '\t'
# Import call prefixes to adjust the autogenerated shell script
import_call_bin_prefix: '' # path to "psql"
import_call_file_prefix: /path/to/files/
Offline mode
Running BioCypher
After running BioCypher with the offline
parameter set to true
and the dbms
set to postgres
,
the output folder contains:
entity-create_table.sql
: The SQL scripts to create the tables for the nodes/edges. Entity is replaced by your nodes and edges and for each node and edge type an own SQL script is generated.entity-part000.csv
: The CSV file containing the data for the entity.postgres-import-call.sh
: The import script to create a database with the SQL scripts and insert the data from the CSV files.
Note
If the postgres-import-call.sh
is missing, you can create it by running bc.write_import_call()
.
Create the PostgreSQL database
A running PostgreSQL instance (e.g. in a Docker container created with the command from above) is required to create the database and import the data.
By running the import script postgres-import-call.sh
, the content should be written to the PostgreSQL database.
Access the PostgreSQL database
To check the content of the database, you can use the psql
command line tool.
First connect to the running PostgreSQL database instance:
psql -h localhost -p 5432 -U postgres
Then you can execute SQL queries. For example, you can list all tables in the database by running the following command in the terminal:
SELECT table_name
FROM information_schema.tables
WHERE table_schema = 'public';