Ontology Handling
Ontology Base Class
A class that represents the ontological "backbone" of a KG.
The ontology can be built from a single resource, or hybridised from a combination of resources, with one resource being the "head" ontology, while an arbitrary number of other resources can become "tail" ontologies at arbitrary fusion points inside the "head" ontology.
Source code in biocypher/_ontology.py
471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815 816 817 818 819 820 821 822 823 824 825 826 827 828 829 830 831 832 833 834 835 836 837 838 839 840 841 842 843 844 845 846 847 848 849 850 851 852 853 854 855 856 857 858 859 860 861 862 863 864 865 866 867 868 869 870 871 872 873 874 875 876 877 878 879 880 881 882 883 884 885 886 887 888 889 890 891 892 893 894 895 |
|
__init__(head_ontology, ontology_mapping=None, tail_ontologies=None)
Initialize the Ontology class.
head_ontology (OntologyAdapter): The head ontology.
tail_ontologies (list): A list of OntologyAdapters that will be
added to the head ontology. Defaults to None.
Source code in biocypher/_ontology.py
_add_properties()
Add properties to the ontology.
For each entity in the mapping, update the ontology with the properties specified in the mapping. Updates synonym information in the graph, setting the synonym as the primary node label.
Source code in biocypher/_ontology.py
_connect_biolink_classes()
Experimental: Adds edges from disjoint classes to the entity node.
Source code in biocypher/_ontology.py
_extend_ontology()
Add the user extensions to the ontology.
Tries to find the parent in the ontology, adds it if necessary, and adds the child and a directed edge from child to parent. Can handle multiple parents.
Source code in biocypher/_ontology.py
_get_current_id()
Instantiate a version ID for the current session.
For now does simple versioning using datetime.
Can later implement incremental versioning, versioning from config file, or manual specification via argument.
Source code in biocypher/_ontology.py
_get_head_join_node(adapter)
Try to find the head join node of the given ontology adapter.
Find the node in the head ontology that is the head join node. If the join node is not found, the method will raise an error.
adapter (OntologyAdapter): The ontology adapter of which to find the
join node in the head ontology.
str: The head join node in the head ontology.
ValueError: If the head join node is not found in the head ontology.
Source code in biocypher/_ontology.py
_join_ontologies(adapter, head_join_node)
Join the present ontologies.
Join two ontologies by adding the tail ontology as a subgraph to the head ontology at the specified join nodes.
adapter (OntologyAdapter): The ontology adapter of the tail ontology
to be added to the head ontology.
Source code in biocypher/_ontology.py
_load_ontologies()
For each ontology, load the OntologyAdapter object.
Store it as an instance variable (head) or in an instance dictionary (tail).
Source code in biocypher/_ontology.py
_main()
Instantiate the ontology.
Loads the ontologies, joins them, and returns the hybrid ontology. Loads only the head ontology if nothing else is given. Adds user extensions and properties from the mapping.
Source code in biocypher/_ontology.py
get_ancestors(node_label)
Get the ancestors of a node in the ontology.
node_label (str): The label of the node in the ontology.
list: A list of the ancestors of the node.
Source code in biocypher/_ontology.py
get_dict()
Return a dictionary representation of the ontology.
The dictionary is compatible with a BioCypher node for compatibility with the Neo4j driver.
Source code in biocypher/_ontology.py
get_rdf_graph()
Return the merged RDF graph.
Return the merged graph of all loaded ontologies (head and tails).
Source code in biocypher/_ontology.py
show_ontology_structure(to_disk=None, full=False)
Show the ontology structure using treelib or write to GRAPHML file.
to_disk (str): If specified, the ontology structure will be saved
to disk as a GRAPHML file at the location (directory) specified
by the `to_disk` string, to be opened in your favourite graph
visualisation tool.
full (bool): If True, the full ontology structure will be shown,
including all nodes and edges. If False, only the nodes and
edges that are relevant to the extended schema will be shown.
Source code in biocypher/_ontology.py
766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815 816 817 818 819 820 821 822 823 824 825 826 827 828 829 830 831 832 833 834 835 836 837 838 839 840 841 842 843 844 845 846 847 848 849 850 851 852 853 854 855 |
|
Ontology Adapter
Class that represents an ontology to be used in the Biocypher framework.
Can read from a variety of formats, including OWL, OBO, and RDF/XML. The ontology is represented by a networkx.DiGraph object; an RDFlib graph is also kept. By default, the DiGraph reverses the label and identifier of the nodes, such that the node name in the graph is the human-readable label. The edges are oriented from child to parent. Labels are formatted in lower sentence case and underscores are replaced by spaces. Identifiers are taken as defined and the prefixes are removed by default.
Source code in biocypher/_ontology.py
30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 |
|
__init__(ontology_file, root_label, ontology_file_format=None, head_join_node_label=None, merge_nodes=True, switch_label_and_id=True, remove_prefixes=True)
Initialize the OntologyAdapter class.
ontology_file (str): Path to the ontology file. Can be local or
remote.
root_label (str): The label of the root node in the ontology. In
case of a tail ontology, this is the tail join node.
ontology_file_format (str): The format of the ontology file (e.g. "application/rdf+xml")
If format is not passed, it is determined automatically.
head_join_node_label (str): Optional variable to store the label of the
node in the head ontology that should be used to join to the
root node of the tail ontology. Defaults to None.
merge_nodes (bool): If True, head and tail join nodes will be
merged, using the label of the head join node. If False, the
tail join node will be attached as a child of the head join
node.
switch_label_and_id (bool): If True, the node names in the graph will be
the human-readable labels. If False, the node names will be the
identifiers. Defaults to True.
remove_prefixes (bool): If True, the prefixes of the identifiers will
be removed. Defaults to True.
Source code in biocypher/_ontology.py
_add_labels_to_nodes(nx_graph, switch_label_and_id)
Add labels to the nodes in the networkx graph.
nx_graph (nx.DiGraph): The networkx graph
switch_label_and_id (bool): If True, id and label are switched
nx.DiGraph: The networkx graph with labels
Source code in biocypher/_ontology.py
_change_nodes_to_biocypher_format(nx_graph, switch_label_and_id, rename_nodes=True)
Change the nodes in the networkx graph to BioCypher format.
This involves
- removing the prefix of the identifier
- switching the id and label if requested
- adapting the labels (replace _ with space and convert to lower sentence case)
Args:
nx_graph (nx.DiGraph): The networkx graph
switch_label_and_id (bool): If True, id and label are switched
rename_nodes (bool): If True, the nodes are renamed
nx.DiGraph: The networkx ontology graph in BioCypher format
Source code in biocypher/_ontology.py
_convert_to_nx(one_to_one, one_to_many)
Convert the one to one and one to many inheritance graphs to networkx.
one_to_one (rdflib.Graph): The one to one inheritance graph
one_to_many (dict): The one to many inheritance dictionary
nx.DiGraph: The networkx graph
Source code in biocypher/_ontology.py
_get_all_ancestors(renamed, root_label, switch_label_and_id, rename_nodes=True)
Get all ancestors of the root node in the networkx graph.
renamed (nx.DiGraph): The renamed networkx graph
root_label (str): The label of the root node in the ontology
switch_label_and_id (bool): If True, id and label are switched
rename_nodes (bool): If True, the nodes are renamed
nx.DiGraph: The filtered networkx graph
Source code in biocypher/_ontology.py
_get_format(ontology_file)
Get the format of the ontology file.
Source code in biocypher/_ontology.py
_get_multiple_inheritance_dict(g)
Get the multiple inheritance dictionary from the RDF graph.
g (rdflib.Graph): The RDF graph
dict: The multiple inheritance dictionary
Source code in biocypher/_ontology.py
_get_nx_id_and_label(node, switch_id_and_label, rename_nodes=True)
Rename node id and label for nx graph.
node (str): The node to rename
switch_id_and_label (bool): If True, switch id and label
tuple[str, str]: The renamed node id and label
Source code in biocypher/_ontology.py
_get_one_to_one_inheritance_triples(g)
Get the one to one inheritance triples from the RDF graph.
g (rdflib.Graph): The RDF graph
rdflib.Graph: The one to one inheritance graph
Source code in biocypher/_ontology.py
_load_rdf_graph(ontology_file)
Load the ontology into an RDFlib graph.
The ontology file can be in OWL, OBO, or RDF/XML format.
ontology_file (str): The path to the ontology file
rdflib.Graph: The RDFlib graph
Source code in biocypher/_ontology.py
_remove_prefix(uri)
Remove the prefix of a URI.
URIs can contain either "#" or "/" as a separator between the prefix and the local name. The prefix is everything before the last separator.
uri (str): The URI to remove the prefix from
str: The URI without the prefix
Source code in biocypher/_ontology.py
_retrieve_rdf_linked_list(subject)
Recursively retrieve a linked list from RDF.
Example RDF list with the items [item1, item2]: list_node - first -> item1 list_node - rest -> list_node2 list_node2 - first -> item2 list_node2 - rest -> nil
subject (rdflib.URIRef): One list_node of the RDF list
list: The items of the RDF list
Source code in biocypher/_ontology.py
get_ancestors(node_label)
get_head_join_node()
get_nx_graph()
get_rdf_graph()
get_root_node()
Get root node in the ontology.
Returns
root_node: If _switch_label_and_id is True, the root node label is
returned, otherwise the root node id is returned.
Source code in biocypher/_ontology.py
has_label(node, g)
Check if the node has a label in the graph.
node (rdflib.URIRef): The node to check
g (rdflib.Graph): The graph to check in
Returns: bool: True if the node has a label, False otherwise
Source code in biocypher/_ontology.py
Mapping of data inputs to KG ontology
Class to store the ontology mapping and extensions.
Source code in biocypher/_mapping.py
17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 |
|
_extend_schema(d=None)
Get leaves of the tree hierarchy from the data structure dict
contained in the schema_config.yaml
. Creates virtual leaves
(as children) from entries that provide more than one preferred
id type (and corresponding inputs).
Parameters:
Name | Type | Description | Default |
---|---|---|---|
d
|
Optional[dict]
|
Data structure dict from yaml file. |
None
|
Source code in biocypher/_mapping.py
_horizontal_inheritance_pid(key, value)
Create virtual leaves for multiple preferred id types or sources.
If we create virtual leaves, input_label/label_in_input always has to be a list.
Source code in biocypher/_mapping.py
_horizontal_inheritance_source(key, value)
Create virtual leaves for multiple sources.
If we create virtual leaves, input_label/label_in_input always has to be a list.
Source code in biocypher/_mapping.py
_read_config(config_file=None)
Read the configuration file and store the ontology mapping and extensions.
Source code in biocypher/_mapping.py
_vertical_property_inheritance(d)
Inherit properties from parents to children and update d
accordingly.