Graph Class
The Graph class provides a unified graph representation supporting various graph types including simple graphs, directed graphs, weighted graphs, multigraphs, and hypergraphs. The design focuses on simplicity and extensibility for knowledge representation.
Overview
Constructor
Graph.__init__()
Parameters:
name(str): Name of the graph (default: "graph")directed(bool): Whether the graph is directed (default: True)
Data Structures
The Graph class uses the following core data structures:
_nodes: Dictionary mapping node IDs to Node objects_edges: Dictionary mapping edge IDs to Edge objects_hyperedges: Dictionary mapping hyperedge IDs to HyperEdge objects_node_types: Index of nodes by type_edge_types: Index of edges by type_hyperedge_types: Index of hyperedges by type_outgoing: Adjacency index for outgoing edges_incoming: Adjacency index for incoming edges
Node Operations
add_node()
Add a node to the graph.
Parameters:
- node_id (str): Unique identifier for the node
- node_type (str): Type/category of the node
- properties (dict, optional): Node properties dictionary
Returns:
- bool: True if node was added, False if it already exists
Example:
get_node()
Get a node by ID.
Parameters:
- node_id (str): Node identifier
Returns:
- Node | None: Node object or None if not found
has_node()
Check if a node exists.
Parameters:
- node_id (str): Node identifier
Returns:
- bool: True if node exists
remove_node()
Remove a node from the graph.
Parameters:
- node_id (str): Node identifier
Returns:
- bool: True if node was removed, False if not found
get_nodes()
Get nodes, optionally filtered by type.
Parameters:
- node_type (str, optional): Filter by node type
Returns:
- list[Node]: List of nodes
get_node_ids()
Get all node IDs, optionally filtered by type.
Parameters:
- node_type (str, optional): Filter by node type
Returns:
- set[str]: Set of node IDs
Edge Operations
add_edge()
def add_edge(
self,
edge_id: str,
edge_type: str,
source: str,
target: str,
properties: dict[str, Any] | None = None
) -> bool
Add an edge to the graph.
Parameters:
- edge_id (str): Unique identifier for the edge
- edge_type (str): Type/category of the edge
- source (str): Source node ID
- target (str): Target node ID
- properties (dict, optional): Edge properties dictionary
Returns:
- bool: True if edge was added, False if it already exists
Raises:
- ValueError: If source or target node does not exist
Example:
get_edge()
Get an edge by ID.
Parameters:
- edge_id (str): Edge identifier
Returns:
- Edge | None: Edge object or None if not found
has_edge()
Check if an edge exists.
Parameters:
- edge_id (str): Edge identifier
Returns:
- bool: True if edge exists
remove_edge()
Remove an edge from the graph.
Parameters:
- edge_id (str): Edge identifier
Returns:
- bool: True if edge was removed, False if not found
get_edges()
Get edges, optionally filtered by type.
Parameters:
- edge_type (str, optional): Filter by edge type
Returns:
- list[Edge]: List of edges
get_edges_between()
Get edges between two specific nodes.
Parameters:
- source (str): Source node ID
- target (str): Target node ID
- edge_type (str, optional): Filter by edge type
Returns:
- list[Edge]: List of edges between the nodes
HyperEdge Operations
add_hyperedge()
def add_hyperedge(
self,
hyperedge_id: str,
hyperedge_type: str,
nodes: set[str],
properties: dict[str, Any] | None = None
) -> bool
Add a hyperedge connecting multiple nodes.
Parameters:
- hyperedge_id (str): Unique identifier for the hyperedge
- hyperedge_type (str): Type/category of the hyperedge
- nodes (set[str]): Set of node IDs to connect
- properties (dict, optional): Hyperedge properties dictionary
Returns:
- bool: True if hyperedge was added, False if it already exists
Raises:
- ValueError: If any node in the set does not exist
Example:
get_hyperedge()
Get a hyperedge by ID.
Parameters:
- hyperedge_id (str): Hyperedge identifier
Returns:
- HyperEdge | None: Hyperedge object or None if not found
has_hyperedge()
Check if a hyperedge exists.
Parameters:
- hyperedge_id (str): Hyperedge identifier
Returns:
- bool: True if hyperedge exists
remove_hyperedge()
Remove a hyperedge from the graph.
Parameters:
- hyperedge_id (str): Hyperedge identifier
Returns:
- bool: True if hyperedge was removed, False if not found
get_hyperedges()
Get hyperedges, optionally filtered by type.
Parameters:
- hyperedge_type (str, optional): Filter by hyperedge type
Returns:
- list[HyperEdge]: List of hyperedges
Graph Traversal
get_neighbors()
Get neighbors of a node.
Parameters:
- node_id (str): Node identifier
- direction (str): "in", "out", or "both" (default: "both")
Returns:
- set[str]: Set of neighbor node IDs
get_connected_edges()
Get all edges connected to a node.
Parameters:
- node_id (str): Node identifier
Returns:
- list[Edge]: List of connected edges
find_paths()
Find paths between two nodes using breadth-first search.
Parameters:
- source (str): Source node ID
- target (str): Target node ID
- max_length (int, optional): Maximum path length
Returns:
- list[list[str]]: List of paths (each path is a list of node IDs)
find_connected_components()
def find_connected_components(self, start_node: str, max_depth: int | None = None) -> dict[str, Any]
Find connected components starting from a node.
Parameters:
- start_node (str): Starting node ID
- max_depth (int, optional): Maximum depth to search
Returns:
- dict[str, Any]: Connected components data
Analysis and Statistics
get_statistics()
Get comprehensive graph statistics.
Returns:
- dict[str, Any]: Statistics dictionary with the following keys:
- basic: Basic counts (nodes, edges, hyperedges, types)
- connectivity: Connectivity metrics (density, clustering, etc.)
get_summary()
Get human-readable summary of the graph.
Returns:
- dict[str, Any]: Summary dictionary
Serialization
to_dict()
Convert the graph to a dictionary representation.
Returns:
- dict[str, Any]: Dictionary representation
from_dict()
Create a graph from a dictionary representation.
Parameters:
- data (dict): Dictionary representation
Returns:
- Graph: New graph instance
to_json()
Export the graph to JSON format.
Returns:
- str: JSON string representation
from_json()
Create a graph from JSON string.
Parameters:
- json_str (str): JSON string representation
Returns:
- Graph: New graph instance
Utility Methods
clear()
Clear all nodes, edges, and hyperedges from the graph.
copy()
Create a deep copy of the graph.
Returns:
- Graph: New graph instance
__len__()
Return the number of nodes in the graph.
Returns:
- int: Number of nodes
__contains__()
Check if a node exists in the graph.
Parameters:
- node_id (str): Node identifier
Returns:
- bool: True if node exists
Properties
name(str): Name of the graphdirected(bool): Whether the graph is directed_stats(dict): Internal statistics tracking
Built-in Deduplication
The Graph class has built-in deduplication that prevents: - Duplicate nodes with the same ID - Duplicate edges with the same ID - Duplicate hyperedges with the same ID
This is a fundamental property of graphs and cannot be disabled.
Examples
Basic Usage
from biocypher import Graph
# Create graph
graph = Graph("protein_network", directed=True)
# Add nodes
graph.add_node("TP53", "protein", {"name": "TP53", "function": "tumor_suppressor"})
graph.add_node("BRAF", "protein", {"name": "BRAF", "function": "kinase"})
# Add edge
graph.add_edge("interaction_1", "interaction", "TP53", "BRAF", {"confidence": 0.8})
# Query
proteins = graph.get_nodes("protein")
print(f"Found {len(proteins)} proteins")
# Traversal
neighbors = graph.get_neighbors("TP53")
print(f"TP53 has {len(neighbors)} neighbors")
HyperEdge Usage
# Add hyperedge (complex)
graph.add_hyperedge(
"complex_1",
"complex",
{"TP53", "BRAF", "MDM2"},
{"function": "cell_cycle_control"}
)
# Get complex
complex_edges = graph.get_hyperedges("complex")
print(f"Found {len(complex_edges)} complexes")
Path Finding
# Find paths between nodes
paths = graph.find_paths("TP53", "BRAF", max_length=3)
print(f"Found {len(paths)} paths between TP53 and BRAF")
for path in paths:
print(f"Path: {' -> '.join(path)}")
Statistics
# Get comprehensive statistics
stats = graph.get_statistics()
print(f"Nodes: {stats['basic']['nodes']}")
print(f"Edges: {stats['basic']['edges']}")
print(f"Hyperedges: {stats['basic']['hyperedges']}")
# Get summary
summary = graph.get_summary()
print(f"Graph: {summary['name']}")
print(f"Top node types: {summary['top_node_types']}")