Traversing is a high-performance way to search your database. The traversal API used here is essentially the same as the one used in the Java API, with a few modifications.
Traversals start at a given node and uses a set of rules to move through the graph and to decide what parts of the graph to return.
You may also want to look at Chapter 15, Cypher Query Language, which is a high level query language for your graph.
The most basic traversals simply follow certain relationship types, and return everything they encounter. By default, each node is visited only once, so there is no risk of infinite loops.
traverser = db.traversal()\ .relationships('related_to')\ .traverse(start_node) # The graph is traversed as # you loop through the result. for node in traverser.nodes: pass
You can tell the traverser to only follow relationships in some specific direction.
from neo4j import OUTGOING, INCOMING, ANY traverser = db.traversal()\ .relationships('related_to', OUTGOING)\ .traverse(start_node)
A traversal can give you one of three different result types: nodes, relationships or paths.
Traversals are performed lazily, which means that the graph is traversed as you loop through the result.
traverser = db.traversal()\ .relationships('related_to')\ .traverse(start_node) # Get each possible path for path in traverser: pass # Get each node for node in traverser.nodes: pass # Get each relationship for relationship in traverser.relationships: pass
To avoid infinite loops, it’s important to define what parts of the graph can be re-visited during a traversal.
By default, uniqueness is set to NODE_GLOBAL
, which means that each node is only visited once.
Here are the other options that are available.
from neo4j import Uniqueness # Available options are: Uniqueness.NONE # Any position in the graph may be revisited. Uniqueness.NODE_GLOBAL # Default option # No node in the entire graph may be visited # more than once. This could potentially # consume a lot of memory since it requires # keeping an in-memory data structure # remembering all the visited nodes. Uniqueness.RELATIONSHIP_GLOBAL # No relationship in the entire graph may be # visited more than once. For the same # reasons as NODE_GLOBAL uniqueness, this # could use up a lot of memory. But since # graphs typically have a larger number of # relationships than nodes, the memory # overhead of this uniqueness level could # grow even quicker. Uniqueness.NODE_PATH # A node may not occur previously in the # path reaching up to it. Uniqueness.RELATIONSHIP_PATH # A relationship may not occur previously in # the path reaching up to it. Uniqueness.NODE_RECENT # Similar to NODE_GLOBAL uniqueness in that # there is a global collection of visited # nodes each position is checked against. # This uniqueness level does however have a # cap on how much memory it may consume in # the form of a collection that only # contains the most recently visited nodes. # The size of this collection can be # specified by providing a number as the # second argument to the # uniqueness()-method along with the # uniqueness level. Uniqueness.RELATIONSHIP_RECENT # works like NODE_RECENT uniqueness, but # with relationships instead of nodes. traverser = db.traversal()\ .uniqueness(Uniqueness.NODE_PATH)\ .traverse(start_node)
You can traverse either depth first, or breadth first. Depth first is the default, because it has lower memory overhead.
# Depth first traversal, this # is the default. traverser = db.traversal()\ .depthFirst()\ .traverse(self.source) # Breadth first traversal traverser = db.traversal()\ .breadthFirst()\ .traverse(start_node)
In order to traverse based on other critera, such as node properties, or more complex things like neighboring nodes or patterns, we use evaluators. An evaluator is a normal Python method that takes a path as an argument, and returns a description of what to do next.
The path argument is the current position the traverser is at, and the description of what to do can be one of four things, as seen in the example below.
from neo4j import Evaluation # Evaluation contains the four # options that an evaluator can # return. They are: Evaluation.INCLUDE_AND_CONTINUE # Include this node in the result and # continue the traversal Evaluation.INCLUDE_AND_PRUNE # Include this node in the result, but don't # continue the traversal Evaluation.EXCLUDE_AND_CONTINUE # Exclude this node from the result, but # continue the traversal Evaluation.EXCLUDE_AND_PRUNE # Exclude this node from the result and # don't continue the traversal # An evaluator def my_evaluator(path): # Filter on end node property if path.end['message'] == 'world': return Evaluation.INCLUDE_AND_CONTINUE # Filter on last relationship type if path.last_relationship.type.name() == 'related_to': return Evaluation.INCLUDE_AND_PRUNE # You can do even more complex things here, like subtraversals. return Evaluation.EXCLUDE_AND_CONTINUE # Use the evaluator traverser = db.traversal()\ .evaluator(my_evaluator)\ .traverse(start_node)
Copyright © 2012 Neo Technology