The traversal framework consists of a few main interfaces in addition to Node
and Relationship
: TraversalDescription
, Evaluator
, Traverser
and Uniqueness
are the main ones.
The Path
interface also has a special purpose in traversals, since it is used to represent a position in the graph when evaluating that position.
Furthermore the RelationshipExpander
and Expander
interfaces are central to traversals, but users of the API rarely need to implement them.
There are also a set of interfaces for advanced use, when explicit control over the traversal order is required: BranchSelector
, BranchOrderingPolicy
and TraversalBranch
.
The TraversalDescription
is the main interface used for defining and initializing traversals.
It is not meant to be implemented by users of the traversal framework, but rather to be provided by the implementation of the traversal framework as a way for the user to describe traversals.
TraversalDescription
instances are immutable and its methods returns a new TraversalDescription
that is modified compared to the object the method was invoked on with the arguments of the method.
Adds a relationship type to the list of relationship types to traverse. By default that list is empty and it means that it will traverse all relationships, irregardless of type. If one or more relationships are added to this list only the added types will be traversed. There are two methods, one including direction and another one excluding direction, where the latter traverses relationships in both directions.
Evaluator
s are used for deciding, at each position (represented as a Path
): should the traversal continue, and/or should the node be included in the result.
Given a Path
, it asks for one of four actions for that branch of the traversal:
Evaluation.INCLUDE_AND_CONTINUE
: Include this node in the result and continue the traversal
Evaluation.INCLUDE_AND_PRUNE
: Include this node in the result, but don’t continue the traversal
Evaluation.EXCLUDE_AND_CONTINUE
: Exclude this node from the result, but continue the traversal
Evaluation.EXCLUDE_AND_PRUNE
: Exclude this node from the result and don’t continue the traversal
More than one evaluator can be added. Note that evaluators will be called for all positions the traverser encounters, even for the start node.
The Traverser
object is the result of invoking traverse()
of a TraversalDescription object.
It represents a traversal positioned in the graph, and a specification of the format of the result.
The actual traversal is performed lazily each time the next()
-method of the iterator of the Traverser
is invoked.
Sets the rules for how positions can be revisited during a traversal as stated in Uniqueness
.
Default if not set is NODE_GLOBAL
.
A Uniqueness can be supplied to the TraversalDescription to dictate under what circumstances a traversal may revisit the same position in the graph. The various uniqueness levels that can be used in Neo4j are:
NONE
- any position in the graph may be revisited.
NODE_GLOBAL
uniqueness - no node in the entire graph may be visited more than once. This could potentially consume a lot of memory since it requires keeping an in-memory data structure remembering all the visited nodes.
RELATIONSHIP_GLOBAL
uniqueness - no relationship in the entire graph may be visited more than once. For the same reasons as NODE_GLOBAL uniqueness, this could use up a lot of memory. But since graphs typically have a larger number of relationships than nodes, the memory overhead of this uniqueness level could grow even quicker.
NODE_PATH
uniqueness - a node may not occur previously in the path reaching up to it.
RELATIONSHIP_PATH
uniqueness - a relationship may not occur previously in the path reaching up to it.
NODE_RECENT
uniqueness - Similar to NODE_GLOBAL uniqueness in that there is a global collection of visited nodes each position is checked against. This uniqueness level does however have a cap on how much memory it may consume in the form of a collection that only contains the most recently visited nodes. The size of this collection can be specified by providing a number as the second argument to the TraversalDescription.uniqueness()-method along with the uniqueness level.
RELATIONSHIP_RECENT
uniqueness - works like NODE_RECENT uniqueness, but with relationships instead of nodes.
These are convenience methods for setting preorder depth-first/ breadth-first BranchSelector
|ordering
policies.
The same result can be achieved by calling the order
method with ordering policies from the Traversal
factory
, or to write your own BranchSelector
/BranchOrderingPolicy
and pass in.
A more generic version of depthFirst/breadthFirst methods in that it allows an arbitrary BranchOrderingPolicy
to be injected into the description.
A BranchSelector is used for selecting which branch of the traversal to attempt next. This is used for implementing traversal orderings. The traversal framework provides a few basic ordering implementations:
Traversal.preorderDepthFirst()
- Traversing depth first, visiting each node before visiting its child nodes.
Traversal.postorderDepthFirst()
- Traversing depth first, visiting each node after visiting its child nodes.
Traversal.preorderBreadthFirst()
- Traversing breadth first, visiting each node before visiting its child nodes.
Traversal.postorderBreadthFirst()
- Traversing breadth first, visiting each node after visiting its child nodes.
Note | |
---|---|
Please note that breadth first traversals have a higher memory overhead than depth first traversals. |
BranchSelectors carries state and hence needs to be uniquely instantiated for each traversal. Therefore it is supplied to the TraversalDescription through a BranchOrderingPolicy interface, which is a factory of BranchSelector instances.
A user of the Traversal framework rarely needs to implement his own BranchSelector or BranchOrderingPolicy, it is provided to let graph algorithm implementors provide their own traversal orders. The Neo4j Graph Algorithms package contains for example a BestFirst order BranchSelector/BranchOrderingPolicy that is used in BestFirst search algorithms such as A* and Dijkstra.
A factory for creating BranchSelectors to decide in what order branches are returned (where a branch’s position is represented as a Path
from the start node to the current node). Common policies are depth-first
and breadth-first
and that’s why there are convenience methods for those.
For example, calling TraversalDescription#depthFirst()
is equivalent to:
description.order( Traversal.preorderDepthFirst() );
An object used by the BranchSelector to get more branches from a certain branch. In essence these are a composite of a Path and a RelationshipExpander that can be used to get new TraversalBranch
es from the current one.
A Path
is a general interface that is part of the Neo4j API.
In the traversal API of Neo4j the use of Paths are twofold.
Traversers can return their results in the form of the Paths of the visited positions in the graph that are marked for being returned.
Path objects are also used in the evaluation of positions in the graph, for determining if the traversal should continue from a certain point or not, and whether a certain position should be included in the result set or not.
The traversal framework use RelationshipExpanders to discover the relationships that should be followed from a particular node to further branches in the traversal.
A more generic version of relationships where a RelationshipExpander
is injected, defining all relationships to be traversed for any given node. By default (and when using relationships) a default expander is used, where any particular order of relationships isn’t guaranteed. There’s another implementation which guarantees that relationships are traversed in order of relationship type, where types are iterated in the order they were added.
The Expander
interface is an extension of the RelationshipExpander
interface that makes it possible to build customized versions of an Expander
.
The implementation of TraversalDescription
uses this to provide methods for defining which relationship types to traverse, this is the usual way a user of the API would define a RelationshipExpander
- by building it internally in the TraversalDescription
.
All the RelationshipExpanders provided by the Neo4j traversal framework also implement the Expander interface. For a user of the traversal API it is easier to implement the RelationshipExpander interface, since it only contains one method - the method for doing getting the relationships from a node, the methods that the Expander interface adds are just for building new Expanders.
In contrary to Node#traverse a traversal description is built (using a fluent interface) and such a description can spawn traversers.
With the definition of the RelationshipTypes
as
private enum Rels implements RelationshipType { LIKES, KNOWS }
The graph can be traversed with for example the following traverser, starting at the “Joe” node:
for ( Path position : Traversal.description() .depthFirst() .relationships( Rels.KNOWS ) .relationships( Rels.LIKES, Direction.INCOMING ) .evaluator( Evaluators.toDepth( 5 ) ) .traverse( joe ) ) { output += position + "\n"; }
The traversal will output:
(7) (7)<--[LIKES,1]--(4) (7)<--[LIKES,1]--(4)--[KNOWS,6]-->(1) (7)<--[LIKES,1]--(4)--[KNOWS,6]-->(1)--[KNOWS,4]-->(6) (7)<--[LIKES,1]--(4)--[KNOWS,6]-->(1)--[KNOWS,4]-->(6)--[KNOWS,3]-->(5) (7)<--[LIKES,1]--(4)--[KNOWS,6]-->(1)--[KNOWS,4]-->(6)--[KNOWS,3]-->(5)--[KNOWS,2]-->(2) (7)<--[LIKES,1]--(4)--[KNOWS,6]-->(1)<--[KNOWS,5]--(3)
Since TraversalDescription
s
are immutable it is also useful to create template descriptions which holds common
settings shared by different traversals. For example, let’s start with this traverser:
final TraversalDescription FRIENDS_TRAVERSAL = Traversal.description() .depthFirst() .relationships( Rels.KNOWS ) .uniqueness( Uniqueness.RELATIONSHIP_GLOBAL );
This traverser would yield the following output (we will keep starting from the “Joe” node):
(7) (7)--[KNOWS,0]-->(2) (7)--[KNOWS,0]-->(2)<--[KNOWS,2]--(5) (7)--[KNOWS,0]-->(2)<--[KNOWS,2]--(5)<--[KNOWS,3]--(6) (7)--[KNOWS,0]-->(2)<--[KNOWS,2]--(5)<--[KNOWS,3]--(6)<--[KNOWS,4]--(1) (7)--[KNOWS,0]-->(2)<--[KNOWS,2]--(5)<--[KNOWS,3]--(6)<--[KNOWS,4]--(1)<--[KNOWS,5]--(3) (7)--[KNOWS,0]-->(2)<--[KNOWS,2]--(5)<--[KNOWS,3]--(6)<--[KNOWS,4]--(1)<--[KNOWS,6]--(4)
Now let’s create a new traverser from it, restricting depth to three:
for ( Path path : FRIENDS_TRAVERSAL .evaluator( Evaluators.toDepth( 3 ) ) .traverse( joe ) ) { output += path + "\n"; }
This will give us the following result:
(7) (7)--[KNOWS,0]-->(2) (7)--[KNOWS,0]-->(2)<--[KNOWS,2]--(5) (7)--[KNOWS,0]-->(2)<--[KNOWS,2]--(5)<--[KNOWS,3]--(6)
Or how about from depth two to four? That’s done like this:
for ( Path path : FRIENDS_TRAVERSAL .evaluator( Evaluators.fromDepth( 2 ) ) .evaluator( Evaluators.toDepth( 4 ) ) .traverse( joe ) ) { output += path + "\n"; }
This traversal gives us:
(7)--[KNOWS,0]-->(2)<--[KNOWS,2]--(5) (7)--[KNOWS,0]-->(2)<--[KNOWS,2]--(5)<--[KNOWS,3]--(6) (7)--[KNOWS,0]-->(2)<--[KNOWS,2]--(5)<--[KNOWS,3]--(6)<--[KNOWS,4]--(1)
For various useful evaluators, see the Evaluators Java API or simply implement the Evaluator interface yourself.
If you’re not interested in the Path
s,
but the Node
s
you can transform the traverser into an iterable of nodes
like this:
for ( Node node : FRIENDS_TRAVERSAL .traverse( joe ) .nodes() ) { output += node.getProperty( "name" ) + "\n"; }
In this case we use it to retrieve the names:
Joe Sara Peter Dirk Lars Ed Lisa
Relationships are fine as well, here’s how to get them:
for ( Relationship relationship : FRIENDS_TRAVERSAL .traverse( joe ) .relationships() ) { output += relationship.getType() + "\n"; }
Here the relationship types are written, and we get:
KNOWS KNOWS KNOWS KNOWS KNOWS KNOWS
The full source for this example is available at
Copyright © 2012 Neo Technology