In order to rapidly find nodes or relationship based on properties, Neo4j supports indexing. This is commonly used to find start nodes for traversals.
By default, the underlying index is powered by Apache Lucene, but it is also possible to use Neo4j with other index implementations.
You can create an arbitrary number of named indexes. Each index handles either nodes or relationships, and each index works by indexing key/value/object triplets, object being either a node or a relationship, depending on the index type.
Just like the rest of the API, all write operations to the index must be performed from within a transaction.
Create a new index, with optional configuration.
with db.transaction: # Create a relationship index rel_idx = db.relationship.indexes.create('my_rels') # Create a node index, passing optional # arguments to the index provider. # In this case, enable full-text indexing. node_idx = db.node.indexes.create('my_nodes', type='fulltext')
with db.transaction: node_idx = db.node.indexes.get('my_nodes') rel_idx = db.relationship.indexes.get('my_rels')
with db.transaction: node_idx = db.node.indexes.get('my_nodes') node_idx.delete() rel_idx = db.relationship.indexes.get('my_rels') rel_idx.delete()
with db.transaction: # Indexing nodes a_node = db.node() node_idx = db.node.indexes.create('my_nodes') # Add the node to the index node_idx['akey']['avalue'] = a_node # Indexing relationships a_relationship = a_node.knows(db.node()) rel_idx = db.relationship.indexes.create('my_rels') # Add the relationship to the index rel_idx['akey']['avalue'] = a_relationship
Removing items from an index can be done at several levels of granularity. See the example below.
# Remove specific key/value/item triplet del idx['akey']['avalue'][item] # Remove all instances under a certain # key del idx['akey'][item] # Remove all instances all together del idx[item]
You can retrieve indexed items in two ways. Either you do a direct lookup, or you perform a query. The direct lookup is the same across different index providers while the query syntax depends on what index provider you use. As mentioned previously, Lucene is the default and by far most common index provider. For querying Lucene you will want to use the Lucene query language.
There is a python library for programatically generating Lucene queries, available at GitHub.
Important | |
---|---|
Unless you loop through the entire index result, you have to close the result when you are done with it. If you do not, the database does not know when it can release the resources the result is taking up. |
hits = idx['akey']['avalue'] for item in hits: pass # Always close index results when you are # done, to free up resources. hits.close()
Copyright © 2012 Neo Technology