Neo4j has a batch insertion mode intended for initial imports, which must run in a single thread and bypasses transactions and other checks in favor of performance. Indexing during batch insertion is done using BatchInserterIndex which are provided via BatchInserterIndexProvider. An example:
BatchInserter inserter = new BatchInserterImpl( "target/neo4jdb-batchinsert" ); BatchInserterIndexProvider indexProvider = new LuceneBatchInserterIndexProvider( inserter ); BatchInserterIndex actors = indexProvider.nodeIndex( "actors", MapUtil.stringMap( "type", "exact" ) ); actors.setCacheCapacity( "name", 100000 ); Map<String, Object> properties = MapUtil.map( "name", "Keanu Reeves" ); long node = inserter.createNode( properties ); actors.add( node, properties ); //make the changes visible for reading, use this sparsely, requires IO! actors.flush(); // Make sure to shut down the index provider indexProvider.shutdown(); inserter.shutdown();
The configuration parameters are the same as mentioned in Section 14.10, “Configuration and fulltext indexes”.
Here are some pointers to get the most performance out of BatchInserterIndex
:
Note | |
---|---|
Changes to the index are available for reading first after they are flushed to disk. Thus, for optimal performance, read and lookup operations should be kept to a minimum during batchinsertion since they involve IO and impact speed negatively. |
Copyright © 2012 Neo Technology