Logo Search packages:      
Sourcecode: beagle version File versions  Download package

Lucene::Net::Index::IndexWriter Class Reference

List of all members.


Detailed Description

An IndexWriter creates and maintains an index. The third argument to the constructor determines whether a new index is created, or whether an existing index is opened for the addition of new documents. In either case, documents are added with the addDocument method. When finished adding documents, close should be called.

If an index will not have more documents added for a while and optimal search performance is desired, then the optimize method should be called before the index is closed.

Opening an IndexWriter creates a lock file for the directory in use. Trying to open another IndexWriter on the same directory will lead to an IOException. The IOException is also thrown if an IndexReader on the same directory is used to delete documents from the index.

See also:
IndexModifier IndexModifier supports the important methods of IndexWriter plus deletion

Definition at line 52 of file IndexWriter.cs.


Public Member Functions

virtual void AddDocument (Document doc, Analyzer analyzer)
 Adds a document to this index, using the provided analyzer instead of the value of GetAnalyzer(). If the document contains more than SetMaxFieldLength(int) terms for a given field, the remainder are discarded.
virtual void AddDocument (Document doc)
 Adds a document to this index. If the document contains more than SetMaxFieldLength(int) terms for a given field, the remainder are discarded.
virtual void AddIndexes (IndexReader[] readers)
 Merges the provided indexes into this index.
virtual void AddIndexes (Directory[] dirs)
 Merges all segments from an array of indexes into this index.
virtual void Close ()
 Flushes all changes to an index and closes all associated files.
virtual int DocCount ()
 Returns the number of documents currently in this index.
virtual Analyzer GetAnalyzer ()
 Returns the analyzer used by this index.
virtual Directory GetDirectory ()
 Returns the Directory used by this index.
virtual System.IO.TextWriter GetInfoStream ()
virtual int GetMaxBufferedDocs ()
virtual int GetMaxFieldLength ()
virtual int GetMaxMergeDocs ()
virtual int GetMergeFactor ()
virtual Similarity GetSimilarity ()
 Expert: Return the Similarity implementation used by this IndexWriter.
virtual int GetTermIndexInterval ()
 Expert: Return the interval between indexed terms.
virtual bool GetUseCompoundFile ()
 Get the current setting of whether to use the compound file format. Note that this just returns the value you set with setUseCompoundFile(boolean) or the default. You cannot use this to query the status of an existing index.
 IndexWriter (Directory d, Analyzer a, bool create)
 Constructs an IndexWriter for the index in
d
. Text will be analyzed with
a
. If
create
is true, then a new, empty index will be created in
d
, replacing the index already there, if any.
 IndexWriter (System.IO.FileInfo path, Analyzer a, bool create)
 Constructs an IndexWriter for the index in
path
. Text will be analyzed with
a
. If
create
is true, then a new, empty index will be created in
path
, replacing the index already there, if any.
 IndexWriter (System.String path, Analyzer a, bool create)
 Constructs an IndexWriter for the index in
path
. Text will be analyzed with
a
. If
create
is true, then a new, empty index will be created in
path
, replacing the index already there, if any.
virtual void Optimize ()
 Merges all segments together into a single segment, optimizing an index for search.
virtual void SetInfoStream (System.IO.TextWriter infoStream)
 If non-null, information about merges and a message when maxFieldLength is reached will be printed to this.
virtual void SetMaxBufferedDocs (int maxBufferedDocs)
 Determines the minimal number of documents required before the buffered in-memory documents are merging and a new Segment is created. Since Documents are merged in a Lucene.Net.store.RAMDirectory, large value gives faster indexing. At the same time, mergeFactor limits the number of files open in a FSDirectory.
virtual void SetMaxFieldLength (int maxFieldLength)
 The maximum number of terms that will be indexed for a single field in a document. This limits the amount of memory required for indexing, so that collections with very large files will not crash the indexing process by running out of memory.

Note that this effectively truncates large documents, excluding from the index terms that occur further in the document. If you know your source documents are large, be sure to set this value high enough to accomodate the expected size. If you set it to Integer.MAX_VALUE, then the only limit is your memory, but you should anticipate an OutOfMemoryError.

By default, no more than 10,000 terms will be indexed for a field.

virtual void SetMaxMergeDocs (int maxMergeDocs)
 Determines the largest number of documents ever merged by addDocument(). Small values (e.g., less than 10,000) are best for interactive indexing, as this limits the length of pauses while indexing to a few seconds. Larger values are best for batched indexing and speedier searches.
virtual void SetMergeFactor (int mergeFactor)
 Determines how often segment indices are merged by addDocument(). With smaller values, less RAM is used while indexing, and searches on unoptimized indices are faster, but indexing speed is slower. With larger values, more RAM is used during indexing, and while searches on unoptimized indices are slower, indexing is faster. Thus larger values (> 10) are best for batch index creation, and smaller values (< 10) for indices that are interactively maintained.
virtual void SetSimilarity (Similarity similarity)
 Expert: Set the Similarity implementation used by this IndexWriter.
virtual void SetTermIndexInterval (int interval)
 Expert: Set the interval between indexed terms. Large values cause less memory to be used by IndexReader, but slow random-access to terms. Small values cause more memory to be used by an IndexReader, and speed random-access to terms.
virtual void SetUseCompoundFile (bool value_Renamed)
 Setting to turn on usage of a compound file. When on, multiple files for each segment are merged into a single file once the segment creation is finished. This is done regardless of what directory is in use.

Public Attributes

const System.String COMMIT_LOCK_NAME = "commit.lock"
const long COMMIT_LOCK_TIMEOUT = 10000
 Default value is 10,000.
const int DEFAULT_MAX_BUFFERED_DOCS = 10
 Default value is 10. Change using SetMaxBufferedDocs(int).
const int DEFAULT_MAX_FIELD_LENGTH = 10000
 Default value is 10,000. Change using SetMaxFieldLength(int).
const int DEFAULT_MERGE_FACTOR = 10
 Default value is 10. Change using SetMergeFactor(int).
const int DEFAULT_TERM_INDEX_INTERVAL = 128
 Default value is 128. Change using SetTermIndexInterval(int).
System.IO.TextWriter infoStream = null
 If non-null, information about merges will be printed to this.
int maxFieldLength = DEFAULT_MAX_FIELD_LENGTH
 The maximum number of terms that will be indexed for a single field in a document. This limits the amount of memory required for indexing, so that collections with very large files will not crash the indexing process by running out of memory.

Note that this effectively truncates large documents, excluding from the index terms that occur further in the document. If you know your source documents are large, be sure to set this value high enough to accomodate the expected size. If you set it to Integer.MAX_VALUE, then the only limit is your memory, but you should anticipate an OutOfMemoryError.

By default, no more than 10,000 terms will be indexed for a field.

int maxMergeDocs = DEFAULT_MAX_MERGE_DOCS
 Determines the largest number of documents ever merged by addDocument(). Small values (e.g., less than 10,000) are best for interactive indexing, as this limits the length of pauses while indexing to a few seconds. Larger values are best for batched indexing and speedier searches.
int mergeFactor = DEFAULT_MERGE_FACTOR
 Determines how often segment indices are merged by addDocument(). With smaller values, less RAM is used while indexing, and searches on unoptimized indices are faster, but indexing speed is slower. With larger values, more RAM is used during indexing, and while searches on unoptimized indices are slower, indexing is faster. Thus larger values (> 10) are best for batch index creation, and smaller values (< 10) for indices that are interactively maintained.
int minMergeDocs = DEFAULT_MIN_MERGE_DOCS
 Determines the minimal number of documents required before the buffered in-memory documents are merging and a new Segment is created. Since Documents are merged in a Lucene.Net.store.RAMDirectory, large value gives faster indexing. At the same time, mergeFactor limits the number of files open in a FSDirectory.
const System.String WRITE_LOCK_NAME = "write.lock"
const long WRITE_LOCK_TIMEOUT = 1000
 Default value is 1,000.

Static Public Attributes

static readonly int DEFAULT_MAX_MERGE_DOCS = System.Int32.MaxValue
 Default value is Integer#MAX_VALUE. Change using SetMaxMergeDocs(int).
static readonly int DEFAULT_MIN_MERGE_DOCS = DEFAULT_MAX_BUFFERED_DOCS

Private Member Functions

void DeleteFiles (System.Collections.ArrayList files, System.Collections.ArrayList deletable)
void DeleteFiles (System.Collections.ArrayList files, Directory directory)
void DeleteFiles (System.Collections.ArrayList files)
void DeleteSegments (System.Collections.ArrayList segments)
void FlushRamSegments ()
 Merges all RAM-resident segments.
internal int GetSegmentsCounter ()
 IndexWriter (Directory d, Analyzer a, bool create, bool closeDir)
void InitBlock ()
void MaybeMergeSegments ()
 Incremental segment merger.
void MergeSegments (int minSegment, int end)
 Merges the named range of segments, replacing them in the stack with a single segment.
void MergeSegments (int minSegment)
 Pops segments off of segmentInfos stack down to minSegment, merges them, and pushes the merged index onto the top of the segmentInfos stack.
System.String NewSegmentName ()
System.Collections.ArrayList ReadDeleteableFiles ()
void WriteDeleteableFiles (System.Collections.ArrayList files)
 ~IndexWriter ()
 Release the write lock, if needed.

Private Attributes

Analyzer analyzer
bool closeDir
Directory directory
Directory ramDirectory = new RAMDirectory()
SegmentInfos segmentInfos = new SegmentInfos()
Similarity similarity
int singleDocSegmentsCount = 0
int termIndexInterval = DEFAULT_TERM_INDEX_INTERVAL
bool useCompoundFile = true
 Use compound file setting. Defaults to true, minimizing the number of files used. Setting this to false may improve indexing performance, but may also cause file handle problems.
Lock writeLock

Classes

class  AnonymousClassWith
class  AnonymousClassWith1
class  AnonymousClassWith2
class  AnonymousClassWith3
class  AnonymousClassWith4

The documentation for this class was generated from the following file:

Generated by  Doxygen 1.6.0   Back to index