lucene - Why is my index directory so small when fields are compressed? -

- September 15, 2012

The answer is probably in question, but I would like to ensure: -)

I have received 10000 documents Indexed Each has an area that stores a text which is actually 100KB larger (it comes from a text file that uses UTF-8). When this field is uncompressed, the index is 436 MB large, but when the field is compressed, it is only 11.4 MB, this will be a compression ratio of 37.5 - is it good to be true or is it not? Or is it possible that the data from the index directory is stored somewhere else on my computer?

When I retrieve the field then there is no error, everything is fine, but I certainly know from life that if anything is true, then there is definitely something wrong. : D

Here is the code:

  / / raw, do not search field type field type 2 = new field type (); FieldType2.setIndexed (incorrect); FieldType2.setTokenized (incorrect); FieldType2.setStored (true); FieldType2.setOmitNorms (true); FieldType2.setIndexOptions (FieldInfo.IndexOptions.DOCS_ONLY); FieldType2.freeze (); Field raw = new field ("raw", compression tool, compressress (text), field type 2); Doc.add (raw); The author of Compression Facility recommends: 76m -> 1.7m, so that your results can be comparable

And of course, it does not write files outside the configured directory, it will be a big bug.

Search This Blog

Updating

lucene - Why is my index directory so small when fields are compressed? -

Comments

Post a Comment

Popular posts from this blog

HTML/CSS - Automatically set height width from background image? -

php - Mysql Show Process - Sleep Commands and what to do -

list Class in C++ -