Home » lucene-2.3.2-src » org.apache » lucene » benchmark » utils »

org.apache.lucene.benchmark.utils

Classes:

ExtractReuters   Split the Reuters SGML documents into Simple Text files containing: Title, Date, Dateline, Body  code | html
ExtractWikipedia   Extract the downloaded Wikipedia dump into separate files for indexing.  code | html
ExtractWikipedia.Parser     code | html