Save This Page
Home » nutch-1.0 » org.apache.nutch » crawl » [javadoc | source]
org.apache.nutch.crawl
public class: CrawlDb [javadoc | source]
java.lang.Object
   org.apache.hadoop.conf.Configured
      org.apache.nutch.crawl.CrawlDb

All Implemented Interfaces:
    org.apache.hadoop.util.Tool

This class takes the output of the fetcher and updates the crawldb accordingly.
Field Summary
public static final  Log LOG     
public static final  String CRAWLDB_ADDITIONS_ALLOWED     
public static final  String CURRENT_NAME     
public static final  String LOCK_NAME     
Constructor:
 public CrawlDb() 
 public CrawlDb(Configuration conf) 
Method from org.apache.nutch.crawl.CrawlDb Summary:
createJob,   install,   main,   run,   update,   update
Methods from java.lang.Object:
clone,   equals,   finalize,   getClass,   hashCode,   notify,   notifyAll,   toString,   wait,   wait,   wait
Method from org.apache.nutch.crawl.CrawlDb Detail:
 public static JobConf createJob(Configuration config,
    Path crawlDb) throws IOException 
 public static  void install(JobConf job,
    Path crawlDb) throws IOException 
 public static  void main(String[] args) throws Exception 
 public int run(String[] args) throws Exception 
 public  void update(Path crawlDb,
    Path[] segments,
    boolean normalize,
    boolean filter) throws IOException 
 public  void update(Path crawlDb,
    Path[] segments,
    boolean normalize,
    boolean filter,
    boolean additionsAllowed,
    boolean force) throws IOException