|
|||||||||
| Home >> All >> org >> apache >> cocoon >> components >> [ crawler overview ] | PREV NEXT | ||||||||
A
- ACCEPT_CONFIG - Static variable in class org.apache.cocoon.components.crawler.SimpleCocoonCrawlerImpl
- Config element name specifying http header value for accept.
- ACCEPT_DEFAULT - Static variable in class org.apache.cocoon.components.crawler.SimpleCocoonCrawlerImpl
- Default value of
acceptconfiguration value. - accept - Variable in class org.apache.cocoon.components.crawler.SimpleCocoonCrawlerImpl
C
- CocoonCrawler - interface org.apache.cocoon.components.crawler.CocoonCrawler.
- The avalon behavioural component interface of crawling.
- cocoonCrawler - Variable in class org.apache.cocoon.components.crawler.SimpleCocoonCrawlerImpl.CocoonCrawlerIterator
- configure(Configuration) - Method in class org.apache.cocoon.components.crawler.SimpleCocoonCrawlerImpl
- Configure the crawler component.
- crawl(URL) - Method in interface org.apache.cocoon.components.crawler.CocoonCrawler
- start crawling the URL.
- crawl(URL) - Method in class org.apache.cocoon.components.crawler.SimpleCocoonCrawlerImpl
- Start crawling a URL.
- crawled - Variable in class org.apache.cocoon.components.crawler.SimpleCocoonCrawlerImpl
D
- dispose() - Method in class org.apache.cocoon.components.crawler.SimpleCocoonCrawlerImpl
- dispose at end of life cycle, releasing all resources.
E
- EXCLUDE_CONFIG - Static variable in class org.apache.cocoon.components.crawler.SimpleCocoonCrawlerImpl
- Config element name specifying excluding regular expression pattern.
- excludeCrawlingURL - Variable in class org.apache.cocoon.components.crawler.SimpleCocoonCrawlerImpl
G
- getLinks(URL) - Method in class org.apache.cocoon.components.crawler.SimpleCocoonCrawlerImpl
- Compute list of links from the url.
H
- hasNext() - Method in class org.apache.cocoon.components.crawler.SimpleCocoonCrawlerImpl.CocoonCrawlerIterator
- check if crawling is finished.
I
- INCLUDE_CONFIG - Static variable in class org.apache.cocoon.components.crawler.SimpleCocoonCrawlerImpl
- Config element name specifying including regular expression pattern.
- includeCrawlingURL - Variable in class org.apache.cocoon.components.crawler.SimpleCocoonCrawlerImpl
- isExcludedURL(String) - Method in class org.apache.cocoon.components.crawler.SimpleCocoonCrawlerImpl
- check if URL is a candidate for indexing
- isIncludedURL(String) - Method in class org.apache.cocoon.components.crawler.SimpleCocoonCrawlerImpl
- check if URL is a candidate for indexing
- iterator() - Method in interface org.apache.cocoon.components.crawler.CocoonCrawler
- Iterate over crawling URLs.
- iterator() - Method in class org.apache.cocoon.components.crawler.SimpleCocoonCrawlerImpl
- Return iterator, iterating over all links of the currently crawled URL.
L
- LINK_CONTENT_TYPE_CONFIG - Static variable in class org.apache.cocoon.components.crawler.SimpleCocoonCrawlerImpl
- Config element name specifying expected link content-typ.
- LINK_CONTENT_TYPE_DEFAULT - Variable in class org.apache.cocoon.components.crawler.SimpleCocoonCrawlerImpl
- Default value of
link-content-typeconfiguration value. - LINK_VIEW_QUERY_CONFIG - Static variable in class org.apache.cocoon.components.crawler.SimpleCocoonCrawlerImpl
- Config element name specifying query-string appendend for requesting links of an URL.
- LINK_VIEW_QUERY_DEFAULT - Static variable in class org.apache.cocoon.components.crawler.SimpleCocoonCrawlerImpl
- Default value of
link-view-queryconfiguration value. - linkContentType - Variable in class org.apache.cocoon.components.crawler.SimpleCocoonCrawlerImpl
- linkViewQuery - Variable in class org.apache.cocoon.components.crawler.SimpleCocoonCrawlerImpl
N
- next() - Method in class org.apache.cocoon.components.crawler.SimpleCocoonCrawlerImpl.CocoonCrawlerIterator
- return the next URL
O
- org.apache.cocoon.components.crawler - package org.apache.cocoon.components.crawler
R
- ROLE - Static variable in interface org.apache.cocoon.components.crawler.CocoonCrawler
- Role name of this avalon component.
- recycle() - Method in class org.apache.cocoon.components.crawler.SimpleCocoonCrawlerImpl
- recylcle this object, relasing resources
- remove() - Method in class org.apache.cocoon.components.crawler.SimpleCocoonCrawlerImpl.CocoonCrawlerIterator
- remove is not implemented
S
- SimpleCocoonCrawlerImpl - class org.apache.cocoon.components.crawler.SimpleCocoonCrawlerImpl.
- A simple cocoon crawler.
- SimpleCocoonCrawlerImpl() - Constructor for class org.apache.cocoon.components.crawler.SimpleCocoonCrawlerImpl
- Constructor for the SimpleCocoonCrawlerImpl object
- SimpleCocoonCrawlerImpl.CocoonCrawlerIterator - class org.apache.cocoon.components.crawler.SimpleCocoonCrawlerImpl.CocoonCrawlerIterator.
- Helper class implementing an Iterator
- SimpleCocoonCrawlerImpl.CocoonCrawlerIterator(SimpleCocoonCrawlerImpl) - Constructor for class org.apache.cocoon.components.crawler.SimpleCocoonCrawlerImpl.CocoonCrawlerIterator
- Constructor for the CocoonCrawlerIterator object
- setDefaultExcludeFromCrawling() - Method in class org.apache.cocoon.components.crawler.SimpleCocoonCrawlerImpl
- Default exclude patterns.
U
- USER_AGENT_CONFIG - Static variable in class org.apache.cocoon.components.crawler.SimpleCocoonCrawlerImpl
- Config element name specifying http header value for user-Agent.
- USER_AGENT_DEFAULT - Static variable in class org.apache.cocoon.components.crawler.SimpleCocoonCrawlerImpl
- Default value of
user-agentconfiguration value. - urlsToProcess - Variable in class org.apache.cocoon.components.crawler.SimpleCocoonCrawlerImpl
- userAgent - Variable in class org.apache.cocoon.components.crawler.SimpleCocoonCrawlerImpl
A C D E G H I L N O R S U
|
|||||||||
| Home >> All >> org >> apache >> cocoon >> components >> [ crawler overview ] | PREV NEXT | ||||||||