|
||||||||||
| PREV NEXT | FRAMES NO FRAMES | |||||||||
AdaptiveRevisitHostQueue
class.AdaptiveRevisitHostQueues used by a
Frontier.Double at the specified index to this list.
double at the specified index to this list.
Double at the end of this list.
Float at the specified index to this list.
float at the specified index to this list.
Float at the end of this list.
Integer at the specified index to this list.
int at the specified index to this list.
Integer at the end of this list.
Long at the specified index to this list.
long at the specified index to this list.
Long at the end of this list.
String at the specified index to this list.
String at the end of this list.
Configurable
attributes.
curi with response status and
content type.
int to the buffer.
long to the buffer.
curi .
curi .
RecyclingFastBufferedOutputStream.pos.
content digest
with the one from a previous crawl.CrawlController when checkpointing.
UURI.
CandidateURI
type
giving the credential the passed name.
ListIterator over the criteria set for this
refinement.
CrawlUriSWFAction
action.- CustomSWFTags(SWFActions) -
Constructor for class org.archive.crawler.extractor.CustomSWFTags
-
DecideRule.ACCEPT,
DecideRule.REJECT, or
DecideRule.PASS.DecideRules have been set up inside
it.object.
matcher.match(object)
returns true will be deleted from the queue.
BdbFrontier (i.e., a basic mostly breadth-first
frontier), but with the addition that you can set the number of documents
to download on a per site basis.ExternalImplDecideRule.ExternalImplDecideRule.ValueErrorHandler.
CrawlURI from the passed CandidateURI.
attributeName on Configurable
component.
attributeName on Configurable
component.
attributeName on Configurable
component.
offset.
offset.
CandidateURI.getString(String).
System.currentTimeMillis() at that time).
System.currentTimeMillis() at that time).
System.currentTimeMillis() when the crawl started).
-Dheritrix.home if available to us.
host.
CrawlHost associated with name.
CrawlHost associated with curi.
URIFrontierMarker initialized with the given
regular expression at the 'start' of the Frontier.
host IF its in
IPV4 quads format (e.g.
File object pointing to the order file.
ComplexType owning the checked attribute.
parameters associated
with this connection manager.
curi.
classType or a
subclass of it.
LongWrapper.
CrawlServer associated with name.
CrawlServer associated with curi.
CrawlerSettings for the checked attribute.
CrawlerSettings object this refinement refers to.
key
in settings.
key
in settings.
getState() except this method returns a
human readable name for the state instead of its constant integer value.
numberOfMatches is reached.
CandidateURI.toString().
ConfigurableX509TrustManager.HttpRecorderGetMethod and HttpRecorderPostMethod.URIFrontierMarker that has become invalid.HttpConnectionParams.isStaleCheckingEnabled(),
HttpConnectionManager.getParams().
CandidateURI with the Frontier.
Level.WARNING).
Level.WARNING) and default error message.
Level.WARNING).
Level.WARNING) and default error message.
Long.
CrawlURI as
requiring a prerequisite.
Queue.ARCReader.RecoverableIOException.
int to a String, and pad it to
pad spaces.
String to pad characters wide
by pre-pending spaces.
String to pad characters wide
by pre-pending padChar.
Configuration Pointers.int right-aligned to the given column.
long, right-aligned to the given column.
RecyclingFastBufferedOutputStream.DEFAULT_BUFFER_SIZE bytes.
ListIterator over the refinements for this
settings object.
ValueErrorHandler.
Level.WARNING).
name.
name.
ReplayCharSequence.close() method.InputStream to make a primitive Repositionable
stream.CandidateURI with the Frontier.
the primary DB, URIs indexed
by the time when they can next be processed again.
SeedCachingScope.SeedFileIterator.HttpConnectionParams.setStaleCheckingEnabled(boolean),
HttpConnectionManager.getParams().
parameters for this
connection manager.
typeName.
type.
type.
ValueErrorHandler.
SettingsHandler, only
constraints with level Level.SEVERE will throw an
InvalidAttributeValueException.true if the WorkQueue implementation of this
Frontier stores its workload on disk instead of relying
on serialization mechanisms.
|
||||||||||
| PREV NEXT | FRAMES NO FRAMES | |||||||||