Overview
Package
Class
Use
Tree
Deprecated
Index
Help
PREV LETTER
NEXT LETTER
FRAMES
NO FRAMES
All Classes
A
B
C
D
E
F
G
H
I
J
L
M
N
P
R
S
T
U
X
G
get()
- Method in class spider.crawl.
ActiveThreads
get the latest active threads count
get(String)
- Method in class spider.util.
RobotExclusion
get an entry for a server (a vector of disallowed paths or parts of paths)
getCanonical(String)
- Static method in class spider.util.
Helper
returns the canonical URL
getContents()
- Method in class spider.util.
XMLParser
get the contents of the document that is being parsed
getCosine(double[], double[])
- Static method in class spider.util.
Helper
cosine of the angle between two vectors
getDocument()
- Method in class spider.util.
XMLParser
returns the starting node of the DOM tree.
getDomainName(String)
- Static method in class spider.util.
Helper
get the second level domain name from a given url
getElement()
- Method in class spider.crawl.
Frontier
get the top frontier element (according to priority) and delete it
getFileName(String)
- Method in class spider.crawl.
History
returns filename that stores the history file
getFileScore(String)
- Method in class spider.crawl.
History
returns score for a given url
getHashValue(String)
- Static method in class spider.util.
Hashing
getHistory()
- Method in class spider.crawl.
History
Returns the ht.
getHistoryElement(String)
- Method in class spider.crawl.
History
return HistoryElement for a given url
getHostName(String)
- Static method in class spider.util.
Helper
get the host name from the given URL
getHostNameWithPort(String)
- Static method in class spider.util.
Helper
get host name with port from a given URL
getLinkContext(int)
- Method in class spider.util.
XMLParser
provides links with context depth of aggregation node is based on rel_depth
getLinkContext(String)
- Method in class spider.util.
XMLParser
provides a given link's context at different levels in the DOM tree
getLinkContextAdaptive(int)
- Method in class spider.util.
XMLParser
climbs up the tree until it finds appropriate sized (w words) context
getLinkContextWords(String, int)
- Static method in class spider.util.
Helper
provides links with context noWords is the number of words around a link text used for context
getLinks()
- Method in class spider.util.
XMLParser
get links from given XML (html).
getLocation(String)
- Static method in class spider.util.
Redirections
get the redirected location
getMaxFrontier()
- Method in class spider.crawl.
BasicCrawler
Returns the maxFrontier.
getMaxPages()
- Method in class spider.crawl.
BasicCrawler
Returns the maxPages.
getMaxSize()
- Method in class spider.util.
RobotExclusion
Returns the maxSize.
getMaxSize()
- Method in class spider.crawl.
Frontier
get max size
getMaxThreads()
- Method in class spider.crawl.
BasicCrawler
Returns the maxThreads.
getPath(String)
- Method in class spider.crawl.
Cache
Returns the path.
getQuery()
- Method in class spider.crawl.
DOMCrawler
Returns the query.
getQuery()
- Method in class spider.crawl.
BestFirst
Returns the query.
getResultBuffer()
- Method in class spider.util.
Stemmer
Returns a reference to a character buffer containing the results of the stemming process.
getResultLength()
- Method in class spider.util.
Stemmer
Returns the length of the word resulting from the stemming process.
getSim(String, String)
- Static method in class spider.util.
Helper
consine similarity between two strings (without idf)
getSim(String, String, Hashtable, int)
- Static method in class spider.util.
Helper
consine similarity between two strings - SMART atc - idf included
getSimInQuerySpace(String, String)
- Static method in class spider.util.
Helper
consine similarity by project text onto query space
getStorageFile()
- Method in class spider.crawl.
BasicCrawler
Returns the storageFile.
getText()
- Method in class spider.util.
XMLParser
get text from the given XML (html)
getTopN()
- Method in class spider.crawl.
BasicCrawler
Returns the topN.
getURLPath(String)
- Static method in class spider.util.
Helper
get the path from the given URL
getURLScores(Hashtable, double, String)
- Method in class spider.crawl.
HubSeeker
getURLScores(Hashtable, double, String)
- Method in class spider.crawl.
DOMCrawler
getVector(String)
- Static method in class spider.util.
RobotExclusion
a static function that gives back the robot exclusion disallow vector for generic user agents given the content of a robots.txt fil
Globals
- class spider.crawl.
Globals
.
Global parameters that are used by the crawlers
Globals()
- Constructor for class spider.crawl.
Globals
Overview
Package
Class
Use
Tree
Deprecated
Index
Help
PREV LETTER
NEXT LETTER
FRAMES
NO FRAMES
All Classes
A
B
C
D
E
F
G
H
I
J
L
M
N
P
R
S
T
U
X