A B C D E F G H I J L M N P R S T U X

G

get() - Method in class spider.crawl.ActiveThreads
get the latest active threads count
get(String) - Method in class spider.util.RobotExclusion
get an entry for a server (a vector of disallowed paths or parts of paths)
getCanonical(String) - Static method in class spider.util.Helper
returns the canonical URL
getContents() - Method in class spider.util.XMLParser
get the contents of the document that is being parsed
getCosine(double[], double[]) - Static method in class spider.util.Helper
cosine of the angle between two vectors
getDocument() - Method in class spider.util.XMLParser
returns the starting node of the DOM tree.
getDomainName(String) - Static method in class spider.util.Helper
get the second level domain name from a given url
getElement() - Method in class spider.crawl.Frontier
get the top frontier element (according to priority) and delete it
getFileName(String) - Method in class spider.crawl.History
returns filename that stores the history file
getFileScore(String) - Method in class spider.crawl.History
returns score for a given url
getHashValue(String) - Static method in class spider.util.Hashing
 
getHistory() - Method in class spider.crawl.History
Returns the ht.
getHistoryElement(String) - Method in class spider.crawl.History
return HistoryElement for a given url
getHostName(String) - Static method in class spider.util.Helper
get the host name from the given URL
getHostNameWithPort(String) - Static method in class spider.util.Helper
get host name with port from a given URL
getLinkContext(int) - Method in class spider.util.XMLParser
provides links with context depth of aggregation node is based on rel_depth
getLinkContext(String) - Method in class spider.util.XMLParser
provides a given link's context at different levels in the DOM tree
getLinkContextAdaptive(int) - Method in class spider.util.XMLParser
climbs up the tree until it finds appropriate sized (w words) context
getLinkContextWords(String, int) - Static method in class spider.util.Helper
provides links with context noWords is the number of words around a link text used for context
getLinks() - Method in class spider.util.XMLParser
get links from given XML (html).
getLocation(String) - Static method in class spider.util.Redirections
get the redirected location
getMaxFrontier() - Method in class spider.crawl.BasicCrawler
Returns the maxFrontier.
getMaxPages() - Method in class spider.crawl.BasicCrawler
Returns the maxPages.
getMaxSize() - Method in class spider.util.RobotExclusion
Returns the maxSize.
getMaxSize() - Method in class spider.crawl.Frontier
get max size
getMaxThreads() - Method in class spider.crawl.BasicCrawler
Returns the maxThreads.
getPath(String) - Method in class spider.crawl.Cache
Returns the path.
getQuery() - Method in class spider.crawl.DOMCrawler
Returns the query.
getQuery() - Method in class spider.crawl.BestFirst
Returns the query.
getResultBuffer() - Method in class spider.util.Stemmer
Returns a reference to a character buffer containing the results of the stemming process.
getResultLength() - Method in class spider.util.Stemmer
Returns the length of the word resulting from the stemming process.
getSim(String, String) - Static method in class spider.util.Helper
consine similarity between two strings (without idf)
getSim(String, String, Hashtable, int) - Static method in class spider.util.Helper
consine similarity between two strings - SMART atc - idf included
getSimInQuerySpace(String, String) - Static method in class spider.util.Helper
consine similarity by project text onto query space
getStorageFile() - Method in class spider.crawl.BasicCrawler
Returns the storageFile.
getText() - Method in class spider.util.XMLParser
get text from the given XML (html)
getTopN() - Method in class spider.crawl.BasicCrawler
Returns the topN.
getURLPath(String) - Static method in class spider.util.Helper
get the path from the given URL
getURLScores(Hashtable, double, String) - Method in class spider.crawl.HubSeeker
 
getURLScores(Hashtable, double, String) - Method in class spider.crawl.DOMCrawler
 
getVector(String) - Static method in class spider.util.RobotExclusion
a static function that gives back the robot exclusion disallow vector for generic user agents given the content of a robots.txt fil
Globals - class spider.crawl.Globals.
Global parameters that are used by the crawlers
Globals() - Constructor for class spider.crawl.Globals
 

A B C D E F G H I J L M N P R S T U X