spider.util
Class RobotExclusion

java.lang.Object
  |
  +--spider.util.RobotExclusion

public class RobotExclusion
extends java.lang.Object

stores the robot permissions for various Web sites

Author:
pant

Constructor Summary
RobotExclusion()
           
 
Method Summary
 void add(java.lang.String server, java.util.Vector perm)
          add an entry to the robot exclusion hash (one entry per server)
 java.util.Vector get(java.lang.String server)
          get an entry for a server (a vector of disallowed paths or parts of paths)
 int getMaxSize()
          Returns the maxSize.
static java.util.Vector getVector(java.lang.String content)
          a static function that gives back the robot exclusion disallow vector for generic user agents given the content of a robots.txt fil
static boolean isDisallowed(java.lang.String url, java.util.Vector perm)
           
 void setMaxSize(int maxSize)
          Sets the maxSize.
 
Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

RobotExclusion

public RobotExclusion()
Method Detail

add

public void add(java.lang.String server,
                java.util.Vector perm)
add an entry to the robot exclusion hash (one entry per server)


get

public java.util.Vector get(java.lang.String server)
get an entry for a server (a vector of disallowed paths or parts of paths)

Returns:
Vector or null

getVector

public static java.util.Vector getVector(java.lang.String content)
a static function that gives back the robot exclusion disallow vector for generic user agents given the content of a robots.txt fil


isDisallowed

public static boolean isDisallowed(java.lang.String url,
                                   java.util.Vector perm)
Returns:
- boolean - if the given url is not okay to fetch

getMaxSize

public int getMaxSize()
Returns the maxSize.

Returns:
int

setMaxSize

public void setMaxSize(int maxSize)
Sets the maxSize.

Parameters:
maxSize - The maxSize to set