Spreadsheet Toolkit

corpus.gobbler
Class Gobbler

java.lang.Object
  |
  +--corpus.gobbler.Gobbler

public class Gobbler
extends java.lang.Object

Gobbler.java 2.1 28/06/2002 Rapid automated corpus identification via keywords and search engines.


Field Summary
static int GOOGLEAPI
          Perform search using the google API.
static int HTTPGET
          Perform search using a HTTP GET.
 java.lang.String outputpath
          Where are the output URL's stored.
 
Constructor Summary
Gobbler(boolean useUI)
          Constructor for the Gobbler object.
 
Method Summary
 void html(java.lang.String data)
          Display HTML code.
static void main(java.lang.String[] args)
          Start the Gobbler going.
 void newEstimate(int ne)
          Update user interface with the new estimate for the number of results availble.
 void performSearch(java.lang.String searchterm, int numperpage, int targetnum, java.lang.String filetype)
          Starts interaction with Google.
 void setSearchMethod(int i)
          GOOGLEAPI or HTTPGET
 void status(java.lang.String s)
          Display useful information.
 void status(java.lang.String s, boolean error)
           
 void store(java.lang.String outputfile, java.lang.String[] urls)
          Given an array of urls this will store them in a file.
 void urlCountTick()
          Update GUI counter
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

outputpath

public java.lang.String outputpath
Where are the output URL's stored. Path must agree with Fetcher.


GOOGLEAPI

public static final int GOOGLEAPI
Perform search using the google API.

See Also:
Constant Field Values

HTTPGET

public static final int HTTPGET
Perform search using a HTTP GET.

See Also:
Constant Field Values
Constructor Detail

Gobbler

public Gobbler(boolean useUI)
Constructor for the Gobbler object.

Method Detail

performSearch

public void performSearch(java.lang.String searchterm,
                          int numperpage,
                          int targetnum,
                          java.lang.String filetype)
Starts interaction with Google. Does the following:
  1. Sends Request to Google for search Page
  2. Checks to see if the target number of results is greater than the actual number available
  3. Extracts URLs from the search page
  4. Continues getting search pages and extracting URLs until the target is meet.
  5. Stores the URLs for later retrival.

Parameters:
searchterm - A string for what being looked for.
numperpage - The number of results per google page.
targetnum - The total number of results wanted. (Will be reduced if it exceeds the total number of results available)
filetype - The file extension to limit to. I suggest xls.

store

public void store(java.lang.String outputfile,
                  java.lang.String[] urls)
Given an array of urls this will store them in a file.

Parameters:
outputfile - The name of the file
urls - The urls read from Google

status

public void status(java.lang.String s)
Display useful information.


status

public void status(java.lang.String s,
                   boolean error)

newEstimate

public void newEstimate(int ne)
Update user interface with the new estimate for the number of results availble.


html

public void html(java.lang.String data)
Display HTML code.


urlCountTick

public void urlCountTick()
Update GUI counter


setSearchMethod

public void setSearchMethod(int i)
GOOGLEAPI or HTTPGET


main

public static void main(java.lang.String[] args)
Start the Gobbler going.


Spreadsheet Toolkit

Project Home Page