Spreadsheet Toolkit

corpus.analyser
Class Analyser

java.lang.Object
  |
  +--corpus.analyser.Analyser
Direct Known Subclasses:
ClusterAnalyser, CompressionAnalyser, FormulaAnalyser, FormulaComponentAnalyser, HeatmapAnalyser, MetricsAnalyser, OccupancyAnalyser, PowerLawAnalyser, TreeDepthAnalyser

public abstract class Analyser
extends java.lang.Object

This class is responsible for reading in a bunch of spreadsheets using Extractor to get the data out. If possible the data will be read from a serialized version of the WorkBook. The abstract method process(WorkSheet ws, String[] sheetnames, int sheetnum); is called to do the rest of the work.

See Also:
process(WorkSheet,String[],int)

Field Summary
protected  java.util.Vector averageMathVector
          This stores the average formula cell referencing vector for each worksheet processed.
static int CLUSTER
          Similar to occupancy, except performs clustering using ClusterGraph.
static int COMPRESSION
          Reports on zip compresison levels for saved workbooks.
protected  int diagram
          What style of analysis that is being done.
static int FORMULA_COM
          Draws the formlua complexity
static int FORMULACOMPONENT
          Similar to metrics, except counts number of each type of formula component used.
static int HEATMAP
          Similar to occupancy, except data is displayed using a heatmap.
static int METRICS
          Dump an assortment of Metrics for each workbook into a file
static int OCCUPANCY
          Counts how many times each cell is occupied.
protected  java.lang.String path
          Where to look for Excel files by default.
static int POWERLAW
          Perform powerlaw analysis for cell references.
static int TREEDEPTH
          Draws the depth of the complexity
protected  Grid values
          The grid values is used to store the occupency levels for each cell
static int VECTOR
          Draws the average vector for each cell
static int VECTOR_MAG
          Draws the average magnitude vector for each worksheet
protected  Grid vectorGrid
          This Grid contains a vector for each cell.
protected  int workbooks
          A quick count of how many workbooks have been processed.
protected  int worksheets
          A quick count of how many worksheets have been processed.
static int WS_OCCUPANCY
          Same as OCCUPANCY except seperated at worksheet level.
 
Constructor Summary
Analyser(java.lang.String path, int diagram)
           
 
Method Summary
abstract  void display()
          Takes all the results from computation and passes them off to Doodler.
static int distinctFamilyTrees(Cell[] cells)
          Count the number of distinct family trees that can be found for a given set of cells.
protected  WorkBook load(java.lang.String path, boolean alreadySaved)
          Load in a Workbook using extractor if needed.
 void loadSheets()
          Given a path, this method will load all the excel spreadsheets in the folder.
static void main(java.lang.String[] args)
          Get the party started :)
protected static java.lang.String mode(int mode)
          Mode of operation
static java.lang.String mode(int mode, java.lang.String path, WorkBook wb, int sheet, java.lang.String options, boolean infoReq)
           
protected abstract  void process(WorkSheet ws, java.lang.String[] sheetnames, int sheetnum)
          Takes a worksheet from a workbook and computes something on it.
abstract  void saveResults(java.lang.String path)
           
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

values

protected Grid values
The grid values is used to store the occupency levels for each cell


vectorGrid

protected Grid vectorGrid
This Grid contains a vector for each cell. Each of these internal vectors will contain all the references made to other cells in a series of MathVectors


averageMathVector

protected java.util.Vector averageMathVector
This stores the average formula cell referencing vector for each worksheet processed.


workbooks

protected int workbooks
A quick count of how many workbooks have been processed.


worksheets

protected int worksheets
A quick count of how many worksheets have been processed.


path

protected java.lang.String path
Where to look for Excel files by default.


diagram

protected int diagram
What style of analysis that is being done.


OCCUPANCY

public static final int OCCUPANCY
Counts how many times each cell is occupied.

See Also:
Constant Field Values

WS_OCCUPANCY

public static final int WS_OCCUPANCY
Same as OCCUPANCY except seperated at worksheet level.

See Also:
Constant Field Values

VECTOR

public static final int VECTOR
Draws the average vector for each cell

See Also:
Constant Field Values

VECTOR_MAG

public static final int VECTOR_MAG
Draws the average magnitude vector for each worksheet

See Also:
Constant Field Values

FORMULA_COM

public static final int FORMULA_COM
Draws the formlua complexity

See Also:
Constant Field Values

TREEDEPTH

public static final int TREEDEPTH
Draws the depth of the complexity

See Also:
Constant Field Values

METRICS

public static final int METRICS
Dump an assortment of Metrics for each workbook into a file

See Also:
Constant Field Values

CLUSTER

public static final int CLUSTER
Similar to occupancy, except performs clustering using ClusterGraph.

See Also:
Constant Field Values

HEATMAP

public static final int HEATMAP
Similar to occupancy, except data is displayed using a heatmap.

See Also:
Constant Field Values

FORMULACOMPONENT

public static final int FORMULACOMPONENT
Similar to metrics, except counts number of each type of formula component used.

See Also:
Constant Field Values

POWERLAW

public static final int POWERLAW
Perform powerlaw analysis for cell references.

See Also:
Constant Field Values

COMPRESSION

public static final int COMPRESSION
Reports on zip compresison levels for saved workbooks.

See Also:
Constant Field Values
Constructor Detail

Analyser

public Analyser(java.lang.String path,
                int diagram)
Method Detail

mode

protected static final java.lang.String mode(int mode)
Mode of operation

Returns:
A string describing the mode of operation.

loadSheets

public void loadSheets()
Given a path, this method will load all the excel spreadsheets in the folder. For each spreadsheet, it will be processed using Extractor if it hasn't been processed before. If it has been processed before, the file will be loaded straight from the disk into a Workbook.
The real work happens once the Excel file has been converted into the toolkits internal representation. For each Worksheet in each Workbook, process(Worksheet) will be called on it.


load

protected WorkBook load(java.lang.String path,
                        boolean alreadySaved)
Load in a Workbook using extractor if needed.

Parameters:
path - The path to the Spreadsheet file
alreadySaved - Is there already a serialized version of disk?
Returns:
The read WorkBook, null if all attempts to read failed.

process

protected abstract void process(WorkSheet ws,
                                java.lang.String[] sheetnames,
                                int sheetnum)
Takes a worksheet from a workbook and computes something on it. The something depends on the analysis/diagram mode.


display

public abstract void display()
Takes all the results from computation and passes them off to Doodler.


saveResults

public abstract void saveResults(java.lang.String path)

mode

public static java.lang.String mode(int mode,
                                    java.lang.String path,
                                    WorkBook wb,
                                    int sheet,
                                    java.lang.String options,
                                    boolean infoReq)

distinctFamilyTrees

public static int distinctFamilyTrees(Cell[] cells)
Count the number of distinct family trees that can be found for a given set of cells.

Parameters:
cells - the cells to start the familytree searches from.

main

public static void main(java.lang.String[] args)
Get the party started :)


Spreadsheet Toolkit

Project Home Page