edu.cmu.minorthird.text
Class FancyLoader

java.lang.Object
  extended by edu.cmu.minorthird.text.FancyLoader

public class FancyLoader
extends java.lang.Object

Configurable method of loading data objects.

Author:
William Cohen

Field Summary
static java.lang.String DATADIR_PROP
          Property defining location of raw data
static java.lang.String LABELDIR_PROP
          Property defining location of labels added to data
static java.lang.String REPOSITORY_PROP
          Property defining root of repository
static java.lang.String SCRIPTDIR_PROP
          Property defining location of scripts for loading data
static java.lang.String SGML_MARKUP_PATTERN_PROP
          When to expect sgml markup
 
Constructor Summary
FancyLoader()
           
 
Method Summary
static java.lang.Object[] getPossibleTextLabelKeys()
          Return an array of a possible arguments to FancyLoader.loadTextLabels()
static java.lang.String getProperty(java.lang.String prop)
           
static TextLabels loadTextLabels(java.lang.String script)
          Try to load a TextLabels object 'foo' in one of these ways.
static void main(java.lang.String[] args)
           
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

REPOSITORY_PROP

public static final java.lang.String REPOSITORY_PROP
Property defining root of repository

See Also:
Constant Field Values

DATADIR_PROP

public static final java.lang.String DATADIR_PROP
Property defining location of raw data

See Also:
Constant Field Values

LABELDIR_PROP

public static final java.lang.String LABELDIR_PROP
Property defining location of labels added to data

See Also:
Constant Field Values

SCRIPTDIR_PROP

public static final java.lang.String SCRIPTDIR_PROP
Property defining location of scripts for loading data

See Also:
Constant Field Values

SGML_MARKUP_PATTERN_PROP

public static final java.lang.String SGML_MARKUP_PATTERN_PROP
When to expect sgml markup

See Also:
Constant Field Values
Constructor Detail

FancyLoader

public FancyLoader()
Method Detail

getPossibleTextLabelKeys

public static java.lang.Object[] getPossibleTextLabelKeys()
Return an array of a possible arguments to FancyLoader.loadTextLabels()


loadTextLabels

public static TextLabels loadTextLabels(java.lang.String script)
Try to load a TextLabels object 'foo' in one of these ways.
  1. If 'foo' is "sampleK.train" or "sampleK.test" for K=1,2,3 then a hard-coded small sample TextLabels object will be returned.
  2. If 'foo' is the name of a file, treat it as a bean shell script, and return the result of executing it.
  3. If script is a file stem "foo" and a file "foo.base" exists, load a textBase from "foo.base" (one document per line, line name used as document id).
  4. If script is a file stem "foo" and a directory "foo" exists, load a textBase from "foo" (one document per file).
  5. If a file named "data.properties" is on the classpath, and 'foo' is the name of a file in the value of the parameter edu.cmu.minorthird.scriptDir, as defined in data.properties, treat that file as a bean shell script, and return the result of executing it. When the script is executed, the variables "dataDir" and "labelDir" will be bound to Files defined by edu.cmu.minorthird.dataDir and edu.cmu.minorthird.labelDir.
SGML markup in the files "foo/*" or "foo.base" will be interpreted as annotations iff "foo" matches the regex defined by edu.cmu.minorthird.sgmlPattern. After any SGML markup is interpreted, FancyLoader will look for additional labels in "foo.labels" or "foo.mixup", in that order.

Parameters:
script - the name of the bean shell script, directory, file, ...
Returns:
TextLabels object

getProperty

public static java.lang.String getProperty(java.lang.String prop)

main

public static void main(java.lang.String[] args)
                 throws bsh.EvalError,
                        java.io.IOException
Throws:
bsh.EvalError
java.io.IOException