edu.cmu.minorthird.classify.sequential
Class SequenceDataset

java.lang.Object
  extended by edu.cmu.minorthird.classify.sequential.SequenceDataset
All Implemented Interfaces:
Dataset, SequenceConstants, Visible, Saveable

public class SequenceDataset
extends java.lang.Object
implements Dataset, SequenceConstants, Visible, Saveable

A dataset of sequences of examples.

Author:
William Cohen

Nested Class Summary
 
Nested classes/interfaces inherited from interface edu.cmu.minorthird.classify.Dataset
Dataset.Split
 
Field Summary
protected  java.util.Set<java.lang.String> classNameSet
           
protected  FeatureFactory factory
           
protected  java.util.List<Example[]> sequenceList
           
protected  int totalSize
           
 
Fields inherited from interface edu.cmu.minorthird.classify.sequential.SequenceConstants
HISTORY_FEATURE, NULL_CLASS_NAME
 
Constructor Summary
SequenceDataset()
           
 
Method Summary
 void add(Example example)
          Add a new example to the dataset.
 void add(Example example, boolean compress)
          Add a new example to the dataset.
 void addSequence(Example[] sequence)
          Add a new sequence of examples to the dataset.
 void addSequence(Example[] sequence, boolean compress)
          Add a new sequence of examples to the dataset

This method allows the caller to specify if they want the examples to be compressed or not.
 java.lang.String getExtensionFor(java.lang.String s)
          Recomended extension for the format with the given name.
 FeatureFactory getFeatureFactory()
          Get the FeatureFactory associated with the dataset
 java.lang.String[] getFormatNames()
          List of formats in which the object can be saved.
 int getHistorySize()
          Return the current history length.
 int getNumPosExamples()
           
 ExampleSchema getSchema()
          Get the schema associated with the dataset
protected  Dataset invertIteration(java.util.Iterator<Example[]> i)
           
 java.util.Iterator<Example> iterator()
          Iterate over all examples, extended so as to contain history information.
static void main(java.lang.String[] args)
           
 int numberOfSequences()
          Return the number of sequences.
 java.lang.Object restore(java.io.File file)
          Restore the object from a file.
 void saveAs(java.io.File file, java.lang.String format)
          Save this object to the given file, in the given format.
 java.util.Iterator<Example[]> sequenceIterator()
          Return an iterator over all sequences.
 void setHistorySize(int k)
          Set the current history length.
 Dataset shallowCopy()
          Make a shallow copy of the dataset.
 void shuffle()
          Randomly re-order the examples.
 void shuffle(java.util.Random r)
          Randomly re-order the examples.
 int size()
          Return the number of examples.
 Dataset.Split split(Splitter<Example> splitter)
          Partition the dataset as required by the splitter.
 Dataset.Split splitSequence(Splitter<Example[]> splitter)
           
 Viewer toGUI()
          A GUI view of the dataset.
 java.lang.String toString()
           
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Field Detail

sequenceList

protected java.util.List<Example[]> sequenceList

totalSize

protected int totalSize

classNameSet

protected java.util.Set<java.lang.String> classNameSet

factory

protected FeatureFactory factory
Constructor Detail

SequenceDataset

public SequenceDataset()
Method Detail

getFeatureFactory

public FeatureFactory getFeatureFactory()
Description copied from interface: Dataset
Get the FeatureFactory associated with the dataset

Specified by:
getFeatureFactory in interface Dataset

setHistorySize

public void setHistorySize(int k)
Set the current history length. Examples produced by the iterator() will contain the last k class labels as features.


getHistorySize

public int getHistorySize()
Return the current history length. Examples produced by the iterator() will contain the last k class labels as features.


getSchema

public ExampleSchema getSchema()
Description copied from interface: Dataset
Get the schema associated with the dataset

Specified by:
getSchema in interface Dataset

add

public void add(Example example)
Add a new example to the dataset.

This method compresses the example before adding it to the the dataset. To prevent this compresstion call add(Example, boolean).

Specified by:
add in interface Dataset
Parameters:
example - The example to add to the dataset.

add

public void add(Example example,
                boolean compress)
Add a new example to the dataset.

This method allows the caller to specify if they want the examples to be compressed or not.

Specified by:
add in interface Dataset
Parameters:
example - The example to add to the dataset.
compress - Boolean specifying whether or not to compress the example.

addSequence

public void addSequence(Example[] sequence)
Add a new sequence of examples to the dataset.

This method compresses each example before adding it to the the dataset. To prevent this compresstion call addSequence(Example[], boolean).


addSequence

public void addSequence(Example[] sequence,
                        boolean compress)
Add a new sequence of examples to the dataset

This method allows the caller to specify if they want the examples to be compressed or not.

Parameters:
sequence - The sequence of examples to add to the dataset
compress - Boolean specifying whether or not to compress the examples.

iterator

public java.util.Iterator<Example> iterator()
Iterate over all examples, extended so as to contain history information.

Specified by:
iterator in interface Dataset

size

public int size()
Return the number of examples.

Specified by:
size in interface Dataset

numberOfSequences

public int numberOfSequences()
Return the number of sequences.


sequenceIterator

public java.util.Iterator<Example[]> sequenceIterator()
Return an iterator over all sequences. Each item returned by this will be of type Example[].


shuffle

public void shuffle(java.util.Random r)
Randomly re-order the examples.

Specified by:
shuffle in interface Dataset

shuffle

public void shuffle()
Randomly re-order the examples.

Specified by:
shuffle in interface Dataset

shallowCopy

public Dataset shallowCopy()
Make a shallow copy of the dataset. Sequences are shared, but not the ordering of the Sequences.

Specified by:
shallowCopy in interface Dataset

split

public Dataset.Split split(Splitter<Example> splitter)
Description copied from interface: Dataset
Partition the dataset as required by the splitter.

Specified by:
split in interface Dataset

splitSequence

public Dataset.Split splitSequence(Splitter<Example[]> splitter)

invertIteration

protected Dataset invertIteration(java.util.Iterator<Example[]> i)

toString

public java.lang.String toString()
Overrides:
toString in class java.lang.Object

getFormatNames

public java.lang.String[] getFormatNames()
Description copied from interface: Saveable
List of formats in which the object can be saved.

Specified by:
getFormatNames in interface Saveable

getExtensionFor

public java.lang.String getExtensionFor(java.lang.String s)
Description copied from interface: Saveable
Recomended extension for the format with the given name.

Specified by:
getExtensionFor in interface Saveable

saveAs

public void saveAs(java.io.File file,
                   java.lang.String format)
            throws java.io.IOException
Description copied from interface: Saveable
Save this object to the given file, in the given format.

Specified by:
saveAs in interface Saveable
Throws:
java.io.IOException

restore

public java.lang.Object restore(java.io.File file)
                         throws java.io.IOException
Description copied from interface: Saveable
Restore the object from a file.

Specified by:
restore in interface Saveable
Throws:
java.io.IOException

toGUI

public Viewer toGUI()
A GUI view of the dataset.

Specified by:
toGUI in interface Visible

main

public static void main(java.lang.String[] args)
                 throws java.io.IOException
Throws:
java.io.IOException

getNumPosExamples

public int getNumPosExamples()