edu.cmu.minorthird.classify
Class BasicDataset

java.lang.Object
  extended by edu.cmu.minorthird.classify.BasicDataset
All Implemented Interfaces:
Dataset, Visible, Saveable, java.io.Serializable
Direct Known Subclasses:
CoreRelationalDataset, RandomAccessDataset

public class BasicDataset
extends java.lang.Object
implements Dataset, java.io.Serializable, Visible, Saveable

A set of examples for learning.

Author:
William Cohen
See Also:
Serialized Form

Nested Class Summary
static class BasicDataset.SimpleDatasetViewer
           
 
Nested classes/interfaces inherited from interface edu.cmu.minorthird.classify.Dataset
Dataset.Split
 
Field Summary
protected  java.util.Set<java.lang.String> classNameSet
           
protected  java.util.List<Example> examples
           
protected  FeatureFactory featureFactory
           
protected  java.util.List<Instance> unlabeledExamples
           
 
Constructor Summary
BasicDataset()
           
BasicDataset(FeatureFactory featureFactory)
           
 
Method Summary
 void add(Example example)
          Add an example to the dataset.
 void add(Example example, boolean compress)
          Add an Example to the dataset.
 void addUnlabeled(Instance instance)
           
 java.lang.String getExtensionFor(java.lang.String s)
          Recomended extension for the format with the given name.
 FeatureFactory getFeatureFactory()
          Get the FeatureFactory associated with the dataset
 java.lang.String[] getFormatNames()
          List of formats in which the object can be saved.
 ExampleSchema getSchema()
          Get the schema associated with the dataset
 boolean hasUnlabeled()
           
 java.util.Iterator<Example> iterator()
          Return an iterator over all examples.
 java.util.Iterator<Instance> iteratorOverUnlabeled()
           
static void main(java.lang.String[] args)
          Simple test routine
 java.lang.Object restore(java.io.File file)
          Restore the object from a file.
 void saveAs(java.io.File file, java.lang.String format)
          Save this object to the given file, in the given format.
 Dataset shallowCopy()
          Make a shallow copy of the dataset.
 void shuffle()
          Randomly re-order the examples.
 void shuffle(java.util.Random r)
          Randomly re-order the examples.
 int size()
          Return the number of examples.
 int sizeUnlabeled()
           
 Dataset.Split split(Splitter<Example> splitter)
          Partition the dataset as required by the splitter.
 Viewer toGUI()
          A GUI view of the dataset.
 java.lang.String toString()
          A string view of the dataset
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Field Detail

featureFactory

protected FeatureFactory featureFactory

examples

protected java.util.List<Example> examples

unlabeledExamples

protected java.util.List<Instance> unlabeledExamples

classNameSet

protected java.util.Set<java.lang.String> classNameSet
Constructor Detail

BasicDataset

public BasicDataset(FeatureFactory featureFactory)

BasicDataset

public BasicDataset()
Method Detail

getSchema

public ExampleSchema getSchema()
Description copied from interface: Dataset
Get the schema associated with the dataset

Specified by:
getSchema in interface Dataset

addUnlabeled

public void addUnlabeled(Instance instance)

iteratorOverUnlabeled

public java.util.Iterator<Instance> iteratorOverUnlabeled()

sizeUnlabeled

public int sizeUnlabeled()

hasUnlabeled

public boolean hasUnlabeled()

getFeatureFactory

public FeatureFactory getFeatureFactory()
Description copied from interface: Dataset
Get the FeatureFactory associated with the dataset

Specified by:
getFeatureFactory in interface Dataset

add

public void add(Example example)
Add an example to the dataset.

This method compresses the example before adding it to the dataset. If you don't want/need the example to be compressed then call add(Example, boolean)

Specified by:
add in interface Dataset
Parameters:
example - The Example that you want to add to the dataset.

add

public void add(Example example,
                boolean compress)
Add an Example to the dataset.

This method lets the caller specify whether or not to compress the example before adding it to the dataset.

Specified by:
add in interface Dataset
Parameters:
example - The example to add to the dataset
compress - Boolean specifying whether or not to compress the example.

iterator

public java.util.Iterator<Example> iterator()
Description copied from interface: Dataset
Return an iterator over all examples. This iterator must always return examples in the order in which they were added, unless the data has been shuffled.

Specified by:
iterator in interface Dataset

size

public int size()
Description copied from interface: Dataset
Return the number of examples.

Specified by:
size in interface Dataset

shuffle

public void shuffle(java.util.Random r)
Description copied from interface: Dataset
Randomly re-order the examples.

Specified by:
shuffle in interface Dataset

shuffle

public void shuffle()
Description copied from interface: Dataset
Randomly re-order the examples.

Specified by:
shuffle in interface Dataset

shallowCopy

public Dataset shallowCopy()
Description copied from interface: Dataset
Make a shallow copy of the dataset. Examples are shared, but not the ordering of the examples.

Specified by:
shallowCopy in interface Dataset

getFormatNames

public java.lang.String[] getFormatNames()
Description copied from interface: Saveable
List of formats in which the object can be saved.

Specified by:
getFormatNames in interface Saveable

getExtensionFor

public java.lang.String getExtensionFor(java.lang.String s)
Description copied from interface: Saveable
Recomended extension for the format with the given name.

Specified by:
getExtensionFor in interface Saveable

saveAs

public void saveAs(java.io.File file,
                   java.lang.String format)
            throws java.io.IOException
Description copied from interface: Saveable
Save this object to the given file, in the given format.

Specified by:
saveAs in interface Saveable
Throws:
java.io.IOException

restore

public java.lang.Object restore(java.io.File file)
                         throws java.io.IOException
Description copied from interface: Saveable
Restore the object from a file.

Specified by:
restore in interface Saveable
Throws:
java.io.IOException

toString

public java.lang.String toString()
A string view of the dataset

Overrides:
toString in class java.lang.Object

toGUI

public Viewer toGUI()
A GUI view of the dataset.

Specified by:
toGUI in interface Visible

split

public Dataset.Split split(Splitter<Example> splitter)
Description copied from interface: Dataset
Partition the dataset as required by the splitter.

Specified by:
split in interface Dataset

main

public static void main(java.lang.String[] args)
Simple test routine