edu.cmu.minorthird.classify.semisupervised
Class SemiSupervisedDataset

java.lang.Object
  extended by edu.cmu.minorthird.classify.semisupervised.SemiSupervisedDataset
All Implemented Interfaces:
Dataset, SemiSupervisedActions, Visible, Saveable

public class SemiSupervisedDataset
extends java.lang.Object
implements Dataset, SemiSupervisedActions, Visible, Saveable

Author:
Edoardo Airoldi Date: Jul 19, 2004

Nested Class Summary
static class SemiSupervisedDataset.SimpleDatasetViewer
           
 
Nested classes/interfaces inherited from interface edu.cmu.minorthird.classify.Dataset
Dataset.Split
 
Field Summary
protected  java.util.Set<java.lang.String> classNameSet
           
protected  java.util.List<Example> examples
           
protected  FeatureFactory factory
           
protected  java.util.List<Instance> unlabeledExamples
           
 
Constructor Summary
SemiSupervisedDataset()
           
 
Method Summary
 void add(Example example)
          Add an example to the dataset.
 void add(Example example, boolean compress)
          Add an Example to the dataset.
 void addUnlabeled(Instance instance)
          Add a new semisupervised example to the dataset.
 java.lang.String getExtensionFor(java.lang.String s)
          Recomended extension for the format with the given name.
 FeatureFactory getFeatureFactory()
          Get the FeatureFactory associated with the dataset
 java.lang.String[] getFormatNames()
          List of formats in which the object can be saved.
 int getNumPosExamples()
           
 ExampleSchema getSchema()
          Get the schema associated with the dataset
 boolean hasUnlabeled()
          Return whether the Dataset contains semisupervised examples available for semi-supervisedd learning.
 java.util.Iterator<Example> iterator()
          Return an iterator over all examples.
 java.util.Iterator<Instance> iteratorOverUnlabeled()
          Return an iterator over all the semisupervised examples.
static void main(java.lang.String[] args)
          Simple test routine
 java.lang.Object restore(java.io.File file)
          Restore the object from a file.
 void saveAs(java.io.File file, java.lang.String format)
          Save this object to the given file, in the given format.
 Dataset shallowCopy()
          Make a shallow copy of the dataset.
 void shuffle()
          Randomly re-order the examples.
 void shuffle(java.util.Random r)
          Randomly re-order the examples.
 int size()
          Return the number of examples.
 int sizeUnlabeled()
          Return the number of semisupervised examples.
 Dataset.Split split(Splitter<Example> splitter)
          Partition the dataset as required by the splitter.
 Viewer toGUI()
          A GUI view of the dataset.
 java.lang.String toString()
          A string view of the dataset
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Field Detail

examples

protected java.util.List<Example> examples

unlabeledExamples

protected java.util.List<Instance> unlabeledExamples

classNameSet

protected java.util.Set<java.lang.String> classNameSet

factory

protected FeatureFactory factory
Constructor Detail

SemiSupervisedDataset

public SemiSupervisedDataset()
Method Detail

getSchema

public ExampleSchema getSchema()
Description copied from interface: Dataset
Get the schema associated with the dataset

Specified by:
getSchema in interface Dataset

addUnlabeled

public void addUnlabeled(Instance instance)
Description copied from interface: SemiSupervisedActions
Add a new semisupervised example to the dataset.

Specified by:
addUnlabeled in interface SemiSupervisedActions

iteratorOverUnlabeled

public java.util.Iterator<Instance> iteratorOverUnlabeled()
Description copied from interface: SemiSupervisedActions
Return an iterator over all the semisupervised examples. This iterator must always return examples in the order in which they were added, unless the data has been shuffled.

Specified by:
iteratorOverUnlabeled in interface SemiSupervisedActions

sizeUnlabeled

public int sizeUnlabeled()
Description copied from interface: SemiSupervisedActions
Return the number of semisupervised examples.

Specified by:
sizeUnlabeled in interface SemiSupervisedActions

hasUnlabeled

public boolean hasUnlabeled()
Description copied from interface: SemiSupervisedActions
Return whether the Dataset contains semisupervised examples available for semi-supervisedd learning.

Specified by:
hasUnlabeled in interface SemiSupervisedActions

getFeatureFactory

public FeatureFactory getFeatureFactory()
Description copied from interface: Dataset
Get the FeatureFactory associated with the dataset

Specified by:
getFeatureFactory in interface Dataset

add

public void add(Example example)
Add an example to the dataset.

This method compresses the example before adding it to the dataset. If you don't want/need the example to be compressed then call add(Example, boolean)

Specified by:
add in interface Dataset
Parameters:
example - The Example that you want to add to the dataset.

add

public void add(Example example,
                boolean compress)
Add an Example to the dataset.

This method lets the caller specify whether or not to compress the example before adding it to the dataset.

Specified by:
add in interface Dataset
Parameters:
example - The example to add to the dataset
compress - Boolean specifying whether or not to compress the example.

iterator

public java.util.Iterator<Example> iterator()
Description copied from interface: Dataset
Return an iterator over all examples. This iterator must always return examples in the order in which they were added, unless the data has been shuffled.

Specified by:
iterator in interface Dataset

size

public int size()
Description copied from interface: Dataset
Return the number of examples.

Specified by:
size in interface Dataset

shuffle

public void shuffle(java.util.Random r)
Description copied from interface: Dataset
Randomly re-order the examples.

Specified by:
shuffle in interface Dataset

shuffle

public void shuffle()
Description copied from interface: Dataset
Randomly re-order the examples.

Specified by:
shuffle in interface Dataset

shallowCopy

public Dataset shallowCopy()
Description copied from interface: Dataset
Make a shallow copy of the dataset. Examples are shared, but not the ordering of the examples.

Specified by:
shallowCopy in interface Dataset

getFormatNames

public java.lang.String[] getFormatNames()
Description copied from interface: Saveable
List of formats in which the object can be saved.

Specified by:
getFormatNames in interface Saveable

getExtensionFor

public java.lang.String getExtensionFor(java.lang.String s)
Description copied from interface: Saveable
Recomended extension for the format with the given name.

Specified by:
getExtensionFor in interface Saveable

saveAs

public void saveAs(java.io.File file,
                   java.lang.String format)
            throws java.io.IOException
Description copied from interface: Saveable
Save this object to the given file, in the given format.

Specified by:
saveAs in interface Saveable
Throws:
java.io.IOException

restore

public java.lang.Object restore(java.io.File file)
                         throws java.io.IOException
Description copied from interface: Saveable
Restore the object from a file.

Specified by:
restore in interface Saveable
Throws:
java.io.IOException

toString

public java.lang.String toString()
A string view of the dataset

Overrides:
toString in class java.lang.Object

toGUI

public Viewer toGUI()
A GUI view of the dataset.

Specified by:
toGUI in interface Visible

split

public Dataset.Split split(Splitter<Example> splitter)
Description copied from interface: Dataset
Partition the dataset as required by the splitter.

Specified by:
split in interface Dataset

main

public static void main(java.lang.String[] args)
Simple test routine


getNumPosExamples

public int getNumPosExamples()