edu.cmu.minorthird.ui
Class Recommended.VPSMMLearner

java.lang.Object
  extended by edu.cmu.minorthird.text.learn.AnnotatorLearner
      extended by edu.cmu.minorthird.text.learn.ConditionalSemiMarkovModel.CSMMLearner
          extended by edu.cmu.minorthird.ui.Recommended.VPSMMLearner
Enclosing class:
Recommended

public static class Recommended.VPSMMLearner
extends ConditionalSemiMarkovModel.CSMMLearner

Uses the voted perceptron algorithm to learn the parameters for a hidden semi-Markov model (SMM).

This is a somewhat more expensive version of the VPHMMLearner, which allows features to describe properties of multi-token spans, rather than only properties of single tokens. This implements the training algorithm described in the initial draft of Cohen & Saragi's KDD paper. This implementation is less memory-intensive than the VPSMMLearner2 package below, but slower, since the feature-extraction step is iterated many times.

Reference: William W. Cohen and Sunita Sarawagi, Exploiting Dictionaries in Named Entity Extraction: Combining Semi-Markov Extraction Processes and Data Integration Methods, Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD-2004).


Constructor Summary
Recommended.VPSMMLearner()
          Extracted entities must be of length 4 or less.
Recommended.VPSMMLearner(int maxLength)
           
 
Method Summary
 
Methods inherited from class edu.cmu.minorthird.text.learn.ConditionalSemiMarkovModel.CSMMLearner
getAnnotationType, getAnnotator, getEpochs, getLearner, getMaxSegmentSize, getSpanFeatureExtractor, hasNextQuery, nextQuery, reset, setAnnotationType, setAnswer, setDocumentPool, setEpochs, setLearner, setMaxSegmentSize, setSpanFeatureExtractor
 
Methods inherited from class edu.cmu.minorthird.text.learn.AnnotatorLearner
getAnnotationTypeHelp, getSpanFeatureExtractorHelp
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

Recommended.VPSMMLearner

public Recommended.VPSMMLearner()
Extracted entities must be of length 4 or less.


Recommended.VPSMMLearner

public Recommended.VPSMMLearner(int maxLength)