edu.cmu.minorthird.ui
Class Recommended.VPSMMLearner2
java.lang.Object
edu.cmu.minorthird.text.learn.AnnotatorLearner
edu.cmu.minorthird.text.learn.SegmentAnnotatorLearner
edu.cmu.minorthird.ui.Recommended.VPSMMLearner2
- Enclosing class:
- Recommended
public static class Recommended.VPSMMLearner2
- extends SegmentAnnotatorLearner
Uses the voted perceptron algorithm to learn the parameters for a
hidden semi-Markov model (SMM).
This is a somewhat more expensive version of the VPHMMLearner,
which allows features to describe properties of multi-token
spans, rather than only properties of single tokens. This
implements the training algorithm described in the final draft of
Cohen & Saragi's KDD paper. This implementation is more
memory-intensive than the VPSMMLearner2 package below, but
faster, since the feature-extraction step is only performed once.
I generally prefer thus method to the (older) VPHMMLearner.
Reference: William W. Cohen and Sunita Sarawagi, Exploiting
Dictionaries in Named Entity Extraction: Combining Semi-Markov
Extraction Processes and Data Integration Methods,
Proceedings of the Tenth ACM SIGKDD International Conference on
Knowledge Discovery and Data Mining (KDD-2004).
Methods inherited from class edu.cmu.minorthird.text.learn.SegmentAnnotatorLearner |
getAnnotationType, getAnnotator, getCompressDataset, getCompressDatasetHelp, getDisplayDatasetBeforeLearning, getDisplayDatasetBeforeLearningHelp, getHistorySize, getSemiMarkovLearner, getSemiMarkovLearnerHelp, getSpanFeatureExtractor, hasNextQuery, nextQuery, reset, setAnnotationType, setAnswer, setCompressDataset, setDisplayDatasetBeforeLearning, setDocumentPool, setSemiMarkovLearner, setSpanFeatureExtractor |
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Recommended.VPSMMLearner2
public Recommended.VPSMMLearner2()
- Extracted entities must be of length 4 or less.
Recommended.VPSMMLearner2
public Recommended.VPSMMLearner2(int epochs,
int maxLen)