edu.cmu.minorthird.text.learn
Class SampleFE.ExtractionFE

java.lang.Object
  extended by edu.cmu.minorthird.text.learn.SpanFE
      extended by edu.cmu.minorthird.text.learn.SampleFE.AnnotatedSpanFE
          extended by edu.cmu.minorthird.text.learn.SampleFE.ExtractionFE
All Implemented Interfaces:
MixupCompatible, SpanFeatureExtractor, java.io.Serializable
Direct Known Subclasses:
ConditionalSemiMarkovModel.CSMMSpanFE
Enclosing class:
SampleFE

public static class SampleFE.ExtractionFE
extends SampleFE.AnnotatedSpanFE

An extraction-oriented feature extractor to apply to one-token spans, for extraction tasks.

See Also:
Serialized Form

Nested Class Summary
 
Nested classes/interfaces inherited from class edu.cmu.minorthird.text.learn.SpanFE
SpanFE.Filter, SpanFE.Function, SpanFE.Result, SpanFE.SetResult<T>, SpanFE.SpanResult, SpanFE.SpanSetResult, SpanFE.StringBagResult, SpanFE.TokenSetResult
 
Field Summary
protected  java.lang.String[] tokenPropertyFeatures
           
protected  boolean useCharType
           
protected  boolean useCompressedCharType
           
protected  int windowSize
           
 
Fields inherited from class edu.cmu.minorthird.text.learn.SpanFE
annotatorLoader, instance, requiredAnnotation, requiredAnnotationFileToLoad, STORE_AS_BINARY, STORE_AS_COUNTS, STORE_COMPACTLY
 
Constructor Summary
SampleFE.ExtractionFE()
           
SampleFE.ExtractionFE(int windowSize)
           
 
Method Summary
 void extractFeatures(Span s)
          Implement this with a specific set of SpanFE 'pipelines'.
 void extractFeatures(TextLabels labels, Span s)
          Implement this with a specific set of SpanFE 'pipelines'.
 int getFeatureWindowSize()
           
 java.lang.String getTokenPropertyFeatures()
           
 boolean getUseCharType()
           
 boolean getUseCompressedCharType()
           
 void setFeatureWindowSize(int n)
          Specify the number of tokens on before and after the span to emit features for.
 void setTokenPropertyFeatures(java.util.Set<java.lang.String> propertySet)
           
 void setTokenPropertyFeatures(java.lang.String commaSeparatedTokenPropertyList)
          Specify the token properties from the TextLabels environment that will be used as features.
 void setUseCharType(boolean flag)
          If set to true, produce features like "token.charTypePattern.Aaaa" for the word "Bill"
 void setUseCompressedCharType(boolean flag)
          If set to true, produce features like "token.charTypePattern.Aa+" for the word "Bill".
 
Methods inherited from class edu.cmu.minorthird.text.learn.SpanFE
emit, emit, emit, emit, extractInstance, extractInstance, from, from, getAnnotationProvider, getRequiredAnnotation, requireMyAnnotation, setAnnotationProvider, setAnnotatorLoader, setFeatureStoragePolicy, setRequiredAnnotation, setRequiredAnnotation, trace
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

windowSize

protected int windowSize

useCharType

protected boolean useCharType

useCompressedCharType

protected boolean useCompressedCharType

tokenPropertyFeatures

protected java.lang.String[] tokenPropertyFeatures
Constructor Detail

SampleFE.ExtractionFE

public SampleFE.ExtractionFE()

SampleFE.ExtractionFE

public SampleFE.ExtractionFE(int windowSize)
Method Detail

setFeatureWindowSize

public void setFeatureWindowSize(int n)
Specify the number of tokens on before and after the span to emit features for.


getFeatureWindowSize

public int getFeatureWindowSize()

setUseCharType

public void setUseCharType(boolean flag)
If set to true, produce features like "token.charTypePattern.Aaaa" for the word "Bill"


getUseCharType

public boolean getUseCharType()

setUseCompressedCharType

public void setUseCompressedCharType(boolean flag)
If set to true, produce features like "token.charTypePattern.Aa+" for the word "Bill".


getUseCompressedCharType

public boolean getUseCompressedCharType()

setTokenPropertyFeatures

public void setTokenPropertyFeatures(java.lang.String commaSeparatedTokenPropertyList)
Specify the token properties from the TextLabels environment that will be used as features. A value of '*' means to use all defined token properties.


getTokenPropertyFeatures

public java.lang.String getTokenPropertyFeatures()

setTokenPropertyFeatures

public void setTokenPropertyFeatures(java.util.Set<java.lang.String> propertySet)

extractFeatures

public void extractFeatures(Span s)
Description copied from class: SpanFE
Implement this with a specific set of SpanFE 'pipelines'. Each pipeline will typically start with 'start(span)' and end with 'emit()'.

Overrides:
extractFeatures in class SpanFE

extractFeatures

public void extractFeatures(TextLabels labels,
                            Span s)
Description copied from class: SpanFE
Implement this with a specific set of SpanFE 'pipelines'. Each pipeline will typically start with 'start(span)' and end with 'emit()'.

Specified by:
extractFeatures in class SpanFE