|
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Object edu.cmu.minorthird.text.TextLabelsLoader
public class TextLabelsLoader
Loads and saves the contents of a TextLabels into a file. Labels can be loaded from operations (see importOps) or from a serialized TextLabels object. Labels can be serialized or types can be saved as operations, xml, or plain lists.
Field Summary | |
---|---|
static int |
CLOSE_ALL_TYPES
Spans in labels are a complete list of all spans. |
static int |
CLOSE_BY_OPERATION
|
static int |
CLOSE_TYPES_IN_LABELED_DOCS
If a document has been labeled for a type, assume all spans of that type are there. |
static java.lang.String[] |
CLOSURE_NAMES
|
static int |
DONT_CLOSE_TYPES
Make no assumptions about closure. |
Constructor Summary | |
---|---|
TextLabelsLoader()
|
Method Summary | |
---|---|
void |
closeLabels(MutableTextLabels labels,
int policy)
Close labels on the labels according to the policy. |
java.lang.String |
createXMLmarkup(java.lang.String documentId,
TextLabels labels)
Save extracted data in an XML format. |
void |
importOps(MutableTextLabels labels,
TextBase base,
java.io.File file)
Load lines modifying a TextLabels from a file. |
MutableTextLabels |
loadOps(TextBase base,
java.io.File file)
Create a new labeling by importing from a file with importOps. |
MutableTextLabels |
loadSerialized(java.io.File file,
TextBase base)
Read in a serialized TextLabels. |
java.lang.String |
markupDocumentSpan(java.lang.String documentId,
TextLabels labels)
Deprecated. use createXMLMarkup(String documentId,TextLabels labels) Save extracted data in an XML format. Convert to string <root>..<type>...</type>..</root> nested things <a>A<b>B</b>C</a> are stored as nested things <a>A<set v=a,b>B</set>C</a> where single sets are simplified so mismatches like [A (B C] D)E are stored as <a>a<set v=a,b>B C</set></a><b>D</b>E |
java.lang.String |
printTypesAsOps(TextLabels labels)
Save extracted data in a format readable with loadOps. |
void |
saveDocsWithEmbeddedTypes(TextLabels labels,
java.io.File dir)
Save documents to specified directory with extracted types embedded as xml. |
void |
saveSerialized(MutableTextLabels labels,
java.io.File file)
Serialize a TextLabels. |
void |
saveTypesAsOps(TextLabels labels,
java.io.File file)
Save extracted data in a format readable with loadOps. |
void |
saveTypesAsStrings(TextLabels labels,
java.io.File file,
boolean includeOffset)
Save spans of given type into the file, one per line. |
java.lang.String |
saveTypesAsXML(TextLabels labels)
Save extracted data in an XML format |
void |
setClosurePolicy(int policy)
Set the closure policy. |
Methods inherited from class java.lang.Object |
---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Field Detail |
---|
public static final int CLOSE_ALL_TYPES
public static final int CLOSE_TYPES_IN_LABELED_DOCS
public static final int DONT_CLOSE_TYPES
public static final int CLOSE_BY_OPERATION
public static final java.lang.String[] CLOSURE_NAMES
Constructor Detail |
---|
public TextLabelsLoader()
Method Detail |
---|
public void setClosurePolicy(int policy)
policy
- one of CLOSE_ALL_TYPES, CLOSE_TYPES_IN_LABELED_DOCS,
DONT_CLOSE_TYPESpublic MutableTextLabels loadOps(TextBase base, java.io.File file) throws java.io.IOException, java.io.FileNotFoundException
java.io.IOException
java.io.FileNotFoundException
public void importOps(MutableTextLabels labels, TextBase base, java.io.File file) throws java.io.IOException, java.io.FileNotFoundException
addToType ID LOW LENGTH TYPE
where ID is a documentID in the
given TextBase, LOW is a character index into that document, and LENGTH is
the length in characters of the span that will be created as given type
TYPE. If LENGTH==-1, then the created span will go to the end of the
document.
For closeType: Lines must be closeType ID TYPE
where ID is a
documentID in the given TextBase and TYPE is the label type to close over
that document.
For closeAllTypes: Lines must be closeAllType ID
where ID is
a documentID in the given TextBase. The document will be closed for all
types present in the TextLabels after all operations are
performed.
For setClosure: Lines must be setClosure POLICY
where POLICY
is one of the policy types defined in this class. It will immediately
change the closure policy for the loader. This is best used at the
beginning of the file to indicate one of the generic policies or the
CLOSE_BY_OPERATION (default) policy.
java.io.IOException
java.io.FileNotFoundException
public void closeLabels(MutableTextLabels labels, int policy)
labels
- policy
- public MutableTextLabels loadSerialized(java.io.File file, TextBase base) throws java.io.IOException, java.io.FileNotFoundException
java.io.IOException
java.io.FileNotFoundException
public void saveSerialized(MutableTextLabels labels, java.io.File file) throws java.io.IOException
java.io.IOException
public java.lang.String printTypesAsOps(TextLabels labels)
public void saveTypesAsOps(TextLabels labels, java.io.File file) throws java.io.IOException
java.io.IOException
public void saveTypesAsStrings(TextLabels labels, java.io.File file, boolean includeOffset) throws java.io.IOException
java.io.IOException
public void saveDocsWithEmbeddedTypes(TextLabels labels, java.io.File dir) throws java.io.IOException
java.io.IOException
public java.lang.String markupDocumentSpan(java.lang.String documentId, TextLabels labels)
public java.lang.String createXMLmarkup(java.lang.String documentId, TextLabels labels)
public java.lang.String saveTypesAsXML(TextLabels labels)
|
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |