EditLabels Tutorial
EditLabels is useful for hand editing labels from applying an extraction annotator. A data directory and a labels file created by running ApplyAnnotator with an Extraction Annotator are needed. To see how to create an Extraction Annotator, look at the TrainExtractor Tutorial.htm. To see how to apply the annotator, look at the ApplyAnnotator Tutorial.htm. The extracted type (such as extracted_name) and true type (such as true_name) also must be known.
For this example, we will use a name annotator and apply it to a directory of data. For a quick reference, here are the command lines needed to do that:
% java –Xmx500M edu.cmu.minorthird.ui.TrainExtractor –labels sample1.train –spanType trueName –saveAs sample1.ann
% java –Xmx500M edu.cmu.minorthird.ui.ApplyAnnotator –labels DATA_DIR –loadFRom sample1.ann –saveAs sample1.labels
Note: For this example the user must create their own data directory with whatever files they would like. I created a simple data directory with one document that contains this simple text:
Did <trueName>Andrew Carngie</trueName> found Carnegie Tech?
Feel free to create the same sample or use the annotator on another directory to run through this example.
To run EditLabels, start with:
% java –Xmx500M edu.cmu.minorthird.ui.EditLabels
Editing Parameters:
Like all ui tasks, all the parameters for EditLabels may be specified in either the gui or by the command line. To use the gui, simple type the –gui on the command line. It is also possible to mix and match where the parameters are specified; for example: one can specify two parameters on the command line and use the gui to select the rest. For this reason, the step by step process for this experiment will first explain how to select a parameter value in the gui and then how to set the same parameter on the command line.
Note: In this experiment every parameter must be specified
If using the gui, click the edit button next to EditLabels when a window appears to edit the parameters. A Property Editor window will appear:
1) Base Parameters:
a. GUI: Enter the name of the data directory in the labelsFilename text field
b. Command Line: use the –labels option followed by the repositoryKey or the directory of files to load. In this case specify –labels DATA_DIR
2) Edit Parameters
a. GUI
i. editFilename: enter the file name of the labels (the result of the ApplyAnnotator experiment), in this case sample1.labels
ii. extractedType: enter the type that ApplyAnnotator predicted (Note: this type is set in TrainExtractor using the –output option and the default is _prediction)
iii. trueType: enter the name of the type that has been hand labeled, in this case trueName
b. Command Line
i. Use the –edit option to specify the file name of the labels (the result of the ApplyAnnotator experiment), in this case –edit sample1.labels
ii. Use –extractedType option to specify what type ApplyAnnotator predicted (Note: this type is set in TrainExtractor using the –output option and the default is _prediction), -extractedType _prediction
iii. Use the –trueType option to specify the correct hand label, in this case: -trueType trueName
At this point, press enter on the command line or click “OK” in the Property Editor in the gui and press “Start Task”. A window like this will appear: