edu.cmu.minorthird.text
Class SpanDifference

java.lang.Object
  extended by edu.cmu.minorthird.text.SpanDifference

public class SpanDifference
extends java.lang.Object

Compares two sets of spans.

Author:
William Cohen

Nested Class Summary
static class SpanDifference.Invoker
           
static class SpanDifference.Looper
          A Span.Looper which also passes out two additional types of information about each returned span s: if s is a FALSE_POS, FALSE_NEG, or TRUE_POS, relative to the original spans.
 
Field Summary
static int FALSE_NEG
          Indicates a false negative span.
static int FALSE_POS
          Indicates a false positive span.
static org.apache.log4j.Logger log
           
static int MAX_STATUS
          Max value of an status indicator, eg FALSE_POS, FALSE_NEG, etc
static int TRUE_POS
          Indicates a true positive negative span.
static int UNKNOWN_POS
          Indicates something inside a guess span which may or may not be inside a truth span.
 
Constructor Summary
SpanDifference(java.util.Iterator<Span> guess, java.util.Iterator<Span> truth)
          Create machinery to analyze the differences between the two sets of spans.
SpanDifference(java.util.Iterator<Span> guess, java.util.Iterator<Span> truth, java.util.Iterator<Span> closures)
          Create machinery to analyze the differences between the two sets of spans.
SpanDifference(SpanDifference[] spanDifferences)
          Create an aggregation of the results in several SpanDifference's.
 
Method Summary
 SpanDifference.Looper differenceIterator()
           
static void main(java.lang.String[] args)
           
 double spanPrecision()
          Return the percentage of 'guess' spans that are also 'truth' spans, ignoring non-truth spans that are not inside closure spans.
 double spanRecall()
          Return the percentage of 'truth' spans that are also 'guess' spans
 double tokenPrecision()
          Return the percentage of tokens in 'guess' spans that are true positives (ignoring tokens that are UNKNOWN_POS).
 double tokenRecall()
          Return the percentage of tokens in true positive spans that are in guess spans (ignoring tokens that are UNKNOWN_POS).
 java.lang.String toString()
           
 java.lang.String toSummary()
          Return a string containing all the summary statistics printed moderately neatly on two lines.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Field Detail

log

public static org.apache.log4j.Logger log

MAX_STATUS

public static final int MAX_STATUS
Max value of an status indicator, eg FALSE_POS, FALSE_NEG, etc

See Also:
Constant Field Values

FALSE_POS

public static final int FALSE_POS
Indicates a false positive span. Specificially, indicates part of document inside a 'guess' span, but not inside a 'truth' span, where the set of truth spans for this area is known to be complete.

See Also:
Constant Field Values

FALSE_NEG

public static final int FALSE_NEG
Indicates a false negative span. Specificially, indicates part of document inside a truth span but not inside a guess span.

See Also:
Constant Field Values

TRUE_POS

public static final int TRUE_POS
Indicates a true positive negative span. Specificially, indicates part of document inside a truth span and also inside a guess span.

See Also:
Constant Field Values

UNKNOWN_POS

public static final int UNKNOWN_POS
Indicates something inside a guess span which may or may not be inside a truth span.

See Also:
Constant Field Values
Constructor Detail

SpanDifference

public SpanDifference(SpanDifference[] spanDifferences)
Create an aggregation of the results in several SpanDifference's.


SpanDifference

public SpanDifference(java.util.Iterator<Span> guess,
                      java.util.Iterator<Span> truth)
Create machinery to analyze the differences between the two sets of spans. It is assume that the first argument is a complete list of all guess spans and the second argument is a complete list of all truth spans.


SpanDifference

public SpanDifference(java.util.Iterator<Span> guess,
                      java.util.Iterator<Span> truth,
                      java.util.Iterator<Span> closures)
Create machinery to analyze the differences between the two sets of spans. It is assumed that the first argument is a complete list of all guess spans, the second argument is a partial list of all truth spans, and the third argument is the set of spans S for which all truth spans contained by S are known.

Method Detail

differenceIterator

public SpanDifference.Looper differenceIterator()

tokenPrecision

public double tokenPrecision()
Return the percentage of tokens in 'guess' spans that are true positives (ignoring tokens that are UNKNOWN_POS).


tokenRecall

public double tokenRecall()
Return the percentage of tokens in true positive spans that are in guess spans (ignoring tokens that are UNKNOWN_POS).


spanPrecision

public double spanPrecision()
Return the percentage of 'guess' spans that are also 'truth' spans, ignoring non-truth spans that are not inside closure spans.


spanRecall

public double spanRecall()
Return the percentage of 'truth' spans that are also 'guess' spans


toString

public java.lang.String toString()
Overrides:
toString in class java.lang.Object

toSummary

public java.lang.String toSummary()
Return a string containing all the summary statistics printed moderately neatly on two lines.


main

public static void main(java.lang.String[] args)