FilterTokenizer

Overview

Package

Class

Tree

Deprecated

Index

Help

PREV CLASS NEXT CLASS

FRAMES NO FRAMES All Classes

SUMMARY: NESTED | FIELD | CONSTR | METHOD

DETAIL: FIELD | CONSTR | METHOD

edu.cmu.minorthird.text
Class FilterTokenizer

java.lang.Object
  edu.cmu.minorthird.text.CompoundTokenizer
      edu.cmu.minorthird.text.FilterTokenizer

All Implemented Interfaces:: Tokenizer

public class FilterTokenizer
extends CompoundTokenizer
extends CompoundTokenizer

This implementation of the Tokenizer interface is used for filtering a text base based on a specified spantype. It is a trivial tokenizer in the sense that it takes a document from the new text base, maps it to the old text base and copies over the tokens. If the mapping is not found (ie if the document being added is not in the parent text base) then the parent tokenizer is used.

Author:: Quinten Mercer

Field Summary

Fields inherited from class edu.cmu.minorthird.text.CompoundTokenizer
`parentTokenizer`

Constructor Summary
`FilterTokenizer(TextBaseManager tbMan, java.lang.String levelName, java.lang.String parentLevelName)`

Method Summary
`TextToken[]`	`splitIntoTokens(Document document)` Tokenize a document.
`java.lang.String[]`	`splitIntoTokens(java.lang.String string)` Tokenize a string

Methods inherited from class edu.cmu.minorthird.text.CompoundTokenizer
`getParent`

Methods inherited from class java.lang.Object
`clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait`

Constructor Detail

FilterTokenizer

public FilterTokenizer(TextBaseManager tbMan,
                       java.lang.String levelName,
                       java.lang.String parentLevelName)

Method Detail

splitIntoTokens

public java.lang.String[] splitIntoTokens(java.lang.String string)

Tokenize a string

Specified by:: splitIntoTokens in interface Tokenizer
Specified by:: splitIntoTokens in class CompoundTokenizer

splitIntoTokens

public TextToken[] splitIntoTokens(Document document)

Tokenize a document.

Specified by:: splitIntoTokens in interface Tokenizer
Specified by:: splitIntoTokens in class CompoundTokenizer

Overview

Package

Class

Tree

Deprecated

Index

Help

PREV CLASS NEXT CLASS

FRAMES NO FRAMES All Classes

SUMMARY: NESTED | FIELD | CONSTR | METHOD

DETAIL: FIELD | CONSTR | METHOD

edu.cmu.minorthird.text Class FilterTokenizer

FilterTokenizer

splitIntoTokens

splitIntoTokens

edu.cmu.minorthird.text
Class FilterTokenizer