net.sf.textkit4j.matching
Class NGramFactory

java.lang.Object
  extended by net.sf.textkit4j.matching.NGramFactory
Direct Known Subclasses:
CharacterNGramFactory, WordNGramFactory

public abstract class NGramFactory
extends java.lang.Object

Generates a map of n-grams and their count in the supplied text. The supplied text can be lower-cased, white-space can be collapsed into a single white-space, and punctuation can be filtered out, depending on how this is configured.

Author:
rich

Constructor Summary
NGramFactory()
           
 
Method Summary
 NGrams bigrams(java.lang.String text)
           
 boolean isCollapseWhiteSpace()
           
 boolean isLowerCase()
           
 boolean isStripPunctuation()
           
 void setCollapseWhiteSpace(boolean collapseWhiteSpace)
           
 void setLowerCase(boolean lowerCase)
           
 void setStripPunctuation(boolean stripPunctuation)
           
 NGrams trigrams(java.lang.String text)
           
 NGrams unigrams(java.lang.String text)
           
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

NGramFactory

public NGramFactory()
Method Detail

unigrams

public NGrams unigrams(java.lang.String text)
Parameters:
text -
Returns:

bigrams

public NGrams bigrams(java.lang.String text)
Parameters:
text -
Returns:

trigrams

public NGrams trigrams(java.lang.String text)
Parameters:
text -
Returns:

isLowerCase

public boolean isLowerCase()

setLowerCase

public void setLowerCase(boolean lowerCase)

isCollapseWhiteSpace

public boolean isCollapseWhiteSpace()

setCollapseWhiteSpace

public void setCollapseWhiteSpace(boolean collapseWhiteSpace)

isStripPunctuation

public boolean isStripPunctuation()

setStripPunctuation

public void setStripPunctuation(boolean stripPunctuation)


Copyright © 2009 All Eight, LLC. All Rights Reserved.