net.sf.textkit4j.matching
Class NGrams

java.lang.Object
  extended by net.sf.textkit4j.matching.NGrams
All Implemented Interfaces:
java.io.Serializable

public class NGrams
extends java.lang.Object
implements java.io.Serializable

(Plagarized from Wikipedia) An n-gram is a sub-sequence of n items from a given sequence. n-grams are used in various areas of statistical natural language processing and genetic sequence analysis. The NGrams class is a map/table of counts keyed by n-gram sequences of characters or words. The idea is this can be a handy model for classification/clustering/grouping applications.

Author:
rich
See Also:
Serialized Form

Method Summary
 java.lang.Double similiarity(NGrams other)
          Calculates the similarity of this NGrams with the other NGrams as the cosine of the angle between the two NGrams vector representation.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Method Detail

similiarity

public java.lang.Double similiarity(NGrams other)
Calculates the similarity of this NGrams with the other NGrams as the cosine of the angle between the two NGrams vector representation.



Copyright © 2009 All Eight, LLC. All Rights Reserved.