java.lang.Object
com.pervasive.datarush.analytics.text.NGramMap
com.pervasive.datarush.analytics.text.WordMap

public class WordMap extends NGramMap
Implementation of a word frequency model.
  • Constructor Details

    • WordMap

      public WordMap()
      Default constructor of an empty word map.
    • WordMap

      public WordMap(Map<String,Integer> map)
      Create a word to frequency map.
      Parameters:
      map - the mappings to use
    • WordMap

      public WordMap(Map<String,Integer> map, int textSize)
      Create a word to frequency map.
      Parameters:
      map - the mappings to use
      textSize - the number of elements in the original text
    • WordMap

      public WordMap(WordMap map)
      Copy a word to frequency map.
      Parameters:
      map - the word map to copy
    • WordMap

      public WordMap(NGramMap map)
      Convert a valid NGramMap into a word map. A valid map has an N of one. If the NGramMap is invalid the WordMap will remain empty.
      Parameters:
      map - the n-gram map to convert
  • Method Details

    • getWordList

      public List<String> getWordList()
      Get an ordered list of the words contained in the map.
      Returns:
      the list of words
    • getFrequency

      public int getFrequency(String word)
      Get the absolute frequency of a word in the map.
      Parameters:
      word - the word to get the frequency of
      Returns:
      the absolute frequency of the word
    • getProbability

      public double getProbability(String word)
      Get the relative frequency of a word in the map. If OrigTextSize has not been set will calculate based on the current map.
      Parameters:
      word - the word to get the frequency of
      Returns:
      the relative frequency of the word
    • increaseFreq

      public boolean increaseFreq(String word)
      Adds a word to the map or increases the frequency if it is already present.
      Parameters:
      word - element to increase the frequency of in the map
      Returns:
      true if word is valid and could be incremented
    • decreaseFreq

      public boolean decreaseFreq(String word)
      Removes a word from the map or decrease the frequency if the absolute frequency is greater than one.
      Parameters:
      word - element to decrease the frequency of in the map
      Returns:
      true if word is valid and could be decremented
    • removeWord

      public int removeWord(String word)
      Removes a word from the map.
      Parameters:
      word - element to remove from the map
      Returns:
      the frequency previously associated with the word or null
    • getStringMap

      public Map<String,Integer> getStringMap()
      Get a copy of the map that backs this object.
      Returns:
      map of Strings to Integers
    • toString

      public String toString()
      Overrides:
      toString in class NGramMap
    • getEncoder

      public static TokenEncoder getEncoder()
    • getDecoder

      public static TokenDecoder getDecoder()