- java.lang.Object
-
- com.pervasive.datarush.analytics.text.NGramMap
-
- com.pervasive.datarush.analytics.text.WordMap
-
public class WordMap extends NGramMap
Implementation of a word frequency model.
-
-
Constructor Summary
Constructors Constructor Description WordMap()Default constructor of an empty word map.WordMap(NGramMap map)Convert a valid NGramMap into a word map.WordMap(WordMap map)Copy a word to frequency map.WordMap(Map<String,Integer> map)Create a word to frequency map.WordMap(Map<String,Integer> map, int textSize)Create a word to frequency map.
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description booleandecreaseFreq(String word)Removes a word from the map or decrease the frequency if the absolute frequency is greater than one.static TokenDecodergetDecoder()static TokenEncodergetEncoder()intgetFrequency(String word)Get the absolute frequency of a word in the map.doublegetProbability(String word)Get the relative frequency of a word in the map.Map<String,Integer>getStringMap()Get a copy of the map that backs this object.List<String>getWordList()Get an ordered list of the words contained in the map.booleanincreaseFreq(String word)Adds a word to the map or increases the frequency if it is already present.intremoveWord(String word)Removes a word from the map.StringtoString()-
Methods inherited from class com.pervasive.datarush.analytics.text.NGramMap
calcOrigTextSize, decreaseFreq, equals, filterByThreshold, filterByTotal, getFrequency, getFrequencyList, getMap, getN, getNGramList, getOrigTextSize, getProbability, getProbabilityList, hashCode, increaseFreq, iterator, removeNGram, setOrigTextSize
-
-
-
-
Constructor Detail
-
WordMap
public WordMap()
Default constructor of an empty word map.
-
WordMap
public WordMap(Map<String,Integer> map)
Create a word to frequency map.- Parameters:
map- the mappings to use
-
WordMap
public WordMap(Map<String,Integer> map, int textSize)
Create a word to frequency map.- Parameters:
map- the mappings to usetextSize- the number of elements in the original text
-
WordMap
public WordMap(WordMap map)
Copy a word to frequency map.- Parameters:
map- the word map to copy
-
WordMap
public WordMap(NGramMap map)
Convert a valid NGramMap into a word map. A valid map has an N of one. If the NGramMap is invalid the WordMap will remain empty.- Parameters:
map- the n-gram map to convert
-
-
Method Detail
-
getWordList
public List<String> getWordList()
Get an ordered list of the words contained in the map.- Returns:
- the list of words
-
getFrequency
public int getFrequency(String word)
Get the absolute frequency of a word in the map.- Parameters:
word- the word to get the frequency of- Returns:
- the absolute frequency of the word
-
getProbability
public double getProbability(String word)
Get the relative frequency of a word in the map. If OrigTextSize has not been set will calculate based on the current map.- Parameters:
word- the word to get the frequency of- Returns:
- the relative frequency of the word
-
increaseFreq
public boolean increaseFreq(String word)
Adds a word to the map or increases the frequency if it is already present.- Parameters:
word- element to increase the frequency of in the map- Returns:
- true if word is valid and could be incremented
-
decreaseFreq
public boolean decreaseFreq(String word)
Removes a word from the map or decrease the frequency if the absolute frequency is greater than one.- Parameters:
word- element to decrease the frequency of in the map- Returns:
- true if word is valid and could be decremented
-
removeWord
public int removeWord(String word)
Removes a word from the map.- Parameters:
word- element to remove from the map- Returns:
- the frequency previously associated with the word or null
-
getStringMap
public Map<String,Integer> getStringMap()
Get a copy of the map that backs this object.- Returns:
- map of Strings to Integers
-
getEncoder
public static TokenEncoder getEncoder()
-
getDecoder
public static TokenDecoder getDecoder()
-
-