Module datarush.analytics
Class NaiveBayesLearner
java.lang.Object
com.pervasive.datarush.operators.AbstractLogicalOperator
com.pervasive.datarush.operators.CompositeOperator
com.pervasive.datarush.analytics.naivebayes.learner.NaiveBayesLearner
- All Implemented Interfaces:
LogicalOperator
Operator responsible for building a Naive Bayes PMML model from input data.
The base algorithm used is specified here here, with the
following differences:
- Provides the ability to predict based on numerical data. For numerical data, we compute probability based on the assumption of a Gaussian distribution.
- We use
Laplace smoothingin place of the "threshold" parameter. - We provide an option to count missing values. If selected, missing values are treated like any other single distinct value. Probability is calculated in terms of the ration of missing to non-missing.
- Calculation is performed in terms of log-likelihood rather than likelihood.
-
Constructor Summary
ConstructorsConstructorDescriptionThe default constructor.NaiveBayesLearner(String targetColumn) Creates a new instance ofNaiveBayesLearner, specifying the minimal set of required parameters. -
Method Summary
Modifier and TypeMethodDescriptionprotected voidCompose the body of this operator.getInput()The input data.Returns the list of columns to be used to predict the output value.getModel()Returns the output PMML model port.Gets the column to be predicted.final voidsetLearningColumns(List<String> learningColumns) Sets the list of columns to be used to predict the output value.voidsetTargetColumn(String targetColumn) Sets the column to be predicted.Methods inherited from class com.pervasive.datarush.operators.AbstractLogicalOperator
disableParallelism, getInputPorts, getOutputPorts, newInput, newInput, newOutput, newRecordInput, newRecordInput, newRecordOutput, notifyError
-
Constructor Details
-
NaiveBayesLearner
public NaiveBayesLearner()The default constructor. Prior to graph compilation the following required properties must be specified or an exception will be raised: -
NaiveBayesLearner
Creates a new instance ofNaiveBayesLearner, specifying the minimal set of required parameters.- Parameters:
targetColumn- the target column to predict. Must be of typeStringValued.
-
-
Method Details
-
getInput
The input data. String fields are assumed to be categorical. Double fields are assumed to be numerical. All other fields are ignored.- Returns:
- the input data
-
getModel
Returns the output PMML model port.- Returns:
- the output PMML model port.
-
getLearningColumns
Returns the list of columns to be used to predict the output value. Default of empty list means "everything but targetColumn".- Returns:
- The list of columns to be used to predict the output value.
-
setLearningColumns
Sets the list of columns to be used to predict the output value. Default of empty list means "everything but targetColumn".- Parameters:
learningColumns- The list of columns to be used to predict the output value.
-
setTargetColumn
Sets the column to be predicted. Must be of type string- Parameters:
targetColumn- the column to be predicted
-
getTargetColumn
Gets the column to be predicted. Must be of type string.- Returns:
- the column to be predicted
-
compose
Description copied from class:CompositeOperatorCompose the body of this operator. Implementations should do the following:- Perform any validation of configuration, input types, etc
- Instantiate and configure sub-operators, adding them to the provided context via
the method
OperatorComposable.add(O) - Create necessary connections via the method
OperatorComposable.connect(P, P). This includes connections from the composite's input ports to sub-operators, connections between sub-operators, and connections from sub-operators output ports to the composite's output ports
- Specified by:
composein classCompositeOperator- Parameters:
ctx- the context
-