Class NaiveBayesClassifier


  • public class NaiveBayesClassifier
    extends Object

    Add class labels, features with any number of states, prior and conditional probabilities. Then compute the probability of each class labels given observed features.

    Example

    Imagine you want to use the classifier to quickly diagnose if a patient has either the flu or measles, and you are able to observe fever and (red) spots on the skin of the patient. This can be modelled as follows:

    Class labels:

    • Flu
    • Measles
    • No disease

    Features:

    • Fever (yes, no)
    • Red spots (yes, no)

    This can be programmed and tested as follows.

    try {
            NaiveBayesClassifier c = new NaiveBayesClassifier();
            c.addClassLabel("Flu");
            c.addClassLabel("Measles");
            c.addClassLabel("No disease");
    
            c.addFeature("Fever");
            c.addState("Fever", "yes");
            c.addState("Fever", "no");
    
            c.addFeature("Red spots");
            c.addState("Red spots", "yes");
            c.addState("Red spots", "no");
    
            c.setPriorProbability("Flu", 0.06d);
            c.setPriorProbability("Measles", 0.04d);
            c.setPriorProbability("No disease", 0.90d);
        
            // Probability of observing Fever=yes given different diseases 
            c.setConditionalProbability("Fever", "yes", "Flu", 0.90d);
            c.setConditionalProbability("Fever", "yes", "Measles", 0.90d);
            c.setConditionalProbability("Fever", "yes", "No disease", 0.01d);
            
            // Probability of observing Fever=no given different diseases 
            c.setConditionalProbability("Fever", "no", "Flu", 0.10d);
            c.setConditionalProbability("Fever", "no", "Measles", 0.10d);
            c.setConditionalProbability("Fever", "no", "No disease", 0.99d);
    
            // Probability of observing Red spots=yes given different diseases
            c.setConditionalProbability("Red spots", "yes", "Flu", 0.05d);
            c.setConditionalProbability("Red spots", "yes", "Measles", 0.90d);
            c.setConditionalProbability("Red spots", "yes", "No disease", 0.01d);
        
            // Probability of observing Red spots=no given different diseases 
            c.setConditionalProbability("Red spots", "no", "Flu", 0.95d);
            c.setConditionalProbability("Red spots", "no", "Measles", 0.10d);
            c.setConditionalProbability("Red spots", "no", "No disease", 0.99d);
    
            // Inject observations as a Map object
            HashMap observations = new HashMap();
            observations.put("Fever", "yes");
            observations.put("Red spots", "no");
            Double[] result = c.classify(observations);
            
            // Present results
            System.out.println("Results (Fever=yes, Red spots=no)");
            System.out.println("Probability of Flu: " + result[0]);
            System.out.println("Probability of Measles: " + result[1]);
            System.out.println("Probability of No disease: " + result[2]);
    catch (Exception e) {
            // Fix something...
    }
     

    Output from this code:

      Results (Fever=yes, Red spots=no)
      Probability of Flu: 0.8039492242595203
      Probability of Measles: 0.056417489421720736
      Probability of No disease: 0.13963328631875882
     
    Version:
    1.0
    Author:
    Lars Moltsen
    • Constructor Detail

      • NaiveBayesClassifier

        public NaiveBayesClassifier()
    • Method Detail

      • setPriorProbability

        public void setPriorProbability​(String classLabel,
                                        double priorProbability)
                                 throws DataStructureException
        Set the prior (default) probability of a class label. The prior probability is what you get if you use the classify method with no observed features.
        Parameters:
        classLabel - The name of the target class label.
        prior - The prior probability of the class label.
        Throws:
        DataStructureException
      • setConditionalProbability

        public void setConditionalProbability​(String featureName,
                                              String ofState,
                                              String givenLabel,
                                              double conditionalProbability)
                                       throws DataStructureException
        Set the conditional probability of a state given a class label. The conditional probability is what you would expect for this feature if the given class label was known.
        Parameters:
        ofState -
        ofFeature -
        givenLabel -
        conditionalProbability -
        Throws:
        DataStructureException
      • getClassLabels

        public String[] getClassLabels()
        Returns all class labels.
        Returns:
        Class labels as an array of String.
      • getFeatures

        public String[] getFeatures()
        Returns all features.
        Returns:
        Features as an array of String.
      • getPriorProbabilities

        public Double[] getPriorProbabilities()
        Returns prior probabilities.
        Returns:
        The prior probability distribution over the class labels.
      • getConditionalProbabilities

        public Double[] getConditionalProbabilities​(String featureName,
                                                    String ofState)
                                             throws DataStructureException
        Returns the conditional probabilities of some feature state given all class labels.
        Parameters:
        featureName - The feature of interest.
        ofState - The state of interest.
        Returns:
        The conditional probability of the given state per class label.
        Throws:
        DataStructureException
      • validate

        public void validate()
                      throws DataStructureException
        Examines the consistency of prior and conditional probabilities and throws an exception if inconsistent. This is done as the first step in the classify method.
        Throws:
        DataStructureException
      • classify

        public Double[] classify​(Map<String,​String> observations)
                          throws DataStructureException
        The Naive Bayes classifiaction algorithm. The structure of the classifier must be complete and consistent. If not, an exception will be thrown.
        Parameters:
        observations - A map (e.g. HashMap) of feature (key) and state (value) pairs.
        Returns:
        A probability distribution over the class labels given the observations.
        Throws:
        DataStructureException