picture

Naive Bayes (Naive Bayesian Mode,NBM)

The origin of Bayes

Bayes was created by the British scholar Thomas · Bayes Put forward a theory of Na reasoning , Later, it developed into a systematic statistical inference method . It's called Bayesian method .

Naive Bayes

Naive Bayes is based on Bayes theorem And Characteristic conditions are independent Hypothetical classification . The advantage is that it still works with less data , It can deal with many kinds of problems . The disadvantage is that it is more sensitive to the equipment mode of input data . Data for nominal type .

Characteristic conditions are independent : hypothesis X Of N Each feature is conditionally independent under the condition that the class is determined . This greatly simplifies the computational complexity , But at the expense of some accuracy .

Nominal data : Only in the limited target set , For example, true and false .

Bayes theorem

Conditional probability means that in the event B In the case of events A Probability of occurrence , use P(A|B) Express , pronounce as "A stay B The probability of occurrence under the condition of occurrence ".

 picture

According to Venn's chart , It can be seen that in the event B When it happens , event A The probability of that happening is zero P(A∩B) Divide P(B).

image.png

among :

  1. P(A) yes A A priori probability or edge probability of , Don't consider B Factors
  2. P(A|B) Is known B After occurrence A Conditional probability of , Also known as A The posterior probability of .
  3. P(B|A) Is known A After occurrence B Conditional probability of , Also known as B The posterior probability of , It's called likelihood .
  4. P(B) yes B A priori probability or edge probability of , It's called a normalized constant .
  5. P(B|A)/P(B) It's called standard likelihood .

Example 1: Stones in the bucket

Suppose there are now  A bucket and B Bucket, two buckets ,A The barrel contains 4 Each stone is divided into two parts 2 A black stone and 2 A gray stone ,B The barrel contains 3 The stones are 2 A black stone and 1   A gray stone , Then take out any stone in these two barrels, and they are all gray , Ask this gray stone in A What's the probability of the bucket being taken out ?

Suppose that A Taking stones out of the barrel is an event A, Take out the gray stone for the event B, stay A The probability of an event in which a gray stone is removed from the bucket is P(B|A), be :P(A) = 4/7,P(B) = 3/7,P(B|A) = 1/2, According to the formula :

image.png

therefore , Take out any stone in two barrels and it is gray , This gray stone is A The probability of the bucket being taken out is 2/3

Example 2: Judge whether to go out to play according to the weather

In reality, we often judge whether to go out to play according to the weather , Let's make a table

The weather temperature humidity wind result
cloudy heat high strong yes
cloudy heat high strong no
cloudy cold high weak no
cloudy cold high weak no
cloudy cold low weak yes
cloudy heat low in yes
Light rain heat high weak no
Light rain cold high weak no
Light rain heat low in yes
Light rain low low strong no

Now a friend calls you out to play , But it's cloudy 、 The temperature is colder 、 Low humidity 、 The wind is strong , Decide if you want to go out and play .

Apply the naive Bayes formula above P( Category | features ) by P( yes | cloudy 、 cold 、 low 、 weak ) and P( Category | features ) = P( no | cloudy 、 cold 、 low 、 weak ) Probability .

If P( yes | cloudy 、 cold 、 low 、 weak ) > P( no | cloudy 、 cold 、 low 、 weak ), Go out and play . If P( yes | cloudy 、 cold 、 low 、 weak ) < P( no | cloudy 、 cold 、 low 、 weak ), Not going out to play .

image.png

Count the characteristic probability of going out to play

Next, we can calculate the features one by one

1. First of all, let's sort out the samples of going out to play , The results are as follows , Altogether 3 Data

The weather temperature humidity wind result
cloudy heat high strong yes
cloudy cold low weak yes
cloudy heat low in yes
Light rain heat low in yes

P( yes ) = 4/10 = 2/5

2. When it's cloudy, go out and play P( cloudy | yes ) The sample statistics are as follows :

The weather temperature humidity wind result
cloudy heat high strong yes
cloudy cold low weak yes
cloudy heat low in yes

P( cloudy | yes ) = 3/4

3. When the temperature is cold, go out and play P( cold | yes ) The sample statistics are as follows :

The weather temperature humidity wind result
cloudy cold low weak yes

P( cold | yes ) = 1/4

4. When the humidity is low, go out and play P( low | yes ) The sample statistics are as follows

The weather temperature humidity wind result
cloudy cold low weak yes
cloudy heat low in yes
Light rain heat low in yes

P( low | yes ) = 3/4

5. When the wind is weak, go out and play P( weak | yes ) The sample statistics are as follows

The weather temperature humidity wind result
cloudy cold low weak yes

P( weak | yes ) = 1/4

It's been counted here P( cloudy ∣ yes )、P( cold ∣ yes )、P( low ∣ yes )、P( weak ∣ yes )、P( yes ) Probability , Let's start with the statistics P( cloudy )、P( cold )、P( low )、P( weak ) Probability

1. The weather is cloudy P( cloudy ) There are a total of 6 strip , The probability is 6/10.P( cloudy ) = 6/10 = 3/5

2. The temperature is cold P( cold ) There are a total of 4 strip , The probability is 4/10.P( cold ) = 4/10 = 2/5

3. The humidity is cold P( low ) There are a total of 4 strip , The probability is 4/10.P( low ) = 4/10 = 2/5

4. The wind is weak P( weak ) There are a total of 5 strip , The probability is 1/2.P( weak ) = 1/2

Calculate the probability of play

image.png

Count the characteristic probability of not going out to play

In whether to go out to play in the calculation of the cloud 、 cold 、 low 、 Go out to play in strong weather  P( yes | cloudy 、 cold 、 low 、 weak ) After the probability of , Also need to calculate the same weather conditions do not go out to play P( no | cloudy 、 cold 、 low 、 weak ) Probability , The same calculation as above  P( cloudy | no )、P( cold | no )、P( low | no )、P( weak | no )*P( no ) Probability .

1. I can't go out to play P( no ) Probability ,P( no ) = 6/10 = 3/5

2. Statistics when the weather is cloudy, do not go out to play P( cloudy | no ) The sample probability of ,P( cloudy | no ) = 3/6 = 1/2

3. Statistics when the temperature is cold, do not go out to play P( cold | no ) The sample probability of ,P( cold | no ) = 3/6 = 1/2

4. Statistics when the humidity is low, do not go out to play P( low | no ) The sample probability of ,P( low | no ) = 1/6

5. When the wind is weak, don't go out to play P( weak | no ) The sample probability of ,P( weak | no ) = 4/6 = 2/3

Calculate the probability of not playing

image.png

Probability comparison

The obvious result :(3/4  * 1/4 * 3/4 * 1/4 * 2/5) / (3/5 * 2/5 * 2/5 * 1/2) < (1/2 * 1/2 *  1/6 * 2/3 * 3/5) / (3/5 * 2/5 * 2/5 * 1/2) therefore P( yes | cloudy 、 cold 、 low 、 weak ) <  P( no | cloudy 、 cold 、 low 、 weak ).

Python Realization

stay Python With the help of pandas Module and numpy The module can calculate naive Bayes , There are several things that need to be done in the code :

  1. Need to select sample , Such as : Example 2 The weather samples in
  2. Calculate the probability of each category , This is a priori probability
  3. Calculate the probability of each feature and category occurring simultaneously , This is a posteriori probability
  4. Calculate the conditional probability
  5. Compare the probability of features appearing in a category

import pandas as pdimport numpy as np
class Nbm(object):
   def getSampleSet(self):        dataSet = np.array(pd.read_csv('csv file '))  # Turning data into arrays        featureData = dataSet[:, 0 : dataSet.shape[1] - 1] # Take out the features        labels = dataSet[:, dataSet.shape[1] - 1] # Take out the category        return featureData, labels

   def priori(self, labels):        # Find a priori probability of yes or no        labels = list(labels)        priori_ny = {}        for label in labels:            priori_ny[label] = labels.count(label) / float(len(labels)) # P = count(label) / count(labels)        return priori_ny
   def feature_probability(self, priori_ny, features):        # Find the characteristic probability : cloudy + yes , cloudy + no , cold + yes , cold + No, the probability of simultaneous occurrence        p_feature_ny = {}        for ny in priori_ny.keys():            ny_index = [i for i, label in enumerate(labels) if label == ny] # yes 、 No subscript            for j in range(len(features)):                f_index = [i for i, feature in enumerate(trainData[:, j]) if feature == features[j]] # Subscripts of features                xy_count = len(set(f_index) & set(ny_index)) # The subscripts of categories and features are the same length                pkey = str(features[j]) + '+' + str(ny)                p_feature_ny[pkey] = xy_count / float(len(labels)) # The probability that features and categories occur simultaneously        return p_feature_ny
   def conditional_probability(self, priori_ny, feature_probability, features):        # Find out the conditional probability        P = {}        for y in priori_ny.keys():            for x in features:                pkey = str(x) + '|' + str(y)                P[pkey] = feature_probability[str(x) + '+' + str(y)] / float(priori_ny[y])  # P[X1/Y] = P[X1Y]/P[Y]        return P
   def classify(self, priori_ny, feature_probability, features):

       # Find the conditional probability        p = self.conditional_probability(priori_ny, feature_probability, features)
       # Find out [ cloudy 、 cold 、 low 、 weak ] Category        f = {}        for ny in priori_ny:            f[ny] = priori_ny[ny]            for x in features:                f[ny] = f[ny] * p[str(x)+'|'+str(ny)]   # Calculation P( cloudy ∣ yes )∗P( cold ∣ yes )∗P( low ∣ yes )∗P( weak ∣ yes )∗P( yes )
       return max(f, key=f.get)  # The category corresponding to the maximum probability

if __name__ == '__main__':    nbm = Nbm()    features = [' cloudy ', ' cold ', ' low ', ' weak ']    trainData, labels = nbm.getSampleSet()    priori_ny = nbm.priori(labels)
   feature_probability = nbm.feature_probability(priori_ny, features)
   result = nbm.classify(priori_ny, feature_probability, features)
   print(features, ' The result is ', result)

summary

This paper briefly introduces some concepts of naive Bayes , Two examples are used to enhance naive Bayes learning , I hope that's helpful .

Reference material

《 Machine learning practice 》

https://baike.baidu.com/item/ Bayes' formula

https://www.ruanyifeng.com/blog/2011/08/bayesian_inference_part_one.html

https://zhuanlan.zhihu.com/p/26262151

Code address

Sample code :https://github.com/JustDoPython/python-100-day/tree/master/day-116

Series articles

   The first 115 God :Python Is it value passing or reference passing

   The first 114 God : Three board model algorithm project actual combat

   The first 113 God :Python XGBoost Algorithm project actual combat

   The first 112 God : Monte Carlo of machine learning algorithm

   The first 111 God :Python Garbage collection mechanism

from 0 Study Python 0 - 110 Summary of the grand collection