 ## Naive Bayes （Naive Bayesian Mode,NBM）

### The origin of Bayes

Bayes was created by the British scholar Thomas · Bayes Put forward a theory of Na reasoning , Later, it developed into a systematic statistical inference method . It's called Bayesian method .

### Naive Bayes

Naive Bayes is based on Bayes theorem And Characteristic conditions are independent Hypothetical classification . The advantage is that it still works with less data , It can deal with many kinds of problems . The disadvantage is that it is more sensitive to the equipment mode of input data . Data for nominal type .

Characteristic conditions are independent ： hypothesis X Of N Each feature is conditionally independent under the condition that the class is determined . This greatly simplifies the computational complexity , But at the expense of some accuracy .

Nominal data ： Only in the limited target set , For example, true and false .

### Bayes theorem

Conditional probability means that in the event B In the case of events A Probability of occurrence , use P(A|B) Express , pronounce as "A stay B The probability of occurrence under the condition of occurrence ". According to Venn's chart , It can be seen that in the event B When it happens , event A The probability of that happening is zero P(A∩B) Divide P(B). among ：

1. P(A) yes A A priori probability or edge probability of , Don't consider B Factors
2. P(A|B) Is known B After occurrence A Conditional probability of , Also known as A The posterior probability of .
3. P(B|A) Is known A After occurrence B Conditional probability of , Also known as B The posterior probability of , It's called likelihood .
4. P(B) yes B A priori probability or edge probability of , It's called a normalized constant .
5. P(B|A)/P(B) It's called standard likelihood .

### Example 1： Stones in the bucket

Suppose there are now  A bucket and B Bucket, two buckets ,A The barrel contains 4 Each stone is divided into two parts 2 A black stone and 2 A gray stone ,B The barrel contains 3 The stones are 2 A black stone and 1   A gray stone , Then take out any stone in these two barrels, and they are all gray , Ask this gray stone in A What's the probability of the bucket being taken out ？

Suppose that A Taking stones out of the barrel is an event A, Take out the gray stone for the event B, stay A The probability of an event in which a gray stone is removed from the bucket is P(B|A), be ：P(A) = 4/7,P(B) = 3/7,P(B|A) = 1/2, According to the formula ： therefore , Take out any stone in two barrels and it is gray , This gray stone is A The probability of the bucket being taken out is 2/3

### Example 2： Judge whether to go out to play according to the weather

In reality, we often judge whether to go out to play according to the weather , Let's make a table

The weather temperature humidity wind result
cloudy heat high strong yes
cloudy heat high strong no
cloudy cold high weak no
cloudy cold high weak no
cloudy cold low weak yes
cloudy heat low in yes
Light rain heat high weak no
Light rain cold high weak no
Light rain heat low in yes
Light rain low low strong no

Now a friend calls you out to play , But it's cloudy 、 The temperature is colder 、 Low humidity 、 The wind is strong , Decide if you want to go out and play .

Apply the naive Bayes formula above P( Category | features ) by P( yes | cloudy 、 cold 、 low 、 weak ) and P( Category | features ) = P( no | cloudy 、 cold 、 low 、 weak ) Probability .

If P( yes | cloudy 、 cold 、 low 、 weak ) > P( no | cloudy 、 cold 、 low 、 weak ), Go out and play . If P( yes | cloudy 、 cold 、 low 、 weak ) < P( no | cloudy 、 cold 、 low 、 weak ), Not going out to play . #### Count the characteristic probability of going out to play

Next, we can calculate the features one by one

1. First of all, let's sort out the samples of going out to play , The results are as follows , Altogether 3 Data

The weather temperature humidity wind result
cloudy heat high strong yes
cloudy cold low weak yes
cloudy heat low in yes
Light rain heat low in yes

P( yes ) = 4/10 = 2/5

2. When it's cloudy, go out and play P( cloudy | yes ) The sample statistics are as follows ：

The weather temperature humidity wind result
cloudy heat high strong yes
cloudy cold low weak yes
cloudy heat low in yes

P( cloudy | yes ) = 3/4

3. When the temperature is cold, go out and play P( cold | yes ) The sample statistics are as follows ：

The weather temperature humidity wind result
cloudy cold low weak yes

P( cold | yes ) = 1/4

4. When the humidity is low, go out and play P( low | yes ) The sample statistics are as follows

The weather temperature humidity wind result
cloudy cold low weak yes
cloudy heat low in yes
Light rain heat low in yes

P( low | yes ) = 3/4

5. When the wind is weak, go out and play P( weak | yes ) The sample statistics are as follows

The weather temperature humidity wind result
cloudy cold low weak yes

P( weak | yes ) = 1/4

It's been counted here P( cloudy ∣ yes )、P( cold ∣ yes )、P( low ∣ yes )、P( weak ∣ yes )、P( yes ) Probability , Let's start with the statistics P( cloudy )、P( cold )、P( low )、P( weak ) Probability

1. The weather is cloudy P( cloudy ) There are a total of 6 strip , The probability is 6/10.P( cloudy ) = 6/10 = 3/5

2. The temperature is cold P( cold ) There are a total of 4 strip , The probability is 4/10.P( cold ) = 4/10 = 2/5

3. The humidity is cold P( low ) There are a total of 4 strip , The probability is 4/10.P( low ) = 4/10 = 2/5

4. The wind is weak P( weak ) There are a total of 5 strip , The probability is 1/2.P( weak ) = 1/2

#### Calculate the probability of play #### Count the characteristic probability of not going out to play

In whether to go out to play in the calculation of the cloud 、 cold 、 low 、 Go out to play in strong weather  P( yes | cloudy 、 cold 、 low 、 weak ) After the probability of , Also need to calculate the same weather conditions do not go out to play P( no | cloudy 、 cold 、 low 、 weak ) Probability , The same calculation as above  P( cloudy | no )、P( cold | no )、P( low | no )、P( weak | no )*P( no ) Probability .

1. I can't go out to play P( no ) Probability ,P( no ) = 6/10 = 3/5

2. Statistics when the weather is cloudy, do not go out to play P( cloudy | no ) The sample probability of ,P( cloudy | no ) = 3/6 = 1/2

3. Statistics when the temperature is cold, do not go out to play P( cold | no ) The sample probability of ,P( cold | no ) = 3/6 = 1/2

4. Statistics when the humidity is low, do not go out to play P( low | no ) The sample probability of ,P( low | no ) = 1/6

5. When the wind is weak, don't go out to play P( weak | no ) The sample probability of ,P( weak | no ) = 4/6 = 2/3

#### Calculate the probability of not playing #### Probability comparison

The obvious result ：(3/4  * 1/4 * 3/4 * 1/4 * 2/5) / (3/5 * 2/5 * 2/5 * 1/2) < (1/2 * 1/2 *  1/6 * 2/3 * 3/5) / (3/5 * 2/5 * 2/5 * 1/2) therefore P( yes | cloudy 、 cold 、 low 、 weak ) <  P( no | cloudy 、 cold 、 low 、 weak ).

### Python Realization

stay Python With the help of pandas Module and numpy The module can calculate naive Bayes , There are several things that need to be done in the code ：

1. Need to select sample , Such as ： Example 2 The weather samples in
2. Calculate the probability of each category , This is a priori probability
3. Calculate the probability of each feature and category occurring simultaneously , This is a posteriori probability
4. Calculate the conditional probability
5. Compare the probability of features appearing in a category
``import pandas as pd``import numpy as np``class Nbm(object):``    def getSampleSet(self):``        dataSet = np.array(pd.read_csv('csv file '))  # Turning data into arrays ``        featureData = dataSet[:, 0 : dataSet.shape - 1] # Take out the features ``        labels = dataSet[:, dataSet.shape - 1] # Take out the category ``        return featureData, labels``    def priori(self, labels):``        #  Find a priori probability of yes or no ``        labels = list(labels)``        priori_ny = {}``        for label in labels:``            priori_ny[label] = labels.count(label) / float(len(labels)) # P = count(label) / count(labels)``        return priori_ny``    def feature_probability(self, priori_ny, features):``        #  Find the characteristic probability ： cloudy + yes , cloudy + no , cold + yes , cold + No, the probability of simultaneous occurrence ``        p_feature_ny = {}``        for ny in priori_ny.keys():``            ny_index = [i for i, label in enumerate(labels) if label == ny] #  yes 、 No subscript ``            for j in range(len(features)):``                f_index = [i for i, feature in enumerate(trainData[:, j]) if feature == features[j]] #  Subscripts of features ``                xy_count = len(set(f_index) & set(ny_index)) #  The subscripts of categories and features are the same length ``                pkey = str(features[j]) + '+' + str(ny)``                p_feature_ny[pkey] = xy_count / float(len(labels)) #  The probability that features and categories occur simultaneously ``        return p_feature_ny``    def conditional_probability(self, priori_ny, feature_probability, features):``        # Find out the conditional probability ``        P = {}``        for y in priori_ny.keys():``            for x in features:``                pkey = str(x) + '|' + str(y)``                P[pkey] = feature_probability[str(x) + '+' + str(y)] / float(priori_ny[y])  # P[X1/Y] = P[X1Y]/P[Y]``        return P``    def classify(self, priori_ny, feature_probability, features):``        # Find the conditional probability ``        p = self.conditional_probability(priori_ny, feature_probability, features)``        # Find out [ cloudy 、 cold 、 low 、 weak ] Category ``        f = {}``        for ny in priori_ny:``            f[ny] = priori_ny[ny]``            for x in features:``                f[ny] = f[ny] * p[str(x)+'|'+str(ny)]   # Calculation P( cloudy ∣ yes )∗P( cold ∣ yes )∗P( low ∣ yes )∗P( weak ∣ yes )∗P( yes )``        return max(f, key=f.get)  # The category corresponding to the maximum probability ``if __name__ == '__main__':``    nbm = Nbm()``    features = [' cloudy ', ' cold ', ' low ', ' weak ']``    trainData, labels = nbm.getSampleSet()``    priori_ny = nbm.priori(labels)``    feature_probability = nbm.feature_probability(priori_ny, features)``    result = nbm.classify(priori_ny, feature_probability, features)``    print(features, ' The result is ', result)``

### summary

This paper briefly introduces some concepts of naive Bayes , Two examples are used to enhance naive Bayes learning , I hope that's helpful .

### Reference material

《 Machine learning practice 》

https://baike.baidu.com/item/ Bayes' formula

https://www.ruanyifeng.com/blog/2011/08/bayesian_inference_part_one.html

https://zhuanlan.zhihu.com/p/26262151

### Code address

Sample code ：https://github.com/JustDoPython/python-100-day/tree/master/day-116

Series articles