定义
<span style="color: rgb(51, 51, 51); font-family: Arial; font-size: 14px; line-height: 26px;">设</span><img alt="" src="http://latex.codecogs.com/gif.latex?x=\{a_1,a_2,...,a_m\}" style="border: none; max-width: 100%; color: rgb(51, 51, 51); font-family: Arial; font-size: 14px; line-height: 26px;" original-title="x=\{a_1,a_2,...,a_m\}"><span style="color: rgb(51, 51, 51); font-family: Arial; font-size: 14px; line-height: 26px;">为一个待分类项,而每个a为x的一个特征属性</span>
<span style="color: rgb(51, 51, 51); font-family: Arial; font-size: 14px; line-height: 26px;">有类别集合</span><img alt="" src="http://latex.codecogs.com/gif.latex?C=\{y_1,y_2,...,y_n\}" style="border: none; max-width: 100%; color: rgb(51, 51, 51); font-family: Arial; font-size: 14px; line-height: 26px;" original-title="C=\{y_1,y_2,...,y_n\}">
<span style="color: rgb(51, 51, 51); font-family: Arial; font-size: 14px; line-height: 26px;">计算</span><img alt="" src="http://latex.codecogs.com/gif.latex?P(y_1|x),P(y_2|x),...,P(y_n|x)" style="border: none; max-width: 100%; color: rgb(51, 51, 51); font-family: Arial; font-size: 14px; line-height: 26px;" original-title="P(y_1|x),P(y_2|x),...,P(y_n|x)">
<span style="color: rgb(51, 51, 51); font-family: Arial; font-size: 14px; line-height: 26px;">如果</span><img alt="" src="http://latex.codecogs.com/gif.latex?P(y_k|x)=max\{P(y_1|x),P(y_2|x),...,P(y_n|x)\}" style="border: none; max-width: 100%; color: rgb(51, 51, 51); font-family: Arial; font-size: 14px; line-height: 26px;" original-title="P(y_k|x)=max\{P(y_1|x),P(y_2|x),...,P(y_n|x)\}"><span style="color: rgb(51, 51, 51); font-family: Arial; font-size: 14px; line-height: 26px;">,则</span><img title="x \in y_k" alt="" src="http://latex.codecogs.com/gif.latex?x%20\in%20y_k" style="border: none; max-width: 100%; color: rgb(51, 51, 51); font-family: Arial; font-size: 14px; line-height: 26px;">
采用属性条件独立性假设
<img src="http://img.blog.csdn.net/20160512152902255">
计算各类条件概率
给定已知类别训练样本集合
<span style="color: rgb(51, 51, 51); font-family: Arial; font-size: 14px; line-height: 26px;">统计得到在各类别下各个特征属性的条件概率估计</span>
<img src="http://latex.codecogs.com/gif.latex?P(a_1|y_1),P(a_2|y_1),...,P(a_m|y_1);P(a_1|y_2),P(a_2|y_2),...,P(a_m|y_2);...;P(a_1|y_n),P(a_2|y_n),...,P(a_m|y_n)">
根据贝叶斯准则有:
<img src="http://latex.codecogs.com/gif.latex?P(y_i|x)=\frac{P(x|y_i)P(y_i)}{P(x)}">
<span style="color: rgb(51, 51, 51); font-family: Arial; font-size: 14px; line-height: 26px;">分母对于所有类别为常数,只要</span><span style="color: rgb(51, 51, 51); font-family: Arial; font-size: 14px; line-height: 26px;">最大化</span><span style="color: rgb(51, 51, 51); font-family: Arial; font-size: 14px; line-height: 26px;">分子</span>
朴素贝叶斯假定属性相互独立
<img src="http://latex.codecogs.com/gif.latex?P(x|y_i)P(y_i)=P(a_1|y_i)P(a_2|y_i)...P(a_m|y_i)P(y_i)=P(y_i)\prod^m_{j=1}P(a_j|y_i)">
流程如下
<img src="http://images.cnblogs.com/cnblogs_com/leoo2sk/WindowsLiveWriter/4f6168bb064a_9C14/1_thumb.png">