Parameters

Parameters

The use of a Bayesian classifier implies that we know:

the a priori probabilities P(C_l) that class C_l appears;
and the conditional probability $p(\omega / C_l)$ of being in the presence of element $\omega$ , given that the observation class is C_l.

A priori probability P(Cl)

If the examples used to teach the system to recognize each class, are sufficiently numerous, then a priori probability P(C_l) can be estimated as the frequency of appearance of this class in comparison with the other classes. This is the most often observed approach when the system is taught to recognize classes from samples.

Conditional probability $p(\omega / C_l)$

The estimation of the conditional probability $p(\omega / C_l)$ represents the main problem. It is very difficult because it requires the estimation of the conditional probabilities for all the possible combinations of elements, given one particular class.

In reality, it is impossible to calculate these estimations. Some assumptions of simplification are often used in order to make the training of the system feasible. The assumption most frequently used is that of conditional independence which states that the probability of two elements $\omega_1$ and $\omega_2$ ,given that class is C_l , is the product of the probabilities of each element taken separately, given that class is C_l :

$\begin{displaymath}p(\omega_1,\omega_2 / C_l) = p(\omega_1 / C_l) \cdot p(\omega_2 / C_l)\end{displaymath}$

(8)

With such an assumption, conditional probability $p(\omega / C_l)$ of an element $\omega$ according to attributes A_i becomes:

$\begin{displaymath}p(\omega / C_l) = \prod_{i=1}^n p_{A_i}(\omega / C_l)\end{displaymath}$

(9)

Bayesian decision rule

This leads to the following Bayesian decision rule, where class C_l represents the class selected:

$\begin{displaymath}p(C_l / \omega) = \max_{j=1}^p \left(\frac{P(C_j) \cdot \prod_{i=1}^n p_{A_i}(\omega / C_j)}{p(\omega)}\right)\end{displaymath}$

(10)

$p_{A_i}(\omega / C_j)$ represents the proportion of examples taking the value of element $\omega$ for attribute A_i in all examples belonging to class C_j.

The product of these probabilities proposed by each attribute A_i for element $\omega$ is the element for information fusion in this rule of Bayes.

IRIT-UPS