The use of a Bayesian classifier implies that we know:
A priori probability P(Cl)
If the examples used to teach the system to recognize each class, are sufficiently numerous, then a priori probability P(Cl) can be estimated as the frequency of appearance of this class in comparison with the other classes. This is the most often observed approach when the system is taught to recognize classes from samples.
Conditional
probability
The estimation of the conditional probability represents the main problem. It is very difficult because it requires the estimation of the conditional probabilities for all the possible combinations of elements, given one particular class.
In reality, it is impossible to calculate these estimations. Some assumptions
of simplification are often used in order to make the training of the system
feasible. The assumption most frequently used is that of conditional
independence which states that the probability of two elements
and
,given that class is Cl , is the product of the
probabilities of each element taken separately, given that class is Cl
:
|
(8) |
With such an assumption, conditional probability
of an element
according to attributes Ai becomes:
|
(9) |
Bayesian
decision rule
This leads to the following Bayesian decision
rule, where class Cl represents the class
selected:
|
(10) |
represents the proportion of examples taking the value of element for attribute Ai in all examples belonging to class Cj.
The product of these probabilities proposed by each attribute Ai for element is the element for information fusion in this rule of Bayes.