bayesian classification
Play-tennis example: classifying X
•An unseen sample X = <rain, hot, high, false>
•P(X|p)·P(p) =
P(rain|p)·P(hot|p)·P(high|p)·P(false|p)·P(p) = 3/9·2/9·3/9·6/9·9/14 = 0.010582
P(rain|p)·P(hot|p)·P(high|p)·P(false|p)·P(p) = 3/9·2/9·3/9·6/9·9/14 = 0.010582
•P(X|n)·P(n) =
P(rain|n)·P(hot|n)·P(high|n)·P(false|n)·P(n) = 2/5·2/5·4/5·2/5·5/14 = 0.018286
P(rain|n)·P(hot|n)·P(high|n)·P(false|n)·P(n) = 2/5·2/5·4/5·2/5·5/14 = 0.018286
•Sample X is classified in class n (don’t play)
The independence hypothesis…
•… makes computation possible
•… yields optimal classifiers when satisfied
•… but is seldom satisfied in practice, as attributes (variables) are often correlated.
•Attempts to overcome this limitation:
–Bayesian networks, that combine Bayesian reasoning with causal relationships between attributes
–Decision trees, that reason on one attribute at a time, considering most important attributes first
Bayesian Belief Networks
•Bayesian belief network allows a subset of the variables conditionally independent
•A graphical model of causal relationships
•Several cases of learning Bayesian belief networks
–Given both network structure and all the variables: easy
–Given network structure but only some variables
–When the network structure is not known in advance
•Classification process returns a prob. distribution for the class label attribute (not just a single class label)