void LearnCPTs_bn ( learner_bn*  learner,   const nodelist_bn*  nodes,   const caseset_cs*  cases,   double  degree )

Performs learning of CPT tables from data. For EM or gradient descent algorithms this is done until a termination condition is met.

learner is the learner object that performs the learning steps. Construct it with NewLearner_bn.

nodes is the list of nodes whose experience and conditional probability tables are to be updated by learning. They must all be from the same net. Other nodes in that net will not be modified.

cases is the set of cases to be used for learning.

degree is the frequency factor to apply to each case in the case set. It must be greater than zero. It gets multiplied by the "NumCases" (multiplicity number) which appears for each case in the file (if the number doesn't appear in the file, it is taken as 1).

When you create the learner_bn (see NewLearner_bn), you choose the algorithm you wish, which may be one of:

1. Counting Learning This is traditional one-pass learning (see ReviseCPTsByFindings_bn) ... . It is the preferred learning method to use, if there are no hidden (also known as 'latent') nodes in the net and no missing values in the case data. If there are hidden variables, that is, variables for which you have no observations, but you suspect exist and can be useful for modeling your world, or if there are a substantial number of missing values in the case data, then the iterative learning algorithms may yield better results.
Because this learning method is not iterative, SetLearnerMaxIters_bn and SetLearnerMaxTol_bn have no affect on it.

2. EM Learning EM learning optimizes the net's CPTs using the well known expectation maximization algorithm, in an attempt to maximize the probability of the data set given the net (i.e., minimize negative log likelihood of the data). If the nodes have CPT and experience tables before the learning starts, they will be considered as part of the data (properly weighted using the experience table), so the knowledge from the data set is combined with the knowledge already in the net. If you do not want this effect, be sure to delete the tables first (see DeleteNodeTables_bn). During EM learning, for each case in the case file, only the CPTs of nodes with findings and their ancestor nodes become modified, so only those nodes will have their experience tables incremented.

3. Gradient Descent Learning Gradient descent learning works similar to EM learning, but it uses a very different algorithm internally. It uses a conjugate gradient descent to maximize the probability of the data, given the net, by adjusting the CPT table entries. Generally speaking, this algorithm converges faster than EM learning, but may be more susceptible to local maxima. It has similarities to the neural net back propagation algorithm.

After the Learner is created, you can set the termination conditions for it. For both EM learning and gradient descent learning, the two possible termination conditions are the maximum number of iterations of the whole batch of cases (see SetLearnerMaxIters_bn), and the minimum change in log likelihood from one pass through the batch to the next (see SetLearnerMaxTol_bn). Termination will occur when either of the two conditions are met. For Counting learning, there currently are no termination conditions to set.


Versions 2.26 and later have this function.

See also:

NewLearner_bn    Creates the learner
SetLearnerMaxIters_bn    Sets a learning termination parameter: the maximum number of batch iterations
SetLearnerMaxTol_bn    Sets a learning termination parameter: the minimum log likelihood increase
NewCaseset_cs    Creates the caseset_cs
ReviseCPTsByCaseFile_bn    Uses a different learning algorithm (better suited if there is little missing data)
DeleteNodeTables_bn    May want to do this before learning


See AddDBCasesToCaseset_cs