norsys.netica
Class NetTester

java.lang.Object
  |
  +--norsys.netica.NetTester

public class NetTester
extends java.lang.Object

A tool for grading a Bayes net by using a set of real cases to see how well the predictions or diagnosis of the net match the actual cases. It is not for decision networks.

The following outlines how you typically use a NetTester object.

1. Choose the nodes that you wish the Net to predict and get rated on, and then place them in a NodeList. These nodes are called the "test" nodes. Their values in the case file are treated as though they were hidden from the Bayes net.

2. Choose any additional nodes you do not wish the network to know the value of during its inference. For example, if the network is for medical diagnosis, you might select the disease nodes and nodes representing other unobservable internal states. These nodes are called the "unobserved" nodes. It is okay if this list is empty or null when passed to the NetTester constructor. It is also okay if test nodes are included in the unobserved" nodes list, although this is unnecessary and redundant, since test nodes are unobserved by definition.

3. Construct the NetTester. E.g.:   NetTester tester = new NetTester (testNodes, unObsvNodes, -1);

4. Run the method testWithCaseset(), supplying the caseset to use for testing. Netica will take the cases, processing them one-by-one. When Netica examines a case, it will ignore any findings for the unobserved nodes. It then does belief updating to generate beliefs for each of the unobserved nodes. It goes back and checks the true value for those nodes as supplied by the case file (if they are supplied for that case), and compares them with the beliefs it generated. It accumulates all the comparisons into summary statistics.

5. Retrieve any of the summary statistics for the nodes you are interested in. Call any of getConfusion, getErrorRate, getLogLoss, or getQuadraticLoss, as desired.

6. Repeat steps 4 and 5 as often as desired, on possibly new case files, thus accumulating the cases and observing how the statistics change.

7. Optionally, cleanup up all the resources allocated for testing by calling the finalize()finalize method.

A coding example of the above is given in the example for NetTester.

Since:
2.08
Version:
5.04 - January 21, 2012

Constructor Summary
NetTester(NodeList testNodes, NodeList unObsvNodes, int tests)

Creates a NetTester which is a tool for grading a Bayes net, using a set of real cases to see how well the predictions or diagnosis of the net match the actual cases.

 
Method Summary
 void finalize()

Removes the NetTester object and frees all its resources (e.g., memory).

 double getConfusion(Node node, int predictedState, int actualState)

Returns the number of times the Net predicted predictedState for node, but the case file actually held actualState as the value of that node, during the performance test of a net.

 double getErrorRate(Node node)

Returns the accumulated "error rate" of node under the tests previously performed with test.

 double getLogLoss(Node node)

Returns the "logarithmic loss" of node under the tests previously performed with test.

 double getQuadraticLoss(Node node)

Returns the "quadratic loss" of node under the tests previously performed.

 void testWithCaseset(Caseset caseset)

Scans through the case data in caseset to do a number of performance tests on a Bayes net (specified when creating the this NetTester).

 
Methods inherited from class java.lang.Object
clone, equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

public NetTester (
 NodeList  testNodes
 NodeList  unObsvNodes
 int  tests 
) throws NeticaException
Creates a NetTester which is a tool for grading a Bayes net, using a set of real cases to see how well the predictions or diagnosis of the net match the actual cases. It is not for decision networks.

testNodes are the nodes that the Bayes net will predict and get rated on. Their values in the case file are all hidden from the Bayes net (i.e., unobserved) whenever a case is read. For each such case, the Bayes net does a prediction and compares that prediction with the true value from the case file, accumulating statistics as it goes.

If unObsvNodes is non-null, then the nodes it contains will also be unobserved. It is okay if it repeats nodes already in testNodes.

Pass -1 for tests.

After creating the NetTester object, you run the tests using testWithCaseset, and then read out the results of the tests with the get... methods. When done, you discard the NetTester with finalize.

IMPORTANT: Before calling testWithCaseset, you may want to call Net.retractFindings to remove any findings entered, because otherwise those findings will be considered while testing each case in the file.

The same net-testing capability is available as "Cases -> Test With Cases" in Netica Application.

Parameters:
NodeList    testNodes    the nodes that the Bayes net will predict and get rated on
NodeList    unObsvNodes    a list of nodes that will will also be unobserved. If there are no such nodes, you may pass null.
int    tests    a parameter for future use; pass -1 for now.

Version:

Versions 2.08 and later have this method.
In the C Version of the API, this function is named NewNetTester_bn.
See Also:
testWithCaseset    Accumulate case data into the test
getConfusion    Get elements of the confusion matrix
getErrorRate    Get fraction of test cases where prediction failed
getLogLoss    Get the "logarithmic loss" score of the test
getQuadraticLoss    Get the "quadratic loss" score of the test
finalize    Free up tester and all its resources
NodeList.NodeList    Create the node lists

Example:
  
  Net      net         = new Net (new Streamer ("ChestClinic.dne"));
  NodeList testNodes   = new NodeList (net);
  NodeList unobsvNodes = new NodeList (net);
  
  Node visitAsia    = net.getNode ("VisitAsia");
  Node tuberculosis = net.getNode ("Tuberculosis");
  Node cancer       = net.getNode ("Cancer");
  Node smoking      = net.getNode ("Smoking");
  Node tbOrCa       = net.getNode ("TbOrCa");
  Node xRay         = net.getNode ("XRay");
  Node dyspnea      = net.getNode ("Dyspnea");
  Node bronchitis   = net.getNode ("Bronchitis");
  
  // The observed nodes are typically the factors known during diagnosis
  testNodes.add (visitAsia);
  testNodes.add (smoking);
  testNodes.add (xRay);
  testNodes.add (dyspnea);
  
  // The unobserved nodes are typically the factors not known during diagnosis:
  unobsvNodes.add (cancer);
  unobsvNodes.add (tuberculosis);
  unobsvNodes.add (tbOrCa);
  
  net.retractFindings();  // IMPORTANT: Otherwise any findings will be part of tests !!
  net.compile();
  
  NetTester tester = new NetTester (testNodes, unobsvNodes, -1);
  
  Streamer inStream = new Streamer ("ChestClinic.cas");
  Caseset cs = new Caseset();
  cs.addCases( inStream, 1.0, null );
  tester.testWithCaseset (cs);
  
  //printConfusionMatrix() is defined in the example for getConfusion
  //It can also be found in examples\TestNet.java
  printConfusionMatrix (tester, smoking);
  System.out.println ("Error rate for " + smoking.getName() + " = " + tester.getErrorRate (smoking));
  	System.out.println ("Logarithmic loss = " + tester.getLogLoss (smoking));
  	
  //====================================================
  // If ChestClinic.cas contains 200 cases randomly generated from the ChestClinic.dne net,
  // then sample output from running the above program might be:
  
  Confusion matrix for Smoking:
          Smoker  NonSmoker       Actual
          52      32              Smoker
          37      79              NonSmoker
  Error rate for Smoking = 0.345
  Logarithmic loss = 0.6490544360707616
  
Method Detail
public void finalize ( ) throws NeticaException
Removes the NetTester object and frees all its resources (e.g., memory).

If you override this method, be sure to call the base class method (super.finalize();).

Version:

Versions 2.08 and later have this method.
In the C Version of the API, this function is named DeleteNetTester_bn.
See Also:
NetTester    Construct the NetTester object

Overrides:
finalize in class java.lang.Object

public double getConfusion (
 Node  node
 int  predictedState
 int  actualState 
) throws NeticaException
Returns the number of times the Net predicted predictedState for node, but the case file actually held actualState as the value of that node, during the performance test of a net. These are the entries of a table traditionally called the "confusion matrix".

For each case, the "prediction" is formed by reading the values of the "observed nodes" of that case in the file, using them to update beliefs in the net, and then picking the state of node which has the highest resultant belief (posterior probability) to be the prediction. The set of "observed nodes" is specified when creating the NetTester.

node is required to have been in the testNodes list originally passed to NetTester.

Parameters:
Node    node    The node being examined.
int    predictedState    The state predicted by this net, as compared to the actualState in the case file.
int    actualState    the actual state in the case file, as compared to the predictedState

Version:

Versions 2.08 and later have this method.
In the C Version of the API, this function is named GetTestConfusion_bn.
See Also:
getErrorRate    Get the fraction of test cases for which the prediction failed
getLogLoss    Get the "logarithmic loss" score of the test
getQuadraticLoss    Get the "quadratic loss" score of the test
NetTester    Construct the NetTester object

Example:
See NetTester for a program that creates a NetTester, and uses the below function.
The below function appears in NeticaEx.c:
/* * Print a confusion matrix table. * This method can be found in examples\TestNet.java * that comes with this distribution. * public static void printConfusionMatrix (NetTester nt, Node node) throws NeticaException { int numStates = node.getNumStates(); System.out.println("\nConfusion matrix for " + node.getName() + ":"); for (int i=0; i < numStates; ++i){ System.out.print ("\t" + node.state(i).getName()); } System.out.println( "\tActual"); for (int a=0; a < numStates; ++a){ for (int p=0; p < numStates; ++p){ System.out.print ("\t" + (int) (nt.getConfusion(node, p, a ))); } System.out.println ("\t" + node.state(a).getName()); } } // Sample output: Confusion matrix for Cancer: Present Absent Actual 11 1 Present 4 184 Absent

public double getErrorRate (
 Node  node 
) throws NeticaException
Returns the accumulated "error rate" of node under the tests previously performed with test. This is the fraction of times the Net predicted or diagnosed states incorrectly for node, out of all the cases which provided a value for node.

For each case, the "prediction" is formed by reading the values of the "observed nodes" of that case in the file, using them to update beliefs in the net, and then picking the state of node which has the highest resultant belief (posterior probability) to be the prediction. The set of "observed nodes" is specified when creating the NetTester.

A result of 0.0 means no prediction errors, whereas a result of 1.0 means all predictions were in error.

node is required to have been in the testNodes list originally passed to NetTester.

Parameters:
Node    node    The node being examined.

Version:

Versions 2.08 and later have this method.
In the C Version of the API, this function is named GetTestErrorRate_bn.
See Also:
getConfusion    Get elements of the confusion matrix
getLogLoss    Get the "logarithmic loss" score of the test
getQuadraticLoss    Get the "quadratic loss" score of the test
NetTester    Construct the NetTester object

Example:

public double getLogLoss (
 Node  node 
) throws NeticaException
Returns the "logarithmic loss" of node under the tests previously performed with test.

The "logarithmic loss" is defined as:  MOAC [ - log (pc)]

  where MOAC stands for the mean (average) over all cases (i.e., all cases
  for which the case file provides a value for the node in question), and

  where log(pc) is the natural logarithm of the probability predicted for the state that turns out to be correct.

Values for logarithmic loss vary from 0 to infinity (inclusive), with 0 being a perfect score. If you must use a single number to grade the predictive/diagnostic quality of a net with respect to a certain discrete node, then we recommend the logarithmic loss.

node is required to have been in the testNodes list originally passed to NetTester.

Parameters:
Node    node    The node being examined.

Version:

Versions 2.08 and later have this method.
In the C Version of the API, this function is named GetTestLogLoss_bn.
See Also:
getConfusion    Get elements of the confusion matrix
getErrorRate    Get the fraction of test cases for which the prediction failed
getQuadraticLoss    Get the "quadratic loss" score of the test
NetTester    Construct the NetTester object
Sensitivity.getMutualInfo    Find the mutual info (entropy reduction) between two nodes

Example:

public double getQuadraticLoss (
 Node  node 
) throws NeticaException
Returns the "quadratic loss" of node under the tests previously performed.

The "quadratic loss" (also known as "Brier score") is defined as:  MOAC [1 - 2 * pc + sum[j=1 to n] (pj2)]

  where MOAC stands for the mean (average) over all cases (i.e., all cases
  for which the case file provides a value for the node in question),

  where pc is the probability predicted for the state that turns out to be correct,

  where pj is the probability predicted for state j, and

  where n is the number of states of the node.

Values for quadratic loss vary from 0 to 2, with 0 being a perfect score.

node is required to have been in the testNodes list originally passed to NetTester.

Parameters:
Node    node    The node being examined.

Version:

Versions 2.08 and later have this method.
In the C Version of the API, this function is named GetTestQuadraticLoss_bn.
See Also:
getConfusion    Get elements of the confusion matrix
getErrorRate    Get the fraction of test cases for which the prediction failed
getLogLoss    Get the "logarithmic loss" score of the test
NetTester    Construct the NetTester object


public void testWithCaseset (
 Caseset  caseset 
) throws NeticaException
Scans through the case data in caseset to do a number of performance tests on a Bayes net (specified when creating the this NetTester).

Netica will pass through the caseset, processing the cases one-by-one. Netica first reads in the case, except for any findings for the unobserved nodes (specified when creating the NetTester). It then does belief updating to generate beliefs for each of the unobserved nodes, and checks those beliefs against the true value for those nodes as supplied by the case file (if they are supplied for that case). It accumulates all the comparisons into summary statistics (which may be retrieved by the various get... methods).

IMPORTANT: Before calling testWithCaseset, you may want to call retractFindings to remove any findings entered, because otherwise those findings will be considered while testing each case in the file.

The net must be compiled (see Net.compile) before calling this.

This method can be called multiple times with different files to accumulate the results of all the cases.

Calls to this method can be intermingled with calls to getConfusion, getErrorRate, getLogLoss, and getQuadraticLoss.

This function will properly support a 'NumCases' column in any case file used to create the caseset, if such a column was present.

Parameters:
Caseset    caseset    

Version:

Versions 2.08 and later have this method. In versions previous to 3.15, this method was named testWithFile.
In the C Version of the API, this function is named TestWithCaseset_bn.
See Also:
NetTester    Construct the NetTester object

Example: