System Requirements, Speed, Scalability.

  • Software. The program uses standard Java and MySQL. Your Java installation should be recent enough to run Version 4.3.10 of Tetrad. JRE 1.6 should suffice. For MySQL, we have used version 5.0.95 and later.

  • Memory. For larger databases, the current version of the system requires substantial amounts of memory. For the Java heap space, we have needed 4GB. For MySQL, to be on the safe side, you should temporary file space around 3 times the size of the original database. The memory requirements are mainly due to the need to compute and store large contingency tables. We are working on this.

  • Scalability. The algorithms scale well in terms of the number of rows in a table. Hundreds of thousands of rows should be no problem, even millions if your database server can handle them. The program does not scale as well in terms of the number of columns and in terms of the number of relationships. Self-relationships also increase the complexity (e.g., Friend(Person,Person)). The reason is that columns and relationships increase exponentially the size of a contingency tables that are the input to a Bayes net learner. We are researching ways for reducing the memory cost due to large contingency tables.

The following table reports learning times for the program run a system with the following resources: QEMU Virtual CPU single processor (cpu64-rhel6), 19 GB RAM, 26.4 GB hard disk storage. This should give you an idea of what to expect by way of runtime.

Name Number of Tables Total Number of Tuples Maximum Number of Rows in Table Time to learn Bayes Net (ms / min)
unielwin 5 336 92 7,358ms / 0.1min
Mutagenesis_std 4 24,326 4,893 19,457ms / 0.32min
MovieLens_std 3 83,402 79,779 7,339ms / 0.1min
imdb_MovieLens 7 1,251,038 996,515 14,606,026ms / 243.4min
Hepatitis_std 7 11,316 5,691 12,470ms / 0.2min
English Premier League
8 12,081 4,954 141,702ms / 2.4min