Programming the WEKA Datamining Toolkit With Jython

Tuesday, 29 November 2005

WEKA is a data mining library written in Java. WEKA implements many Machine Learning Algorithms for Classification and some Clustering algorithms. These algorithms are well supported by a nice set of data pre-processing utilities. WEKA also has GUI interface called the explorer and a Drag-and-Drop build tool to create custom applications.

However, WEKA does not lend it self for fast prototyping. That is, If I have a dataset, and want to see the result of using various algorithms interactively, or change parameters on-the-fly, it is not very convinient in Java.

It would be a good to have an interpreter where I can play around with data and algorithms.

Jython comes to the rescue. Jython is an implementation of the high-level, dynamic, object-oriented language Python written in 100% Pure Java, and seamlessly integrated with the Java platform. It thus allows you to run Python on any Java platform.

This means, we should be able access all the WEKA methods and classes from Jython because WEKA uses only Sun's Java libraries (including SWING).

Lets see how to do that:

Step 1 : Download Jython 2.1 from www.jython.org

Step 2 : Run the downloaded jython21.class file with java.

java jython21

The Graphical installer will ask you for the installation location. Choose any suitable location. My installation location is /home/pradeep/jython-2.1.

Step 2 : Download WEKA and unzip the zip file into a suitable location - /home/pradeep/weka-3-4-6

Step 3: Configure Jython

Open /home/pradeep/jython-2.1/jython and change the paths accordingly


#!/bin/sh
###############################################################################
#
# This file generated by Jython installer
# Created on XXX by pradeep

"/usr/bin/java" -Dpython.home="/home/pradeep/jython-2.1"  
-classpath "/home/pradeep/jython-2.1/jython.jar:/home/pradeep/weka-3-4-6/weka.jar:$CLASSPATH" 
"org.python.util.jython" "$@"

Above,

  1. /usr/bin/java is the java runtime executable
  2. /home/pradeep/jython-2.1 is the home of jython installation
  3. the string following -classpath is the list of all the libraries that you want available inside Jython environment.

Step 4 : Testing the setup.

$/home/pradeep/jython-2.1/jython
Jython 2.1 on java1.4.2-02 (JIT: null)

>>> import weka
>>> dir(weka)
['name', 'associations', 'attributeSelection', 'classifiers', 
'clusterers', 'core', 'datagenerators', 'estimators', 'experiment', 'filters', 'gui']
>>>

Now you have access to all the WEKA methods and libraries.