Director:Philippsen, M.
Period:October 1, 2009 - October 1, 2015
Coworker:Veldema, R.; Dotzler, G.; Blaß, T.

JaMP is an implementation of the well-known OpenMP standard adapted for Java. JaMP allows one to program, for example, a parallel for loop or a barrier without resorting to low-level thread programming. For example:

class Test {
...void foo(){
......//#omp parallel for
......for (int i=0;i<N;i++) {
.........a[i] = b[i] + c[i]

is valid JaMP code. JaMP currently supports all of OpenMP 2.0 with partial support for 3.0 features, e.g., the collapse clause. JaMP generates pure Java 1.5 code that runs on every JVM. It also translates parallel for loops to CUDA-enabled graphics cards for extra speed gains. If a particular loop is not CUDA-able, it is translated to a threaded version that uses the cores of a typical multi-core machine. JaMP also supports the use of multiple machines and compute accelerators to solve a single problem. This is achieved by means of two abstraction layers. The lower layer provides abstract compute devices that wrap around the actual CUDA GPUs, OpenCL GPUs, or multicore CPUs, wherever they might be in a cluster. The upper layer provides partitioned and replicated arrays. A partitioned array automatically partitions itself over the abstract compute devices and takes the individual accelerator speeds into account to achieve an equitable distribution. The JaMP compiler applies code-analysis to decide which type of abstract array to use for a specific Java array in the user's program.

In 2015 we added OpenMP tasks (OpenMP 3.0) to JaMP. This makes it possible to parallelize recursive algorithms with JaMP.


  • JaMPCuda-2015-01-16.tar.gz Last Update: 2015-01-16
  • JaMPCudaJar.tar.gz Version without the source code. Last Update: 2010-07-21
  • This is the first version of Java/OpenMP that is pure Java (no Jackal), and as such is suitable for running on a normal JVM.This version is special in that it allows parallel regions to be executed (transparently) on a CUDA capable graphics card for more speed. You may need to tune the included setup script to adapt to your environment. This release is Linux only.
  • Note that the Cuda version also runs without having a GPU. In this case, temporarily install Cuda to satisfy the install script. Then install per the instructions. To run without Cuda (using your normal multicore CPU), start the application as normal without the CudaClassloader.
  • To set the number of cores/threads used you can do one of: (1) set the JAMP_NUM_THREADS property (2) call Omp.omp_set_num_threads(num_threads); There was code to automatically detect the number of cores used, but it isn't 100% reliable/portable.
  • Please drop me an email if these links do not work.

Cluster-Java/OpenMP: March 2013 (experimental)

  • jamp-dist-2013-04-04.tgz
  • This version supports multiple GPUs and machines. It is, however, more experimental than the older version.

GPU-GC: a garbage collector for GPU code for a Java like parallel language.

watermark seal