PaToH (Partitioning Tools for Hypergraph) is an extremely fast multilevel hypergraph partitioning tool. Important features of PaToH:

You can find more information about PaToH, as well as binary distributions for various platforms here.


DGL (Deep Graph Library is a Graph Machine Learning library. It provides fast and memory-efficient message passing primitives for training Graph Neural Networks. We are building the dgl.graphbolt library in collaboration with the AWS Shanghai AI lab, an extremely optimized multi-GPU GNN dataloading library. Important features of GraphBolt:

You can find more information about GraphBolt, as well as information on how to get started here.


ElGA (Elastic Graph Analysis) is a distributed graph analysis system that is designed for handling dynamic graphs. It is elastic and so can scale during operation as the dynamic graphs grow and shrink. More information about ElGA can be found here.


BOA (Bucket-Order-Assemble) is a parallel de novo genome assembly framework that utilizes hypergraph and graph partitioning. This library is designed to improve assembly quality as well as expose a high degree of parallelism for standalone assemblers. More information and source code of BOA can be found here.


PIGO (a Parallel Graph Input and Output library) is a library built to assist you with common sparse graph or matrix input, output, and preprocessing. It supports easily loading a variety of graph formats rapidly in parallel, performing standard preprocessing, and saving results and intermediate representations. More information about PIGO can be found here.


HiSVSIM is a hierarchical state vector simulator of quantum circuits that works with valid acyclic partitioning of (DAG representation of) quantum circuits. More information about HiSVSIM can be found here.


SARMA (SpatiAl Rectilinear Matrix pArtitioning) is a template-based, header only, library for spatial rectilinear partitioning. The main goal of this library to introduce novel symmetric rectilinear partitioning algorithms. More information about SARMA can be found here.


bbTC (A Block Based Triangle Counting Algorithm on Heterogeneous Environments) is a triangle counting algorithm for shared memory heterogenous systems with CPUs and GPUs. More information about bbTC can be found here.


gsaNa is an iterative labeled network aligner which leverages the global structure-based vertex positioning technique to reduce the problem size, and produces high quality alignments. More information about gsaNA can be found here.


dagP is a fast multilevel directed acyclic graph partitioning tool. You can find more information about dagP here.

dagP Scheduler

dagPscheduler is a static scheduler built on top of dagP partitioner tool. You can find more information about dagPscheduler here.

Nucleus Decomposition

Nucleus decomposition is a framework to find hierarchy of dense subgraphs, based on the generalization of k-core concept. It also includes a visualization feauture to show the hierarchy between dense subgraphs. More information about nucleus decomposition can be found here.

S3G2 - Scalable Shell Sequence Graph Generator

S3G2 is a sequential/distributed/shared memory synthetic graph generator that uses a distribution histogram of k-shell values as input. You can find more information about SSSGG here.


matchmaker2 is a framework for maximum cardinality matching algorithms on bipartite graphs. It includes several GPU-based maximum cardinality matching algorithms in addition to sequential ones. More information about matchmaker2 can be found here.


BADIOS is a framework to shatter and compress graphs for fast betweenness centrality computation. It also includes a preordering procedure. More information about BADIOS can be found here.


gpuBC is a sofware including a set of techniques to make the betweenness computations faster on GPUs as well as on heterogeneous CPU/GPU architectures. Our techniques are based on virtualization of the vertices with high degree, strided access to adjacency lists, removal of the vertices with degree 1, and graph ordering. More information can be found here.


theadvisor is an academic paper recommendation service that helps researchers with their literature search. The service starts with a simple keyword search or takes a bibliography file (in BibTeX, RIS, or EndNote XML format) of a paper the researcher currently working on, and suggests other relevant publications. It also gives venue and reviewer recommendations. You can access the service at


mrSNP is a web service that predicts the impact of a SNP in a 3UTR on miRNA binding. It reduces the manual work and allows users to input any SNP that has been captured with any SNP-calling program. More information about mrSNP can be found here.


CPB (Correlated Patterns Biclustering) is a biclustering-based tool to mine genes that are co-regulated with a given reference gene in order to discover genes that function in a common biological process. More information about CPB can be found here.


MICA is an R package designed to integrate microRNAs and mRNAs expression to better discover active modules from the protein-protein interaction (PPI) network. More information about MICA can be found here.


BiBench (Bicluster Benchmarking) is a Python library designed to make biclustering analysis easy by providing a common interface to several biclustering algorithms. It also provides features such as, generation of synthetic datasets for different bicluster models, transformations of the datasets, and validation and visualization of the findings of the algorithms. More information about BiBench can be found here.

Benchmarking short sequence mapping tools

This is a benchmarking suite that extensively analyze sequencing tools with respect to various aspects and provide an objective comparison. Information about the tools included in the comparison and the options used in the experiments in addition to the code used to verify the tools can be found here.


pMap is an MPI-based tool to parallelize the alignment step of state-of-the-art sequence mapping programs. It allows transparent execution of the alignment step of a selected program in parallel on a compute-cluster. pMap is publicly available and currently supports BWA, SOAP, Bowtie, GSNAP, MAQ and RMAP. More information about pMap can be found here.


SPart is a C++ library for partitioning a spatially located workload into balanced parts. SPart provides numerous algorithms to partition one dimensional workload into intervals and two dimensional workload into rectangles. The spatial partitioning techniques are commonly used to distribute scientific application including particle in cell simulation, direct volume rendering, linear algebra and collision detection. More information about SPart can be found here.


Zoltan, developed and maintained by Sandia National Laboratories, is a toolkit for load balancing and parallel data management. In the last couple of years, we have been collaborating with the Zoltan team. As an outcome of this collaboration we developed a parallel multilevel hypergraph partitioning algorithm (which can be used for static and dynamic load-balancing as well as matrix partitioning), and also distance-1 and distance-2 coloring algorithms. Current Zoltan release includes the implementation of these algorithms and its source code is distributed under the GNU Lesser General Public License. More information and the distribution of Zoltan can be found at the project web site.


DataCutter is a component-based middleware framework initally designed to support coarse-grain data-flow execution on heterogeneous environments. In DataCutter, application processing structure is implemented as a set of components, named filters, that exchange data through logical streams. A stream denotes an uni-directional data flow from one filter (i.e., the producer) to another (i.e., the consumer). A filter is required to read data from its input streams and write data to its output streams only.

The DataCutter runtime system supports both data- and task-parallism. Processing, network and data copying overheads are minimized by the ability to place filters on different platforms. DataCutter's filtering service performs all steps necessary to instantiate filters on the desired hosts, to connect all logical endpoints, and to call the filter's interface functions for processing work. Data exchange between two filters on the same host takes place by memory copy operations, while a message passing communication layer (e.g. TCP sockets or MPI) is used for communication between filters on different hosts.

We are currently developing a light-weight version of DataCutter, DataCutter-Lite, for multi-to-many core architectures.