TDA Lab - Software - BiBench

Table Of Contents

Previous topic

bibench

Next topic

algorithms Package

This Page

bibench Package

bibench Package

BiBench: a framework for biclustering tasks.

all Module

bicluster Module

Classes for represting biclusters, and some utility functions for dealing with common bicluster tasks, like IO.

class bibench.bicluster.Bicluster(rows, cols, data=None)[source]

A class for representing biclusters.

area()[source]

Returns the number of elements in this bicluster.

array(rows=None, cols=None)[source]

Get a numpy array bicluster from data, using the indices in bic_indices.

Note: requires that this Bicluster’s data member is not None.

Args:
  • rows: the row indices to use; defaults to this bicluster’s rows.
  • cols: the column indices; defaults to this bicuster’s columns.
copy()[source]

Returns a deep copy of this instance.

difference(other)[source]

Returns the difference of two biclusters.

Args:
  • other: a Bicluster
Returns:

A Bicluster instance with self’s rows and columns, but not other’s.

If other and self have the same data attribute, the returned Bicluster also has it; else its data attribute is None.

filter_cols()[source]

Returns the dataset with only the columns from this bicluster.

Note: requires that this Bicluster’s data member is not None.

filter_rows()[source]

Returns the dataset with only the rows from this bicluster.

Note: requires that this Bicluster’s data member is not None.

intersection(other)[source]

Returns a new bicluster with common rows and columns.

Args:
  • other: a Bicluster
Returns:

A Bicluster instance, with rows and columns common to both self and other.

If other and self have the same data attribute, the returned Bicluster also has it; else its data attribute is None.

issubset(other)[source]

Returns True if self’s rows and columns are both subsets of other’s; else False.

overlap(other)[source]

Returns the ratio of the overlap area to self’s total size.

shape()[source]

Returns the number of rows and columns in this bicluster.

symmetric_difference(other)[source]

Returns a new bicluster with only unique rows and columns, i.e. the inverse of the intersection.

Args:
  • other: a Bicluster
Returns:

A Bicluster instance with all rows and columns unique to either self or other.

If other and self have the same data attribute, the returned Bicluster also has it; else its data attribute is None.

union(other)[source]

Returns a new bicluster with union of rows and columns.

Args:
  • other: a Bicluster
Returns:

A Bicluster instance with all rows and columns from both self and other.

If other and self have the same data attribute, the returned Bicluster also has it; else its data attribute is None.

class bibench.bicluster.BiclusterList(itr, algorithm=None, arguments=None, properties=None)[source]

Bases: list

A list of biclusters with three extra attributes:

  • alg: the algorithm that generated these biclusters
  • args: the arguments to ‘alg’
  • properties: properties, such as likelihood, of this clustering, if any.
bibench.bicluster.bicluster_algorithm(f)[source]

Decorator to automatically set ‘alg’ and ‘args’ attribute of results of a biclustering algorithm.

bibench.bicluster.filter(biclusters, minrows=2, mincols=2, max_overlap=1.0, remove_subsets=True, datashape=None)[source]

Removes duplicates, small biclusters, overlapping biclusters, and biclusters that are as large as the dataset from a list.

Args:
  • biclusters: a list of biclusters to filter.

  • min_rows: the minimum allowed number of rows.

  • min_cols: the minimum allowed number of columns.

  • max_overlap: the maximum allowed % overlap between any two clusters;

    a float between 0 and 1.

  • remove_subsets: filter out biclusters that are subsets of existing

    biclusters.

  • data: use if bicluster.data is None.

Returns:
A sublist of the given biclusters.
bibench.bicluster.get_row_col_matrices(biclusters)[source]

Returns the row x number and col x number matrices for the given set of biclusters.

Requires that ‘data’ member be set and equal for all biclusters.

Args:
  • biclusters: a list of Bicluster instances.
Returns:

The tuple (rowmatrix, colmatrix), where rowmatrix has dimensions m by len(biclusters) and colmatrix has dimensions n by len(biclusters), where the dataset has m rows and n columns.

Element rowmatrix[x, y] is 1 if row x is in bicluster y, else it is zero. Element colmatrix[x, y] is 1 if column x is in bicluster y, else zero.

bibench.bicluster.read_biclusters(filename)[source]

Reads the bicluster from a file writtin by write_biclusters().

Args:
  • filename: a string.
bibench.bicluster.write_biclusters(biclusters, filename)[source]

Write biclusters to an output file.

Uses the format:

<rows> <cols>

seperated by empty lines.

Args:
  • biclusters: the list of biclusters that will be written to the file
  • filename: a string containing the output file name.

rutil Module

util Module

bibench.util.bootstrap(data, size)[source]

Bootstrap a new dataset, of any size, from the given dataset, with replacement.

Args:
  • data: numpy.ndarray
  • size: int or sequence of ints
Returns:
A numpy.ndarray
bibench.util.dict_combinations(d)[source]

Takes a dictionary containing lists. Generates all combinations of values from those lists.

Useful for ranges of parameters for functions.

>>> [i for i in dict_combinations(dict(first=[1,2]))]
[{'first': 1}, {'first': 2}]
bibench.util.flatten(nested)[source]

Flatten a list of lists into a single list.

>>> flatten([[1, 2, 3], [4, 5, 6]])
[1, 2, 3, 4, 5, 6]
bibench.util.get_hidden_dir(subdir=None)[source]

Get the BiBench cache directory, and create it if necessary.

Args:
  • subdir: a subdir to create if it does not exist.
bibench.util.grouper(iterable, n, fillvalue=None)[source]

Iterate over a list in chunks. From ‘http://stackoverflow.com/questions/434287/what-is-the-most-pythonic-way-to-iterate-over-a-list-in-chunks

>>> list(grouper([1, 2, 3, 4], 3, 'x'))
[(1, 2, 3), (4, 'x', 'x')]
bibench.util.isiterable(obj)
bibench.util.make_index_map(mylist)[source]

Map each item in the list to its list index.

bibench.util.shuffle(data)[source]

Shuffle an array along all axes. Returns the shuffled array.

bibench.util.which(program)[source]

Check for an executable on the PATH; return its absolute path.

Taken from http://stackoverflow.com/questions/377017/test-if-executable-exists-in-python

bibench.util.zdumps(obj)[source]

dump an object, compressing as much as possible

bibench.util.zloads(zstr)[source]

load a compressed string dumped by _zdumps_

visualization Module