This package contains modules for working with datasets: generating data, I/O, data transformations, etc.
Utilities for reading and writing datasets for various algorithms.
Bases: numpy.ndarray
A numpy array with extra attributes ‘genes’, ‘samples’, and ‘annotation’.
adapted from http://docs.scipy.org/doc/numpy/user/basics.subclassing.html
Read a tsv file with the same format written by write_expression_data().
Writes every bicluster in a list of lists of biclusters to a file in the format read by BicOverlapper.
written to the file.
filename: output file name.
File format:
[number_of_biclusters]
bicluster set 1
#rows bic1.1 #columns bic1.1
row1 row2 ... rowN
col1 col2 ... colN
#rows bic1.2 #columns bic1.2
row1 row2 ... rowN
col1 col2 ... colN
...
bicluster set 2
#rows bic2.1 #columns bic2.1
row1 row2 ... rowN
col1 col2 ... colN
#rows bic2.2 #columns bic2.2
row1 row2 ... rowN
col1 col2 ... colN
...
Write a DAVID (http://david.abcc.ncifcrf.gov/) list of genes.
Writes a DAVID multilist, with each list in one column.
The first row gives the name of the list, which is just ‘name#’.
The gene names must be the same for each bicluster
Writes a dataset in the following relatively standard format:
Genes/Conditions [col ID] [col ID] ... [col ID]
[row ID] [value] [value] ... [value]
[row ID] [value] [value] ... [value]
...
[row ID] [value] [value] ... [value]