.2bit in ExpressionPlot

From ExpressionPlot
Jump to: navigation, search

.2bit is a format for storing genomic sequences. It offers both significant compression over FASTA and fast random access.

Since it is a binary format, the utilities twoBitToFa and twoBitInfo are needed to access the data. Copies of these are packaged with ExpressionPlot and can be found in $EP_HOME/other_tools. If for some reason they are missing then ExpressionPlot uses the unix which utility to see if you have another copy somewhere else. If you need to get a new copy, for example if you are running under Mac OS, then they are available available from UCSC.

Storing .2bit files

A few ExpressionPlot tools use .2bit files. You can always specify a .2bit file using the complete path. However, if you keep all your .2bit files in the $EP_HOME/genomes directory then you can specify them with just the basename. For example, instead of /usr/local/bin/expressionplot/genomes/mm9.2bit you can just say mm9 and ExpressionPlot will both add the .2bit extension (if necessary) and look in that directory to find the file. You can change this "special directory" by setting the GENOMES_DIR variable in your $EP_HOME/config.ini file.

Getting .2bit files

UCSC makes the .2bit files available from their download site. For example, to get the mm9 genome use

 cd `expressionplot-config GENOMES_DIR`
 wget -N http://hgdownload.cse.ucsc.edu/gbdb/mm9/mm9.2bit

Substitute other genomes for mm9 (like hg18 or rn4) as needed. You can also see the entire list of available genomes.