Regular Installation

From ExpressionPlot
Revision as of 22:44, 30 September 2011 by Brad (Talk | contribs) (Dependencies)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

ExpressionPlot is designed to run on linux and other other unix-like platforms. It was developed under Ubuntu but has run successfully under an expanding list of distributions and operating systems including CentOS, Fedora and Mac OS X. Please post installation-related questions for any platform to the ExpressionPlot Google Group or e-mail to expressionplot@googlegroups.com. We aim to answer all questions by the next business day.

Dependencies

ExpressionPlot has the following dependencies which you must have running before you begin:

  • Apache web server
  • MySQL database server
  • R statistical language
  • Many other components that would be present on most any unix system: bash, make (for Mac therefore requires XCode), perl, head, tail, grep, and others.

ExpressionPlot comes with copies of the following open source bioinformatic software. However, the binaries will only run on 64 bit linux machines. If you have a different architecture then you'll have to download and install them yourself. You then make ExpressionPlot aware of their location by setting the appropriate variables in config.ini.

It also has the following dependencies, but if you don't have them installed then the install script will try to install them for you:

  • RApache, a system for generating web pages using Apache and R
  • Perl libraries from CPAN including Date::Formatter Proc::ProcessTable Config::Simple Clone Mysql Email::Valid Bio::SeqIO Captcha::reCAPTCHA JSON Inline Cwd File::Basename File::Temp
  • R libraries from CRAN including RMySQL Hmisc rjson
  • Bioconductor, DESeq

Install Notes:

Getting Install Script

To install ExpressionPlot download and unpack the expressionplot-install tar file (50 KB):

 wget http://expressionplot.com/download/install/expressionplot-install-current.tar
 tar xf expressionplot-install-current.tar

The install script uses wget extensively to download components; if it is not already installed on your machine you should install it. Mac users will find it is not by default installed on their machines, but can be easily installed, for example, with fink install wget.

The tar package will unzip into its own directory, which will contain two files:

install.pl The ExpressionPlot install script.
install-config.ini A configuration file giving system-specific details for the installation.

Configuration

You will have to make some decisions about your installation before you begin. These are specified by setting variables in install-config.ini.

Directories

ExpressionPlot will install into three different directories:

Variable Default Description
EP_HOME /usr/local/bin/expressionplot The base directory for most ExpressionPlot files. Can be anywhere on your filesystem.
HTDOCS_HOME /var/www/expressionplot The base directory for web documents, including JavaScript, HTML, and R scripts (the latter to be served by RApache). These must be in a directory from which Apache can serve them. You should set the HTDOCS_URL to the corresponding URL for that directory.
CGI_HOME /usr/lib/cgi-bin/expressionplot The base directory for CGI scripts. These must be in a directory from which Apache can serve them. You should set the CGI_URL to the corresponding URL for that directory.

These directories will be created during the install process, so you don't have to prepare them. You only have to think in advance where you want them, and where Apache will be able to serve them. Aside from these three locations, the config script expressionplot-config will be installed directly into /usr/local/bin. You can change this using the --epc-dir switch when you call install.pl however it is essential that the script be in the path of any user using ExpressionPlot (including the Apache user).

MySQL

The other decision you will make is what database and MySQL user ExpressionPlot will use. This is controlled by the following three variables:

Variable Default Description
MYSQL_USER expressionplot MySQL username
MYSQL_PWD highthroughput Password for MYSQL_USER.
MYSQL_DB expressionplot Name of MySQL database to be used for all ExpressionPlot-related tables.

It is not necessary to create the user account or database---the install script will take care of this for you.

Installation

Once you've made any desired changes to install-config.ini you can run the install script:

 perl install.pl install-config.ini

It is an interactive script and will try to check and fix your dependencies, then download, build and install the ExpressionPlot base system.

Once successful, you should be able to run

 expressionplot-config

from anywhere on your system to get the base directory of your ExpressionPlot install, in case you forgot where you put it. You can also run

 expressionplot-config VARNAME

to see the value of the configuration variable VARNAME. Finally,

 expressionplot-config -all

will list all of your configuration variables.


Add-ons: EP-manage.pl

After you complete the initial installation you will have the base ExpressionPlot system but won't be able to do any analysis until you add in some data. This is done easily with the EP-manage.pl script to do this (located in the util/ subdirectory of the ExpressionPlot home).

If you are only running the front-end (you have your own back-end), then strictly speaking you could get away without adding anything on, but you would of course have to populate the database by other methods. Even so, it might be useful to download at least an annotation for the genome that you are using so that the SeqView Tool can show known transcripts along with your data.

One way to just try out ExpressionPlot is to get it running on some human tissue panel data. Here is a sequence of add-ons that should populate your database with some the processed data (the download for hg18 annotation is about 70MB and for the tissue panel is about 600MB so it may take a little while):

# Go to ExpressionPlot util directory
cd `expressionplot-config`/util

# Get hg18 annotation files
./EP-manage.pl get_annot hg18

# Get processed Human Tissue Panel data
./EP-manage.pl get_project Human_Tissue_Panel_processed

If you want to try out the back-end, then you could download the raw sequencing data instead of the processed data. This download is bigger (1.6GB), and you still will need the annotation files. These commands will download the annotation and the raw data, then start the pipeline:

# Go to ExpressionPlot util directory
cd `expressionplot-config`/util

# Get hg18 annotation files
./EP-manage.pl get_annot hg18

# Get Human Tissue Panel sequences
./EP-manage.pl get_project Human_Tissue_Panel

# Start up screen (optional)
screen -S EP-pipeline-on-Tissue-Panel

# Start pipeline
EP=`expressionplot-config`
cd $EP/projects/Human_Tissue_Panel
$EP/RNASeq/RNASeq-pipeline.pl lanes.txt hg18 -j hg18_all_junctions -hjl 31 \
  -cl $EP/annot/hg18/hg18_trimmed_gene_clusters.tsv \
  -ae $EP/annot/hg18/hg18_acembly_AE_events_with_flanking_SS.tsv \
  -p iDEA -l pipeline.%d%b%y.log \
  -ri $EP/annot/hg18/hg18_acembly_intron_events.xls \
  -ate $EP/annot/hg18/hg18_ensGene_term_exons.tsv \
  -ensT $EP/annot/hg18/hg18_ensembl_and_tRNA_clusters.tsv \
  -admin expressionplot -l pipeline.%d%b%y.log \

Then point your web browser to http://SERVERNAME/cgi-bin/expressionplot/home.pl and you are ready to go!


See Installing add-ons for more details.

Next

Next: Preparing Raw Data with The EP Backend or Using the EP Web Interface