Bellerophontes

Download and Install

You can install Bellerophontes in two different ways.

The easiest (and preferred) way is to install the provided .deb file. In debian based distributions you can open the "bellerophontes.deb file. After the automated procedure, the tool will be installed.

Or, if you prefer, you can install it by typing:

sudo dpkg -i bellerophontes.deb

If you don't have a debian-based linux distribution, or if you prefer to install it manually, download and decompress the archive into a local folder.

tar xzf bellerophontes_0.4.0.tgz

If you chose to decompress the archive, you need to manually add the install directory to your PATH, and follow the instructions provided in the README.txt file.

Bellerophontes is build on top of the following programs:

TopHat (1.0.14) - optional
Cufflinks (0.9.3) - optional
Bowtie (> 0.12.5)
Blast

These programs have been modified in the next version and Bellerophontes has been developed considering the reported version. It is recommended to use these version. Please note that TopHat and Cufflinks are fully optional, but strongly recommended in order to generate a new transcriptoma based on the sample under study.

Requirements

Be sure you have installed java6-jdk and java6-jre on your machine. In ubuntu 11.04 you can do:

 sudo vi /etc/apt/source.list

add the following line at the end of the file:

 deb http://archive.canonical.com/ubuntu maverick partner

then type:

sudo apt-get update

sudo apt-get install sun-java6-jre sun-java6-jdk

BLAST program is also needed. Specifically, Bellerophontes needs blastall command that is part of the blast2 package available in ubuntu distrubution:

sudo apt-get install blast2

Install EMBOSS suite:

sudo apt-get install emboss

If you are using an OS different from Ubuntu, check the oracle site for instructions on how to install java.

Data Sets

In order to run Bellerophontes you need to provide bowtie index of the human genome. In Bellerophontes distribution we also include a pre-built index of the human genome HG19 located under Bellerophontes/reference.

Follow the bowtie-build instruction on how to build a reference.

We also provide a dataset test sample of paired-end 75 bp RNA-Seq Data of Chronic Myelogenous Leukaemia. The two mates fastq files are provided with the Bellerophontes distribution into the Bellerophontes/samples folder. The provided sample is a subset of reads belonging to the real sample but still revealing the BCR-ABL1 fusion.

Configuration

A configuration file properties.config is included in the Bellerophontes distribution. If you want to run the Bellerophontes test data set (recommended the first time) you should use the default configuration. A configuration file should look like this:

# Configuration File

 #AutoGenerated #Fri May 18 12:24:40 CEST 2012

intersectBED_exec=intersectBED

min_enco_read=8

trim_size=0

bowtie_exec=bowtie

skip_bowtie=yes

min_word_length_spanning=15

mate_length=50

max_gap_distance=5000

maximum_inner_length=400

Gene Filtering Candidates (optional)

A filtering candidates could be evaluated in details using the optional file gene_filter_list.txt

Where a list of gene_id could be written in "or" condition as well as a couple of geneid involved in funsion gene in "and" condition, in the following way:

<gene_id1>|<gene_id2>|<gene_id3>|<gene_id3>|<gene_id4>|<gene_id5>
<gene_id6>&<gene_id7>
<gene_id8>&<gene_id9>
<gene_id(N)>&<gene_id(M)>

i.e.

ENSG00000149212|ENSG0000019885|ENSG000002422996|ENSG00000249193|ENSG00000121594
ENSG00000121594&ENSG00000114631

Using Gene Filtering Candidates is possible to focus bellerophontes to some genes subset or fusions

Run

In order to run Bellerophontes you have to launch the Bellerophontes executable with the following options:

bellerophontes

 -a,--annotation-file [arg]   GTF Annotation file

 -d,--working-dir [arg]       working_dir

 -f1,--fastq1 [arg]           First mate file

 -f2,--fastq2 [arg]           Second mate file

 -g,--genome-file [arg]       Genome file

 -h,--help                    Print this help message

 -n,--de-novo                 A new reference will be generated, using the
			genome coverage computed using the provided
			samples

 -t,--thread-level [n]      The number of thread that will be used (Default: 1)

 -u1,--unmapped1 [arg]        Unmapped mate file (optional)

 -u2,--unmapped2 [arg]        Unmapped mate file (optional)

Notes: the genome reference folder should contain both the Bowtie indexes and the original fasta file. For instance, suppose that the fasta file is the hg19.fa file (provided in the Bellerophontes distribution located on Bellerophontes/reference) the folder should contain the following files:

 hg19.1.ebwt

 hg19.2.ebwt

 hg19.3.ebwt

 hg19.4.ebwt

 hg19.fa

 hg19.fa.fai

 hg19.rev.1.ebwt

 hg19.rev.1.ebwt

 hg19.rev.2.ebwt

also a default annotation file is provided in the data set, named Homo_sapiens.GRCh37.60.chr.gtf.tar.gz. Alternatively, the annotation file from the UCSC can be retrieved with the following procedure:

 wget ftp://ftp.ensembl.org/pub/current/gtf/homo_sapiens/Homo_sapiens.GRCh37.60.gtf.gz

 gunzip Homo_sapiens.GRCh37.60.gtf.gz

For example, from any directory you can run:

 bellerophontes -t 8 -a Homo_sapiens.GRCh37.60.chr.gtf -g reference/hg19/hg19 -d working_dir/ -f1 samples/s_7_1_sequence_chr22chr9.fq -f2  samples/s_7_2_sequence_chr22chr9.fq --de-novo

all temporary files and results will be stored into the "working_dir" folder.

So far, Bellerophontes produces two report files:

report.txt: list of detected fusions. For instances:

gene1_name gene1_id strand1 chr_gene1 gene2_name gene2_id strand2 chr_gene2 start_spanning_gene1 breakpoint1 | breakpoint2 end_spanning_gene2 #spanning #encompassing
breakpoint_sequence

FilterHS_final_result.txt: list of valid encompassing regions.

Bellerophontes

Download and Install

Requirements

Data Sets

Configuration

Run

Home

Manual

Releases

Data set

Contributors

Contact