Home     |     Tutorial (Standalone version)     |     Tutorial (Web server)     |     Method Description     |     Contact us

Standalone version of VirusFaster

To facilitate experienced users with needs to integrate the VirusFaster into their local NGS analysis pipeline, the standalone version of VirusFaster is available for download. However, the VirusFaster web server provides additional functions to help interpreting analysis results of the program, including annotating virus integration sites, generating Circos plots, viewing breakpoint alignment and identifying related studies from Dr vis v2.0, a database of human disease-related viral integration sites.

Installation:
1. Download CAP3 and untar the file.
2. Download the local version of VirusFaster and unzip the file.

Usage:

1. Prepare BAM File: The input BAM file should be produced by BWA against a combined genome including the human reference genome hg19 and virus genomes of interest. Duplicate reads should also be removed.

2. Running VirusFaster to detect the virus integrated sites.
java -jar LocalVirusFaster.jar \
    -b [INPUT_BAM] \
    -g [HUMAN_REFERENCE_GENOME] \
    -v [VIRUS_REFERENCE_GENOME] \
    -o [OUTPUT_DIRECTORY] \
    -p [CAP3_PROGRAME_PATH] \
    -s [PREFIX] \
    -t [TEMP_DIRECOTRY] \
    -m [MIN_SOFT_CLIP] \
    -i [INSERT_SIZE] \
    -l [READ_LENGTH] \
    -y [MODE]

Description:

INPUT_BAM
The BAM file of the sample
HUMAN_REFERENCE_GENOME
The path of the Human reference genome file in FASTA format, e.g. hg19.fasta
VIRUS_REFERENCE_GENOME
The path of a virus genome file in FASTA format, e.g. HBV.fasta
OUTPUT_DIRECTORY
The path of output directory
CAP3_PROGRAME_PATH
The path of cap3 program.
PREFIX
The prefix of result file
TEMP_DIRECOTRY
The path of the temporary directory
MIN_SOFT_CLIP
The number of soft-clipped sequencing reads as a sensible threshold for preliminary filtering of viral integration events. We recommend 3 soft-clipped reads as default.
INSERT_SIZE
Insert size can be either specified by users or estimated automatically by VirusFaster based on Gaussian distribution of the paired-end (PE) reads of the input BAM.
READ_LENGTH
Read length can be either specified by users or calculated automatically by VirusFaster.
MODE: strict_mode or loose_mode
Strict mode: Under the strict mode, VirusFaster will apply steps 1-5 (Figure 1) to detect virus integrated breakpoints, which will generate results with higher confidence. Both breakpoints in the human genome and virus genome will be confirmed by soft-clip reads at single base resolution.
Loose mode: For low-depth NGS sequencing data without breakpoints detected, it is recommended to try the loose mode, which will only take steps 1-3 (Figure 1) to detect virus integrated sites. This mode will detect more breakpoints than the strict mode with possible lower specificity.


Output:

The format of VirusFaster output files is listed below:

Column index Description
Column 1 The virus name
Column 2 The position of breakpoint in the virus genome
Column 3 The orientation of soft-clip in the breakpoint in the virus genome
Column 4 The number of soft-clip reads which support the virus breakpoints
Column 5 The consensus sequence of the virus breakpoint.
Column 6 Human chromosome
Column 7 The position of breakpoint in the human genome
Column 8 The orientation of soft-clip in the breakpoint in the human genome
Column 9 The number of soft-clip reads which support the human breakpoints
Column 10 The consensus sequence of the human breakpoint.
Copyright© 2016-2017, All Rights Reserved.
Center for Cancer Bioinformatics, Peking Cancer Hospital Feedback