How Do I Install and Use BUSCO on Ubuntu 14.04.1 LTS

What is BUSCO

BUSCO stand for Benchmarking Universal Single-Copy Orthologs which can be used to assess the completeness of genome assembly and annotation.

Why BUSCO

I tried several time to use the CEGMA (Core Eukaryotic Genes Mapping Approach) on my Ubuntu14.0 mechine but failed. Then I found the developer of CEGMA has stopped to give any support for it and has suggested to use BUSCO.

Requirements

  • Python 3
run this command in your terminal
sudo apt-get install python3
run this command in your terminal
sudo apt-get install ncbi-blast+
  • HMMER (HMMER 3.1b2)
run this command in your terminal
sudo apt-get install hmmer
  • Augustus 3.0.x (genome only)
Download from HERE and install accordingly
  • EMBOSS tools 6.x.x (transcriptome only)
run this command in your terminal
sudo apt-get install emboss

Installation

  • Download latest script of BUSCO from HERE, from Software & User Guide section, and unzip it. It will create a directory 'busco'. This directory shoul have following files : BUSCO_userguide.pdf, LICENSE, release_notes,BUSCO_v1.1.py,README.txt,sample_data
  • Download the library of lineage-specific BUSCO data from HERE, from Dataset section, and extract in same directory of script. I downloaded the eukryotes specific file whose name is eukryota

Uses

  • Genome assembly assessment
python BUSCO_v1.1b.py -o NAME -in ASSEMBLY -l LINEAGE –m genome
  • Gene set assessment:
python BUSCO_v1.1b.py -o NAME -in GENE_SET -l LINEAGE -m OGS
  • Gene set assessment:
python BUSCO_v1.1b.py -o NAME -in TRANSCRIPTOME -l LINEAGE -m trans
NAME- name to use for the run and all temporary files ASSEMBLY/GENE_SET/TRANSCRIPTOME - file in fasta format
LINEAGE - path to the lineage to be used (-l eukryota for example)

Sort FASTA File by Sequence Name

How do I install Blast2go 3 on Ubuntu