Best Method to Find Homologous Genes of a Species?

So, you have a gene of of interest that you would like to compare between homologues. SInce number of genome sequences are increasing day by day so it is the most obvious question that what is the best method to find homologous genes across different organisms? So before starting, let's clear some terms. Homologous gene is a broad term which can be used synonymously with other terms also. Some of them are : 

Homologous : shared ancestry of a gene in two different species. For instance, hemoglobins genes of human and rat. 
Paralogous : shared ancestry caused by gene duplication e.g. hemoglobin A, hemoglobin A2, hemoglobin B and hemoglobin F genes in human genome 
Ohnologous : paralogs originated by a process of whole-genome duplication. 
Xenologous : homologs resulting from horizontal gene transfer between two organisms

I have already stated so many times that NCBI is my first love. So whenever I look for homologs; first of all I use NCBI BLAST services. BLAST can give you a good rough idea about different homologs of your gene of interest. A very good tutorial to find out homologous gene in other organism is provided by NCBI itself that can be found HERE. To find a homolog for a gene in another organism, you can search NCBI with gene name, accession number and sequences.

    In other way, you can identify the homologs of your gene interest from the pre-computed databases. Huge numbers of databases are there to help you out. The best thing about using an existing database is that more sophisticated methods for detecting orthologs than simple BLAST searches have been used to prepare the database and since everything is already pre-computed, so it s much faster. So roughly there databases are prepared on two different but quite dependent principle : 

  • Phylogeny-based analysis
  • Blast-based analysis
Note that both above mentioned methods are not free of artifacts and they can lead to wrong predictions so it is all about your final purpose of your study. This is the reason that There is also a third independent alternative method is evolving which consider homologs based on other genomic features rather than their coding sequence only. List of most common homologs database is given below : 

    Phylogeny-based analysis
    Homology computed using resources available in seven popular homology prediction services PhylomeDB, EnsemblCompara, EggNOG, OrthoMCL, COG, Fungal Orthogroups, and TreeFam
    phylomeDB 'allows users to interactively explore the evolutionary history of genes through the visualization of phylogenetic trees and multiple sequence alignments.'
    EnsemblCompara is another phylogeny based orthology and paralogy predictions server.
    PhyloFacts contain multiple sequence alignment, one or more phylogenetic trees, predicted 3D protein structures, cellular localization, and Gene Ontology (GO) annotations and evidence codes
    Blast-based analysis
    100 organisms are included in Inparanoid database. You can download the stand alone version also.
    EggNOG eggNOG's database of orthologous groups of genes has 721,801 orthologous groups in 1133 species, covering 4,396,591 proteins.
    OrthoMCL DB covers 124740 Ortholog Groups from 150 genomes. You can search homologs by IDs, Keyword, or PFam domain,Phyletic Pattern and Group Properties
List is endless as new tools are coming publicly very often.

  • Bioinformatics video tutorial : How to use Metaphor

  • EnsemblCompara API Tutorial
  • No comments:

    Post a Comment

    Have Problem ?? Drop a comments here!