How to Predict Gene from Multiple Sequence
Gene prediction tools or ORF finders are inevitable tool for both molecular biologist and bioinformaticians. Therefore, There are so many softwares and server all around to predict the gene in given genomic DNA sequences. Some of these gene prediction tools are trained to predict gene in a specific genome while some work ab initio also. Problem with most of gene prediction servers is their output. Output of ORF finders are OK if you have single or few genes as input but it is hard to handle the output if size of input file is very big in other words gene prediction for multiple sequence is difficult if you don't any programming language.
In this post, lets discuss about a server that use to prediction gene from multiple sequences.ORF FIND is hosted on GreenGene, University of Massachusetts, Lowell. It's simple interface is really easy to use. This ORF finder at Greengnene server find ORFs in multiple DNA sequence file by using GLIMMER to find the ORF coordinates and EMBOSS to extract the amino acid sequences out of predicted ORF DNA sequences.
|Steps in gene prediction from multiple sequences by ORF finder|
Finally, result of gene prediction from many sequence will appear in a temporary folder where predicted ORFS, predicted protein and input can be easily found.
|ORF Finder result folder|
Here, It's important to note that input file format is important for successful prediction from multiple sequences. Your multiple fasta format should always contain sequences in single line after '>sequence description' line. Look below for detail :
> Correct Format CCTCCTCCTGTTTTTCCCTCAATACAACCTCATTGGATTATTCAATTCACCATCCTGCCCTTGTTCCTTCCATTATACAGCTGTCTTTGCCCTCTCCTTCTCTCGCTGGACTGTTCACCAACTCTCAGCCCGCGATCCCAATTTCCAGACAACCCATCTTATCAGCTTGGCCACGGCCTCGACCCGAACAGACCGGCGTCCAGCGAGAAGAGCGTCGCCTCGACGCCTCTGCTTGACCGCACCTTGATGCTCAAGACTTATCGCGATGCCAAGAAGCGTCTCATCATGTTCGACTACGA > Wrong Format CGAAACGGGCACCTATACAACGATTGAAACCATTATTCAAGCTCAGCAAGCGTCTATGC TAGCGGTTATTGCGAGCACTTCAGCGGTTGCTACTACGACTACTACTTGATAAATGAAA CGGCTATAAAAGAGGCTGGGGCAAAAGTATGTTAGTTGAAGGGTGACCTGAACGATGAA TCGGTCGAATTTTTTATTGGCAGAGGGAAGGTAGGTTTACTCAATTTAGTTACTTCTAG CCGTTGATTGGAGGAGCGCAAGCGACGAGGAGGCTCATCGGCCGCCCGCGGAAAGCGTA GTCTTACACGGAAATCAACGGCGGTGTCATAAGCGAG
Also Read :
First of all, let me clear it that i am not a Bioinformatician. i am simple plant biology researcher who face problem in her daily research life and who bother to post solution of those problem on this webpage. That's it.