Perl Script 2 : Convert Multi Fasta file into a Single line FASTA File

Question : 
I've a multi- fasta file with several unique names, (e.g. > Cryptococcus gattii ), and I need to generate  another file with single line sequence i.e. header in one line sequence in single line below.

>Cryptococcus gattii
MGIKGLTGLLSENAPKCMKDHEMKTLFGRKVAIDASMSIYQFLIAVRQQDGQMLMNESGDVTSHLMGFFYRTIRMVDHGIKPCYIFDGKPPELKGSVLAKRFARREEAKEGEEEAKETGTAEDVDKLARRQVRVTREHNEECKKLLSLMGIPVVTAPGEAEAQCAELARAGKVYAAGSEDMDTLTFHSPILLRHLTFSEAKKMPISEIHLDVALRDLEMSMDQFIELCILLGCDYLEPCKGIGPKTALKLMREHGTLGKVVEHIRGKMAEKAEEIKAAADEEAEAEAEAEKYDSDPENEEGGETMINSDGEEVPAPSKPKSPKKKAPAKKKKIASSGMQIPEFWPWEEAKQLFLKPDVVNGDDLVLEWKQPDTEGLVEFLCRDKGFNEDRVRAGAAKLSKMLAAKQQGRLDGFFTVKPKEPAAKDAGKGKGKDTKGEKRKAEEKGAAKKKTKK
>Daphnia pulex
MGIKGLTQVIGDTAPTAIKENEIKNYFGRKVAIDASMSIYQFLIAVRSEGAMLTSADGETTSHLMGIFYRTIRMVDNGIKPVYVFDGKPPDMKGGELTKRAEKREEASKQLVLATDAGDAVEMEKMNKRLVKVNKGHTDECKQLLTLMGIPYVEAPCEAEAQCAALVKAGKVYATATEDMDSLTFGSNVLLRYLTYSEAKKMPIKEFHLDKILDGLSYTMDEFIDLCIMLGCDYCDTIKGIGAKRAKELIDKHRCIEKVIENLDTKKYTVPENWPYQEARRLFKTPDVADAETLDLKWTQPDEEGLVKFMCGDKNFNEERIRSGAKKLCKAKTGQTQGRLDSFFKVLPSSKPSTPSTPASKRKVGCIIYLFLYF

Answer
Multi fasta or multi line fasta is a useful file format where fasta header is followed by sequence in several lines instead of in single line. But some softwares accept sequences in a single lines so we need to compress sequences in a single line. Following PERL script can help to do that
singleline.pl

Script name Download
singleline.pl


Considering sequences are stored in 'input.txt' , result will be stored in 'output.txt' and given PERL script is stored at your 'desktop' named as 'singleline.pl', you can use this script as follows


perl  singleline.pl   input.txt  > output.txt



Result will be like that
Source : Biostar Forum

6 comments:

  1. I noticed a few possible errors with your script (only by attempting to run it naively myself). I had to make a few small changes for this to actually work. I realized the issue arises from the way this website handles the use of '<', '>' characters. This effectively removes the file handlers from your script, resulting in it not working. You might want to consider hosting this elsewhere or finding a way around this. Either way, thank you for posting this, I too have been in the need to single line sequence files.

    ReplyDelete
  2. Hi jordyn,
    Thanks for your suggestion. I will consider it in future.

    ReplyDelete
  3. Thanks. It's very useful

    ReplyDelete
  4. Thanks, the template helped me a lot.

    ReplyDelete
  5. Thank you. Your blog has been very useful. This script works very well

    ReplyDelete
    Replies
    1. Hi Chucao,
      Always welcome. Please to learn that PERL script worked for you.

      Delete

Have Problem ?? Drop a comments here!