Converting NBRF Format

Converting NBRF Format

 

NBRF files consist of one or more groups of non-blank lines separated by one or more blank lines; the non-blank lines look similar to this:

 

 >P1;Chloroflex

 Chloroflex 428 bases

  MSKEHVQTIA TDDVSKNGHT PPTNASTPPY PFVAIVGQAE LKLALLLCVV

  NPTIGGVMVM GHRGTAKSTA VRALAAMLPP IKAVAGCPYS CAPDRTAGLC

  DQCRALEQQS GKTKKPAVIN IPVPVVDLPL GATEDRVCGT LDIERALTQG

  VQAFAPGLLA RANRGFLYID EVNLLEDHLV DVLLDVAASG VNVVEREGVS

  VRHPARFVLV GSGNPEEGDL RPQLLDRFGL HARITTITDV SERVEIVKRR

  REYDADPFAF VEKWAKETQK LQRKIKQAQR RLPEVILPDP VLYKIAELCV

  KLEVDGHRGE LTLARA-ATA LAALEGRNEV TVQDVRRIAV LALRHRLRKD

  PLETQD---- ---DAVRIER AVEEVLVP-- ---------- ----------

  ---------- ---------- --------*

 

Each group begins with a line starting with a greater-than symbol (‘>’). This line is ignored. The first word in the following line (e.g., Chloroflex above) is treated as the name of the sequence; the rest of that line is ignored Subsequent lines are taken as the sequence. This example would be converted to the MEGA file format as follows:

 

#mega

!Title: filename

 

#Chloroflex

MSKEHVQTIA TDDVSKNGHT PPTNASTPPY PFVAIVGQAE LKLALLLCVV

NPTIGGVMVM GHRGTAKSTA VRALAAMLPP IKAVAGCPYS CAPDRTAGLC

DQCRALEQQS GKTKKPAVIN IPVPVVDLPL GATEDRVCGT LDIERALTQG

VQAFAPGLLA RANRGFLYID EVNLLEDHLV DVLLDVAASG VNVVEREGVS

VRHPARFVLV GSGNPEEGDL RPQLLDRFGL HARITTITDV SERVEIVKRR

REYDADPFAF VEKWAKETQK LQRKIKQAQR RLPEVILPDP VLYKIAELCV

KLEVDGHRGE LTLARA-ATA LAALEGRNEV TVQDVRRIAV LALRHRLRKD

PLETQD---- ---DAVRIER AVEEVLVP-- ---------- ----------

---------- ---------- --------