Converting NBRF Format

Converting NBRF Format

 

NBRF files consist of one or more groups of non-blank lines separated by one or more blank lines; the non-blank lines look similar to this:

 

 >P1;Chloroflex

 Chloroflex 428 bases

  MSKEHVQTIA TDDVSKNGHT PPTNASTPPY PFVAIVGQAE LKLALLLCVV

  NPTIGGVMVM GHRGTAKSTA VRALAAMLPP IKAVAGCPYS CAPDRTAGLC

  DQCRALEQQS GKTKKPAVIN IPVPVVDLPL GATEDRVCGT LDIERALTQG

  VQAFAPGLLA RANRGFLYID EVNLLEDHLV DVLLDVAASG VNVVEREGVS

  VRHPARFVLV GSGNPEEGDL RPQLLDRFGL HARITTITDV SERVEIVKRR

  REYDADPFAF VEKWAKETQK LQRKIKQAQR RLPEVILPDP VLYKIAELCV

  KLEVDGHRGE LTLARA-ATA LAALEGRNEV TVQDVRRIAV LALRHRLRKD

  PLETQD---- ---DAVRIER AVEEVLVP-- ---------- ----------

  ---------- ---------- --------*

 

Each group begins with a line starting with a greater-than symbol (‘>’). This line is ignored. The first word in the following line (e.g., Chloroflex above) is treated as the name of the sequence; the rest of that line is ignored Subsequent lines are taken as the sequence. This example would be converted to the MEGA file format as follows:

 

#mega 

!Title: filename 

 

#Chloroflex 

MSKEHVQTIA TDDVSKNGHT PPTNASTPPY PFVAIVGQAE LKLALLLCVV 

NPTIGGVMVM GHRGTAKSTA VRALAAMLPP IKAVAGCPYS CAPDRTAGLC 

DQCRALEQQS GKTKKPAVIN IPVPVVDLPL GATEDRVCGT LDIERALTQG 

VQAFAPGLLA RANRGFLYID EVNLLEDHLV DVLLDVAASG VNVVEREGVS 

VRHPARFVLV GSGNPEEGDL RPQLLDRFGL HARITTITDV SERVEIVKRR 

REYDADPFAF VEKWAKETQK LQRKIKQAQR RLPEVILPDP VLYKIAELCV 

KLEVDGHRGE LTLARA-ATA LAALEGRNEV TVQDVRRIAV LALRHRLRKD 

PLETQD---- ---DAVRIER AVEEVLVP-- ---------- ---------- 

---------- ---------- --------