Introduction to GenBank


Here are some sample GenBank files:
NM_003938
XM_045915
NM_000229
BC000578

Study the organization of these files. Notice indentation, use of slashes, capitalization, spacing, etc.

You can download GenBank files from GenBank. Try doing a nucleotide search for melanoma. The search results will return a list of records containing the term melanoma. Select one of these records. Once it is displayed click the "Text" button near the top of the page. Download this file by selecting File-->Save As... and give it an appropriate name.

Here's a little script which allows a user to view an entire file based on the name of the file in a directory:

#!/usr/bin/perl -w print "Available Files:\n"; @files = (); $folder = 'GENBANK'; unless(opendir(FOLDER, $folder)){ print "Cannot open folder\n"; exit; } @files = readdir(FOLDER); closedir(FOLDER); foreach $f (@files){ print "$f\n"; } print "Enter name of file: "; $filename = <STDIN>; chomp $filename; $filename = 'GENBANK/'.$filename; open(FH, $filename); @data = <FH>; close(FH); print @data; In order for this script to work you must have all your GenBank files stored in a directory called GENBANK and the script must exist in the same directory as the GENBANK directory.
Here's a slightly modifed version of this script which returns the ACCESSION number contained in the file:
#!/usr/bin/perl -w print "Available Files:\n"; @files = (); $folder = 'GENBANK'; unless(opendir(FOLDER, $folder)){ print "Cannot open folder\n"; exit; } @files = readdir(FOLDER); closedir(FOLDER); foreach $f (@files){ print "$f\n"; } print "Enter name of file: "; $filename = <STDIN>; chomp $filename; $filename = 'GENBANK/'.$filename; open(FH, $filename); @data = <FH>; close(FH); foreach my $line (@data){ if($line =~ /^ACCESSION/ ){ $line =~ s/^ACCESSION\s*//; chomp ($line); print "$line\n"; exit; } }

ASSIGNMENT:
First of all you will need to download another six files from GenBank so that you have a directory containing ten GenBank files. Next you will write a script which lists the files in your GENBANK directory, allows the user to select a file from the list, and then displays accession number, base count, and source for the selected file.