String Processing

Two Dimensional Arrays

In this lesson the student will learn how to:
  1. Create two-dimensional arrays.
  2. Access values stored in two-dimensional arrays.
By the end of this lesson the student will be able to:

	Write a script which stores all the single, double,
	triple, and quadruple nucleotide sets in a
	two-dimensional array.

Repetitive DNA

In a previous lesson we discussed telomeres. Telomeres are made of repeated TTAGGG sequences. There are other commonly repeated motifs found in DNA. The exact function or cause of these sequences is unknown in many cases, but there are plenty of repeated motifs found in DNA. Dinucleotide repeats and trinucleotide repeats show up in many places throughout mammalian genomes. These repeats are somewhat less prevalent in prokaryotes.

Certain trinucleotide repeats are associated with certain diseases:

  1. CCG - more than normal number of repeats associated with ataxia
  2. CAG - normally found in many locations and increased numbers are associated with male infertility, schizophrenia, ataxia, and huntington's disease depending on the locus involved, a decrease in repeats is associated with prostrate cancer
  3. CTG - many extra repeats associated with myotonic dystrophy
  4. GCG - extra repeats associated with several different disorders including various neurological, skeletal, and other disorders
  5. AAG - extra repeats associated with many disorders including prion disease

Two-Dimensional Arrays

#!/usr/bin/perl @stuff = ( ["coin", "pretzel", "cucumber", "frog", "bride"], ["found", "lost", "consumed", "created", "vaporized"], ["on", "under", "near", "beneath", "next to"], ["rock", "tree", "elephant", "house", "boulder"] ); srand(time); for($i=0; $i<5; $i++){ $a = int rand(5); $b = int rand(5); $c = int rand(5); $d = int rand(5); print "Jill $stuff[1][$a] the $stuff[0][$b] $stuff[2][$c] the $stuff[3][$d].\n"; }
Notice the use of square brackets when creating the 2D array and when accessing it.
#!/usr/bin/perl #2D arrays can hold several arrays each containing a #different number of items @AR = ( [ "this", "that", "the", "then", "those"], [ 1, 2, 3, 4, 5, 6, 7, 8 ], [ "the dog", "that cat", "my nose", "your foot"], [ "thunderous applause", "towering peaks", "tormentuous rain"], [ 1000, 5000, 10000 ] ); #add an item to the array $AR[4][3]=25000; #print all the items in the array foreach $item (@AR){ foreach $thing (@$item){ print "$thing\n"; } }
#!/usr/bin/perl @abc = ( "a", "b", "c" ); $cnt = 0; foreach $letter (@abc){ foreach $alpha (@abc){ $abc[1][$cnt] = "$letter$alpha"; $cnt++; } } for($i=0; $i<9; $i++){ print "$abc[1][$i]\n"; }

ASSIGNMENT:

Write a script which stores all the single, double, triple, and quadruple nucleotide sets in a two-dimensional array. Notice that the last example on this page solves the single and double nucleotide sets problem and it will be your job to add code which will provide the triple and quadruple nucleotide sets. Make sure you are clear on how many nucleotides will be in each set before you begin working on this script.