String Processing

Arrays: Pop, Push, Shift, and Unshift

In this lesson the student will learn how to:
  1. Add items to an array using push
  2. Add items to an array using unshift
  3. Retrieve items from an array using pop
  4. Retrieve items from an array using shift
By the end of this lesson the student will be able to:

     Write a script which can add or remove one
     item at a time from the front or the end of
     and array.

Ends of a DNA Sequence

As you might expect, a DNA sequence has two ends. The ends of a DNA sequence are chemically distinct. The two ends are usually called the 5' (five-prime) end and the 3' (three-prime) end. A phosphate group is found at the 5' end and a hydroxyl (-OH) group is found at the 3' end. A chain of nucleotides is linked covalently by phosphodiester bonds.

Complementary strands of DNA can be written like this:

     
      5'-ATTCCTCCA-3'
      3'-TAAGGAGGT-5'

Although normally only the 5' to 3' strand is written:

       ATTCCTCCA

Since the content of the complementary strand can be inferred from the first strand.

The two strands are held together by hydrogen bonds. Two hydrogen bonds form between A and T nucleotides and three hydrogen bonds form between G and C nucleotides. (The chemical properties of DNA are important to know about, but not for our focus in this unit and so we will not go into further detail here.)

push and unshift

You can use push and unshift to add items to an array. The big difference between the two functions is that one adds items to the front of the array and the other adds items to the end of the array.

#!/usr/bin/perl @L1 = (); foreach $item (1..10){ push(@L1, $item); print "@L1\n"; } @L2 = (); foreach $item (1..10){ unshift(@L2, $item); print "@L2\n"; }
Run this sample script to see which adds to the front and which adds to the end.

You might not be familiar with the .. construct (called the range operator). To understand this construct do some experimentation with the following script:

#!/usr/bin/perl foreach $item (55..60){ print $item . " "; } print "\n"; foreach $item (A..Z){ print $item . " "; } print "\n"; foreach $item (a..f){ print $item . " "; } print "\n"; foreach $item (reverse 1..10){ print $item . " "; } print "\n";
pop and shift

You can also remove items from an array with pop and shift:

#!/usr/bin/perl @a = ("hello", 2, 33, -14, "A", "AA", "AAA", 13, 2); while(@a){ $item = pop(@a); print $item . " "; } print "\n"; @a = ("hello", 2, 33, -14, "A", "AA", "AAA", 13, 2); while(@a){ $item = shift(@a); print $item . " "; } print "\n";
Inspect the output of this script to see the difference between the way shift and pop retrieve items from an array.

Here's a slightly modified version of the same script giving you an array-centric view of the results of shift and pop:

#!/usr/bin/perl @a = ("hello", 33, -14, "A", "AA", "AAA", 13, 2); while(@a){ pop(@a); print "@a\n"; } @a = ("hello", 33, -14, "A", "AA", "AAA", 13, 2); while(@a){ shift(@a); print "@a\n"; }

ASSIGNMENT:

Write a script which prompts the user to add or remove one item at a time from the front or end of an array. Ensure that the items added are single nucleotides. In other words reject any input which is not "A", "C", "G", or "T". Display the altered array after each iteration of the loop containing the actions. Also provide for "Q" as the exit code the user enters to end the loop. Begin the script with an array containing only five nucleotides.