question archive For these problems, you will be processing DNA data from the file dna

For these problems, you will be processing DNA data from the file dna

Subject:Computer SciencePrice: Bought3

For these problems, you will be processing DNA data from the file dna.txt. Data is printed in the file in pairs of lines. The first line in the pair is the name of the DNA sequence and the second line is the DNA sequence itself. The following provides you with some context for the task, but an understanding of DNA is not required for this assignment.

DNA consists of long chains of chemical compounds called nucleotides. Four nucleotides are present in DNA: Adenine (A), Cytosine (C), Guanine (G), and Thymine (T). Certain regions of the DNA are called genes. Most genes encode instructions for building proteins (they're called "protein-coding" genes). These proteins are responsible for carrying out most of the life processes of the organism. Nucleotides in a gene are organized into codons. Codons are groups of three nucleotides and are written as the first letters of their nucleotides (e.g., TAC or GGA). Each codon uniquely encodes a single amino acid, a building block of proteins.

For these problems you will be identifying protein-encoding genes as well as other attributes of the genetic data in the file. Note that all matches for these problems will be case-insensitive.

Determine a single bash shell statement with grep that will perform the operation(s) requested.

You may use input/output redirection operators such as >, <, and |.

1) Print all of the DNA sequences in the file (all of the non-names).

2) Print the full DNA sequences that contain the word "CAT", preceded by the name of the sequence. Hint: man grep to look for options that alter what gets printed for a match. 

dna.txt content:

Simple Protein-Coding Gene

ATGCCACTATGGTAG

Upper and Lowercase Protein

ATgCCAACATGgATGCCcGATAtGGATTgA

Valid Gene

ATGCGACCCTAGTAG

Invalid Gene

ATGCGACCCTAGTAGGG

Another Protein-Coding Gene

ATGACCGACTCAGTATAA

Yet-Another Protein-Coding Gene

ATGATCGACTACGATTAG

Yet-Again-Another Protein-Coding Gene

ATGATTGGGCCCGCTTAGTAGTGA

Invalid Protein-Coding Gene

ATGACCGACTCAGTAAAT

Another Invalid Protein-Coding Gene

ATGACCGACTAG

Yet-Another Invalid Protein-Coding Gene

AGGATTGGGCCCGCTTAGTAGTGA

Feline-Encoding DNA

CATCATCATCATCATCATCATCATCATCAT

Palindrome DNA

ACTTCA

1) correct output:

-ATGCCACTATGGTAG

-ATgCCAACATGgATGCCcGATAtGGATTgA

-ATGCGACCCTAGTAG

-ATGCGACCCTAGTAGGG

-ATGACCGACTCAGTATAA

-ATGATCGACTACGATTAG

-ATGATTGGGCCCGCTTAGTAGTGA

-ATGACCGACTCAGTAAAT

-ATGACCGACTAG

-AGGATTGGGCCCGCTTAGTAGTGA

-CATCATCATCATCATCATCATCATCATCAT

-ACTTCA

-TAGACGTACCTTAG

2) correct output:

-Upper and Lowercase Protein

-ATgCCAACATGgATGCCcGATAtGGATTgA

---

-Feline-Encoding DNA

-CATCATCATCATCATCATCATCATCATCAT

pur-new-sol

Purchase A New Answer

Custom new solution created by our subject matter experts

GET A QUOTE

Related Questions