Pattern_Theory/Chapter1.html

<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
<head>

  
  <meta content="text/html; charset=ISO-8859-1" http-equiv="content-type">

  
  <title>Chapter1</title>
</head>


<body>

In this chapter, you will need the following:<br>

<br>

<br>

- <span style="font-weight: bold;">In Exercise 5:</span> <br>
You need a long English text. We give here some .txt files of Mark Twain's novels &nbsp;:  &nbsp; <a href="hfinn10.txt">hfinn10.txt</a> ,  
&nbsp; <a href="sawy210.txt">sawy210.txt</a> ,
&nbsp; <a href="sawyr10.txt">sawyr10.txt</a> ,
&nbsp; <a href="yanke11.txt">yanke11.txt</a> ,
&nbsp; <a href="lmiss10.txt">lmiss10.txt</a> ,
&nbsp; <a href="puddn10.txt">puddn10.txt</a> ,
&nbsp; <a href="sawy310.txt">sawy310.txt</a> &nbsp; and
&nbsp; <a href="tramp11.txt">tramp11.txt</a> .

You will also find here <a href="MergedMatlabDotMFiles.txt">a .txt file</a> containing some merged Matlab files that are useful to reformat the texts (remove punctuation, etc.) and to analyze the n-tuples.

<br>

<br>

- <span style="font-weight: bold;">In Exercise 6:</span> <br>
You need statistics of DNA sequences. The translation code divides the DNA at every third base and uses the
64 possible triplets in each piece to code either for one of the amino
acids or for the end of the protein, called the stop. The table is
available at <a href="http://www.kazusa.or.jp/codon/cgi-bin/showcodon.cgi?species=9606">http://www.kazusa.or.jp/codon/cgi-bin/showcodon.cgi?species=9606</a><br>

There is a vast library of DNA sequences available online from either&nbsp; `Genbank': &nbsp;<a href="http://www.ncbi.nlm.nih.gov/Genbank/">http://www.ncbi.nlm.nih.gov/Genbank/</a> &nbsp;<br>

or `EMBL': <a href="http://www.ebi.ac.uk/embl/">http://www.ebi.ac.uk/embl/</a>.<br>

</body>
</html>