72 lines
3.1 KiB
Markdown
72 lines
3.1 KiB
Markdown
# Hamming
|
|
|
|
Write a program that can calculate the Hamming difference between two DNA strands.
|
|
|
|
A mutation is simply a mistake that occurs during the creation or
|
|
copying of a nucleic acid, in particular DNA. Because nucleic acids are
|
|
vital to cellular functions, mutations tend to cause a ripple effect
|
|
throughout the cell. Although mutations are technically mistakes, a very
|
|
rare mutation may equip the cell with a beneficial attribute. In fact,
|
|
the macro effects of evolution are attributable by the accumulated
|
|
result of beneficial microscopic mutations over many generations.
|
|
|
|
The simplest and most common type of nucleic acid mutation is a point
|
|
mutation, which replaces one base with another at a single nucleotide.
|
|
|
|
By counting the number of differences between two homologous DNA strands
|
|
taken from different genomes with a common ancestor, we get a measure of
|
|
the minimum number of point mutations that could have occurred on the
|
|
evolutionary path between the two strands.
|
|
|
|
This is called the 'Hamming distance'.
|
|
|
|
It is found by comparing two DNA strands and counting how many of the
|
|
nucleotides are different from their equivalent in the other string.
|
|
|
|
GAGCCTACTAACGGGAT
|
|
CATCGTAATGACGGCCT
|
|
^ ^ ^ ^ ^ ^^
|
|
|
|
The Hamming distance between these two DNA strands is 7.
|
|
|
|
# Implementation notes
|
|
|
|
The Hamming distance is only defined for sequences of equal length. This means
|
|
that based on the definition, each language could deal with getting sequences
|
|
of equal length differently.
|
|
|
|
## Getting Started
|
|
|
|
Make sure you have read the [getting started with C++](http://help.exercism.io/getting-started-with-cpp.html)
|
|
page on the [exercism help site](http://help.exercism.io/). This covers
|
|
the basic information on setting up the development environment expected
|
|
by the exercises.
|
|
|
|
## Passing the Tests
|
|
|
|
Get the first test compiling, linking and passing by following the [three
|
|
rules of test-driven development](http://butunclebob.com/ArticleS.UncleBob.TheThreeRulesOfTdd).
|
|
Create just enough structure by declaring namespaces, functions, classes,
|
|
etc., to satisfy any compiler errors and get the test to fail. Then write
|
|
just enough code to get the test to pass. Once you've done that,
|
|
uncomment the next test by moving the following line past the next test.
|
|
|
|
```C++
|
|
#if defined(EXERCISM_RUN_ALL_TESTS)
|
|
```
|
|
|
|
This may result in compile errors as new constructs may be invoked that
|
|
you haven't yet declared or defined. Again, fix the compile errors minimally
|
|
to get a failing test, then change the code minimally to pass the test,
|
|
refactor your implementation for readability and expressiveness and then
|
|
go on to the next test.
|
|
|
|
Try to use standard C++11 facilities in preference to writing your own
|
|
low-level algorithms or facilities by hand. [CppReference](http://en.cppreference.com/)
|
|
is a wiki reference to the C++ language and standard library. If you
|
|
are new to C++, but have programmed in C, beware of
|
|
[C traps and pitfalls](http://www.slideshare.net/LegalizeAdulthood/c-traps-and-pitfalls-for-c-programmers).
|
|
|
|
## Source
|
|
|
|
The Calculating Point Mutations problem at Rosalind [view source](http://rosalind.info/problems/hamm/)
|