Minimum Edit Distance
Insertion Operation
Deletion Operation
Substitution Operation
Needed to transform one string into the other string.
Two strings and their alignment:
If each operation has cost of 1, Distance between these is 5
If substitutions cost 2 (Levenstein), Distance between them is 8
Alignment in Computational Biology
Given a sequence of bases
AGGCTATCACCTGACCTCCAGGCCGATGCCC TAGCTATCACGACCGCGGTCGATTTGCCCGAC
An alignment:
Given two sequences, align each letter to a letter or gap
How to find the Minimum Edit Distance?
Searching for a path (sequence of edits) from the start string to the final string:
◦ Initial state: the word we’re transforming
◦ Operators: insert, delete, substitute
◦ Goal state: the word we’re trying to get to
◦ Path cost: what we want to minimize: the number of edits
You can also see the video of solving minimum edit distance on YouTube. click here to see video.
Minimum Edit as Search
But the space of all edit sequences is huge!
◦ We can’t afford to navigate naïvely
◦ Lots of distinct paths wind up at the same state.
◦ We don’t have to keep track of all of them
◦ Just the shortest path to each of those revisted states.
Comments
Post a Comment