Minimum Edit Distance

minimum edit distance


The minimum edit distance between two strings Is the minimum number of editing operations.

Insertion Operation

Deletion Operation

Substitution Operation

      Needed to transform one string into the other string.


Two strings and their alignment: 

Alignment of two strings
 

If each operation has cost of 1, Distance between these is 5

If substitutions cost 2 (Levenstein), Distance between them is 8 


Alignment in Computational Biology

Given a sequence of bases

AGGCTATCACCTGACCTCCAGGCCGATGCCC TAGCTATCACGACCGCGGTCGATTTGCCCGAC 

An alignment:

Alignment of two biological DNA

Given two sequences, align each letter to a letter or gap


How to find the Minimum Edit Distance?

Searching for a path (sequence of edits) from the start string to the final string:

Initial state: the word we’re transforming 

Operators: insert, delete, substitute 

Goal state: the word we’re trying to get to 

Path cost: what we want to minimize: the number of edits 

How to find the Minimum Edit Distance

You can also see the video of solving minimum edit distance on YouTube. click here to see video.

Minimum Edit as Search 

But the space of all edit sequences is huge! 

◦ We can’t afford to navigate naïvely 

◦ Lots of distinct paths wind up at the same state. 

◦ We don’t have to keep track of all of them 

◦ Just the shortest path to each of those revisted states. 

Comments

Popular posts from this blog

Global Warming