Monday, February 17, 2014

LeetCode:Edit Distance

Given two words word1 and word2, find the minimum number of steps required to convert word1 to word2. (each operation is counted as 1 step.)

You have the following 3 operations permitted on a word:

a) Insert a character
b) Delete a character
c) Replace a character

Solution: Two Dimensional Dynamic Programming


Remember when a problem is to find the optimal value, DP is a possible solution.

This is a classic DP problem.

dp[i][j] refers to edit distance between X[0...i] and Y[0...j], so

Think about it. For the last step, we can either:

a. insert a letter of Y[len2-1] to X

b. delete a letter X[len1-1] from X
c. substitue X[len1-1] with Y[len2-1]

Basic cases: dp[0][0] = 0; dp[i][0] = i,  i delete operations, similarly (empty string to word[0..i]), dp[0][j] = j, j insert operations


2. Optimal functions:

dp[i][j]=min((d[i-1][j]+1),(d[i][j-1]+1),(d[i-1][j-1]+0 or 1))

d[i-1][j] last step is to delete X[i] (X[1.i-1] to Y[1..j] is already optimal)

d[i][j-1] last step is to insert Y[j] ( X[1..i] to Y[1..j-1] is already optimal)

d[i-1][j-1] last step is to substitute X[i] with Y[j], if(X[i] == Y[j]) plus 0, else plus 1


这里是中文解释(改进版)

但对于编辑距离的话,当我们要计算d(i,j)时,即计算A(i)到B(j)之间的编辑距离,此时,设A(i)形式是somestr1c;B(i)形如somestr2d的话,

将somestr1变成somestr2的编辑距离已知是d(i-1,j-1)

将somestr1c变成somestr2的编辑距离已知是d(i,j-1) 将somestr1变成somestr2d的编辑距离已知是d(i-1,j) 那么利用这三个变量,就可以递推出d(i,j)了: 如果c==d,显然编辑距离和d(i-1,j-1)是一样的 如果c!=d,情况稍微复杂一点,
如果将c替换成d,编辑距离是somestr1变成somestr2的编辑距离 + 1,也就是d(i-1,j-1) + 1
d(i,j-1) 意味着已经将somestr1c变成somestr2的编辑距离, 现在只要再insert 字母d即可, 即d(i,j-1) + 1
d(i-1,j) 意味着已经将somestr1变成somestr2d了,即已经将somestr1c变成(somestr2d)c了 现在只需把A的最后一个字母c delete即可,d(i-1,j) + 1 



Experience:

DP array, use dp[m][n] to represent the result, so have to assign (m+1)*(n+1).

Using dp[m-1][n-1] is not recommended here.

No comments:

Post a Comment