LPS array (failure function)
int[] computeLPS(String pat) {
int n = pat.length();
int[] lps = new int[n];
int len = 0, i = 1;
while (i < n) {
if (pat.charAt(i) == pat.charAt(len)) {
lps[i++] = ++len;
} else if (len != 0) {
len = lps[len - 1];
} else {
lps[i++] = 0;
}
}
return lps;
}Advertisement
Main KMP loop
int i = 0, j = 0;
while (i < text.length()) {
if (text.charAt(i) == pat.charAt(j)) { i++; j++; }
if (j == pat.length()) {
matches.add(i - j);
j = lps[j - 1];
} else if (i < text.length() && text.charAt(i) != pat.charAt(j)) {
if (j != 0) j = lps[j - 1];
else i++;
}
}Advertisement
Why O(N + M)
Each text character examined at most twice. LPS lets us skip forward instead of restart.
LPS array (failure function)
int[] computeLPS(String pat) {
int n = pat.length();
int[] lps = new int[n];
int len = 0, i = 1;
while (i < n) {
if (pat.charAt(i) == pat.charAt(len)) {
lps[i++] = ++len;
} else if (len != 0) {
len = lps[len - 1];
} else {
lps[i++] = 0;
}
}
return lps;
}Main KMP loop
int i = 0, j = 0;
while (i < text.length()) {
if (text.charAt(i) == pat.charAt(j)) { i++; j++; }
if (j == pat.length()) {
matches.add(i - j);
j = lps[j - 1];
} else if (i < text.length() && text.charAt(i) != pat.charAt(j)) {
if (j != 0) j = lps[j - 1];
else i++;
}
}Why O(N + M)
Each text character examined at most twice. LPS lets us skip forward instead of restart.
Alternatives
Boyer-Moore: skip more chars, better for large alphabets. Rabin-Karp: hashing for multi-pattern.