The polynomial hash
Hash of string = sum(char[i] × base^(n-i-1)) mod prime. Rolling: subtract removed char × base^(n-1), multiply by base, add new char.
Advertisement
Implementation
long mod = 1_000_000_007L, base = 31;
long pow = 1;
for (int i = 0; i < pat.length() - 1; i++) pow = (pow * base) % mod;
long patHash = hash(pat);
long winHash = hash(text.substring(0, pat.length()));
for (int i = 0; i <= text.length() - pat.length(); i++) {
if (winHash == patHash && text.substring(i, i + pat.length()).equals(pat))
matches.add(i);
if (i < text.length() - pat.length()) {
winHash = ((winHash - text.charAt(i) * pow) * base + text.charAt(i + pat.length())) % mod;
if (winHash < 0) winHash += mod;
}
}Advertisement
Collision handling
Different strings can have same hash. On hash match, verify byte-by-byte. Choose prime large enough that collisions are rare.
Complexity
O(N + M) expected. O(NM) worst case if many collisions. Choose good hash to avoid.