This is a follow up on the previous post where we solved Regular Expression Matching using simple recursion. In this post we will solve the same problem using Dynamic Programming. Dynamic Programming (DP) is an algorithmic technique for solving an optimization problem by breaking it down into simpler subproblems where the optimal solution to the problem is obtained by solving for optimal solutions to the subproblems.
Problem Statement
Given an input string and a regex pattern, implement regular expression matching with support for ‘.’ and ‘*’.
- ‘.’ matches any single character
- ‘*’ matches zero or more of the preceding element (the char before *)
Examples
- s = ‘aa’, p = ‘ab’ Not a match
- s= ‘bbbbb’, p = ‘b*’ Match because ‘*’ matches zero or more of the preceding element b
- s= ‘fljflf’, p = ‘.*’ Match because we want to match zero or more of the preceding element which is any character in this case
- s = ‘oop’, p = ‘r*o*p’ Match because r is present 0 times, o is present twice and p is a direct match
Thought Process
In any dynamic programming problem we should first define the subproblem. Let’s consider an example where s = ‘bbbbb’ and p = ‘b*b*c*’. If we use the recursive solution to solve the problem, the recursion tree will branch out as shown below. Although not all the branches are represented in the diagram, we can clearly see that some of the subproblems are repeatedly solved over and over again and this is an expensive computation. One way to avoid solving repeated problems is to remember the results of a computation using a technique called Memoization.

Each subproblem call will resemble isRegexMatch(s[i:], pattern[j:]) where i and j are indexes used to slice the string and the pattern respectively. So we do not have to memoize the entire substring and substring of the pattern. So instead of
mem[(s[i:], pattern[j:])) = result
We will use only i and j to index the result in mem such as
mem[(i, j)] = result
We will therefore use the exact same solution from our previous solution except this time we will store the results for future lookup.
Complexity
For a string of length S and pattern of length P, we call dp(i, j) at the worst case for all the possible subproblems for i = 0 to S and j = 0 to P. So the time complexity will be O(SP).
Same for space complexity since we are caching the results of the subproblems O(SP).
Solution
def isRegexMatch(s, pattern):
mem = {}
def dp(i, j):
if (i, j) not in mem:
if j == len(pattern): # Check if pattern is empty, if so string s should also be empty
mem[(i,j)] = i == len(s)
else:
# Check the first character match in s and pattern
# String should not be empty
# The first characters exactly match (Or)
# pattern's first char contains "." to match any char in s
is_first_char_match = len(s) > i and (s[i] == pattern[j] or pattern[j] == ".")
# When '*' is present in the second char of pattern
if j < len(pattern) - 1 and pattern[j+1] == "*":
mem[(i,j)] = dp(i, j+2) or (is_first_char_match and dp(i+1, j))
else:
# Continue checking the remaining characters using recursion
mem[(i,j)] = is_first_char_match and dp(i+1, j+1)
return mem[(i,j)]
return dp(0, 0)
Hope you enjoyed solving this dynamic programming example using Memoization technique. See you in the next post!


