Word Subsets
— Leetcode, Hashtable — 2 min read
You are given two string arrays words1 and words2.
A string b is a subset of string a if every letter in b occurs in a including multiplicity.
- For example, "wrr" is a subset of "warrior" but is not a subset of "world".
A string a from words1 is universal if for every string b in words2, b is a subset of a.
Return an array of all the universal strings in words1. You may return the answer in any order.
Example:
Input: words1 = ["amazon","apple","facebook","google","leetcode"], words2 = ["e","o"]Output: ["facebook","google","leetcode"]
Brute Force
A naive way of solving this problem will be:
- Count the letter frequency for each word in words1 and words2
- Compare the letter frequency of each word in words1 against each word in word2
from collections import Counter
class Solution: def wordSubsets(self, words1: List[str], words2: List[str]) -> List[str]: dicts_a = list(map(lambda x: Counter(x), words1)) dicts_b = list(map(lambda x: Counter(x), words2)) result = []
for index, dict_a in enumerate(dicts_a): add = True for dict_b in dicts_b: for letter, value in dict_b.items(): if not letter in a or dict_a[letter] < value: add = False break if not add: break if add: result.append(words1[index]) return result
However, the time complexity of this method is O(abmn), where a, b are the lengths of longest word in words1 and words2, m,n are the lengths for words1 and words2. This is not ideal.
Optimized Method
When comparing with each word in words2, we may notice that there is actully no need for word in words1 to be compared with every word in words2. We can find the maximum frequency of letters that appear in words2, and compare word in words1 against this frequency list. Since if there is a letter in any word with a higher frequency than the letter in word in words1, the word will not be universal
from collections import Counter
class Solution: def wordSubsets(self, words1: List[str], words2: List[str]) -> List[str]: dicts_a = list(map(lambda x: Counter(x), words1)) dicts_b = list(map(lambda x: Counter(x), words2)) max_dict_b = {} result = [] #creating the maximum frequency list for words2 for dict_b in dicts_b: for letter, count in dict_b.items(): if letter not in max_dict_b: max_dict_b[letter] = count else: max_dict_b[letter] = max(max_dict_b[letter], count) for index, a in enumerate(dicts_a): add = True for letter, value in max_dict_b.items(): if not letter in a or a[letter] < value: add = False break if add: result.append(words1[index]) return result
The time complexity of this solution will be O(abm), since we removed iteration of words2 when comparing.
In this problem, we extract the information we care about (max-frequency list) from the original information (frequency list of every word), therefore reduce the problem size.