哈希算法题目：实现一个魔法字典

字数 645 2025-10-27 12:20:21

哈希算法题目：实现一个魔法字典

题目描述
设计一个数据结构，支持以下两种操作：

buildDict(words): 使用字符串字典初始化数据结构
search(word): 判断能否只修改一个字符使得修改后的字符串与字典中的某个字符串相同

解题思路分析

核心问题是如何高效判断"只差一个字符"的匹配关系
直接暴力匹配会超时，需要优化搜索过程
哈希表可以存储模式化的键，将相似单词分组

详细解题步骤

第一步：理解匹配规则

当搜索单词"hello"时，需要判断字典中是否存在如"hallo"、"helli"等单词
关键观察：只差一个字符的单词有共同特征
例如"hello"可以生成模式："ello", "hllo", "helo", "helo", "hell*"

第二步：设计数据结构

使用哈希表存储模式到单词的映射
对于每个单词，生成所有可能的模式（将每个位置替换为通配符）
模式作为键，对应的原始单词集合作为值

第三步：构建字典实现

from collections import defaultdict

class MagicDictionary:
    def __init__(self):
        self.patterns = defaultdict(set)  # 模式到单词集合的映射
    
    def buildDict(self, words):
        # 清空已有数据
        self.patterns.clear()
        
        # 为每个单词生成所有可能的模式
        for word in words:
            for i in range(len(word)):
                # 将第i个字符替换为通配符*
                pattern = word[:i] + '*' + word[i+1:]
                # 将原始单词添加到该模式对应的集合中
                self.patterns[pattern].add(word)

第四步：搜索实现

    def search(self, word):
        # 生成搜索单词的所有模式
        for i in range(len(word)):
            pattern = word[:i] + '*' + word[i+1:]
            
            # 检查该模式是否存在于字典中
            if pattern in self.patterns:
                # 如果模式对应的单词集合包含多个单词，或者
                # 包含的单词不是搜索单词本身，则返回True
                candidates = self.patterns[pattern]
                if len(candidates) > 1 or word not in candidates:
                    return True
        return False

第五步：完整代码示例

from collections import defaultdict

class MagicDictionary:
    def __init__(self):
        self.patterns = defaultdict(set)
    
    def buildDict(self, words):
        self.patterns.clear()
        for word in words:
            for i in range(len(word)):
                pattern = word[:i] + '*' + word[i+1:]
                self.patterns[pattern].add(word)
    
    def search(self, word):
        for i in range(len(word)):
            pattern = word[:i] + '*' + word[i+1:]
            if pattern in self.patterns:
                candidates = self.patterns[pattern]
                if len(candidates) > 1 or word not in candidates:
                    return True
        return False

第六步：算法复杂度分析

构建字典时间复杂度：O(N×L)，其中N是单词数量，L是单词平均长度
搜索时间复杂度：O(L)，只需要生成L个模式并查询哈希表
空间复杂度：O(N×L)，需要存储所有模式

第七步：测试用例验证

# 测试示例
magic_dict = MagicDictionary()
magic_dict.buildDict(["hello", "leetcode"])

# 测试用例1：搜索"hello" -> False（需要修改成不同单词）
print(magic_dict.search("hello"))  # False

# 测试用例2：搜索"hhllo" -> True（修改第二个字符匹配"hello"）
print(magic_dict.search("hhllo"))  # True

# 测试用例3：搜索"hell" -> False（长度不同）
print(magic_dict.search("hell"))  # False

# 测试用例4：搜索"leetcoded" -> False（长度不同）
print(magic_dict.search("leetcoded"))  # False

关键要点总结

模式化思维：通过通配符将相似单词分组
哈希表优化：将O(N)的搜索优化为O(L)的模式匹配
边界处理：注意排除完全匹配的情况
这种模式化方法在字符串模糊匹配中很常用