[译]C语言实现一个简易的Hash table(5)

lynjay

2019-06-30

上一章中，我们使用了双重Hash的技术来处理碰撞，并用了C语言实现，本章我们将实现Hash表中的插入、搜索和删除接口。

实现接口

我们的hash函数将会实现如下的接口：

// hash_table.h
void ht_insert(ht_hash_table* ht, const char* key, const char* value);
char* ht_search(ht_hash_table* ht, const char* key);
void ht_delete(ht_hash_table* ht, const char* key);

Insert函数

在Hash表中插入一条记录时，我们需要遍历整个Hash表知道找到一个空的位置，然后执行插入并将Hash表的大小加1。Hash表中的count属性代表Hash表的大小，在下一章缩放hash表大小中很有用：

void ht_insert(ht_hash_table* ht, const char* key, const char* value) {
    ht_item* item = ht_new_item(key, value);
    int index = ht_get_hash(item->key, ht->size, 0);
    ht_item* cur_item = ht->items[index];
    int i = 1;
    while(cur_item != NULL) {
        index = ht_get_hash(item->key, ht->size, i);
        cur_item = ht->items[index];
        ++i;
    }
    ht->items[index] = item;
    ht->count++;
}

Search函数

search和insert有点相似，但是在while循环中，我们会检查记录的key是否与我们正在搜索的key匹配。如果匹配，就会返回这条记录的value，没有匹配到就会返回NULL：

char* ht_search(ht_hash_table* ht, const char* key) {
        int index = ht_get_hash(key, ht->size, 0);
        ht_item* item = ht->items[index];
        int i = 1;
        while (item != NULL) {
            if (strcmp(item->key, key) == 0) {
                return item->value;
            }
            index = ht_get_hash(key, ht->size, i);
            item = ht->items[index];
            i++;
        } 
    return NULL;
}

delete函数

从开放的地址Hash表中删除比插入或搜索更复杂，因为存在碰撞，我们希望删除的记录可能是碰撞链的一部分。从表中删除它会破坏该链，并且无法在链的尾部找到记录。要解决此问题，我们只需将其标记为已删除，而不是真的删除该记录。

我们将记录替换为指向全局哨兵的指针，再将其标记为已删除，该全局哨兵表示包含已删除的记录的bucket：

// hash_table.c
static ht_item HT_DELETED_ITEM = {NULL, NULL};

void ht_delete(ht_hash_table* ht, const char* key) {
    int index = ht_get_hash(key, ht->size, 0);
    ht_item* item = ht->items[index];
    int i = 1;
    while (item != NULL) {
        if (item != &HT_DELETED_ITEM) {
            if (strcmp(item->key, key) == 0) {
                ht_del_item(item);
                ht->items[index] = &HT_DELETED_ITEM;
            }
        }
        index = ht_get_hash(key, ht->size, i);
        item = ht->items[index];
        i++;
    } 
    ht->count--;
}

删除后，我们需要将Hash表的count属性减1。

我们也需要修改下ht_insert和ht_search函数，当搜索时，我们需要忽略并跳过已删除的项，在已删除项的位置我们可以插入新的记录：

// hash_table.c
void ht_insert(ht_hash_table* ht, const char* key, const char* value) {
    // ...
    while (cur_item != NULL && cur_item != &HT_DELETED_ITEM) {
        // ...
    }
    // ...
}

char* ht_search(ht_hash_table* ht, const char* key) {
    // ...
    while (item != NULL) {
        if (item != &HT_DELETED_ITEM) { 
            if (strcmp(item->key, key) == 0) {
                return item->value;
            }
        }
        // ...
    }
    // ...
}

修改一下

我们的Hash表现在还不支持更新key的值，如果我们插入两条相同key的记录，key将会冲突，第二条记录就会插入到下一个可用的位置，当使用key搜索时，我们会找到第一条记录，第二条记录就永远不会被找到，现在我们修改下ht_insert函数，在插入多条相同key的记录时，会删除之前的记录再插入新的记录：

// hash_table.c
void ht_insert(ht_hash_table* ht, const char* key, const char* value) {
    // ...
    while (cur_item != NULL) {
        if (cur_item != &HT_DELETED_ITEM) {
            if (strcmp(cur_item->key, key) == 0) {
                ht_del_item(cur_item);
                ht->items[index] = item;
                return;
            }
        }
        // ...
    } 
    // ...
}

上一章：处理碰撞
下一章：缩放Hash表大小

hash函数 c语言 hash char table

安科网

[译]C语言实现一个简易的Hash table(5)

lynjay

实现接口

Insert函数

Search函数

delete函数

修改一下

lynjay

相关推荐

php hash算法实现memcached分布式

mysql对于很长的字符列的索引方案

加解密原理

webpack 中，hash、chunkhash、contenthash 的区别是什么？

python3 一致性hash算法

9.算法之顺序、二分、hash查找

布隆过滤器(Bloom Filter)与Hash算法

HashMap源码分析

python hash

HashMap、lru、散列表

MySQL 索引结构 hash 有序数组

面试必看！凭借着这份 MySQL 高频面试题，我拿到了京东，字节的offer！

PHP弱类型hash比较缺陷

一致性哈希算法 CARP 原理解析, 附 Golang 实现

字典的key都可以是什么

Hash算法：双重散列

mysql 一些小问题

这可能是史上最全的MySQL面试题分享了，看完直接收藏

100道MySQL常见面试题总结

一致性Hash算法

lynjay