Redis Source - Hash
参考文献
- 极客时间专栏 <<Redis 源码剖析与实战>> 第三讲 如何实现一个性能优异的Hash表
Hash
- 源码基于
Redis 6.2
分支
Redis
如何实现链式哈希
什么是哈希冲突?
- 由于键空间会大于
Hash
表空间,这就导致在用Hash
函数把键映射到Hash
表空间时,不可避免地会出现不同的键被映射到数组的同一个位置上.而如果同一个位置只能保存一个键值对,就会导致Hash
表保存的数据非常有限,这就是常说的哈希冲突.
如何解决哈希冲突呢?
- 方案一: 链式哈希
- 局限: 链式哈希的链不能太长,否则会降低
Hash
表性能
- 局限: 链式哈希的链不能太长,否则会降低
- 方案二: 就是当链式哈希的链长达到一定长度时,可以使用
rehash
- 改进方法: 渐进式
rehash
- 改进方法: 渐进式
结构体
1 | // src/dict.h |
Redis
如何实现rehash
?
Redis
实现rehash
的基本思路
- 首先,
Redis
准备了两个哈希表,用于rehash
时交替保存数据. - 其次,在正常服务请求阶段,所有的键值对写入哈希表
ht[0]
. - 接着,当进行
rehash
时,键值对被迁移到哈希表ht[1]
中. - 最后,当迁移完成后,
ht[0]
的空间会被释放,并把ht[1]
的地址赋值给ht[0]
,ht[1]
的表大小设置为0
.这样一来,又回到了正常服务请求的阶段,ht[0]
接收和服务请求,ht[1]
作为下一次rehash
时的迁移表.
什么时候触发rehash
?
Redis
用来判断是否触发rehash
的函数是_dictExpandIfNeeded
1 | /* Expand the hash table if needed */ |
-
_dictExpandIfNeeded
函数中定义了三个扩容条件.- 条件一:
ht[0]
的大小为0
,此时Hash
表是空的,所以Redis
就需要将Hash
表空间设置为初始大小,而这是初始化的工作,并不属于rehash
操作. - 条件二:
ht[0]
承载的元素个数已经超过了ht[0]
的大小,同时Hash
表可以进行扩容. - 条件三:
ht[0]
承载的元素个数,是ht[0]
的大小的dict_force_resize_ratio
倍,其中,dict_force_resize_ratio
的默认值是5
. - 当前承载的元素个数(
d->ht[0].used
)和Hash
表当前设定的大小(d->ht[0].size
).这两个值的比值一般称为负载因子(load factor
).也就是说,Redis
判断是否进行rehash
的条件,就是看load factor
是否大于等于1
和是否大于5
- 条件一:
-
dict_can_resize
控制是否开启rehash
功能1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26// src/dict.h
typedef enum {
DICT_RESIZE_ENABLE,
DICT_RESIZE_AVOID,
DICT_RESIZE_FORBID,
} dictResizeEnable;
// src/server.c
/* This function is called once a background process of some kind terminates,
* as we want to avoid resizing the hash tables when there is a child in order
* to play well with copy-on-write (otherwise when a resize happens lots of
* memory pages are copied). The goal of this function is to update the ability
* for dict.c to resize or rehash the tables accordingly to the fact we have an
* active fork child running. */
void updateDictResizePolicy(void) {
// 如果当前进程是一个 fork 出的子进程(server.in_fork_child != CHILD_TYPE_NONE),
// 则禁止字典进行大小调整,因为在子进程中进行调整可能会影响到父进程的内存(由于写时复制机制)
if (server.in_fork_child != CHILD_TYPE_NONE)
dictSetResizeEnabled(DICT_RESIZE_FORBID);
// 如果存在活跃的子进程(通过 hasActiveChildProcess() 函数检查),则避免字典进行大小调整,以免影响到子进程的内存.
else if (hasActiveChildProcess())
dictSetResizeEnabled(DICT_RESIZE_AVOID);
else
// 以上条件都不满足时,开启rehash
dictSetResizeEnabled(DICT_RESIZE_ENABLE);
} -
在
src/dict.c
文件中,_dictExpandIfNeeded
被调用关系1
2
3
4
5
6
7
dictAdd
dictReplace --> dictAddRaw --> _dictKeyIndex --> _dictExpandIfNeeded
dictAddOrFind-
dictAdd
: 用来往Hash
表中添加一个键值对.1
2
3
4
5
6
7
8
9/* Add an element to the target hash table */
int dictAdd(dict *d, void *key, void *val)
{
dictEntry *entry = dictAddRaw(d,key,NULL);
if (!entry) return DICT_ERR;
dictSetVal(d, entry, val);
return DICT_OK;
} -
dictReplace
: 用来往Hash
表中添加一个键值对,或者键值对存在时,修改键值对.1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28/* Add or Overwrite:
* Add an element, discarding the old value if the key already exists.
* Return 1 if the key was added from scratch, 0 if there was already an
* element with such key and dictReplace() just performed a value update
* operation. */
int dictReplace(dict *d, void *key, void *val)
{
dictEntry *entry, *existing, auxentry;
/* Try to add the element. If the key
* does not exists dictAdd will succeed. */
entry = dictAddRaw(d,key,&existing);
if (entry) {
dictSetVal(d, entry, val);
return 1;
}
/* Set the new value and free the old one. Note that it is important
* to do that in this order, as the value may just be exactly the same
* as the previous one. In this context, think to reference counting,
* you want to increment (set), and then decrement (free), and not the
* reverse. */
auxentry = *existing;
dictSetVal(d, existing, val);
dictFreeVal(d, &auxentry);
return 0;
} -
dictAddorFind
: 直接调用dictAddRaw
1
2
3
4
5
6
7
8
9
10
11
12/* Add or Find:
* dictAddOrFind() is simply a version of dictAddRaw() that always
* returns the hash entry of the specified key, even if the key already
* exists and can't be added (in that case the entry of the already
* existing key is returned.)
*
* See dictAddRaw() for more information. */
dictEntry *dictAddOrFind(dict *d, void *key) {
dictEntry *entry, *existing;
entry = dictAddRaw(d,key,&existing);
return entry ? entry : existing;
}
-
-
总结:
Redis
中触发rehash
操作的关键,就是_dictExpandIfNeeded
函数和updateDictResizePolicy
函数._dictExpandIfNeeded
函数会根据Hash
表的负载因子以及能否进行rehash
的标识,判断是否进行rehash
,而updateDictResizePolicy
函数会根据RDB
和AOF
的执行情况,启用或禁用rehash
.
rehash
扩容括多大
-
在
Redis
中,rehash
对 Hash 表空间的扩容是通过调用dictExpand
函数来完成的.dictExpand
函数的参数有两个,一个是要扩容的 Hash 表,另一个是要扩到的容量.1
2
3
4
5// src/dict.c
/* return DICT_ERR if expand was not performed */
int dictExpand(dict *d, unsigned long size) {
return _dictExpand(d, size, NULL);
} -
从2020/12/6日Redis 6.2版本开始 扩容大小由原来的
d->ht[0].used*2
变为了d->ht[0].used+1
Had a discuss this with @yossigo @guybe7 @MeirShpilraien @inbaryuval @yaacovhazan-Redislabs, and here’s the conclusions:
- It’s hard to predict the impact of such a mechanism when enabled on all dicts, we feel we should focus on just solving the major known problem of the main dict, in which is easier to realize all the implications of this change. for that, a new field can be added in the
dictType
struct, so it doesn’t have a cost per instance of the dictionary. - this is essentially a balance between memory and execution performance, currently redis aims to hit a dict saturation ratio between 0.5 and 1.0 (as soon at it reaches 1.0 it bring it back to 0.5), but these two constants can in theory be set to different ones (0.8 and 1.2 or anything else).
- the code that does both
used*2
and later_dictNextPower
seems like a bug. since the check is thatused >= size
, the outcome of this will usually be the same as_dictNextPower(used+1)
(no need for the *2). the only case where it’ll differ is whendict_can_resize
is false during fork, so that later the_dictNextPower(used*2)
will cause the dict to jump to *4 (i.e._dictNextPower(1025*2)
will give you 4096).
proposals:
- replace the *2 with +1 (will only affect wakeup after fork)
- when there is enough memory, keep the logic like it is (saturation between 0.5 and 1.0), but when the memory is in shortage (rehash is going to cause eviction), we can relax the saturation threshold, but only up to a limit. let’s say up to saturation of 1.61803 (after that we will prefer to rehash, despite the eviction it’ll cause).
- we can probably revise the rehashing code to be a little bit less memory hungry during rehashing (i’ll open another issue to describe the design).
- It’s hard to predict the impact of such a mechanism when enabled on all dicts, we feel we should focus on just solving the major known problem of the main dict, in which is easier to realize all the implications of this change. for that, a new field can be added in the
rehash
如何执行
1 | /* return DICT_ERR if expand was not performed */ |
渐进式rehash
如何实现?
为什么要实现渐进式rehash
?
- 其实这是因为,
Hash
表在执行rehash
时,由于Hash
表空间扩大,原本映射到某一位置的键可能会被映射到一个新的位置上,因此,很多键就需要从原来的位置拷贝到新的位置.而在键拷贝时,由于Redis
主线程无法执行其他请求,所以键拷贝会阻塞主线程,这样就会产生**rehash
开销** - 简单来说,渐进式
rehash
的意思就是Redis
并不会一次性把当前Hash
表中的所有键,都拷贝到新位置,而是会分批拷贝,每次的键拷贝只拷贝Hash
表中一个bucket
中的哈希项.这样一来,每次键拷贝的时长有限,对主线程的影响也就有限了
渐进式rehash
在代码层面是如何实现的呢?
-
两个关键函数:
_dictRehashStep
和dictRehash
1
2
3
4
5
6
7
8
9dictAddRaw
dictGenericDelete
dictFind --> _dictRehashStep --> dictRehash
dictGetRandomKey
dictGetSomeKeys
1 | /* This function performs just a step of rehashing, and only if hashing has |
dictRehash
函数的整体逻辑包括两部分:- 首先,该函数会执行一个循环,根据要进行键拷贝的
bucket
数量n
,依次完成这些bucket
内部所有键的迁移.当然,如果ht[0]
哈希表中的数据已经都迁移完成了,键拷贝的循环也会停止执行. - 其次,在完成了
n
个bucket
拷贝后,dictRehash
函数的第二部分逻辑,就是判断ht[0]
表中数据是否都已迁移完.如果都迁移完了,那么ht[0]
的空间会被释放.因为Redis
在处理请求时,代码逻辑中都是使用ht[0]
,所以当rehash
执行完成后,虽然数据都在ht[1]
中了,但Redis
仍然会把ht[1]
赋值给ht[0]
,以便其他部分的代码逻辑正常使用. - 而在
ht[1]
赋值给ht[0]
后,它的大小就会被重置为0
,等待下一次rehash
.与此同时,全局哈希表中的rehashidx
变量会被标为-1
,表示rehash
结束了
- 首先,该函数会执行一个循环,根据要进行键拷贝的
附录
本博客所有文章除特别声明外,均采用 CC BY-NC-SA 4.0 许可协议。转载请注明来自 HoleLin's Blog!