LRUCache的实现原理及利用python实现的方法

作者：蒂米时间：2022-06-26 06:51:51　

简介

LRU(Least Recently Used)最近最少使用，最近有时间和空间最近的歧义，所以我更喜欢叫它近期最少使用算法。它的核心思想是，如果一个数据被访问过，我们有理由相信它在将来被访问的概率就越高。于是当LRU缓存达到设定的最大值时将缓存中近期最少使用的对象移除。LRUCache内部使用LinkedHashMap来存储key-value键值对，并将LinkedHashMap设置为访问顺序来体现LRU算法。

无论是对某个key的get，还是set都算做是对该key的一次使用。当set一个不存在的key，并且LRU Cache中key的数量超过cache size的时候，需要将使用时间距离现在最长的那个key从LRU Cache中清除。

LRU Cache实现

在Java中，LRUCache是通过LinkedHashMap实现的。鄙人照猫画虎，实现一个Python版的LRU Cache（可能和其他大神的实现有所区别）。

首先，需要说明的是：

LRU Cache对象内部会维护一个双端循环链表的头节点

LRU Cache对象内部会维护一个dict

内部dict的value都是Entry对象，每个Entry对象包含：

key的hash_code（hash_code = hash(key)，在本实现中，hash_code相同的不同key，会被当作一个key来处理。因此，对于自定义类，应该实现魔术方法：__hash__）
v - (key, value)对中的value
prev - 前一个对象
next - 后一个对象

具体实现是：

当从LRU Cache中get一个key的时候：

计算该key的hash_code
从内部dict中获取到entry
将该entry移动到双端循环链表的第一个位置
返回entry.value

当向LRU Cache中set一个(key, value)对的时候：

计算该key的hash_code，

从LRU Cache的内部dict中，取出该hash_code对应的old_entry（可能不存在），然后根据(key, value)对生成一个new_entry，之后执行：

dict[hash_code] = new_entry
将new_entry提到双端循环链表的第一个位置
如果old_entry存在，则从链表中删除old_entry
如果是新增了一个(key, value)对，并且cache中key的数量超过了cache size，那么将双端链表的最后一个元素删除（该元素就是那个最近最少被使用的元素），并且从内部dict中删除该元素

HashMap的实现原理

（面试过程中也经常会被问到）：数组和链表组合成的链表散列结构，通过hash算法，尽量将数组中的数据分布均匀，如果hashcode相同再比较equals方法，如果equals方法返回false，那么就将数据以链表的形式存储在数组的对应位置，并将之前在该位置的数据往链表的后面移动，并记录一个next属性，来指示后移的那个数据。

注意：数组中保存的是entry（其中保存的是键值）

Python实现

class Entry:
def __init__(self, hash_code, v, prev=None, next=None):
self.hash_code = hash_code
self.v = v
self.prev = prev
self.next = next

def __str__(self):
return "Entry{hash_code=％d, v=％s}" ％ (
self.hash_code, self.v)
__repr__ = __str__

class LRUCache:
def __init__(self, max_size):
self._max_size = max_size
self._dict = dict()
self._head = Entry(None, None)
self._head.prev = self._head
self._head.next = self._head

def __setitem__(self, k, v):
try:
hash_code = hash(k)
except TypeError:
raise

old_entry = self._dict.get(hash_code)
new_entry = Entry(hash_code, v)
self._dict[hash_code] = new_entry

if old_entry:
prev = old_entry.prev
next = old_entry.next
prev.next = next
next.prev = prev

head = self._head
head_prev = self._head.prev
head_next = self._head.next

head.next = new_entry
if head_prev is head:
head.prev = new_entry
head_next.prev = new_entry
new_entry.prev = head
new_entry.next = head_next

if not old_entry and len(self._dict) > self._max_size:
last_one = head.prev
last_one.prev.next = head
head.prev = last_one.prev
self._dict.pop(last_one.hash_code)

def __getitem__(self, k):
entry = self._dict[hash(k)]
head = self._head
head_next = head.next
prev = entry.prev
next = entry.next

if entry.prev is not head:
if head.prev is entry:
head.prev = prev
head.next = entry

head_next.prev = entry
entry.prev = head
entry.next = head_next

prev.next = next
next.prev = prev

return entry.v

def get_dict(self):
return self._dict

if __name__ == "__main__":
cache = LRUCache(2)
inner_dict = cache.get_dict()

cache[1] = 1
assert inner_dict.keys() == [1], "test 1"
cache[2] = 2
assert sorted(inner_dict.keys()) == [1, 2], "test 2"
cache[3] = 3
assert sorted(inner_dict.keys()) == [2, 3], "test 3"
cache[2]
assert sorted(inner_dict.keys()) == [2, 3], "test 4"
assert inner_dict[hash(2)].next.v == 3
cache[4] = 4
assert sorted(inner_dict.keys()) == [2, 4], "test 5"
assert inner_dict[hash(4)].v == 4, "test 6"

来源：http://timd.cn/python-lru-cache/

标签：python,lrucache,实现原理

投稿

LRUCache的实现原理及利用python实现的方法

猜你喜欢

Python中多线程及程序锁浅析

用ASP显示ACCESS数据库的GIF图象

谈中国站长站的文章干扰码实现方法

Golang中goroutine和channel使用介绍深入分析

Linux ORCLE数据库增量备份脚本

php Exception异常处理详解

网页代码更清晰高效的一些经验

python删除文件夹下相同文件和无法打开的图片

Python使用matplotlib 画矩形的三种方式分析

python多线程http下载实现示例

python将天数转换为日期字符串的方法实例

python3.3使用tkinter开发猜数字游戏示例

PHP strip_tags() 去字符串中的 HTML、XML 以及 PHP 标签的函数

MSSQL存储过程解秘过程全析

asp 防盗链代码(彻底屏蔽迅雷，旋风，快车下载站内资源)

Div+CSS网页布局对SEO的影响漫谈

解析SQL Server中数据库快照的工作原理

python Django模板的使用方法（图文）

python自动化操作之动态验证码、滑动验证码的降噪和识别

Python自动生产表情包