哈希表的实现-Yinode Blog

哈希表

散列表（Hash table，也叫哈希表），是根据键（Key）而直接访问在内存存储位置的数据结构。也就是说，它通过计算一个关于键值的函数，将所需查询的数据映射到表中一个位置来访问记录，这加快了查找速度。这个映射函数称做散列函数，存放记录的数组称做散列表。

哈希表的最大特性就是能使用O(1)的时间访问一个键值对。哈希表技术的重点有两点获取Key的Hash值和解决冲突

获取HASH

哈希函数可以被认为是一种纯函数，接收一个字符串，返回一个Int数。字符串参数相同，返回的Int值必定相同，我这里就介绍一种名为BKDR的实现，我这里做了一些改装，这里要注意，一个Hash函数的性能是直接影响整个Hash表的性能的。

/**
 * 
 * @description 获取一个字符串对应的HashCode
 * @param {String} str 要转换的字符串
 * @returns {Number} 映射出来的整数
 * 要不是JS新增了BigInt 随便算算就超出最大IFEE754的极限了
 */
function getHash(str) {
	str = str.split('')
	let seed = BigInt(133),
		hash = BigInt(0)
	while (str.length > 0) {
		let charCode = BigInt(str[0].charCodeAt(0))
		str.shift()
		hash = BigInt(hash * seed + charCode)
	}
	return hash & BigInt(0x7fffffff)
}

一旦获取到Hash值，我们就能通过取模运算直接获取到最终的index值

// 获取index
let hash = getHash('imkey')
// 无论是Set过程中 还是Get过程中 都需要进行Index的计算
// 这个Index就代表了当前键值对象存储在表中的下标
let index = hash % table.length

解决冲突

事实上，没有一个Hash函数的实现能够彻底避免不同的Key最终生成相同的Hash，简单来讲就是会出现两个Key被映射到同一个槽位的情况。

常见的的解决方式有

链接法每个坑位里面放置一个链表一个坑位就能放多个值
开放地址法含有多个变种，但核心思想是通过某种规则尽可能的填向其他表格比如向下一个个检测
再散列设立一个Hash函数的池如果发生冲突按序调用直到Hash出来的Index坑位为空为止

这里我选择了链接法，直接上完整版的实现。

Code

/**
 * 
 * @description 获取一个字符串对应的HashCode
 * @param {String} str 要转换的字符串
 * @returns {Number} 映射出来的整数
 * 要不是JS新增了BigInt 随便算算就超出最大IFEE754的极限了
 */
function getHash(str) {
	str = str.split('')
	let seed = BigInt(133),
		hash = BigInt(0)
	while (str.length > 0) {
		let charCode = BigInt(str[0].charCodeAt(0))
		str.shift()
		hash = BigInt(hash * seed + charCode)
	}
	return hash & BigInt(0x7fffffff)
}

class HashTable {
	constructor() {
		// 初始表的大小设置为50 具体值要根据情况
		this.table = new Array(50).fill(null)
		// 表中元素个数
		this.tableNum = 0
		// 表的大小
		this.tableLen = 50
	}
	set(key, val) {
		// 装载因子过大 需要执行换表
		if (this._factor > 2) {
			this._moveNewTable()
		}
		let index = this._getIndex(key)
		let ele = this.table[index]
		// 空槽
		if (!ele) {
			this.table[index] = {
				prev: null,
				next: null,
				key,
				val
			}
			this.tableNum++
		} else {
			// 检查是否重复 重复直接覆盖值即可
			let node = this._find(ele, key)
			if (node) {
				node.val = val
				return
			}
			this.table[index] = {
				prev: null,
				next: ele,
				key,
				val
			}
			ele.prev = this.table[index]
			this.tableNum++
		}
	}
	get(key) {
		let index = this._getIndex(key)
		let node = this._find(this.table[index], key)
		if (node) {
			return node.val
		} else {
			return null
		}
	}
	// 删除元素
	del(key) {
		let index = this._getIndex(key)
		let node = this._find(this.table[index], key)
		if (node) {
			if (node.prev) {
				node.prev.next = node.next
			}
			if (node.next) {
				node.next.prev = node.prev
			}
			this.tableNum--
		}
	}
	// 搜寻目标 返回node引用 未找到返回Null
	_find(node, key) {
		while (node) {
			if (node.key === key) {
				return node
			}
			node = node.next
		}
		return null
	}
	_getIndex(key) {
		return Number(getHash(key) % BigInt(this.tableLen))
	}
	// 装载因子
	get _factor() {
		return this.tableNum / this.tableLen
	}
	// 进行表的扩增
	_moveNewTable() {
		let nTable = new Array(this.tableLen * 2).fill(null)
		let len = this.tableLen
		for (let i = 0; i < len; i++) {
			nTable[i] = this.table[i]
		}
		this.table = nTable
		this.tableLen = this.tableLen * 2
	}
}

const table = new HashTable()

table.set('name', 'Jack')
table.set('name', 'TOM')

for (let i = 0; i < 100; i++) {
	table.set('KEY' + i + Math.random(), i)
}

总结

哈希表是一种完美利用数组的线性内存，直接通过偏移访问的特性，再加上Hash函数的隐射，来做到KeyVal的O(1)访问的技术。

在很多语言内部都有直接的实现，但是理解其原理依旧非常重要

哈希表的实现

TOC

哈希表

获取HASH

解决冲突

Code

总结

FEATURED TAGS