Keras—embedding嵌入层的用法详解

作者：MXuDong 时间：2021-06-05 01:08:17　

最近在工作中进行了NLP的内容，使用的还是Keras中embedding的词嵌入来做的。

Keras中embedding层做一下介绍。

中文文档地址：https://keras.io/zh/layers/embeddings/

参数如下：

其中参数重点有input_dim,output_dim,非必选参数input_length.

初始化方法参数设置后面会单独总结一下。

demo使用预训练（使用百度百科（word2vec）的语料库）参考

embedding使用的demo参考：

def create_embedding(word_index, num_words, word2vec_model):
embedding_matrix = np.zeros((num_words, EMBEDDING_DIM))
for word, i in word_index.items():
try:
embedding_vector = word2vec_model[word]
embedding_matrix[i] = embedding_vector
except:
continue
return embedding_matrix

#word_index:词典（统计词转换为索引）
#num_word:词典长度+1
#word2vec_model:词向量的model

加载词向量model的方法：

def pre_load_embedding_model(model_file):
# model = gensim.models.Word2Vec.load(model_file)
# model = gensim.models.Word2Vec.load(model_file,binary=True)
model = gensim.models.KeyedVectors.load_word2vec_format(model_file)
return model

model中Embedding层的设置（注意参数，Input层的输入，初始化方法）：

embedding_matrix = create_embedding(word_index, num_words, word2vec_model)

embedding_layer = Embedding(num_words,
EMBEDDING_DIM,
embeddings_initializer=Constant(embedding_matrix),
input_length=MAX_SEQUENCE_LENGTH,
trainable=False)
sequence_input = Input(shape=(MAX_SEQUENCE_LENGTH,), dtype='int32')
embedded_sequences = embedding_layer(sequence_input)

embedding层的初始化设置

keras embeding设置初始值的两种方式

随机初始化Embedding

from keras.models import Sequential
from keras.layers import Embedding
import numpy as np

model = Sequential()
model.add(Embedding(1000, 64, input_length=10))
# the model will take as input an integer matrix of size (batch, input_length).
# the largest integer (i.e. word index) in the input should be no larger than 999 (vocabulary size).
# now model.output_shape == (None, 10, 64), where None is the batch dimension.

input_array = np.random.randint(1000, size=(32, 10))

model.compile('rmsprop', 'mse')
output_array = model.predict(input_array)
print(output_array)
assert output_array.shape == (32, 10, 64)

使用weights参数指明embedding初始值

import numpy as np
import keras

m = keras.models.Sequential()
"""
可以通过weights参数指定初始的weights参数
因为Embedding层是不可导的
梯度东流至此回,所以把embedding放在中间层是没有意义的,emebedding只能作为第一层
注意weights到embeddings的绑定过程很复杂，weights是一个列表
"""
embedding = keras.layers.Embedding(input_dim=3, output_dim=2, input_length=1, weights=[np.arange(3 * 2).reshape((3, 2))], mask_zero=True)
m.add(embedding) # 一旦add，就会自动调用embedding的build函数,
print(keras.backend.get_value(embedding.embeddings))
m.compile(keras.optimizers.RMSprop(), keras.losses.mse)
print(m.predict([1, 2, 2, 1, 2, 0]))
print(m.get_layer(index=0).get_weights())
print(keras.backend.get_value(embedding.embeddings))

给embedding设置初始值的第二种方式：使用initializer

import numpy as np
import keras

m = keras.models.Sequential()
"""
可以通过weights参数指定初始的weights参数
因为Embedding层是不可导的
梯度东流至此回,所以把embedding放在中间层是没有意义的,emebedding只能作为第一层
给embedding设置权值的第二种方式，使用constant_initializer
"""
embedding = keras.layers.Embedding(input_dim=3, output_dim=2, input_length=1, embeddings_initializer=keras.initializers.constant(np.arange(3 * 2, dtype=np.float32).reshape((3, 2))))
m.add(embedding)
print(keras.backend.get_value(embedding.embeddings))
m.compile(keras.optimizers.RMSprop(), keras.losses.mse)
print(m.predict([1, 2, 2, 1, 2]))
print(m.get_layer(index=0).get_weights())
print(keras.backend.get_value(embedding.embeddings))

关键的难点在于理清weights是怎么传入到embedding.embeddings张量里面去的。

Embedding是一个层，继承自Layer，Layer有weights参数，weights参数是一个list，里面的元素都是numpy数组。在调用Layer的构造函数的时候，weights参数就被存储到了_initial_weights变量

basic_layer.py 之Layer类

if 'weights' in kwargs:
self._initial_weights = kwargs['weights']
else:
self._initial_weights = None

当把Embedding层添加到模型中、跟模型的上一层进行拼接的时候，会调用layer(上一层)函数，此处layer是Embedding实例，Embedding是一个继承了Layer的类，Embedding类没有重写__call__()方法，Layer实现了__call__()方法。

父类Layer的__call__方法调用子类的call()方法来获取结果。

所以最终调用的是Layer.__call__()。在这个方法中，会自动检测该层是否build过（根据self.built布尔变量）。

Layer.__call__函数非常重要。

def __call__(self, inputs, **kwargs):
"""Wrapper around self.call(), for handling internal references.
If a Keras tensor is passed:
- We call self._add_inbound_node().
- If necessary, we `build` the layer to match
the _keras_shape of the input(s).
- We update the _keras_shape of every input tensor with
its new shape (obtained via self.compute_output_shape).
This is done as part of _add_inbound_node().
- We update the _keras_history of the output tensor(s)
with the current layer.
This is done as part of _add_inbound_node().
# Arguments
inputs: Can be a tensor or list/tuple of tensors.
**kwargs: Additional keyword arguments to be passed to `call()`.
# Returns
Output of the layer's `call` method.
# Raises
ValueError: in case the layer is missing shape information
for its `build` call.
"""
if isinstance(inputs, list):
inputs = inputs[:]
with K.name_scope(self.name):
# Handle laying building (weight creating, input spec locking).
if not self.built:#如果未曾build，那就要先执行build再调用call函数
# Raise exceptions in case the input is not compatible
# with the input_spec specified in the layer constructor.
self.assert_input_compatibility(inputs)

# Collect input shapes to build layer.
input_shapes = []
for x_elem in to_list(inputs):
if hasattr(x_elem, '_keras_shape'):
input_shapes.append(x_elem._keras_shape)
elif hasattr(K, 'int_shape'):
input_shapes.append(K.int_shape(x_elem))
else:
raise ValueError('You tried to call layer "' +
self.name +
'". This layer has no information'
' about its expected input shape, '
'and thus cannot be built. '
'You can build it manually via: '
'`layer.build(batch_input_shape)`')
self.build(unpack_singleton(input_shapes))
self.built = True#这句话其实有些多余，因为self.build函数已经把built置为True了

# Load weights that were specified at layer instantiation.
if self._initial_weights is not None:#如果传入了weights，把weights参数赋值到每个变量，此处会覆盖上面的self.build函数中的赋值。
self.set_weights(self._initial_weights)

# Raise exceptions in case the input is not compatible
# with the input_spec set at build time.
self.assert_input_compatibility(inputs)

# Handle mask propagation.
previous_mask = _collect_previous_mask(inputs)
user_kwargs = copy.copy(kwargs)
if not is_all_none(previous_mask):
# The previous layer generated a mask.
if has_arg(self.call, 'mask'):
if 'mask' not in kwargs:
# If mask is explicitly passed to __call__,
# we should override the default mask.
kwargs['mask'] = previous_mask
# Handle automatic shape inference (only useful for Theano).
input_shape = _collect_input_shape(inputs)

# Actually call the layer,
# collecting output(s), mask(s), and shape(s).
output = self.call(inputs, **kwargs)
output_mask = self.compute_mask(inputs, previous_mask)

# If the layer returns tensors from its inputs, unmodified,
# we copy them to avoid loss of tensor metadata.
output_ls = to_list(output)
inputs_ls = to_list(inputs)
output_ls_copy = []
for x in output_ls:
if x in inputs_ls:
x = K.identity(x)
output_ls_copy.append(x)
output = unpack_singleton(output_ls_copy)

# Inferring the output shape is only relevant for Theano.
if all([s is not None
for s in to_list(input_shape)]):
output_shape = self.compute_output_shape(input_shape)
else:
if isinstance(input_shape, list):
output_shape = [None for _ in input_shape]
else:
output_shape = None

if (not isinstance(output_mask, (list, tuple)) and
len(output_ls) > 1):
# Augment the mask to match the length of the output.
output_mask = [output_mask] * len(output_ls)

# Add an inbound node to the layer, so that it keeps track
# of the call and of all new variables created during the call.
# This also updates the layer history of the output tensor(s).
# If the input tensor(s) had not previous Keras history,
# this does nothing.
self._add_inbound_node(input_tensors=inputs,
output_tensors=output,
input_masks=previous_mask,
output_masks=output_mask,
input_shapes=input_shape,
output_shapes=output_shape,
arguments=user_kwargs)

# Apply activity regularizer if any:
if (hasattr(self, 'activity_regularizer') and
self.activity_regularizer is not None):
with K.name_scope('activity_regularizer'):
regularization_losses = [
self.activity_regularizer(x)
for x in to_list(output)]
self.add_loss(regularization_losses,
inputs=to_list(inputs))
return output

如果没有build过，会自动调用Embedding类的build()函数。Embedding.build()这个函数并不会去管weights，如果它使用的initializer没有传入，self.embeddings_initializer会变成随机初始化。

如果传入了，那么在这一步就能够把weights初始化好。

如果同时传入embeddings_initializer和weights参数，那么weights参数稍后会把Embedding#embeddings覆盖掉。

embedding.py Embedding类的build函数

def build(self, input_shape):
self.embeddings = self.add_weight(
shape=(self.input_dim, self.output_dim),
initializer=self.embeddings_initializer,
name='embeddings',
regularizer=self.embeddings_regularizer,
constraint=self.embeddings_constraint,
dtype=self.dtype)
self.built = True

综上，在keras中，使用weights给Layer的变量赋值是一个比较通用的方法，但是不够直观。keras鼓励多多使用明确的initializer，而尽量不要触碰weights。

来源：https://blog.csdn.net/qq_33472765/article/details/86561245

标签：Keras,embedding,嵌入层

投稿

Keras—embedding嵌入层的用法详解

猜你喜欢

XHTML与HTML之间的7个区别

SQL Server中导入导出数据的三种方式

python获取Linux下文件版本信息、公司名和产品名的方法

使用numpngw和matplotlib生成png动画的示例代码

如何前后翻阅聊友们的发言？

Python如何一行输入多个数,并存入列表

PyQt5实现简易计算器

Django1.9 加载通过ImageField上传的图片方法

Python paramiko模块使用解析（实现ssh）

优化Oracle停机时间及数据库恢复

python 基于dlib库的人脸检测的实现

Sql Server、Oracle以及Access数据库判断字段是否为空的办法 (From calmzeal's code life)

使用pycharm将自己项目代码上传github(小白教程)

Python 爬虫学习笔记之单线程爬虫

Python中搜索和替换文件中的文本的实现(四种)

python实现读取excel表格详解方法

python开发入门——列表生成式

用Pelican搭建一个极简静态博客系统过程解析

Python 如何创建一个简单的REST接口

如何利用python turtle绘图自定义画布背景颜色

Keras—embedding嵌入层的用法详解

猜你喜欢

XHTML与HTML之间的7个区别

SQL Server中导入导出数据的三种方式

python获取Linux下文件版本信息、公司名和产品名的方法

使用numpngw和matplotlib生成png动画的示例代码

如何前后翻阅聊友们的发言？

Python如何一行输入多个数,并存入列表

PyQt5实现简易计算器

Django1.9 加载通过ImageField上传的图片方法

Python paramiko模块使用解析（实现ssh）

优化Oracle停机时间及数据库恢复

python 基于dlib库的人脸检测的实现

Sql Server、Oracle以及Access数据库 判断字段是否为空的办法 (From calmzeal's code life)

使用pycharm将自己项目代码上传github(小白教程)

Python 爬虫学习笔记之单线程爬虫

Python中搜索和替换文件中的文本的实现(四种)

python实现读取excel表格详解方法

python开发入门——列表生成式

用Pelican搭建一个极简静态博客系统过程解析

Python 如何创建一个简单的REST接口

如何利用python turtle绘图自定义画布背景颜色

Sql Server、Oracle以及Access数据库判断字段是否为空的办法 (From calmzeal's code life)