How can i add a Bi-LSTM layer on top of bert model?(如何在 bert 模型之上添加 Bi-LSTM 层?)
问题描述
我正在使用 pytorch 并且我正在使用基本的预训练的 bert 对仇恨言论的句子进行分类.我想实现一个 Bi-LSTM 层,它将最新的所有输出作为输入来自 bert 模型的转换器编码器作为一个新模型(实现 nn.Module 的类),我对 nn.LSTM 参数感到困惑.我使用
I'm using pytorch and I'm using the base pretrained bert to classify sentences for hate speech. I want to implement a Bi-LSTM layer that takes as an input all outputs of the latest transformer encoder from the bert model as a new model (class that implements nn.Module), and i got confused with the nn.LSTM parameters. I tokenized the data using
bert = BertForSequenceClassification.from_pretrained("bert-base-uncased", num_labels=int(data['class'].nunique()),output_attentions=False,output_hidden_states=False)
我的数据集有 2 列:class(label)、sentence.有人可以帮我弄这个吗?提前致谢.
My data-set has 2 columns: class(label), sentence. Can someone help me with this? Thank you in advance.
编辑:此外,在 bi-lstm 中处理输入后,网络将最终隐藏状态发送到使用 softmax 激活函数执行分类的全连接网络.我该怎么做?
Edit: Also, after processing the input in the bi-lstm, the network sends the final hidden state to a fully connected network that performs classication using the softmax activation function. how can I do that ?
推荐答案
您可以这样做:
from transformers import BertModel
class CustomBERTModel(nn.Module):
def __init__(self):
super(CustomBERTModel, self).__init__()
self.bert = BertModel.from_pretrained("bert-base-uncased")
### New layers:
self.lstm = nn.LSTM(768, 256, batch_first=True,bidirectional=True)
self.linear = nn.Linear(256*2, <number_of_classes>)
def forward(self, ids, mask):
sequence_output, pooled_output = self.bert(
ids,
attention_mask=mask)
# sequence_output has the following shape: (batch_size, sequence_length, 768)
lstm_output, (h,c) = self.lstm(sequence_output) ## extract the 1st token's embeddings
hidden = torch.cat((lstm_output[:,-1, :256],lstm_output[:,0, 256:]),dim=-1)
linear_output = self.linear(hidden.view(-1,256*2)) ### assuming that you are only using the output of the last LSTM cell to perform classification
return linear_output
tokenizer = BertTokenizerFast.from_pretrained("bert-base-uncased")
model = CustomBERTModel()
这篇关于如何在 bert 模型之上添加 Bi-LSTM 层?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持编程学习网!
本文标题为:如何在 bert 模型之上添加 Bi-LSTM 层?
基础教程推荐
- Plotly:如何设置绘图图形的样式,使其不显示缺失日期的间隙? 2022-01-01
- 求两个直方图的卷积 2022-01-01
- 使用大型矩阵时禁止 Pycharm 输出中的自动换行符 2022-01-01
- 无法导入 Pytorch [WinError 126] 找不到指定的模块 2022-01-01
- 包装空间模型 2022-01-01
- 在同一图形上绘制Bokeh的烛台和音量条 2022-01-01
- PermissionError: pip 从 8.1.1 升级到 8.1.2 2022-01-01
- 在Python中从Azure BLOB存储中读取文件 2022-01-01
- 修改列表中的数据帧不起作用 2022-01-01
- PANDA VALUE_COUNTS包含GROUP BY之前的所有值 2022-01-01
