首頁(yè) 猿問了解 ELMo 的演示次數(shù)

了解 ELMo 的演示次數(shù)

Python

達(dá)令說 2021-11-23 16:40:21

我正在嘗試使用 ELMo，只需將其用作更大的 PyTorch 模型的一部分。此處給出了一個(gè)基本示例。這是一個(gè) torch.nn.Module 子類，它計(jì)算任意數(shù)量的 ELMo 表示并為每個(gè)表示引入可訓(xùn)練的標(biāo)量權(quán)重。例如，此代碼片段計(jì)算兩層表示（如我們論文中的 SNLI 和 SQuAD 模型）：from allennlp.modules.elmo import Elmo, batch_to_idsoptions_file = "https://s3-us-west-2.amazonaws.com/allennlp/models/elmo/2x4096_512_2048cnn_2xhighway/elmo_2x4096_512_2048cnn_2xhighway_options.json"weight_file = "https://s3-us-west-2.amazonaws.com/allennlp/models/elmo/2x4096_512_2048cnn_2xhighway/elmo_2x4096_512_2048cnn_2xhighway_weights.hdf5"# Compute two different representation for each token.# Each representation is a linear weighted combination for the# 3 layers in ELMo (i.e., charcnn, the outputs of the two BiLSTM))elmo = Elmo(options_file, weight_file, 2, dropout=0)# use batch_to_ids to convert sentences to character idssentences = [['First', 'sentence', '.'], ['Another', '.']]character_ids = batch_to_ids(sentences)embeddings = elmo(character_ids)# embeddings['elmo_representations'] is length two list of tensors.# Each element contains one layer of ELMo representations with shape# (2, 3, 1024).# 2 - the batch size# 3 - the sequence length of the batch# 1024 - the length of each ELMo vector我的問題涉及“陳述”。你能將它們與普通的 word2vec 輸出層進(jìn)行比較嗎？您可以選擇將返回多少ELMo（增加第 n 維），但是這些生成的表示之間有什么區(qū)別以及它們的典型用途是什么？給你一個(gè)想法，對(duì)于上面的代碼，embeddings['elmo_representations']返回兩個(gè)項(xiàng)目（兩個(gè)表示層）的列表，但它們是相同的。簡(jiǎn)而言之，如何定義 ELMo 中的“表示”？

查看完整描述

1 回答

肥皂起泡泡

TA貢獻(xiàn)1829條經(jīng)驗(yàn) 獲得超6個(gè)贊

請(qǐng)參閱原始論文的第 3.2 節(jié)。

ELMo 是 biLM 中中間層表示的任務(wù)特定組合。對(duì)于每個(gè)令牌，L 層 biLM 計(jì)算一組 2L+1 個(gè)表示

之前在第 3.1 節(jié)中說：

最近最先進(jìn)的神經(jīng)語(yǔ)言模型計(jì)算與上下文無關(guān)的標(biāo)記表示（通過標(biāo)記嵌入或字符上的 CNN），然后將其通過 L 層的前向 LSTM。在每個(gè)位置 k，每個(gè) LSTM 層輸出一個(gè)上下文相關(guān)的表示。頂層 LSTM 輸出用于通過 Softmax 層預(yù)測(cè)下一個(gè)標(biāo)記。

為了回答您的問題，這些表示是這些基于 LSTM 的上下文相關(guān)表示。

反對(duì) 回復(fù) 2021-11-23