mindtext.modules.encoder.bart¶

class mindtext.modules.encoder.bart.BartConfig (max_sent_len=5, vocab_size=50265, max_position_embeddings=1024, encoder_layers=12, encoder_ffn_dim=4096, encoder_attention_heads=16, decoder_layers=12, decoder_ffn_dim=4096, decoder_attention_heads=16, encoder_layerdrop=1.0, decoder_layerdrop=1.0, activation_function=”gelu”, d_model=1024, dropout=0.9, attention_dropout=1.0, activation_dropout=1.0, init_std=0.02, classifier_dropout=1.0, scale_embedding=False, gradient_checkpointing=False, use_cache=True, num_labels=3, pad_token_id=1, bos_token_id=0, eos_token_id=2, is_encoder_decoder=True, decoder_start_token_id=2, forced_eos_token_id=2)

这是一个配置类用来保存模型的配置信息，用来初始化BART模型，定义模型结构。

init (max_sent_len=5, vocab_size=50265, max_position_embeddings=1024, encoder_layers=12, encoder_ffn_dim=4096, encoder_attention_heads=16, decoder_layers=12, decoder_ffn_dim=4096, decoder_attention_heads=16, encoder_layerdrop=1.0, decoder_layerdrop=1.0, activation_function=”gelu”, d_model=1024, dropout=0.9, attention_dropout=1.0, activation_dropout=1.0, init_std=0.02, classifier_dropout=1.0, scale_embedding=False, gradient_checkpointing=False, use_cache=True, num_labels=3, pad_token_id=1, bos_token_id=0, eos_token_id=2, is_encoder_decoder=True, decoder_start_token_id=2, forced_eos_token_id=2)

参数

_vocab_size (int): BART模型的词典大小，默认为50265。
d_model (int): 层和池化层的维度，默认为1024。
encoder_layers (int): 编码器的数量，默认为12。
decoder_layers (int): 解码器的数量，默认为12。
encoder_attention_heads (int): 在Transformer编码器里面每一个attention层的注意力头数量，默认为16。
decoder_attention_heads (int): 在Transformer解码器里面每一个attention层的注意力头数量，默认为16。
decoder_ffn_dim (int): 解码器中间层的维度，默认为4096。
encoder_ffn_dim (int): 编码器中间层的维度，默认为4096。
activation_function (str): 激活函数。
dropout (float): 全部在embeddings，encoder，pooler的全连接层dropout的大小，默认为0.1。
attention_dropout (float): attention的dropout大小，默认为0.0。
activation_dropout (float): 全连接层内部的激活函数的dropout，默认为0.0。
classifier_dropout (float): 分类器的dropout大小，默认为0.0。
max_position_embeddings (int): 模型可用的最大序列长度，建议设置较大数值(e.g., 512 or 1024 or 2048)，默认为1024。
init_std (float): 初始化权重矩阵所用的截断正态分布的标准差，默认为0.02。
encoder_layerdrop (float): 编码层的drop概率，默认为0.0，详情见https://arxiv.org/abs/1909.11556。
decoder_layerdrop (float): 解码层的drop概率，默认为0.0，详情见https://arxiv.org/abs/1909.11556。
gradient_checkpointing (bool): 如果为真，使用梯度保存来保存expense of slower backward pass，默认为False。
scale_embedding (bool): Scale embeddings by diving by sqrt(d_model)，默认为False。
use_cache (float): 是否模型返回最行的key/value attention，默认为True。
num_labels (int): 在class:transformers.BartForSequenceClassification中labels的数量，默认为3。
forced_eos_token_id (int): 当达到最大长度，token的id被强制为最后一个token。默认为2。

shift_tokens_right (input_ids: mindspore.Tensor, pad_token_id: int, decoder_start_token_id: int)

将输入的ids转移到右边

参数

input_ids (mindspore.Tensor): 输入的ids。
pad_token_id (int): 用来填充的token id。

返回

shifted_input_ids (mindspore.Tensor)：返回的向量。

class mindtext.modules.encoder.bart.BartLearnedPositionalEmbedding (num_embeddings: int, embedding_dim: int)

通过被修整后的最大size学习位置编码

init (num_embeddings: int, embedding_dim: int)

参数

num_embeddings (int): embedding的数量。
embedding_dim (int): embedding的维度。

construct (input_ids_shape: mindspore.Tensor.shape, past_key_values_length: int = 0)

参数

返回

class mindtext.modules.encoder.bart.BartAttention (embed_dim: int, num_heads: int, dropout: float = 0.0, is_decoder: bool = False, bias: bool = True)

多头注意力机制

init (embed_dim: int, num_heads: int, dropout: float = 0.0, is_decoder: bool = False, bias: bool = True)

参数

embed_dim (int): embedding的维度。
num_heads (int): 注意力头的数量。
dropout (float): dropout的大小，默认为0.0。
bias (bool): 是否使用偏置，默认为True。

construct (hidden_states: mindspore.Tensor, key_value_states: Optional[mindspore.Tensor] = None, past_key_value: Optional[Tuple[mindspore.Tensor]] = None, attention_mask: Optional[mindspore.Tensor] = None)

参数

hidden_states (mindspore.Tensor):
key_value_states (_ Optional[mindspore.Tensor]_):
past_key_value (Optional[Tuple[mindspore.Tensor]]):
embed_dim (Optional[mindspore.Tensor]):

返回

attn_output (mindspore.Tensor): embedding的维度。
past_key_value (_ Optional[mindspore.Tensor]_): 缓存过去key and value projection的隐藏状态。

class mindtext.modules.encoder.bart.BartEncoderLayer (config: BartConfig)

编码层

init (config: BartConfig)

参数

config (BartConfig): BART模型的配置。

construct (hidden_states: mindspore.Tensor, attention_mask: mindspore.Tensor)

参数

hidden_states (mindspore.Tensor): 输入向量，shape为(seq_len, batch, embed_dim)。
attention_mask (_ Optional[mindspore.Tensor]_): attention mask的size(batch, 1, tgt_len, src_len)，其中填充元素通过非常大的负数来表明。

返回

outputs (mindspore.Tensor):

class mindtext.modules.encoder.bart.BartDecoderLayer (config: BartConfig)

解码层

init (config: BartConfig)

参数

config (BartConfig): BART模型的配置。

construct (hidden_states: mindspore.Tensor, attention_mask: Optional[mindspore.Tensor] = None, encoder_hidden_states: Optional[mindspore.Tensor] = None, encoder_attention_mask: Optional[mindspore.Tensor] = None, past_key_value: Optional[Tuple[mindspore.Tensor]] = None)

参数

hidden_states (mindspore.Tensor): 输入向量，shape为(seq_len, batch, embed_dim)。
attention_mask (_ Optional[mindspore.Tensor]_): attention mask的size(batch, 1, tgt_len, src_len)，其中填充元素通过非常大的负数来表明。
encoder_hidden_states (mindspore.Tensor): 通过attention到该层的shape(seq_len, batch, embed_dim)。
encoder_hidden_mask (mindspore.Tensor): attention mask的size(batch, 1, tgt_len, src_len)，其中填充元素通过非常大的负数来表明。
past_key_value (mindspore.Tensor): 缓存过去key and value projection的隐藏状态。

返回

outputs (mindspore.Tensor):

class mindtext.modules.encoder.bart.BartClassificationHead (input_dim: int, inner_dim: int, num_classes: int, pooler_dropout: float)

句子级别分类任务的头

init (input_dim: int, inner_dim: int, num_classes: int, pooler_dropout: float)

参数

input_dim (int): 输入的维度。
inner_dim (int):
num_classes (int): 分类的数量。
pooler_dropout (float): pooler的dropout的大小。

construct (hidden_states: mindspore.Tensor)

参数

hidden_states (mindspore.Tensor):

返回

hidden_states (mindspore.Tensor):

class mindtext.modules.encoder.bart.BartPretrainedModel (module)

BART预训练模型

class mindtext.modules.encoder.bart.BartEncoder (config: BartConfig, embed_tokens: Optional[nn.Embedding] = None)

Transformer的encoder包含config.encoder_layers self attention layers，每一层都是class:`BartEncoderLayer

init (config: BartConfig, embed_tokens: Optional[nn.Embedding] = None)

参数

config (BartConfig): BartConfig。
embed_tokens (mindspore.nn.Embedding):输出的embedding

construct (input_ids=None, attention_mask=None)

参数

input_ids (mindspore.Tensor): 输入在字典里面的索引，填充会被默认忽略，需要提供填充。
attention_mask (mindspore.Tensor): attention mask，值是0/1

返回

hidden_states (mindspore.Tensor):

class mindtext.modules.encoder.bart.BartDecoder (config: BartConfig, embed_tokens: Optional[nn.Embedding] = None)

Transformer的dncoder包含config.encoder_layers self attention layers，每一层都是class:`BartEncoderLayer

init (config: BartConfig, embed_tokens: Optional[nn.Embedding] = None)

参数

config (BartConfig): BartConfig。
embed_tokens (mindspore.nn.Embedding):输出的embedding

construct (input_ids=None, attention_mask=None, encoder_hidden_states=None, encoder_attention_mask=None, past_key_values=None)

参数

input_ids (mindspore.Tensor): 输入在字典里面的索引，填充会被默认忽略，需要提供填充。。
attention_mask (_ Optional[mindspore.Tensor]_): attention mask，值是0/1。
encoder_hidden_states (mindspore.Tensor): 编码层的最后一层的输出隐藏状态。
encoder_hidden_mask (mindspore.Tensor): attention mask，值是0/1。
past_key_value (mindspore.Tensor): 包含注意块的预计算键和值隐藏状态。可以用来加速解码。

返回

hidden_states (mindspore.Tensor):

class mindtext.modules.encoder.bart.BartModel (config: BartConfig)

BART模型

init (config: BartConfig)

参数

config (BartConfig): BartConfig。

construct (input_ids=None, attention_mask=None, decoder_input_ids=None, decoder_attention_mask=None, past_key_values=None)

参数

input_ids (mindspore.Tensor): 输入在字典里面的索引，填充会被默认忽略，需要提供填充。。
attention_mask (_ Optional[mindspore.Tensor]_): attention mask，值是0/1。
decoder_input_ids (mindspore.Tensor):
decoder_attention_mask (mindspore.Tensor): attention mask，值是0/1。
past_key_value (mindspore.Tensor): 包含注意块的预计算键和值隐藏状态。可以用来加速解码。

返回

hidden_states (mindspore.Tensor):

class mindtext.modules.encoder.bart.BartForConditionalGeneration (config: BartConfig)

BART模型

init (config: BartConfig)

参数

config (BartConfig): BartConfig。

construct (input_ids=None, attention_mask=None, labels=None, decoder_attention_mask=None, past_key_values=None)

参数

input_ids (mindspore.Tensor): 输入在字典里面的索引，填充会被默认忽略，需要提供填充。。
attention_mask (_ Optional[mindspore.Tensor]_): attention mask，值是0/1。
labels (mindspore.Tensor): 语言掩盖模型计算loss的标签。
decoder_attention_mask (mindspore.Tensor): attention mask，值是0/1。
past_key_value (mindspore.Tensor): 包含注意块的预计算键和值隐藏状态。可以用来加速解码。

返回

class mindtext.modules.encoder.bart.BartForConditionalGenerationOneStep (net, optimizer, sens=1.0)

BartForConditionalGenerationOneStep

init (net, optimizer, sens=1.0)

参数

net (BartConfig): BartConfig。
optimizer (BartConfig): BartConfig。
sens=1.0 (BartConfig): BartConfig。

construct (nput_ids, attention_mask=None, labels=None)

参数

input_ids (mindspore.Tensor): 输入在字典里面的索引，填充会被默认忽略，需要提供填充。。
attention_mask (_ Optional[mindspore.Tensor]_): attention mask，值是0/1。
labels (mindspore.Tensor): 语言掩盖模型计算loss的标签。

返回