mindtext.modules.encoder.xlnet¶

class mindtext.modules.encoder.xlnet.XLNetConfig (vocab_size: int = 32000, d_model: int = 1024, n_layer: int = 24, n_head: int = 16, d_inner: int = 4096, ff_activation: str = “gelu”, untie_r: bool = True, attn_type_bi: bool = True, initializer_range: float = 0.02, layer_norm_eps: float = 1e-12, dropout: float = 0.1, mem_len: int = 512, reuse_len: Optional[int] = None, use_mems: bool = False, bi_data: bool = False, clamp_len: int = -1, same_length: bool = False, pad_token_id: int = 5, bos_token_id: int = 1, eos_token_id: int = 2, **kwargs)

XLNetModel的配置信息，可通过json文件传入

Example

>>> xlconfig = XLNetConfig.from_json_file('json_file_path')

init (vocab_size: int = 32000, d_model: int = 1024, n_layer: int = 24, n_head: int = 16, d_inner: int = 4096, ff_activation: str = “gelu”, untie_r: bool = True, attn_type_bi: bool = True, initializer_range: float = 0.02, layer_norm_eps: float = 1e-12, dropout: float = 0.1, mem_len: int = 512, reuse_len: Optional[int] = None, use_mems: bool = False, bi_data: bool = False, clamp_len: int = -1, same_length: bool = False, pad_token_id: int = 5, bos_token_id: int = 1, eos_token_id: int = 2, **kwargs)

参数

vocab_size (int): xlnet model的字典大小，定义可以表示多少不同的tokens，默认为32000。
d_model (int): 编码层和pooler层的维度，默认为1024。
n_layer (int): Transformer解码器里面的编码层数量，默认为24。
n_head (int): 注意力层里面的注意力头数量，默认为16。
d_inner (int): Transformer里面前馈层的维度，默认为4096。
ff_activation (str): 非线性激活函数，默认为“gelu”。
untie_r (bool): 是否结合关系位置偏置，默认为True。
attn_type_bi (bool): 在模型里面使用的attention类型，默认为bi。
initializer_range (float): 截断正态分布的标准差，默认为0.02。
layer_norm_eps (loat): 在normalization层使用的epsilon，默认为1e-12。
dropout (float): 在embedding，编码和pooler层使用的dropout的大小，默认为0.1。
mem_len (int): 缓存的token数量，key/valuede在先前的前向传播计算过不会再计算，默认为512。
reuse_len (Optional[int]): 当前batch缓存的将被用于后面步骤的token数量，默认为None。
use_mems (bool): 是否模型使用记忆机制，默认为Falase。
bi_data (bool): 是否使用双向的输入流水线。通常在预训练设置为True，在微调上设置为False。
clamp_len (int): 夹紧所有大于夹紧长度的相对距离。将此属性设置为-1意味着无夹紧，默认为-1。
same_length (bool): 是否使用一些注意力长度，默认为False。
pad_token_id (int): 填充的token，默认为5。
bos_token_id (int): 输入开始的token，默认为1。
eos_token_id (int): 输入结束的token，默认为2。

class mindtext.modules.encoder.xlnet.XLNetRelativeAttention (config: XLNetConfig)

Example

>>> xlconfig = XLNetConfig.from_json_file('json_file_path')
>>> xl_rel_attn = XLNetRelativeAttention(xlconfig)

init (config: XLNetConfig)

参数

config (XLNetConfig): XLNetConfig。

construct (h: Tensor, g: Tensor, attn_mask_h: Tensor, attn_mask_g: Tensor, r: Tensor, seg_mat: Tensor, mems: Optional[Tensor] = None, target_mapping: Optional[Tensor] = None)

xlnet 关系注意力层的前向传播

参数

h (mindspore.Tensor): H hidden states。
g (mindspore.Tensor): G hidden states。
attn_mask_h (mindspore.Tensor): H hidden states的attention mask。
attn_mask_g (mindspore.Tensor): G hidden states的attention mask。
r (mindspore.Tensor): 位置编码。
seg_mat (mindspore.Tensor): segment编码。
mems (Optional[Tensor]): Mems向量。
target_mapping (Optional[Tensor].Tensor): xlnet target映射。

返回

output_h (Union[Tensor, Tuple[Tensor]]): 关系注意力。

class mindtext.modules.encoder.xlnet.XLNetFeedForward (config: XLNetConfig)

Example

>>> xlconfig = XLNetConfig.from_json_file('json_file_path')
>>> xl_feed_forward = XLNetFeedForward(xlconfig)

init (config: XLNetConfig)

参数

config (XLNetConfig): XLNetConfig。

construct (inp: Tensor)

参数

inp (mindspore.Tensor): 输入向量。

返回

output (mindspore.Tenso): 输出向量。

class mindtext.modules.encoder.xlnet.XLNetLayer (config: XLNetConfig)

XLNet 编码层

Example

>>> xlconfig = XLNetConfig.from_json_file('json_file_path')
>>> xl_layer = XLNetLayer(xlconfig)

init (config: XLNetConfig)

参数

config (XLNetConfig): XLNetConfig。

construct (h: Tensor, g: Tensor, attn_mask_h: Tensor, attn_mask_g: Tensor, r: Tensor, seg_mat: Tensor, mems: Optional[Tensor] = None, target_mapping: Optional[Tensor] = None)

参数

h (mindspore.Tensor): H hidden states。
g (mindspore.Tensor): G hidden states。
attn_mask_h (mindspore.Tensor): H hidden states的attention mask。
attn_mask_g (mindspore.Tensor): G hidden states的attention mask。
r (mindspore.Tensor): 位置编码。
seg_mat (mindspore.Tensor): segment编码。
mems (Optional[Tensor]): Mems向量。
target_mapping (Optional[Tensor].Tensor): xlnet target映射。

返回

output_h (Union[Tensor, Tuple[Tensor]]): xlnet编码注意力。

class mindtext.modules.encoder.xlnet.XLNetModel (config: XLNetConfig)

XLNet模型

Example

>>> xlconfig = XLNetConfig.from_json_file('json_file_path')
>>> xlnet = XLNetModel(xlconfig)

init (config: XLNetConfig)

参数

config (XLNetConfig): XLNetConfig。

construct (input_ids: Tensor, attention_mask: Tensor, token_type_ids: Tensor, mems: Optional[List[Tensor]] = None, perm_mask: Optional[Tensor] = None, target_mapping: Optional[Tensor] = None)

参数

input_ids (mindspore.Tensor): 输入的index序列。
attention_mask (mindspore.Tensor): 输入的注意力掩码。
token_type_ids (mindspore.Tensor): 输入序列类型。
mems (Optional[Tensor]): Mems向量。
perm_mask (Optional[Tensor]): Perm mask。
target_mapping (Optional[Tensor].Tensor): target映射。

返回

output (Union[Tensor, Tuple[Tensor]]): 输出。

class mindtext.modules.encoder.xlnet.XLNetFinetuneCell (network: nn.Cell, optimizer: nn.Optimizer, scale_update_cell: Optional[nn.Cell] = None)

XLNet微调模块

init (network: nn.Cell, optimizer: nn.Optimizer, scale_update_cell: Optional[nn.Cell] = None)

参数

network (nn.Cell): XLNet模型，例如XLNetForClassification。
optimizer (nn.Optimizer): 优化器。
scale_update_cell (Optional[nn.Cell]): Scaling loss。

construct (input_ids: Tensor, token_type_id: Tensor, attention_mask: Tensor, label: Tensor, sens: Optional[int] = None)

参数

input_ids (mindspore.Tensor): 输入的index序列。
attention_mask (mindspore.Tensor): 输入的注意力掩码。
token_type_ids (mindspore.Tensor): 输入序列类型。
label (mindspore.Tensor): 标签索引。
sens (Optional[Tensor]): 梯度对输出的灵敏度。

返回

loss, cond (Tuple[Tensor, Tensor]): loss。

class mindtext.modules.encoder.xlnet.XLNetForClassification (model: XLNetModel, config: XLNetConfig, num_class: int = 2, loss: Optional[nn.Cell] = None)

XLNet的分类微调模块

init (model: XLNetModel, config: XLNetConfig, num_class: int = 2, loss: Optional[nn.Cell] = None)

参数

model (nn.Cell): XLNet模型。
config (XLNetConfig): XLNetConfig。
num_class (int): 分类的数量。
loss (Optional[nn.Cell]): 损失函数。

construct (input_ids, attention_mask, token_type_ids, label: Optional[Tensor] = None)

参数

input_ids (mindspore.Tensor): 输入的index序列。
attention_mask (mindspore.Tensor): 输入的注意力掩码。
token_type_ids (mindspore.Tensor): 输入序列类型。
label (mindspore.Tensor): 标签索引。

返回

loss (Tuple[Tensor, Tensor]): loss。