论文笔记——对NLP深度神经模型的统一深入理解【神经网络可解释性】

Posted on 2019-10-07 | In 论文笔记 , 可解释性

From SJTU, MSRA and PKU.
Authors: Chaoyu Guan, Xiting Wang, Quanshi Zhang, Runjin Chen, Di He, Xing Xie.
Title: Towards a Deep and Unified Understanding of Deep Neural Models in NLP
In: ICML 2019.
Codes: icml2019paper2428/Towards-A-Deep-and-Unified-Understanding-of-Deep-Neural-Models-in-NLP

Introduction

前段时间看过张拳石老师介绍这篇论文的文章，很感兴趣，自己原来也有这方面的其他想法，但是并没有实践过，这篇论文收获颇丰。

论文笔记——COMET Commonsense Transformers for Automatic Knowledge Graph Construction

Posted on 2019-07-03 | In 论文笔记 , Commonsense

From Allen Institute for Artiﬁcial Intelligence and Microsoft Research.

Authors: Antoine Bosselut, Hannah Rashkin, Maarten Sap, Chaitanya Malaviya, Asli Celikyilmaz, Yejin Choi

Title: COMET: Commonsense Transformers for Automatic Knowledge Graph Construction

In: ACL, 2019.

Introduction

本文是Allen实验室发表在ACL2019的一篇关于自动常识知识库构建的文章。作者提出了Commonsense Transformers（COMET）生成模型，主体框架是Transformer语言模型，在ATOMIC和ConceptNet知识库中选取种子知识训练集进行预训练，使得模型可以自动构建常识知识库。Allen实验室也提供了Demo和Code，Demo挺有意思的，输入一个event（有参与者），就可以返回一个常识知识图。

论文笔记 — Transformer XL

Posted on 2019-06-25 | In 论文笔记 , Pre-trained LM

From Google Brain and CMU.

Authors: Zihang Dai∗, Zhilin Yang∗, Yiming Yang, Jaime Carbonell, Quoc V. Le, Ruslan Salakhutdinov

Title: TransformerXL: Attentive Language Models Beyond a Fixed-Length Context.

In: ACL, 2019

Introduction

为了帮助理解XLNet[4]，本文对其核心框架Transformer-XL作一个解读。本文发表在ACL2019上，论文想要解决的问题：如何赋予编码器捕获长距离依赖的能力。目前在自然语言处理领域，Transformer的编码能力超越了RNN，但是对长距离依赖的建模能力仍然不足。在基于LSTM的模型中，为了建模长距离依赖，提出了门控机制和梯度裁剪，目前可以编码的最长距离在200左右。在基于Transformer的模型中，允许词之间直接self-attention，能够更好地捕获长期依赖关系，但是还是有限制。

论文笔记 — XLNet Generalized Autoregressive Pretraining for Language Understanding

Posted on 2019-06-21 | In 论文笔记 , Pre-trained LM

From Google Brain and CMU.

Authors: Zhilin Yang∗, Zihang Dai∗, Yiming Yang, Jaime Carbonell, Ruslan Salakhutdinov, Quoc V. Le

Title: XLNet: Generalized Autoregressive Pretraining for Language Understanding.

Preprint at 2019.6.20.

Introduction

这篇论文建立在Transformer-XL【作者们ACL2019的工作】的基础之上。看过Transformer-XL的同学应该知道其编码方式其实已经有了挺大的改进，对长文本的编码优于Vanilla Transformer。本文引入了PLM（Permutation Language Model，排列语言模型【Permutation: a way, especially one of several possible variations, in which a set or number of things can be ordered or arranged.】）而抛弃BERT的Mask LM，然后引入Masked Two-Stream Self-Attention解决PLM出现的目标预测问题【见Motivation】，最后用三倍于BERT的语料进行预训练，刷榜SQuAD、GLUE、RACE等。

Learn python with socratica [My notes] - part 15- Logging

Posted on 2019-06-20 | In Learn python with socratica

Lesson 17

Introduction

logging就是记录的意思，也就是生成日志的一种方式。在程序运行过程中，logging模块可以记录所有的一切【按需记录】。这对于开发者来说很重要，好的程序是错哪儿都知道直接知道发生什么问题。

IndexFziQ

I'm a PhD student in NLP at IIE, CAS. I blog about NLP and Python.

GitHub E-Mail Instagram