Yuqiang Xie


  • Home

  • About

  • Tags

  • Categories

  • Search

论文笔记——对NLP深度神经模型的统一深入理解【神经网络可解释性】

Posted on 2019-10-07 | In 论文笔记 , 可解释性

From SJTU, MSRA and PKU.
Authors: Chaoyu Guan, Xiting Wang, Quanshi Zhang, Runjin Chen, Di He, Xing Xie.
Title: Towards a Deep and Unified Understanding of Deep Neural Models in NLP
In: ICML 2019.
Codes: icml2019paper2428/Towards-A-Deep-and-Unified-Understanding-of-Deep-Neural-Models-in-NLP

Introduction

前段时间看过张拳石老师介绍这篇论文的文章,很感兴趣,自己原来也有这方面的其他想法,但是并没有实践过,这篇论文收获颇丰。

Read more »

论文笔记——COMET Commonsense Transformers for Automatic Knowledge Graph Construction

Posted on 2019-07-03 | In 论文笔记 , Commonsense

From Allen Institute for Artificial Intelligence and Microsoft Research.

Authors: Antoine Bosselut, Hannah Rashkin, Maarten Sap, Chaitanya Malaviya, Asli Celikyilmaz, Yejin Choi

Title: COMET: Commonsense Transformers for Automatic Knowledge Graph Construction

In: ACL, 2019.

Introduction

本文是Allen实验室发表在ACL2019的一篇关于自动常识知识库构建的文章。作者提出了Commonsense Transformers(COMET)生成模型,主体框架是Transformer语言模型,在ATOMIC和ConceptNet知识库中选取种子知识训练集进行预训练,使得模型可以自动构建常识知识库。Allen实验室也提供了Demo和Code,Demo挺有意思的,输入一个event(有参与者),就可以返回一个常识知识图。

Read more »

论文笔记 — Transformer XL

Posted on 2019-06-25 | In 论文笔记 , Pre-trained LM

From Google Brain and CMU.

Authors: Zihang Dai∗, Zhilin Yang∗, Yiming Yang, Jaime Carbonell, Quoc V. Le, Ruslan Salakhutdinov

Title: TransformerXL: Attentive Language Models Beyond a Fixed-Length Context.

In: ACL, 2019

Introduction

为了帮助理解XLNet[4],本文对其核心框架Transformer-XL作一个解读。本文发表在ACL2019上,论文想要解决的问题:如何赋予编码器捕获长距离依赖的能力。目前在自然语言处理领域,Transformer的编码能力超越了RNN,但是对长距离依赖的建模能力仍然不足。在基于LSTM的模型中,为了建模长距离依赖,提出了门控机制和梯度裁剪,目前可以编码的最长距离在200左右。在基于Transformer的模型中,允许词之间直接self-attention,能够更好地捕获长期依赖关系,但是还是有限制。

Read more »

论文笔记 — XLNet Generalized Autoregressive Pretraining for Language Understanding

Posted on 2019-06-21 | In 论文笔记 , Pre-trained LM

From Google Brain and CMU.

Authors: Zhilin Yang∗, Zihang Dai∗, Yiming Yang, Jaime Carbonell, Ruslan Salakhutdinov, Quoc V. Le

Title: XLNet: Generalized Autoregressive Pretraining for Language Understanding.

Preprint at 2019.6.20.

Introduction

这篇论文建立在Transformer-XL【作者们ACL2019的工作】的基础之上。看过Transformer-XL的同学应该知道其编码方式其实已经有了挺大的改进,对长文本的编码优于Vanilla Transformer。本文引入了PLM(Permutation Language Model,排列语言模型【Permutation: a way, especially one of several possible variations, in which a set or number of things can be ordered or arranged.】)而抛弃BERT的Mask LM,然后引入Masked Two-Stream Self-Attention解决PLM出现的目标预测问题【见Motivation】,最后用三倍于BERT的语料进行预训练,刷榜SQuAD、GLUE、RACE等。

Read more »

Learn python with socratica [My notes] - part 15- Logging

Posted on 2019-06-20 | In Learn python with socratica

Lesson 17

Introduction

logging就是记录的意思,也就是生成日志的一种方式。在程序运行过程中,logging模块可以记录所有的一切【按需记录】。这对于开发者来说很重要,好的程序是错哪儿都知道直接知道发生什么问题。

Read more »
12…5
IndexFziQ

IndexFziQ

I'm a PhD student in NLP at IIE, CAS. I blog about NLP and Python.

25 posts
10 categories
28 tags
GitHub E-Mail Instagram
© 2021 IndexFziQ
Powered by Hexo
|
Theme — NexT.Mist v5.1.4
访客数 人 总访问量 次