| |||||||
ShanghaiTech University Knowledge Management System
Probabilistic Transformer: A Probabilistic Dependency Model for Contextual Word Representation | |
2023-07 | |
会议录名称 | FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: ACL 2023 |
ISSN | 0736-587X |
页码 | 7613–7636 |
发表状态 | 已发表 |
DOI | 10.18653/v1/2023.findings-acl.482 |
摘要 | Syntactic structures used to play a vital role in natural language processing (NLP), but since the deep learning revolution, NLP has been gradually dominated by neural models that do not consider syntactic structures in their design. One vastly successful class of neural models is transformers. When used as an encoder, a transformer produces contextual representation of words in the input sentence. In this work, we propose a new model of contextual word representation, not from a neural perspective, but from a purely syntactic and probabilistic perspective. Specifically, we design a conditional random field that models discrete latent representations of all words in a sentence as well as dependency arcs between them; and we use mean field variational inference for approximate inference. Strikingly, we find that the computation graph of our model resembles transformers, with correspondences between dependencies and self-attention and between distributions over latent representations and contextual embeddings of words. Experiments show that our model performs competitively to transformers on small to medium sized datasets. We hope that our work could help bridge the gap between traditional syntactic and probabilistic approaches and cutting-edge neural approaches to NLP, and inspire more linguistically-principled neural approaches in the future. |
会议录编者/会议主办者 | Association for Computational Linguistics ; Bloomberg ; et al. ; Google Research ; LIVEPERSON ; Meta ; Microsoft |
关键词 | Deep learning Natural language processing systems Structural design Contextual words Dependency model Language processing Mean-field Natural languages Neural modelling Probabilistics Random fields Syntactic structure Word representations |
会议名称 | ACL2023 |
出版地 | Toronto, Canada |
会议地点 | Toronto, Canada |
会议日期 | 2023-07 |
学科门类 | 工学::计算机科学与技术(可授工学、理学学位) |
URL | 查看原文 |
收录类别 | EI |
语种 | 英语 |
出版者 | Association for Computational Linguistics (ACL) |
EI入藏号 | 20234515012242 |
EI主题词 | Syntactics |
EI分类号 | 408.1 Structural Design, General ; 461.4 Ergonomics and Human Factors Engineering ; 723.2 Data Processing and Image Processing |
原始文献类型 | Conference article (CA) |
文献类型 | 会议论文 |
条目标识符 | https://kms.shanghaitech.edu.cn/handle/2MSLDSTB/345942 |
专题 | 信息科学与技术学院_硕士生 信息科学与技术学院_PI研究组_屠可伟组 |
通讯作者 | Kewei Tu |
作者单位 | 1.School of Information Science and Technology, ShanghaiTech University 2.Shanghai Engineering Research Center of Intelligent Vision and Imaging |
第一作者单位 | 信息科学与技术学院 |
通讯作者单位 | 信息科学与技术学院 |
第一作者的第一单位 | 信息科学与技术学院 |
推荐引用方式 GB/T 7714 | Haoyi Wu,Kewei Tu. Probabilistic Transformer: A Probabilistic Dependency Model for Contextual Word Representation[C]//Association for Computational Linguistics, Bloomberg, et al., Google Research, LIVEPERSON, Meta, Microsoft. Toronto, Canada:Association for Computational Linguistics (ACL),2023:7613–7636. |
条目包含的文件 | 下载所有文件 | |||||
文件名称/大小 | 文献类型 | 版本类型 | 开放类型 | 使用许可 |
个性服务 |
查看访问统计 |
谷歌学术 |
谷歌学术中相似的文章 |
[Haoyi Wu]的文章 |
[Kewei Tu]的文章 |
百度学术 |
百度学术中相似的文章 |
[Haoyi Wu]的文章 |
[Kewei Tu]的文章 |
必应学术 |
必应学术中相似的文章 |
[Haoyi Wu]的文章 |
[Kewei Tu]的文章 |
相关权益政策 |
暂无数据 |
收藏/分享 |
修改评论
除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。