南京大学计算机科学与技术系
软件新技术与产业化协同创新中心
摘 要:
Existing
studies on question answering on knowledge bases (KBQA) mainly operate with the
standard i.i.d
assumption, i.e., training distribution over questions is the same as the test
distribution. However, i.i.d may be neither reasonably achievable nor
desirable on large-scale KBs because 1) true user distribution is hard to
capture and 2) randomly sample training examples from the enormous space would
be highly data-inefficient. Instead, we suggest that KBQA models should have
three levels of built-in generalization: i.i.d, compositional, and zero-shot. To
facilitate the development of KBQA models with stronger generalization, we
construct and release a new large-scale, high-quality dataset with 64,331
questions, GrailQA, and provide evaluation settings for all
three levels of generalization. In addition, we propose a novel BERT-based KBQA
model. The combination of our dataset and model enables us to thoroughly
examine and demonstrate, for the first time, the key role of pre-trained
contextual embeddings like BERT in the generalization of KBQA.
报告人简介:
谷雨,现为俄亥俄州立大学博士生,导师为Yu Su。他于南京大学计算机科学与技术系取得学士及硕士学位,期间于Websoft研究组师从程龚老师从事智能问答及语义搜索等相关研究。目前研究方向为自然语言处理,主要研究兴趣为语义解析及围绕知识库的自然语言处理。曾于WSDM,
WWW, ACL等学术会议上发表论文。本次报告中,谷雨会重点介绍一个全新的知识图谱问答(KBQA)数据集GrailQA,并讨论KBQA近期的发展趋势及未来方向。
时间:11月8日(星期一) 18:45
地点:计算机科学技术楼229室
|