"PASTA: Table-Operations Aware Fact Verification via Sentence-Table Cloze Pre-training" by Zihui Gu

Selected Works of Preslav Nakov

Article

PASTA: Table-Operations Aware Fact Verification via Sentence-Table Cloze Pre-training

Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, EMNLP 2022

Zihui Gu, Renmin University of China
Ju Fan, Renmin University of China
Nan Tang, Qatar Computing Research Institute
Preslav Nakov, Mohamed Bin Zayed University of Artificial Intelligence
Xiaoman Zhao, Renmin University of China
Xiaoyong Du, Renmin University of China

Download

Document Type

Conference Proceeding

Abstract

Fact verification has attracted a lot of research attention recently, e.g., in journalism, marketing, and policymaking, as misinformation and disinformation online can sway one's opinion and affect one's actions. While fact-checking is a hard task in general, in many cases, false statements can be easily debunked based on analytics over tables with reliable information. Hence, table-based fact verification has recently emerged as an important and growing research area. Yet, progress has been limited due to the lack of datasets that can be used to pre-train language models (LMs) to be aware of common table operations, such as aggregating a column or comparing tuples. To bridge this gap, in this paper we introduce PASTA, a novel state-of-the-art framework for table-based fact verification via pre-training with synthesized sentence-table cloze questions. In particular, we design six types of common sentence-table cloze tasks, including Filter, Aggregation, Superlative, Comparative, Ordinal, and Unique, based on which we synthesize a large corpus consisting of 1.2 million sentence-table pairs from WikiTables. PASTA uses a recent pre-trained LM, DeBERTaV3, and further pretrains it on our corpus. Our experimental results show that PASTA achieves new state-of-the-art performance on two table-based fact verification benchmarks: TabFact and SEM-TAB-FACTS. In particular, on the complex set of TabFact, which contains multiple operations, PASTA largely outperforms the previous state of the art by 4.7 points (85.6% vs. 80.9%), and the gap between PASTA and human performance on the small TabFact test set is narrowed to just 1.5 points (90.6% vs. 92.1%).

DOI

10.18653/v1/2022.emnlp-main.331

Publication Date

12-1-2022

Disciplines

Comments

Archived with thanks to ACL Anthology

Preprint License: CC by 4.0 DEED

Uploaded 27 November 2023

Additional Links

Publisher version link: https://aclanthology.org/2022.emnlp-main.331/

Citation Information

Z. Gu, J. Fan, N. Tang, P. Nakov, X. Zhao, and X. Du, "PASTA: Table-Operations Aware Fact Verification via Sentence-Table Cloze Pre-training", in Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, ACL, pp. 4971–4983, Dec 2022. doi:10.18653/v1/2022.emnlp-main.331