Adapt and Decompose: Efficient Generalization of Text-to-SQL via Domain Adapted Least-To-Most Prompting

Aseem Arora; Shabbirhussain Bhaisaheb; Harshit Nigam; Manasi Patwardhan; Lovekesh Vig; Gautam Shroff

doi:10.18653/v1/2023.genbench-1.3

Adapt and Decompose: Efficient Generalization of Text-to-SQL via Domain Adapted Least-To-Most Prompting

Aseem Arora, Shabbirhussain Bhaisaheb, Harshit Nigam, Manasi Patwardhan, Lovekesh Vig, Gautam Shroff

Abstract

Cross-domain and cross-compositional generalization of Text-to-SQL semantic parsing is a challenging task. Existing Large Language Model (LLM) based solutions rely on inference-time retrieval of few-shot exemplars from the training set to synthesize a run-time prompt for each Natural Language (NL) test query. In contrast, we devise an algorithm which performs offline sampling of a minimal set-of few-shots from the training data, with complete coverage of SQL clauses, operators and functions, and maximal domain coverage within the allowed token length. This allows for synthesis of a fixed Generic Prompt (GP), with a diverse set-of exemplars common across NL test queries, avoiding expensive test time exemplar retrieval. We further auto-adapt the GP to the target database domain (DA-GP), to better handle cross-domain generalization; followed by a decomposed Least-To-Most-Prompting (LTMP-DA-GP) to handle cross-compositional generalization. The synthesis of LTMP-DA-GP is an offline task, to be performed one-time per new database with minimal human intervention. Our approach demonstrates superior performance on the KaggleDBQA dataset, designed to evaluate generalizability for the Text-to-SQL task. We further showcase consistent performance improvement of LTMP-DA-GP over GP, across LLMs and databases of KaggleDBQA, highlighting the efficacy and model agnostic benefits of our prompt based adapt and decompose approach.

Anthology ID:: 2023.genbench-1.3
Volume:: Proceedings of the 1st GenBench Workshop on (Benchmarking) Generalisation in NLP
Month:: December
Year:: 2023
Address:: Singapore
Editors:: Dieuwke Hupkes, Verna Dankers, Khuyagbaatar Batsuren, Koustuv Sinha, Amirhossein Kazemnejad, Christos Christodoulopoulos, Ryan Cotterell, Elia Bruni
Venues:: GenBench | WS
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 25–47
Language:
URL:: https://aclanthology.org/2023.genbench-1.3
DOI:: 10.18653/v1/2023.genbench-1.3
Bibkey:
Cite (ACL):: Aseem Arora, Shabbirhussain Bhaisaheb, Harshit Nigam, Manasi Patwardhan, Lovekesh Vig, and Gautam Shroff. 2023. Adapt and Decompose: Efficient Generalization of Text-to-SQL via Domain Adapted Least-To-Most Prompting. In Proceedings of the 1st GenBench Workshop on (Benchmarking) Generalisation in NLP, pages 25–47, Singapore. Association for Computational Linguistics.
Cite (Informal):: Adapt and Decompose: Efficient Generalization of Text-to-SQL via Domain Adapted Least-To-Most Prompting (Arora et al., GenBench-WS 2023)
Copy Citation:
PDF:: https://aclanthology.org/2023.genbench-1.3.pdf

PDF Cite Search