Abstract
This study explores how large language models (LLMs) can create synthetic survey data to make the validation of constructs and pre-test of surveys faster and more reliable. We propose a strategy and a framework for designing prompts and personas that enable LLMs to simulate human responses. We apply the framework to three validated survey instruments and collect ChatGPT-generated answers. Using this synthetic data, we conduct widely used validity tests and run the structural estimation. We find that LLMs can replicate human behavior and validate instruments developed to examine the organizational setting in a way consistent with theory, addressing challenges such as survey design, construct validity, reliability, and generalizability. Despite the limitations that we document, we posit that LLMs can help evaluate the epistemic relationships of constructs and that synthetic data can be a complement to real-world data to advance the rigor and efficiency of survey-based research.
Original language | English |
---|---|
Publisher | SSRN |
DOIs | |
Publication status | Published - 6 Nov 2023 |
Datasets
-
Replication Data for: Charting New Territory: Using Large Language Models to Enhance Survey Instruments in the Organizational Setting
Motoki, F. (Creator), Monteiro, J. (Creator), Malagueno de Santana, R. (Creator) & Rodrigues, V. (Creator), Harvard Dataverse, 13 Sep 2024
DOI: 10.7910/DVN/RQGKHW
Dataset