Abstract
This study examines the potential of large language models (LLMs) to generate synthetic survey data for organizational research. We propose a structured framework for prompting LLMs to simulate human-like responses, incorporating persona creation and impulse variables to enhance variability. Using previously validated constructs in the organizational deviance literature, we assess whether synthetic data exhibits response patterns aligned with theoretical expectations. Our findings suggest that LLMs can produce structured data that approximates real-world constructs, paving the way for more efficient pre-testing and refinement of survey instruments. While our results highlight promising potential, they also reveal key challenges, including response homogeneity, overestimated reliability, and model-specific biases. Ethical considerations, such as bias propagation and transparency, further emphasize the need for careful application. As LLMs continue to advance, their role in methodological innovation may expand, enabling researchers to explore new avenues in survey-based studies. This study represents a foundational step in integrating synthetic data into organizational research, broadening methodological possibilities while acknowledging current limitations.
Original language | English |
---|---|
Publisher | SSRN |
DOIs | |
Publication status | Published - 6 Nov 2023 |
Datasets
-
Replication Data for: Charting New Territory: Using Large Language Models to Enhance Survey Instruments in the Organizational Setting
Motoki, F. (Creator), Monteiro, J. (Creator), Malagueno de Santana, R. (Creator) & Rodrigues, V. (Creator), Harvard Dataverse, 13 Sept 2024
DOI: 10.7910/DVN/RQGKHW
Dataset