TY - JOUR
T1 - Detecting shadow lobbying
AU - Slobozhan, Ivan
AU - Ormosi, Peter
AU - Sharma, Rajesh
N1 - Funding Information: This work has received funding from the EU H2020 program under the SoBigData++ project (grant agreement No. 871042).
PY - 2022/4/5
Y1 - 2022/4/5
N2 - Lobbying activity is subject to strict disclosure requirements in the USA. Failure to comply with these requirements can lead to criminal and civil penalties. It is claimed that these tight lobbying disclosure measures resulted in an increase in ‘underground lobbying’. This research proposes a method to discover non-compliance in lobbying disclosure and gauge the magnitude of underground lobbying. We start from the premise that lobbying changes the text of the bills it targets. If these changes happen to some extent systematically, then the texts of lobbied bills should be discernible from non-lobbied bills. We combine the corpus of US legislative bills with a large dataset of lobbying activity to give us a partially labelled dataset, where a positive label indicates a lobbied bill, and the lack of a label indicates either that the bill was lobbied, or was lobbied but not disclosed. To address this partial labelling problem, we first set up a naive classification task, where we assume all unlabelled bills to have a negative label and train a model on a large corpus of US bills. By finding the best performing model, we then design a bagging method and collect out of fold predictions, to predict for each unlabelled bill whether it was lobbied or not. From these predictions, we infer that there are a sizable number of bills that are likely to have been lobbied, but this lobbying activity was not disclosed. We then investigate how the political affiliation of the sponsoring senators and congressmen relates to these probabilities.
AB - Lobbying activity is subject to strict disclosure requirements in the USA. Failure to comply with these requirements can lead to criminal and civil penalties. It is claimed that these tight lobbying disclosure measures resulted in an increase in ‘underground lobbying’. This research proposes a method to discover non-compliance in lobbying disclosure and gauge the magnitude of underground lobbying. We start from the premise that lobbying changes the text of the bills it targets. If these changes happen to some extent systematically, then the texts of lobbied bills should be discernible from non-lobbied bills. We combine the corpus of US legislative bills with a large dataset of lobbying activity to give us a partially labelled dataset, where a positive label indicates a lobbied bill, and the lack of a label indicates either that the bill was lobbied, or was lobbied but not disclosed. To address this partial labelling problem, we first set up a naive classification task, where we assume all unlabelled bills to have a negative label and train a model on a large corpus of US bills. By finding the best performing model, we then design a bagging method and collect out of fold predictions, to predict for each unlabelled bill whether it was lobbied or not. From these predictions, we infer that there are a sizable number of bills that are likely to have been lobbied, but this lobbying activity was not disclosed. We then investigate how the political affiliation of the sponsoring senators and congressmen relates to these probabilities.
KW - Corruption
KW - Lobbying
KW - Lobbying disclosure
KW - Machine learning
UR - http://www.scopus.com/inward/record.url?scp=85127816980&partnerID=8YFLogxK
U2 - 10.1007/s13278-022-00875-y
DO - 10.1007/s13278-022-00875-y
M3 - Article
VL - 12
JO - Social Network Analysis and Mining
JF - Social Network Analysis and Mining
SN - 1869-5469
M1 - 48
ER -