Currently a PhD candidate in Economics at the Nanyang Technological University. I use machine learning and natural language processing within the standard applied econometric analyses. I use Stata for standard applied econometrics and Python for everything else.
In this paper I explore the use of text processing, machine learning (ML), and natural language processing (NLP) to assemble a panel of direct quotations of the Singapore parliamentary speeches that appear in The Straits Times. In particular, I use NLP methods to generate measures of coverage accuracy, and an unsupervised ML method (Latent Dirichlet Allocation) to generate controls for the topical content of the political speeches and the news articles that report them. Conditional on the observables, I find the coverage of the opposition speeches to be less accurate than those of the ruling party speeches. In addition to the usual robustness tests, I also provide evidence against the possibility that the observed differences in coverage accuracy occurs because of differences in language competency. While the finding that the opposition receives less accurate coverage cannot be unambiguously interpreted as causal, I also provide two arguments suggesting that the estimates in this paper are lower bounds on the true magnitude of the opposition status on political media coverage. To the best of my knowledge, this paper is the first that attempts to detect media slant by focusing on coverage accuracy instead of intensity.
I exploit the plausibly exogeneous increase in women representation in the parliament of Singapore in the period 2000–17, and find that this increase predicts a similar increase in the women representation on top SGX-listed corporate board of directors, but only for those firms that have close government ties—the government-linked companies (GLCs). I interpret this finding as one where the corporate sector takes cues from the government on representation issues, and it is the GLCs that respond more to these cues. I then use the above findings as a first-stage in a 2SLS analyses to identify the causal effect of higher women board representation on tangible firm outcomes such as firm value and leverage.
I use Python for my research work, and have benefitted extensively from open-source libraries in the Python ecosystem. As a tiny contribution, I wrote Leixcal richness which I use to generate proxies for language sophistication in my research.