Essays in financial economics
File(s)
Author(s)
Uettwiller, Antoine
Type
Thesis or dissertation
Abstract
The three chapters in this thesis study agent behavior, whether it be that of retail investors
(chapter 1), the reporting behavior of home-buyers for the purpose of taxation
(chapter 2), or that of firms choosing what data to collect on their consumers (chapter
3). All three chapters employ data-driven approaches and novel methods to explore
heterogeneity within their respective fields of study and draw conclusions that have
implications for policymakers, market participants, and other stakeholders.
In the first chapter, I study the heterogeneity of retail investors based on the stocks
they chose to discuss. I use Reddit WallStreetBets data totalling more than 1.5 million
posts and 35 million comments from 2018 to the end of 2021. I find retail investors to
be diverse, with distinct subgroups showing interest in stocks with attention-grabbing
features, or fundamental-based investing. I also find those attributes to be connected
to echo chamber effects, likely causing confirmation bias, finding that subgroups do
not mix. Finally, I find subgroups to be of differential importance in a predictability
exercise. Effects commonly attributed to retail investors, such as increased volatility,
might be coming from a sub-sample of them but also a sub-sample of stocks.
In the second chapter, we develop a new method to estimate under-reporting and employ
it on large, granular administrative data from the Mumbai real-estate market. The
approach compares bunching of reported values around government-assessed guidance
values with the distribution of a third-party measure of market values. We estimate
that 11 percent of Mumbai real-estate value is under-reported between 2013
and 2022. We detect a decline in bunching post-demonetization, but no change in the
overall under-reporting rate. Properties with mortgages from banks with high nonperforming
assets and from public-sector banks exhibit greater under-reporting.
5
In the third chapter, we scrape a comprehensive set of US firms’ privacy policies and
study these policies alongside firms’ web data extraction behaviour. We find considerable
and systematic variation in privacy policies along multiple dimensions including
ease of access, length, readability, and clarity, both within and between industries.
Surprisingly, firms’ data extraction, measured by the extent of cookies they place on
consumers browsing their websites, is strongly and positively related to the length and
complexity of their policies. Firms with intermediate levels of technical sophistication
have longer, more complex policies than firms with very high or low technical sophistication,
controlling for a range of other firm attributes. A simple signalling model of
firms engaging in data extraction in an economy with both myopic and sophisticated
consumers, where privacy is a “shrouded attribute” of firm-consumer interactions,
helps to rationalize these findings.
(chapter 1), the reporting behavior of home-buyers for the purpose of taxation
(chapter 2), or that of firms choosing what data to collect on their consumers (chapter
3). All three chapters employ data-driven approaches and novel methods to explore
heterogeneity within their respective fields of study and draw conclusions that have
implications for policymakers, market participants, and other stakeholders.
In the first chapter, I study the heterogeneity of retail investors based on the stocks
they chose to discuss. I use Reddit WallStreetBets data totalling more than 1.5 million
posts and 35 million comments from 2018 to the end of 2021. I find retail investors to
be diverse, with distinct subgroups showing interest in stocks with attention-grabbing
features, or fundamental-based investing. I also find those attributes to be connected
to echo chamber effects, likely causing confirmation bias, finding that subgroups do
not mix. Finally, I find subgroups to be of differential importance in a predictability
exercise. Effects commonly attributed to retail investors, such as increased volatility,
might be coming from a sub-sample of them but also a sub-sample of stocks.
In the second chapter, we develop a new method to estimate under-reporting and employ
it on large, granular administrative data from the Mumbai real-estate market. The
approach compares bunching of reported values around government-assessed guidance
values with the distribution of a third-party measure of market values. We estimate
that 11 percent of Mumbai real-estate value is under-reported between 2013
and 2022. We detect a decline in bunching post-demonetization, but no change in the
overall under-reporting rate. Properties with mortgages from banks with high nonperforming
assets and from public-sector banks exhibit greater under-reporting.
5
In the third chapter, we scrape a comprehensive set of US firms’ privacy policies and
study these policies alongside firms’ web data extraction behaviour. We find considerable
and systematic variation in privacy policies along multiple dimensions including
ease of access, length, readability, and clarity, both within and between industries.
Surprisingly, firms’ data extraction, measured by the extent of cookies they place on
consumers browsing their websites, is strongly and positively related to the length and
complexity of their policies. Firms with intermediate levels of technical sophistication
have longer, more complex policies than firms with very high or low technical sophistication,
controlling for a range of other firm attributes. A simple signalling model of
firms engaging in data extraction in an economy with both myopic and sophisticated
consumers, where privacy is a “shrouded attribute” of firm-consumer interactions,
helps to rationalize these findings.
Version
Open Access
Date Issued
2023-04
Date Awarded
2023-07
Copyright Statement
Creative Commons Attribution NonCommercial NoDerivatives Licence
Advisor
Ramadorai, Tarun
Hansman, Christopher John
Publisher Department
Imperial College Business School
Publisher Institution
Imperial College London
Qualification Level
Doctoral
Qualification Name
Doctor of Philosophy (PhD)