Image sourced by unsplash.com
The Markup’s most recent study has shed more light on the existence of a lucrative, $12 billion shadow industry that thrives off the extraction and brokering of location data. This industry has also burgeoned during the pandemic, particularly through the extensive deployment of contact tracing apps which have also relied on forms of mobility data.
While the need for and efficacy of contact-tracing apps is now under the scanner, understanding mobility patterns will still remain relevant as countries continue to shape COVID-19 policies. However, the current collection and usage of mobility data, particularly through contact tracing technology highlights a number of fissures around the lack of accountability and loss of citizen trust in these systems.
While these findings are valid causes for concern, it is possible to re-imagine a more responsible and conditional collection, use and sharing of mobility data. Reclaiming power and control around how data and related technology will continue to be employed, during COVID-19 and after, will require matters of data governance and stewardship to be placed at the forefront of our conversations.
Through the lens of stewardship, the narrative on privacy and data empowerment can also be extended beyond compliance oriented data governance practices. Stewards are typically independent entities that can: intermediate the flow of data, represent the rights of individuals and communities, implement mechanisms to safeguard data rights and empower users to unlock the value of data – however models of stewardship may differ in structure, purpose and services provided.
Stewardship may also help address several emerging questions. How can mobility data be responsibly collected or unlocked for public health applications? What are the rights and protections that should be embedded to safeguard data that relates to individuals and communities? What can be done to ensure that data collected is representative of the populations it aims to serve?
This paper examines the unlocking, use and sharing of mobility data during the pandemic. It also proposes that community-based organizations and civil society organizations can act as stewards or data intermediaries, to better facilitate the critical and responsible collection/exchange of mobility data.
Background: Mobility Data & COVID-19
Mobility data refers to information (often passively captured) that provides insights into the location and movement of a population – often through their interactions with digital mobility devices (like our smartphones) or transport services. Sources of mobility data, while diverse, include call detail records from telecom companies, GPS details from phones or vehicles, geotagged social media data or first or third-party software data.
Geolocation, a subset of mobility data, may be useful in shaping responsive courses of action as it can be leveraged in granular form to understand hyperlocal realities or, when aggregated, regional, national or international patterns. However, privacy concerns arise from the sensitive or personal data that may be inferred from these records and the often opaque conditions around its usage. The ongoing deployment of contact tracing applications, which largely depend on individual-level location data, have demonstrated extensive potential for misuse and surveillance.
In Norway, citizens’ concerns around the invasive nature of the ‘Smittestopp’ app led to a ruling by their Data Protection Authority that has temporarily suspended its use and called for a deletion of all collected data. Israel’s security agency Shin Bet, repurposed its surveillance tools for contact tracing. These tools captured geolocation, cellular data and credit card details. Enforced with no legislative oversight, the move garnered significant flak from opposition leaders and civil liberties advocates who argued that it flouted democratic principles and undermined the individual rights of Israeli citizens. In October 2020, the government aimed to pass a legislation which would allow the police unfettered access to data collected by Shin Bet, the Health Ministry and army, for “criminal investigations”. The implications of misuse or abuse of this data are also greater for Palestinians living under military occupation and are not entitled to any legal protections or rights. The imposition of surveillance tools that depend on an overcollection of granular data may in fact only expand the scope of an existing surveillance regime and allow for further contravention of digital rights.
Despite the surveillance and privacy concerns around the use of contact tracing apps and mobility data, it is undeniable that this data has immense public value and has helped officials understand the development of the COVID-19 virus and map its variants and waves. It has also been used to track: areas of mobility that contribute towards increased transmission of the virus, adherence to social distancing norms and the effectiveness of measures like lockdowns or restrictions.
Unlocking Mobility Data: India vs Taiwan
Taiwan’s government has been championed for its proactivity, robust infrastructure and speed in addressing the COVID-19 pandemic. Its response has relied on the extensive use of personal data, in what some have referred to as “participatory self-surveillance.” Their electronic ‘digital fence’ system collects cellular data of visitors entering Taiwan to ensure they comply with a mandatory 14 day quarantine. Cellular data is ‘unlocked’ by telecom providers who are required to partner with the government under the Communicable Disease Control Act. The digital fence system is implemented by the country’s Central Epidemic Command Center, which must abide by Taiwan’s arguably robust Personal Data Protection Act (PDPA). Recognizing concerns around privacy, the government has imposed access controls around the data and a retention period that defines when it should be deleted.
While these measures may be considered extraordinary and raise privacy related-concerns, citizens have largely demonstrated confidence in their government’s actions. Clear communication and processes that enhance trust, transparency and partnership through the vTaiwan public engagement platform have contributed towards this. Through this participatory approach, the government has enabled both top-down and bottom-up flow of data. It has also encouraged a range of stakeholders – like developers – to take part in developing innovative solutions through hackathons.
In stark contrast to Taiwan’s model, India’s lack of a robust public data infrastructure and general inability to provide access to credible and relevant government data has been exposed during the pandemic. Poor reporting coupled with a lack of cohesion between state and central agencies in transparent data collection and sharing efforts has had devastating effects on its population and has, in large part, prevented the creation of an effective national pandemic response. Institutional failures coupled with limited access to data also made it challenging for the government to provide a safe and secure passage to migrants at the peak of India’s first wave which had acute implications for their health, safety, livelihood and wellbeing.
The dearth of reliable data from the public sector has required that mobility data be unlocked by the private sector. Through Google’s ‘Community Mobility Reports’, researchers have been able to assess how increased mobility associated with lifting of travel restrictions may have contributed towards a rise in cases. To better map intra and inter-regional mobility patterns, researchers also leveraged Facebook’s ‘Data for Good’ platform – using anonymised Facebook mobile users’ data network combined with census data from 2011. These derived insights can inform policymakers on the efficacy of interventions like mobility restrictions, assess which restrictions are most impactful in reducing the spread of COVID and understand the impact of relaxations on public health outcomes to better strategise re-opening measures.
Relying on the private sector to access mobility data
While the unavailability of public sector data has hampered COVID-19 efforts, the vacuum it has created has given way to an overdominance of private sector influence. Although Google’s commitment to publicly release their ‘Community Mobility Reports’ may be commendable for the rich insights it has generated and the studies it has been able to power, we must not hesitate to acknowledge and be critical of these companies’ ability to capture this volume of data. Moreover, the swiftness with which large tech companies assumed the role typically occupied by the governments, who as a result were absolved of their own responsibilities, also sets a dangerous precedent. In addition, there are also ramifications related to privacy, representation, control and transparency around data collected.
Most studies carried out with Google, Apple or Facebook data indicate that they have relied on anonymised and aggregated data. Google claims it preserves privacy by implementing their ‘world-class anonymization technology’, use of differential privacy techniques and adding noise to datasets – however computational privacy experts call for more detail into these methods and greater transparency.
Additionally, it has been revealed that companies like SafeGraph, that have been heavily relied upon for aggregate data by public agencies, researchers and scientists, were only able to provide much of this location data due to their shadowy and extractive business model of tracking, ingesting and brokering data through a software development kit embedded in various apps and services.
The broader issue that remains is whether the platforms that provide access to mobility data implement any formal or informal consent mechanisms. Referring to the situation in China where companies like Alibaba and Tencent were similarly better equipped to provide data more efficiently and at scale, Huang, Sun and Sui argue that “By harvesting colossal amounts of user data in real-time, these firms may know more about population movement than the government itself.” As citizens possess no real power to seek accountability with private sector companies, this requires individuals and communities to merely rely on companies’ goodwill, which will also determine how they choose to wield the data they collect and when to make it accessible. In principle, the narrative of ‘private sector to the rescue’ also contributes to normalizing and justifying the existing pace of data extractivism and other inequities of the data economy.
Another challenge that exists despite the public availability of this data, is its limitations in representation. Experts argue that representation of mobility data must be viewed from three perspectives: the fraction it represents of the population, demographic diversity (age, sex, gender, race, caste, class, etc.) and how it maps to geographical differences (urban, semi-urban, rural). An audit carried out by researchers from Carnegie Mellon University and Stanford University revealed that studies that depend on smartphone mobility data may be demographically biased. Drawing from a mobility dataset from SafeGraph in the U.S, they discovered that there is an underrepresentation of older people as well as those from African-American, Native-American, and Latinx communities.
It is vital that datasets are representative particularly because these groups are likely to face a disproportionate risk of contracting the virus. This highlights the importance of including what researchers refer to as ‘auxiliary information’ and greater transparency around sources of mobility data used. In the Indian context, concerns around representation in datasets must also account for the widening digital divide as this risks further excluding and rendering invisible a large portion of poor and marginalized individuals across the country.
Stewardship: Bottom-up mechanisms to responsibly capture data
It is necessary for governments to reckon with these challenges and build transparent, accountable data exchange ecosystems. This will allow citizens, relief agencies and innovators to share and harness data in a manner that is inclusive, efficient and privacy-centric. Moreover, it is critical that the earlier mentioned challenges of a lack of representation, control, transparency and agency around the often passive collection of mobility data by the private sector are addressed. However, at a time where valid trepidations exist around building public infrastructures that could further monitor or surveil data – particularly in the absence of concrete data protection regulation – there is also a need to explore decentralised, bottom-up solutions.
In favor of shaping a more equitable and rights-focused system of data governance, stewardship may offer a potential path forward. Introducing more active stewards of data can help to diversify sources of mobility data. To this end, greater attention must be paid to the role of community-based collectives and civil society organizations (CSO) who may be well-equipped to steward data on behalf of the beneficiaries they serve. CSOs like Jan Sahas have been collecting data on migrants and carrying out research to inform policymakers. Engaging organizations as stewards could ensure that data collection, sharing and usage follows principles of minimization, informed consent and conditional, purpose-driven usage.
An example of a steward in action is PLACE which has chosen to structure itself as a data trust. Data trusts, which are one model of stewardship, are imagined to support users to pool their data (or data rights) to collectively negotiate its terms of use, within a framework of strong fiduciary duties. PLACE was established with these ideals in order to democratise the collection, availability and sharing of map-based geospatial data. To ensure that the model remains ‘inclusive, independent and shared’, they have adopted an open membership model. In order to be a part of PLACE, members must agree to a set of terms and conditions on data usage, ethics, and principles, and pay a fee that is contingent on their size and use of data. Structured to retain greater answerability and accountability over data collected, the organization is overseen by trustees that have been onboarded on the basis of their experience and their alignment with the broader vision.
While perhaps not formalised as a steward, another example to potentially draw lessons from is the City of Amsterdam which co-built Public Eye, an open-source solution for crowd monitoring. The system relies on a combination of sensors and cameras to measure footfall density which can then be used by citizens to avoid crowded spaces and adhere to curfews. To address concerns around privacy and be transparent with respect to the use of data, the City aims to minimize identifying individuals through the use of heat-maps and only stores a minimal amount of data on an encrypted, city-owned network – sufficient to train algorithms. In an effort to encourage public oversight and engagement, the project also features on the City’s AI register – a platform that aims to provide citizens with information and the possibility to provide feedback and participate in shaping AI systems deployed in the City. In this context, the city acts as a steward that possesses a greater degree of answerability, accountability and going forward will likely provide greater transparency around the technology deployed and data governance practices.
Going forward, aside from its possible usage for COVID-19, it is likely that the use of mobility data will continue to be a critical facet to shape bottom-up policy, determine more responsive city-planning and shape sustainability and inclusion goals. To this end, a varied landscape of stewardship models will need to be supported, through both top-down and bottom-up infrastructural support.
Therefore, this will require action around the following dimensions
- Shape dynamic regulation that seeks to build guardrails around the unlocking, collection, usage and sharing of location and mobility data
Taiwan’s data infrastructure and contingency plans, while entirely centralised, offered transparency and accountability to citizens, which in turn generated trust across stakeholder groups, incentivizing greater collaboration and innovation around solution generation. This allowed for necessary data from various sources including public agencies, private sector companies and civil society organizations to be both aggregated and responsibly shared.
These processes and data flows were informed and guided by regulations that sought to ensure the collection and use of data was conditional and limited by scope and clear purpose. A joint statement released by the WHO has similarly advocated for greater transparency, protection and systems of accountability around the use of data for COVID-19. - Build collaborative, privacy by design infrastructure and supportive ecosystem for critical data exchange
Initiatives like the Mobility Data Collaborative, which represent a diverse cross-section of stakeholders have developed an assessment tool in partnership with the Future of Privacy Forum that supports cities in identifying how to incentivize the sharing of mobility data while safeguarding individual privacy rights, considering community interests and functioning with greater transparency.
Ecosystems of data exchange can also be facilitated by organizations like Visions.pol which act as account aggregators, another model of stewardship, by creating human-centric data exchange networks that support multi-stakeholder data collection and sharing while relegating control through robust consent processes to the individual user. Their model is not predicated on the brokerage of location data, but rather the management of consent. - Strengthen the capacity of citizen-centric organizations and communities to better map, collect and control data that pertains to their mobility and health
Through citizen-science tools, individuals and communities can be activated to collect and contribute to bottom-up crowdsourcing data movements. Enabling active contribution and civic engagement at this point in the data lifecycle, allows for processes of data collection and governance to extend beyond seeking mere ‘consent’. This can be channeled through civil society organizations who already act as important intermediaries and data holders, like SafetiPin.
Established non-profits or social enterprises like Safetipin have the requisite infrastructure to also collect important data through crowdsourcing and high citizen involvement. Safetipin collects a range of geo-tagged data, qualitative information and images to showcase the spatial and social nature and experiences of how individuals (particularly women) inhabit and traverse public spaces. In a recent survey carried out with the World Bank in Sri Lanka, they reported the realities of a pre and post-pandemic world – including how instances of harassment persisted, and the decrease in walking as a mode of transport, although this represented the most common form of getting around prior to COVID-19 for women. Safetipin’s ability to highlight gendered dimensions of COVID-19 and their approach to representing and collecting data with active participation from the community, outlines the need to include these voices and actors in data stewarding efforts.
There are opportunities to explore the conditional and responsible use of mobility data whether it is to shape smarter, people-centric and more equitable cities or to address ongoing public emergencies like COVID-19. However, its use over the last few years, particularly through contact tracing apps, have demonstrated the scope for misuse that range from surveillance to woeful mismanagement of the pandemic due to a lack of data. While Taiwan has created centralized solutions to use location and mobility data that command high citizen trust and are subject to regulations and protocol, other countries like India have had to increasingly rely on the private sector to unlock data. Private sector platforms possess valuable data that must be made accessible to multiple stakeholders – however its dominance in being able to capture this data at such volumes is also indicative of larger systemic injustices in the data economy that have allowed for this degree of monopolization. Stewardship offers solutions to rethink how to build more collaborative and transparent systems of bottom-up data collection and exchange around mobility data – whether in the form of civil society organizations capturing more representative data or consent-centric data exchange ecosystems that can support the collection and use of mobility data to drive policy.