Karnataka’s open data access: At what cost?

By Soujanya Sridharan & Vinay Narayan

March 24th, 2022

On October 19, the Government of Karnataka (GoK) notified its Open Data Policy (the “Policy”). The Policy purports to achieve three overarching goals – one, effective management and interoperability of data across state government departments; two, to define processes and standards for enabling proactive open access to government data for research, innovation, and evidence-based governance; and three, promoting monetisation of anonymised citizen data. This comes at the heels of the Economic Survey of India, 2019 recommendations to monetise citizen data to facilitate the use of data as a “public good”.

However, the Policy is concerning for several reasons: it calls for sharing anonymised citizen data which could pose grave privacy threats through the risk of de-anonymisation. Further, the conception of “data ownership” adopted in the policy makes the state department that is processing the data the owner of said data, when in practice and policy, the state is considered the fiduciary or custodian of citizen data and is compelled to act in their best interests. As advocated by the Personal Data Protection Bill, 2019 and the Non-personal Data Governance Framework, 2020, the ultimate ownership of data lies with the individuals and communities who help produce it. Lastly, the data monetisation aspect of the Policy is problematic for its lack of transparency, potential to prop up private data monopolies, and lack of clarity on pricing mechanisms.

Perils of anonymisation

While the Policy allows for sharing of personal data once it has been anonymised, it does not prescribe the standards that need to be followed for data anonymisation, much like most data governance-related policies in India. With India lacking data protection legislation (for both personal and non-personal data), there is a distinct lack of a legal threshold and standards for anonymisation. This could lead to the use of methods such as pseudonymisation, where personal identifiers are replaced with pseudonyms. This was the stated method of anonymisation used in the contract between the UK National Health Service and various private firms, for which the UK NHS has come under immense criticism for the possibility of users being re-identified from their health data owing to the use of pseudonymisation. Indeed, the EU’s General Data Protection Regulation (GDPR) recognizes this, noting that “Personal data which have undergone pseudonymisation… should be considered to be information on an identifiable natural person” and is therefore not outside the purview of the GDPR.

Pseudonymisation aside, other methods of anonymisation are not foolproof either. Earlier this year, private individuals were able to unmask a user by analysing location data acquired from a 3rd party data vendor, despite the data in the data set being anonymous. This served as further proof of the fallibility of anonymisation of data, especially anonymous mobility/location data. This is particularly concerning in the context of the Policy given that it specifically allows for geospatial data to be shared. The threat to privacy is amplified by the absence of penalties in the Policy for accidental/willful de-anonymisation carried out using data acquired through the Policy. Given that there is no framework for data protection in India, this makes it very difficult for people to seek legal recourse.

Ownership or obscurement?

The Policy bestows ownership of datasets on the Chief Data Officer of respective departments of the GoK. To begin with, adopting an ownership approach to data is problematic and unsuitable. Not only is data not like property and other goods that can be owned or exchanged, but an individual’s data on its own also does not have as much value as it does when pooled together with data about other people from other sources. Many of the problems around unfair use of data cannot be addressed simply by assigning ownership of data. What is required instead is a system of data rights, that will address the gamut of harms arising from the sharing and processing of data.

Even within an ownership framework, the approach adopted in the Policy goes against the principles set out in the Personal Data Protection Bill, where individuals are de facto owners of the data relating to them, and in the NPD Report, where non-personal data derived from personal data of an individual will be owned by the individual, and rights in non-personal data relating to a community will be vested in the trustee of that community. The Policy does not provide any legal basis for this assignment of ownership. Such assignment of ownership also has a detrimental impact on the decisional autonomy of the individuals and communities to whom that data relates, taking away from their ability to decide whom data relating to them is shared with, and the purposes that such data could be put to.

Additionally, as recognised in the NPD, government departments are to function as custodians/fiduciaries of data, having a fiduciary responsibility to act in the interests of communities, seeking public value for this data. However, simply making data available does not satisfy this duty as there is no guarantee that the benefits of data analysis will accrue to the public. A reading of the Policy indicates that the purpose of this assignment of ownership is to assign responsibility for the quality and authenticity of data to departments and staff within them. However, if this was indeed the intent, the GoK must change the taxonomy used to disassociate it from notions of ownership.

True costs of monetisation

The Policy stands out for proposing monetisation of aggregated and anonymised citizen data held by various GoK state departments. This is problematic due to several reasons. First, as previously stated, anonymisation is not a fool-proof mechanism against re-identification, particularly in the absence of distinct legal thresholds for anonymisation. Second, the monetisation contracts between the GoK and authorised data buyers will be covered under a non-disclosure agreement, which has disconcerting implications for transparency. Citizens have little visibility into the prices at which data is sold, the purpose for which it will be used, and the parties involved in the agreement. The role of individuals and communities as producers of data is obscured, making their participation in data-sharing decisions impossible. Third, the Policy mentions Data Stewards as entities that will fix the prices for data monetisation, without clarifying their roles, responsibilities, and oversight mechanisms.

More fundamentally, data monetisation presents several institutional challenges in itself. The Policy treats data as a mere economic resource or property that can be exploited. Scholars have drawn attention to the unique nature of data that renders it unsuitable to be treated as property. This is compounded by the relational value attributed to data which makes it conducive for use, which property approaches data cannot support. Moreover, intensified commodification of datasets could risk propping up private data monopolies which might be better able to afford GoK’s datasets, than smaller players such as civil society organisations. This facet hinges on the specific pricing formula to be adopted by the GoK, but the Policy fails to provide clarity on the same. Lastly, the accumulation of datasets in the hands of a few private players could undermine the ability to use data for public interest and innovation, in ways that would distribute the value of data equitably and truly benefit the communities who produce the datasets.

Pathways to democratic data governance

Any policy framework seeking to share data must respect the interests of individuals and communities who generate this data as well as their rights over it. Data must be deployed for public welfare in consultation with the citizens who are affected by it. Intermediaries such as a data steward present a promising avenue to mediate public access to data, by representing and enabling community participation in data decisions. Barcelona’s DECODE project and Korea’s Gyeonggi province Data Dividend program are two valuable examples of citizen-led data stewardship efforts that benefit local communities and businesses alike. The GoK must rethink its policy in light of such.

models surfacing democratic approaches to data governance, expanding the scope of the current Policy to enable meaningful citizen participation in decision-making. Note: A version of this article appeared originally in Deccan Herald on November 13, 2021, and it can be accessed here.