Comments on the Revised Report by the Committee of Experts on Non-Personal Data Governance Framework

By Soujanya Sridharan & Siddharth Manohar

February 12th, 2021

Image sourced by

‘Communities’ as conceived by the Committee under Section 7.2 is ambiguous and risks causing more harm than good to their interest.

The report makes communities the owners of NPD, with a community being defined as “any group of people that are bound by common interests and purposes, and involved in social and/or economic interactions.” A community could be virtual or delimited by common geographic, livelihood, economic or social interests. However, It is possible that a group of individual data principals sharing a common interest, say subscribers of a milk delivery service by BigBasket Daily in a particular locality, might not be aware that they are indeed a “community”. How such communities would create a Section 8 company, Society or Trust and choose a Data Trustee to represent and protect their interests is challenging to comprehend, particularly when the individual data principals involved in common social or economic interactions do not view themselves as a community.1

As also, the Report is unclear about the terms of the association between data principals and a larger community. Consent, particularly community consent, is another dimension that the Committee has not considered it in the report. In its current form, individuals must merely consent to the anonymisation of their data and are given an option to withdraw the same, prior to anonymisation. 2 An effort must be made to involve communities and/or their representatives at every step of the data value chain 3 – from collection to processing through to data exchange and use by third parties – to ensure that their data is used for social benefit.

Data Principal v. Community: Lack of vision in organising principles for Communities and Data Principals; lack of procedure for the protection of rights

The NPD Policy has stated that the creation of regulatory structure including Data Trustees, Data Custodians, and Data Processors, the Non-personal Data Authority (NPDA), and High-Value Datasets (HVDs) will help Data Communities exercise rights over datasets that are relevant to them.

It is important to note that this objective has not been achieved. Paragraph 7.7 states that Data Trustees are “obligated to establish grievance redressal mechanisms so that community can raise grievances”. It is unclear as to why the burden of coming up with the procedure of instituting grievance redressal mechanisms has been laid completely at the feet of the Data Trustee, without principles on how it ought to be designed. At the level of an initial analysis, the process should comply with tenets of principles of natural justice and sound administrative law. Excessive delegation is a violation of these principles, 4 and the Committee must elaborate on the form of recourse available to Data Communities in order to engage with the activities of Data Trustees.

The Report requires a clearer articulation of the harms from which it seeks to protect Data Principals and Communities

The Report states that Data Custodians and Trustees have a right to act to minimize harms, and they are in fact obligated to carry out this task by virtue of their duty of care towards the relevant Data Communities. What forms these harms take, however, has not been specified. This lack of detail leaves stakeholders in the dark as to what possible claims or causes of action they may have. Definition of harms under the Report cannot be left to the imagination of stakeholders.

The Report must lay out the processes involved and form of submissions to be made as part of the grievance redressal requirements prescribed in the Report under Paragraphs 7.2(ii), 7.7(ii), and 7.4(iv). Clarity of procedure and uniformity of process is necessary for transparency in the protection of harms against Data Communities and Principals. The Report in its current form ignores Constitutional principles of procedural justice.

The Report fails to make a convincing case for the NPDA outlined in Section 7.10 and avoids any discussion on the possibility of sectoral overlaps that could manifest as an enhanced regulatory burden for data-driven businesses.

Despite its attempt at articulating the nature and roles of the NPDA, the Committee has not made a convincing case for the institution of an independent regulatory authority. 5 For one, a variety of existing mechanisms could achieve the same ends as that of the NPDA. An MoU between two consenting parties says a hyperlocal delivery platform and a govt. the agency, for release of data relating to self-employed gig workers in order to ascertain the number of informal workers in a city, could result in a similar disclosure of data, without the regulation of an overarching entity such as the NPDA.

Further, the Committee has failed to consider the ramifications of regulatory overlaps as a result of the creation of an NPDA. Firstly, the jurisdiction of DPA vis. NPDA is ambivalent inasmuch the presence of confounding variables such as mixed datasets and re-identification of NPD creates challenges for outlining clear adjudicatory mandates for the two authorities. In this context, the regulatory parameter for NPDA is undefined and requires more deliberation on the principles that compel the creation of the NPDA. As also, the Committee proposes that HVDs be excluded from copyright protection. The rationale offered is that formulation of HVDs does not entail any creativity or skill, but a mere compilation of existing “fields of data”. However, the existence of certain “fields of data” might be a product of innovation and skill, producing competing claims for “trade secrets” protection simultaneously. In such a case, the mandate of the NPDA to compel disclosure by Data Custodians will necessarily be in conflict with the existing IPR regime. 6

The Report also suggests that sectoral regulators can delineate protocols for data sharing, in addition to those imposed by the sector-agnostic horizontal NPDA. This could result in a situation of regulatory excess where data-driven businesses will have to comply with multiple legislations: the PDP Bill in cases of personal data, the NPD framework for anonymised data, and the third set of compliance obligations outlined by sectoral regulators. Efforts must be made to harmonise compliance requirements across regulatory bodies such as the DPA, NDPA and other sector-specific authorities to ease the consequent regulatory burden on businesses.

The dichotomy between the functions of a non-profit under Paragraph 7.2(ii) and a Trustee under Paragraph 7.7 of the Report, leading to inadequate protection against harms from the processing of NPD

The role of the Data Trustee is to protect the interest of its user groups and communities. The Data Trustee firstly plays a pivotal role in the creation of the relevant HVD. However, the community relevant to the HVD cannot approach the Data Trustee to express their concerns on its usage through a grievance redressal mechanism. In order to do this, they are required to approach a different non-profit, as described under Paragraph 7.2(ii). Any engagement on harm experienced by Data Communities is required to go through this alternate regulatory route, as opposed to the better-established framework of the Data Trustee under the Report. Given that articulation of economic interests of the Community is in the hands of the Data Trustee, it is unclear as to why the prevention of harm has been delegated to a different entity. Both these processes aim to protect the interests of the same group – the Community impacted by the HVD. The Report adds to regulatory burden on Data Communities when different platforms are required to be approached for different concerns.

It is recommended that the responsibility of protection against harm be given at least the same strength of grievance redressal process and accountability standards as that of protecting the interests of Data Communities in granting access to HVDs. Potential harm should be an explicit consideration in the Data Trustee’s decision in granting access for HVDs to any data requester.

Anonymisation and data sanitization with respect to “mixed datasets” are loopholes in the framework that could be used to circumvent mandatory data sharing obligations under Section 7.4 (iii).

The mandatory data sharing obligations sought to be imposed on Data Custodians and Data Businesses by the Revised Report introduces complications of regulatory arbitrage and enhanced compliance burden on private entities. This is highlighted by the clauses on anonymisation 7 and governance of mixed datasets – wherein Data Custodians/Businesses could simply alter their terms of service and choose not to anonymise data or claim that they operate with mixed datasets (containing both PII and NPD) – which could be used to circumvent the NPD framework itself as explained below.

This above is problematic because the creation of structured HVDs for mandatory sharing, including raw and factual datasets, is a resource-intensive activity that could potentially disincentivise Data Custodians and Data Businesses from anonymising data at all. In fact, it is possible that mandatory data sharing could engender practices of regulatory arbitrage – a phenomenon by which Data Custodians could circumvent mandatory data sharing obligations by explicitly choosing not to anonymise data. Regulatory arbitrage is all but common in the tech industry where companies like Facebook have modified their terms of service to escape stringent privacy guidelines under the GDPR. 8

In the Indian context, Data Custodians could avoid the mandatory disclosure obligations imposed on them by either choosing not to anonymise user data or claim that such data is a “mixed dataset” 9 which is “inextricably linked” to personal data of users, thereby bypassing the mandatory data-sharing obligations. Consequently, the Data Custodians would not fall under the purview of the NPD framework, but that of the Data Protection Authority mentioned in the PDP Bill. This glaring loophole in the NPD framework is only strengthened by a lack of coherence in the definition of a “mixed dataset”, specifically the scope of what constitutes as being “inextricably linked” to personal data. Whether such a link is determined by economic or privacy considerations is unclear although their association to personal data might be tenuous, at best.

Alternatives to Mandatory Data-sharing: Creating a voluntary sharing ecosystem

A plausible alternative to mandatory data sharing obligations imposed by the NPD framework is the creation of “data marketplaces”. Data marketplaces would provide an avenue for data providers/suppliers to offer curated datasets at a price where data consumers/requestors can purchase or subscribe to such datasets. 10 Similarly, the proposed EU Data Governance Act, 2020 makes provisions for “data altruism”. Data altruism refers to a mechanism of data sharing in which data subjects (data principals in India) consent to processing of their personal data, or permissions of other data holders (custodians, processors) are obtained to authorise the use of their non-personal data without reward, for purposes of public welfare, such as scientific research or enhanced delivery of public services. 11

For more information about Aapti’s submission to the Committee, please visit the link here.


  2. Paragraph 5.4(iii) of the Report.
  4. Hamdard Dawakhana v. UoI, (AIR 1960 SC 554).
  7. See Section 5.4
  9. See Paragraph 5.1(v) of the Report.