Background: The Non- Personal Data Governance Framework
The Ministry of Electronics & Information Technology constituted a Committee of Experts on the Non-Personal Data (NPD) Governance Framework, which released an initial report in July, 2020 and a subsequent revised report in December, 2020. Some of the principal goals behind this report have been to unlock the economic value of non-personal data by creating a data sharing framework and bringing regulatory certainty around non-personal data. The report’s proposal to allow governments to mandatorily access certain classes of data held by private entities, has stirred up debate on whether the measure is justified or necessary. One important subset of this discourse is the interplay between data governance and intellectual property rights. IPR are valuable for creators and investors because they permit exclusive use, which facilitates profit from investments in creations. While this competitive edge enhances innovation, adequately balancing the rights of investors/ companies against community interests is crucial in ensuring that these investments happen, as well as reach and benefit the society at large. Mandatory access of databases by the government is a policy that affects the exclusivity that IPR may otherwise provide for entities that deal with/market databases. Therefore, clarity on IPR for non-personal databases becomes a vital component for development of the data economy.
It is important to understand data categorisation under the report, because that is a determining factor for mandatory sharing with the government. The NPD report creates 3 classes of data – raw data (the base level of data provided or observed), aggregate data (an aggregate view of data across data points), and inferred data (derived view of data, where insights are developed by combining different data points involving trade secrets, algorithms etc.). It provides that the government may mandatorily access subsets of raw data, and aggregate data but not inferred data. The report also goes on to justify mandatory access to raw databases/ subsets on the ground that they are not entitled to protection under the current IPR regime. Currently database protections are available under the Copyright Act and Trade Secrets jurisprudence (contract law), where the nature of protections available depend on the type of database or dataset. This piece re-visits the issues around database protections in the NPD context.
Addressing IPR for raw databases
Copyright law protects the creators of original works against unauthorised use or reproduction. It refers to a bundle of rights that vest with the owner, or assignee, in respect of literary, dramatic, musical, artistic works. These include, for authors, economic rights to reproduce, perform, distribute their works etc, as well as moral rights that allow authors to preserve and protect their link with their works.
Originality is an important threshold for entitlement to copyright protection. Copyright does not subsist in ideas but only in the expression of these ideas. In determining originality, the courts in India have shifted from the ‘sweat of the brow’ test to the ‘modicum of creativity’ approach. This shift is apparent in Eastern Book Company v. DB Modak where the Supreme Court held that- Although for establishing a copyright, the creativity standard applied is not that something must be novel or non-obvious, but some amount of creativity in the work to claim a copyright is required. What this means, simply, is that it is not sufficient that labour has been exercised or capital has been invested, in order for a work to gain copyright protection, but it is necessary for the work to exhibit a minimum level of creativity.
Section 2(o) of the Copyright Act, 1957, protects computer databases as “literary works”, making the aforementioned originality threshold equally applicable to databases. This raises questions on the protection available for raw databases, or even subsets of raw databases. There have been arguments that raw data in itself is entitled to copyright protection. This stems from Article 10(2) of the Agreement on Trade-Related Aspects of Intellectual Property Rights (TRIPS), which states that the copyright subsisting in a compilation of data “shall be without prejudice to any copyright subsisting in the data or material itself”, and that this might imply proprietary rights vesting in such underlying data.
However, this proposition is problematic because theoretically, copyright is premised upon the idea-expression dichotomy, where ideas do not gain protection but only manifestations or expressions of ideas are protected. While this is a well settled principle in copyright jurisprudence, the Supreme Court has given much clarity to the idea-expression dichotomy in R.G. Anand v. Deluxe Films where it observed that ‘an idea, principle, theme, or subject matter or historical or legendary facts being common property cannot be the subject matter of copyright of a particular person. It is always open to any person to choose an idea as a subject matter and develop it in his own manner and give expression to the idea by treating it differently from others.’ Therefore, copyright protections may exist for databases that involve expression of insights/ value, but arguing for copyright protection for raw databases is far-fetched, as they are generally not expressions of ideas, but are often compilations of facts/ available information.
The NPD report may be right in stating that the current copyright regime in India does not support protections for raw data/ subsets, but the exclusion of rights/protections for the raw database class and its expropriation, is still potentially problematic. Often, companies invest in collecting, collating and compiling data, even where no selection, arrangement or creativity may exist. Stepanov argues that for companies that work in trading data, the value from data is not created from internal use in the company, rather from their readiness for market tradea. Therefore, legal protection to incentivize the creation or collection of data might in these cases be justifiable.
The World Intellectual Property Organization (WIPO) has also recognized that the implication of the originality requirement is that some databases are not protected under copyright even if substantial investments have been made to produce them. Introducing sui generis database protection therefore becomes a viable option, as seen in the EU. The protection of non-original databases is known as the sui generis right. This is a specific property right for databases, that is unrelated to other forms of protection such as copyright, where the maker of the database may protect against the extraction and/or reuse of the whole or a substantial part of the database’s content. While databases that fulfil the originality requirements can also avail copyright protections, databases that are otherwise unprotected, will become entitled to sui generis protections. As per the EU Database Directive, the database maker must prove that ‘there has been qualitatively and/or quantitatively a substantial investment in either obtaining, verifying or presenting the contents’ in order to avail sui generis protection for non-original databases. While sui generis rights protect databases, they do not protect individual data that is underlying the database, because those are merely factual in nature.
A WIPO report (2003) on the impact of protection of unoriginal databases in India highlighted that 80% of Indian databases emanate from the government domains. The dearth of private participation in the creation of databases could be attributable to the absence of sui generis protection of databases in India. The report highlights concerns raised by database industries on how infringement/piracy could disincentivise the creation of new value-added databases. Further, with effective legal protection, database providers would have the confidence to willingly disseminate data and thus make information more readily accessible.
On IPR violations in mandatory access of aggregate databases
For aggregate databases, copyright protection exists on the fulfillment of the originality threshold. Therefore, violating copyright protections to aggregate data must be aligned with provisions of the Agreement on TRIPS. One argument is that the 3-part test under Article 13 of TRIPS, restricts the scope of exceptions that member states may grant to IP rights. A requirement envisaged under this test, for a valid exception, is that the restriction of IP rights must be limited to special cases. However, the purpose prescribed for seeking mandatory access to databases under the NPD framework is ‘public good’, which is an all encompassing term that is neither specific nor restrictive, and therefore may be applicable to circumstances beyond ‘special cases’, as under the TRIPS.
Cases that have attempted to define ‘public purpose’, around land acquisition by the government, are an example of how problematic and complex the concept of ‘public good’ could become. This is a term that both courts and executing bodies have struggled to define for decades. There have been decisions such as Somavanti v. State of Punjab where the court has held that the decision of the government in holding that a property is needed for a specific purpose is sufficient justification in itself for the acquisition being legitimate, and that no further considerations are necessary. The courts have often refrained from defining ‘public purpose’ on the ground that it is a policy issue that does not warrant judicial intervention. In the absence of any other checks and balances, this has allowed the government to acquire land for any conceivable purpose (including private projects) and bring it within the definition of public purpose. This lack of clarity has caused massive confusion around the eminent domain powers of the State for acquisition of land for ‘public purpose’. The scope of ‘public good’ is likely to run into similar hurdles, in the absence of a more nuanced approach that specifically defines circumstances permitting mandatory access.
Conclusion: Why database protections are important?
The value that IPR brings for owners and creators is the ability to earn from their investments, further incentivising innovation. For society, the value is in access to improved services. Applied in the context of databases, businesses invest time, labour and organizational skills to collect and verify the accuracy of the required volume of data and to create from it a marketable product or services. One of the biggest ill-effects of confusion on IP for databases is that the production of databases would be less than that which is socially desirable, thereby disincentivising database creation altogether. Community ownership and exceptions for research and academia are also important questions to factor into the discourse on database protections.
It is important for regulatory clarity on ownership of non-personal data to come before an attempt at mandating government access to the same. If left unresolved, the encompassing uncertainty is likely to produce large-scale negative effects for the data economy, by impeding business interoperability, access to data, and hampering data reuse. Considering one of the principal objectives of the NPD framework is to incentivise innovation and entrepreneurship, it becomes important for it to balance private ownership against public/ government interests through an approach that creates certainty for investments, but also incentivises creation and sharing.