Image sourced by


To say that the agricultural sector is a crucial part of the Indian economy would be an understatement. The sector’s contribution to the national GDP increased from 17.8% in 2019-20 to 19.9% in 2020-21 – the steepest rise in more than 30 years. It was hardly surprising, then, to perceive the government’s attention focusing on development of the sector. However, rather than focusing on the interests of the community, the new policies accorded preference to the private sector. 

In 2020, protests erupted across the country in response to the three farm bills (now Acts) introduced by the government. The bills paved the way for corporate entry into the agricultural sector, putting the livelihoods of farmers at risk. One of the main areas of contention is to do with changes brought in by one Act, The Farmers’ Produce Trade and Commerce (Promotion and Facilitation) Act, 2020. This Act allows for farmers to sell their produce outside the Agriculture Produce Market Committees (APMCs) where governments buy produce from farmers at a Minimum Support Price (MSP), instead selling directly to businesses. A number of farmers and farmer groups fear that this will lead to the MSP system becoming irrelevant, depriving farmers of an assured source of income and forcing them to sell only to major corporates. Farmers were not adequately consulted when the bills were drafted, and key issues such as the imbalance of power between big businesses and small farmers were not taken into account. This has only heightened the trust deficit between farmers and the government.

It is thus no wonder that the AgriStack was viewed with suspicion when farmers found out about it. The AgriStack is a collection of technologies and digital databases, proposed by the central government, focusing on India’s farmers and the agricultural sector. It is supposed to facilitate innovative agri-focused solutions, leveraging digital technologies, to address a wide range of issues, including farmers’ access to credit and better implementation of government welfare schemes.

Information about the creation of this platform came into the public domain via a question in Parliament; no official documentation yet exists detailing ownership of the platform and the data to be stored on it. According to this explainer, there is a possibility that banks and insurance companies might get access to the information, particularly financial information, on the basis of which they will make crucial decisions pertaining to loans or premium amounts. There is no documented evidence of a pilot having been carried out, or of a consultative process for assessing the impact of this platform before its rollout.

The government also recently released the “Consultation Paper on India Digital Ecosystem of Agriculture” (hereinafter ‘IDEA’) with the objective of digitising the agricultural sector and thus doubling farmer income. It approaches the entire process of digitisation from a market-centric perspective while ignoring the interests of the data principals, i.e., farmers. This is evident from the data flow describing a one-way process – information is collected from farmers and shared with businesses which glean valuable insights and create databases and services. 

An important takeaway from these developments is that farmers are being perceived less as individuals engaged in an integral occupation and more as individual data points needed to build a larger data set. The strongest attestation to this is that these policies are being drafted in the absence of legislation protecting farmers’ privacy. India does not have a personal data protection law yet, with the Personal Data Protection Bill, 2019 (PDP Bill) remaining in limbo in Parliament. The government has only recently begun to think about non-personal data (NPD) governance with the revised report of the Committee of Experts on Framework for Non Personal Data Governance (NPDR) being released in December 2020. The report, which only proposes a framework for NPD governance, has evoked a lot of criticism and there remains a good deal of work to be done before even draft NPD legislation is arrived at. Government and big business are prone to taking advantage of this situation, resulting in a crisis of data extractivism. Farmers’ data is collected, analysed and used for market purposes, without considering crucial questions relating to consent or compensation to the community for sharing data. This leads to a gradual erosion of user agency – farmers have less of a say in how their data is going to be used, and with whom it is being shared. However, that value resides in farmers’ data is undeniable. Agricultural data can have direct benefits for farmers (by helping improve crop yield, optimising land usage) as well as furthering larger public policy goals (ensuring food safety and sustainable resource usage). But individual farmers are ill-placed to negotiate with businesses for adequate compensation for their data.

This paper hypothesises that farmers would be better placed if they were represented by a neutral intermediary who is incentivised toward their best interests. This intermediary – a data steward – would engage at the community level by including farmers in all stages of the data sharing process. Stewards can also be made responsible for ensuring that data is unlocked to generate societal value. The paper examines the possibility of the steward being a representative of the farmer. Different types of existing models of stewardship are reimagined from the perspective of the agricultural sector. The advantages and limitations of such models are then explicated in the endeavour that they will help to improve future policies. 

For purposes of research, the author has referred to existing literature on the subject (which is limited, as the concept of stewardship is still evolving) while also deriving inputs from interviews with community-based organisations that work with agricultural data. This analysis draws on Aapti Institute’s Data Stewardship Taxonomy report and ongoing study mapping stewardship variations in the ecosystem. 

Giving back control of data to farmers

Farmers, generators of agricultural data, possess comprehensive knowledge of their land and crops. The combination of historical data and years of experience makes them best equipped to take decisions on issues such as fertilisers for increasing yield and the kind of crop best suited for the soil condition. Indeed, making such decisions is not a stand-alone process. They depend on external resources such as the Indian Meteorological Department for weather data; this is also where the private sector comes in. Businesses provide farmers with applications whose selling point is individual farmer-specific advisories. Digital Green’s application, CoCo, is an example of such a business model. These applications base their data sets on raw data collected from individual farmers from which they then draw inferences that are subsequently given back to a farmer as an ‘agricultural advisory’. 

Undoubtedly, these applications have benefitted farmers significantly by helping increase output and income. However, the drawback is that over the years, farmers have increasingly been distanced from their data and the benefits accruing from it. Proprietary models of data collection adopted by the private sector exclude farmers from accessing the inferred data, instead using this data to drive their businesses and profits. This system of proprietary models also makes interoperability difficult for the farmer – shifting from one company to another is almost impossible because they cannot access historical data. Technological lock-ins are just one of the many ways of taking away farmers’ ownership and agency regarding their data.

A primary reason for this steady loss of control has been the stagnant growth of digital literacy in the sector. According to a NITI Aayog study, casual workers in the agricultural sector have the lowest levels of digital literacy, at 13%. The number lies at 24% for those who earn a regular salary working in the agricultural sector, and 26% for those who are self-employed in agriculture. These are extremely low numbers in comparison to the non-agricultural sector (15% for casual labour, 53% for regular salary earners and 32% for the self-employed). The knowledge gap leads to an information asymmetry

This information asymmetry manifests in two resultant problems. The first is the limited bargaining power of farmers, which in turn exacerbates the already existing power imbalance between big business and farmer communities. Take, for example, the failure of the Land Acquisition Act to compensate communities it has displaced over the years. Their only recourse is to approach the courts – an option seldom exercised, particularly by individual farmers who lack the financial wherewithal. The situation is worse when it comes to data privacy, as farmers are left with little recourse in the absence of fundamental data protection legislation.

The second is the absence of an accessible grievance redressal mechanism for farmers to address harms and hold big businesses and data processors accountable. Over the years, these mechanisms have become less approachable for impacted individuals, for many reasons, including the onus being on the farmer to detect harms (which can be difficult in the absence of digital literacy), difficulties in compliance, and soft penalties on companies. For instance, while the Plant Varieties Act contained provisions for farmers to claim compensation for companies selling bad seeds within the Act itself, the new Seeds Bill, 2019 requires individuals to approach the Consumer Court under completely different legislation (Consumer Protection Act) to claim compensation. Such changes erect barriers for farmers regarding grievance redressal by requiring a higher level of legal literacy. The issue becomes more complicated in a situation where the privacy rights of farmers are violated. Without the PDP Bill, the only option for farmers is to approach the High Courts or the Supreme Court, which can be a very time- and cost-intensive process. 

Aspirations regarding technological development have led to a situation where farmers have to fight to retain the reins of control over their data by finding ways to exercise user agency. In lieu of this, we recommend the use of the data stewardship model to support farmers. 

The case for stewardship

A data steward is defined as a trusted intermediary who works on behalf of users to manage their data without any vested interests. A data steward has four main functions, of collaboration, management, accountability, and intermediation. A data steward provides a platform to individuals and entities to pool and share their data and bring different data sets together. Further, the data steward manages the sharing of data and can negotiate with data requesters on behalf of users. With the help of a data steward, farmers can exercise greater control over their data and also ensure that they benefit from use of their data. However, as the idea of data stewardship is an evolving model, there are many nuances to consider. 

For one, a steward cannot always be expected to be a neutral body acting without any interests regarding the data. As a business model, this might not be profitable for a steward if it derives monetary compensation that is very little when compared to the services provided. It might choose to run parallel businesses to make up for these costs. Regulations on stewardship can set standards to ensure that the data stewards work for the community and their separate businesses do not conflict to the detriment of the farmers’ interests. 

Secondly, the structure could be such that the steward will be directly accountable to the community – its decisions to share data must be transparent and full disclosures have to be made about the safety and security standards followed in the process. The steward must also ensure confirmation of the community’s consent before sharing data by including its members in the decision-making process. The data steward will act as a negotiator between the community and a third-party data requester. 

The stewardship model has been experimented with in other jurisdictions. In the US, Ag Data Transparent was developed as a result of the Privacy and Security Principles for Farm Data drafted by the American Farm Bureau Federation, working with commodity groups, farm organisations, and agriculture technology providers. They set certain standards for companies to fulfil, such as formulating data protection policies to ensure that the information collected is protected from any harm. Companies that comply are given a certification by the data steward which automatically instils trust in farmers in engaging with them. These ‘standards’ were set keeping in mind the best interests of farmer communities, which Ag Data Transparent is bound to protect. 

Data stewards establish a relationship of trust through their constant engagement with the community, which ensures a bottom-up representation of interests. They are also usually more prepared in terms of technical capacity and digital literacy, which is why they can take on this role. 

This evolving model must account for critical questions such as the extent of the data steward’s liability in case of harm caused to the community due to de-anonymisation of data or deletion of the contract with the steward if it begins to show vested interests while handling community data. 

Reimagining existing use cases of data stewardship from agricultural sector perspective

There already exist different models of data stewardship in India, though none are sector-specific. We now look at these models from farmer community perspectives to understand their likely working if implemented specifically for farmer benefit. 

  1. Data cooperatives: This is a system in which individual farmers or existing farmer groups voluntarily come together to pool their data, which is then shared further for their benefit. Alternatively, existing farmer cooperatives can partner with institutions that can help them implement necessary systems to exercise control over their data. The benefit of this model is that all the members have an equal stake in the partnership, and an equal say in the decision-making process. Since the cooperative consists of community members themselves, there is an assurance that it will work in their best interests. A possible disadvantage of this model is the method used to make decisions while sharing data. Should cooperatives listen to the majority while taking a vote on sharing data, it would mean that the data of those who have not consented or approved would be shared as well. On the other hand, can one voice of disapproval or dissent prevent the data sharing from taking place at all? There must also be standards set to ensure that members reluctant to share their data do not conform merely because other members have done so. 
  2. Data trusts: Data trusts are intermediaries established under a policy or legal framework. They store and use data only for specified purposes, and thus do not have any vested interest in the information. This model has been recommended in the NPDR. The policy reiterates the use of data from the agricultural sector as a prime example of ‘high value data sets’ – data that must be ‘extracted’ in order to promote innovation in the economy. Communities can exercise control over their data through the ‘data trustee’ which can be a government body, a not-for-profit organisation or a group where community members come together to make decisions. The data trustee model in the NPDR is not a viable option for many reasons. To begin with, the policy does not define what constitutes a ‘community’. If the trustee is a government body, there is a possibility of a conflict of interest when it comes to making decisions on data sharing for governance purposes – should community interests be prioritised over national interest or vice versa? The NPDR provides little visibility on the working of data trustees and must ensure more clarity on their role before finalisation of the policy.
  3. Personal Data Stores (PDS): They are “single storage points” containing an “aggregated set” of users’ information and empower users to have greater control over their data. These are similar to data trusts in that “they accord users with a high degree of specific control over usage and sharing of data”. The only difference is that in the data trust model, the sharing is for a predefined purpose. Here, however, the user can choose specific instances when data is to be shared. This is still a pilot concept and not regulated in most countries. The disadvantages of this model when applied to the agricultural sector are similar to those of the account aggregator model –  it places too much burden on the farmer to understand the terms and conditions before sharing data, which might lead to sharing data in ways that harm the farmer’s interests. This dilutes the efficacy of user control. 
  4. Data exchanges and collaboratives: In this model, large existing data sets are pooled, either for public benefit or commercial purposes, and the data steward makes a decision about where the data is going to be used. This can be done by government or private bodies, which can unlock data for greater value. It can be used for prediction and forecasting of disasters, assessing impacts of projects or even to design public services. Some examples in the agricultural space include WayCool and Digital Green, both data exchange platforms that also act as consent managers for registered users. However, there are still no regulations to guide usage of these models, which makes it open to misuse. For instance, there must be direct accountability to users by ensuring that the data exchanged is being used for the specified purpose only. Data in these pools must necessarily be shared in an anonymised form to protect data subjects from harm. 


The stewardship model, prima facie, seems ideal for representation of farmer interests. It not only ensures user agency over data, but also works in the best interests of the farmer. The characteristics from each of the data stewardship prototypes discussed above must be considered from the perspective of the community before implementing on a larger scale. This can be done through pilot studies across states and the model can be taken forward based on the feedback. Additionally, since data stewards will not have any monetary interest in the community’s data, the government must make investments to support maintenance of the structure. Setting up sector-specific data stewards would be more beneficial than general data stewards because they would grasp community needs better. Stewards should preferably be independent not-for-profit organisations rather than a public or private body to avoid any conflict of interest while managing the community data. The possibility of a public or private body representing the community may be considered if there are clear regulations assuring that the community’s interests get preference in case of such a conflict. The data stewardship model cannot be developed in a vacuum. Holistic participation can take place if the farmer community’s digital literacy skills are simultaneously developed, another task for the government to take up. Learnings from the training can be honed through further interactions with the data stewards, which would enable farmers to be exposed to the intricacies of digital developments. 

As an alternative to the current neo-liberal system that prioritises corporate interference in the agricultural sector, the food sovereignty model can be adopted to ensure community participation in the system. According to this concept, it is the right of people to define their own agricultural policies. This means that they become an essential component in the discussions around the framing of these policies. This can only be done if a bottom-up approach is used in the drafting process. Policies by, for and of the community cannot be formed without their involvement and for this necessary support structures like a data steward must be provided.