October 30th, 2020
Under the speedy growth of the tech industry and platform tech companies within it, data has been the primary market power – the focal point of the digital age. 2018 saw 2.5 exabytes of data generated each day (2,500,000,000,000,000,000 bytes), and pace has only quickened – it’s estimated that by 2050, 463 exabytes will be created globally daily. A grand portion of this is generated by average online platform users – the backbone to the digital age. And in an economy centred around user-data, it is pertinent to question users’ role and agency in their data use – particularly given the staggering revenue it generates. This article considers the current power structure embedded in the data economy, the imbalances within it, and how data stewardship can potentially resolve some of these imbalances.
Power structures in the data economy today
If the statistics detailing global data generation are startling – the value ownership of this data is doubly so. The majority of this data comes from individuals using free services offered by platform companies (Facebook, Google, Uber, etc) – so the usage and value extraction of this data also lies in the hands of such companies. It is used to generate relevant insights about users’ thought and behaviour, and to guide their behaviour – informing decisions like when to advertise what, and to whom. It is this model that Zuboff articulated as “surveillance capitalism” and, as profit-driven organisations are want to do, these insights are deployed to benefit only a narrowly defined group of people. As a result, the role of the individual in this vast data network is increasingly diminished, despite them being the very source of this data. The unique nature of data also magnifies the power that technology giants hold. Unlike natural resources, its value does not diminish with usage. Instead, data can be used a near unlimited number of times by various entities to fuel further inferences. This can also be exponentially enhanced by combining different datasets.
Yet, individuals’ relationships with large data controllers or requestors features little agency, transparency or accountability – this poses many problems. For example, there is a serious dearth of algorithmic accountability on technology platforms that mediate our relationships with work, the state, health, access to services etc. In-design and post-analysis decisions made by algorithms are not limited to entertainment recommendations (which too have received serious critique – for biases favouring harmful, radicalising, or age-inappropriate content and often allowing the monetisation of such content), but also extend to labour rights issues like sorting applications and problematic algorithm-based management of gig workers’ opportunities (and consequently, their earnings). Algorithmic decisions are also routinely subject to societal biases, entering these systems by various means – like biased training data or flawed causal assumptions in their foundational programming.
And when these algorithms are “black boxes” with little transparency or accountability – they unfairly limit opportunities, restrict services and service-offering, and lead to ‘digital redlining’; a form of digital data discrimination that uses digital identities and activities to bolster inequality and oppression. Even when protected information like race, gender, or income attributes are not explicitly stated – they still provide bases for algorithmic decision making because there exist numerous effective digital proxies for the same information.
At a micro consent-level, average users regularly receive privacy notices/terms of service regarding their data and its management. However, most users are not equipped with the requisite expertise, foresight or nuance to understand these terms, or their possible consequences. Data-literacy is a complex and dynamic field, while the reach of these platforms cuts across various socio-economic, cultural and literacy contexts. Yet, these notices essentially serve as a barrier to a desired service: situating consent on a one-way street, not in a conversation.
But whether we think of this (inexhaustive) list of issues with consent, or with algorithmic accountability (opaque to users as well as external researchers), or debacles like Cambridge Analytica — the imbalance, opacity and sheer concentration of power in the data economy is glaring. Not only is there a swiftly growing volume and commodification of peoples’ data, but it lies disproportionately siloed with, and controlled by a handful of technology giants. It is also clear that individuals, without collective action, cannot protect themselves or exercise any agency over how their data is used and managed.
There is an urgent need to redistribute power in the data economy and to ask – how do we as a society want to govern our data, and who should be in control of it?
Stewardship – a piece of the data governance puzzle?
With growing conversation around building a more equitable data economy, a kind of ‘new social contract’ for the digital age, concerns are not limited to user protection. There is also a need to unlock the societal value of data – given its unique and exponential nature – beyond corporate commodification, and toward public good. As Mazzucato puts it, “government platforms now have enormous potential to improve the efficiency of the public sector and to democratise the platform economy.” The challenge of data governance and of redistributing power does not stop at regulation and retroactive harm-reduction – solutions must be poised to proactively empower the user to negotiate better with technology companies, and create a fertile environment to responsibly explore and harness the societal value of data. Given the currently skewed power structure, the vision toward restructuring the data economy and creating proactive data governance – data stewardship may well embody a sizable piece of the puzzle.
A data steward is an intermediary entity between data-requester and data-sharer (let us call them requestor and principal respectively). This intermediary holds a set of governance, technological and structural measures to unlock the value of data while safeguarding rights. A steward’s intermediary actions would include facilitating data negotiations with technical expertise that principals may not have. Stewards could also act as consent managers in such negotiations – ensuring that principals can effectively revoke consent, provide conditional consent and proactively be made aware of any violations. Depending upon the structure of the steward, this entity could also magnify users’ leverage and power in these negotiations by acting as a representative for a group of principals. A steward can also curate and provide options to data principals on which third parties to share their data with. In this way, stewardship could potentially cover large ground in the current power chasm between principals and requestors – affording them agency, insight and more.
Beyond proactively empowering principals, a steward would also seek out, evaluate and facilitate valuable data-sharing partnerships. This function is key to accessing the societal value of data. As mentioned earlier, combining datasets, reusing data, and insight into what these combinations can look like greatly magnifies the knowledge/inferences that data sharing can yield. A steward would not only be equipped to understand and facilitate this process, but is envisioned to be incentivised toward public good – something one cannot meaningfully expect from profit-driven commercial entities. There already exist examples of data used to streamline public good, which stewardship could greatly enhance. In this way, a data steward could steer data practices toward community value – ‘for the people’ as opposed to a more commercial, singular ‘of the people’.
Theoretically, then, the existence of such an intermediary is capable of enhancing accountability of platforms, user control over their own data, and harnessing the societal value of data – all of which appears to be a step closer to a more equitable data economy.
What makes a ‘good’ data steward?
Current scholarship and discourse around the roles and functions of a data steward is ongoing and varied. However, if the need of the hour is to redistribute power in the data economy, afford agency to principals and forge responsible, valuable data sharing structures – the most fundamental stewardship principle seems to be ‘people first.’ It is essential for a steward to be incentivised toward user rights and public good at every stage, and to have codified responsibility toward those it serves to empower. Structurally, it follows that a stewardship role cannot be absorbed into a profit-driven organisation for ‘self stewardship,’ given the conflict of interest.
Additionally, a steward must be well suited to the kind of data and data use it is stewarding – both in scale and technical capacity. Across sectors, data-use purposes and contexts, user-protection and data value-extraction has various nuances and requirements. For example, a principal may use one steward to manage their health data, and another to manage their entertainment data. This brings us to the challenge within a challenge:
What structures can a data steward embody?
As with the roles and functions of stewardship, research and implementation surrounding possible structure of these stewards also varies greatly. There exist a number of different models of stewardship across different use-cases, sectors, contexts and types of data. Some of the currently debated and piloted models include data trusts, data exchanges, data collaboratives, account aggregators, personal data stores and data commons. These models may overlap or differ on metrics such as roles, functions, principles, access and limitations. Data stewardship holds great potential to solve power imbalances in the data economy, but ideas of stewardship frameworks are constantly evolving in their structures and implementation — and the task of data governance remains as hefty as it is imperative.
To this end, we at Aapti’s Data Economy Lab research the ways in which data stewardship can unlock the societal value of data while safeguarding individual and community rights. We evaluate existing and possible future frameworks, and facilitate pilots of data stewardship.