In a COVID-disrupted world where technology plays a more intrusive role than ever, 2011 seems like the distant past. Back then awareness of the responsibilities around personal data stewardship was sparked by the European Union (EU) Privacy Directive or ‘Cookie Law’.

Developers had originally used cookies as tools, small snippets of code that ‘make the web work’ in terms of personalisation. This morphed into a leading role for cookies – creating profiles to track people across the web as part of the huge market in behavioural advertising. Despite a constant stream of regulations, breaches, and personal data scandals, data growth has continued on an explosive trajectory as tracking gets granular and more connected devices to come online. On average, every human created 1.7 MB of data per second by 2020.

People now distrust and fear the use of their data, beyond annoying ads to misinformation and emotional manipulation. It’s clear the current system also does not serve consumers of data well; ad fraud rates are around 50%, with much flowing to organised crime. Each organisation must comply with the EU General Data Protection Regulation (GDPR) equally, so a disproportionate burden falls on smaller businesses, startups, and less well-funded public projects. They lack legal resources and revenues from personal data but must still discover, manage and disclose the risks to citizens around its use.

While cookies may have had their day, centralised solutions like Google’s Privacy Sandbox, curated “FLoC” audiences and less scrutinised plans from other ‘walled garden’ platforms take the web further from its original, decentralised ideals.

FLoCs, or Federated Learning of Cohorts, work by assigning each browser an anonymised ID, then adding that into a large group where only the overall patterns are accessible to advertisers. Since this happens on people’s computers their data wouldn’t get stored on a server, which is one of the privacy concerns associated with third-party cookies. So far, so good. However, this rouses the concerns of privacy activists because it raises the level of browser fingerprinting. Since FLoC takes browsing behaviour and uses it to create an identifier before assigning people to a group, whoever wants to track you already has a lot of the work done for them.

More worrying is the increased concentration of control in the hands of dominant platforms; Google and Facebook already represent 70% of internet traffic, and are responsible for 90% of the ad industry’s annual growth. SimHash, the algorithm Google uses to create IDs and cohorts, was originally created for use by their web crawlers to find nearly identical web pages. As search has evolved from a useful tool to a self-referential advertising ecosystem with Google at its centre, the current danger is that the same playbook gets applied to citizens, data scientists and even regulators, relying on a single, all-seeing counterparty to ‘do the right thing’.

As we’ve seen ESG (Environmental, Social and Governance) and responsible investing reach a tipping point in 2021, this reign of centralised platforms may now be coming to an end. New regulations focused on individual sovereignty, decentralised technology and a humanistic framework for federated income will play major roles in shaping a New Data Economy.

Are we there yet? The long road to data emancipation

Despite the role of regulators, privacy commentators and technologists in propagating cookie banner technology, many are now frustrated with the lack of progress beyond this most basic tool for individual data control. We all click banners and are served legal disclaimers every day, without believing our rights are really being respected. Advertisers, data scientists and other data consumers are also concerned about using aggregated third party information with consent obtained at best through acquiescence, rather than “freely given” as the GDPR requires.

They worry such consent is not valid, and the new ePrivacy Directive, successor to the Cookie Law and the subject of four years of deliberations, increases the likelihood that insights from such data will be outweighed by risks of regulatory action and class-action lawsuits.

Unlike the GDPR, it covers not only personal data but also metadata and confidentiality requirements and will apply to instant messaging apps and machine-to-machine communication. ‘Cookie walls’, although not totally outlawed, need to offer a choice and there is a greater burden on proof of consent, and offering control to individuals.

This change is reflected in brand and publisher moves to build stores of first-party data where they can be confident of its legitimacy. Loyalty schemes that had previously been outsourced to third parties have been brought back in-house, and brands such as IKEA have committed to a “Customer Data Promise” which in spirit at least goes beyond compliance obligations.

In practice, preference centres and in-app controls to manage consent choices, limit data use, and request deletion, such as IKEA’s, offer more control to individuals, but this still takes place on the organisation’s digital estate. The problem is that means of data production and management stay with the Data Controller. Organisations ‘mark their own homework’, managing rights of citizens with complaints to the regulator the only redress available. What if we each had a preference centre, a choice of consumer-grade tools and web destinations where we can run our daily lives without being bought and sold in the marketplace?

The idea of data wallets or Personal Data Vaults (PDVs) is not new. They were embraced most visibly by Tim Berners-Lee, the inventor of the World Wide Web, as tools for people to have a meaningful role in the system which is meant to provide them with value. A PDV inverts the cookie model; instead of websites and e-retailers forcing cookies on users, they ‘carry’ their vault (identity credentials and personal data) themselves which can only be interrogated on terms they set. Unfortunately, a data wallet without interoperability with the existing digital ecosystem is a pretty limited piece of software, which is how most experiments with PDVs at scale turned out.

The problem is two-fold. First, the current system prioritises convenience (form filling, logins) from data already kept in others’ silos, hampering adoption. Second, the process of repatriating data to your own PDV is difficult (and risky) enough to deter all but the most dedicated. Projects to launch ethical apps or social platforms were stuck, needing to provide the functionality of current incumbents or educate users in new products while lacking buy-side engagement from advertisers habituated to the ad tech economy. This flows through to a relative lack of investment and, in many cases, niche status. We need a structure for academic projects and data activism to evolve into new commercial frameworks and tools for motivated individuals and counterparties to engage with a fair and transparent marketplace – at scale.

Beyond compliance – data unions and digital free trade

One proposal to recast the digital economy pushes the notion of a Universal Basic Income (UBI) from data, led by Andrew Yang and the Data Dividend movement. This hinges on compulsory redistribution of data income from platforms to individuals.

More recently, they have expanded with attempts to litigate on behalf of “data subjects” (people) to seek redress for mispricing or misuse of a particular community’s personal data through its technological or non-technological platform. This casts the state as a prime intermediary but does not solve the problem of data being siloed, or uncover its fair value. While popular, it’s best seen in the context of broader antitrust and tax initiatives focused on big tech. An unfulfilled promise of GDPR was around data portability. Organisations got their house in order in terms of security and compliance obligations but didn’t progress to opening their silos.

In late 2020 regulators in Europe and California drew up laws to encourage the share and earn model. Their intentions were for the New Data Economy to succeed, and for people to bargain collectively for their rights:

“Value of data produced in the EU per year is €1.5T…we want to put those who generate data in the driver’s seat… data needs to be shared and exchanged more widely….” (Ursula von der Leyen, European Commission president, February 2021).

In particular, the EU’s Data Governance and Digital Markets Acts build on progress made by California’s Consumer Privacy Act, namely:

  • Legal standing and powers to data unions, so data buyers and regulators can trust and delegate to professional, human representatives instead of big tech.
  • Allowing users of Amazon, Facebook, Google, and other massive platforms (known as gatekeepers) to transfer their data in real-time to third parties like data unions.
  • Funding in the form of two billion euros in grants to build systems and infrastructure to support the New Data Economy.

So how do data unions (or data trusts in the UK) represent the key to digital free trade? As accountable, transparent structures for individuals to collectively exercise control and bargain for their data’s value. Data unions can be based on existing institutions from the offline world like clubs, public bodies, and charities, mirroring their own articles of association around data use and ethics.

Each time you interact with a digital streaming platform, the data you choose is sent to a trusted third party to be monetised on your behalf. How about donating your wearable data to the UK’s National Health Service for research, instead of being siloed with Fitbit’s owners, Google? Not a problem. Data portability creates a multiplier effect for pioneering data unions focused on elements of the existing tech stack, like search or browsing. One example is Swash, which runs a browser plugin that separates identity from data and monetises for users based on their explicit consent and choices.

Focusing on existing web activity helped adoption to a point where Swash has 60,000 users and sells insights directly to buyers instead of existing platforms. This new legislation allows Swash users to plug in fresh sources of data, knowing their identity, privacy, and transparency around use are protected.

Data wallets also get a new lease of life: the ‘plugged in PDV’ combined with data portability is open and interoperable, but access to data stays in the control of the user. Once the PDV owner has joined several data unions, over time their data becomes more significant and valuable to advertisers, application developers, and data scientists. Based on data union members’ consent and preferences, their PDVs become discoverable and query-able by these third parties. Clearly, a different proposition altogether from cookies, or curated ‘audiences’ like FLoCs.

While the role of data unions in data monetisation is a key driver, whether they take off as predicted over the next three to five years equally depends on how they affect broader economic justice and agency over data, beyond material ownership. Collective agency is a key weapon to effecting positive change around rights, and this needs to be brought to the fore if privacy activists (who tend to be against data monetisation) are to be brought along. Through data unions, people can connect with others who share the same values and use their data strategically to influence corporations and other institutions that may have proved resistant to individual requests

Bringing it all together

The past decade produced a regulatory environment and openness regarding a new deal around data, but creating a new ecosystem, where portability and ownership are practically enshrined in balance with this, requires mass adoption by stakeholders. While decentralised Web3 technologies exist, including crypto-secure wallets, data markets like Ocean, and rails for micropayments to individuals, the task of knitting these together in an accessible way is significant. How can individuals and groups new to this get started?

Incubators are needed for data unions to be created at scale and matched with standards and protocols for advertisers and data consumers to engage commercially, just as the ad tech Lumascape set market conventions and agreements which made it attractive for investors. These are starting to emerge; Pool will offer a service to nascent data unions which covers setup, legal advice and technical support.

With unions up and running, Pool operates a data ‘clearing house’. Once members’ checks and preferences are met, they transmit a stream of information for aggregation with other union members’ data streams. These aggregated streams are bundled and retailed onward to third parties requiring access to large data sets of consumer information. In the face of unilateral schemes from platforms, data buyers, brands and Web3 technologists are cooperating through the Data Privacy Protocol Alliance (DPPA) to define best practices and technical standards for an open data economy. This departs from the traditional model of a trade association and reflects how far the Overton window has moved in terms of data commerce.

A proliferation of crypto tokens is not in the interest of consumers; central banks can play a role in creating or providing stable coins with oversight and guarantees, enabling a common currency for micropayments rather than as dispensers of UBI.

Conclusion – from consensus to coalition

“The current system of every digital offering managing its own set of data profiles cobbled together from unreliable third party device graphs and tracking data will seem bizarrely inefficient in retrospect.” – WPP Data 2030 Report

As the digital ‘wild west’ age transitions to a mature, hyper-connected phase, the reality of data centralisation among a handful of platforms is becoming more evident, and the counterbalance to this must be structural and social. Jaron Lanier and others have made the observation that services like Google and Facebook exist only because of the acceptance of a massive amount of distributed ‘volunteer labour’ through billions of people’s data. The opportunity for data unions, once placed in the context of a regulatory framework to act as a force in collective bargaining, is extraordinary. Bringing representation and balance to matters of policy, legal redress, and embedding new, privacy-centric technology in these platforms is a multiplier for the common good which goes beyond the monetary aspect.

From being an ‘eyeball’, in this individual data economy, people can manage in one click their rights to carry, delete or share their data. We can know who uses our information through distributed technologies embedded in the web, mobile and smart devices that follow open standards. We can seek out the best data unions whose missions and practices align with our values, and be assured this flows through to action. The focus should now be on education and integration of this movement with the legitimate needs of state institutions, data consumers, and responsible advertisers, with the goal of a sustainable data ecosystem flourishing without state support within the next decade.

