April 6, 2023

Data Privacy and Consumer Trust Part III

Reconciling privacy and transparency.

Catherine Daly

Head of Product Marketing

In this three-part blog series, we explore how data collection in the digital age has led to a breakdown in consumer trust, how blockchain technology seeks to solve that problem, and how to address the limitations of the blockchain solution.

Part III introduces new solutions that address the limitations of blockchain as a tool for enterprise-scale data management. Read Part I here and Part II here.

While blockchain technology offers users control over their data and the ability to share it securely, it also makes that data publicly accessible, which isn’t appropriate in cases where sensitive or personal information is being shared. In order to protect user privacy, blockchain has to be used in conjunction with other technologies that allow some data to remain private. 

Encryption

The obvious answer to introducing privacy to the blockchain is to encrypt the data before it’s added to the ledger. This ensures that only authorized parties can read the data, even though it’s publicly available. But conversely, encryption makes it more difficult to verify the integrity of the data, which defeats the purpose of blockchain transparency. To connect encrypted data to the blockchain, you need a way to verify it.

Zero-knowledge proofs

A zero-knowledge (zk) proof is a cryptographic technology that allows data to be verified without revealing its content. Zk-proofs can be used in conjunction with data encryption to prove that consumer data is being used and shared appropriately without revealing the data itself. Let's say a company wants to collect user data to offer personalized product recommendations to users. As a user, you are concerned about the company collecting more data than what you’ve consented to. In order to maintain your trust, the company would need to prove that they are only collecting the data that they say they will. Let’s walk through this use case:

  1. The company defines the minimum set of data points that are required to offer personalized product recommendations, such as the user's age, gender, and purchasing history.
  2. The user consents to the collection of this data and the use of it for personalized product recommendations.
  3. The company provides a zk-proof to verify that only the consented-to data was collected.
  4. The company stores the encrypted user data on the blockchain, along with a record of the proof.
  5. When the user interacts with the company's product recommendation system, the company uses the encrypted user data to generate personalized recommendations, without ever revealing the actual data to anyone.

But implementing zk-proofs can be complex and technically challenging, particularly for smaller companies with limited resources. The computational complexity of some zk-proofs also means that they’re not really suited for use in large-scale systems with high volumes of data. As the size and complexity of the data set increases, the time and resources required to compute zk-proofs can become prohibitively high. 

zk-SNARKs

A SNARK is a type of zk-proof designed to be particularly compact and efficient. SNARKs, which stand for Succinct Non-interactive Argument of Knowledge, can be verified quickly and with a relatively small amount of data, since the prover and the verifier don’t need to communicate with each other during the verification process. For a large-scale system that a business uses to manage their data where efficiency is paramount, such as a data warehouse, zk-SNARKs are a more appropriate choice than other zk-proof solutions.

Querying and storing data on-chain

In addition to challenges with privacy, blockchains alone aren’t a practical solution for data storage. Storing enterprise-scale datasets on a blockchain is completely untenable, and storing even a moderate amount of data on-chain can be extremely expensive. Blockchains also don’t have a query language—meaning there’s no built-in way to ask questions about the data stored on-chain.

Decentralized storage solutions

Decentralized storage solutions, like IPFS, Filecoin, and Arweave have sought to solve this problem by providing cheap data storage in a decentralized environment. But, that’s really all these solutions provide. They’re not built for queries, certainly not for analytics, and they’re not easily integrated with existing enterprise systems. 

Query language

Some Web3 protocols, like The Graph, are making blockchain data API-accessible and queryable. But these solutions are not designed to scale past around 1 terabyte of data, and they don’t incorporate off-chain data. As mentioned before, it’s (at best) extremely expensive or (at worst) impossible to store the volume of data on-chain that would be required of an enterprise-scale consumer data management system, which means that most of the data would have to be stored in an off-chain system, with the most important information connected back to the blockchain via an oracle network. 

Zero-trust data platform

Above all, the truth is that, if you’re a major enterprise, you’re not going to upend your entire infrastructure to thread together blockchain technology, blockchain-compatible data encryption, decentralized data storage, a query language, cryptographic proofs, and an oracle network. You need a singular platform that integrates with your existing systems and provides it all for you. At Space and Time, we’re building that solution. 

Real-time blockchain indexing and oracle integration

Space and Time natively reads from and writes to the blockchain, allowing businesses to easily utilize blockchain data and publish query results back on-chain. Users can join this blockchain data with their own provided off-chain data in a single query, making it easy and simple to integrate the blockchain with existing enterprise systems. Queries can be easily written back on-chain directly from Space and Time through our integration with Chainlink, so businesses can put only the most important aggregations on-chain and don’t have to pay for expensive blockchain storage. Unlike almost all other data platforms, data storage in Space and Time is completely free, and can be encrypted in-database for maximum security guarantees.

Decentralized HTAP data warehouse

Space and Time is the first decentralized data warehouse that supports both transactional queries (quick lookups to power applications) and analytics (complex queries used to generate business insights) in a single cluster, which means that businesses don’t have to spend on a separate database, data warehouse, and tools to move data between them. 

Proof of SQL

Proof of SQL is the novel zk-SNARK developed by Space and Time that cryptographically guarantees SQL operations. In other words, you can run a query and publish the result on-chain with the assurance that the computation was run accurately and the underlying data hasn’t been tampered with. Proof of SQL allows smart contracts to run complex computations without spending ridiculous amounts of gas, which makes it much more realistic for businesses to use them as a tool for automation.

Built-in API gateway

Space and Time has pre-built APIs for blockchain data, security, data streaming, SQL operations, and more, which means that building applications on Space and Time or connecting the platform to existing environments requires no unnecessary time or setup from the end user.

Implications for consumer data

Imagine a retail giant like Amazon using Space and Time as its data management solution. As the general public is well aware, Amazon collects a massive amount of consumer data, including browsing history, purchase history, and demographic information about each of its 300 million+ active users. This data is paramount to Amazon’s ability to provide personalized shopping experiences and relevant product recommendations to customers. Many of the reasons we all find Amazon so convenient are made possible by its collection of our data. But do you trust that that data is being managed appropriately?

With Space and Time, Amazon could stream all of this collected data into the secure, decentralized platform via the built-in API gateway. Consumers have the peace of mind that their data is stored on a community-operated platform that’s not susceptible to breaches or centralized tampering from Amazon. Space and Time’s encryption solution means that all of this sensitive data, like someone’s address or credit card number, can still remain private and highly secure. 

Amazon can use Space and Time to facilitate the quick lookups that power their application in real time—such as checking a user's shopping cart contents or verifying their shipping address during the checkout process. It can also use Space and Time to run analytics against this data and generate personalized product recommendations based on the user's browsing history and preferences to enhance their shopping experience. All of this can be done in a single Space and Time cluster, without Amazon having to spend on a separate database, data warehouse, and ETL tools to move the data around.

Finally, Proof of SQL and the ability to write tamperproof data and analytics to the blockchain allows Amazon to demonstrate that the operations performed on customer data are accurate and untampered with, without revealing the actual private data itself. This helps ensure consumers that their collected data is used only for its intended purpose, such as generating personalized recommendations or improving user experience. Using Space and Time as a data management solution could help Amazon reestablish trust with its consumers without sacrificing its very real need to collect and use their data.

A new era of privacy, transparency, and zero-trust

Web3 is creating a new age of trustless business/consumer relationships. Space and Time is laying the foundation for enterprises to adopt blockchain as a framework for data management by offering a comprehensive solution that combines the security and trustworthiness of blockchain technology with the efficiency and scalability needed for large-scale data operations.

Space and Time enables businesses to build robust data management systems that prioritize both consumer privacy and transparency. As more companies embrace the potential of Web3 and platforms like Space and Time, we can expect to witness a significant shift towards more responsible, ethical, and trust-based data management practices in the digital age, benefiting both businesses and consumers alike.

Catherine Daly

Head of Product Marketing

Catherine Daly is a senior marketing strategist with a passion for building community around emerging technology. Prior to Space and Time, Catherine managed full-funnel marketing for both startups and established global organizations in the semiconductor industry. She is accomplished in developing data-driven integrated communications strategies to accelerate growth for businesses across Web3 and the technology ecosystem. At Space and Time, Catherine oversees all growth, community, brand, product marketing, and content strategy.