Building Blocks: What is Decentralized AI?
Decentralized AI is a new paradigm based on the idea that AI is too powerful to be controlled by a small group of people
By applying the mindset of self-sovereignty to AI development, decentralized AI seeks to redistribute power and mitigate concerns over privacy, equity, and accessibility. In a post-LLM world.
These goals have generally been approached through some combination of cryptography, database systems, and p2p protocols to distribute the training and inference of large models. Below, we'll explore this new, rapidly evolving field.
Centralized AI vs Decentralized AI
Centralized AI refers to the status quo. That is an extreme concentration of capital, computing resources, data, and a lack of oversight. Due to the intensive requirements, it is hard/impossible to build cutting-edge models outside a select group of organizations. As a result, influence over this critical technology is in the hands of not just a group of companies but a few leaders within the market.
Decentralized AI aims to address this by leveraging blockchains to distribute ownership and governance of models and increase transparency and accessibility. This more user-centric vision of AI has the potential to unlock new GPU resources via, for example, TEE verifiability, where we can have new data for custom models. This vision promises to improve user privacy, lower the barrier for more developers, make investing in AI more open, and address the society-level issue of who controls these systems.
The Decentralized AI Stack
As 'decentralized AI' is an umbrella term, it can mean many different things. However, there are some core components or layers. These include energy generation and distribution, physical infrastructure, compute and storage, and silicon chips. On top of this foundation, you've got the models, data, and application layer. Here's a visualization of a crypto-oriented version of this stack.
Whether you're referring to a DeFi application that uses machine intelligence to optimize yield or a DAO that governs a model democratically, some of the key technologies or categories that comprise this stack include: blockchains, smart contracts, federated learning, coprocessors, fully homomorphic encryption, zkML, TEEs, and MPC, to name a few. Each of these is used differently depending on the context.
Decentralized AI Use Cases
To dive deeper into how the tech is applied and what it enables, let's look at a few examples. First, you have a bunch of distributed compute networks (e.g. Akash) that aim to coordinate the supply of the large amounts of physical hardware, primarily GPUs, needed to run these models.
Others are looking to solve this problem by tokenizing centralized GPU as an alternative. In addition to generalized compute platforms, there is a subset of projects focused just on machine learning model training (e.g. Gensyn).
Numerous solutions are also being explored for verifying offchain data (e.g. ezkl, Ora). Optimistic ML (opML), trusted execution environment ML (teeML), and zero-knowledge ML (zkML) are being employed to enable applications to handle heavy compute requests offchain and then submit a verifiable output proving the offchain workload.
In this way, we accomplish a few things:
- Add greater integrity to the process of model development.
- Work around the fact blockchains are resource-constrained and AI requires a vast amount of resources.
- Enable uses like training and inference in a decentralized and verifiable fashion.
A final commonly cited use case is AI agents (e.g. Fetch.AI). AI agents are autonomous bots capable of receiving instructions and executing tasks via an AI model. They can be paired up with a wallet and thus given the freedom to transact with smart contracts. In this scenario, the agents' private keys can be stored confidentially onchain (e.g. Sapphire) and used to sign messages as needed.
The Benefits of Decentralized AI
Applying blockchain to AI has been touted for everything from greater transparency, efficiency, and governance to democratizing access to AI development. While it's not a cure-all, a distributed approach to AI does present some interesting possibilities.
Primary benefits include security and privacy. For example, under current AI models, most AI-leveraging apps collect data, which is then stored on centralized servers or often shared with third parties. Data breaches are common, and users typically have little to no visibility of how their data is used.
On the other hand, a more decentralized setup promises fewer data breaches and more agency when it comes to personal data. It's possible to envision a world where related value flows are transparent, and users can permission their data for training and get compensated.
Following the thread, there can also be greater confidentiality guarantees through open-source models, local storage, alternative GPU infra, and SSL encryption. The diagram below is a good way to conceptualize the full scope of benefits.
Challenges of Decentralized AI
Like any emergent field, decentralized AI faces many unknowns and challenges. For instance, data quality is an issue, there are no standards, and the best use cases are still being worked out. But perhaps the biggest challenge is cost/scalability.
Due to their complexity, training large models and running queries on them is very expensive. The higher the computational complexity of an algorithm, the more resources it will cost to use that algorithm. At present, distributed alternatives have neither the capital nor infrastructure to mount a serious challenge to incumbents.
Another obstacle is trust. Even if a user were to perform a propagation in a deep learning model in a P2P fashion, right now, there's no way to be certain that what they receive genuinely originated from the model in question or that the input they provided actually triggered the output.
If we can't be sure about this, any malicious node could return improper results and corrupt the entire process. And this is where we get back to the core of decentralized AI's value prop: verifiability. For most of these use cases to take off, we need a cryptographically verifiable method of proving outputs and bringing offchain data into the onchain realm.
Decentralized AI and the Future
The maturing of blockchains allows us to extend trust-minimized computation into physical infrastructure, traditional data assets — and, critically, AI. This points toward a future where blockchain helps to verify data sources and properly train AI to ensure accuracy, completeness, and integrity.
For our part, we've created Runtime OFf-chain Logic or ROFL, a framework that extends runtimes like Sapphire to offchain components. ROFL provides mechanisms similar to Sapphire for verifiable computing but with the ability to run heavy workloads offchain. This opens up a world of use cases and enables an entirely new type of composability across blockchains and offchain computation stacks. ROFL is currently in development. Learn more about it here.