It’s impossible to trust the applications we use everyday without trusting the underlying institutions which ensure that every implicit and explicit agreement is honored.
This trust centralizes power around those institutions, and renders the individual powerless. Blockchains like Ethereum decentralize the power into a well-defined set of rules that are executed by a network of computers in a manner that is open and fair to everyone.
Decentralization is the key to shifting the power from institutions to individuals. That’s why it’s so important. A blockchain running only on a central server is no better than the status quo.
Centralizing a Decentralized Network
To the untrained eye, Ethereum applications often appear decentralized. Smart contracts manifest as web-based interfaces which display data garnered from on-chain activity. User funds are custodied by private keys stored in local wallets. Transactions are quickly propagated to miners and included in blocks. However, although the Ethereum network itself is sufficiently decentralized, there are many central points where failure can compromise users.
For example, if a DApp’s server is infiltrated, it may be possible to populate the web interface with false information. This could cause users to make decisions they otherwise wouldn’t have. Or maybe a user’s wallet provider is overtaken. It would be possible to cut the wallet provider off from the network and not propagate user transactions to miners, thereby censoring their ability to transact with the chain.
These are exactly the scenarios that Ethereum was created to avoid, and on November 11th, the ecosystem’s inability to avoid those scenarios became glaringly clear.
Falling Out of Sync
At 7:08 UTC on November 11th, block 11234873 was mined containing a transaction that some old versions of go-ethereum (Geth) rejected as invalid. Although the number of clients in the network running this outdated version were significantly outnumbered by up-to-date clients, one of the operators running the outdated version was the popular infrastructure provider, Infura.
At 7:13 UTC, Infura’s monitoring system alerted the team that it was lagging behind the tip of the chain. Over the next five hours, they worked through the issue and deployed a hot fix to resume service. By all accounts, they handled the situation well and with haste. Unfortunately, the reality is that for approximately five hours, many Ethereum applications were tracking an invalid chain and most users were unable to send transactions.
Although a chain split is the worst error a blockchain can incur, it’s unavoidable if operators don’t stay up-to-date. Before this event, it was generally understood that to stay in sync, only updates which coincided with scheduled hard forks were required for operators. For this reason, Infura was under the assumption that any other consensus-critical updates would be advertised more liberally by the go-ethereum team.
Because they operate custom versions of go-ethereum that are specifically tuned to meet the demands of their high-performance infrastructure and billions of requests per day, it’s difficult for them to update their nodes every two weeks to stay in sync with the upstream repository. That’s why they were still running their custom versions of go-ethereum v1.9.9 and v1.9.13, even though the latest version was v1.9.23.
Why We Rely on Infura
It’s clear that if DApps simply didn’t rely on Infura, the impact from the chain split would’ve been reduced drastically. So why do most DApp interfaces not get their data from Ethereum nodes directly?
First of all, most users don’t run their own nodes. Running a node is costly and requires experience to setup correctly, especially in a production environment. Therefore, the burden is shouldered by DApps.
Infura has wrapped the functionality of Ethereum clients and created a straightforward API that allows any developer to write software that interfaces with the Ethereum network. The product they’ve built is excellent, and is often more economical than being self-sufficient. This causes DApps to routinely use Infura as their single source of truth.
Second, Ethereum clients are only good at answering simple queries such as “What is Alice’s balance?”. More complicated queries are prohibitively expensive to answer using only an Ethereum client. Even if users did have their own local node, they still wouldn’t be able to get the answer to queries like “How much did Alice spend between March and April?”.
The crux is that DApps often rely on these rich queries, and are therefore relegated to maintaining their own off-chain databases to support their application. These databases are usually built by listening to the chain on their own server, logging events as they come in, and then organizing and aggregating the events into a form that is amenable to the desired queries. Requiring every user to do this would be create a lot of friction in terms of acquiring new users, so it’s regularly provided to the world via a central API.
Even if a DApp is proactive and runs its own node to populate its database — avoiding Infura all together — its users connect to Infura by default via the MetaMask wallet. While it’s possible to connect to a personal Ethereum node through MetaMask, most users don’t even understand the problem with only connecting to a single source and consequently cannot adjust their behavior.
Although Infura is a prominent example of centralization in the Ethereum ecosystem, it’s far from the only one. Etherscan, oracles, DApp APIs — even a singular connection to the internet — are all susceptible to coercion. Relying on any one of these services for information would result in completely trusting that operator.
It’s surprisingly difficult to be self-sovereign using a technology whose primary goal is self-sovereignty. So don’t trust, verify.
Blockchains are a decentralization primitive. Maintaining this property for all users is a tooling problem. Anyone with sufficient knowledge can already run their own full node and interact directly with the network today. But, this isn’t the goal. The goal is to allow everyone to interact with applications, without needing to trust any institution — to do that, the tooling must improve.
The status quo is to use a single source of truth when interacting with the chain. This inherently centralizes the flow of information and requires trust in that source. It is also in great contrast to full nodes, which connect with 25-50 unique peers. Full nodes can maintain an accurate view of the network even if only one of their peers are honest.
It’s critical that we continue developing tools that allow users to maintain their own view of the network, based on many different sources. There are many techniques that will make this easier, including light clients, block witnesses, state proofs, data retrieval networks and zero-knowledge proofs.
Each technique has different setup assumptions and security thresholds. Ultimately, a combination of them will be bundled together in software that end users use everyday. They’ll “just work” the same way that the internet “just works”, allowing us to achieve our goal of decentralized applications for everyone.
In the meantime, it’s important to be aware of the points of failure that exist today.