Ethereum faces a critical challenge: the inherent tendency of blockchain protocols to grow in complexity and data size over time. This occurs in two primary areas:
- Historical Data: Every transaction and account created at any point in history must be permanently stored by all clients and downloaded by new ones to fully synchronize with the network. This leads to increasing client loads and synchronization times, even if the chain’s capacity remains unchanged.
- Protocol Features: Adding new functionalities is easier than removing outdated ones, resulting in escalating code complexity.
To ensure Ethereum’s long-term sustainability, we must counteract these trends by systematically reducing complexity and bloat while preserving blockchain’s core attribute: persistence. Whether it’s an NFT, a love letter embedded in transaction calldata, or a $1M smart contract, users should be able to retrieve their data after years of inactivity. Decentralized applications (DApps) relying on this permanence need assurance that foundational layers won’t undergo breaking changes.
The Purge: 2025 Roadmap
Balancing continuity with simplification is achievable. Biological systems like biologically immortal organisms and social structures such as Japan’s millennium-old shrine demonstrate longevity through renewal. Ethereum has already made strides:
– Phasing out Proof-of-Work
– Neutralizing the SELFDESTRUCT
opcode
– Beacon chain nodes now store only ~6 months of historical data
The ultimate challenge lies in forging a path toward long-term stability without compromising scalability, security, or technical sustainability.
Key Goals of The Purge
- Reduce client storage demands by eliminating the need for nodes to permanently store all historical data or even complete state.
- Lower protocol complexity through strategic removal of obsolete features.
👉 Discover how Ethereum’s scalability solutions complement The Purge
Historical Data Expiry
The Problem
A fully synced Ethereum node currently requires ~1.1 TB of disk space for execution clients plus additional hundreds of GB for consensus clients. Over 80% of this is historical data—blocks, transactions, and receipts from years past. Even with stagnant gas limits, node sizes grow by hundreds of GB annually.
The Solution
Historical data relies on a 1-of-N trust model: consensus on the latest block implicitly validates all prior blocks via cryptographic links (hashes, EIP-4788). This allows decentralized storage approaches:
– Distributed Networks: Nodes store random subsets of data (e.g., 10% each). With 100,000 nodes, each datum replicates 10,000×—equivalent to 10,000 full-archive nodes.
– Erasure Coding: Enhances robustness without increasing storage (already used for blob data availability sampling).
Current Progress:
– Consensus blocks: ~6 months retention
– Blobs: ~18 days
– EIP-4444 proposes 1-year expiry for execution history
Future Vision: Unified expiry (~18 days) with peer-to-peer networks (e.g., Portal Network) handling older data.
Key Trade-offs
Approach | Pros | Cons |
---|---|---|
Immediate expiry + centralized archives | Simple implementation | Weakens decentralization |
Torrent/Portal integration | Robust, decentralized | Requires protocol upgrades |
👉 Explore Ethereum’s evolving storage solutions
State Expiry
The Challenge
Even without historical data, state (account balances, contract storage) grows ~50 GB/year. Users impose perpetual storage costs via one-time fees.
Proposed Solutions
- Partial State Expiry (e.g., EIP-7736):
- Divides state into “stems” (≤7936B groupings).
- Inactive stems (>6 months) convert to 32B stubs.
-
Requires Merkle proofs to revive expired data.
-
Address-Cycle Based Expiry:
- New state trees added annually.
- Full nodes store only the latest two trees.
- Accesses to expired states require proofs.
Address Space Expansion/Contraction
- Expansion: 32-byte addresses (version + cycle + hash). Backward compatibility via mapping.
- Contraction: Ban address subsets (e.g.,
0xffffffff
prefix), freeing space for cycle tags. Risks 256 hash collisions.
Trade-offs
Option | Permanent Growth | Complexity | User Experience |
---|---|---|---|
Statelessness | ~8 TB in decades | Low | Unchanged |
Partial Expiry | Minimal | Moderate | Slight complexity |
Full Expiry | Zero | High | Address changes |
Feature Cleanup
Rationale
Simplicity fosters security, accessibility, and credible neutrality. Without deliberate pruning, protocols accumulate complexity.
Implemented Examples
- SELFDESTRUCT Opcode: Neutralized in Dencun (EIP-6780).
- RLP → SSZ Migration: Better serialization/hashing (EIP-6493).
Future Targets
Category | Changes | Benefit |
---|---|---|
EVM | Remove dynamic jumps, gas observability | Easier static analysis |
Precompiles | Remove unused (e.g., RIPEMD160) | Reduced consensus risk |
Logging | Replace Bloom filters with SNARKs | Lower complexity |
Radical Approach: Protocol-to-Contract Migration
Moving core functions (e.g., EVM) into contract code could maximize flexibility. Options:
– Minimal L1: Beacon chain + lightweight VM (RISC-V/Cairo).
– EVM Evolution: Transition to EOF-based strict EVM.
FAQ
1. Will historical data expiry break blockchain explorers?
No. Explorers can specialize as archive nodes or source data from distributed networks like Portal.
2. How does state expiry affect dormant accounts?
Users can “reactivate” expired state via proofs. Address-cycle designs ensure accessibility.
3. When will EIP-4444 activate?
Targeting 2025, pending client readiness for distributed storage solutions.
4. Are there risks to removing precompiles?
Yes—some legacy contracts may break. Analysis shows minimal impact (e.g., identity precompile used in <0.1% of txns).
5. Could EOF replace the current EVM?
Potentially, but requires multi-year migration to avoid breaking existing contracts.
6. How does The Purge impact L2s?
L2s benefit from reduced base-layer complexity but must adapt to state/address changes.
Conclusion
The Purge represents Ethereum’s commitment to sustainable decentralization. By methodically reducing historical data burdens, state growth, and feature creep, we pave the way for a blockchain that remains accessible, secure, and scalable for decades.