Possible futures of the Ethereum protocol, part 3: The Scourge
2024 Oct 20
See all posts
Possible futures of the Ethereum protocol, part 3: The Scourge
Special thanks to Justin Drake, Caspar Schwarz-Schilling, Phil
Daian, Dan Robinson, Charlie Noyes and Max Resnick for feedback and
review, and the ethstakers community for discussion.
One of the biggest risks to the Ethereum L1 is proof-of-stake
centralizing due to economic pressures. If there are economies-of-scale
in participating in core proof of stake mechanisms, this would naturally
lead to large stakers dominating, and small stakers dropping out to join
large pools. This leads to higher risk of 51% attacks, transaction
censorship, and other crises. In addition to the centralization risk,
there are also risks of value extraction: a small group
capturing value that would otherwise go to Ethereum's users.
Over the last year, our understanding of these risks has increased
greatly. It's well understood that there are two key places where this
risk exists: (i) block construction, and (ii)
staking capital provision. Larger actors can afford to
run more sophisticated algorithms ("MEV extraction") to generate blocks,
giving them a higher revenue per block. Very large actors can also more
effectively deal with the inconvenience of having their capital locked
up, by releasing it to others as a liquid staking token (LST). In
addition to the direct questions of small vs large stakers, there is
also the question of whether or not there is (or will be) too
much staked ETH.
The Scourge, 2023 roadmap
This year, there have been significant advancements on block
construction, most notably convergence on "committee inclusion lists
plus some targeted solution for ordering" as the ideal solution, as well
as significant research on proof of stake economics, including ideas
such as two-tiered staking models and reducing issuance to cap the
percent of ETH staked.
The Scourge: key goals
- Minimize centralization risks at Ethereum's staking layer (notably,
in block construction and capital provision, aka. MEV and staking
pools)
- Minimize risks of excessive value extraction from users
In this chapter
Fixing the block
construction pipeline
What problem are we solving?
Today, Ethereum block construction is largely done through
extra-protocol propser-builder separation with MEVBoost. When a validator gets
an opportunity to propose a block, they auction off the job of choosing
block contents to specialized actors called builders. The task of
choosing block contents that maximize revenue is very economies-of-scale
intensive: specialized algorithms are needed to determine which
transactions to include, in order to extract as much value as possible
from on-chain financial gadgets and users' transactions interacting with
them (this is what is called "MEV extraction"). Validators are left with
the relatively economies-of-scale-light "dumb pipe" task of listening
for bids and accepting the highest bid, as well as other
responsibilities like attesting.
Stylized diagram of what MEVBoost is doing: specialized
builders take on the tasks in the red, and stakers take on the tasks in
blue.
There are various versions of this, including "proposer-builder
separation" (PBS) and "attester-proposer separation" (APS). The
difference between these has to do with fine-grained details around
which responsibilities go to which of the two actors: roughly, in PBS,
validators still propose blocks, but receive the payload from builders,
and in APS, the entire slot becomes the builder's responsibility.
Recently, APS is preferred over PBS, because it further reduces
incentives for proposers to co-locate with builders. Note that APS would
only apply to execution blocks, which contain transactions;
consensus blocks, which contain proof-of-stake-related data
such as attestations, would still be randomly assigned to
validators.
This separation of powers helps keep validators decentralized, but it
has one important cost: the actors that are doing the "specialized"
tasks can easily become very centralized. Here's Ethereum block
building today:
Two actors are choosing the contents of roughly 88% of Ethereum
blocks. What if those two actors decide to censor a transaction? The
answer is not quite as bad as it might seem: they are not able to reorg
blocks, and so you don't need 51% censoring to prevent a transaction
from getting included at all: you need 100%. With 88% censoring, a user
would need to wait an average of 9 slots to get included (technically,
an average of 114 seconds, instead of 6 seconds). For some use cases,
waiting for two or even five minutes for certain transactions is fine.
But for other use cases, eg. defi liquidations, even the ability to
delay inclusion of someone else's transaction by a few blocks is a
significant market manipulation risk.
The strategies that block builders can employ to maximize revenue can
also have other negative consequences for users. A "sandwich
attack" could cause users making token swaps to suffer significant
losses from slippage. The transactions introduced to make these attacks
clog the chain, increasing gas prices for other users.
What is it, and how does it
work?
The leading solution is to break down the block production task
further: we give the task of choosing transactions back to the proposer
(ie. a staker), and the builder can only choose the ordering and insert
some transactions of their own. This is what inclusion
lists seek to do.
At time T, a randomly selected staker creates an inclusion list, a
list of transactions that are valid given the current state of the
blockchain at that time. At time T+1, a block builder, perhaps chosen
through an in-protocol auction mechanism ahead of time,
creates a block. This block is required to include every transaction in
the inclusion list, but they can choose the order, and they can add in
their own transactions.
Fork-choice-enforced inclusion lists (FOCIL)
proposals involve a committee of multiple inclusion list
creators per block. To delay a transaction by one block, k
of k
inclusion list creators (eg. k = 16
)
would have to censor the transaction. The combination of FOCIL with a
final proposer chosen by auction that is required to include the
inclusion lists, but can reorder and add new transactions, is often
called "FOCIL + APS".
A different approach to the problem is multiple concurrent
proposers (MCP) schemes such as BRAID. BRAID
seeks to avoid splitting up the block proposer role into a
low-economies-of-scale part and a high-economies-of-scale part, and
instead tries to distribute the block production process among many
actors, in such a way that each proposer only needs to have a medium
amount of sophistication to maximize their revenue. MCP works by having
k
parallel proposers generate lists of transactions, and
then using a deterministic algorithm (eg. order by highest-to-lowest
fee) to choose the order.
BRAID does not seek to attain the goal of dumb-pipe block proposers
running default software being optimal. Two easy-to-understand reasons
why it cannot do so are:
- Last-mover arbitrage attacks: suppose that the
average time that proposers submit is T, and the last possible
time you can submit and still get included is around T+1. Now, suppose
that on centralized exchanges, the ETH/USDC price moves from $2500 to
$2502 between T and T+1. A proposer can wait an extra second and add an
additional transaction to arbitrage on-chain decentralized exchanges,
claiming up to $2 per ETH in profit. Sophisticated proposers who are
very well-connected to the network have more ability to do this.
- Exclusive order flow: users have the incentive to
send transactions directly to one single proposer, to minimize their
vulnerability to front-running and other attacks. Sophisticated
proposers have an advantage because they can set up infrastructure to
accept these direct-from-user transactions, and they have stronger
reputations so users who send them transactions can trust that the
proposer will not betray and front-run them (this can be mitigated with
trusted hardware, but then trusted hardware has trust assumptions of its
own)
In BRAID, attesters can still be separated off and run as a dumb-pipe
functionality.
In addition to these two extremes, there is a spectrum of
possible designs in between. For example, you could auction off
a role that only has the right to append to a block, and not to
reorder or prepend. You could even let them append or prepend, but not
insert in the middle or reorder. The attraction of these techniques is
that the winners of the auction market are likely to be very concentrated,
and so there is a lot of benefit to reducing their authority.
Encrypted mempools
One technology that is crucial to the successful implementation of
many of these designs (specifically, either BRAID or a version of APS
where there are strict limits on the capability being auctionef off) is
encrypted mempools. Encrypted mempools are a technology
where users broadcast their transactions in encrypted form, along with
some kind of proof of their validity, and the transactions are included
into blocks in encrypted form, without the block builder knowing the
contents. The contents of the transactions are revealed later.
The main challenge in implementing encrypted mempools is coming up
with a design that ensures that transactions do all get revealed later:
a simple "commit and reveal" scheme does not work, because if revealing
is voluntary, the act of choosing to reveal or not reveal is itself a
kind of "last-mover" influence on a block that could be exploited. The
two leading techniques for this are (i) threshold
decryption, and (ii) delay encryption, a primitive closely related
to verifiable delay functions
(VDFs).
What are some links to
existing research?
What is left to
do, and what are the tradeoffs?
We can think of all of the above schemes as being different ways of
dividing up the authority involved in staking, arranged on a spectrum
from lower economies of scale ("dumb-pipe") to higher economies of scale
("specialization-friendly"). Pre-2021, all of these authorities were
bundled together in one actor:
The core conundrum is this: any meaningful authority that
remains in the hands of stakers, is authority that could end up being
"MEV-relevant". We want a highly decentralized set of actors to
have as much authority as possible; this implies (i) putting a lot of
authority in the hands of stakers, and (ii) making sure stakers are as
decentralized as possible, meaning that they have few
economies-of-scale-driven incentives to consolidate. This is a difficult
tension to navigate.
One particular challenge is multi-block MEV: in some cases,
execution auction winners can make even more money if they capture
multiple slots in a row, and do not allow any MEV-relevant
transactions in blocks other than the last one that they control. If
inclusion lists force them to, then they can try to bypass that by not
publishing any block at all during those slots. One could make
unconditional inclusion lists, which directly become the block
if the builder does not provide one, but this makes the
inclusion list MEV-relevant. The solution here may involve some
compromise that involves accepting some low degree of incentive to bribe
people to include transactions in an inclusion list, and hoping that
it's not high enough to lead to mass outsourcing.
We can view FOCIL + APS as follows. Stakers continue to have the
authority on the left part of the spectrum, while the right part of the
spectrum gets auctioned off to the highest bidder.
BRAID is quite different. The "staker" piece is larger, but it gets
split into two pieces: light stakers and heavy stakers. Meanwhile,
because transactions are ordered in decreasing order of priority fee,
the top-of-block choice gets de-facto auctioned off via the fee market,
in a scheme that can be viewed as analogous to enshrined
PBS.
Note that the safety of BRAID depends heavily on encrypted mempools;
otherwise, the top-of-block auction mechanism becomes vulnerable to
strategy-stealing attacks (essentially: copying other people's
transactions, swapping the recipient address, and paying a 0.01% higher
fee). This need for pre-inclusion privacy is also the reason why
enshrined PBS is so tricky to implement.
Finally, more "aggressive" versions of FOCIL + APS, eg. the option
where APS only determines the end of the block, look like this:
The main remaining task is to (i) work on solidifying the various
proposals and analyzing their consequences, and (ii) combine this
analysis with an understanding of the Ethereum community's goals in
terms of what forms of centralization it will tolerate. There is also
work to be done on each individual proposal, such as:
- Continuing work on encrypted mempool designs, and
getting to the point where we have a design that is both robust and
reasonably simple, and plausibly ready for inclusion.
- Optimizing the design of multiple inclusion lists
to make sure that (i) it does not waste data, particularly in the
context of inclusion lists covering blobs, and (ii) it
is friendly to stateless validators.
- More work on the optimal auction design for
APS.
Additionally, it's worth noting that these different proposals are
not necessarily incompatible forks on the road from each other. For
example, implementing FOCIL + APS could easily serve as a stepping stone
to implementing BRAID. A valid conservative strategy would be a
"wait-and-see" approach where we first implement a solution where
stakers' authority is limited and most of the authority is auctioned
off, and then slowly increase stakers' authority over time as we learn
more about the MEV market operation on the live network.
How does
it interact with other parts of the roadmap?
There are positive interactions between solving one staking
centralization bottleneck and solving the others. To give an analogy,
imagine a world where starting your own company required growing your
own food, making your own computers and having your own army. In this
world, only a few companies could exist. Solving one of the three
problems would help the situation, but only a little. Solving two
problems would help more than twice as much as solving one. And
solving three would be far more than three times as helpful - if you're
a solo entrepreneur, either 3/3 problems are solved or you stand no
chance.
In particular, the centralization bottlenecks for staking are:
- Block construction centralization (this section)
- Staking centralization for economic reasons (next section)
- Staking centralization because of the 32 ETH minimum (solved with
Orbit or other techniques; see the post on the Merge)
- Staking centralization because of hardware requirements (solved in
the Verge, with stateless clients and later ZK-EVMs)
Solving any one of the four increases the gains from solving any of
the others.
Additionally, there are interactions between the block construction
pipeline and the single slot finality design, particularly in the
context of trying to reduce slot times. Many block construction
pipeline designs end up increasing slot times. Many block
construction pipelines involve roles for attesters at multiple steps in
the process. For this reason, it can be worth thinking about the block
construction pipelines and single slot finality simultaneously.
Fixing staking economics
What problem are we solving?
Today, about 30% of the ETH supply is
actively staking. This is far more than enough to protect Ethereum
from 51% attacks. If the percent of ETH staked grows much larger,
researchers fear a different scenario: the risks that would arise if
almost all ETH becomes staked. These risks include:
- Staking turns from being a profitable task for specialists into a
duty for all ETH holders. Hence, the average staker would be much more
unenthusiastic, and would choose the easiest approach (realistically,
delegating their tokens to whichever centralized operator offers the
most convenience)
- Credibility of the slashing mechanism weakens if almost all ETH is
staked
- A single liquid staking token could take over the bulk of the stake
and even taking over "money" network effects from ETH itself
- Ethereum needlessly issuing an extra ~1m ETH/year. In the case where
one liquid staking token gets dominant network effect, a large portion
of this value could potentially even get captured by the LST.
What is it, and how does it
work?
Historically, one class of solution has been: if everyone staking is
inevitable, and a liquid staking token is inevitable, then let's make
staking friendly to having a liquid staking token that is actually
trustless, neutral and maximally decentralized. One simple way to do
this is to cap staking penalties at eg. 1/8, which would make 7/8 of
staked ETH unslashable, and thus eligible to be put into the same liquid
staking token. Another option is to explicitly create two
tiers of staking: "risk-bearing" (slashable) staking, which would
somehow be capped to eg. 1/8 of all ETH, and "risk-free" (unslashable)
staking, which everyone could participate in.
However, one criticism of this approach is that it seems
economically equivalent to something much simpler: massively reduce
issuance if the stake approaches some pre-determined
cap. The basic argument is: if we end up in a world
where the risk-bearing tier has 3.4% returns and the risk-free tier
(which everyone participates in) has 2.6% returns, that's actually the
same thing as a world where staking ETH has 0.8% returns and just
holding ETH has 0% returns. The dynamics of the risk-bearing tier,
including both total quantity staked and centralization, would be the
same in both cases. And so we should just do the simple thing and reduce
issuance.
The main counterargument to this line of argument would be if we can
make the "risk-free tier" still have some useful role and
some level of risk (eg. as proposed by
Dankrad here).
Both of these lines of proposals imply changing the issuance curve,
in a way that makes returns prohibitively low if the amount of stake
gets too high.
Left: one proposal for an adjusted issuance curve, by
Justin Drake. Right: another set of proposals, by Anders Elowsson.
Two-tier staking, on the other hand, requires setting two
return curves: (i) the return rate for "basic" (risk-free or low-risk)
staking, and (ii) the premium for risk-bearing staking. There are
different ways to set these parameters: for example, if you set a hard
parameter that 1/8 of stake is slashable, then market dynamics will
determine the premium on the return rate that slashable stake gets.
Another important topic here is MEV capture. Today,
revenue from MEV (eg. DEX arbitrage, sandwiching...) goes to proposers,
ie. stakers. This is revenue that is completely "opaque" to the
protocol: the protocol has no way of knowing if it's 0.01% APR, 1% APR
or 20% APR. The existence of this revenue stream is highly inconvenient
from multiple angles:
- It is a volatile revenue source, as each individual
staker only gets it when they propose a block, which is once every ~4
months today. This creates an incentive to join pools for more stable
income.
- It leads to an unbalanced allocation of incentives:
too much for proposing, too little for attesting.
- It makes stake capping very difficult to implement:
even if the "official" return rate is zero, the MEV revenue alone may be
enough to drive all ETH holders to stake. As a result, a realistic stake
capping proposal would in fact have to have returns approach
negative infinity, as eg. proposed
here. This, needless to say, creates more risk for stakers,
especially solo stakers.
We can solve these problems by finding a way to make MEV revenue
legible to the protocol, and capturing it. The earliest proposal was Francesco's
MEV smoothing; today, it's widely understood that any mechanism for
auctioning off block proposer rights (or, more generally, sufficient
authority to capture almost all MEV) ahead of time accomplishes the same
goal.
What are some links
to existing research?
What is left to
do, and what are the tradeoffs?
The main remaining task is to either agree to do nothing, and accept
the risks of almost all ETH being inside LSTs, or finalize and agree on
the details and parameters of one of the above proposals. An approximate
summary of the benefits and risks is:
Do nothing |
* MEV burn implementation, if any |
* Almost 100% of ETH staked, likely in LSTs (perhaps a single
dominant one) * Macroeconomic risks |
Stake capping (via changing issuance curve) |
* Reward function and parameters (esp. what the cap is) * MEV
burn implementation |
* Open question of which stakers enter and leave, possibility that
remaining staker set is centralized |
* Two-tiered staking |
* The role of the risk-free tier * Parameters (eg. the economics
that determine the amount staked in the risk-bearing tier) * MEV
burn implementation |
* Open question of which stakers enter and leave, possibility that
risk-bearing set is centralized |
How does
it interact with other parts of the roadmap?
One important point of intersection has to do with solo
staking. Today, the cheapest VPSes that can run an Ethereum
node cost about $60 per month, primarily due to hard disk storage costs.
For a 32 ETH staker ($84,000 at the time of this writing), this
decreases APY by (60 * 12) / 84000 ~= 0.85%
. If total
staking returns drop below 0.85%, solo staking will be unviable for many
people at these levels.
If we want solo staking to continue to be viable, this puts further
emphasis on the need to reduce node operation costs, which will be done
in the Verge: statelessness will remove storage space requirements,
which may be sufficient on its own, and then L1 EVM validity proofs will
make costs completely trivial.
On the other hand, MEV burn arguably helps solo staking.
Although it decreases returns for everyone, it more importantly
decreases variance, making staking less like a lottery.
Finally, any change in issuance interacts with other fundamental
changes to the staking design (eg. rainbow staking). One particular
point of concern is that if staking returns become very low, this means
we have to choose between (i) making penalties also low,
reducing disincentives against bad behavior, and (ii) keeping penalties
high, which would increase the set of circumstances in which even
well-meaning validators accidentally end up with negative returns if
they get unlucky with technical issues or even attacks.
Application layer solutions
The above sections focused on changes to the Ethereum L1 that can
solve important centralization risks. However, Ethereum is not just an
L1, it is an ecosystem, and there are also important application-layer
strategies that can help mitigate the above risks. A few examples
include:
- Specialized staking hardware solutions - some
companies, such as Dappnode, are
selling hardware that is specifically designed to make it as easy as
possible to operate a staking node. One way to make this solution more
effective, is to ask the question: if a user is already spending the
effort to have a box running and connected to the internet 24/7, what
other services could it provide (to the user or to others) that benefit
from decentralization? Examples that come to mind include (i) running
locally hosted LLMs, for self-sovereignty and privacy reasons, and (ii)
running nodes for a decentralized VPN.
- Squad staking - this solution from Obol allows multiple
people to stake together in an M-of-N format. This will likely get more
and more popular over time, as statelessness and later L1 EVM validity
proofs will reduce the overhead of running more nodes, and the benefit
of each individual participant needing to worry much less about being
online all the time starts to dominate. This is another way to reduce
the cognitive overhead of staking, and ensure solo staking prospers in
the future.
- Airdrops - Starknet gave an airdrop
to solo stakers. Other projects wishing to have a decentralized and
values-aligned set of users may also consider giving airdrops or
discounts to validators that are identified as probably being solo
stakers.
- Decentralized block building marketplaces - using a
combination of ZK, MPC and TEEs, it's possible to create a decentralized
block builder that participates in, and wins, the APS auction game, but
at the same time provides pre-confirmation privacy and censorship
resistance guarantees to its users. This is another path toward
improving users' welfare in an APS world.
- Application-layer MEV minimization - individual
applications can be built in a way that "leaks" less MEV to L1, reducing
the incentive for block builders to create specialized algorithms to
collect it. One simple strategy that is universal, though inconvenient
and composability-breaking, is for the contract to put all incoming
operations into a queue and execute them in the next block, and auction
off the right to jump the queue. Other more sophisticated approaches
include doing more work offchain eg. as Cowswap does. Oracles can also be
redesigned to minimize oracle-extractable
value.
Possible futures of the Ethereum protocol, part 3: The Scourge
2024 Oct 20 See all postsSpecial thanks to Justin Drake, Caspar Schwarz-Schilling, Phil Daian, Dan Robinson, Charlie Noyes and Max Resnick for feedback and review, and the ethstakers community for discussion.
One of the biggest risks to the Ethereum L1 is proof-of-stake centralizing due to economic pressures. If there are economies-of-scale in participating in core proof of stake mechanisms, this would naturally lead to large stakers dominating, and small stakers dropping out to join large pools. This leads to higher risk of 51% attacks, transaction censorship, and other crises. In addition to the centralization risk, there are also risks of value extraction: a small group capturing value that would otherwise go to Ethereum's users.
Over the last year, our understanding of these risks has increased greatly. It's well understood that there are two key places where this risk exists: (i) block construction, and (ii) staking capital provision. Larger actors can afford to run more sophisticated algorithms ("MEV extraction") to generate blocks, giving them a higher revenue per block. Very large actors can also more effectively deal with the inconvenience of having their capital locked up, by releasing it to others as a liquid staking token (LST). In addition to the direct questions of small vs large stakers, there is also the question of whether or not there is (or will be) too much staked ETH.
The Scourge, 2023 roadmap
This year, there have been significant advancements on block construction, most notably convergence on "committee inclusion lists plus some targeted solution for ordering" as the ideal solution, as well as significant research on proof of stake economics, including ideas such as two-tiered staking models and reducing issuance to cap the percent of ETH staked.
The Scourge: key goals
In this chapter
Fixing the block construction pipeline
What problem are we solving?
Today, Ethereum block construction is largely done through extra-protocol propser-builder separation with MEVBoost. When a validator gets an opportunity to propose a block, they auction off the job of choosing block contents to specialized actors called builders. The task of choosing block contents that maximize revenue is very economies-of-scale intensive: specialized algorithms are needed to determine which transactions to include, in order to extract as much value as possible from on-chain financial gadgets and users' transactions interacting with them (this is what is called "MEV extraction"). Validators are left with the relatively economies-of-scale-light "dumb pipe" task of listening for bids and accepting the highest bid, as well as other responsibilities like attesting.
Stylized diagram of what MEVBoost is doing: specialized builders take on the tasks in the red, and stakers take on the tasks in blue.
There are various versions of this, including "proposer-builder separation" (PBS) and "attester-proposer separation" (APS). The difference between these has to do with fine-grained details around which responsibilities go to which of the two actors: roughly, in PBS, validators still propose blocks, but receive the payload from builders, and in APS, the entire slot becomes the builder's responsibility. Recently, APS is preferred over PBS, because it further reduces incentives for proposers to co-locate with builders. Note that APS would only apply to execution blocks, which contain transactions; consensus blocks, which contain proof-of-stake-related data such as attestations, would still be randomly assigned to validators.
This separation of powers helps keep validators decentralized, but it has one important cost: the actors that are doing the "specialized" tasks can easily become very centralized. Here's Ethereum block building today:
Two actors are choosing the contents of roughly 88% of Ethereum blocks. What if those two actors decide to censor a transaction? The answer is not quite as bad as it might seem: they are not able to reorg blocks, and so you don't need 51% censoring to prevent a transaction from getting included at all: you need 100%. With 88% censoring, a user would need to wait an average of 9 slots to get included (technically, an average of 114 seconds, instead of 6 seconds). For some use cases, waiting for two or even five minutes for certain transactions is fine. But for other use cases, eg. defi liquidations, even the ability to delay inclusion of someone else's transaction by a few blocks is a significant market manipulation risk.
The strategies that block builders can employ to maximize revenue can also have other negative consequences for users. A "sandwich attack" could cause users making token swaps to suffer significant losses from slippage. The transactions introduced to make these attacks clog the chain, increasing gas prices for other users.
What is it, and how does it work?
The leading solution is to break down the block production task further: we give the task of choosing transactions back to the proposer (ie. a staker), and the builder can only choose the ordering and insert some transactions of their own. This is what inclusion lists seek to do.
At time T, a randomly selected staker creates an inclusion list, a list of transactions that are valid given the current state of the blockchain at that time. At time T+1, a block builder, perhaps chosen through an in-protocol auction mechanism ahead of time, creates a block. This block is required to include every transaction in the inclusion list, but they can choose the order, and they can add in their own transactions.
Fork-choice-enforced inclusion lists (FOCIL) proposals involve a committee of multiple inclusion list creators per block. To delay a transaction by one block,
k
ofk
inclusion list creators (eg.k = 16
) would have to censor the transaction. The combination of FOCIL with a final proposer chosen by auction that is required to include the inclusion lists, but can reorder and add new transactions, is often called "FOCIL + APS".A different approach to the problem is multiple concurrent proposers (MCP) schemes such as BRAID. BRAID seeks to avoid splitting up the block proposer role into a low-economies-of-scale part and a high-economies-of-scale part, and instead tries to distribute the block production process among many actors, in such a way that each proposer only needs to have a medium amount of sophistication to maximize their revenue. MCP works by having
k
parallel proposers generate lists of transactions, and then using a deterministic algorithm (eg. order by highest-to-lowest fee) to choose the order.BRAID does not seek to attain the goal of dumb-pipe block proposers running default software being optimal. Two easy-to-understand reasons why it cannot do so are:
In BRAID, attesters can still be separated off and run as a dumb-pipe functionality.
In addition to these two extremes, there is a spectrum of possible designs in between. For example, you could auction off a role that only has the right to append to a block, and not to reorder or prepend. You could even let them append or prepend, but not insert in the middle or reorder. The attraction of these techniques is that the winners of the auction market are likely to be very concentrated, and so there is a lot of benefit to reducing their authority.
Encrypted mempools
One technology that is crucial to the successful implementation of many of these designs (specifically, either BRAID or a version of APS where there are strict limits on the capability being auctionef off) is encrypted mempools. Encrypted mempools are a technology where users broadcast their transactions in encrypted form, along with some kind of proof of their validity, and the transactions are included into blocks in encrypted form, without the block builder knowing the contents. The contents of the transactions are revealed later.
The main challenge in implementing encrypted mempools is coming up with a design that ensures that transactions do all get revealed later: a simple "commit and reveal" scheme does not work, because if revealing is voluntary, the act of choosing to reveal or not reveal is itself a kind of "last-mover" influence on a block that could be exploited. The two leading techniques for this are (i) threshold decryption, and (ii) delay encryption, a primitive closely related to verifiable delay functions (VDFs).
What are some links to existing research?
What is left to do, and what are the tradeoffs?
We can think of all of the above schemes as being different ways of dividing up the authority involved in staking, arranged on a spectrum from lower economies of scale ("dumb-pipe") to higher economies of scale ("specialization-friendly"). Pre-2021, all of these authorities were bundled together in one actor:
The core conundrum is this: any meaningful authority that remains in the hands of stakers, is authority that could end up being "MEV-relevant". We want a highly decentralized set of actors to have as much authority as possible; this implies (i) putting a lot of authority in the hands of stakers, and (ii) making sure stakers are as decentralized as possible, meaning that they have few economies-of-scale-driven incentives to consolidate. This is a difficult tension to navigate.
One particular challenge is multi-block MEV: in some cases, execution auction winners can make even more money if they capture multiple slots in a row, and do not allow any MEV-relevant transactions in blocks other than the last one that they control. If inclusion lists force them to, then they can try to bypass that by not publishing any block at all during those slots. One could make unconditional inclusion lists, which directly become the block if the builder does not provide one, but this makes the inclusion list MEV-relevant. The solution here may involve some compromise that involves accepting some low degree of incentive to bribe people to include transactions in an inclusion list, and hoping that it's not high enough to lead to mass outsourcing.
We can view FOCIL + APS as follows. Stakers continue to have the authority on the left part of the spectrum, while the right part of the spectrum gets auctioned off to the highest bidder.
BRAID is quite different. The "staker" piece is larger, but it gets split into two pieces: light stakers and heavy stakers. Meanwhile, because transactions are ordered in decreasing order of priority fee, the top-of-block choice gets de-facto auctioned off via the fee market, in a scheme that can be viewed as analogous to enshrined PBS.
Note that the safety of BRAID depends heavily on encrypted mempools; otherwise, the top-of-block auction mechanism becomes vulnerable to strategy-stealing attacks (essentially: copying other people's transactions, swapping the recipient address, and paying a 0.01% higher fee). This need for pre-inclusion privacy is also the reason why enshrined PBS is so tricky to implement.
Finally, more "aggressive" versions of FOCIL + APS, eg. the option where APS only determines the end of the block, look like this:
The main remaining task is to (i) work on solidifying the various proposals and analyzing their consequences, and (ii) combine this analysis with an understanding of the Ethereum community's goals in terms of what forms of centralization it will tolerate. There is also work to be done on each individual proposal, such as:
Additionally, it's worth noting that these different proposals are not necessarily incompatible forks on the road from each other. For example, implementing FOCIL + APS could easily serve as a stepping stone to implementing BRAID. A valid conservative strategy would be a "wait-and-see" approach where we first implement a solution where stakers' authority is limited and most of the authority is auctioned off, and then slowly increase stakers' authority over time as we learn more about the MEV market operation on the live network.
How does it interact with other parts of the roadmap?
There are positive interactions between solving one staking centralization bottleneck and solving the others. To give an analogy, imagine a world where starting your own company required growing your own food, making your own computers and having your own army. In this world, only a few companies could exist. Solving one of the three problems would help the situation, but only a little. Solving two problems would help more than twice as much as solving one. And solving three would be far more than three times as helpful - if you're a solo entrepreneur, either 3/3 problems are solved or you stand no chance.
In particular, the centralization bottlenecks for staking are:
Solving any one of the four increases the gains from solving any of the others.
Additionally, there are interactions between the block construction pipeline and the single slot finality design, particularly in the context of trying to reduce slot times. Many block construction pipeline designs end up increasing slot times. Many block construction pipelines involve roles for attesters at multiple steps in the process. For this reason, it can be worth thinking about the block construction pipelines and single slot finality simultaneously.
Fixing staking economics
What problem are we solving?
Today, about 30% of the ETH supply is actively staking. This is far more than enough to protect Ethereum from 51% attacks. If the percent of ETH staked grows much larger, researchers fear a different scenario: the risks that would arise if almost all ETH becomes staked. These risks include:
What is it, and how does it work?
Historically, one class of solution has been: if everyone staking is inevitable, and a liquid staking token is inevitable, then let's make staking friendly to having a liquid staking token that is actually trustless, neutral and maximally decentralized. One simple way to do this is to cap staking penalties at eg. 1/8, which would make 7/8 of staked ETH unslashable, and thus eligible to be put into the same liquid staking token. Another option is to explicitly create two tiers of staking: "risk-bearing" (slashable) staking, which would somehow be capped to eg. 1/8 of all ETH, and "risk-free" (unslashable) staking, which everyone could participate in.
However, one criticism of this approach is that it seems economically equivalent to something much simpler: massively reduce issuance if the stake approaches some pre-determined cap. The basic argument is: if we end up in a world where the risk-bearing tier has 3.4% returns and the risk-free tier (which everyone participates in) has 2.6% returns, that's actually the same thing as a world where staking ETH has 0.8% returns and just holding ETH has 0% returns. The dynamics of the risk-bearing tier, including both total quantity staked and centralization, would be the same in both cases. And so we should just do the simple thing and reduce issuance.
The main counterargument to this line of argument would be if we can make the "risk-free tier" still have some useful role and some level of risk (eg. as proposed by Dankrad here).
Both of these lines of proposals imply changing the issuance curve, in a way that makes returns prohibitively low if the amount of stake gets too high.
Left: one proposal for an adjusted issuance curve, by Justin Drake. Right: another set of proposals, by Anders Elowsson.
Two-tier staking, on the other hand, requires setting two return curves: (i) the return rate for "basic" (risk-free or low-risk) staking, and (ii) the premium for risk-bearing staking. There are different ways to set these parameters: for example, if you set a hard parameter that 1/8 of stake is slashable, then market dynamics will determine the premium on the return rate that slashable stake gets.
Another important topic here is MEV capture. Today, revenue from MEV (eg. DEX arbitrage, sandwiching...) goes to proposers, ie. stakers. This is revenue that is completely "opaque" to the protocol: the protocol has no way of knowing if it's 0.01% APR, 1% APR or 20% APR. The existence of this revenue stream is highly inconvenient from multiple angles:
We can solve these problems by finding a way to make MEV revenue legible to the protocol, and capturing it. The earliest proposal was Francesco's MEV smoothing; today, it's widely understood that any mechanism for auctioning off block proposer rights (or, more generally, sufficient authority to capture almost all MEV) ahead of time accomplishes the same goal.
What are some links to existing research?
What is left to do, and what are the tradeoffs?
The main remaining task is to either agree to do nothing, and accept the risks of almost all ETH being inside LSTs, or finalize and agree on the details and parameters of one of the above proposals. An approximate summary of the benefits and risks is:
* Macroeconomic risks
* MEV burn implementation
* Parameters (eg. the economics that determine the amount staked in the risk-bearing tier)
* MEV burn implementation
How does it interact with other parts of the roadmap?
One important point of intersection has to do with solo staking. Today, the cheapest VPSes that can run an Ethereum node cost about $60 per month, primarily due to hard disk storage costs. For a 32 ETH staker ($84,000 at the time of this writing), this decreases APY by
(60 * 12) / 84000 ~= 0.85%
. If total staking returns drop below 0.85%, solo staking will be unviable for many people at these levels.If we want solo staking to continue to be viable, this puts further emphasis on the need to reduce node operation costs, which will be done in the Verge: statelessness will remove storage space requirements, which may be sufficient on its own, and then L1 EVM validity proofs will make costs completely trivial.
On the other hand, MEV burn arguably helps solo staking. Although it decreases returns for everyone, it more importantly decreases variance, making staking less like a lottery.
Finally, any change in issuance interacts with other fundamental changes to the staking design (eg. rainbow staking). One particular point of concern is that if staking returns become very low, this means we have to choose between (i) making penalties also low, reducing disincentives against bad behavior, and (ii) keeping penalties high, which would increase the set of circumstances in which even well-meaning validators accidentally end up with negative returns if they get unlucky with technical issues or even attacks.
Application layer solutions
The above sections focused on changes to the Ethereum L1 that can solve important centralization risks. However, Ethereum is not just an L1, it is an ecosystem, and there are also important application-layer strategies that can help mitigate the above risks. A few examples include: