Possible futures of the Ethereum protocol, part 6: The Splurge
2024 Oct 29
See all posts
Possible futures of the Ethereum protocol, part 6: The Splurge
Special thanks to Justin Drake, Tim Beiko and Yoav Weiss for
feedback and review
Some things are just not easy to put into a single category. There
are lots of "little things" in Ethereum protocol design that are very
valuable for Ethereum's success, but don't fit nicely into a larger
sub-category. In practice, about half of which has ended up being about
EVM improvements of various kinds, and the rest is made up of various
niche topics. This is what "the Splurge" is for.
The Splurge, 2023 roadmap
The Splurge: key goals
- Bring the EVM to a performant and stable "endgame state"
- Bring account abstraction in-protocol, allowing all users to benefit
from much more secure and convenient accounts
- Optimize transaction fee economics, increasing scalability while
reducing risks
- Explore advanced cryptography that could make Ethereum far better in
the long run
In this chapter
EVM improvements
What problem does it solve?
The EVM today is difficult to statically analyze, making it difficult
to create highly efficient implementations, formally verify code, and
make further extensions to over time. Additionally, it is highly
inefficient, making it difficult to implement many forms of advanced
cryptography unless they are explicitly supported through
precompiles.
What is it, and how does it
work?
The first step in the current EVM improvement roadmap, scheduled to
be included in the next hard fork, is the EVM Object Format (EOF). EOF is
a series of EIPs
that specifies a new version of EVM code that has a number of distinct
features, most notably:
- Separation between code (executable, but not
readable from the EVM) and data (readable, but not
executable)
- Dynamic jumps banned, static jumps only.
- EVM code can no longer observe gas-related
information.
- A new explicit subroutine mechanism is added.
Structure of EOF code
Old-style contracts would continue to exist and be createable,
although there is a possible path to deprecate old-style contracts (and
perhaps even force-convert them to EOF code) eventually. New-style
contracts would benefit from efficiency gains created by EOF - first,
from slightly smaller bytecode taking advantage of the subroutine
feature, and later from new EOF-specific features, or EOF-specific gas
cost decreases.
After EOF is introduced, it becomes easier to introduce further
upgrades. The most well-developed today is the EVM Modular Arithmetic
Extensions (EVM-MAX). EVM-MAX creates a new set of operations
specifically designed for modular arithmetic, and puts them into a new
memory space that cannot be accessed with other opcodes. This enables
the use of optimizations, such as Montgomery
multiplication.
A newer idea is to combine EVM-MAX with a
single-instruction-multiple-data (SIMD) feature. SIMD has been around as
an idea for Ethereum for a long time starting with Greg Colvin's EIP-616.
SIMD can be used to speed up many forms of cryptography, including hash
functions, 32-bit STARKs, and lattice-based cryptography. EVM-MAX plus
SIMD make for a natural pair of performance-oriented extensions to the
EVM.
An approximate design for a combined EIP would be to take EIP-6690 as a
starting point, and then:
-
Allow (i) any odd number or (ii) any power of 2 up to 2768 as
a modulus
-
For each EVMMAX opcode (add
, sub
,
mul
) add a version which, instead of taking 3 immediates
x
, y
, z
, takes 7 immediates:
x_start
, x_skip
, y_start
,
y_skip
, z_start
, z_skip
,
count
. In python code, these opcodes would do something
equivalent to:
for i in range(count):
mem[z_start + z_skip * count] = op(
mem[x_start + x_skip * count],
mem[y_start + y_skip * count]
)
Except in an actual implementation, it would be processed in parallel.
-
Potentially, add
XOR
, AND
, OR
,
NOT
and SHIFT
(both cyclic and noncyclic), at
least for power-of-two moduli. Also add ISZERO
(which
pushes the output to EVM main stack)
This would be powerful enough to implement elliptic curve
cryptography, small-field cryptography (eg. Poseidon, circle STARKs),
conventional hash functions (eg. SHA256, KECCAK, BLAKE), and
lattice-based cryptography.
Other EVM upgrades may also be possible, but so far they have seen
much less attention.
What are some links to
existing research?
What is left to
do, and what are the tradeoffs?
Currently, EOF is scheduled to be included in the next hard fork.
While there is always a possibility to remove it - features have been
last-minute-removed from hard forks before - doing so would be an uphill
battle. Removing EOF would imply making any future upgrades to the EVM
without EOF, which can be done but may be more difficult.
The main tradeoff in EVM is L1 complexity versus infrastructure
complexity. EOF is a significant amount of code to add to EVM
implementations, and the static code checks are pretty complex. In
exchange, however, we get simplifications to higher-level languages,
simplifications to EVM implementations, and other benefits. Arguably, a
roadmap which prioritizes continued improvement to the Ethereum L1 would
include and build on EOF.
One important piece of work to do is to implement something like
EVM-MAX plus SIMD and benchmark how much gas various cryptographic
operations would take.
How does
it interact with other parts of the roadmap?
The L1 adjusting its EVM makes it easier for L2s to do the same. One
adjusting without the other creates some incompatibilities, which has
its own downsides. Additionally, EVM-MAX plus SIMD can reduce gas costs
for many proof systems, enabling more efficient L2s. It also makes it
easier to remove more precompiles, by replacing them with EVM code that
can perform the same task perhaps without a large penalty to
efficiency.
Account abstraction
What problem does it solve?
Today, a transaction can only be verified in one way: ECDSA
signatures. Originally, account abstraction was meant to expand beyond
this, and allow an account's verification logic to be arbitrary EVM
code. This could enable a range of applications:
- Switching to quantum-resistant cryptography
- Rotating out old keys (widely understood to be a recommended
security practice)
- Multisig wallets and social
recovery wallets
- Signing with one key for low-value operations and another key (or
set of keys) for high-value operations
- Allowing privacy protocols to work without relayers, significantly
lowering their complexity and removing a key central point of
dependency
Since account abstraction began in 2015, the goals have expanded to
also include a large set of "convenience goals", such as an account that
has no ETH but has some ERC20 being able to pay gas in that ERC20.
Instead of the account abstraction roadmap just abstracting validation,
it aims to abstract everyghing: authentication (who can perform an
action), authorization (what can they do), replay protection, gas
payment and execution. One summary of these goals is the following
chart:
MPC here is multi-party computation: a 40-year-old
technique to split a key into multiple pieces that are stored on
multiple devices, and use cryptographic techniques to generate a
signature without combining the pieces of the key directly.
EIP-7702 is an
EIP planned to be introduced in the next hard fork. EIP-7702 is the
result of the growing recognition of a need to give the convenience
benefits of account abstraction to all users, including EOA users, to
improve user experience for everyone in the short term, and in a way
that avoids bifurcation into two ecosystems. This work started with EIP-3074, and
culminated in EIP-7702. EIP-7702
makes the "convenience features" of account abstraction available to all
users, including EOAs (externally-owned
accounts, ie. accounts controlled by ECDSA signatures),
today.
As we can see from the chart, while some challenges (especially the
"convenience" challenges) can be solved with incremental techniques such
as multi-party computation or EIP-7702, the bulk of the security goals
that motivated the original account abstraction proposal can only be
solved by going back and solving the original problem: allowing smart
contract code to control transaction verification. The reason why this
has not been done so far is that implementing it safely is a
challenge.
What is it, and how does it
work?
At the core, account abstraction is simple: allow transactions to be
initiated by smart contracts, and not just EOAs. The entire complexity
comes from doing this in a way that is friendly to maintaining a
decentralized network and protecting against denial of service
attacks.
One illustrative example of a key challenge is the multi-invalidation
problem:
If there are 1000 accounts whose validation function all depends on
some single value S
, and there are transactions in the
mempool that are valid given the current value of S
, then
one single transaction flipping the value of S
could
invalidate all of the other transactions in the mempool. This allows for
an attacker to spam the mempool, clogging up the resources of nodes on
the network, at a very low cost.
Years of effort trying to expand functionality while limiting DoS
risks have led to convergence on one solution for how to implement
"ideal account abstraction": ERC-4337.
ERC-4337 works by dividing processing of user operations into two
phases: validation and execution. All
validations are processed first, and all executions are processed
second. In the mempool, a user operation is only accepted if its
validation phase only touches its own account (plus a few special-case
extensions, see "associated storage" in ERC-7562), and does
not read environmental variables. This prevents multi-invalidation
attacks. A strict gas limit on the validation step is also enforced.
ERC-4337 was designed as an extra-protocol standard (an ERC), because
at the time the Ethereum client developers were focused on the Merge,
and did not have any spare capacity to work on other features. This is
why ERC-4337 uses its own object called user operations, instead of
regular transactions. More recently, however, we have been realizing
that there is a need to enshrine at least parts of it in the protocol.
Two key reasons are:
- The inherent inefficiencies of the EntryPoint being a contract: a
flat ~100k gas overhead per bundle and thousands extra per user
operation
- The need to make sure Ethereum properties such as inclusion
guarantees created by inclusion lists carry over to account abstraction
users.
Additionally, ERC-4337 has been extended by two features:
- Paymasters: a feature that allows an account to pay
fees on behalf of another account. This violates the rule that only the
sender account itself can be accessed during the validation phase, so
special handling is introduced to allow the paymaster mechanism and
ensure that it is safe.
- Aggregators: a feature that supports signature
aggregation, such as BLS aggregation or SNARK-based aggregation. This is
needed to enable the highest level of data efficiency on rollups.
What are some links
to existing research?
What is left to
do, and what are the tradeoffs?
The main remaining thing to figure out is how to fully bring account
abstraction into the protocol. A recently popular enshrined account
abstraction EIP is EIP-7701, which
implements account abstraction on top of EOF. An account can have a
separate code section for validation, and if an account has that code
section set, that is the code gets executed during the validation step
of a transaction from that account.
EOF code structure for an EIP-7701 account
What is fascinating about this approach is that it makes it clear
that there are two equivalent ways to view native account
abstraction:
- EIP-4337, but as part of the protocol
- A new type of EOA, where the signature algorithm is EVM code
execution
If we start with strict bounds on the complexity of code that can be
executed during validation - allowing no external state access, and even
at first setting a gas limit too low to be useful for quantum-resistant
or privacy-preserving applications - then the safety of this approach is
very clear: it's just swapping out ECDSA verification for an EVM code
execution that takes a similar amount of time. However, over time we
would need to loosen these bounds, because allowing
privacy-preserving applications to work without relayers, and quantum
resistance, are both very important. And in order to do this,
we do need to find ways to address the DoS risks in a more flexible way,
without requiring the validation step to be ultra-minimalistic.
The main tradeoff seems to be "enshrine something that fewer people
are happy with, sooner" versus "wait longer, and perhaps get a more
ideal solution". The ideal approach will likely be some hybrid approach.
One hybrid approach is to enshrine some use cases more quickly, and
leave more time to figure out others. Another is to deploy more
ambitious versions of account abstraction on L2s first. However, this
has the challenge that for an L2 team to be willing to do the work to
adopt a proposal, they need to be confident that L1 and/or other L2s
will adopt something compatible later on.
Another application that we need to think about explicitly is
keystore
accounts, which store account-related state on either L1 or
a dedicated L2, but can be used both L1 and any compatible L2. Doing
this effectively likely requires L2s to support opcodes such as L1SLOAD
or REMOTESTATICCALL
,
though it also requires account abstraction implementations on L2 to
support it.
How does
it interact with other parts of the roadmap?
Inclusion lists need to support account abstracted transactions. In
practice, the needs of inclusion lists and the needs of decentralized
mempools end up being pretty similar, though there is slightly more
flexibility for inclusion lists. Additionally, account abstraction
implementations should ideally be harmonized on L1 and L2 as much as
possible. If, in the future, we expect most users to be using keystore
rollups, the account abstraction designs should be built with this in
mind. Gas payment abstraction should also be designed with cross-chain
use cases in mind (see eg. RIP-7755).
EIP-1559 improvements
What problem does it solve?
EIP-1559
activated on Ethereum in 2021, and led to significant improvements in
average block inclusion time.
However, the current implementation of EIP-1559 is imperfect in
several ways:
- The formula is slightly flawed: instead of targeting 50% blocks it
targets ~50-53% full blocks depending on variance (this has to do with
what mathematicians call the "AM-GM
inequality")
- It doesn't
adjust fast enough in extreme conditions.
The formula later used for blobs (EIP-4844) was explicitly designed to
address the first concern, and is overall cleaner. Neither EIP-1559
itself, nor EIP-4844, attempt to address the second problem. As a
result, the status quo is a confusing halfway state involving two
different mechanisms, and there is even a case that over time both will
need to be improved.
In addition to this, there are other weaknesses of Ethereum resource
pricing that are independent of EIP-1559, but which could be solved by
tweaks to EIP-1559. A major one is average case vs worst case
discrepancies: resource prices in Ethereum have to be set to be
able to handle the worst case, where a block's entire gas consumption
takes up one resource, but average-case use is much less than this,
leading to inefficiencies.
What is it, and how does it
work?
A solution to these inefficiencies is multidimensional
gas: having separate prices and limits for separate resources. This
concept is technically independent from EIP-1559, but EIP-1559 makes it
easier: without EIP-1559, optimally packing a block with multiple
resource constraints is a complicated multidimensional
knapsack problem. With EIP-1559, most blocks are not at full
capacity on any resource, and so the simple algorithm of "accept
anything that pays a sufficient fee" suffices.
We have multidimensional gas for execution and blobs today; in
principle, we could increase this to more dimensions:
calldata, state reads/writes, and
state size expansion.
EIP-7706
introduces a new gas dimension for calldata. At the same time, it
streamlines the multidimensional gas mechanism by making all three types
of gas fall under one (EIP-4844-style) framework, thus also solving the
mathematical flaws with EIP-1559.
EIP-7623 is a
more surgical solution to the average case vs worst case resource
problem that more strictly bounds max calldata without introducing a
whole new dimension.
A further direction to go would be to tackle the update rate problem,
and find a basefee calculation algorithm that is faster, and at the same
time preserves the key invariants introduced by the EIP-4844 mechanism
(namely: in the long run average usage approaches exactly the
target).
What are some links
to existing research?
What is left to
do, and what are the tradeoffs?
Multidimensional gas has two primary tradeoffs:
- It adds complexity to the protocol
- It adds complexity to the optimal algorithm needed to fill a block
to capacity
Protocol complexity is a relatively small issue for calldata, but
becomes a larger issue for gas dimensions that are "inside the EVM",
such as storage reads and writes. The problem is that it's not just
users that set gas limits: it's also contracts that set limits when they
call other contracts. And today, the only way they have to set limits is
one-dimensional.
One easy way to eliminate this problem is to make multidimensional
gas only available inside EOF, because EOF does not allow contracts to
set gas limits in calls to other contracts. Non-EOF contracts would have
to pay a fee in all types of gas when making a storage operation (eg. if
an SLOAD
costs 0.03% of a block's storage access gas limit,
the non-EOF user would also be charged 0.03% of the execution gas
limit)
More research on multidimensional gas would be very helpful in
understanding the tradeoffs and figuring out the ideal balance.
How does
it interact with other parts of the roadmap?
A successful implementation of multidimensional gas can greatly
reduce certain "worst-case" resource usages, and thus reduce pressure on
the need to optimize performance in order to support eg. STARKed
hash-based binary trees. Having a hard target for state size growth
would make it much easier for client developers to plan and estimate
their requirements going forward into the future.
As described above, EOF makes more extreme versions of
multidimensional gas significantly easier to implement due to its gas
non-observability properties.
Verifiable delay functions
(VDFs)
What problem does it solve?
Today, Ethereum uses RANDAO-based
randomness to choose proposers. RANDAO-based randomness works by
asking each proposer to reveal a secret that they committed to ahead of
time, and mixing each revealed secret into the randomness. Each proposer
thus has "1 bit of manipulation": they can change the randomness (at a
cost) by not showing up. This is reasonably okay for finding proposers,
because it's very rare that you can give yourself two new proposal
opportunities by giving up one. But it's not okay for on-chain
applications that need randomness. Ideally, we would find a more robust
source of randomness.
What is it, and how does it
work?
Verifiable delay
functions are a type of function that can only be computed
sequentially, with no speedups from parallelization. A simple example is
repeated hashing: compute
for i in range(10**9): x = hash(x)
. The output, proven with
a SNARK proof of correctness, could be used as a random value. The idea
is that the input is selected based on information available at time T,
and the output is not yet known at time T: it only becomes available
some time after T, once someone fully runs the computation. Because
anyone can run the computation, there is no possibility to withhold the
result, and so there is no ability to manipulate the outcome.
The main risk to a verifiable delay function is unexpected
optimization: someone figures out how to run the function much
faster than expected, allowing them to manipulate the information they
reveal at time T based on the future output. Unexpected optimization can
happen in two ways:
- Hardware acceleration: someone makes an ASIC that
runs the computation loop much faster than existing hardware.
- Unexpected parallelization: someone finds a way to
run the function faster by parallelizing it, even if doing so requires
100x more resources.
The tasks of creating a successful VDF is to avoid these two issues,
while at the same time keeping efficiency practical (eg. one problem
with the hash-based approach is that SNARK-proving over hashing in real
time has heavy hardware requirements). Hardware acceleration is
typically solved by having a public-good actor create and distribute
reasonably-close-to-optimal ASICs for the VDF by itself.
What are some links
to existing research?
What is left to
do, and what are the tradeoffs?
Currently, there is no VDF construction that fully satisfies Ethereum
researchers on all axes. More work is left to find such a function. If
we have it, the main tradeoff is simply whether or not to include it: a
simple tradeoff of functionality versus protocol complexity and risk to
security. If we think a VDF is secure, but it ends up being insecure,
then depending on how it's implemented security degrades to either the
RANDAO assumption (1 bit of manipulation per attacker) or something
slightly worse. Hence, even a broken VDF would not break the protocol,
though it would break applications or any new protocol features that
strongly depend on it.
How does
it interact with other parts of the roadmap?
The VDF is a relatively self-contained ingredient of the Ethereum
protocol, though in addition to increasing the security of proposer
selection it also has uses in (i) onchain applications that depend on
randomness, and potentially (ii) encrypted mempools, though making
encrypted mempools based on a VDF still depends on additional
cryptographic discoveries which have not yet happened.
One point to keep in mind is that given uncertainty in hardware,
there will be some "slack" between when a VDF output is produced and
when it becomes needed. This means that information will be accessible a
few blocks ahead. This can be an acceptable cost, but should be taken
into account in eg. single-slot finality or committee selection
designs.
Obfuscation
and one-shot signatures: the far future of cryptography
What problem does it solve?
One of Nick Szabo's most famous posts is a 1997 essay on "God
protocols". In this essay, he points out that often, multi-party
applications depend on a "trusted third party" to manage the
interaction. The role of cryptography, in his view, is to create a
simulated trusted third party that does the same job, without actually
requiring any trust in any specific actor.
"Mathematically trustworthy protocol", diagram by Nick
Szabo
So far, we have only been able to partially approach this ideal. If
all we need is a transparent virtual computer, where the data
and computation cannot be shut down, censored or tampered with, but
privacy is not a goal, then blockchains can do it, though with limited
scalability. If privacy is a goal, then up until recently we
have only been able to make a few specific protocols for specific
applications: digital signatures for basic authentication, ring signatures
and linkable
ring signatures for primitive forms of anonymity, identity-based
encryption to enable more convenient encryption under specific
assumptions about a trusted issuer, blind
signatures for Chaumian e-cash, and so
on. This approach requires lots of work for every new application.
In the 2010s, we saw the first glimpse of a different, and more
powerful approach, based on programmable
cryptography. Instead of creating a new protocol for each new
application, we could use powerful new protocols - specifically,
ZK-SNARKs - to add cryptographic guarantees to
arbitrary programs. ZK-SNARKs allow a user to prove any
arbitrary statement about data that they hold, in a way that the
proof (i) is easy to verify, and (ii) does not leak any data other than
the statement itself. This was a huge step forward for privacy and
scalability at the same time, that I have likened to the effect
of transformers
in AI. Thousands of man-years of application-specific work were
suddenly swept away by a general-purpose solution that you can just plug
in to solve a surprisingly wide range of problems.
But ZK-SNARKs are only the first in a trio of similar extremely
powerful general-purpose primitives. These protocols are so powerful
that when I think of them, they remind me of a set of extremely powerful
cards in Yu-Gi-Oh, a card game and a TV show that I used to play and
watch when I was a young child: the Egyptian god
cards. The Egyptian god cards are a trio of extremely powerful
cards, which according to legend are potentially deadly to manufacture,
and are so powerful that they are not allowed in duels. Similarly, in
cryptography, we have the trio of Egyptian god protocols:
What is it, and how does it
work?
ZK-SNARKs are one of these three protocols that we
already have, to a high level of maturity. After large improvements
to prover speed and developer-friendliness in the last five years,
ZK-SNARKs have become the bedrock of Ethereum's scalability and privacy
strategy. But ZK-SNARKs have an important limitation: you need to know
the data to make proofs about it. Each piece of state in a ZK-SNARK
application must have a single "owner", who must be around to approve
any reads or writes to it.
The second protocol, which does not have this limitation, is fully
homomorphic encryption (FHE). FHE lets you
do any computation on encrypted data without seeing the
data. This lets you do computations on a user's data for the
user's benefit while keeping the data and the algorithm private. It also
lets you extend voting systems such as
MACI to have almost-perfect security and privacy guarantees. FHE was
for a long time considered too inefficient for practical use, but now
it's finally becoming efficient enough that we are starting to see
applications.
Cursive, an
application that uses two-party computation and FHE to do
privacy-preserving discovery of common interests.
But FHE too has its limits: any FHE-based technology still requires
someone to hold the decryption key. This could be a M-of-N
distributed setup, and you can even use TEEs to add a second layer of
defense, but it's still a limitation.
This gets us to the third protocol, which is more powerful than the
other two combined: indistinguishability
obfuscation. While it's still
very far from maturity, as of 2020 we have
theoretically valid protocols for it based on standard security
assumptions, and work is recently
starting on implementations. Indistinguishability obfuscation lets
you create an "encrypted program" that performs an arbitrary
computation, in such a way that all internal details of the program are
hidden. As a simple example, you can put a private key into an
obfuscated program which only lets you use it to sign prime numbers, and
distribute this program to other people. They can use the program to
sign any prime number, but cannot take the key out. But it's far more
powerful than that: together with hashes, it can be used to implement
any other cryptographic primitive, and more.
The only thing that an obfuscated program can't do, is prevent itself
from being copied. But for that, there is something even more powerful
on the horizon, though it depends on everyone having quantum computers:
quantum one-shot
signatures.
With obfuscation and one-shot signatures together, we can build
almost perfect trustless third parties. The only thing we can't do with
cryptography alone, and that we would still need a blockchain for, is
guaranteeing censorship resistance. These technologies would allow us to
not only make Ethereum itself much more secure, but also build much more
powerful applications on top of it.
To see how each of these primitives adds additional power, let us go
through a key example: voting. Voting is a fascinating
problem because it has so many tricky security properties that need to
be satisfied, including very strong forms of both verifiability and
privacy. While voting protocols with strong security properties have existed for decades, let
us make the problem harder for ourselves by saying that we want a design
that can handle arbitrary voting protocols: quadratic
voting, pairwise-bounded
quadratic funding, cluster-matching
quadratic funding, and so on. That is, we want the "tallying" step
to be an arbitrary program.
- First, suppose we put votes publicly on a
blockchain. This gets us public
verifiability (anyone can verify that the final outcome is
correct, including tallying rules and eligibility rules) and
censorship resistance (can't stop people from voting).
But we have no privacy.
- Then, we add ZK-SNARKs. Now, we have
privacy: each vote is anonymous, while ensuring that
only authorized voters can vote, and every voter can only vote
once.
- Now, we add the MACI mechanism.
Votes are encrypted to a central server's decryption key. The central
server is required to run the tallying process, including throwing out
duplicate votes, and it publishes a ZK-SNARK proving the answer. This
keeps the previous guarantees (even if the server is cheating!), but if
the server is honest it adds a coercion-resistance
guarantee: a user can't prove how they voted, even if they want to. This
is because while a user can prove the vote that they made, they have no
way to prove that they did not make another vote that cancels it out.
This prevents bribery and other attacks.
- We run the tallying inside FHE, and then have an
N/2-of-N threshold-decryption
computation decrypt it. This makes the coercion-resistance
guarantee N/2-of-N,
instead of 1-of-1.
- We make the tallying program obfuscated, and we
design the obfuscated program so that it can only give an output if
given permission to do so, either by a proof of blockchain consensus, or
by some quantity of proof of work, or both. This makes the
coercion-resistance guarantee almost perfect: in the blockchain
consensus case, you would need 51% of validators to collude to break it,
and in the proof of work case, even if everyone colludes, re-running the
tally with different subsets of voters to try to extract the behavior of
a single voter would be extremely expensive. We can even make the
program make a small random adjustment to the final tally, to make it
even harder to extract the behavior of an individual voter.
- We add one-shot signatures, a primitive that
depends on quantum computing that allows signatures that can only be
used to sign a message of a certain type once. This makes the
coercion-resistance guarantee truly perfect.
Indistinguishability obfuscation also allows for other powerful
applications. For example:
- DAOs, on-chain auctions, and other applications with
arbitrary internal secret state.
- A truly universal trusted
setup: someone can create an obfuscated program that
contains a key, and can run any program and provide the output, putting
hash(key, program)
in as an input into the program. Given
such a program, anyone can also put the program into itself, combining
the program's pre-existing key with their own key, and in doing so
extend the setup. This can be used to generate a 1-of-N trusted setup
for any protocol.
- ZK-SNARKs whose verification is just a signature.
Implementing this is simple: have a trusted setup where someone creates
an obfuscated program that only signs a message with a key if it's a
valid ZK-SNARK.
- Encrypted mempools. It becomes trivially easy to
encrypt transactions in such a way that they only get decrypted when
some onchain event in the future happens. This could even include the
successful execution of a VDF.
With one-shot signatures, we can make blockchains immune to
finality-reverting 51% attacks, though censorship attacks continue to be
possible. Primitives similar to one-shot signatures enable quantum money,
solving the double-spend problem without a blockchain, though many more
complex applications would still require a chain.
If these primitives can be made efficient enough, then most
applications in the world can be made decentralized. The main bottleneck
would be verifying the correctness of implementations.
What are some links
to existing research?
What is left to
do, and what are the tradeoffs?
There is a heck of a lot left to do.
Indistinguishability obfuscation is incredibly immature, and candidate
constructions are millions of times too slow (if not more) to be usable
in applications. Indistinguishability obfuscation is famous for having
runtimes that are "theoretically" polynomial-time, but take longer than
the lifetime of the universe to run in practice. More recent protocols
have made runtimes less extreme, but the overhead is still far too high
for regular use: one implementer expects a runtime of one year.
Quantum computers do not even exist: all constructions you might read
about on the internet today are either prototypes not capable of doing
any computation larger than 4 bits, or are not real quantum computers,
in the sense that while they may have quantum parts in them, they cannot
run actually-meaningful computations like Shor's
algorithm or Grover's
algorithm. Recently, there have been signs that "real" quantum
computers are no longer
that far away. However, even if "real" quantum computers come soon,
the day when regular people have quantum computers on their laptops or
phones may well be decades after the day when powerful institutions get
one that can crack elliptic curve cryptography.
For indistinguishability obfuscation, one key tradeoff is in security
assumptions. There are more aggressive designs that use
exotic assumptions.
These often have more realistic runtimes, but the exotic assumptions
sometimes end up
broken. Over time, we may end up understanding lattices enough to
make assumptions that do not get broken. However, this path is more
risky. The more conservative path is to insist on protocols whose
security provably reduces to "standard" assumptions, but this may mean
that it takes much longer until we get protocols that run fast
enough.
How does
it interact with other parts of the roadmap?
Extremely powerful cryptography could change the game completely. For
example:
- If we get ZK-SNARKs which are as easy to verify as a signature, we
may not need any aggregation protocols; we can just verify onchain
directly.
- One-shot signatures could imply much more secure proof-of-stake
protocols.
- Many complicated privacy protocols could be replaced with "just"
having a privacy-preserving EVM.
- Encrypted mempools become much easier to implement.
At first, the benefits will come on the application layer, because
the Ethereum L1 inherently needs to be conservative on security
assumptions. However, even application-layer use alone could be
game-changing, by as much as the advent of ZK-SNARKs has been.
Possible futures of the Ethereum protocol, part 6: The Splurge
2024 Oct 29 See all postsSpecial thanks to Justin Drake, Tim Beiko and Yoav Weiss for feedback and review
Some things are just not easy to put into a single category. There are lots of "little things" in Ethereum protocol design that are very valuable for Ethereum's success, but don't fit nicely into a larger sub-category. In practice, about half of which has ended up being about EVM improvements of various kinds, and the rest is made up of various niche topics. This is what "the Splurge" is for.
The Splurge, 2023 roadmap
The Splurge: key goals
In this chapter
EVM improvements
What problem does it solve?
The EVM today is difficult to statically analyze, making it difficult to create highly efficient implementations, formally verify code, and make further extensions to over time. Additionally, it is highly inefficient, making it difficult to implement many forms of advanced cryptography unless they are explicitly supported through precompiles.
What is it, and how does it work?
The first step in the current EVM improvement roadmap, scheduled to be included in the next hard fork, is the EVM Object Format (EOF). EOF is a series of EIPs that specifies a new version of EVM code that has a number of distinct features, most notably:
Structure of EOF code
Old-style contracts would continue to exist and be createable, although there is a possible path to deprecate old-style contracts (and perhaps even force-convert them to EOF code) eventually. New-style contracts would benefit from efficiency gains created by EOF - first, from slightly smaller bytecode taking advantage of the subroutine feature, and later from new EOF-specific features, or EOF-specific gas cost decreases.
After EOF is introduced, it becomes easier to introduce further upgrades. The most well-developed today is the EVM Modular Arithmetic Extensions (EVM-MAX). EVM-MAX creates a new set of operations specifically designed for modular arithmetic, and puts them into a new memory space that cannot be accessed with other opcodes. This enables the use of optimizations, such as Montgomery multiplication.
A newer idea is to combine EVM-MAX with a single-instruction-multiple-data (SIMD) feature. SIMD has been around as an idea for Ethereum for a long time starting with Greg Colvin's EIP-616. SIMD can be used to speed up many forms of cryptography, including hash functions, 32-bit STARKs, and lattice-based cryptography. EVM-MAX plus SIMD make for a natural pair of performance-oriented extensions to the EVM.
An approximate design for a combined EIP would be to take EIP-6690 as a starting point, and then:
For each EVMMAX opcode (
Except in an actual implementation, it would be processed in parallel.add
,sub
,mul
) add a version which, instead of taking 3 immediatesx
,y
,z
, takes 7 immediates:x_start
,x_skip
,y_start
,y_skip
,z_start
,z_skip
,count
. In python code, these opcodes would do something equivalent to:XOR
,AND
,OR
,NOT
andSHIFT
(both cyclic and noncyclic), at least for power-of-two moduli. Also addISZERO
(which pushes the output to EVM main stack)This would be powerful enough to implement elliptic curve cryptography, small-field cryptography (eg. Poseidon, circle STARKs), conventional hash functions (eg. SHA256, KECCAK, BLAKE), and lattice-based cryptography.
Other EVM upgrades may also be possible, but so far they have seen much less attention.
What are some links to existing research?
What is left to do, and what are the tradeoffs?
Currently, EOF is scheduled to be included in the next hard fork. While there is always a possibility to remove it - features have been last-minute-removed from hard forks before - doing so would be an uphill battle. Removing EOF would imply making any future upgrades to the EVM without EOF, which can be done but may be more difficult.
The main tradeoff in EVM is L1 complexity versus infrastructure complexity. EOF is a significant amount of code to add to EVM implementations, and the static code checks are pretty complex. In exchange, however, we get simplifications to higher-level languages, simplifications to EVM implementations, and other benefits. Arguably, a roadmap which prioritizes continued improvement to the Ethereum L1 would include and build on EOF.
One important piece of work to do is to implement something like EVM-MAX plus SIMD and benchmark how much gas various cryptographic operations would take.
How does it interact with other parts of the roadmap?
The L1 adjusting its EVM makes it easier for L2s to do the same. One adjusting without the other creates some incompatibilities, which has its own downsides. Additionally, EVM-MAX plus SIMD can reduce gas costs for many proof systems, enabling more efficient L2s. It also makes it easier to remove more precompiles, by replacing them with EVM code that can perform the same task perhaps without a large penalty to efficiency.
Account abstraction
What problem does it solve?
Today, a transaction can only be verified in one way: ECDSA signatures. Originally, account abstraction was meant to expand beyond this, and allow an account's verification logic to be arbitrary EVM code. This could enable a range of applications:
Since account abstraction began in 2015, the goals have expanded to also include a large set of "convenience goals", such as an account that has no ETH but has some ERC20 being able to pay gas in that ERC20. Instead of the account abstraction roadmap just abstracting validation, it aims to abstract everyghing: authentication (who can perform an action), authorization (what can they do), replay protection, gas payment and execution. One summary of these goals is the following chart:
MPC here is multi-party computation: a 40-year-old technique to split a key into multiple pieces that are stored on multiple devices, and use cryptographic techniques to generate a signature without combining the pieces of the key directly.
EIP-7702 is an EIP planned to be introduced in the next hard fork. EIP-7702 is the result of the growing recognition of a need to give the convenience benefits of account abstraction to all users, including EOA users, to improve user experience for everyone in the short term, and in a way that avoids bifurcation into two ecosystems. This work started with EIP-3074, and culminated in EIP-7702. EIP-7702 makes the "convenience features" of account abstraction available to all users, including EOAs (externally-owned accounts, ie. accounts controlled by ECDSA signatures), today.
As we can see from the chart, while some challenges (especially the "convenience" challenges) can be solved with incremental techniques such as multi-party computation or EIP-7702, the bulk of the security goals that motivated the original account abstraction proposal can only be solved by going back and solving the original problem: allowing smart contract code to control transaction verification. The reason why this has not been done so far is that implementing it safely is a challenge.
What is it, and how does it work?
At the core, account abstraction is simple: allow transactions to be initiated by smart contracts, and not just EOAs. The entire complexity comes from doing this in a way that is friendly to maintaining a decentralized network and protecting against denial of service attacks.
One illustrative example of a key challenge is the multi-invalidation problem:
If there are 1000 accounts whose validation function all depends on some single value
S
, and there are transactions in the mempool that are valid given the current value ofS
, then one single transaction flipping the value ofS
could invalidate all of the other transactions in the mempool. This allows for an attacker to spam the mempool, clogging up the resources of nodes on the network, at a very low cost.Years of effort trying to expand functionality while limiting DoS risks have led to convergence on one solution for how to implement "ideal account abstraction": ERC-4337.
ERC-4337 works by dividing processing of user operations into two phases: validation and execution. All validations are processed first, and all executions are processed second. In the mempool, a user operation is only accepted if its validation phase only touches its own account (plus a few special-case extensions, see "associated storage" in ERC-7562), and does not read environmental variables. This prevents multi-invalidation attacks. A strict gas limit on the validation step is also enforced.
ERC-4337 was designed as an extra-protocol standard (an ERC), because at the time the Ethereum client developers were focused on the Merge, and did not have any spare capacity to work on other features. This is why ERC-4337 uses its own object called user operations, instead of regular transactions. More recently, however, we have been realizing that there is a need to enshrine at least parts of it in the protocol. Two key reasons are:
Additionally, ERC-4337 has been extended by two features:
What are some links to existing research?
What is left to do, and what are the tradeoffs?
The main remaining thing to figure out is how to fully bring account abstraction into the protocol. A recently popular enshrined account abstraction EIP is EIP-7701, which implements account abstraction on top of EOF. An account can have a separate code section for validation, and if an account has that code section set, that is the code gets executed during the validation step of a transaction from that account.
EOF code structure for an EIP-7701 account
What is fascinating about this approach is that it makes it clear that there are two equivalent ways to view native account abstraction:
If we start with strict bounds on the complexity of code that can be executed during validation - allowing no external state access, and even at first setting a gas limit too low to be useful for quantum-resistant or privacy-preserving applications - then the safety of this approach is very clear: it's just swapping out ECDSA verification for an EVM code execution that takes a similar amount of time. However, over time we would need to loosen these bounds, because allowing privacy-preserving applications to work without relayers, and quantum resistance, are both very important. And in order to do this, we do need to find ways to address the DoS risks in a more flexible way, without requiring the validation step to be ultra-minimalistic.
The main tradeoff seems to be "enshrine something that fewer people are happy with, sooner" versus "wait longer, and perhaps get a more ideal solution". The ideal approach will likely be some hybrid approach. One hybrid approach is to enshrine some use cases more quickly, and leave more time to figure out others. Another is to deploy more ambitious versions of account abstraction on L2s first. However, this has the challenge that for an L2 team to be willing to do the work to adopt a proposal, they need to be confident that L1 and/or other L2s will adopt something compatible later on.
Another application that we need to think about explicitly is keystore accounts, which store account-related state on either L1 or a dedicated L2, but can be used both L1 and any compatible L2. Doing this effectively likely requires L2s to support opcodes such as
L1SLOAD
orREMOTESTATICCALL
, though it also requires account abstraction implementations on L2 to support it.How does it interact with other parts of the roadmap?
Inclusion lists need to support account abstracted transactions. In practice, the needs of inclusion lists and the needs of decentralized mempools end up being pretty similar, though there is slightly more flexibility for inclusion lists. Additionally, account abstraction implementations should ideally be harmonized on L1 and L2 as much as possible. If, in the future, we expect most users to be using keystore rollups, the account abstraction designs should be built with this in mind. Gas payment abstraction should also be designed with cross-chain use cases in mind (see eg. RIP-7755).
EIP-1559 improvements
What problem does it solve?
EIP-1559 activated on Ethereum in 2021, and led to significant improvements in average block inclusion time.
However, the current implementation of EIP-1559 is imperfect in several ways:
The formula later used for blobs (EIP-4844) was explicitly designed to address the first concern, and is overall cleaner. Neither EIP-1559 itself, nor EIP-4844, attempt to address the second problem. As a result, the status quo is a confusing halfway state involving two different mechanisms, and there is even a case that over time both will need to be improved.
In addition to this, there are other weaknesses of Ethereum resource pricing that are independent of EIP-1559, but which could be solved by tweaks to EIP-1559. A major one is average case vs worst case discrepancies: resource prices in Ethereum have to be set to be able to handle the worst case, where a block's entire gas consumption takes up one resource, but average-case use is much less than this, leading to inefficiencies.
What is it, and how does it work?
A solution to these inefficiencies is multidimensional gas: having separate prices and limits for separate resources. This concept is technically independent from EIP-1559, but EIP-1559 makes it easier: without EIP-1559, optimally packing a block with multiple resource constraints is a complicated multidimensional knapsack problem. With EIP-1559, most blocks are not at full capacity on any resource, and so the simple algorithm of "accept anything that pays a sufficient fee" suffices.
We have multidimensional gas for execution and blobs today; in principle, we could increase this to more dimensions: calldata, state reads/writes, and state size expansion.
EIP-7706 introduces a new gas dimension for calldata. At the same time, it streamlines the multidimensional gas mechanism by making all three types of gas fall under one (EIP-4844-style) framework, thus also solving the mathematical flaws with EIP-1559.
EIP-7623 is a more surgical solution to the average case vs worst case resource problem that more strictly bounds max calldata without introducing a whole new dimension.
A further direction to go would be to tackle the update rate problem, and find a basefee calculation algorithm that is faster, and at the same time preserves the key invariants introduced by the EIP-4844 mechanism (namely: in the long run average usage approaches exactly the target).
What are some links to existing research?
What is left to do, and what are the tradeoffs?
Multidimensional gas has two primary tradeoffs:
Protocol complexity is a relatively small issue for calldata, but becomes a larger issue for gas dimensions that are "inside the EVM", such as storage reads and writes. The problem is that it's not just users that set gas limits: it's also contracts that set limits when they call other contracts. And today, the only way they have to set limits is one-dimensional.
One easy way to eliminate this problem is to make multidimensional gas only available inside EOF, because EOF does not allow contracts to set gas limits in calls to other contracts. Non-EOF contracts would have to pay a fee in all types of gas when making a storage operation (eg. if an
SLOAD
costs 0.03% of a block's storage access gas limit, the non-EOF user would also be charged 0.03% of the execution gas limit)More research on multidimensional gas would be very helpful in understanding the tradeoffs and figuring out the ideal balance.
How does it interact with other parts of the roadmap?
A successful implementation of multidimensional gas can greatly reduce certain "worst-case" resource usages, and thus reduce pressure on the need to optimize performance in order to support eg. STARKed hash-based binary trees. Having a hard target for state size growth would make it much easier for client developers to plan and estimate their requirements going forward into the future.
As described above, EOF makes more extreme versions of multidimensional gas significantly easier to implement due to its gas non-observability properties.
Verifiable delay functions (VDFs)
What problem does it solve?
Today, Ethereum uses RANDAO-based randomness to choose proposers. RANDAO-based randomness works by asking each proposer to reveal a secret that they committed to ahead of time, and mixing each revealed secret into the randomness. Each proposer thus has "1 bit of manipulation": they can change the randomness (at a cost) by not showing up. This is reasonably okay for finding proposers, because it's very rare that you can give yourself two new proposal opportunities by giving up one. But it's not okay for on-chain applications that need randomness. Ideally, we would find a more robust source of randomness.
What is it, and how does it work?
Verifiable delay functions are a type of function that can only be computed sequentially, with no speedups from parallelization. A simple example is repeated hashing: compute
for i in range(10**9): x = hash(x)
. The output, proven with a SNARK proof of correctness, could be used as a random value. The idea is that the input is selected based on information available at time T, and the output is not yet known at time T: it only becomes available some time after T, once someone fully runs the computation. Because anyone can run the computation, there is no possibility to withhold the result, and so there is no ability to manipulate the outcome.The main risk to a verifiable delay function is unexpected optimization: someone figures out how to run the function much faster than expected, allowing them to manipulate the information they reveal at time T based on the future output. Unexpected optimization can happen in two ways:
The tasks of creating a successful VDF is to avoid these two issues, while at the same time keeping efficiency practical (eg. one problem with the hash-based approach is that SNARK-proving over hashing in real time has heavy hardware requirements). Hardware acceleration is typically solved by having a public-good actor create and distribute reasonably-close-to-optimal ASICs for the VDF by itself.
What are some links to existing research?
What is left to do, and what are the tradeoffs?
Currently, there is no VDF construction that fully satisfies Ethereum researchers on all axes. More work is left to find such a function. If we have it, the main tradeoff is simply whether or not to include it: a simple tradeoff of functionality versus protocol complexity and risk to security. If we think a VDF is secure, but it ends up being insecure, then depending on how it's implemented security degrades to either the RANDAO assumption (1 bit of manipulation per attacker) or something slightly worse. Hence, even a broken VDF would not break the protocol, though it would break applications or any new protocol features that strongly depend on it.
How does it interact with other parts of the roadmap?
The VDF is a relatively self-contained ingredient of the Ethereum protocol, though in addition to increasing the security of proposer selection it also has uses in (i) onchain applications that depend on randomness, and potentially (ii) encrypted mempools, though making encrypted mempools based on a VDF still depends on additional cryptographic discoveries which have not yet happened.
One point to keep in mind is that given uncertainty in hardware, there will be some "slack" between when a VDF output is produced and when it becomes needed. This means that information will be accessible a few blocks ahead. This can be an acceptable cost, but should be taken into account in eg. single-slot finality or committee selection designs.
Obfuscation and one-shot signatures: the far future of cryptography
What problem does it solve?
One of Nick Szabo's most famous posts is a 1997 essay on "God protocols". In this essay, he points out that often, multi-party applications depend on a "trusted third party" to manage the interaction. The role of cryptography, in his view, is to create a simulated trusted third party that does the same job, without actually requiring any trust in any specific actor.
"Mathematically trustworthy protocol", diagram by Nick Szabo
So far, we have only been able to partially approach this ideal. If all we need is a transparent virtual computer, where the data and computation cannot be shut down, censored or tampered with, but privacy is not a goal, then blockchains can do it, though with limited scalability. If privacy is a goal, then up until recently we have only been able to make a few specific protocols for specific applications: digital signatures for basic authentication, ring signatures and linkable ring signatures for primitive forms of anonymity, identity-based encryption to enable more convenient encryption under specific assumptions about a trusted issuer, blind signatures for Chaumian e-cash, and so on. This approach requires lots of work for every new application.
In the 2010s, we saw the first glimpse of a different, and more powerful approach, based on programmable cryptography. Instead of creating a new protocol for each new application, we could use powerful new protocols - specifically, ZK-SNARKs - to add cryptographic guarantees to arbitrary programs. ZK-SNARKs allow a user to prove any arbitrary statement about data that they hold, in a way that the proof (i) is easy to verify, and (ii) does not leak any data other than the statement itself. This was a huge step forward for privacy and scalability at the same time, that I have likened to the effect of transformers in AI. Thousands of man-years of application-specific work were suddenly swept away by a general-purpose solution that you can just plug in to solve a surprisingly wide range of problems.
But ZK-SNARKs are only the first in a trio of similar extremely powerful general-purpose primitives. These protocols are so powerful that when I think of them, they remind me of a set of extremely powerful cards in Yu-Gi-Oh, a card game and a TV show that I used to play and watch when I was a young child: the Egyptian god cards. The Egyptian god cards are a trio of extremely powerful cards, which according to legend are potentially deadly to manufacture, and are so powerful that they are not allowed in duels. Similarly, in cryptography, we have the trio of Egyptian god protocols:
What is it, and how does it work?
ZK-SNARKs are one of these three protocols that we already have, to a high level of maturity. After large improvements to prover speed and developer-friendliness in the last five years, ZK-SNARKs have become the bedrock of Ethereum's scalability and privacy strategy. But ZK-SNARKs have an important limitation: you need to know the data to make proofs about it. Each piece of state in a ZK-SNARK application must have a single "owner", who must be around to approve any reads or writes to it.
The second protocol, which does not have this limitation, is fully homomorphic encryption (FHE). FHE lets you do any computation on encrypted data without seeing the data. This lets you do computations on a user's data for the user's benefit while keeping the data and the algorithm private. It also lets you extend voting systems such as MACI to have almost-perfect security and privacy guarantees. FHE was for a long time considered too inefficient for practical use, but now it's finally becoming efficient enough that we are starting to see applications.
Cursive, an application that uses two-party computation and FHE to do privacy-preserving discovery of common interests.
But FHE too has its limits: any FHE-based technology still requires someone to hold the decryption key. This could be a M-of-N distributed setup, and you can even use TEEs to add a second layer of defense, but it's still a limitation.
This gets us to the third protocol, which is more powerful than the other two combined: indistinguishability obfuscation. While it's still very far from maturity, as of 2020 we have theoretically valid protocols for it based on standard security assumptions, and work is recently starting on implementations. Indistinguishability obfuscation lets you create an "encrypted program" that performs an arbitrary computation, in such a way that all internal details of the program are hidden. As a simple example, you can put a private key into an obfuscated program which only lets you use it to sign prime numbers, and distribute this program to other people. They can use the program to sign any prime number, but cannot take the key out. But it's far more powerful than that: together with hashes, it can be used to implement any other cryptographic primitive, and more.
The only thing that an obfuscated program can't do, is prevent itself from being copied. But for that, there is something even more powerful on the horizon, though it depends on everyone having quantum computers: quantum one-shot signatures.
With obfuscation and one-shot signatures together, we can build almost perfect trustless third parties. The only thing we can't do with cryptography alone, and that we would still need a blockchain for, is guaranteeing censorship resistance. These technologies would allow us to not only make Ethereum itself much more secure, but also build much more powerful applications on top of it.
To see how each of these primitives adds additional power, let us go through a key example: voting. Voting is a fascinating problem because it has so many tricky security properties that need to be satisfied, including very strong forms of both verifiability and privacy. While voting protocols with strong security properties have existed for decades, let us make the problem harder for ourselves by saying that we want a design that can handle arbitrary voting protocols: quadratic voting, pairwise-bounded quadratic funding, cluster-matching quadratic funding, and so on. That is, we want the "tallying" step to be an arbitrary program.
Indistinguishability obfuscation also allows for other powerful applications. For example:
hash(key, program)
in as an input into the program. Given such a program, anyone can also put the program into itself, combining the program's pre-existing key with their own key, and in doing so extend the setup. This can be used to generate a 1-of-N trusted setup for any protocol.With one-shot signatures, we can make blockchains immune to finality-reverting 51% attacks, though censorship attacks continue to be possible. Primitives similar to one-shot signatures enable quantum money, solving the double-spend problem without a blockchain, though many more complex applications would still require a chain.
If these primitives can be made efficient enough, then most applications in the world can be made decentralized. The main bottleneck would be verifying the correctness of implementations.
What are some links to existing research?
What is left to do, and what are the tradeoffs?
There is a heck of a lot left to do. Indistinguishability obfuscation is incredibly immature, and candidate constructions are millions of times too slow (if not more) to be usable in applications. Indistinguishability obfuscation is famous for having runtimes that are "theoretically" polynomial-time, but take longer than the lifetime of the universe to run in practice. More recent protocols have made runtimes less extreme, but the overhead is still far too high for regular use: one implementer expects a runtime of one year.
Quantum computers do not even exist: all constructions you might read about on the internet today are either prototypes not capable of doing any computation larger than 4 bits, or are not real quantum computers, in the sense that while they may have quantum parts in them, they cannot run actually-meaningful computations like Shor's algorithm or Grover's algorithm. Recently, there have been signs that "real" quantum computers are no longer that far away. However, even if "real" quantum computers come soon, the day when regular people have quantum computers on their laptops or phones may well be decades after the day when powerful institutions get one that can crack elliptic curve cryptography.
For indistinguishability obfuscation, one key tradeoff is in security assumptions. There are more aggressive designs that use exotic assumptions. These often have more realistic runtimes, but the exotic assumptions sometimes end up broken. Over time, we may end up understanding lattices enough to make assumptions that do not get broken. However, this path is more risky. The more conservative path is to insist on protocols whose security provably reduces to "standard" assumptions, but this may mean that it takes much longer until we get protocols that run fast enough.
How does it interact with other parts of the roadmap?
Extremely powerful cryptography could change the game completely. For example:
At first, the benefits will come on the application layer, because the Ethereum L1 inherently needs to be conservative on security assumptions. However, even application-layer use alone could be game-changing, by as much as the advent of ZK-SNARKs has been.