eth2 insights: aggregation performance
January 26, 2022
In our eth2 insights series, Elias Simos, Protocol Specialist at Coinbase Cloud, shares his joint research with Sid Shekhar (Blockchain Research lead at Coinbase), and looks at why aggregation in eth2 appears ineffective over the course of Medalla testing
In eth2, aggregation means collecting valuable network information from the peer-to-peer (P2P) layer and packaging it together for efficient submission on-chain. This is done by validators in each consensus round, and involves hashing together individual attestation [identification] signatures broadcast by other nodes and submitting them for inclusion on-chain as a group.
Attestations are the consensus function taking up most chain space — especially in Phase 0 when there will be no on-chain transactions. So, aggregation is a vital protocol function in an eth2 chain designed for scalability.
According to the protocol’s rules, every validator in the active set is either proposing a block or allocated an attestation slot at every epoch. Attesters are organized into committees, each of which represents a group of validators assigned to attest in the same target slot (or duty slot) in a given epoch. Some attesters on those committees are chosen to be aggregators.
Too many attestations
Given these rules, you would expect the number of attestations found on-chain to be roughly equal to the number of validators in the active set epoch. “Roughly”, because some validators are chosen as proposers (32 per epoch in Phase 0), and some would probably not take part in consensus due to downtime. The protocol only needs two thirds of activated validators to attest to reach finality.
We found, however, that over the 14,500 epochs we surveyed (about two months in human time), Medalla testing included 50-100% more attestations than activated validators per epoch.
There is a possible reason behind this.
Figure 1: individual attestations included on-chain vs. active validators, over time
An aggregator collects “unaggregated” attestations from a broadcast domain on the P2P network, then amasses them into a single proof, which is published on another GossipSub broadcast domain. These aggregated attestations end up on-chain when the block proposer “sees” a pool of attestations either from the aggregated attestations broadcast domain or some other source (such as "unaggregated" attestations on the attestation subnets) and copies those that:
1. are valid for that block (include attestations targeting the slot proposed)
2. add value to that block (include information not previously included in the block)
So a block producer will sometimes include duplicate votes. Consider the following two grouped attestation examples, targeting the same block slot and checkpoints:
A. signed by validators [1, 2]
B. signed by validators [2, 3]
Thanks to BLS (Boneh-Lynn-Shacham) aggregation, these two attestations cannot be aggregated as they partially overlap — so both will be included on-chain.
The ‘singles’ problem
We found that attestations containing duplicate votes were often included in different blocks — some after a delay. To some degree, this is expected and assists the network. However, given that the number of surplus inclusions is between 50-100%, it might be sensible to think of those attestations containing duplicate votes as bloat. It all depends on how they are aggregated.
Consider a committee of five attesting to the same target slot. For the surplus votes to stand at 60-80%, various combinations of aggregates might take up less or more chain state space:
More lean: 2 attestations: [1, 2, 3, 4] and [2, 3, 4, 5]
More bloated: 5 attestations: , [1, 2], [2, 3], [3, 4], [4, 5]
What we observed in Medalla looks more like the second option.
To reduce the computational load, the demonstration here looks only at the span of epochs 0 to 1,500 (the first week of the Medalla testnet) and includes about 2.3m aggregated attestations.
Figure 2: distribution of aggregated attestations by number of votes they included
We found three distinct clusters:
Attestations that included a single validator vote
Attestations that included 2-15 validator votes
Attestations that included 80-120 validator votes
Notably, 37% of all attestations committed on-chain included only one validator vote, which we will call “singles”.
We can now also look at the average inclusion delay over included grouped attestations by the number of votes they packed together.
Figure 3: inclusion delay vs number of votes each aggregated attestation packed
At first, this looks like a case of “If you want to go fast, go alone!” The singles were included with a delay of 10 slots on average, while the larger groups were included with a delay of approximately 12 slots. This might imply, given the roughly 75% surplus in validator votes across Medalla, that singles — which appear to be included first — get added again in larger aggregates that were included later.
Grouping singles by the slot that they were targeting shows , on average, it took 22 separate single attestations to get them included. For reference, this exercise grouped together about 39k separate target slots — 80% of those in the sample.
Figure 4: distribution of aggregated attestations including only one vote, targeting the same block
When looking at the distribution of inclusion delays among singles, 37% of the singles — 14% of total grouped attestations in the sample — were included with a delay of 1.
Figure 5: distribution of the inclusion delay of “singles”
Finally, when grouping those singles included with a delay of 1 slot by target slot, 40% (5% of all attestations in the sample) were targeting the same block and being included at the same time, yet were committed in over three (or more) separate attestations.
Figure 6: distribution of the inclusion delay of “singles” targeting the same block
Similarly, extending the analysis to those with an inclusion delay of between 2 and 5, we found these were included in over two separate attestations on average — covering about 60% of the sample together with those with an inclusion delay of 1, and mapping to 10% of all attestations included between epochs 0 and 1,500.
To boil down all these stats and terminology into one insight, 50-100% more attestations than activated validators were included per epoch over the course of Medalla testing.
Further investigation (in the sample of epochs 0 to 1,500) found that over a third of attestations included on-chain are “singles.” Of these, approximately 50% should have been aggregated as they were targeting the same block slot and were included at the same time.
While it is better to withhold final judgement until the whole dataset is analyzed, there are strong indications that aggregation in Medalla is not performing as intended.
Given the life cycle of aggregations, the probable cause is either the proposer (picking singles over aggregates) or the aggregator (packing singles as aggregates), with inclusion delays exaggerating the outcome. It’s also worth underlining that neither party is explicitly incentivized by the protocol to be more “efficient” when it comes to aggregations.
More effective aggregation means less state bloat and more space for useful information to be stored. It also enhances protocol security and overall guarantees, by allowing whistleblowers to survey past history for protocol level violations more quickly (and more cheaply) thanks to the reduced volume of data.
In a world of limited resources, an economically “rational” client side of the network will prioritize building features that help validators keep good uptime and get votes included as fast as possible. These functions help maximize rewards for operators. Rational operators will optimize similarly. This incentive structure may lead to a negative network-level externality that manifests as inefficient aggregation.
Given its lifecycle stage, the eth2 development team is focused on implementation, not on protocol design and specifications. However, the protocol could also incentivize efficient aggregation — even off-chain — by linking the aggregators with attestation and issue rewards in retrospect.
If aggregation remains a problem that network participants will solve only out of goodwill, it’s likely to remain a lasting issue.
Thank you to Sid Shekhar, Lakshman Sankar, Paul Hauner and Jim McDonald for their thoughtful questions, prompts and the time they devoted to discussing the original findings of eth2data.github.io