eth2 insights: slashings

January 26, 2022

In our eth2 insights series, Elias Simos, Protocol Specialist at Coinbase Cloud, shares his joint research with Sid Shekhar, Blockchain Research Lead at Coinbase, and looks at the correlations and probable causes of slashings in Medalla testing

Slashing is a core protocol function and mechanism enabler in proof of stake networks. In crude terms, if block rewards are the carrot, slashing is the stick — a negative incentive to stop network participants misbehaving.

In Phase 0 of eth2, slashing befalls validators who:

double attest or double propose
surround vote

To understand slashing, it’s necessary to remember how consensus is achieved in eth2. At a very high level, the body of validators broadcasts votes on the state of the chain (attestations), which are aggregated and passed to the block producer to include in a block. So what exactly do the votes include?

Figure 1: The form of an aggregate attestation as it is included on-chain

The form of attester votes

An attester must provide two types of vote:

one towards the head of the chain
one towards finality

These votes are described by the following core components/fields:

Target slot: the slot allocated to the attester by the RANDAO in a given epoch. It also signifies the vote towards the head of the chain, by LMD-GHOST.
Committee index: the group of validators to which each attester is assigned. Committees help aggregate votes, which end up on-chain as “grouped” attestations.
Source checkpoint (s): a checkpoint is the first (proposed) block of an epoch. The source checkpoint is a vote towards what the attester recognizes as the last justified epoch boundary, set by Casper FFG. A checkpoint is justified if two-thirds of validators in the active set vote for it.
Target checkpoint (t): a vote towards the next justified checkpoint, by Casper FFG. Source and target votes help get the chain state to finalize.

Once two-thirds of active validator votes are on-chain for a pair of source (s) and target (t) checkpoints related to an epoch, they become justified. Then the previous epoch’s pair becomes finalized — that way the state of the chain before that checkpoint boundary cannot be changed.

Slashing rules!

Casper FFG outlines two slashing conditions:

Double vote: when two conflicting attester votes from the same validator, which share the same target checkpoint (t1 = t2), get included on-chain. A double vote means the proposer has signed two different beacon blocks in the same slot.
Surround vote: when an attester vote with source and target surrounds is surrounded by previous votes, such that s1 < s2 < t2 < t1

Slashing in Phase 0 triggers a balance reduction (depending on how many other validators get slashed at the same time) and exit from the active set. Validators must join the exit queue and lose the right to participate in consensus and earn rewards. Their ETH will also be frozen until Phase 1.5.

A slashable offence is identified and protocol-level penalties are enacted by a “whistleblower.” The whistleblower collects evidence of the offence and commits proof(s) to a block proposer for inclusion on-chain. After Phase 0 the whistleblower will earn the majority of those rewards (seven-eighths), but in Phase 0 the block proposer earns the full reward.

Slashable offences need not be discovered immediately. The protocol allows a time window between an offence and the inclusion of slashing proof, known as the “weak-subjectivity” period. This can last for a maximum of 54,000 epochs.

This window implies that whistleblowers need to store an extensive record of chain history and usage to check for violations. Given the size of the chain and the large space taken up by attestations, a leaner aggregated-attestations state means a leaner overall history, which might enable better detection of violations.

The Medalla testing slashathon

Over the 14,500 epochs surveyed in eth2data.github.io, we observed about 5,100 attester slashings and 50 proposer slashings. However these numbers, taken from the beaconcha.in application programming interface (API), don't match those reported by beaconscan.com, which stand at approximately 1,900 attester slashings.

The error probably lies with beaconcha.in and is likely to relate to it tracking “submitted slashing proofs,” not actual slashings. Proofs for the same event can be submitted many times, but only the first enforces punishment.

But while the nominal values might be misleading, it is still worth examining overall trends.

Attester slashings

By far the majority of the attester violations in Medalla took place around epoch 2,200, at the start of the roughtime incident. This was when Prysm client clocks went out of sync with network time, so validators incorrectly proposed blocks and submitted attestations for future slots.

Figure 2: Density plot of attester violations committed overtime (S = Blue, T = Orange)

While offences centered around the beginning of roughtime, only about 20% of the proofs were included at that time. Some 94% of the slashing proofs were included by epoch 4,000. The rest rolled in all the way to the end of our sample of 14,500 epochs — a significant delay.

Figure 3: Slashing proofs included on-chain over time

Let’s call the gap between the offence and proof being included on-chain the “detection delay.”

On average, the attester offence detection delay was 750 epochs. Yet the top 10% of slashers managed to get proofs included in less than 20 epochs. When we tried to link fast slashers to client choice, we found no deviation from the network-wide client distribution among the third we could identify.

Figure 4: distribution plot of detection delays

Now, things get really interesting when comparing the aggregate attestations surplus over time, with the average detection delay in slashable offences, measured on a per epoch basis.

Figure 5: Attestations surplus vs average detection delay over time

When the attestations surplus was highest (early in Medalla — up to epoch 2,200), slashings took an average 3,000 epochs to be detected. As the Testnet matured, the average detection delay dropped to about 500 epochs.

However, as the attestations surplus grew, the detection delay followed. This means that poor aggregation performance in Medalla has a direct effect on the enactment of slashings.

Going into more detail around attester violations, we found all enacted slashings link back to double votes. Specifically, 90% were down to double votes with the same source and target epochs but different target slots.

Another 8% disagreed both in target slot and source epoch fields. Double vote slashings due to pure asynchrony in checkpoint values were far less frequent.

Table 1: type of slashing observed, described by S, T, and Target Slot

Here, 6.23% of reported slashings do not actually appear to be offences — probably because of a bug in the beaconcha.in API. Moreover, no observations satisfied the surround-vote condition. This doesn’t mean no surround-vote violations took place, but discovering them is far more complex than for simple double votes. This is being actively researched by the technical Ethereum community.

When investigating what type of clients included slashing proofs, we found that more were running Lighthouse software than the average distribution among validators in Medalla. There are two possible reasons why.

First, a larger proportion of validators running Lighthouse went down during roughtime.

Figure 6: distribution of proposers that included the slashing proofs, by client type

Looking at proofs submitted after epoch 4,000, when the network returned to normal, Lighthouse’s share appears even greater. The sample size is small but this could indicate the second reason for Lighthouse’s overperformance: that slashing software in the Lighthouse client might outperform the competition.

Proposer slashings

Proposer slashings are more straightforward than attester slashings and center around double proposals of blocks. In Medalla, these are rare (47 in total), with most concentrated around epoch 13,700.

Figure 7: distribution of proposer slashing proofs included on-chain, over time

This may be because of a bug or a client software upgrade that included a better “slasher” module. The former scenario is more likely as there was no obvious pattern in the whistleblower/proposer client type.

The vast majority of slashed proposers were running the Nimbus client (about 60%). When we examined the high density proposer slashing epoch range, that grew to over 85%. Given that Nimbus suffered syncing issues as stakers’ attention turned from Medalla to Spading/Zinken around that time, it’s highly likely this caused the slashings.

Figure 8: distribution of slashed proposers by CLIENT_IDENTIFIER

The detection delay in proposer slashings was one slot for about 85% of cases — a stark contrast to the average 750 epochs delay with attester slashings. This shows that detecting attester slashings is much more complex than proposer slashings.

Figure 9: Distribution of detection delay in proposer slashings in Medalla

Conclusion

Attributing slashings in Medalla to their probable root causes makes it apparent that client syncing issues (specifically with Prysm during roughtime and Nimbus around epoch 13,700) have been the most common reason.

We didn’t survey the attestations log for slashable offences not enacted, but it appears that poor aggregation performance directly affects how quickly slashable offences are detected.

Given our thinking about how the attestations surplus can potentially degrade whistleblowers’ performance, we can expect that correlating this surplus with unpunished slashable offences will reveal interesting insights. But that’s for another time.