Who’s who in eth2: Luke Youngblood from Coinbase
January 26, 2022
In the fifth edition of our Who’s who in eth2 series, Elias Simos interviews Luke on his expertise in staking infrastructure, the challenges facing eth2 validators, and more.
BY ELIAS SIMOS · OCT 8 2021
Welcome to Who’s who in eth2, presented by Elias Simos, Protocol Specialist at Coinbase Cloud. In this series, Elias interviews key contributors to the development and growth of Ethereum and eth2, exploring their involvement in eth2, visions for the future, and perspectives on what eth2 means for the world from people deeply embedded in its ecosystem.
The series includes interviews with notable builders, researchers, infrastructure experts, and leaders from the eth2 ecosystem, along with principal members of eth2’s four client teams—Prysmatic Labs, Sigma Prime, Nimbus, and Teku.
In this fifth post, Elias interviews Luke Youngblood, Senior Staff Software Engineer at Coinbase Cloud, on his expertise in staking infrastructure, the challenges facing eth2 validators, the balance between simplicity and complexity in infrastructure, and how Luke thinks around key considerations for Coinbase’s involvement in eth2.
Why don’t we start with a little bit of background on yourself and what you’re working on at Coinbase?
I've worked in tech for the last couple of decades. I started my career working as a systems engineer and a network engineer. In the mid‐2000s, I built a very large private cloud for McKesson, which is a large healthcare company spread across the US, Canada, and Europe. It was a really great experience because I got a chance to build automation that thousands of developers use to deploy their applications every day.
Then in 2016 I got a job at Amazon Web Services. I wanted to work for the biggest cloud. At Amazon Web Services, I helped some of our large customers migrate applications to the cloud, and really focused on distributed systems.
Separately, I've always been into crypto as a hobby. I started mining Bitcoin back in 2010, in the very early days, and then started fiddling with Ethereum back in 2016‐2017. While I was working at Amazon my brother and I started a small side‐business helping build proof of stake infrastructure for the Tezos Foundation. At the time, back in early 2018, I knew nothing about proof of stake. It was this very new technology. So, early as we were, we ended up being (I believe) the first to build technology that didn't exist at the time—things like remote signing systems.
Fast forward a little bit in the future to late 2018‐early 2019, Coinbase wanted to get into staking rewards. So they were looking for experts in this field. They found me through the Tezos Foundation and acqui‐hired me to come help build out staking rewards at Coinbase. For the last couple of years I've been working at Coinbase on adding staking rewards products. We started with Tezos, Cosmos, Algorand, and, more recently, eth2, which is the biggest staking rewards product we've launched so far.
As you were describing your story arc I couldn’t help but wonder, what are some of the most transferable learnings that you took from the AWS days and brought with you to a world of open, permissionless, distributed systems–like proof of stake blockchain networks?
It's a really interesting question because many of the concepts that we learned at AWS, and the core principles of cloud computing, apply very well to proof of stake blockchain networks, and decentralized networks in general.
So, for example, one of the largest benefits to using cloud services is that you can go global in minutes. The idea is that I can deploy infrastructure to any region in the world. You know — Asia, Europe, United States, Canada, South America, even Africa — I can deploy infrastructure to any continent, any region, within minutes, without having to build data centers, without having to buy hardware.
Applying this concept to decentralized networks means we don't have to limit ourselves to a single data center. If we want to create decentralized networks that are very robust, we can actually provision validators across many different continents, and the networks we provision can be more robust because of it. And so applying those concepts was kind of a revelation for me because I realized that we didn't have to worry about 20% of the hash power of the network going offline if a single data center in China goes off the internet.
Luke at an Ethereum developers meetup along with Protocol Specialist Viktor Bunin
Is that also helping make these networks more secure?
Yes, absolutely. I think in all of these decentralized networks the ideal state is that there are thousands of validators running across a wide variety of setups, including home setups, professional setups, and data centers. It would be perfect if there were, let's say, a hundred thousand Ethereum validators and each one was running at a home or on a different internet connection; that would be ideal.
But the reality is that there will be large concentrations of staking power in various locations. Those concentrations of hash power also exist in proof of work networks — including Bitcoin, mostly because of the fact that hardware required to do it is very specialized. With proof of stake, of course, we can use generalized hardware. You don't have to buy ASICS, we don't need as much electricity, so we don't have to be right next to hydroelectric power or some cheap power source — we can provision those systems pretty much anywhere. So they can be much, much more decentralized if we do it properly.
I do think though, just as there are benefits to extreme decentralization (e.g. everyone running validators at home), there are also benefits to decentralization across a variety of data centers and cloud providers. You can imagine that network outcomes would end up being a lot more "random" with a network that is run by hobbyists in its majority — at least with the state of technology as it exists today.
Besides, 32 ETH has ended up being worth quite a lot.
Interesting! There is a side-narrative thread in the eth2 world, that eth2 enables anybody to run a validator, and I tend to think of that as a bit of a misnomer. The monetary hurdle is significant and the technical hurdle not insignificant. I would love to hear your thoughts around that.
Well, it's a great challenge to solve. There's a couple of challenges there, right? The first is the tooling may not be easy to use to run a validator. You can run validators on Windows, Linux, Mac OS — Prysm’s clients run on all three. However, it does require command line skills, and not everyone will have the skills to easily spin up a validator. But the tooling is constantly improving. And I do expect in a year or so that there will be easy point and click tooling where anyone can provision a validator without having a lot of command line Linux knowledge.
Another aspect is just the cost. I think with staking pools this can actually be reduced in some ways. For example, there are staking pools like Rocket Pool, where you might only need 16 ETH to start a validator and you can receive ETH from others that just want to stake with the staking pool.
Also, a really interesting project that is being worked on — I believe Consensys is working on this right now—is called Secret Shared Validators (SSV). The idea is that we could use threshold signing schemes to actually spread keys across many thousands of validators, so that an individual validator doesn't have to worry about losing their private key material due to a hardware failure, or something like that.
If this proves itself to be a viable strategy in SSV we could probably apply the same concepts to decentralizing validators such that it doesn't require 32 ETH; you could stake much smaller amounts, which would be interesting.
You touched on something really interesting there, which is the effort of staking pools to effectively enable delegation in the eth2 world, which is not natively baked into the protocol.
Given your experience with other proof of stake networks that do enable delegation natively, would you say this is more of a feature or a bug for eth2?
Yeah, it's definitely created some challenges for us. eth2 is pure proof of stake, not delegated. I think the design goals have a good intent behind them. The design goal of not having a single validator stake more than 32 ETH is intended to decentralize the network more, so we have thousands of validators, potentially tens or hundreds of thousands of validators, instead of just perhaps a hundred, like you see in some of the Tendermint‐based proof of stake networks. So it's a good design goal, but the reality is that there will be large staking pools regardless, whether they are exchanges or whether they are decentralized pools, those large staking pools will need to concentrate staking power. So they will just have to adapt and run thousands of validators each.
That's what we've had to do at Coinbase to launch staking rewards; we're running thousands of validators. Now, the other challenge that pure proof of stake brings is regarding funds movement. From a security and risk standpoint, it's actually very easy to do delegated proof of stake because we can delegate funds that are in a cold wallet so they're completely offline and the private keys are not stored online. It's really nice from a security standpoint when we're able to do that. With pure proof of stake, like eth2, of course we have to move those funds into the deposit contract so they're now no longer cold.
I think these are always tough considerations when you're designing a proof of stake network, but, I think in general, if you can minimize funds movement that's always a benefit. And, in general, if you can support delegation, that's also a really nice characteristic to have.
But on the flipside, I also do think that having hundreds of thousands of validators is a great benefit to have. However, some of the same benefits could potentially be achieved in delegated networks if you have the right economic incentives.
So it's actually true that the network tends to centralize. However, I think for large entities that are staking, their goal is also to decentralize as much as possible.
While true on the validator index level, when you aggregate up to deposit addresses you come down from 200k keys to about 7k unique addresses. If you further collapse that to the operator level, you would probably find that the majority of the network is run by maybe less than a hundred parties… so, not so dissimilar to Tendermint-based networks.
Given that dynamic, do you think that eth2 achieves meaningful differentiation?
I think you're bringing up a really good point in that we might have less decentralization than we appear to have on the surface. In delegated proof of stake networks this is more visible because the delegations are public and on the blockchain. Whereas the only way we can really determine how centralized or decentralized pure proof of stake is, is by trying to correlate things like deposit addresses.
So it's actually true that the network tends to centralize. However, I think for large entities that are staking, their goal is also to decentralize as much as possible and I'll explain the reasons for this.
For one, if I'm a large staker on the Ethereum network I probably don't want to have a third of the network by stake power. Because if I have a third of the network and there's some vulnerability in my setup, I could potentially have correlated slashing of 100% of my stake. So because the risk increases the closer you get to one third of the network, that dramatically reduces my incentive to want to have that much staking power on a single provider.
Likewise, we want to help promote the health of the network and the stability of the network, because of all the applications that exist on top of it. If a third of the network’s voting power goes offline due to maybe even a data center or an internet outage, finality can halt and the network can stop finalizing blocks which is also bad. So there's a variety of disincentives to discourage concentrating too much.
As large providers like Coinbase, we have to think about how we decentralize across as many different infrastructure providers and as many different hardware and software configurations as possible. Because the ideal state is that if there is a vulnerability or a software defect in one of our stacks it only affects a very small percentage of our total validator population and it doesn't affect our entire amount at stake.
Luke speaking at a staking conference along with Coinbase Cloud Engineering Director Aaron Henshaw
I imagine that translates well into diversifying your net exposure to the network via multiple clients, geo-locations of the data centers that these validators in, etc. Talk to me a little bit more about how you think around those parameters?
This is where I think Coinbase Cloud’s architecture is really brilliant. When I'm provisioning validators on the Coinbase Cloud infrastructure I can choose from a variety of cloud regions and multiple cloud providers. When I launch what's called a cluster, that cluster is actually spread across two different cloud providers by default with only half of the validator clients on each. So we leverage this capability of Coinbase Cloud as much as possible, and we actually spread our validators across as many cloud regions and cloud providers as possible.
By simply provisioning validators across Amazon in Ireland, Google in Frankfurt, Amazon in Tokyo, Google in Hong Kong, Amazon in Singapore, and each one of those regions has several data centers, they call them availability zones, we can actually achieve really good decentralization across the entire infrastructure stack. So if there was a bad internet weather day that impacts one of those data centers, it's a very small percentage of our total validator population.
Likewise, on the client side, we support all three of the major client implementations — Prysm, Lighthouse, and Teku. This allows us to diversify our footprint on the client side as much as possible, as each client has strengths and weaknesses.
Last, we are leveraging more staking infrastructure providers than Coinbase Cloud. Some of our providers run in the cloud and some run on bare metal, and this diversity gives us greater resilience and decentralization.
Kind of tangential to the stream we’re on now, I’ll say with great power comes great responsibility. I’d love to hear your thoughts, both as to the degree of responsibility an organization like Coinbase should assume in a network like eth2, but also as to what your perception of the network incentive design is.
You bring up a good point. Based on the way that the network economics are configured, it is advantageous for us to spread across as many providers as possible. So the incentives are aligned for us to decentralize as much as possible so that we don't have single points of failure—and if we do, to minimize their potential impact.
Where I do get somewhat concerned, is that I speculate that not every staking provider operates in the same way that we do. When you're operating in the cloud you have the capability of provisioning all of your validators in a single region, or you have the capability of provisioning them all in a dozen regions, and it's basically no additional cost to do so. So our incentives are aligned to decentralize as much as possible.
On the other hand, if you're a validator service operating in a physical location, you might not have the capability of easily turning on another data center or easily putting half of your validators in Asia and the other half in Europe. In that sense, not all of the actors in the ecosystem will necessarily have the same easy path to decentralization.
When I first got into proof of stake the thing that really appealed to me about it was the energy efficiency. And when I dug in a bit more, and I realized that instead of using raw compute power and electricity to secure the network, you use operational expertise, security skills, and ultimately value at risk, my mind was blown.
I want to take a step back before we dive into more eth2 specifics. I’m curious to understand more about your personal motivations. Why crypto? Why staking?
Well, first of all, when I first discovered Bitcoin and I started mining back in 2010 on my gaming PC, it was a fascinating hobby to me. I never really thought it would be worth a lot of money. I just thought it was really cool that I could generate money on my computer when I wasn't working. And I thought about how quickly I could send it across borders. I remember sending money to an exchange, and I thought it was fascinating that I could deposit funds and have them almost immediately available for trading. So crypto in general just became this rabbit hole that I went down.
Then when Ethereum launched and all of a sudden we had the world computer and smart contracts, and I realized that we could automate the value transfer between companies or between individuals, and we could codify these things in smart contracts, it was just mind blowing to me to think about the implications that would have for all financial transactions. I was hooked!
When I first got into proof of stake the thing that really appealed to me about it was the energy efficiency. And when I dug in a bit more, and I realized that instead of using raw compute power and electricity to secure the network, you use operational expertise, security skills, and ultimately value at risk, my mind was blown. The votes that protect the ledger are essentially really inexpensive if you have the private key material.
That meant that engineers like myself who happened to have expertise in these areas of infrastructure and security could build a career for ourselves. I never thought my career would end up in crypto, but it's been a really fun journey to kind of end up here.
Let’s unpack how the journey unfolded thereon. You mentioned earlier that you first got involved with Tezos and a few other networks early on. How do you feel about walking us through that arc of your story?
Sure. Indeed, I first got involved with Tezos — that dates back to early 2018. It was about three months before the betanet launch of Tezos. They had a very successful crowdsale two years before in 2016, and they had promised the community the network would launch in Q2 of 2018. The community, as you can imagine, was really anticipating the launch, and they only had about three months to go. At the time they had zero infrastructure.
Really what we had to build out was the global footprint for the Tezos Foundation bakers. Bakers are basically the same thing as validators on other proof of stake networks. These Tezos Foundation validators, or bakers, would be the only validators running on the network for the first seven cycles, or about a three weeks period of time, while other validators would delegate, or receive delegations, and come online in cycle seven.
It was really interesting because in just about three months we had to build the remote signing infrastructure, including hardware security modules (HSMs) running across four different cloud regions, and all the bakers and boot node infrastructure that was necessary for others to connect their nodes to the network. Of course, we did this all in cloud infrastructure because that was the only way we could actually do it in a three month timeframe with the security requirements we had.
"By running an incentivized testnet and giving delegations to dozens of validators, you can very quickly decentralize your token supply and decentralize your network by delegating to validators that have proven competent at running incentivized testnets. So it allows the validator community to become more skilled at running these networks, and then allows individual validators to demonstrate proficiency in these networks."
So it was a really exciting time to be a part of the proof of stake movement. We were building new technology — and we were also working crazy hours, while still working our day jobs. At some point I remember I took a couple of weeks off work and flew to Paris and spent time with the Nomadic Labs developer team there in Paris ahead of the launch.
Then the funny thing was, at the time I remember thinking that we were going to build this infrastructure for them and then hand it over to some team — I didn't know who — was going to operate the validators going forward. But what ended up happening is the Tezos Foundation is a charitable organization, and they don't necessarily have a team of DevOps and infrastructure people. So they asked, "Can you operate it for us?" So I ended up operating the Tezos Foundation bakers, and I still operate them to this day. Coinbase has generously allowed me to continue operating them while I work at Coinbase. It gave me a great experience and great background on how to operate proof of stake networks.
Going forward, later in 2018, I participated in Cosmos’ Game of Stakes. Cosmos had a really innovative approach to incentivizing validator participation. They had an incentivized testnet called Game of Stakes that took place in late 2018, and anyone could participate with testnet funds. There were prizes given to validators that remained online the longest, there were uptime leaders, prizes given to validators that never got jailed. People were encouraged to attack each other to try to find security vulnerabilities in the network. It was really great because you got firsthand experience running Cosmos and operating in an adversarial testnet. Then you received Cosmos ATOMs as a prize for participating.
There've been so many proof of stake networks after Cosmos that have copied this incentivized testnet model, because it was so successful. It accomplishes two things. First of all, many of these decentralized networks that are launching today are funded mostly by venture capital. So they have just a few large shareholders. By running an incentivized testnet and giving delegations to dozens of validators, you can very quickly decentralize your token supply and decentralize your network by delegating to validators that have proven competent at running incentivized testnets. So it allows the validator community to become more skilled at running these networks, and then allows individual validators to demonstrate proficiency in these networks.
Using all these networks that you’ve participated in and studied as the background, what are some things that you think eth2 could draw inspiration from?
I think in general delegation is just very nice from a not‐having‐to‐move‐funds perspective. I can keep my funds cold, I can delegate to a validator, and I don't have to worry about my funds that are in cold storage being lost. That's very appealing.
Another thing that would be really nice to see is perhaps allowing validators’ effective balance to grow over time. So on Ethereum 2 validators cannot go above a 32 ETH effective balance, which means that in order for validators to compound their rewards they're effectively going to have to exit at some point and then create more validators.
All of these exiting and depositing transactions are going to cause a lot of churn on the network. So perhaps it would be nice to come up with an economic model that allows validators to continue to compound the rewards without having to exit validating. That would be one nice improvement, but I'm sure there are a lot of economic implications that I haven't thought through completely.
Switching gears a bit and thinking about Coinbase’s involvement in eth2, what were some of the key considerations that you made starting out?
The unique challenge with eth2 is just the number of validators we have to manage. With all the other networks that we've launched staking rewards on in the past, we've only had to operate a single validator and then delegate all funds to that validator. With eth2, of course, we have to be able to launch many thousands of validators, and we also need to orchestrate deposits into the Ethereum mainnet deposit contract.
As I thought through the problem space, what we settled on was the design for an orchestration tool that would allow us to securely create validators. We're using Bison Trails’ API for that. So kudos to the Bison Trails team for designing a really easy to use and user‐friendly API that we can integrate with. Essentially, we want to be able to use automation and allocate new validators on-demand because we realized we couldn't manually stand up a validator every time we need to deposit 32 ETH.
So it all starts with orchestration allocating the validators, and then once we've allocated the validators, we also need to allocate cold storage addresses — the withdrawal address. For example, once you're done staking and you want to exit the system, those funds have to go back somewhere.
Then we need to also orchestrate a deposit, and the deposit itself as a smart contract interaction on the Ethereum mainnet. So we built this tool that helps us to orchestrate the life cycle of thousands of validators’ deposit and withdrawal addresses. Then you need to monitor all the activity, so we built a monitoring system to have an overview of whether our validators are attesting or not.
Then we also use the same data from our monitoring system that checks validator balances to determine how much in rewards to pay our customers. We know exactly how much ETH we've earned every day through attestations and block proposals; then we're able to use that to calculate what APR we should pay our customers minus fees.
That was an interesting way to monitor the health, as well as to monitor the financial health, of our staking rewards product through monitoring validator balances.
Luke speaking on a staking conference panel
I’m curious to hear why you’re looking at balances alone. Why are you not looking at, say, the inclusion delays, source/target checkpoint accuracy, etc.? Why only track the outputs and not the inputs as well?
Great question! We could monitor for just attestation inclusion distance, and all of those things individually, but what we really like to look for are simple metrics that can give us a strong signal of healthy or unhealthy, up or down — and the balance actually turned out to be one of the best signals for that.
But it’s also because we need to capture the data for financial reporting systems, we have to be able to audit every last penny of ETH and make sure we prove to our accountants that we've paid customers exactly the amount of rewards we've earned. It actually allows us to kill two birds with one stone and use that same monitoring data for financial reporting. We found balance changes were a really good proxy for validator health.
Is there a point about simplicity that has more of a staying power for you? What’s the thread between keeping things simple and easy to understand and react to, versus doing really complex infrastructure buildups at the cutting edge of distributed systems?
That's exactly it. Simplicity is something we really value quite a bit. When deconstructing the problem of staking we initially thought to take a systems approach of just monitoring the health of all the servers. Then what we realized really quickly was that the server can be completely healthy and we still might be offline for some unknown reason. So the approach we decided to take was to actually index every transaction in the blockchain that we care about — in the case of staking, it's just staking transactions, typically votes and block proposals — and use that index data as our source of truth for monitoring.
That's the approach we initially developed for Tezos when I was first building out the staking infrastructure for the Tezos Foundation. We just index the blockchain itself and we use a separate node from the nodes our validators use. That way, there’s been a few incidents where we've thought our validator was online, everything looked great, logs show we're proposing blocks and voting, but our validator actually lost consensus with the rest of the network. So it actually wasn't voting, and we detect those problems much easier when we actually look at the blockchain data itself.
Think along the following lines: You’ve just downloaded an app on your phone and now you're able to use decentralized lending protocols on Ethereum. You're able to take out a home mortgage from Compound or Aave in minutes, and as a user you're able to do all these amazing things with the DeFi ecosystem that you can't do today.
As far as eth2 goes, what are you most excited about for the next one or two years? And what do you think is one of the biggest challenges that eth2 faces from your perspective, or more generally?
I'm most excited about the upcoming merge. I think it's going to be fantastic if we're able to pull off the quick merge by early next year. Immediately, it's going to do a couple of things that are really powerful. First of all, it's going to dramatically reduce the cost of gas and fees on the Ethereum network, which has been a pain point for many of us that use DeFi today. It’s going to also increase the revenue for validators on the eth2 beacon chain, because they'll be able to earn fees from including Ethereum mainnet transactions in blocks.
From a staking perspective, it's going to increase the rewards that people can earn. And for just anyone using the Ethereum network, it should actually improve the performance and reduce fees for them. I think that'll be a huge benefit in the near short term, hopefully in the next six to twelve months.
Then going forward I think there are some challenges, obviously. One of the more immediate concerns is that there are a number of EVM‐compatible L1s that are popping up, which don't have some of the same decentralization guarantees and censorship resistances that Ethereum has. If we don't pull off the quick merge in time, and fees continue to rise, and the gas price continues to rise on the Ethereum mainnet, there’s a real risk that a lot of applications could migrate to these other L1s that don't have the strong security and decentralization guarantees that Ethereum has.
Final question. We’re having this conversation in two or three years’ time. What does the world of eth2 look like from where you stand three years in the future?
Amazing question. Think along the following lines: You’ve just downloaded an app on your phone and now you're able to use decentralized lending protocols on Ethereum. You're able to take out a home mortgage from Compound or Aave in minutes, and as a user you're able to do all these amazing things with the DeFi ecosystem that you can't do today. In that future, all finance is on-chain such that you won’t really even need a bank.
From a network perspective, we have a thousand shards operating and there are tens of thousands of transactions per second, and we've exceeded the Visa Network in terms of transactions per second throughput. And from a finance perspective, crypto probably now has a $10 trillion plus market cap, and decentralized finance protocols are larger than banks in the United State.
Amen. This is a future I’m very excited about and very much on board for. Thank you so much Luke.
Really fun chatting with you Elias, it's been great.
Interview by Elias Simos