Tuesday, September 4, 2018

Chia Network

Back in March I wrote Proofs of Space, analyzing Bram Cohen's fascinating EE380 talk. I've now learned more about Chia Network, the company that is implementing a network using his methods. Below the fold I look into their prospects.

Chia Network's blockchain

First, it is important to observe that, although Proofs of Space and Time are important for distributed storage networks such as FileCoin's, this is not what Chia Network is using Cohen's Proofs of Space and Proofs of Time (Verifiable Delay Functions VDF) for. Instead, they are using them as a replacement for Proof of Work in blockchains such as Bitcoin's. Here is the brief explanation of how they would be used I wrote in Proofs of Space:
As I understand it, the proof of space technique in essence works by having the prover fill storage space with an array of pseudo-random points in [0,1] (a plot) via a time-consuming process. The verifier can then pose to the prover a question that can be answered either by a single storage access (fast) or by repeating the process of filling the storage (slow). By observing the time the prover takes the verifier can distinguish these two cases, and thus be assured that the prover has stored the (otherwise useless) data.

As I understand it, verifiable delay functions work by forcing the prover to perform a specified number of iterations to generate a value that the verifier can quickly show is valid.
  • To use in a blockchain, each block is a proof of space followed by a proof of time which finalizes it.
  • To find a proof of space, take the hash of the last proof of time, put it on a point in [0,1], find the closest proof of space you can to that.
  • To find the number of iterations of the proof of time, multiply the difference between those two positions by the current work difficulty factor and round up to the next integer.
  • The result of this is that the best proof of space will finish first, with the distribution of arrival times of finalizations the same as happens in a proof of work system if resources are fixed over time.
  • The only discretion left on the part of farmers is whether to withhold their winning proofs of space.
In other words, the winning verification of a block will be the one from the peer whose plot contains the closest point, because that distance controls how long it will take for the peer to return its verification via the verifiable delay function. The more storage a peer devotes to its plot, and thus the shorter the average distance between points, the more likely it is to be the winner because its proof of space will suffer the shortest delay.
There are a number of additional points that need to be understood:
  • The process of filling storage with pseudo-random points must be slow enough that a peer cannot simulate extra storage by re-filling the limited storage it has with a new array of points. The process is presumably linear in the size of the storage to be filled, in the speed of the CPU, and in the write bandwidth of the storage medium. But since it is done only once, the slowness doesn't impair the Proof of Space time.
  • The delay imposed by the Proof of Time must be orders of magnitude longer than the read latency of the storage medium, so that slow storage is not penalized, and fast storage such as SSDs are not advantaged.
  • The process of filling storage has to be designed so that it is impossible to perform iterative refinement, in which a small amount of permanent storage is used to find the neighborhood of the target point, then a small amount of storage filled with just that neighborhood, and the process repeated to zero in on the target, hoping that the return in terms of reduced VDF time, exceeds the cost of populating the small neighborhood.
As I understand it, Chia Network's goals are that their blockchain would be more energy efficient, more ASIC-resistant, less centralized, and more secure than Proof of Work blockchains. Given the appalling energy waste of Proof of Work blockchains, even if Chia only managed the first of these it would be a really good thing.

Energy Efficiency

As regards energy efficiency, I believe a peer's duty cycle looks like this:
  • Compute 1 hash, consuming infinitesimal energy in the CPU.
  • Access 1 value from the disk by waking it from standby, doing a seek and a read, and dropping back into standby. This uses little energy.
  • Do a Proof of Time, which takes a long time with the CPU and RAM using energy, and the disk in standby using perhaps 1W.
So the overall energy consumption of PoSp/VDP depends entirely on the energy consumption of the Proof of Time, the Proof of Space is irrelevant. Two recent papers on VDF are Dan Boneh et al's Verifiable Delay Functions and A Survey of Two Verifiable Delay Functions. The math is beyond me, but I believe their VDFs all involve iterating on a computation in which the output of each iteration is the input for the next, to prevent parallelization. If this is the case, during the Proof of Time one of the CPU's cores will be running flat-out and not using any RAM. Thus I don't see that the energy demands of these VDFs are very different from the energy demands of hashing, as in Proof of Work. So if every peer ran a VDF the Chia network would use as much energy as Bitcoin.

But, as I understand it, this isn't what Chia is planning. Instead, the idea is that there would be a small number of VDF services, to which peers would submit their Proofs of Space. In this way the energy used by the network would be vastly smaller than Bitcoin's.

ASIC resistance

The math of VDFs is based on imposing a number of iterations, not on imposing a delay in wall clock time. There may be an advantage to executing the VDF faster than the next peer. As David Vorick argues:
At the end of the day, you will always be able to create custom hardware that can outperform general purpose hardware. I can’t stress enough that everyone I’ve talked to in favor of ASIC resistance has consistently and substantially underestimated the flexibility that hardware engineers have to design around specific problems, even under a constrained budget. For any algorithm, there will always be a path that custom hardware engineers can take to beat out general purpose hardware. It’s a fundamental limitation of general purpose hardware.
Thus one can expect that if the rewards are sufficient, ASICs will be built to speed up any time-consuming algorithm (I agree that Cohen's Proof of Space is unlikely to attract ASICs, since the time it takes is negligible).

I don't know what happens if a VDF service A with ASICs gets a point with distance d and, thanks to the ASICs, announces it significantly before VDF service B without ASIC assistance gets a point with distance d-ε and announces it. In the Bitcoin blockchain, all possible solutions to a block are equally valid. A later, different solution to the block is just later, and thus inferior. But in the Chia blockchain a later solution to the block could be unambiguously better (i.e. provably having performed fewer iterations) if it came from a slower peer. How does the network decide how long to wait for a potentially better solution? If the network doesn't wait long enough, and the rewards are enough, it will attract ASICs.

I believe that Chia hopes that the VDF services will all use ASICs, and thus be close in speed. But there won't be enough VDF servers to make a market for ASICs. Tim Swanson estimates that there are about 3.8M Antminer S9s mining on the Bitcoin blockchain. That's a market for ASICs, but a few 10s isn't. Perhaps FPGAs could be used instead.

Centralization

Proof of Work blockchains become centralized through the emergence of mining pools:
Bitcoin gives out a block reward, 25 BTC as of today, every 10 minutes on average. This means that a miner whose power is a small fraction of the total mining power is unlikely to win in a very long time. Since this can take years, miners congregate in pools. A pool has a manager, and whenever a participant in the pool finds proof of work, the reward goes to the manager, which distributes it among the participants according to their contribution to the pool.
But the emergence of pools isn't due to Proof of Work, it is a necessary feature of every successful blockchain. Consider a blockchain with a 10-minute block time and 100K equal miners. A miner will receive a reward on average once in 1.9 years. If the average Chia network miner's disk in such a network has a 5-year working life the probability that it will not receive any reward is 2.5%.

This is not a viable sales pitch to miners; they join pools to smooth out their income. Thus, all other things being equal, a successful blockchain would have no more than say 60 equal pools, generating on average a reward every 10 hours.

But all other things are not equal, and this is where Brian Arthur's Increasing Returns and Path Dependence in the Economy comes in. Randomly, one pool will be bigger than the others, and will generate better returns through economies of scale. The better returns will attract miners from other pools, so it will get bigger and generate even better returns, attracting more miners. This feedback loop will continue, as it did with Ghash.io.

Mining Pools 9/1/18
Bitcoin's mining pools are much larger than needed to provide reasonable income smoothing, though the need to retain the appearance of decentralization means that it currently takes 4 pools (two apparently operated by Bitmain) to exceed 51% of the mining power. Presumably economies of scale generating better return on investment account for the additional size. As regards Chia Network's centralization:
Just as with Bitcoin's Proof of Work, farming pools would arise in PoSp/VDF to smooth out income.
What would pools mining a successful Chia Network blockchain look like? Lets assume initially that it is uneconomic to develop ASICs to speed up the VDF. There would appear to be three possible kinds of participants in a pool:
  • Individuals using the spare space in their desktop PC's disk. The storage for the Proof of Space is effectively "free", but unless these miners joined pools, they would be unlikely to get a reward in the life of the disk.
  • Individuals buying systems with CPU, RAM and disk solely for mining. The disruption to the user's experience is gone, but now the whole cost of mining has to be covered by the rewards. To smooth out their income, these miners would join pools.
  • Investors in data-center scale mining pools. Economies of scale would mean that these participants would see better profits for less hassle than the individuals buying systems, so these investor pools would come to dominate the network, replicating the Bitcoin pool centralization.
Thus if Chia's network were to become successful, mining would be dominated by a few large pools. Each pool would run a VDF server to which the pool's participants would submit their Proofs of Space, so that the pool manager could verify their contribution to the pool.

The emergence of pools, and dominance of a small number of pools, has nothing to do with the particular consensus mechanism in use. Thus I am skeptical that alternatives to Proof of Work will significantly reduce centralization of mining in blockchains generally, and in Chia Network's blockchain specifically.

Security

Like Radia Perlman, Eric Budish in The Economic Limits Of Bitcoin And The Blockchain (commentary on it here) observes that:
From a computer security perspective, the key thing to note about (2) is that the security of the blockchain is linear in the amount of expenditure on mining power, ... In contrast, in many other contexts investments in computer security yield convex returns (e.g., traditional uses of cryptography) — analogously to how a lock on a door increases the security of a house by more than the cost of the lock.
Budish shows that the security of a blockchain against a 51% attack depends on the per-block mining reward being large relative to the maximum value of the transactions in a block. Reducing the expenditure on mining makes the blockchain less secure. As I wrote in Cryptocurrencies Have Limits:
Budish combines Equations 1 & 2 to get Equation 3:
Pblock > Vattack⁄α
This inequality expresses the honest equilibrium condition for deterring an outsider's 51% attack (to deter insiders, Pblock has to be twice as big):
the equilibrium per-block payment to miners for running the blockchain must be large relative to the one-off benefits of attacking it. Equation (3) places potentially serious economic constraints on the applicability of the Nakamoto (2008) blockchain innovation. By analogy, imagine if users of the Visa network had to pay fees to Visa, every ten minutes, that were large relative to the value of a successful one-off attack on the Visa network.
This is why, especially with the advent of "mining as a service", 51% attacks on alt-coins have become endemic.

There is nothing specific to Proof of Work in Budish's analysis; the same limit will apply to Chia Network's blockchain. But Cohen hopes to motivate the use of spare space in individual's desktops by keeping mining rewards small. His first slide claims:
As long as rewards are below depreciation value it will be unprofitable to buy storage just for farming
I believe that by depreciation value he means the depreciation of all the storage in the network. Robert Fontana and Gary Decad's 2016 numbers are vendor revenue of $0.039/GB for hard disk. Assuming 5-year straight-line depreciation (262,800 10-minute blocks), the block reward and thus the maximum value of the transactions in a block must be less than $0.15/PB. Kryder's law means this limit will decrease with time.

Note also that it is possible at low cost to rent very large amounts of storage and computation for short periods of time in order to mount a 51% attack on a PoSp/VDP network in the same way that "mining as a service" enables 51% attacks on alt-coins.

For example, a back-of-the-envelope computation of a hour-long Petabyte attack at Amazon would start by using about 7 days of AWS Free Tier to write the data to the sc1 version of Elastic Block Storage. The EBS would cost $35/PB/hr, so the setup would cost $2940. Then the actual hour-long attack would cost $35, for a total of just under $3000.

I'm skeptical that the low-reward approach to maintaining decentralization is viable.

Burstcoin

It turns out that for the past four years there has been a Proof-of-Space coin, BurstCoin. Its "market cap" spiked on launch and when Bitcoin spiked, but it is currently under 2.5K BTC with a 24-hr volume just over 12 BTC. There are currently 670 cryptocurrencies with greater 30-day volume than Burst. So it hasn't been a great success. As expected, mining is dominated by a few pools.

Despite the lack of success, BurstCoin claims to have about 284PB of storage devoted to mining. Lets take that at face value for now. Using the same numbers as above, that's $11M in capital. With 5-year straight-line depreciation and a 4-minute block time (657K blocks) the depreciation per block is $16.74 per block. The reward is currently 809 coins/block or at current "market price" $7.17, so they are below Bram Cohen's criterion for not being worth investing just for mining.

Despite that, last year people were claiming to be mining with 120-250TB, and earning ~700 coins/day. That clearly isn't happening today. There are 360 blocks/day so the daily mining reward is 291,240 coins, or $2.58K. 240TB would be 8.5E-4 of the claimed mining power, so should gain 247.6 coins or $2.18/day. 40 cheap 6TB drives at Newegg would be around $6K, so the 5-year depreciation would be $3.29/day for a loss of $1.11/day before considering power, etc. And as the rewards decreased the loss would increase.

Suppose I have 2TB spare space in my desktop PC. I'd have 7E-6 of the claimed mining power so, before pool expenses, I could expect to earn about 2c/day, decreasing. Why bother? Mining BurstCoin only makes sense if you have free access to large amounts of spare disk space for substantial periods. Despite the rhetoric, it isn't viable to aggregate the unused space in people's desktops.

Deployment

The history of BurstCoin might be seen as a poor omen for Chia. But BitTorrent has given Cohen a stellar reputation, and Chia has funding from Andreessen Horwitz, so it is likely to fare better. Nevertheless, Chia mining isn't likely to be the province of spare space on people's desktops:
  • There are fewer and fewer desktops, and the average size of their disks is decreasing as users prefer smaller, faster, more robust SSDs to the larger, slower, more fragile 2.5" drives that have mostly displaced the even larger 3.5" drives in desktops.
  • The only remaining market for large, 3.5" drives is data centers, and drive manufacturers are under considerable pressure to change the design of their drives to make them more efficient in this space. This will make them unsuitable for even those desktops that want large drives.
So, if the rewards are low enough to prevent investment in storage dedicated to mining, where is the vast supply of free hard disk space to come from? I can see two possible sources:
  • Manufacturers could add to their hard drives' firmware the code to fill them with a plot during testing and then mine during burn-in before shipment. This is analogous to what Butterfly Labs did with their mining hardware, just legal.
  • Cloud data centers need spare space to prepare for unexpected spikes in demand. Experience and "big data" techniques can reduce the amount, but enough uncertainty remains that their spare space is substantial.
So it is likely that the Chia blockchain would be dominated by a small number of large companies (2 disk manufacturers, maybe 10 clouds). Arguably, these companies would be more trustworthy and more decentralized than the current Bitcoin miners.

No comments: