Wednesday 12 June 2019

A short summary of solutions for UTXO scaling

Currently the UTXO set is about 3 or 4 GB stored on disk, and is bigger in memory so usually can't be stored entirely in memory and so relies on caching. At the moment, the UTXO set is bounded only by the blocksize limit - the UTXO set can grow at maximum nearly as fast as the blockchain does (if every transaction creates many outputs from a minimal number of inputs. The UTXO set shrank a bit recently (possibly in part due to incentives introduced for segwit), but in general its been growing around 50% per year (while memory cost effectiveness has been improving at only 15%/year). Without finding a solution to this, the UTXO set can be expected to grow to over 20 GB in 5 years. That's a major problem for running a full node.

Luckily people have been thinking about the problem for a while now and there are a number of ideas that could massively improve the situation. Here's a list of the major ones I know about:

UTXO Hash Set (UHS)

This method seems to be able to reduce the size of the representation of the UTXO set by almost half, at the cost of about 5% additional network traffic. That sounds great, but honestly pales in comparison to using Accumulators (full disclosure, I'm not actually 100% sure UHS is not also an accumulator).

Dynamic Accumulators

Dynamic accumulators are representations of set inclusion (ie accumulators) for which elements can be dynamically added or removed efficiently. Particularly interesting are universal accumulators, which can prove not only inclusion in the set, but also non-membership which can be super useful for things like SPV fraud proofs.

Merkle Accumulators

Merkle accumulators use a merkle tree to represent the UTXO set. Similar to UHS, this can lead to massive space savings at the cost of higher network traffic. Utreexo is one such proposal that can represent the entire UTXO set in a few kilobytes. The tradeoff is that transactions must include merkle proofs of inclusion which would require at least 25% more data being downloaded.

RSA Accumulators

RSA Accumulators are interesting because they can produce inclusion proofs of constant size (not dependent on the number of UTXOs). This is as opposed to Merkle based accumulators that scale logarithmicly with the number of UTXOs. They also have other interesting properties like universality, allowing non-membership proofs, and potential privacy improvements like delinking inputs from their corresponding outputs.

Eliptic Curve Accumulators

Eliptic curve cryptography is really interesting because keys and other cryptographic artifacts are much smaller than equivalents in RSA, since RSA relies on exponentiation while ECC basically relies on multiplication. I haven't found quite as much information on these as RSA accumulators, but it sounds like an area of active research.

More Papers

https://eprint.iacr.org/2005/123.pdf

Other thoughts

What's really interesting to me is the idea of stateless full nodes, where a full node doesn't really need to store any state at all and thus running one would require much lower computer resources. Theoretically, no node would have to store the entire data set (blockchain or UTXO set) and instead nodes could share data in a truly distributed way where each node could choose to store say 1/1000th of the total data and share that data on demand to nodes that needed it. As long as each piece of data can be verified from a tiny aggregation of it, no trustlessness would be sacrificed.

More Discussion

See more discussion about this on r/BitcoinDiscussion



Submitted June 12, 2019 at 02:10PM by fresheneesz http://bit.ly/2IABDoR

No comments :

Post a Comment