Is Monero’s (or All) Anonymity Broken?
Cryptonote anonymity algorithm cryptocurrencies such as Monero are designed to be honeypots.
I discovered that hypothetically every anonymity technology (except Zerocash*) has a pathological, irreparable design flaw outlined below, which renders the anonymity undependable. Worse yet, there might be no way for users to know their anonymity has been compromised because the perpetrator would want to maintain the deceit of the valuable honeypot; which by the way, is also the case with Tor†.
“Beware of Geeks bearing gifts”
* This blog discusses Zerocash (not zerocoin!), which is the original name of the technology employed in Zcash and its clones. This blog compares the technology of Zerocash to the Cryptonote/RingCT technologies in Monero and clones. This blog is not claiming that the Zcash token or any particular clone is superior to Monero as an investment. See the discussion in the comments for more details.
Divide and Conquer
||The flaw is that the set of transactions mixed together that provides the anonymity set, is never all of the unspent transactions (UTXO) on the blockchain.|
Thus, the miners can Sybil attack by creating transactions that spend to self, filling up the anonymity sets with as many of their own (i.e. “white sheep”) transactions as is necessary to deanonymize each honest (i.e. “black sheep”) payer.
These filler transactions cost the miner nothing because the perpetrating miner surreptitiously pays the outputs and the transaction fees to self.
It is unknowable which transactions were created by the miner, because there is no linkage (not even in proof-of-stake) between UTXO and the identify of the entity (“miner”) that appends a block of transactions to the blockchain. Thus the undetectable perpetrating miner can even recoup the transaction fees of sending transactions to blocks created by non-complicit miners, by including offsetting non-complicit transactions in the perpetrating miner’s blocks. Thus, users can’t defeat this vulnerability by limiting their mixes to transactions of blocks from miners they trust, because they are unable to distinguish the complicit transactions from the non-complicit ones—i.e. the complicity (identity) of the transactions can be Sybil attacked because that is the definition of anonymity.
Thus the perpetrator will own
X% of the transactions in every anonymity set, where
X is the perpetrator’s percentage of the network hashrate. And I will explain further how this vulnerability combined with two other vulnerabilities can enable the perpetrator to deanonymize (i.e. unmask the anonymity of) the users.
When I had discussed this vulnerability with @smooth earlier this year, he claimed complicit miners would deprive themselves of income if the block size is limited and blocks are full, because perpetrating miners would displace transaction fee income with their own transactions. At the time, I responded that a perpetrator with
51% of the hashrate could make sure only his blocks were allowed on the blockchain. But I henceforth realized that both of our superficial analyses were incorrect. I was incorrect because orphaning of
49% of the blocks would be an unambiguously detectable attack due to the long-term sustained sky high orphan rate (which normally is roughly
< 1%). And @smooth was incorrect because:
- Currently the transaction fees of
< $5per block (c.f. txns/block) are insignificant (i.e. only
2%of) compared to the protocol block reward of
- And research has modeled that proof-of-work becomes dysfunctional if ever transaction fee revenue ever becomes significant relative to the protocol determined block reward.
Thus perpetrating miners wouldn’t care about displacing 2% of their total rewards, when by doing so they can gain access to a very valuable honeypot.
And also to correct an ostensibly implicit misunderstanding that both @smooth and I had concerning whether Monero’s dynamically-adjusting block size algorithm has any impact on this vulnerability. Note that whether the block size is limited or not has nothing to do with the vulnerability, because if the perpetrator attempted to create for free more than
X% of the transactions, the excess must go in the perpetrator’s blocks (else the transaction fees cost will not be offset) and thus users could choose to not mix with transactions from larger blocks.
Some people are inclined to emphasize that the number of Monero transactions are too small; and the perpetrator wouldn’t have been around during the first three years of blockchain history. But it is the value of transactions and the value of any “illicit” activity being enabled by them that drives the economic value of the honeypot, such as to law enforcement and draconian capital controls frenzy descending on the world. The FBI has mentioned Monero is on its radar. And ratio between the complicit and non-complicit transactions in the recent UTXO (which isn’t knowable) is what matters, not the number of transactions. The older UTXO are harmful for mixing in recent transactions as explained in the subsequent Overlapping Mix Sets section.
For analyzing how large
X% may be, proof-of-work mining always becomes concentrated by those who have the most economies-of-scale. For example, Bitmain does not have to mine with all the ASICs it manufacturers in order to extract the majority of all the profit that is produced by mining with the said ASICs. The economics and game theory trend to oligarchy control over mining is the norm—not a roll of the dice. And for Cryptonote/Monero, the perpetrator has the added advantage of gaining additional value from this vulnerability that the non-complicit miner doesn’t. Economics rules outcomes. Selling access to the honeypot is irrefutably more profitable than not doing so, unless offsetting costs of doing so are too great. That applies even with Monero’s current CPU/GPU mining; and note that every proof-of-work algorithm can given sufficient capital investment be ASIC mined with orders-of-magnitude more efficiency (even @tromp agrees there are no possible exceptions!). Due to the natural phenomenon of economies-of-scale concentration, there are only two semiconductor chip fabs in the world which make all 14/16nm ASICs.
Additionally, per the research mentioned in the prior section, Monero’s perpetual (“tail”) block reward will not prevent its proof-of-work from becoming dysfunctional in the future, in the absence of an oligarchy of miners. But an oligarchy of miners would have an economic incentive to sell access to the deanonymization honeypot, because absent a whistle-blower like Edward Snowden, no one will know (or at least be able to prove) that such access and honeypot is deployed.
Dash, CoinJoin, and Masternodes
Silently deanonymizing the minions does not disturb the profit applecart of mining and speculation, as evident by Dash’s (not Dashcoin) speculation market success even though no one—who properly understands anonymity technology—uses Dash’s masternode scam honeypot which is analogous to the lack of robust anonymity provided by a VPN.
As I had been the first to point out in 2014 that CoinJoin can always be jammed, and thus masternodes were required in order to make CoinJoin functional. And as you explain, masternodes compromise the anonymity.
I also found a high school level probability math error in Dash’s security.
Combinatorial Blockchain Analysis Deanonymization
With the vulnerability of the Divide and Conquer section, the perpetrator owns (i.e. knows the identify of)
X% “white sheep” transactions in every anonymity set mix. But with that information alone, it is unknown which of the remaining
(100 − x)% “black sheep” is the payer—i.e. all the remaining “black sheep” UTXO in the anonymity set mix are equiprobable to be the payer.
However, the perpetrator has at least two additional vulnerabilities which can be exploited together to form a combinatorial analysis which further narrows the unknowns in the anonymity set mixes to potentially a single “black sheep” payee.
Firstly, although any given payee might be diligent about always employing a different IP address for every spend transaction, in general not all payees will, because as of now it is inconvenient, many users are ambivalent/unaware, and for the medium-term future probably eventually implausible for most users as governments clamp down on for example WiFi hotspots that do not adhere to increasingly draconian user tracking regulations. And even if users do employ a different IP address for every spend transaction, many of them due to their ignorance of the technical issues will still be correlated by other metadata such as browser fingerprints, web browsing activity, viruses on their device, etc..*
Such metadata correlation enables a linkage between spend transaction in any case, and possibly even identifying the person signing them (if the metadata can be linked to a real identity). But that doesn’t identify which UTXO “black sheep” in anonymity sets is the payer. To correlate to the UTXO, the payer would have not been running a full node when the UTXO was received as a payee. Because the payee would have linked his IP address (or potentially other metadata) to the UTXO when requesting that a full node scan for incoming payments.
Thus for the Cryptonote ring signatures (including Monero’s RingCT upgrade), the perpetrator is able to identify which of the “black sheep” in anonymity sets is the payer when correlating the same IP address (or other meta data) between ring signature mix transactions because for example users don’t run a full node or otherwise leak metadata correlations.
. . .
* I’m referring here to a perpetrator which has the resources of the national security agencies or is otherwise earning enough from having the profit advantage in mining provided by the honeypot, to hire an army of hackers, payoff insiders at ISPs, etc..
Some claim that absolute anonymity from national security agencies is not the most popular goal; rather that privacy from the general public is the more realistic goal. But such a distinction may not actually exist especially for businesses and high net worth individuals (i.e. the market that actually needs privacy), because private data as seen by nodes on the network often has a market value to various data mining markets (when aggregated for many users). For example, economics should dictate that metadata such as IP address is not likely obscured from the data mining market even when using a VPN unless users are paying more to the VPN service to obscure identity than that VPN (or a Sybil attacker combined with timing analysis on the low-latency of VPNs) can earn on the open market for revealing the data. Hackers already routinely hack secret data then auction it off on the Internet. “Elementary [economics] my dear Watson”.
Overlapping Mix Sets
Every payee must mix his transaction with other UTXO to form an anonymity set mix (which obviously is a subset of all the transaction outputs), but it is unknowable which of those UXTO (of those which aren’t owned by the perpetrator, which is also publicly unknowable) are compromised as explained in the prior section. Although that doesn’t directly identify the remaining payees who were diligent not to use the same IP address for each spend transaction, it does convert those identified careless “black sheep” into “white sheep” thus further decreasing size of the set of "black sheep" in anonymity sets which are equiprobable to be the remaining diligent payees. In other words, the carelessness of other users reduces the effective anonymity set size of the diligent users, in addition to the reduced anonymity set size due to the vulnerability of the Divide and Conquer section. And note that increasing the size of the anonymity set mixes (or equivalently mixing over and over again), doesn’t improve matters because the two vulnerabilities mentioned thus far scale proportionally to the anonymity set mix size—i.e. the number of bad apples is a proportion.
Secondly, the perpetrator has another vulnerability to exploit to further narrow the remaining diligent “black sheep” to one per anonymity set.
Ring signature transaction enable the payer to select which UTXO to mix into the anonymity set of his spend transaction (obviously in order to obscure which of the UTXO is the payee, the deanonymization of which is what we are discussing). Thus there is no restriction to prevent different spend transactions from having overlapping sets of UTXO. If these instances of overlap exceed the number of equiprobable unknown “black sheep”, then through a process of elimination it is mathematically possible to identify the payer of each spend transaction. Note that there is nothing any payer can do to prevent this because other payers are free to mix any payee’s UTXO. And the (risk of) instances of overlap for any UTXO increase indefinitely because no UTXO can ever be marked as spent, because it is supposed to be unknowable which of the UTXO was spent in each ring signature anonymity set. In other words, the probability of being deanonymized by overlapping mixes, increases the longer that a spend transaction has been on the blockchain.
A simplified example can demonstrate this. Each line is a ring signature where each number represents a UTXO address:
[1, 2, 3] [2, 3, 4] [3, 4, 1] [4, 2, 1]
Note that each UTXO is equiprobable to be the payer of each of the above four transactions. However if the perpetrator issued the first transaction spending UTXO numbered
1 and has identified the second transaction as spending UTXO numbered
2 due to metadata correlation, then the perpetrator knows that
4 are the payers of the third and fourth transaction respectively.
But only the perpetrator knows this, and so if another ring signature transaction includes any of those UTXO which the perpetrator has already identified as spent, e.g.
[3, 4, 9] so then the perpetrator knows that
9 is the payer and so on indefinitely.
. . .
I had communicated (archived here) this vulnerability to @smooth in 2014 because I was trying to help determine if there were any technological vulnerabilities in Monero that BitcoinEXpress could exploit in his threat to attack Monero at that time.*
I reiterated and further explained (archived here) the vulnerability and my proposed solution to Monero’s cryptographer Shen Noether (aka @NobleSir) on Reddit during our discussions in 2015 (archived here) about the RingCT homomorphic encryption upgrade he invented to add the hiding of monetary values to Cryptonote’s ring signatures, because I had been simultaneously inventing a similar integration of Cryptonote ring signatures with Confidential Compact Transactions (CCT) which I named Zero Knowledge Transactions.
I had proposed as an improvement to Monero the grouping of UTXO into sets to restrict ring signatures to mixing UTXO within its explicit grouping. This would have provided the benefit that it would be possible to know when all UTXO in a grouping were spent when the count of spends was equal to the number of UTXO in the grouping, so that the grouping could be marked as spent and purged from the UTXO (my use of the term “pruned”).
In hindsight, my aforementioned improvement is also vulnerable and thus untenable because it limits the payer to an anonymity set which is temporally local to the UTXO received as a payee; thus enabling a complicit miner to target specific UTXO for deanonymization by temporal positioning of the perpetrating miner’s transactions in the explicit grouping. Now I realize Monero couldn’t accept my proposed improvement to diminish the superset grab bag for selecting the anonymity set of payers,† because of the Sybil attack problem I explained in the Divide and Conquer section; thus the vulnerabilities I have enumerated compound each other in an irreparable contagion.
. . .
* At that time, I received gracious donations of
2.5 ₿ from @rpietila,
2.5 ₿ from @smooth, and
5 ₿ from @jl777 for my assistance when ₿ was roughly $300, which was very much appreciated as I had no other income at that time and my savings were nearly depleted due to my 5+ year struggle with a mystery illness which ended up being diagnosed as “gut TB” and treated in 2017.
I’ll probably write another blog, Zerocash’s Technology Explained for Dummies, since afaik no one has written that yet. Herein I will only summarize the facets of Zerocash (not zerocoin!) that make its anonymity more dependable.
The anonymity set mix in Zerocash is always the entire set of transactions past and future—there are no definable subsets. Thus the exploits in the sections Divide and Conquer and Overlapping Mix Sets aren’t possible. Although it is possible for users to deanonymize themselves by not being diligent as explained in the Metadata Correlation section, unlike in non-Zerocash anonymity technologies, the carelessness of some users doesn’t diminish the anonymity of the diligent users, because there are no subsets of anonymity which can be narrowed. Zooko Wilcox-O'Hearn wrote in 2015:
A usability difference might be that in a [non-Zerocash] system, a recipient may need to distinguish “how much privacy” comes with the sender's amount, whereas in Zerocash there is a known “standard amount of privacy”.
As alluded to in the Metadata Correlation section and thus same as for Cryptonote/Monero, the diligent user must run a full node so that his IP address is not correlated when filtering the entire set of new transactions for incoming payments. And the user should never submit new transactions to the network from the IP address of his full node, nor communicate with the same IP address (or other correlated metadata) to his full node. Or don’t communicate with the full node over the Internet.
Although both Monero’s RingCT upgrade and Zerocash could have an undetected increase in the money supply if elliptic curve cryptography (ECC)* is cracked (also for Zerocash in the case that the systemic private key setup is compromised by human trust), it is very important to note that the cryptographic anonymity algorithm (not to be conflated with validation of total coin supply) of Zerocash can’t be cracked unless SHA256 is cracked; whereas, Monero/Cryptonote’s anonymity algorithm will be cracked if ECC is.
I would much rather my anonymity can only break if hash functions break which is the case for Zerocash but not Monero, RingCT, or Cryptonote, because if hash functions fail then all cryptocurrency and blockchains are F.U.B.A.R..
Whereas, if only ECC fails, in carefully designed cryptocurrencies such as Bitcoin remain secure (and Zerocash remains anonymous), but unfortunately Cryptonote/Monero will not remain secure nor anonymous because explicit mixing of ring signatures doesn’t allow the public key addresses to remain protected as hashed until spent!
Satoshi was so much more confident about the robustness of the security of hash functions (such as SHA256) versus ECC w.r.t. to future cryptanalysis, quantum computing, and potential for planted back doors, that he did a genius level design for Bitcoin emphasizing reliance on hash functions for security.
All currently known non-centralized anonymity set mixing technologies other than Zerocash—including Cryptonote, CoinJoin/Dash, and Shadowcash—are vulnerable to the Sybil attack explained in the Divide and Conquer section because they require explicit anonymity mix subsets. Zerocash is not susceptible because the anonymity set is always implicitly all of the UTXO (even those already spent), so spamming the UTXO gains the Sybil attacker no probabilistic advantage.
This is fundamental and the explicit anonymity set technologies can not be “fixed”, because moving to an implicit anonymity set requires something like zk-SNARKS to lift the abelian group from low-level homomorphic properties such as public key signatures and homomorphic value hiding, to proving general arithmetic circuits. Arithmetic circuits are required because cryptographic trapdoors are required for creating zero knowledge protocols, trapdoors require an abelian group, and trapdoors can not be created directly on cryptographic hash functions given that hash functions are not an abelian group, yet cryptographic hash functions are required to create the commitments needed for implicit anonymity sets and hash functions can be modeled as arithmetic circuits. Thus, Cryptonote can’t be improved to use implicit anonymity sets and have its anonymity algorithm rely on the conditional security of hash functions instead of that of the presumed math theoretic intractability of “factoring” the elliptic curve discrete logarithm problem (ECDLP)1.†
. . .
* Note for those who need a layman’s primer on ECC, I suggest chapter 7 The Cryptography Behind Bitcoin in the book Bitcoin for the Befuddled which can be previewed for free on Google Books.
† Which is conjectured to increase exponentially with the bit width.2 Even unconditional security’s reliance on proven assumptions of prohibitive cost is not equivalent to information-theoretic security—the inability to break security even with unlimited computing power, due to unavailable information.
Civilization Requires Anonymity and Privacy
Civilization collapses into totalitarianism without anonymity and privacy; and they are also essential for the analogous reasons to enable functioning families and relationships.
Privacy is the hiding of data or actions. Anonymity is privacy of the associated identities. And pseudonymity is the use of a faux identity.
For example, an IP address can be a pseudonym of the human utilizing a WiFi hotspot that’s not logging user identities. An example of anonymity is cryptographically mixing known identities (in an anonymity set) such that all are equiprobable. With Stealth address, all payees are mathematically equiprobable. Hiding the transferred token amount in Zerocash or RingCT’s transactions is an example of cryptographic privacy. Analogs in the non-computerized world are trusted agents and locks—i.e. without cryptography anonymity can’t be independently verified and relies on trust.
Fungibility and Conclusions
In a prior very rough draft of this blog, I explained in more detail that even without the anonymity set mixing (e.g. the ring signatures aspect of Cryptonote), the separate unlinkability technology of Cryptonote (aka BIP-63 Stealth addresses) provides the downstream anonymity that is required for those who are diligent at obscuring their metadata (such as IP address, which includes the necessity to run a full node).
The untraceability of anonymity set mixing is only needed for obscuring UTXO that is not already anonymous with the use of Stealth addresses for unlinkable payments.
Other than that need, the argument for needing untraceability on every transaction is to prevent tainting of UTXO with a claim that this is necessary to maintain fungibility. I thoroughly debunked that argument in my prior rough draft. The irony I pointed out is that the explicit anonymity mix subsets of Cryptonote/Monero are inferior for resisting deleterious implications of taint than Zerocash’s superior untraceability with no explicit anonymity subsets. And (especially in a cryptocurrency with very high velocity of money transaction volume) the natural downstream mixing of transactions will spread the taint out analogous to the employment of explicit anonymity sets (and possibility faster, presuming that Cryptonote can never realistically be a blockchain design that supports real-time confirmations and noting 0-confirmations is incentives incompatible as transaction fee revenues increase).
The bottom line is that the most popular medium-of-exchange unit will require high, real-time transaction volume and be incompatible with users running full nodes and using powerful untraceability. Zerocash is the superior untraceability mixer but anonymity set mixers will be relegated to a special case optional mixer not employed on most transactions. And as an optional mixer then the threat of creating undetected money supply is mitigated by the fact that the amount of tokens that can be spent out of the mixer must can't be greater than the tokens that were spent into it.
Yet the implication is that to avoid exchange rate delays and fluctuation risk when running coins through the optional mixer, the untraceability mixer should be denominated in the same token units as the popular medium-of-exchange. Thus dedicated anonymity cryptocurrencies have no future—not even as side-chains, because side-chains are irreparably flawed (even if not merged mined and using a different consensus algorithm).
So my (not near-term, but over medium-term) conclusion is goodbye Monero and Dash. Eventually perhaps goodbye Zerocash as a dedicated anonymity token, but being the best technology by far eventually as an optional mixer on a popular medium-of-exchange cryptocurrency* that can handle high, real-time transaction volume.
. . .
* Such as my coming Bitnet altcoin.
1 I. Chatzigiannakis, A. Pyrgelis, P. Spirakis, Y. Stamatiou, Elliptic Curve Based Zero Knowledge Proofs and Their Applicability on Resource Constrained Devices, §3. Zero Knowledge Protocols Based on the ECDLP, 8 July 2011.
2 Andrea Corbellini, Elliptic Curve Cryptography: breaking security and a comparison with RSA, 8 June 2014.