CoinJoin is an anonymization method for bitcoin transactions proposed by Gregory Maxwell. It is based on the following idea: “When you want to make a payment, find someone else who also wants to make a payment and make a joint payment together.”. When making a joint payment, there is no way to relate input and outputs in one bitcoin transaction and thus the exact direction of money movement remains unknown to third parties.
CoinJoin-based mixing methods increase privacy for all users – even those not using mixing – because it is no longer likely that all inputs to a transaction come from a single wallet, and hence can no longer be reliably associated with a single user.
As we all remember, Satoshi has defined Bitcoin as “electronic cash”, and what distinguishes ordinary cash from, say, e-money? And the fact that cash is anonymous and decentralized. For example, you might meet a stranger in a dark alley: he's a commodity to you, say, a rottwell puppy, and you're $ 100. You neither the seller has not seen, nor he you, and her deal to tell you can not as simply do not know. Due in part to this feature, the cache is still popular and is likely to be popular for a very long time. It is with the help of cash that all the dark cases in this world are financed, but for some reason no one seeks to prohibit them.
The pseudonymy of Bitcoin is not a special problem, because 99% of the population is not engaged in anything reprehensible. However, the principle of privacy should work:“I have nothing to hide, it's just not your business." Hence, a lot of cryptographers and security specialists began to work on how to return Bitcoin initially declared anonymity, and hence the right to be called “cash”.
As usual, there were more than one way. For example, Zerocoin with its blind signatures, which was rejected by the core of Bitcoin developers, as it is great to increase the size of the already inflated Blockchain.
A security specialist named Gregory Maxwell offered "CoinJoin". This approach was simpler than the one used in Zerocoin and did not require modification of the Protocol or Blockchain. Maxwell reminded everyone that a regular transaction in the Bitcoin system can have not only a few outputs, but also more than one entry. If two different addresses want to make a transfer, they can create one for two transactions, agree on exits, and the network can accept such a transaction only if the transaction is signed at the same time by both addresses.
The proposed approach was immediately adopted. Company Blockchain.info opened a separate service Sharedcoin, and the developers of DarkWallet announced support for CoinJoin from the beginning. The principle of operation in both cases is very simple. Users who want to use CoinJoin connect to a kind of “chat”, where they are waiting for the same wishing to carry out their transactions. Then they agree to synchronize their actions and create a common, one-on-all transaction with a variety of inputs and outputs, and then all amicably sign the result. The point of such a transaction is that an observer from the outside cannot determine exactly which output corresponds to which output. The funds seem to merge into one pile from different sources (just as it happens in mixer services), and then go to a variety of completely different addresses. Path detection is not working, even if we make the assumption that each of the inputs should match exactly the same (minus Commission) on the amount of output users simply create additional “address for delivery”, so who where and what is translated becomes not clear at all. Is it worth mentioning that CoinJoin can be repeated many times?
- Multiple inputs and only two outputs in the right transaction
- Multiple inputs and only two outputs per second transaction
- CoinJoin good, but still not fully anonymous. First, users need to meet somewhere, and centralized "chat" can keep records. TOR saves YOU from this problem, but another unsolved problem is that users who subscribe to a group * CoinJoin transaction know too much about each other. They know, for example, what is the address of where how many lists and which address to return back. This data may also be recorded, which may result in some being deanonymized.
Despite this shortcoming, CoinJoin was recognized as “safe enough” and even there was a discussion about embedding this functionality in the official Bitcoin Core client. As you know, maybe soon we would have seen this functionality in the standard supply, if there was something better.
CoinJoin requires that users negotiate transactions they wish to join. The first services to handle this (such as blockchain.info's SharedCoin) used centralized servers and required users to trust the operator of the service not to steal the bitcoins or allow others to do so. Centralized services may also compromise participants' privacy by keeping logs of the transactions they negotiate. Decentralized implementations of CoinJoin such as JoinMarket attempt to circumvent various issues related to centralization.
The level of anonymity offered by CoinJoin can also be diminished if the protocol is implemented incorrectly. One such flaw was identified in blockchain.info's SharedCoin mixing service. Security consultant Kristov Atlas, In this article he states “the SharedCoin service should be used only as a light protective measure for financial privacy”. As part of his research, the author developed a tool called 'CoinJoin Sudoku' that could identify SharedCoin transactions and detect relationships between specific payments and payees.
Bitcoin is often promoted as a tool for privacy but the only privacy that exists in Bitcoin comes from pseudonymous addresses which are fragile and easily compromised through reuse, "taint" analysis, tracking payments, IP address monitoring nodes, web-spidering, and many other mechanisms. Once broken this privacy is difficult and sometimes costly to recover.
Traditional banking provides a fair amount of privacy by default. Your inlaws don't see that you're buying birth control that deprives them of grand children, your employer doesn't learn about the non-profits you support with money from your paycheck, and thieves don't see your latest purchases or how wealthy you are to help them target and scam you. Poor privacy in Bitcoin can be a major practical disadvantage for both individuals and businesses.
Privacy errors can also create externalized costs: You might have good practices but when you trade with people who don't (say ones using "green addresses") you and everyone you trade with loses some privacy. A loss of privacy also presents a grave systemic risk for Bitcoin: If degraded privacy allows people to assemble centralized lists of good and bad coins you may find Bitcoin's fungibility destroyed when your honestly accepted coin is later not honored by others, and its decentralization along with it when people feel forced to enforce popular blacklists on their own coin.
The idea is very simple, first some quick background.
A Bitcoin transaction consumes one or more inputs and creates one or more outputs with specified values. Each input is an output from a past transaction. For each input there is a distinct signature (scriptsig) which is created in accordance with the rules specified in the past-output that it is consuming (scriptpubkey).
The Bitcoin system is charged with making sure the signatures are correct, that the inputs exist and are spendable, and that the sum of the output values is less than or equal to the sum of the input values (any excess becomes fees paid to miners for including the transaction). It is normal for a transaction to spend many inputs in order to get enough value to pay its intended payment, often also creating an additional 'change' output to receive the unspent (and non-fee) excess.
There is no requirement that the scriptpubkeys of the inputs used be the same; i.e., no requirement that they be payments to the same address. And, in fact, when Bitcoin is correctly used with one address per payment, none of them will be the same. When considering the history of Bitcoin ownership one could look at transactions which spend from multiple distinct scriptpubkeys as co-joining their ownership and make an assumption: How else could the transaction spend from multiple addresses unless a common party controlled those addresses?
In the illustration 'transaction 2' spends coins which were assigned to 1A1 and 1C3. So 1A1 and 1C3 are necessarily the same party? This assumption is incorrect. Usage in a single transaction does not prove common control (though it's currently pretty suggestive), and this is what makes CoinJoin possible:
The signatures, one per input, inside a transaction are completely independent of each other. This means that it's possible for Bitcoin users to agree on a set of inputs to spend, and a set of outputs to pay to, and then to individually and separately sign a transaction and later merge their signatures. The transaction is not valid and won't be accepted by the network until all signatures are provided, and no one will sign a transaction which is not to their liking.
To use this to increase privacy, the N users would agree on a uniform output size and provide inputs amounting to at least that size. The transaction would have N outputs of that size and potentially N more change outputs if some of the users provided input in excess of the target. All would sign the transaction, and then the transaction could be transmitted. No risk of theft at any point.
In the illustration 'transaction 2' has inputs from 1A1 and 1C3. Say we beliece 1A1 is an address used for Alice and 1C3 is an address used for Charlie. Which of Alice and Charlie owns which of the 1D and 1E outputs?
The idea can also be used more casually. When you want to make a payment, find someone else who also wants to make a payment and make a joint payment together. Doing so doesn't increase privacy much, but it actually makes your transaction smaller and thus easier on the network (and lower in fees); the extra privacy is a perk. 
Such a transaction is externally indistinguishable from a transaction created through conventional use. Because of this, if these transactions become widespread they improve the privacy even of people who do not use them, because no longer will input co-joining be strong evidence of common control. There are many variations of this idea possible, and all can coexist because the idea requires no changes to the Bitcoin system. Let a thousand flowers bloom: we can have diversity in ways of accomplishing this and learn the best.
CoinShuffle was proposed by researchers from Saarland University in 2014. It further develops the CoinJoin concept and increases privacy by not requiring any trusted third-party to be involved in the creation of mixed transactions. In the original research paper presented at the 19th European Symposium on Research in Computer Security, CoinShuffe is described as a completely decentralized coin-mixing protocol “inspired by CoinJoin to ensure security against theft and by the accountable anonymous group communication protocol Dissent to ensure anonymity as well as robustness against DoS attacks”.
Only a proof-of-concept implementation of CoinShuffle protocol was made available, written purely to evaluate the feasibility and performance. It was followed by further research leading to the CoinShuffle++, ValueShuffle and PathShuffle proposals.
An example 2-party coinjoin transaction. https://chain.localbitcoins.com/tx/c38aac9910f327700e0f199972eed8ea7c6b1920e965f9cb48a92973e7325046
The outputs to addresses
1Fufjpf9RM2aQsGedhSpbSCGRHrmLMJ7yY are coinjoined because they are both of value 0.01btc.
Another example is this 3-party coinjoin. https://chain.localbitcoins.com/tx/92a78def188053081187b847b267f0bfabf28368e9a7a642780ce46a78f551ba
Don't you need tor or something to prevent everyone from learning everyone's IP?
Any transaction privacy system that hopes to hide user's addresses should start with some kind of anonymity network. This is no different. Fortunately networks like Tor, I2P, Bitmessage, and Freenet all already exist and could all be used for this. (Freenet would result in rather slow transactions, however)
However, gumming up "taint analysis" and reducing transaction sizes doesn't even require that the users be private from each other. So even without things like tor this would be no worse than regular transactions.
Don't the users learn which inputs match up to which outputs?
In the simplest possible implementation where users meet up on IRC over tor or the like, yes they do. The next simplest implementation is where the users send their input and output information to some meeting point server, and the server creates the transaction and asks people to sign it. The server learns the mapping, but no one else does, and the server still can't steal the coins.
More complicated implementations are possible where even the server doesn't learn the mapping.
E.g. Using chaum blind signatures: The users connect and provide inputs (and change addresses) and a cryptographically-blinded version of the address they want their private coins to go to; the server signs the tokens and returns them. The users anonymously reconnect, unblind their output addresses, and return them to the server. The server can see that all the outputs were signed by it and so all the outputs had to come from valid participants. Later people reconnect and sign.
Similar things can be accomplished with various zero-knowledge proof systems.
Does the totally private version need to have a server at all? What if it gets shut down?
No. The same privacy can be achieved in a decentralized manner where all users act as blind-signing servers. This ends up needing n^2 signatures, and distributed systems are generally a lot harder to create. I don't know if there is, or ever would be, a reason to bother with a fully distributed version with full privacy, but it's certainly possible.
What about DOS attacks? Can't someone refuse to sign even if the transaction is valid?
Yes, this can be DOS attacked in two different ways: someone can refuse to sign a valid joint transaction, or someone can spend their input out from under the joint transaction before it completes.
However, if all the signatures don't come in within some time limit, or a conflicting transaction is created, you can simply leave the bad parties and try again. With an automated process any retries would be invisible to the user. So the only real risk is a persistent DOS attacker.
In the non-decentralized (or decentralized but non-private to participants) case, gaining some immunity to DOS attackers is easy: if someone fails to sign for an input, you blacklist that input from further rounds. They are then naturally rate-limited by their ability to create more confirmed Bitcoin transactions.
Gaining DOS immunity in a decentralized system is considerably harder, because it's hard to tell which user actually broke the rules. One solution is to have users perform their activity under a zero-knowledge proof system, so you could be confident which user is the cheater and then agree to ignore them.
In all cases you could supplement anti-DOS mechanisms with proof of work, a fidelity bond, or other scarce resource usage. But I suspect that it's better to adapt to actual attacks as they arise, as we don't have to commit to a single security mechanism in advance and for all users. I also believe that bad input exclusion provides enough protection to get started.
Isn't the anonymity set size limited by how many parties you can get in a single transaction?
Not quite. The anonymity set size of a single transaction is limited by the number of parties in it, obviously. And transaction size limits as well as failure (retry) risk mean that really huge joint transactions would not be wise. But because these transactions are cheap, there is no limit to the number of transactions you can cascade.
In particular, if you have can build transactions with m participants per transaction you can create a sequence of m*3 transactions which form a three-stage switching network that permits any of m^2 final outputs to have come from any of m^2 original inputs (e.g. using three stages of 32 transactions with 32 inputs each 1024 users can be joined with a total of 96 transactions). This allows the anonymity set to be any size, limited only by participation.
In practice I expect most users only want to prevent nosy friends (and thieves) from prying into their financial lives, and to recover some of the privacy they lost due to bad practices like address reuse. These users will likely be happy with only a single pass; other people will just operate opportunistically, while others may work to achieve many passes and big anonymity sets. All can coexist.
How does this compare to Zerocoin?
As a crypto and computer science geek I'm super excited by Zerocoin: the technology behind it is fascinating and important. But as a Bitcoin user and developer the promotion of it as the solution to improved privacy disappoints me.
Zerocoin has a number of serious limitations:
- It uses cutting-edge cryptography which may turn out to be insecure, and which is understood by relatively few people (compared to ECDSA, for example).
- It produces large (20kbyte) signatures that would bloat the blockchain (or create risk if stuffed in external storage).
- It requires a trusted party to initiate its accumulator. If that party cheats, they can steal coin. (Perhaps fixable with more cutting-edge crypto.)
- Validation is very slow (can process about 2tx per second on a fast CPU), which is a major barrier to deployment in Bitcoin as each full node must validate every transaction.
- The large transactions and slow validation also means costly transactions, which will reduce the anonymity set size and potentially make ZC usage unavailable to random members of the public who are merely casually concerned about their privacy.
- Uses an accumulator which grows forever and has no pruning. In practice this means we'd need to switch accumulators periodically to reduce the working set size, reducing the anonymity set size. And potentially creating big UTXO bloat problems if the horizon on an accumulator isn't set in advance.
Some of these things may improve significantly with better math and software engineering over time.
But above all: Zerocoin requires a soft-forking change to the Bitcoin protocol, which all full nodes must adopt, which would commit Bitcoin to a particular version of the Zerocoin protocol. This cannot happen fast—probably not within years, especially considering that there is so much potential for further refinement to the algorithm to lower costs. It would be politically contentious, as some developers and Bitcoin businesses are very concerned about being overly associated with "anonymity". Network-wide rule changes are something of a suicide pact: we shouldn't, and don't, take them lightly.
CoinJoin transactions work today, and they've worked since the first day of Bitcoin. They are indistinguishable from normal transactions and thus cannot be blocked or inhibited except to the extent that any other Bitcoin transaction could be blocked.
(As an aside: ZC could potentially be used externally to Bitcoin in a decentralized CoinJoin as a method of mutually blinding the users in a DOS attack resistant way. This would allow ZC to mature under live fire without taking its costs or committing to a specific protocol network-wide.)
The primary argument I can make for ZC over CoinJoin, beyond it stoking my crypto-geek desires, is that it may potentially offer a larger anonymity set. But with the performance and scaling limits of ZC, and the possibility to construct sorting network transactions with CJ, or just the ability to use hundreds of CJ transactions with the storage and processing required for one ZC transactions, I don't know which would actually produce bigger anonymity sets in practice. E.g. To join 1024 users, just the ZC redemptions would involve 20k * 1024 bytes of data compared to less than 3% of that for a complete three-stage cascade of 32 32-way joint transactions. Though the ZC anonymity set could more easily cross larger spans of time.
The anonymity sets of CoinJoin transactions could easily be big enough for common users to regain some of their casual privacy and that's what I think is most interesting.
How does this compare to CoinWitness?
CoinWitness is even rocket-sciency than Zerocoin, it also shares many of the weaknesses as a privacy-improver: Novel crypto, computational cost, and the huge point of requiring a soft fork and not being available today. It may have some scaling advantages if it is used as more than just a privacy tool. But it really is overkill for this problem, and won't be available anytime real soon.
Sounds great! Where is it?
Theres the rub: There exist no ready made, easy-to-use software for doing this. You can make the transactions by hand using bitcoin-qt and the raw transactions API, as we did in that "taint rich" thread, but to make this into a practical reality we need easy-to-use automated tools.
Luke has written up some sketches a protocol which would enable establishing joint transactions over the regular Bitcoin network.
The Bitcoin-qt RPC system provides everything someone needs to write a side-car applet (including the ability to lock txouts to prevent them from being spent out from from under it) that participants in such a system. But the fact that so many users use centralized webwallets today which can spy on them will ultimately limit the userbase for these tools.
Personally, most of my coding brain capacity is spent on other things which are even more important to me. And what I could spare on Bitcoin is spent on more core and security things— if I work on anything wallet related anytime soon it will likely be improving the privacy behavior of coin selection... But moreover:
Anyone who builds this is going to be accused of enabling criminal activity, it doesn't matter if any actual criminals use this or not: Criminal activity sells headlines. Being a Bitcoin core developer already fills my quota for accusations of this kind, especially my quota for risk that I'm not even paid for. :)
In reality, real criminals don't need CoinJoin if they have even the slightest clue: They can afford to buy privacy in a way that regular users cannot, it's just a cost of their (often lucrative) business.
Joe-criminal can go out and buy 120% PPS mining to get brand new coins, or run his money through a series of semi-sham high cashflow gambling businesses for a 50% cut, they can afford the cost of seeking out and interfacing with these seedy services... Joe and Jane doe? Their names are up in neon on blockchain.info. It might not seem great to them, but if there a high cost of fixing it they simply won't, because the cost of fixing it is very concrete and the cost or privacy loss is speculative and distant. They might just need to give up bitcoin and switch to something almost totally private: cash... Regular users need efficient and inexpensive privacy if it is to help them at all.
I know that making such a tool doesn't fit into the get-rich-quick mold of many Bitcoin businesses, but the importance is self-apparent and the simplest versions of this don't require very deep technical wizardry. I think the "political" risk of improving people's privacy is a real one that you should carefully consider, but around these parts I see people sticking their names on some rather outrageously risky stuff. I'd hoped the "taint rich" thread would be enough to inspire some community action, but perhaps this will be.
See Also on BitcoinWiki
- CoinJoin: Bitcoin privacy for the real world
- A Taxonomy of Bitcoin Mixing Services for Policymakers
- CoinJoin Sudoku - Weaknesses in SharedCoin, and CoinJoin research.
- Blockchain's SharedCoin Users Can Be Identified, Says Security Expert
- ‘Dark Wallet’ Is About to Make Bitcoin Money Laundering Easier Than Ever
- CoinShuffle: Practical Decentralized Coin Mixing for Bitcoin
- Shared Coin - Free Trustless Private Bitcoin Transactions