Tornado.cash: A story of anonymity and zk-SNARKs

What is Tornado.cash, how to use it and the future

With the recent Yearn vault v1 hack from just a few days ago, we can see a new pattern of hacks emerging:

Get anonymous ETH via tornado.cash.
Use the ETH to pay for the hack transaction(s).
Use a flash loan to decrease capital requirements.
Create some imbalances given the large capital and arbitrage the imbalances.
Repeat as often as possible.
Payback the flash loan, keep the profits.

Flash loans and creating imbalances certainly deserve their own blog post. But this week we will take a closer look at tornado.cash.

So what is it?

ELI5: Tornado.cash

A room full of money

Imagine we have a room with a single door and a guard in front of the room. Anyone is allowed to walk up to the guard and give him a 100$ note. The guard takes the note and puts it inside the locked room. Then he asks the person giving the money to think of a very large number. Instead of giving him the number directly, the person computes the hash of the number, writes it down and hands it to the guard. The paper with the hash is thrown into a big bowl.

No imagine over time hundreds or thousands of people do the same. The room will then have thousands of 100$ notes and the guard will have a bowl with thousands of papers containing the hashes.

If someone wants his 100$ back, he can walk up to the guard. The easy solution would just be showing the random number from before to the guard. The guard can compute the hash and check all papers for any such hash. If he finds one, he will destroy that paper and give you 100$ back.

But if we do it like this, what if the guard is malicious? He could secretly track which exact 100$ belongs to which random number. Then we would receive exactly the same 100$ note as we put in the room. A pretty pointless transaction, achieving nothing.

An experiment with balls

Now what if we could prove to the guard that we know a secret number that hashes to a commitment inside the bowl without revealing the actual number? Well we can do that with a Zero-knowledge Proof. Zero-knowledge Proofs are complicated, but an easy way to conceptualize them is the following example (thank you to Lucas for giving me this example at ETH.Berlin):

Imagine you are blind and I give you two balls that feel exactly the same way and are the same weight. Now I tell you those two balls have different colors. There's no one else nearby. How can you find out if I'm telling the truth?

You can put one ball in each hand, show them to me. Now you put them behind your back and you either swap the balls between both hands, or you don't. Then you show them to me and ask me: 'Did I swap the balls, yes or no?'. Now if both balls are the same color, I could try to guess it. I'll have a 50% chance of guessing correctly. That's why you do this 15 times in a row. If I answered correctly every time, you can almost certainly (99.997% certainty) say that both balls are indeed of different color. If not, I must have randomly guessed correctly 15 times, pretty unlikely.

I have now proven to you that the balls are of different color without ever revealing the actual colors, hence the name 'zero knowledge' proof. You have no idea if the balls are green, black, orange or something else.

Getting your money back

Now someone can prove to the guard that he in fact knows one number that hashes to one in the bowl. That proof generates a specific signature. The guard writes down the signature (which is not known to the guard when the 100$ are added) and stores it. Whenever someone provides a new proof, he can check if the same proof was already used or not. If it was already used, someone is trying to get multiple 100$ notes with the same random number.

So even though the hash of that number is still inside our bowl, and the guard has no idea which of those hashes have already taken the 100$ out again, he can realize that someone is trying to use the same proof multiple times.

Now after getting back the 100$ they cannot directly be traced back to the original 100$, even if the guard is malicious.

From Zero-knowledge to zk-SNARKs

There's one problem with the normal Zero-knowledge proofs in the world of blockchains: asking many times in a row and waiting for the answer would require several transactions to go back and forth. That's not very efficient at all. With zk-SNARKs, or non-interactive zero-knowledge proofs, we can do this in one round. Essentially the questions are pre-determined based on a random oracle model. The prover can then send all answers in one single transaction.

The whole concept of zk-SNARKs is a very interesting topic. Vitalik posted a beginner-friendly introduction some days ago here. Well, as beginner-friendly as possible. You'll see what I mean. If you want to go really deep into the math behind it, it will be anything but easy. I certainly haven't figured out everything in the article myself, but here's my high-level understanding if you just want to know the basics:

zk-SNARKs are based on very heavy computation, like computing a hash 100 million times.
Verifying a proof itself doesn't require running the heavy computation.
Actual data is represented by polynomials, .e.g., x² − 4x + 7.
Using the factor theorem, we can transform certain polynomials into multiples of its lowest degree polynomial.
Then using polynomial commitments and the Schwartz–Zippel lemma, we can verify a proof for such polynomials by just randomly checking some coordinates.

To understand this properly, read the post. I'll link it once again: https://vitalik.ca/general/2021/01/26/snarks.html.

From theory into practice: tornado.cash

Using zk-SNARKs, tornado.cash allows you to deposit a fixed amounts of ETH, DAI, cDAI, USDC, cUSDC or USDT into the contract. On deposits you'll receive a backup code that you'll need to withdraw the funds later on.

Why fixed amounts? Basically every fixed amount is its own anonymity set. You can see in the screenshot above, at that time the anonymity was 426 for 0.1 ETH. Meaning 426 other people currently have access to 0.1 ETH. And since deposits are public information, when you deposit 0.1 ETH, those 0.1 ETH can later be traced to this group of 427 people, but not to you directly.

How secure is this? It's as secure in terms of anonymity as large as the anonymity set is and as frequent people deposit and withdraw. If you have a 30,000 people in one set, but no deposits/withdrawals for months. Now you come and deposit, wait one day and withdraw, it will be pretty easy to trace the funds back to you. So keep an eye on the stats page.

How does it work? - A pedersen hash function which efficiently computes a hash onto an elliptic curve to be used in the zk-SNARK. snarkjs was used for the initial setup and the automatic generation of the Solidity verifier contract.

All details can be found in the whitepaper.

Governance for tornado.cash

Plans to add governance to the protocol are in the planning. This will include its own TORN token which will used as treasury (55%), paid out to the team and investors (30%), airdropped to early users of the service (5%) as well as used for a new concept of anonymity mining (10%).

Since the service is only secure when a lot of people are using it, the idea is to incentive people further to leave funds in the contracts and pay them TRON for it. This will be done in a way to fully keeps the anonymity of the miner.

Deanonymzing tornado.cash

People have already started to try and de-anonymize users. This is possible by three metrics:

time of the day of sending the deposit/withdrawal
gas price distribution
transaction graph analysis

1. When most users of the service live in Europe and you're living in New Zealand and interact with the contract around 4pm your time, it will be 4am in Europe. So it will be extremely easy to identify you.

2. Most wallets incl MetaMask set gas prices automatically and thus provide some information to identify a user.

3. Users with multiple addresses might interact with the same services with those addresses. In the worst case there is even a direct link between the addresses by sending funds somewhere in the transaction graph. This allows to map certain addresses together.

All these issues are preventable if you follow strict rules:

use a random number generator from 0-24 to generate a time for you when you send the deposit/withdrawal
set your gas price manually with the help of a random number generator or use multiple wallets
use a fresh new address for withdrawals and never use the same services in the future with this and another address

The future of tornado.cash

It's not clear at this point where the future is headed for tornado.cash. The TORN token that launched a few days ago already increased significantly, but what the treasury funds will be used for is not clear at this point. Some people have raised concerns regarding the necessity for a service like tornado.cash to require a token.

What do you think of tornado.cash? Let me know in the comments.

Markus Waas

Solidity Developer