Wallets: Securing storage

There is always a mention of securing wallets whenever there is a discussion around blockchain applications, and rightly so. Usually, with the mention of wallets, the discussion digresses towards cold and hot wallets.

While cold storage wallets are significantly more secure, they are also more cumbersome to use, especially if used frequently. Hot wallets provide a significantly better user experience and ease of use.

Regardless of whether it is a cold or a hot wallet, storage of the data (private keys) must be made secure.

sidenoteThis article explores how storage of your private keys (secrets) is made secure, and how to future proof it against advances in technology. Many of the other security concerns are not discussed in this article. It is a very wide subject, and must be studied as such, in that context. This article restricts its scope to: how to make it harder to access the private keys even if a cracker had the wallet’s data storage.

The article explores basics of securing storage (cryptography practices), and borrows learnings from blockchain. It then applies these learnings to the existing practices to attempt future-proofing storage security against technology advancement.

The marvel of Proof-Of-Work

Block interval on the Bitcoin network is 10 mins. That means every subsequent block must be mined in 10 mins. It should not be possible for miners to mine faster than that and create coins at a faster rate.

Over time surely hardware and software would become more efficient, so how would Bitcoin cope with this? Would it not be possible to mine a block in less than 10 mins? The Bitcoin protocol is designed such that improvements in hardware and software would not be able to decrease the time needed to mine a block. This is achieved by increasing the difficulty to mine a block. As mining systems get faster, it gets harder to mine a block. Thus the protocol is resilient to improvements in technology (both hardware as well as software).

Just like the network protocol of Bitcoin it may be possible to make it difficult to “hack” wallets, even if hardware / software technology advances. The wallets must catch-up with technology improvements and increase the difficulty to access securely stored data.

A few key concepts must be understood before delving deeper into the topic.

The human factor

Private keys used for signing transactions on a blockchain are typically stored together using a wallet. The keys are most likely stored on the device used for signing the transactions (disk or SSD or USB drive or phone storage). The stored data (private keys) must be encrypted, if not, any malware can steal it.

A key is needed to encrypt the data and this key cannot be stored on the same device. Otherwise it is as good as locking the door and keeping the key beside it.

The key usually gets generated from a password or passphrase. Hence, ultimately the security of such systems depends upon humans.

Unfortunately, harder the password / passphrase, poorer the overall user experience. The user has to remember such difficult passwords / passphrase, and just in case, record it somewhere else like a piece of paper! If the wallet has something very valuable, then users need to store such papers in a safe or risk losing the secret and the coins forever.

Making systems secure and user friendly manifest as conflicting needs.

Cracking password is getting easier with simple brute force and evolving hardware and software capabilities. Thus it is very important to choose right passwords / passphrase. Humans are not good at that either. To address this, passphrase is generated by computers. This is described later in the article.

Entropy, password and passphrase

In cryptography the term entropy is used often. It is very important to understand this concept, as the security of a system directly depends upon this. The measure of randomness is called entropy. The more the randomness in password, the higher will be the password’s entropy. Higher the entropy harder it is to crack the password.

Understanding what is entropy is much easier with hypothetical systems. Let us consider a system, in which the user’s first name is the username and the password is one of the following:

  • birthplace or
  • father’s name or
  • mother’s name or
  • last name

(i.e. only 4 possible values for the password). Note that this is just a hypothetical system. In this case, the probability of an attacker’s guess being right is 25%. The attacker can get these names from spying on user’s letters or looking up some public records etc. Now with only 4 guess, the attacker can get through. As 2 bits can represent 4 possibilities (00, 01, 10, 11) therefore the entropy of the system is 2 bits. 

As claimed in this book (regarded as one of the best books on the subject), typical passwords introduce an entropy of 2 bits per character. This means, if someone uses 8 character passwords, we expect an entropy of 16 bits. For secure systems we would need at least 128 bit security. This means we need people to have 64 character passwords. That is impractical.

To overcome the problem of low entropy, secure wallets ask users to have a series of words (12 or 24) or a passphrase. Whether these are generated randomly, or users are asked to create their own passphrase depends upon the wallet. This is done to increase the entropy.

BIP-39 is used by many wallets including metamask, trezzor, ledger etc. The English-language wordlist for the BIP39 standard has 2048 words, so if the phrase contained only 12 random words, the number of possible combinations would be 2048^12 = 2^132 and the phrase would have 132 bits of security. However, some of the data in a BIP39 phrase is not random,[2] so the actual security of a 12-word BIP39 seed phrase is only 128 bits. Wallets can generate a key from this passphrase which is used to encrypt the data stored in the wallet.

How much entropy is good enough depends upon how much technology has evolved!

sidenoteBrainwallet is a concept in which you don’t store the private keys on any wallet, instead remember passphrase, and generate private key from it. The concept is explained here. Beware, it is very insecure, due to low entropy of such systems. Many software wallets which use passphrase differ from Brainwallet by the fact that along with the encrypted data file, a salt is stored. The details are described below. In short, avoid Brainwallets or any other wallet which depends solely on humans to come up with a passphrase.

An interesting talk was presented at DEFCON 23 on wallet security. Here’s the deck for the talk.

Brute force attacks

If a hacker got hold of an encrypted file, and does not have the key to decrypt it, there are two approaches the hacker could take:

  1. Get the key by stealing or look for clues elsewhere.
  2. Guess the key, try it out, if not successful make another guess.

The second approach is called brute force. With modern encryption systems, the number of guesses the hacker has to make is extremely large (2^256 for a 512 bit key). Of course humans are not guessing, computers do the guess work (rather simply try the next key from a set).

sidenoteA study of birthday attack helps us understand why a square-root of a number is good enough for brute force attack. Generally it is recommended to have 128 bit security in today’s context. Therefore your wallets must use at least 256 bit keys to be considered to have 128 bit security.

Modern GPU rigs can be created by engineers to crack passwords by brute force (how to article). To aid this, software is available freely.

Brute force resilience

Even if a user had a decent entropy for the passphrase, a cracker can deploy a number of computers to use brute force and leak the secrets out or in future simply use more powerful hardware. So it is important to make it more and more difficult to reveal secrets from a wallet.

Bitcoin relied upon time needed to “compute” a block as a way measuring how easy or difficult it is. For this it looks at a moving average of 2016 blocks. If the moving average falls below the intended 10 mins, the network decides to increase the difficulty. As hardware improvements are made, or distributed mining farms are created (software improvements), the moving average tends to fall below the 10 min mark. The network then decides to increase the difficulty. This link shows average block time for Bitcoin, from 2010 until now.

sidenoteOne of my earlier articles “Bitcoin, Blockchain and the design elements explained” throws light on technical details of hashing, difficulty, transactions, blocks, consensus etc and how it all ties together.

By increasing the difficulty, the miners need to try out many more “nonce” values to arrive at a valid hash. What is nonce is explained in the article in the note above. The miner needs to run many iterations of hash generation. This in a way is the penalty for increasing the Bitcoin network hash-rate. At the time of writing this article, the bitcoin network is operating roughly around 84 million-trillion hashes per second.

Lesson: To increase the difficulty of cracking passwords, get the computer to perform many more hash computations!

Salting and Stretching

In the current scenario, no password or passphrase is strong enough.

In order to make the wallet secure, wallets must not rely on passwords or passphrases alone. To increase the entropy of the system, the password / passphrase is hashed along with an extremely large random integer (512 bit integer preferably). This large cryptographically secure pseudo random number is called salt.

sidenoteIn cryptographic systems, random number generation is taken very seriously, as they form the basis for various cryptographic algorithms. In spite of so much research, a truly random number generator has not been made possible. Hence, these random number generators are called Pseudo Random Number Generators. An interesting research paper can be found here.

The password concatenated with salt is hashed multiple times to generate a key (see the algorithm below). This key is then used for encrypting the “private keys store” (data stored in the wallet). The private keys for the addresses on the blockchain are now assumed to be secure. This technique is used in cryptography for storing and securing secrets. It relies on the fact that Cryptographic Hashes have the following properties:

  1. It is impossible to guess the generated hash value for a stream of bytes. To get the hash value, one has to run the algorithm, there is no shortcut.
  2. For a given input it always generates the same hash value.
  3. It is infeasible to deduce the input based on the hash value. That means it is an irreversible mathematical function.
sidenoteNot all wallets use salt. Some wallets provide a feature to recover your wallet in case a user loses the computer or mobile phone etc. This feature does not allow for storing salt on the device. Also, it does not allow for storing the data (private keys for coins) on the device. How it is achieved deserves an article on its own. This article focuses on how to secure the storage of private keys. The storage could be anywhere, on the wallet provider’s cloud or user’s device. In some other wallets, there is no data stored at all, only key generation algorithm is used.

Key derivation from password / passphrase

K => Generated Key
h => Cryptographic hash function
r => Stretch factor (number of iterations to generate the key)
p => passphrase
s => salt (a very large pseudo random number)

   x0 := 0
   xi := h(xi-1 || p || s) for i=1..r
   K := xr
*credit

As the number of iterations increase, more computation is needed by the attacker who is using brute force. By increasing the computation required for each “guess” by a factor of “r”, wallets make it very difficult to run brute force where “r” is reasonably large. However, the user suffers a slightly longer response time each time the user wants to use the wallet. As long as it is only a couple of seconds, it must be acceptable to the user I “guess”.

Lesson: As hardware, software capabilities increase, increasing hashing power needed to generate key can be a deterrent for brute force attacks.

sidenoteInteresting read: NIST standards on how to implement Password Based Key Derivation Functions (PBKDF). It is recommended for an application developer to use PBKDF2 functions from open source libraries, rather than implementing your own. This is the openssl’s PBKDF2 function description.

The algorithm described above is only the gist of PBKDF. In industrial strength implementations, expected keyLength etc are taken into account.

A large number of wallets opt for only double hashes as a way to improve user experience. Double hash is practically computed without any delay for the user. This makes it very tempting for the user to pick up such a wallet. Nobody likes slow wallets after all. Few posts on quora, and reddit claim that: “using such double hash wallets is no big issue, because they are reasonably secure.” This is misconceived. In no time technology will evolve to break such wallets even by brute force techniques or should we say technology has already evolved. See this how to article.

Do not use fast wallets which use double hashing as a way to generate key to secure your crypto-assets, your hard earned money!

It is recommended that an iteration count (“r”) of at least 1000 must be used! This makes trying each guess a 1000 times slower. A big deterrent against brute force.

Building resilience to technology advancement

There is a possibility of future-proofing the salting and stretching. A reasonable value for number of iterations is based on how long it takes the current hardware to compute the hashes. To future proof this, the number of iterations must increase, as the wallet detects bigger hash rates.

One of the possibilities is to store the number of iterations as an integer along with the salt. Whenever the wallet detects that it is increasingly becoming faster to generate the key, the wallet must increase this value and re-write the encrypted data. The network’s hash rate could be one of the indicators. As the hash rate increases by a certain magnitude, the wallet must decide to up the difficulty.

The flip side of this approach is that the user could experience a slower response time. However, it is in user’s benefit to accept the slower response time, or upgrade the hardware.

Remember the generated key is only encrypting the data stored on the storage. The private keys for addresses of bitcoins / ether will not be altered. In time, this data must be re-encrypted with a newer more resilient key.

Data written on disks is stored in blocks. This is not the same as the blocks in blockchain, it happens to be that disks are segregated in sectors and blocks.

While the newly encrypted data is being rewritten, care must be taken to overwrite the same blocks on the storage or at least destroy the old blocks by overwriting garbage on it.

sidenoteCryptographers believe that destroying secrets is as important as securely storing secrets and rightly so. Deleting a file does not necessarily delete the contents on the storage, it merely removes the links to it. unlink system call.

This leaves the space on the storage available for the OS to write a new file. However, if the space is not consumed, your secrets are still there for hackers to read!

Conclusion

Beware of using wallets without understanding how the wallets are securing secrets. If you have a good understanding of how secrets can be stored securely, then you have a better chance of choosing the right wallets. Finally, as they say in Cryptography, do not trust anybody especially yourself (more so, if you are an application developer) on designing secure systems, hence, do not design your own wallet!

4 Comments

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s