Introduction

A hacker recently released 32 million passwords obtained through an attack on a "legacy" server at rockyou.com. The folks at Imperva analyzed this password list and came to the usual conclusions and offered the usual well-meaning advice.

I think most of their conclusions are wrong or irrelevant. They are traditional password complaints, and I have made some of these recommendations myself, but the threats have changed, and people haven't, and it is time to fix this.

The size of the data spill is unprecedented. It reminds me of password sniffing successes discovered on ISP monitoring computers in 1996. At that time, traffic loads were light enough that a Sun workstation positioned on a network backbone could sniff all the cleartext passwords that passed it. A number of these maintenance machines were hacked and sniffers installed. The malware was discovered when disk partitions mysteriously filled up. Investigation revealed millions of captured passwords. (It is no surprise that attacks on .gov and .mil sites increased at that time.) These passwords were not published, but they were analyzed

Research over three decades has shown that people do not pick passwords that are resistant to dictionary attacks, and they don't like being forced to do so.

The Imperva folks used NASA password advice to analyze the data. The analysis is fine, but NASA's rules are not. Let's look at NASA's list:

  1. It should contain at least eight characters.
  2. It should contain a mix of four different types of characters
  3. It should not be a name, a slang word, or any word in the dictionary [etc., elided]

I call these eye-of-newt password rules, because the recipes are reminiscent of magical potions. The rules vary per authentication service, which means we have to create different ones in different ways for different sites. The sites don't usually list their rules to those logging in, so users have to consult notes.

Imperva adds their own advice for users echoing security maven Bruce Schneier:

  1. Choose a strong password for sites you care for the privacy of the information you store. Bruce Schneir's [sic] advice is useful: "take a sentence and turn it into a password. Something like "This little piggy went to market" might become "tlpWENT2m". That nine-character password won't be in anyone's dictionary."
  2. Use a different password for all sites—even for the ones where privacy isn't an issue. To help remember the passwords, again, following Bruce Schneier's advice is recommended: "If you can't remember your passwords, write them down and put the paper in your wallet. But just write the sentence—or better yet—a hint that will help you remember your sentence."
  3. Never trust a 3rd party with your important passwords (webmail, banking, medical etc.)

Rule 3 is generally good advice, but it depends on the third party. Spouses who share bill-paying may need to share the banking password or other authentication token. But it is a bad idea to give one web service access to another on your behalf: you are adding potentially weak links to the chain.

Bruce suggests in rule 2 that it is okay to write down passwords, or better, write a reminder about them, and I agree. Previous admonitions against writing down passwords contemplated local attacks—people reading your Post-it notes on your terminal in the office for example. Most attacks come from distant malefactors, and they will never see your terminal. But do beware of leaks to family members, like curious teenagers or a divorcing spouse. Of course, it would be better if the password were memorable, or generated by a token.

Item one has a reasonable suggestion for creating a password resistant to dictionary attacks. But the result is not easy to type, not easy to associate with a particular service, and may not contain enough types of characters (see NASA rule 2) to be accepted.

But most importantly, dictionary attacks are no longer a common threat to most Internet users. Passwords are usually obtained by keyboard sniffing software or phishing websites. Under these threat models, all the eye-of-newt rules are what many users suspected: annoying, tedious bureaucratic rules that don't actually help security!

Consider: it didn't matter what kind of password the 32 million people chose: the bad guys got all of them, strong and weak, for reasons beyond the users' control. It was the site administrators who screwed up and succumbed to an old and well-known SQL injection attack.

(The folks at RockYou have explained the measures they have taken since the attack at http://www.rockyou.com/help/securityMessage.php. It is interesting to note the URL of this security notice: the message was generated by PHP, a popular and easy language to use. According to my sources, PHP has fundamental security problems, and is easily misconfigured. A large percent of the break-ins to Linux and *BSD servers are through PHP.)

Let's look at Imperva's further advice to administrators:

  1. Enforce strong password policy—if you give the users a choice, it is very likely that they would choose weak passwords.

Users will choose weak passwords, but trying to get them to do otherwise is simply poor engineering. It doesn't work very well, is inconvenient as hell, and doesn't stop the Bad Guys.

  1. Make sure passwords are not transmitted in clear text. Always use HTTPS on login.

This is fine advice. There is certainly no point in leaking cleartext passwords: we have SSL and ssh. This doesn't address shoulder-surfing, keyboard sniffing, and phishing, but it does raise the bar. Most sites have been getting this right for a long time.

  1. Make sure passwords are not kept in clear text. Always digest password before storing to DB.

It's not a bad idea to digest passwords for database storage. This means that theft of the database comes with the added need for dictionary attacks on the individual passwords. This was probably not a bad idea for Unix passwords in the late 1970s, but it was proved less than useful a long time ago. A much more useful alternative rule: Make your authentication server highly resistant to attacks. If you can't or won't, use the server of a third party who will. Authentication service is not a job for amateurs.

  1. Employ aggressive anti-brute force mechanisms to detect and mitigate brute force attacks on login credentials. Make these attacks too slowly for any practical purposes even for shorter passwords. You should actively put obstacles in the way of a brute-force attacker—such as CAPTCHAs, those little reading tests that try to separate computers from humans.

This, finally, is excellent advice. See below for further discussion. However, it is not clear how well the CAPTCHA idea is going to work: there is an arms race there that we seem to be losing.

  1. Employ a password change policy. Trigger the policy either by time or when suspicion for a compromise arises.

Password change policies are over-used, and probably a bad idea in general. Good passphrases are hard to think up and remember, and trying to remember changes is a problem. Of course, it is a good idea to change your password when a breach is suspected.

  1. Allow and encourage passphrases instead of passwords. Although sentences may be longer, they may be easier to remember. With added characters, they become more difficult to break.

This is good. Passphrases are better than the eye-of-newt passwords: they are easier to remember and type, and perhaps harder to shoulder surf. On devices with spelling correction (like the iPhone when not in password mode), they can simplify password entry. Many authentication servers reject perfectly good high entropy passphrases because they fail the eye-of-newt tests.

Solution 1: One-time Passwords

How do we get out of this mess? We know that grandma is not going to tend to pick the high-entropy authentication tokens we want. (And this is true for all values of grandma: my children's grandma wrote disk controller software for the UNIVAC 1.)

I see two general solutions. The best is a hardware or software token that provides a one-time password on demand, given a PIN. The RSA time-based token seems to have won the battle. It is available as a small hardware device hung on a keychain, or as an app on a smart phone. There is no password picking, no eye-of-newt, no password changes, and even eavesdropping is probably okay.

The problem is that you have to carry these around, and maybe carry many of them. The software solution isn't bad: most iPhone users always have the device with them. I am not sure if it handles multiple authenticators, but it certainly could. But many of us may have a dozen or more sites that require logins.

This would seem to be the perfect place for a trusted authentication service as a business. Why shouldn't sites like Facebook or RockYou trust some well-respected company offering this service? There might be a few competing services, like MasterCard, Visa, AT&T, and RSA. One would have to have tokens for each.

This has been tried, but it hasn't caught on. It seems to me that it is time to try again. Perhaps previous solutions were too expensive. I would suggest that the service be offered to small sites (fewer than 30 authentications a month, say) for free, through a well-documented interface similar to OpenID or OpenAuth.

Solution 2: The Don't Be A Moron Rule.

Since the early 1970s, the US banking system has used four-digit PINs to access ATM machines. While eight-digit PINs are available in Europe along with disparagement of the difference, the US standard has stood up well.

The four-digit PIN is okay because the user only has a few tries to get it right. If the banks let you choose a PIN, it may not be a moronic one. By limiting the number of tries, and locking the account, grandma can have a nearly hassle-free password. Here is one rule to rule them all:

Don't pick a moronic password: one that a friend or relative can guess in a few tries, or a shoulder surfer can easily spot as you type.

This gets rid of the useless eye-of-newt rules. Password length is not constrained. Dictionary words are okay, even encouraged. You might want to change the password occasionally, but pick another word in a dictionary if you'd like. Pet and spouse names are out. So is "1234", "password", and "facebook". Grandma can understand these rules, and the need for them. Humans get out of the high-entropy game.

The cost here is locked accounts. I suggest that this inconvenience is less likely to happen with simpler passwords. When it does, allow a few further attempts after increasing lengths of time. This scheme has passed the test of time. This should cut support costs, without reducing security.

A password hint is not a bad idea, but it should refer to the one true password, not a secondary password. Mother's-maiden-name passwords seldom pass the no-moron rule.

Password counting and timed-backoff require an authentication server, but not necessarily a heavyweight one. On Unix systems, there is a moribund PAM module named pam_tally that has been around for over a decade that counts login attempts. It should be dusted off and used widely.

Conclusion

The prevalence of repeated security problems has become boring. We don't seem to learn lessons taught long ago. Legacy solutions have legacy drawbacks. The solutions I have proposed above are not new, not radical, and not particularly expensive or inconvenient. Single sign-on can reduce the authentication load on the user in many cases.

Experts have shown the ability to run major Internet offerings without major catastrophes. Most of the published data leakages have come from smaller providers who may have had less experience with the security problems involved.

We can leverage solutions to actually make things better. This is how normal engineering works, and has been hard to fine in computer security.

Besides, we have bigger fish to fry. Threats like supply chain attacks are now real. We seem to be slowly losing the virus detection battle. Malicious users quickly analyze and exploit weaknesses derived from software patches. Nation states are involved in offensive operations. How are we going to solve these if we can't get something like authentication right?

Appendix - PINs in the Passwords

Almost 16% of the passwords released were PINs (i.e. all digits). Here's a quick analysis.

The top ten were clearly moronic:

 CountPIN
1290,729123456
279,07612345
376,789123456789
421,7251234567
520,55312345678
613,984654321
713,272111111
813,028000000
99,516123123
108,6761234567890

It took some brief thought to figure out some in the top 100. These are much less clear to me, and are probably less moronic than your anniversary or birthday:

 CountPIN
204,576159753
372,4915201314
691,34514344
851,0761435254

The PIN lengths were distributed thus:

digitsCount
155
2149
31,048
420,661
5213,543
62,278,919
>62,678,615
There seems to be little evidence that American banking PINs were used. None of the top 100 PINs had fewer than five digits, though that was clearly allowed.

My least important password appeared more than two dozen times. I could find no others that I recall using in the past twenty years.