My password generator
How do you create a really good password that you don't need to remember but that you might occasionally need to write on paper or type into a keyboard? These days modern operating system provide really good sources of randomness, and one method that is often used is to read some randomness from the operating system PRNG located at /dev/random and run the data through the base64 encoding to get letters, numbers, + (plus) and / (slash). However, those passwords are not that conveinent and sometimes when I write them down people mistake my zeroes for capital o and things like that.
What I wanted was a password generator that could output a configurable length password using only easily distinguishable letters and numbers, so I wrote one. As usual I place this code in the public domain, feel free to use it any way you want.
Features:
- The entropy of the password is as good as the underlying operating system. If you use a recent Linux or OSX version, the data returned from /dev/random is quite good.
- The code is simple and it is easy to verify that the program actually uses the entropy that it reads.
- The resulting passwords are easy to type on keyboards and write on paper without confusing the reader with similar characters such as 1 (one) and l (lower case l).
- The length of the password is configurable.
#!/usr/bin/python import sys # alphanumeric chars minus l, I, O, 0, 1 alphabet = "abcdefghijkmnopqrstuvwxyzABCDEFGHJKLMNPQRSTUVWXYZ23456789" # Some expeimentation told me that 2 ** 5.8 = 55.7 BITS_PER_CHAR = 5.8 # The default password length has the capacity of a bit more # than 64 bits of entropy. DEFAULT_LEN = 12 def main(args): count = DEFAULT_LEN if len(args) > 1: if args[1] == '-h': usage() return elif args[1] == '-c' and len(args) > 2: count = int(sys.argv[2]) else: usage() return print(create_password(count)) def string_to_bignum(s): num = 0 for c in s: num = ord(c) + (num << 8); return num def create_password(length): random = open("/dev/random", "r") needed_bytes = (int)(length * BITS_PER_CHAR) / 8 + 1 n = string_to_bignum(random.read(needed_bytes)) random.close() s = "" for i in xrange(length): s = s + alphabet[n % len(alphabet)] n = n / len(alphabet) return s def usage(): print """mkpasswd [-h] [-c COUNT] Create a random password using the operating system's entropy pool using a 57 character alphabet of letters and numbers. The characters in the alphabet excludes characters and letters easily confusable such as I and 1. Each password character holds about 5.8 bits of entropy, so the standard 12 character password can hold a theroretical maximum of 69 bits of entropy. The actual entropy present in any generated password is a function of the entropy gathering algortihm present in the kernel of your operating system. -h display this help text -c COUNT create a password with COUNT characters.""" if __name__ == '__main__': main(sys.argv)Filed under Cryptography, Programming | Comment (1)
How to best protect your users’ passwords
It seems like this blog has turned into more and more of a programming blog, and here is yet another step in that direction. Perhaps somewhat boring to many, but hopefully useful to some.
Online services generally use usernames and passwords to identify their users. The simplest way of doing that is to save the information in clear text, perhaps in a database or in a text file. When a user logs in, look up the stored password and compare it with the one supplied by the user. If they match, authentication is successful. A simple solution, but a problematic one. If the username and password information gets in the hands of the wrong people. Since many user the same password over and over again it is likely that the username-email-password combinations can be used to log into other services.
This is a real problem. Writing secure web applications is difficult, and security problems that gives an attacker access to login data can be introduced not only by your own code but also by third party libraries and frameworks.
Because of this it is a good idea to make it somewhat more difficult to use the password information for an attacker using some sort of scrambling method. Many people call such methods password encryption, but encryption implies that its possible to decrypt the information with the right key, and keys can be lost so it is better to make the process one way. The basic idea is to take a plaintext password and change it into something called a hash that can not be reversed back into the plaintext. However, the scrambling process needs to be repeatable so that a plaintext password can be verified to match the password that was once used to create the hash.
To help in this process there is a family of cryptographic functions that called cryptographic hash functions. They work in a way that a variable length input is turned into a fixed length output in a way that it's very difficult to a) find two inputs that generate the same output and b) find the input given a specific output. Assuming that the cryptographic hash function works as advertised, shouldn't this should solve all our password storing problems? If we store a cryptographic hash of the user supplied password using for example the SHA-1 algorithm, an attacker that gets hold of our login data can't run SHA-1 backwards to get the passwords. All he has is a list of hash values, but at the same time we can repeat the SHA-1 function when a user needs to authenticate and compare the hashes of the new and old password. If they match they must be the same.
This would be an efficient method if it were not for one little detail: Users choose bad passwords. Some use password, some use their first name some a1. There are lists available with common passwords, and once an attacker puts a computer to the task of trying out passwords many of them gets broken. Worse yet, since the SHA-1 algorithm is such a common one, there is almost certainly hard drives out there filled with pre-calculated SHA-1 values for all common passords and even the ones for all possible password combinations shorter than for example 8 characters. With such pre-calculated values any short or simple password can be found in seconds.
To solve the second problem, the one with pre-calculated hash values, something called a salt value is used. A salt is a random number that is added to the plaintext password before it is encrypted. If the salt can have say a billion possible values, someone doing pre-calculated hash values need to do a billion times more pre-calculations and have a billion times more storage to store the hashes. The salt value can be stored in an unaltered form together with the hash value, and be used when verifying a password by simply doing the same hash operation with the same salt a second time. Another benefit of adding salt to the password is that if you have a large number of users, the attacker can't reuse each password guess with all the users, since their salt values differ.
The problem with easy to guess passwords is much more difficult to solve. The only thing that helps a bit, besides educating users in methods for choosing better passwords, is to make it more time consuming to do one hash calculation. That way it takes longer to try out millions of common passwords and password combination and guessing right will take longer, hopefully too long to be worth it. To make it take a bit longer to calculate the hash you can simply repeat the cryptographic hash function over and over again. Doing it once on my dual core desktop computer takes less than a millisecond. Doing it a thousand times increases the time spent calculating one hash value to about 200 milliseconds.
So, to take good care of your users' login information you should:
- pay attention to security on your servers. That includes operating system security, backup management as well as avoiding misstakes in your own code.
- encourage them to use good passwords.
- store hash values of the passwords instead of plaintext versions
- use a good cryptographic hash function to hash them
- use a large enough and random enough salt value when hashing
- repeat the hash function until you reach a reasonable tradeoff between efficiency and difficulty of repeating it by an attacker.
Soon, I'll publish some java code I've written that implements recommendations 3-6. But that's for another day, now I need to put this computer to sleep.
Update 090330: Since I wrote this post I have published the code for two implementations of these recommendations, in Java and Ruby.
Filed under Cryptography, Geeky | Comment (1)