Linux - SecurityThis forum is for all security related questions.
Questions, tips, system compromises, firewalls, etc. are all included here.
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
Lately I have heard a bit about how length is more important than entropy in selecting passwords. The logic behind this is an easy to remember long length, low entropy password is still almost impossible to brute force. I did a bit of research on password entropy and found it implied by several sources that a password with 40 bits of entropy will create a hash with 40 bits of entropy if it is unsalted. This is the only source I could find where it was stated as fact. Is this correct? Thank you.
I don't know whether it's correct but from what I understand:
- a password's keyspace is determined by the size of the character set and the length of the password
- a password with low entropy may increase the chances of a guessing attack succeeding
- a longer password is harder to brute-force but may not be if the character set is small and known to the attacker
So when I generate passwords I use at least one character from the each set of upper case letters, lower case letters, digits and symbols. I ensure my password length is >= 10 and don't reuse them or the same format if it's for something important.
Having a long, low entropy password may take you outside the range of most rainbow tables but a smart attacker may still be able to come up with an algorithm that has a better than average probablity of guessing your password, especially if you reuse the format.
I don't understand this very well so I may not have been clear. On the wiki for password cracking the following is stated:
A user-selected eight-character password with numbers, mixed case, and symbols, reaches an estimated 30-bit strength, according to NIST. 230 is only one billion permutations and would take an average of 16 minutes to crack.
There are definitely more than a billion permutations with the above stated character set but if an attacker is able to determine from an obtained hash that the password only has 30 bits of entropy, he only needs to try the billion possible combinations to get the password. The following gentleman stated in the comments:
Hashing your password doesn’t add entropy to the password. If your password is 80-bits worth of entropy, and you hash it with SHA1, it’s still only worth 80-bits of entropy. If you’re talking about breaking the salted hash in /etc/shadow directly, then you’re breaking a message with 160-bits of entropy. BUT, you’re attacking the hash, not the password. This is why password cracking utilities, such as John the Ripper, use dictionaries to get to the password. It’s much easier to search a total space of 80-bits than 160. Take a password, hash it, and compare it to the stored hash. You’re attacking the password, not its hash.
Also, I made very clear in the post how long it takes to break 72-bits worth of entropy, as well as 64 and 56 given a sufficient attack. Entropy matters a great deal. There are more algorithms than brute force for getting at a password, you can count on that. So, take the post for what you will, but if your password doesn’t have a sufficient amount of entropy, don’t come crying to me when your account gets compromised
If the above stated is true, then the individuals claiming a long password with low entropy are safe because they are still difficult to brute-force are giving horrible advice. But I do not know if what I have read is true.
Last edited by amishtechie; 06-08-2011 at 10:13 AM.
There are definitely more than a billion permutations with the above stated character set but if an attacker is able to determine from an obtained hash that the password only has 30 bits of entropy, he only needs to try the billion possible combinations to get the password
.. the attacker still won't know what characters are in the set though.
then the individuals claiming a long password with low entropy are safe because they are still difficult to brute-force are giving horrible advice
They are still difficult to brute-force due to the length, but brute-force isn't the only method of attacking passwords.
where you can think of log(b) as the entropy per character and n as the length of a string, and then n log (b) is the total entropy of the string. (In bits, if you use logarithms base two.) The total entropy (of passwords of a given length generated in a specified way) should be related to the memory, time and effort required by an attacker to crack these passwords, but neither Johnson nor Taponce appear to have thought about what these relations might be. A second problem is that when b is the number of characters in the alphabet used to construct strings (passwords), log(b) can be identified with the entropy per character only in very special cases.
Taponce is correct that the paragraph he quotes from Johnson's essay is dangerously misleading, but his table is also seriously misleading! To see why, search for and study the original paper by Shannon introducing the entropy per character, which you can find on-line. Shannon explains very clearly (by way of amusing examples) a simple but effective model for the statistical character of "text" (such as passwords chosen in billions of runs of a specific password generating algorithm), how to define the entropy per character for such models, and why log(b) is almost always an overestimate to the true entropy per character. This is a crucial point because it means that Taponce's table is suggesting minimal password lengths which are much too small, if you use (for example) a password generating algorithm which yields only "pronounceable" passwords.
A third possible misconception: entropy (as defined by Shannon) is not a property of a single string, but of "typical" strings in some space of alternative strings together with a probability distribution on that space.
The discussion of cryptographic hashes reveals further misconceptions, I think, but first things first.
Entropy is an important concept which is relevant almost everywhere in any discussion of security or anonymity, so it is important to understand how it is defined and what it means. One remarkable aspect of Shannon's paper is that he succeeded not only in defining a quantity which has turned out to be extremely useful in an amazingly broad variety of applications, but also in explaining what this quantity means, intuitively.
Originally Posted by some contributor to a Wikipedia article
2^30 is only one billion permutations
I think the author meant only that 2^30 is approximately 10^9, which is true. 26^8 is approximately 208 x 10^9; if you choose an 8 character password (using only lower case Roman letters), you obtain 37.6 bits worth of (total) entropy if the probability of choosing any one such password is 26^-8. Most likely, this will not hold true, so you will obtain fewer bits of entropy than you had hoped. If you use an algorithm which generates pronounceable passwords, the entropy might well be closer to 20 bits than 37 bits. Remember, the total entropy refers not to a single password but to the "typical" ones in the space of all 26^8 possible choices, given some method of choosing passwords.
Taponce's point was that if you want to increase the total entropy in some scheme of choosing passwords, all other things being equal, you get more benefit from increasing n than b. If you want to choose pronounceable passwords, it may be better to restrict yourself to an alphabetic character set (b = 52) and to increase n until you are above some threshold number of bits, bearing in mind that the entropy per character of natural language is typically about half that of completely random text. For this reason, a long passphrase which does not resemble a quotation from any published work may be preferable to a short password constructed using a large alphabet. Unfortunately, many password storage algorithms simply truncate long passwords at eight characters, so be careful.
If the attacker knows that you used the password "entropy" at another website, he knows that you are choosing passwords nonalgorithmically, and that you are likely to simply reuse the same password with minor changes ("entropy2"); this would reduce the entropy to a handful of bits. So the attacker's (partial) knowledge of the method you use to choose passwords may also play a role. In cryptography it is common to assume that the attacker has complete knowledge of the method. But not the recent output of /dev/urandom on your PC, or all hope is lost.
I think the most important thing in choosing a password is not to use dictionary words, these are easily cracked. I think the entropy of the RNG / seed is far more important than the password entropy.
Absolutely. One other thing to avoid is l33tifying your passwords. I've recently seen some results from an SSH brute-force scanner and many of the passwords that are being tried are simply l33t v3rsions of dictionary words. I suspect the same sort of words are used in rainbow tables as well.
Here is some more general advice on choosing passwords for website accounts.
Passwords alone never suffice to provide strong protection against unauthorized intrusions, but they do play an important role in the first line of defense. This thread is only about improving security of passwords against threats such as dictionary word and rainbow table cracking attacks, and doesn't consider other kinds of threats.
(I neglect some well known vulnerabilities which enable attackers to circumvent password cracking entirely. These can be easily fixed, but not by the website user, so it makes sense to focus on some things the user can do to improve resistance to some attacks.)
Of the 32 million passwords exposed in the Rockyou breach, which occurred sometime before 4 December 2009, the most common were
consider using a password safe to store website passwords; if possible these should
be as long as each website allows
use alphanumeric characters, punctuation, brackets, and other symbols
resemble as closely as possible "truly random" strings
Warning! Pseudorandom number generators produce nonrandom output which is predictable if an attacker can guess the "seed", and even if he cannot, their output is subtly non-random. Much safer is "random" string production using "truly random" bits gathered from arbitrary and unpredictable system events. For example, the gpg command quoted above works by coaxing your system into "gathering entropy" from unpredictable "random" events such as keystroke or disk access timings, mouse motions, etc.
If you have some experimental data, you can try to estimate (with some bias) the entropies of competing password production processes. Try running this shell script which yields 50 character strings produced by three methods:
If you want to try to verify your intuition about which of these methods produces the greatest entropy per character, simply counting observed frequencies (see Shannon's paper) turns out to be an inefficient method of entropy estimation; you can find better methods by searching. It should be self-evident that the third method produces significantly smaller entropy per character; this is because pwgen attempts to produce pronounceable passwords.
I stress again that simply maximizing total entropy within given constraints does not offer protection against some common attacks, much less all possible attacks. But all other things being equal, it makes sense to try to close as many doors as possible, irrespective of how plausible someone thinks a conceivable attack might be. For example, if possible, it is probably not a bad idea to try to ensure that pwgen and other generators are not writing your passwords in cleartext to temporary files, even if they will eventually be "deleted". (The gpg method is probably safer in this respect than most other "random string" generating methods.)
Depending upon how you intend to store and use your passwords, and what kinds of attacks you consider most likely, different methods will be preferable. There is much well-intentioned but partially contradictory advice out there, such as