Privacy Canada is community-supported. We may earn a commission when make a purchase through one of our links. Learn more.
Password Storage with Hash Functions
Websites store user information in what is called a database. This database stores the usernames, passwords, and other important information so that the next time the user logs in everything is as they left it. However, this poses a security threat. The database has to store the user’s password, so that when they try to log in it can check against the user’s registered password. If the database were somehow compromised (by outside or inside forces – it happens quire frequently) all of the user passwords would be revealed. This is where password hashing comes in.
Instead of storing the passwords in the plaintext form, they are instead put through a one way hash function before being stored in the database, and put through the same function when the user tries to use their password to log in.
Let’s say Bob registers with the password “hunter2”, and the server is using the MD5 hash function to store passwords in it’s database. The MD5 hash of “hunter2”, which is “2ab96390c7dbe3439de74d0c9b0b1767”, is then stored in the database. Now when Bob tries to log in again, he types “hunter2” into the password field, but when the server goes to compare the two values to see if he typed the same password he registered with, it puts it through the MD5 function again, yeilding the same hash string and letting him log in successfully. If Bob had typed in “hunter1”, the result of the MD5 function would be “726ad07bc398372b56a52e3de8693679”. Note that this is vastly different from the hash of the correct password, even though it is very close. This is due to what is called the avalanche effect, where even small differences in the input will result in vastly different outputs.
The password is now stored in the database, but no one with access to the database could simply log in as Bob – because they don’t actually have his password, just the hash. Hash functions are designed to be (hopefully) very difficult to reverse – the only option being a brute force search to find the original string or a collision that produces the same hash.