Partial passwords is an authentication system where users enter a random 2-3 characters of their password. While it makes random guesses easier, attackers must eavesdrop on more than one logons.
How to implement partial passwords correctly? The problem made us think and eventually realise why online banking systems using authentication with partial passwords ask for two or three letters only and they also limit the maximum length of passwords.
Note: Actually – the above is only true for well-designed systems.
Problem
If a system asks users for two letters, the database of the system will store all possible pairs of letters that appear in a user password – all “obfuscated” (e.g., with a cryptographic hash function). We say obfuscated because brute-force attack on data with about 10 bits of entropy (the number of all possible options being 1,000) is trivial.
Those systems can not ask for more than two letters because more letters would require a lot more database space. A two-letter partial password system (with eight characters’ passwords) requires 56 strings (hash values), while a three-letter system would require 336 strings – easily taking more than 5kB of data for a single password.
It appears that most companies indeed store all possible combination of letters a user may enter in a database. The second option is to store the password unencrypted or encrypted with a symmetric algorithm (e.g. 3DES, AES). Passwords cannot be stored as hash values as the system needs to have access to plaint text passwords to compare the letters entered by users.
Better Solution
We have suggested a solution based on Shamir secret sharing scheme. This saves database space and it is also fast as only a polynomial computation (in the simplest form) is needed for verification.
A. Global Parameters
At the beginning, someone has to define global parameters of the system. Well, there is actually just one – how many letters will users have to select. Let us call the parameter N. The maximum length of password (L) is important only from the database point of view – you will need to store a 32b long number for each character.
B. adding New User
- User chooses his/her password P. It consists of letters p1, p2, p3, … pk.
- The system will generate at least 32 bits’ long secret key – K – unique for each user. 32 bits is enough if N is 3, for larger N (the number of letters users have to correctly select each time) you may want to increase the length of K.
- The system will also generate N-1 32 bits’ long random numbers R1, R2, … R(N-1)
- The next step is a computation of k points (k being the length of the password) on a polynomial: y=K+R1*x + R2*x^2 + … + R(N-1)*x^(N-1), for x = 1, 2, … k. Let us denote the results as y1, y2, …, y(N-1).
- Values s1=(y1-p1), s2=(y2-p2), … sk=(yk-pk) is stored in the database. Each number takes 32 bits. One will also need to store K, or the hash of K.
C. Authentication
The next part of the system is user authentication, which is very simple and fast.
- The system selects N positions in the password – i1, i2, … iN.
- A user selects N letters from her/his password at specified positions so that we have pairs (p’1, i1), (p’2, i2), …, (p’N, iN).
- The system recovers yi values for indices i selected in step 1 – simply just adding stored values (see step 5 above) to values p’i entered by the users.
- Now we have to solve the polynomial equation to obtain K’. The equation for that looks horribly but it is quick to compute and can even be partially pre-computed as it uses indices (positions of letters): K’ = \sum_i [ yi * [ (\PI_j (j) ) / (\PI_j (i-j)) ] ], where i and j run over i1, i2, …, iN (step 1), and j skips the actual selected i.
Example: let’s say that user selected 2nd, 3rd, and 4th letter, the solution will be: K’=y2*( 3*4/[(2-3)*(2-4)] )+y3*( 2*4/[(3-2)*(3-4)] ) + y4*( 2*3/[(4-2)*(4-3)] ) - The last step is to compare K and K’. If they are equal, user entered correct values and is logged in.
One note here – the secret is not K, but values yi reconstructed when user enters his/her letters.
Pros and Cons
The highlights of this solution are:
- The number of letters for authentication and the length of passwords is not really limited.
- The database space that is needed to store all necessary information is linear with the length of a password, not quadratic.
- Secrets are not stored in plain-text and need certain computation.
- All the data can be still encrypted for storage if needed.
- Faster to compute – verification is several multiplications of 32 bit numbers that is much much faster than computing a hash value.
The low points are:
- It is not a straight forward solution.
- The security is still not increased beyond the difficulty of finding the correct K letters.
Improvements
The discussion actually continued and a stranger – Tom – showed that storing the constant K in plaintext allows for an attack. I have initially suggested that one can store this constant in plaintext or a hash value of it. Currently, we have to recommend using the hash value.
For those interested in detail, here is an excerpt from Tom’s email. (It assumes password letters come from a set of 100 characters – hence the magic constant of 100.)
For match (p1p2..pN) I think you can compute the whole password in “constant” time O(100*k), by using the function again for each next unknown password character pN+1..pk, as an equation with 1 variable. E.g. we’ve got a match (p1p2…pN), then to reveal pN+1 we solve the equation for set 1, 2, …, N-1, N+1 with just discovered (y1, y2, …, yN-1), and with an unknown yN+1:
K’ = \sum_i [ yi * [ (\PI_j (j) ) / (\PI_j (i-j)) ] ] would be like
K’ = yN+1 * [ (\PI_j (j) ) / (\PI_j (N+1-j)) ] ] + C
where C is constant (computed).
If K is in plain text, then it is a simple equation with 1 variable and can be solved in O(1). If K is stored as a hash, then iterate through every character again and compare hash(K’) with stored hash(K). Note, that in this case, computing each next character requires O(100) operations, in total of O(100*k) rather than O(100^k).
10^12/16!*10! was suggested as if you brute force N=6 characters with plain K (if using the trick with modulo it is be better to brute force last N chars than first N, due to the lower charset for p16 than p1).