🇬🇧🇺🇸 My favorite things in crypto(graphy)

Remember when crypto used to stand for cryptography and not cryptocurrencies? I do! I was scrolling around and thought: you don't read a lot of cryptography stuff anymore, or security protocols. Or maybe I am in the wrong social media bubbles, and I don't get that content pushed to my face (although I would like to).

Cryptography was one of my all-time favorite topics in faculty. It was fascinating to me to jumble bits around and get security out of that. That and also using math to secure messages, hide information from bad actors, or just assert the validity of some information, so that you are sure it has not been tampered with.

Let's discuss some high level concepts of cryptography, maybe it will spark somebody else's interest in the field

Hashes

Hashes are interesting. Hashing is a process where you take a piece of information, you take it through a specific algorithm which after a lot of repetitive well thought steps, you get a constant size output. The constant size output "hides" the original information completely, and there's no way "mathematically" to get to the original information starting from a hash.

Some examples of popular hashing algorithms are MD5, SHA-1 and SHA-256. For a hash algorithm, the output size is always the same, while the input can have any size. Due to this, the input domain is theoretically infinite, while the output domain is finite (2 to the power of whatever the output length is in bits). Due to this, for each possible output, you have multiple possible inputs that when hashes, they produce the same hash. They are called collisions.

Out in the wild, passwords should never be stored in plain text, or in any recoverable form, not even encrypted. Instead, their hash must be stored, so that in the case of a potential data leak, there would be no way for somebody to obtain the original passwords from the leaked data.

But even if we store it in a hashed form, there is still the human factor that we need to account for: people tend to use easy passwords, especially non-technical people. At best, they use some password managers, but let's be honest, 99% of the people use 1-2 common words plus some digits in the end. That's just human nature. Due to that, due to the way hashes work, the same "easy" password used by two accounts will have stored the same hash. So an attacker, can pre-build a list of the most common passwords, hash them, and then just do a lookup for the hashed value to discover the accounts in the data leak that use weak passwords.

To prevent this, the current standard for password storage is using a salted hashing algorith. It works like a regular hash, the only difference being that it adds a random value to the actual password, then hashes it thousands of times (rounds), and then in the end, a composite value is stored in the format alg$salt$rounds$hash (for example it might be something like md5$ef121dd$1000$1123affe35231ffaee1331313. The value isn't real, I put something together to portray the idea. And in the real world, the amount of "rounds" used for hashing is in the tens (if not hundreds) of thousands, so that for a possible attacker, it would take a lot of computing power to brute force it, or generate enough weak password hashes for their attack to be successful (due to the salt, they need to compute all possible salts and weak password combination).

Public key encryption

Public key encryption is a method for "locking" some information, usng some publicly available information, so that only the parties that have some secret information derived from the public information can access it. Basically, there are two keys: one public and one private, that are created at the same time, and starting from the public key, you can't get to the secret key using reasonable computing power and time.

The most common public key encryption algorithm is RSA, which uses multiple mathematical parameters, and the keys are formed from these parameters. The magic behind it is that in finite set algebra, some operations are not inversible (you can do an operation in a way, but trying to do the reverse is next to impossible, and the most efficient algorithm to do it is basically brute forcing your way to the answer). For example ab can be easily computed, but a-b can't. Cool math stuff, I won't get into details.

Some concrete use cases for this kind of encryption is in the SSL/TLS protocol, where the web server exposes their public key as the certificate, the clients (web browsers) then use that certificate to encrypt the traffic, and then the web server, having the secret key, can decrypt and read the traffic. So an eavesdropper that intercepts the web traffic won't be able to understand anything, because they don't have the secret key, and they can't obtain it through mathematical means.

Another cool use case is in the SSH protocol, where the process is basically the same, only that the purpose is to obtain a secure shell (Secure SHell, got it?) to the server, and then execute arbitarily commands. You can tell why security is an important factor when you're doing this, since you, as the client, gain complete access and control over the server.

The third interesting real use case are digital signatures. The difference here is that the private key is used to encrypt the message (called the signature), and then the public key is posted and attached to the identity of the signer, so that anyone can use it and validate that 1. the message has not been tampered with and 2. the message has been "validated" by the party whose public key belongs to.

Zero knowledge proofs

This is a bit more advanced theoretical topic which has a lot of math at base, but it being my research topic for the faculty diploma, I got the opportunity to dig into it a bit, and I can say that it is fascinating.

Long story short, the problem you address with zero knowledge proofs are how can a party A demonstrate to another party B that they know a certain piece of information, without offering ANY information about it. For example, a password. How do you offer proof that you know a certain password, without offering them any information. In some predefined exchange of messages, offering them some pieces of information but without offering them any piece that could lead to them "guessing" the original information.

Let me give you an example. You have a password and the server (the other party) needs to get convinced that indeed you know that password. In the usual flow of things, in a regular authentication flow, you send the password to the server, they hash it and compare it to the hash they have stored (assuming they don't do the ultimate sin of storing the password in plaintext). But by doing this, you are literally giving the whole information (the password to the other party).

There are multiple protocols of how zero knowledge proofs, but the overall idea is the same: you send some initial data, the other party computes some challenges using that data (to make sure they don't use some carefully crafted challenges to obtain information). The challenge is some mathematical problem that can be solved only by knowing the original information. The two parties do this exchange multiple times, and with each successful repetition, the confidence that indeed, the information is truly known by the first party increases. If any challenge is failed, the party rejects any further attempt and concludes that the information is not known by the first party.

The math behind it is based on finite set algebra or graph theory, so I won't go into the details.

Conclusion

I really hate how the term "crypto" got hijacked from something cool and useful to something speculative and of no value. I wanted to (maybe) spark the interest of some regarding the cool subject of cryptography, and give some high level information about its coolest areas. But these are only the tip of the (theoretical) iceberg, and there are many more theoretical areas I didn't cover, and when they are put into practice, a lot of cool security protocols come up (like how TLS works, Kerberos, etc).