More Annotations

Favourite Annotations

Text

AARON TOPONCE : LINUX. GNU. FREEDOM. In 2007, Ulrich Drepper decided to improve things for GNU/Linux. He recognized the threat that GPU clusters, and even ASICs, posed on fast password cracking with md5crypt. One aspect of md5crypt was the hard-coded 1,000 iterations spent on the CPU, before the password hash

was finalized.

AARON TOPONCE : THE LAGGED FIBONACCI GENERATOR A Fibonacci sequence PRNG exists called the Lagged Fibonacci Generator. Here is how it works: S n = S n-j ⊙ S n-k mod M, 0 < j < k. Where "⊙" is any binary function, such as addition, subtraction, multiplication, or even the bitwise exclusive-or. First off, it

doesn't address

AARON TOPONCE : CHECKSUMS, DIGITAL SIGNATURES, AND MESSAGE Digital signatures are a form of checksum, in that they provide data integrity, but they require asymmetric encryption to also provide authenticity. Digital signatures are away to attach an identity to the checksum. This implies a level of trust between you and the 3rd party, such as Debian with our example above. AARON TOPONCE : ZFS ADMINISTRATION, PART XIV- ZVOLS A ZVOL is a ZFS block device that resides in your storage pool. This means that the single block device gets to take advantage of your underlying RAID array, such as mirrors or RAID-Z. It gets to take advantage of the copy-on-write benefits, such as snapshots. It gets to take advantage of online scrubbing, compression and data

deduplication.

AARON TOPONCE : LET'S TALK PASSWORD HASHING In other words, Argon2d would be suitable for password hashing, while Argon2i would be suitable for encryption key derivation. However, regardless of Argon2d or Argon2i, the cost parameters will perform the same, so we'll treat them as a single unit here. Like scrypt, Argon2 has both a CPU and a RAM cost. AARON TOPONCE : IDENTIFICATION VS IDENTITY Identity: the state or fact of remaining the same one or ones, as under varying aspects or conditions. Identification: an act or instance of identifying; the state of being identified. You see, your identity is who you are, while your identification verifies your identity. This is an important realization that many of us in the

tech-world don't

AARON TOPONCE : SETTING UP A GLOBAL TOR PROXY ON ANDROID When at the main page of the app, long-tap the power button in the center of the droid, to connect to the Tor network. When the arms of the droid are down, you are not connected. When the arms are yellow, and pointing to the sides of the phone, the app is trying to get a connecting to the Tor network. When the arms are green, pointing up,

you

was finalized.

doesn't address

deduplication.

tech-world don't

you

AARON TOPONCE : WEECHAT RELAY WITH LET'S ENCRYPT CERTIFICATES The official Let's Encrypt "certbot" package used for creating Let's Encrypt certificates is already available in Debian unstable. A simple "apt install certbot" will get that up and running for you. Once installed, you will need to create your certificate. $ certbot certonly --standalone -d weechat.example.com -m aaron.toponce@gmail.com. AARON TOPONCE : ZFS ADMINISTRATION, PART VI- SCRUB AND With ZFS, you can. While ZFS is performing a scrub on your pool, it is checking every block in the storage pool against its known checksum. Every block from top-to-bottom is checksummed using an appropriate algorithm by default. Currently, this is the "fletcher4" algorithm, which is a 256-bit algorithm, and it's fast. AARON TOPONCE : PLAYING CARD CIPHERS While I was at it, I created my own playing card cipher, and I'm still currently evaluating its security. That brings to total list to eight playing card ciphers: Card-Chameleon. Chaocipher. DECK. AARON TOPONCE : INTRODUCING DECKWARE Deckware only requires 7 shuffles, and odds are 2 out of 3 you won't need to try again. Second, for those 42 shuffles with Pokerware, you only returned 74 bits of security. For Deckware's 7 shuffles, you were able to extract 224 bits! That's a significant return for the cost, making it far more efficient. In summary: AARON TOPONCE : CHECKSUMS, DIGITAL SIGNATURES, AND MESSAGE Digital signatures are a form of checksum, in that they provide data integrity, but they require asymmetric encryption to also provide authenticity. Digital signatures are away to attach an identity to the checksum. This implies a level of trust between you and the 3rd party, such as Debian with our example above. AARON TOPONCE : THE ENTROPY KEY To setup the entropy key, you will receive a package with a DVD, the entropy key, and a paper for the master password to decrypt the bits from the key. To get the keys setup, you will need to match the key ID "EKXXXXX" with the same ID on the paper, if you have multiple keys. Then, in Debian/Ubuntu, you can install the daemon to talk to the

keys:

AARON TOPONCE : USE A GOOD PASSWORD GENERATOR This means that 7,776 divides values 0 through 2 32 -2,561 evenly. So if your random number is between the range of 2 32 -2,560 through 2 32 -1, the value needs to be tossed out, and a new number generated. Oddly enough, you could use a an insecure CRNG, such as SHA-256, but truncate the digest to a certain length. AARON TOPONCE : THE DRUNKEN BISHOP CIPHER Knowing these rules, we can now describe where our drunk bishop moves when he lands on any square on the board. Now, we just need to generate a random board, which will determine the bishop's movement. When generating a random board, all 64 numbers from 0 through 63 must be assigned to a square. AARON TOPONCE : PLAYING CARD CIPHERS While I was at it, I created my own playing card cipher, and I'm still currently evaluating its security. That brings to total list to eight playing card ciphers: Card-Chameleon. Chaocipher. DECK. Mirdek. Pocket-RC4. Quadibloc. AARON TOPONCE : ZFS ADMINISTRATION, PART XVI- GETTING AND Because the parent storage pool is also a valid ZFS dataset, any child datasets will inherit non-default properties, as seen. And, the same is true for nested datasets, snapshots and volumes. With ZFS dataset properties, you now have all the tuning at your AARON TOPONCE : PASSWORD CARDS I'm actually surprised that I haven't blogged this already. This is a topic that is right up my alley, so it definitely belongs here. How

many times have you

was finalized.

doesn't address

deduplication.

tech-world don't

you

was finalized.

doesn't address

deduplication.

tech-world don't

you

keys:

AARON TOPONCE : THE DRUNKEN BISHOP CIPHER Knowing these rules, we can now describe where our drunk bishop moves when he lands on any square on the board. Now, we just need to generate a random board, which will determine the bishop's movement. When generating a random board, all 64 numbers from 0 through 63 must be assigned to a square. AARON TOPONCE : USE A GOOD PASSWORD GENERATOR This means that 7,776 divides values 0 through 2 32 -2,561 evenly. So if your random number is between the range of 2 32 -2,560 through 2 32 -1, the value needs to be tossed out, and a new number generated. Oddly enough, you could use a an insecure CRNG, such as SHA-256, but truncate the digest to a certain length. AARON TOPONCE : PLAYING CARD CIPHERS While I was at it, I created my own playing card cipher, and I'm still currently evaluating its security. That brings to total list to eight playing card ciphers: Card-Chameleon. Chaocipher. DECK. Mirdek. Pocket-RC4. Quadibloc. AARON TOPONCE : ZFS ADMINISTRATION, PART XVI- GETTING AND Because the parent storage pool is also a valid ZFS dataset, any child datasets will inherit non-default properties, as seen. And, the same is true for nested datasets, snapshots and volumes. With ZFS dataset properties, you now have all the tuning at your AARON TOPONCE : PASSWORD CARDS I'm actually surprised that I haven't blogged this already. This is a topic that is right up my alley, so it definitely belongs here. How

many times have you

was finalized.

doesn't address

deduplication.

tech-world don't

you

was finalized.

doesn't address

deduplication.

tech-world don't

you

keys:

many times have you

AARON TOPONCE : REASONABLE SSH SECURITY FOR OPENSSH 6.0 OR Okay. Now that we've everything ironed out in hardening our OpenSSH 6.0 connections, let's see how this would look in the client and on the server. For both the client config and the server config, it should support algorithms for both OpenSSH 6.0 and 6.7. For AARON TOPONCE : STRONG PASSWORDS NEED ENTROPY Here's a table showing the length your password must be given the possible character combinations in your password, if you want a certain entropy. Say you want an entropy of 64-bits using only numbers, it would need to be 20 characters long. If you wanted an entropy of 80 bits using characters from the entire ASCII set, it would only need to be AARON TOPONCE : CHECKSUMS, DIGITAL SIGNATURES, AND MESSAGE I recently submitted a bug to the Vim project about its Blowfish encryption not using authentication.Bram Moolenaar, the lead developer of Vim, responded about using checksums and digital signatures. AARON TOPONCE : THE LAGGED FIBONACCI GENERATOR Running it verifies our results: $ python lagged.py 6 1 4 4 3 9 0 4 8 1. It's a "lagged" generator, because "j" and "k" lag behind the generated pseudorandom value. AARON TOPONCE : LET'S TALK PASSWORD HASHING TL;DR. In order of preference, hash passwords with: Argon2; scrypt; bcrypt; PBKDF2; Do not store passwords with: MD5; md5crypt; sha512crypt; sha256crypt; UNIX crypt(3) AARON TOPONCE : THE ENTROPY KEY Recently, I purchased 5 entropy keys from http://entropykey.co.uk.They are hardware true random number generators using reverse bias P-N junctions. Basically, they AARON TOPONCE : USE A GOOD PASSWORD GENERATOR Introduction. For the past several months now, I have been auditing password generators for the web browser in Google Sheets.It started by looking for creative ideas I could borrow or extend upon for my online password generator.Sure enough, I found some, such as using mouse movements as a source of entropy to flashy animations of rolling dice for a Diceware generator. AARON TOPONCE : ZFS ADMINISTRATION, PART XII- SNAPSHOTS Snapshots with ZFS are similar to snapshots with Linux LVM. A snapshot is a first class read-only filesystem. It is a mirrored copy of the state of the filesystem at the time you took the snapshot. AARON TOPONCE : IDENTIFICATION VS IDENTITYIDENTIFICATION MEANINGIDENTITY IDENTIFICATION DIFFERENCEIDENTITY VS IDENTIFICATION I had an interesting discussion yesterday at work, that I would like to share here.It was in regards to when the proper time presents itself to show identification versus identifying them on the outset. AARON TOPONCE : WEECHAT RELAY WITH LET'S ENCRYPT CERTIFICATESWEECHAT APPWEECHAT GITHUBWEECHAT PLUGINSWEECHAT SASLWEECHAT SLACK I've been on IRC for a long time. Not as long as some, granted, but likely longer than most. I've had my hand in a number of IRC clients, mostly terminal-based. AARON TOPONCE : ZFS ADMINISTRATION, PART I- VDEVS So, I've blogged a few times randomly about getting ZFS on GNU/Linux, and it's been a hit. I've had plenty of requests for blogging more. So, this will be the first in a long series of posts about how you can administer your ZFS filesystems and pools. AARON TOPONCE : SETTING UP A GLOBAL SSH PROXY ON ANDROID I'm one that takes precautions with my data when on unfamiliar or untrusted networks. While for the most part, I trust TLS to handle my data securely, I find that it doesn't take much effort to setup a transparent proxy on my Android handset, to route all packets through

an encrypted proxy.

AARON TOPONCE : CHECKSUMS, DIGITAL SIGNATURES, AND MESSAGE I recently submitted a bug to the Vim project about its Blowfish encryption not using authentication.Bram Moolenaar, the lead developer of Vim, responded about using checksums and digital signatures. AARON TOPONCE : THE LAGGED FIBONACCI GENERATOR Running it verifies our results: $ python lagged.py 6 1 4 4 3 9 0 4 8 1. It's a "lagged" generator, because "j" and "k" lag behind the generated pseudorandom value. AARON TOPONCE : LET'S TALK PASSWORD HASHING TL;DR. In order of preference, hash passwords with: Argon2; scrypt; bcrypt; PBKDF2; Do not store passwords with: MD5; md5crypt; sha512crypt; sha256crypt; UNIX crypt(3) AARON TOPONCE : THE ENTROPY KEY Recently, I purchased 5 entropy keys from http://entropykey.co.uk.They are hardware true random number generators using reverse bias P-N junctions. Basically, they AARON TOPONCE : USE A GOOD PASSWORD GENERATOR Introduction. For the past several months now, I have been auditing password generators for the web browser in Google Sheets.It started by looking for creative ideas I could borrow or extend upon for my online password generator.Sure enough, I found some, such as using mouse movements as a source of entropy to flashy animations of rolling dice for a Diceware generator. AARON TOPONCE : ZFS ADMINISTRATION, PART XII- SNAPSHOTS Snapshots with ZFS are similar to snapshots with Linux LVM. A snapshot is a first class read-only filesystem. It is a mirrored copy of the state of the filesystem at the time you took the snapshot. AARON TOPONCE : IDENTIFICATION VS IDENTITYIDENTIFICATION MEANINGIDENTITY IDENTIFICATION DIFFERENCEIDENTITY VS IDENTIFICATION I had an interesting discussion yesterday at work, that I would like to share here.It was in regards to when the proper time presents itself to show identification versus identifying them on the outset. AARON TOPONCE : WEECHAT RELAY WITH LET'S ENCRYPT CERTIFICATESWEECHAT APPWEECHAT GITHUBWEECHAT PLUGINSWEECHAT SASLWEECHAT SLACK I've been on IRC for a long time. Not as long as some, granted, but likely longer than most. I've had my hand in a number of IRC clients, mostly terminal-based. AARON TOPONCE : ZFS ADMINISTRATION, PART I- VDEVS So, I've blogged a few times randomly about getting ZFS on GNU/Linux, and it's been a hit. I've had plenty of requests for blogging more. So, this will be the first in a long series of posts about how you can administer your ZFS filesystems and pools. AARON TOPONCE : SETTING UP A GLOBAL SSH PROXY ON ANDROID I'm one that takes precautions with my data when on unfamiliar or untrusted networks. While for the most part, I trust TLS to handle my data securely, I find that it doesn't take much effort to setup a transparent proxy on my Android handset, to route all packets through

an encrypted proxy.

AARON TOPONCE : LINUX. GNU. FREEDOM. Introduction. Lately, I've been interested in pulling up some classical modes of generating randomness. That is, rather than relying on a computer to generate my random numbers for me, which is all to common and easy these days, I wanted to go offline, and generate random numbers the classical way- coin flips, dice rolls, card shuffling, roulette wheels, bingo ball cages, paper shredding, etc. AARON TOPONCE : IDENTIFICATION VS IDENTITY I had an interesting discussion yesterday at work, that I would like to share here.It was in regards to when the proper time presents itself to show identification versus identifying them on AARON TOPONCE : THE DRUNKEN BISHOP CIPHER Background. Ever since learning Bruce Schneier's Solitaire Cipher, I was interested in creating a hand cipher of my own.Unfortunately, I'm just an amateur cryptographer, and a lousy one at that. So I didn't have any confidence in creating my own hand cipher. AARON TOPONCE : ZFS ADMINISTRATION, PART XIV- ZVOLS What is a ZVOL? A ZVOL is a "ZFS volume" that has been exported to the system as a block device. So far, when dealing with the ZFS filesystem, other than creating our pool, we haven't dealt with block devices at all, even when mounting the datasets. AARON TOPONCE : ZFS ADMINISTRATION, PART XVI- GETTING AND Motivation. Just as with Zpool properties, datasets also contain properties that can be changed. Because datasets are where you actually store your data, there are AARON TOPONCE : ZFS ADMINISTRATION, PART VI- SCRUB AND Standard Validation. In GNU/Linux, we have a number of filesystem checking utilities for verifying data integrity on the disk. This is done through the "fsck" utility. AARON TOPONCE : ZFS ADMINISTRATION, PART III- THE ZFS The previous post about using ZFS with GNU/Linux concerned covering the three RAIDZ virtual devices (VDEVs).This post will cover another VDEV- the ZFS Intent Log, or the ZIL. AARON TOPONCE : ZFS ADMINISTRATION, PART I- VDEVS So, I've blogged a few times randomly about getting ZFS on GNU/Linux, and it's been a hit. I've had plenty of requests for blogging more. So, this will be the first in a long series of posts about how you can administer your ZFS filesystems and pools. AARON TOPONCE : PASSWORD CARDS I'm actually surprised that I haven't blogged this already. This is a topic that is right up my alley, so it definitely belongs here. How

many times have you

AARON TOPONCE : USE WGET(1) TO EXPAND SHORTENED URLS I'm a fan of all things microblogging, but let's face it: until URLs become part of the XML, and not part of your character count (which is ridiculous anyway), shortened URLs are going to

AARON TOPONCE

Linux. GNU. Freedom.

Skip to content

* Author Colophon

* Site License

* Contact

* Donate

CHECKSUMS IN PASSWORDS? UH, OKAY.

INTRODUCTION

As most of my readers know, I have a rather extensive yet easy-to-use web-based password generator . I've spent a lot of time doing password research (a couple ideas mine, most not), and have implemented most of these into the project. These include, but are not limited to: * Expansive language support * Verbal unambiguity * Visual unambiguity

* Memorability

* Compact density

* Programmatic prediction

* Versatility

* Accommodating complex requirements

* Entertainment

* Checksums

That last idea was just recently committed to the project, and I think it might have some value, albeit with some tight controls and possibly little reward for the cost.

BUBBLE BABBLE

When I started developing my password generator, Tony Arcieri suggested on IRC that I implement Bubble Babble. I think he meant it mostly as a joke, but already being somewhat familiar with it having to do with SSH keys, I looked into it. Bubble Babble is an encoding specification designed for SSH

fingerprints

.

The goal is to make the fingerprints pronounceable, such as when comparing host keys on first use. Antti Huima designed the specification, and built in a checksum to detect transmission errors. But for a password generator, I initially ignored implementing the checksum, and just implemented "xVCVC-CVCVC-...-CVCVC-CVCVx", where "C" is a random consonant from the spec and "V" is a random vowel also

from the spec.

But then I got thinking, would it really be that big of a deal to implement Bubble Babble's checksum? Would it impact the character count for similar security margins? Would users even notice or care? It seemed obvious that the answer was to try it and see, so try it I did. For example, here is a Bubble Babble password before I implemented the checksum: "xuduh-taren-rezyd-gefik-bixux", and here's one after implementing the checksum: "xuval-zoder-cykeh-lyvin-lyrax". The former has approximately 78 bits of security, but the check will fail, while the latter has 72 with a valid check. Both are five pronounceable pseudowords. However, the check isn't easily identifiable, as it's integrated throughout the entire string, and calculating the check is rather involved. One noticeable identifier is if Bubble Babble is encoding an odd number of bytes versus an even number of bytes. If the number of bytes is even, then the format of the string will have a third "x": "XVCVC-CVCVC-...-CVCVC-CVXVX". That doesn't mean that every even-numbered byte-encoded Bubble Babble string that has three "x"s has a valid check, but even numbered bytes with a valid check will have three "x"s. Regardless, stripping off the checksum isn't really a thing, due to being tightly integrated with the full string. Okay. Now that it's implemented in Bubble Babble, is there any practical value to it? I could only think of one possible scenario. A SCENARIO WITH TIGHTLY CONTROLLED AUTHENTICATION Suppose an organization has a credential management system (CMS) that requires employees to use their built in password manager and password generator. The password generator generates Bubble Babble passwords with valid checksums. When a new employee is hired and their account is setup, the CMS generates a random Bubble Babble password, hashes it, and stores it on disk for authentication. If at any point the employee wants to change their password, the CMS prevents them from supplying their own password, and they must use the builtin generator. When staff authenticate, all client-side software checks the password for a valid checksum before sending it to the authentication server. If the check fails, the user has entered their password incorrectly, and must try again. If the check succeeds, the software sends the password to the authentication servers for hashing and verification. The employee could attempt to bypass the client-side check by just sending the password to the authentication server directly, but it would be pointless as if verifying the password hash still fails at the authentication server, the employee still has to retype their

password.

Okay, but why? Well, assuming the organization is using a best practice password hashing function with an appropriate cost factor, then authentication is expensive. People frequently mistype passwords and that cost on the server can be mitigated with client-side checksum

validation.

But shouldn't users just copy-paste their passwords from the password manager? Absolutely yes they should. The most secure password is the one you don't know. However, there may be scenarios where pasting the password out of the CMS isn't practical, such as hooking up a crash cart to an unresponsive server or logging into your workstation when first getting into the office.

MORE CHECKSUMS

So if there is value here, are there other places where I've implemented a binary-to-text encoding scheme as a password generator that has a formally defined checksum in its specification? Yes, Crockford's Base32 and

Bitcoin's BIP39

.

In the case of Crockford's Base32, the checksum extends the base-32 character set to 37 characters, and the checksum is calculated modulo 37 against the bytes. It's rather trivial. In the case of Bitcoin's BIP39, the bytes (which must be a multiple of 4 bytes) are hashed with SHA-256, and the leading bits of the digest are appended to the original entropy bytes to make the final bitstring a multiple of 11 bits, which is then converted to words and presented as a mnemonic, the final word being the "check word".

SCREENSHOTS

Below are some screenshots of the current state of affairs with the Bitcoin, Bubble Babble, and Base32 generators when at least 70 bits of security is required. The styling and such in each container might change as the project matures, such as "Integrated checksum" in the lower right-hand corner, but the checksum will remain.

*

PRIOR WORK - LETTERBLOCK DICEWARE While I did start thinking of this independently on my own, there is prior work that should be acknowledged. I just discovered it last night when doing web searches for password and passphrase generators with checksums. It was partly that discovery actually that lead to the creation of this post. On August 12, 2020 Arne Babenhauserheide created Letterblock Diceware

as an approach

to physically and practically carrying Diceware with you. Unfortunately, Diceware ships 7,776 words with indices, and at best, this is several pages of printed paper with 5 dice, which isn't practical to carry around. So he created a 6x6 table of "pronounceable" and "memorable" digits, letters, and bigrams that can fit on a business card. Roll 2d6, one for the row the other for the column, and record the intersection for your password character. Four 2d6 rolls create a "block" worth about 20 bits of security. Four blocks produces about 80 bits of security as a result. However, as a weak checksum, add the row numbers of two consecutive blocks modulo 4, and insert the resulting character between the two blocks. For example, if you rolled for two blocks as {col, row}:

* {2,1}: A

* {6,3}: t

* {2,1}: A

* {3,5}: U

* {1,4}: 48

* {2,1}: A

* {2,4}: FK

* {3,3}: N

Then your blocks are "AtAU" and "48AFKN" (or "4AFN" if you prefer). The rows however are "1, 3, 1, 5, 4, 1, 4, & 3". Adding these up modulo 4 returns (1+3+1+5+4+1+4+3) % 4 = 2, which yields "-" for the check. Thus, the resulting password would be "AtAU-4AFN" (I would have done modulo 6 instead. Then the check is uniform, and it could be printed as one more column on the card). He also mentions the same scenario that I did in this post as well

(emphasis mine):

> Letterblock passwords use 55 letters that are unambiguous in > handwriting and safe to use in URLs, grouped in blocks of four > letters to make them easier to remember, WITH SEPARATORS THAT WORK > AS WEAK CHECKSUM TO CATCH MANY TYPING ERRORS BEFORE EVEN SENDING THE > PASSWORD TO THE SERVER, with weak optimization for legibility by > creating 8 passwords and choosing the one with bigrams that are > closest to regular prose. It's worth noting that upstream Diceware also ships tables to be used

with dice ,

although they're not designed for memorability.

CONCLUSION

There you have it. Three password generators in my web-based password generation project that now ship with checksums without reducing security or reducing the end user experience. I haven't made an actual release yet, as there is a bit more work I want to do prior to that. However, I'm sure there are other scenarios where passwords with checksums have value, as authentication is ubiquitous, and I couldn't possibly list every possible authentication scenario. Play around with it and let me know what you think. I would be very interested in your

feedback.

2021 04 09 General Comments

(0)

Shortlink

INTRODUCING DECKWARE - A 224-BIT ENTROPY EXTRACTOR

INTRODUCTION

I can't believe that it's been almost 3 years since my last blog post

.

Interestingly enough, that was on a deterministic card shuffle that I decided to call "Ouroboros". Well, this post is also about a deterministic algorithm with a deck of playing cards, but rather than shuffling the deck, we'll be extracting the entropy out of it. The algorithm is called Deckware . I would have called it "Pokerware", but it was already taken by Chris Wellons . I could have called it "Solitaireware", and it does have a sort of ring to it, but I didn't want to confuse people with the Solitaire Cipher by Bruce Schneier . I debated calling it "Bridgeware", but I fear that the Bridge card game is a fringe game enjoyed only by old ladies in nursing homes drinking lemonade, and most people wouldn't get it. Ultimately, the randomness extractor is working through the whole deck, so it makes sense to call it "Deckware", even if it does sound a bit like a construction company. The thing to understand about this algorithm however, is that it is not a generic passphrase generator like Diceware or Pokerware. As such, there is no word list provided with Deckware. Instead, it's a randomness extractor. it's designed such that you use your deck of playing cards as a random number generator, and this algorithm uniformly returns a 224-bit random number from that shuffle. Once you have that 224-bits of entropy, it's yours to do with as you wish: * Use it as a <= 224-bit cryptographic symmetric key. * Use it as a seed for a CSPRNG, such as reseeding your kernel RNG. * Use it for election auditing, lottery drawing, or randomized drug

samples.

* Convert the hexadecimal to a 14-word Niceware passphrase

.

When you need a lot of randomness, Deckware might work, although it's not particularly fast.

LEHMER CODE

The basis of Deckware is Lehmer code

. Lehmer code is a

factoradic algorithm for converting any specific permutation in a set to an integer. To understand how this works, let's look first at standard combinations that we're all familiar with. In decimal, which we use every day, we're all familiar with the "ones" place, the "tens" place, "hundreds" place, "thousands" place, etc. So a number like "3481" is 3*1000 + 4*100 + 8*10 + 1*1, right? Simple

enough.

Factoradic systems are a way to represent an integer as the sum of multiples of factorials. Instead of a decimal number system (or binay, octal, hexadecimal, etc), it's a factorial number system. If I wanted to take my previous example of "3481", I know that 6! = 720, so 7! = 5040. Thus, 3481/6! = 4 remainder 601. 601/5! is 5 remainder 1. Thus, 3481 = 4*6! + 5*5! + 1*1!. Okay, but how do you do that with a permutation? Let's say we have a box with numbered chits 1, 2, & 3. How many permutations (order matters) are there? Well, we know it's 3! = 6. We could list them all

quite easily:

* 1, 2, 3

* 1, 3, 2

* 2, 1, 3

* 2, 3, 1

* 3, 1, 2

* 3, 2, 1

Lehmer code converts each unique sequence to an integer. It does this by starting with the left-most value, and counting the values less than it to its right. So, starting with the first permutation of "1, 2, 3", "1" is our left-most value, and no values to its rights that are less than 1. So, for this factorial, it's multiplier would be "0". Next we move to the second value, which is "2". Again, there are no values to its right that are less than 2. So also for this factorial, its multiplier is also "0". Finally, on the last value, there are no values to its right, so it's value is "0". This is always the case for the right-most value in Lehmer code. So for "1, 2, 3", our Lehmer code would be 0*2! + 0*1! + 0*0! = 0. If we look at our second permutation of "1, 3, 2", applying Lehmer code, we get 0*2! + 1*1! + 0*0! = 1. Let's complete the list: * 1, 2, 3 = 0*2! + 0*1! + 0*0! = 0 * 1, 3, 2 = 0*2! + 1*1! + 0*0! = 1 * 2, 1, 3 = 1*2! + 0*0! + 0*0! = 2 * 2, 3, 1 = 1*2! + 1*1! + 0*0! = 3 * 3, 1, 2 = 2*2! + 0*1! + 0*0! = 4 * 3, 2, 1 = 2*2! + 1*1! + 0*0! = 5 Deckware uses Lehmer code, but with 52! permutations instead of 3! like our example above. PLAYING CARD PERMUTATIONS Knowing that there are 52 unique cards in a standard Poker or Bridge deck of playing cards, then we know there are 52! order permutations. 52! has 68 decimal digits . Converting to binary bits yields log2(52!) ~= 225.581. In case you forgot, it would take all the energy from a hypernova

captured

by a Dyson sphere to count from 0 to ~2^227. In all likelihood, a sufficiently shuffled deck has never been discovered before. But how do you do the math? How do you compare the inequality of the Ace of Spades to the Ten of Diamonds, for example? To do this, we need to make some numerical assignments. We're going to use Bridge order for suits, and treat Ace as low, King as high. As such, we get: * CLUBS: Ace - King = 1 - 13 * DIAMONDS: Ace - King = 14 - 26 * HEARTS: Ace - King = 27 - 39 * SPADES: Ace - King = 40 - 52 Now that we have these numerical assignments, we can trivially do our inequality comparisons to build our Lehmer code. But we have a snag. Because the permutation space is larger than 225 bits but not quite 226 bits, we can't use the full space, or we'll end up with a biased extractor. As such, we need to discard anything larger than 2^225-1 (because we start counting with 0). So, when we compute our Lehmer code, if the value is 2^225 or greater, it's ignored, and the user needs to reshuffle the deck. Otherwise, we return the lower 224 bits of the extracted result to the user. However, 2^225 is approximately 67% of 2^log2(52!). This means that on average, you will have to reshuffle the deck 33% of the time to prevent getting a biased result, or about 1 out of every 3 shuffles will be discarded. It's really unfortunate that it couldn't be better, but it is what it is. DECKWARE VERSUS POKERWARE: FIGHT! I think it's worth mentioning how Deckware compares to Pokerware and when you would want to choose one over the other, seeing as though they are both using a deck of playing cards as a source of randomness. First off, as already mentioned, Deckware does not ship a word list. Technically speaking, Deckware is not a passphrase generator. It's an entropy extractor. This means that you need to bring your own word list to the Deckware table. By comparison, Pokerware provides both formal and slang word lists as part of the project. Second, Pokerware can be executed trivially without any computing or calculating device. All you need is a deck of cards and a printed off indexed word list. To be fair, I don't think anyone actually keeps a printed off word list of Pokerware, or Diceware for that matter, with them, except for maybe the inventors themselves. I'm guessing most, if not all, are using a computer to generate the Diceware or Pokerware

passphrase.

Deckware on the other hand _could be executed 100% with a pencil and paper_, but it would be painful and incredibly slow. That's something you would make inmates in prison do when they need something to do. I mean, this is essentially what it would take: * Count inequalities for every card placement in the list. * Find the Lehmer code using the factorial number system. * Convert to base 16. Yeah, no, I'll pass. I'll stick with the tool. However, Deckware has a couple of advantages over Pokerware though that might be worth

considering.

First with Pokerware, after every draw, the deck needs to be reshuffled. As determined, this is at least 7 shuffles. That's 7 full deck shuffles for every passphrase word. At 6 words, that's a total of 42 shuffles you've performed on the deck. Deckware only requires 7 shuffles, and odds are 2 out of 3 you won't need to try again. Second, for those 42 shuffles with Pokerware, you only returned 74 bits of security. For Deckware's 7 shuffles, you were able to extract 224 bits! That's a significant return for the cost, making it far more

efficient.

In summary:

POKERWARE:

* Advantages:

* Provides two word lists. * Simple and clean to execute. * Can be executed without a computer. * Stands on its own as a unique tool.

* Disadvantages:

* Cumbersome shuffling per generated word. * More time costly for similar security margins.

DECKWARE:

* Advantages:

* Maximizes deck entropy. * Small time commitment. * Can be used for security solutions other than passphrases.

* Disadvantages:

* Does not provide a word list. * Might be difficult to independently audit. * Requires a computer. * Can be replaced with SHA-224. I think that last disadvantage actually speaks volumes. In the past, I would shuffle the deck, record the results, and hash it with SHA-224. That's perfectly acceptable, and I won't blame you for that approach. Even though using SHA-224 to hash your deck order is technically biased, the bias isn't significant enough to reduce security in practical terms, and so long as SHA-2 remains secure, you can't identify a biased result from an unbiased one. Deckware is elegant in that not only is it uniform, it doesn't rely on any cryptographic primitives. It's just factorial math. This means you can trivially audit it for correctness. For example, extract the 224-bit hexadecimal string from an ordered deck, and it should return 0x00000000000000000000000000000000000000000000000000000000. Swap the King of Spades with the Queen of Spades, and it will return 0x00000000000000000000000000000000000000000000000000000001. This sort of vetting isn't accessible for SHA-2, although there is little to no reason to not trust its correctness. I'm not going to say one is better than the other (Pokerware or Deckware), because as outlined, they have their own strengths and weaknesses. I have personally used Pokerware, and truth be told, I was adding it to my password and passphrase generator (how did I miss it?!), and it got me thinking: "how would I design a playing card algorithm without relying on cryptography?"

DECKWARE IN ACTION

Here's a couple screenshots of an early release of the tool in action. Here, you can see the unshuffled deck on the "upper table". The suit symbols are emoji provided by OpenMoji . I added the text next to each suit using the DejaVu Serif font in

Inkscape.

Here, I've dragged and dropped each card onto the lower table representing the shuffled deck. I'm not very good an JavaScript listening events, so I shamelessly took the code from W3Schools. No doubt it could use some polish, but it works. Notice that I've clicked the "Calculate unique deck ID" button to extract the entropy (maybe I should change that button text now that I'm thinking about it). I got "b08bd2f0720ade917b842ee1e721fe1c6ad00429e1155f9201b50d82" returned. This gives be a 14-word Niceware passphrase of "random sporran ironclad tare lifeful cromwell trekked wrigglier imprudence amenable thai hajj affectionately barratry". After extracting the entropy out of the deck, you should thoroughly reshuffle the deck or place it back in order to destroy the key, so the entropy cannot be re-extracted. You should also reload your web browser for the same reason. The tool is not using any persistent storage, but feel free to run the tool in a private browser window if

you're paranoid.

CLOSING THOUGHTS

In practice, after shuffling the deck, I was able to record every card in the tool in 153 seconds, or around 2 - 3 minutes. That's not bad with drag-and-drop using the mouse, and I'm sure it can be improved with a keyboard listening event to type it in rather than using the mouse. Again though, I'm not proficient with JavaScript listening events, so maybe someone can help me out here. However, this tool or SHA-224, the bulk of the time is taken to record the cards in the shuffled deck, so from my point of view, it's sixes. Pick your poison. You need a tool, one way or the other. For the time being, I've got this opened in a tab in my browser. When I need a password generator, I give the hexadecimal string to Niceware. 14 words is generally overkill for my usual password needs. Even dividing it in half at two 7 words each, gives me two 112-bit passphrases. Still overkill. But 5 words yields 80 bits of security, which is right on the money. I can get two 80-bit passphrases and one 64-bit passphrase out of a single shuffle. I'll have to see how this

goes.

2021 02 18 General Comments

(1)

Shortlink

THE OUROBOROS CARD SHUFFLE

INTRODUCTION

For the most part, I don't play a lot of table games, and I _don't play party games_. But occasionally, I'll sit down with my family and play a board game or card game. When we play a card game though, I get teased by how I shuffle the deck of cards. I know that to maximize entropy in the deck, it should be riffle shuffled at least 7 times, but a better security margin would be around 10 shuffles, and after 12, I'm just wasting my time. But I don't always riffle shuffle. I'll also do various deterministic shuffles as well, such as the pile shuffle to separate cards from each

other.

I'm familiar with a number of deterministic card shuffles , which I didn't even know

had names:

* PILE SHUFFLE- Separate the cards into piles, one at a time, until exhausted, then collect the piles. I usually just do 4 or 5 piles, and

pick up in order.

* MONGEAN SHUFFLE- Move the cards from one had to the other, strictly alternating placing each discard on top then beneath the previously discarded cards. * MEXICAN SPIRAL SHUFFLE- Discard the top card on the table, and the second card to the bottom of the deck in your hard. Continue discarding all odd cards to the table, all even cards beneath the deck in hand until exhausted. I never do this shuffle, because it takes too

long to execute.

In practice, when I'm playing a card game with my family, I'll do something like 3 riffle shuffles, a pile shuffle, 3 more riffle shuffles, a Mongean shuffle, 3 more riffle shuffles, another pile shuffle, then one last riffle shuffle. I'll get teased about it, of course: "Dad, I'm sure the cards are shuffled just fine. Can we just play now?", but when we play the game, I'll never hear complaints about how poorly the deck was shuffled. This got me thinking though- there aren't that many simple deterministic blind card shuffles (I say "blind", because any of the playing card ciphers would work, but that requires seeing the cards, which is generally frowned upon when playing competitive card games). I wonder what else is out there. Well, doing some web searches didn't turn out much. In fact, all I could find were variations of the above shuffles, such as random pile discarding and pickup, but nothing new. So the question then turned into- could I create my own simple deterministic card shuffle? It didn't take me long before I came up with what I call the "Ouroboros shuffle". THE OUROBOROS SHUFFLE Before going any further, let me state that I _very much doubt_ I'm the first to come up with this idea, but I have searched and couldn't find where anyone else had documented it. If it does in fact exist, let me know, and I'll gladly give credit where credit is due. Until then, however, I'll stick with calling it the "Ouroboros Shuffle", named after the serpent or dragon eating its own tail. The shuffle is simple: * Holding the deck in your hard, discard the first card from the bottom of the deck to the table. * Discard the top card of the deck to the discard pile on the table. * Repeat steps 1 and 2, strictly alternating bottom and top cards until the deck is exhausted. If the playing cards are plastic-based, like those from Kem or Copag, then you could "pinch" the top and bottom cards simultaneously, and pull them out of the deck in your hand to the tale. If you do this perfectly, you will pinch 2 cards only 26 times. If they're paper-based though, this may or may not work as efficiently due to cards having a tendency to stick together after heavy use. If the deck was unshuffled as "1, 2, 3, ..., 50, 51, 52", then the first shuffle would look like this:

Step: 0

Unshuffled: 1, 2, 3, ..., 50, 51, 52

Shuffled:

Step: 1

Unshuffled: 1, 2, 3, ..., 49, 50, 51

Shuffled: 52

Step: 2

Unshuffled: 2, 3, 4, ..., 49, 50, 51

Shuffled: 1, 52

Step: 3

Unshuffled: 2, 3, 4, ..., 48, 49, 50 Shuffled: 51, 1, 52

Step: 4

Unshuffled: 3, 4, 5, ..., 48, 49, 50 Shuffled: 2, 51, 1, 52

Step: 5

Unshuffled: 3, 4, 5, ..., 47, 48, 49 Shuffled: 50, 2, 51, 1, 52

Step: 6

Unshuffled: 4, 5, 6, ..., 47, 48, 49 Shuffled: 3, 50, 2, 51, 1, 52

....

Step 50:

Unshuffled: 26, 27 Shuffled: 25, 28, 24, ..., 51, 1, 52

Step 51:

Unshuffled: 26

Shuffled: 27, 25, 28, ..., 51, 1, 52

Step 52:

Unshuffled:

Shuffled: 26, 27, 25, ..., 51, 1, 52 As you can see, the top and bottom cards are always paired together in the unshuffled deck, and discarded as a pair to the shuffled deck. The top and bottom cards could also be thought of as the head and tail of a list, and thus why I called it the Ouroboros shuffle. If you execute this algorithm perfectly from an unshuffled deck, it will take 51 rounds to before restoring the deck to its unshuffled

state.

OBSERVATIONS

Almost immediately, I noticed a bias. It doesn't matter how many times I execute this algorithm, the bottom card will always remain on the bottom. In the above example, the King of Spades (if assigned the value of "52") will stay at the bottom of the deck, due to the nature of the shuffle of discarding the bottom card first. So I recognized that I would need to cut at least 1 card from the top of the deck to the bottom of the deck before the next round of the shuffle, to ensure the bottom card gets mixed in with the rest of the deck. Other questions started popping up, specifically: * How many perfect shuffles will it take to restore the deck to an unshuffled state now? * Is there a different bias hidden after cutting the top card to the

bottom?

* What if I cut 2 cards? 3 cards? 51 cards? Whelp, time to code up some Python, and see what pops out. What I'm looking for is what the state of the deck looks like after each round. In other words, I want to know which card occupies which positions in the deck. For example, does the Seven of Clubs see all possible 52 positions in the deck? Without the cut, we know that's not possible, because the bottom card stubbornly stays in the bottom position. Typing up a quick script and graphing with Gnuplot gave me the following images. The first image on the left is the Ouroboros shuffle with no cuts, where the right image is the Ouroboros shuffle followed by cutting the top card to the bottom of the deck as the end of the round. Click to enlarge. What you're looking at is the card position in the deck along the X-axis and the card value along the Y-axis. In the left image, where the Ouroboros shuffle is executed without any following cuts, the 52nd card in the deck is always the face value of 52. But in the right image, where the Ouroboros shuffle is followed by cutting one card from the top of the unshuffled deck to the bottom, every card position sees every face value. So what would happen if instead of cutting 1 card off the top to the bottom at the end of each round, I cut 2 cards, and cards, etc. all the way to cutting 51 cards off the top to the bottom? Well, more Python scripting, and I generated a total of 52 images showing every possible position a card occupies in the deck until the deck returns to its unshuffled state. Interestingly enough, executing the Ouroboros shuffle followed by cutting 19 cards, leads to a cycle length of 6,090 perfect shuffles before restoring the deck back to its unshuffled state. Awesome! Except, as you can see in the Imgur post above, it's extremely biased. Every shuffle-and-cut is listed here with its cycle length:

Cut: 0, iter: 51

Cut: 1, iter: 52

Cut: 2, iter: 51

Cut: 3, iter: 272 Cut: 4, iter: 168 Cut: 5, iter: 210 Cut: 6, iter: 217

Cut: 7, iter: 52

Cut: 8, iter: 418

Cut: 9, iter: 52

Cut: 10, iter: 24 Cut: 11, iter: 350 Cut: 12, iter: 387 Cut: 13, iter: 252 Cut: 14, iter: 1020 Cut: 15, iter: 144 Cut: 16, iter: 1972 Cut: 17, iter: 34 Cut: 18, iter: 651 Cut: 19, iter: 6090 Cut: 20, iter: 175 Cut: 21, iter: 90 Cut: 22, iter: 235 Cut: 23, iter: 60 Cut: 24, iter: 2002 Cut: 25, iter: 144 Cut: 26, iter: 12 Cut: 27, iter: 50 Cut: 28, iter: 24 Cut: 29, iter: 10 Cut: 30, iter: 44 Cut: 31, iter: 72 Cut: 32, iter: 297 Cut: 33, iter: 90 Cut: 34, iter: 45 Cut: 35, iter: 132 Cut: 36, iter: 12 Cut: 37, iter: 210 Cut: 38, iter: 207 Cut: 39, iter: 104 Cut: 40, iter: 420 Cut: 41, iter: 348 Cut: 42, iter: 30 Cut: 43, iter: 198 Cut: 44, iter: 35 Cut: 45, iter: 140 Cut: 46, iter: 390 Cut: 47, iter: 246 Cut: 48, iter: 28 Cut: 49, iter: 12 Cut: 50, iter: 36 Cut: 51, iter: 30 The only "shuffle then cut" rounds that are uniform appear to be cutting 1 card, 7 cards, and 9 cards. The other 49 shuffles are biased in one way or another, even if each of them have different cycle

lengths.

Here's the Python code I used to create the shuffled lists, each into their own comma-separated file:

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30 #!/usr/bin/python

def step_1(deck):

tmp =

for card in range(26): tmp.insert(0, deck) deck.pop(-1) tmp.insert(0, deck) deck.pop(0)

return tmp

def step_2(deck, cut): return deck + deck

orig =

deck =

for i in range(52):

with open("cut_{}.csv".format(i), "w") as f: f.write(",".join(map(str, deck)) + "\n") deck = step_1(deck) deck = step_2(deck, i)

n = 1

while deck != orig: with open("cut_{}.csv".format(i), "a") as f: f.write(",".join(map(str, deck)) + "\n") deck = step_1(deck) deck = step_2(deck, i)

n += 1

print "Cut: {}, iter: {}".format(i, n)

CONCLUSION

This was a fun shuffle to play with and one that I'll incorporate into my card playing with my family. Now I could do something like: 3 riffle shuffles, Ouroboros shuffle with 1 card cut, 3 riffle shuffles, Mongean shuffle, 3 riffle shuffles, pile shuffle, 1 riffle shuffle, and I will be happy.

2018 10 05 General

Personal Comments (6)

Shortlink

LATIN SQUARES, MATHEMATICS, AND CRYPTOGRAPHY

INTRODUCTION

Recently, I've been studying Latin squares

and their role in

classical cryptography including the one-time pad. Latin squares are NxN squares where no element in a row is duplicated in that same row, and no element in a column is duplicated in that column. The popular Sudoku game is a puzzle that requires building a Latin square. As I delved deeper and deeper into the subject, I realized that there is a rich history here that I would like to introduce you to. Granted, this post is not an expansive nor exhaustive discussion on Latin squares. Rather, it's meant to introduce you to the topic, so you can look into it on your own if this interests you. In each of the sections below, the "A" and "N" characters are highlighted in the table image to demonstrate that the table is indeed a Latin square. Further, you can click on any table image to enlarge.

TABULA RECTA

The Tabula Recta is the table probably most are familiar with, and recognize it as the Vigenère table. However, the table was first used by German author and monk Johannes Trithemius in 1508, which it was used in his Trithemius polyalphabetic cipher. This was a good 15 years before Blaise de Vigenère was even born, 43 years before Giovan Battista Bellaso wrote about his cipher using the table in his 1553 book "La cifra del. Sig. Giovan Battista Bellaso", and 78 years before Blaise de Vigenère improved upon Bellaso's cipher. Today, we know it as either the "tabula recta" or the "Vigenère table". Regardless, each row shifts the alphabet one character to the left, creating a series of 26 Caesar cipher shifts. This property of the shifted alphabets turns out to be a weakness with the Vigenère

cipher , in that

if a key repeats, we can take advantage of the Caesar shifts to discover the key length, then the key, then finally breaking the

ciphertext.

Jim Sandborn integrated a keyed tabula recta into his Kryptos sculpture in the 2nd and 4th panels. Even though the first 3 passages in the Kryptos sculpture have been cracked, the 4th passage remains a mystery.

BEAUFORT TABLE

More than 250 years later, Rear Admiral Sir Francis Beaufort modified the Vigenère cipher by using a reciprocal alphabet and changing the way messages were encrypted. Messages were still encrypted with a repeating key, similar to the Vigenère cipher, but plaintext character was located in the first column and the key in the first row. The intersection was the ciphertext. This became the Beaufort

cipher .

His reasoning in why he used a different table and changed the enciphering process isn't clear. It may have been as simple as knowing concepts about the Vigenère cipher without knowing the specific details. He may have had other reasons. One thing to note, however, is that Vigenère-encrypted ciphertexts cannot be decrypted with a Beaufort table and vice versa. Even though the Beaufort cipher suffers from the same cryptanalysis, the Caesar shifts are different, and the calculation if using numbers instead of letters is also different. The Beaufort table was integrated into a hardware encryption machine called the Hagelin M-209 . The M-209 was used by the United States military during WWII and through the Korean War. The machine itself was small, and compact, coming in about the size of a lunchbox and only weighing 6 pounds, which was remarkable for the time. One thing to note, is that the Beaufort table has "Z" in the upper-left corner, with the reciprocal alphabet in the first row and first column, as shown in the image above. Any other table that is not exactly as shown above that claims to be the Beaufort table is not

correct.

NSA'S DIANA RECIPROCAL TABLE Of course, the narcissistic NSA needs their own polyalphabetic table

!

We can't let everyone else be the only ones who have tables! I'm joking of course, as there is a strong argument for using this reciprocal table rather than the Beaufort. Everyone is familiar with the one-time pad , a proven theoretically unbreakable cipher if used correctly. There are a few ways in which to use the one-time pad, such as using XOR or modular addition and subtraction. Another approach is to use a lookup table. The biggest problem with the tabula recta is when using the one-time pad by hand, it's easy to lookup the wrong row or column and introduce mistakes into the enciphering process. However, due to the reciprocal properties of the "DIANA" table (don't you love all the NSA codenames?), encryption and decryption are identical, which means they only require only a single column. A key "row" is no longer needed, and the order of plain, key and cipher letter don't matter (Vigenère vs Beaufort) and may even differ for sender and receiver. Just like with Beaufort, this table is incompatible with Vigenère-encrypted ciphertexts. Further, it's also incompatible with Beaufort-encrypted ciphertexts, especially if it's a one-time pad. The Beaufort table shifts the alphabet to the right, while the DIANA table shifts the alphabet to the left. The tabula recta also shifts left. Let's make one thing clear here- this table was created strictly for ease of use, _not for increased security_. When using the one-time pad, the key is at least the length of the message, which means it doesn't repeat. So it doesn't matter that the table is 26 Caesar-shifted alphabets. That property won't show itself in one-time

pad ciphertexts.

E.J. WILLIAMS' BALANCED TABLES Stepping away from cryptography for a moment, and entering the world of mathematics, and in this case, mathematical models applied to farming, we come across E.J. Williams' balanced tables

.

Note how the "A" and "N" characters are populated throughout the table compared to what we've seen previously. The paper is modeling chemical treatments to crops over a span of time, and how to approach the most efficient means of applying those treatments. The effects of the previous treatment, called the "residual effect" is then analyzed. A method based on a "balanced" Latin square is discussed. It is then applied to multiple farming

sites and analyzed.

Now, I know what you're thinking- "Let's use this for a cipher table!". Well, if you did, and your key repeated throughout the message, the ciphertext would not exhibit Caesar-shifted characteristics like Vigenère and Beaufort. However, the table is still deterministic, and as such, knowing how the table is built will give cryptanalysts the edge necessary to still break Williams-encrypted ciphertexts. MICHAEL DAMM'S ANTI-SYMMETRIC QUASIGROUPS OF ORDER 26 Also in the world of mathematics are quasigroups . These are group algebras that must be both totalitive and invertible, but not necessarily associative. Michael Damm researched quasigroups as the basis for an integrity checksum, such as in calculating the last digit of a credit card number . But, not only did he research quasigroups, but anti-symmetric quasigroups. Anti-symmetry is a set algebra concept. If "(c*x)*y = (c*y)*x", then this implies that "x = y", and thus the set is symmetric. An anti-symmetric set means "(c*x)*y != (c*y)*x", and as such, "x != y". Michael Damm, while researching checksums, introduced us to anti-symmetric quasigroups. One property was required, and that was that the main diagonal was "0", or "A" in our case. The Damm algorithm creates a checksum, such that when verifying the check digit, the result places you on the main diagonal, and thus returns "0". Note that any quasigroup can be represented by a Latin square. Due to the nature of the Damm algorithm as a checksum, this could be used to verify the integrity of a plaintext message before encrypting using a quasigroup of order 26, as shown above. The sender could calculate the checksum of his plaintext message, and append the final character to the plaintext before encrypting. The recipient, after decrypting the message, could then run the same Damm checksum algorithm against the full plaintext message. If the result is "A", the message wasn't modified. Notice in my image above, that while "A" rests along the main diagonal, the rest of the alphabets are randomized, or at least shuffled. It really isn't important how the alphabets are created, so long as they meet the requirements of being an anti-symmetric

quasigroup.

RANDOM TABLES

Finally, we have randomized Latin squares. These are still Latin squares, such that for any element in a row, it is not duplicated in that row, and for any element in a column, it is not duplicated in that column. Other than that, however, there is no relationship between rows, columns, or elements. Their use is interesting in a few

areas.

First, suppose I give you a partially filled Latin square as a "public key", with instructions on how to encrypt with it. I could then use my fully filled Latin square "private key", of which the public is a subset of. Using this private key, with some other algorithm, I could then decrypt your message. It turns out, filling in a partially-filled Latin square is NP-complete

,

meaning that we don't know of any polynomial-time algorithm currently can can complete the task. As such, this builds a good foundation for public key cryptography

,

as briefly outlined here. Further, because of the lack of any structure in a randomized Latin square, aside from the requirements of being a Latin square, these make good candidates for symmetric message authentication code (MAC) designs. For example, a question on the cryptography StackExchange asked if there was any humanly-verifiable way to add message authentication to the one-time pad. The best answer suggested using a circular buffer as a signature, which incorporates the key, the plaintext, modular addition, and the Latin square. By having a randomized Latin square as the foundation for a MAC tag, no structure is present in the authenticated signature itself. Note, the table can

still be public.

Steve Gibson incorporated Latin squares into a deterministic password manager . Of course, as with all deterministic password managers, there are some fatal flaws in their

design

.

Further, his approach, while "off the grid", is rather cumbersome in execution. But it is creative, and certainly worth mentioning here as a randomized Latin square.

CONCLUSION

Latin squares have fascinated mathematicians for centuries, and in this post, we have seen their use en cryptography, mathematical modeling, data integrity, message authentication, and even password generation. This only shows briefly their potential. 2018 09 20 Cryptology Security Comments (2)

Shortlink

GETTING UP TO 8 POSSIBILITIES FROM A SINGLE COIN TOSS

INTRODUCTION

Lately, I've been interested in pulling up some classical modes of generating randomness. That is, rather than relying on a computer to generate my random numbers for me, which is all to common and easy these days, I wanted to go offline, and generate random numbers the classical way- coin flips, dice rolls, card shuffling, roulette wheels, bingo ball cages, paper shredding

, etc.

In fact, if randomness interests you, I recently secured the r/RNG subreddit , where we discuss everything from random number generators, to hashing functions, to quantum mechanics and chaos to randomness extraction. I invite you to hang out with us. Anyway, I was a bit bothered that all I could get out of a single coin toss was 2 outcomes- heads or tails. It seems like there just is no way around it. It's the most basic randomness mechanic, yet it seems so limiting. Then I came across 18th century mathematician Gearges-Louis Leclerc, Comte de Buffon, where he played a common game of placing bets on tossing a coin onto a tiled floor, and whether or not the coin landed squarely in the tile without crossing any edges, or if the coin did actually cross a tile edge. Then it hit me- I can extract more entropy out of a coin toss by printing a grid on a piece of paper. So, I put this up on Github

as a simple

specification. So far, I have papers you can print that are letter, ledger, a3, or a4 sizes, and coins for United States, Canada, and the

European Union.

THE THEORY

Assume a tile has a length of "l" and is square, and assume a coin has a diameter "d" such that "d < l". In other words, the tile is larger than the coin, and there is a place on the tile where the coin can sit without crossing any edges of the tile. This means that if the edge of the coin is tangent to the edge of the tile, then we can draw a smaller square with the center of the coin, inside our tile. This smaller square tells us that if the center of the coin lands anywhere inside of that smaller square, the edges of the coin will not cross the edges of the tile. Now we know the coin diameter, but we would like to know the tile edge length, so we can draw our grid on a paper to toss the coin to. As such, we need to know the ratio of the area of the tile to the area of the smaller inner square drawn by the coin. Know that the area of the tile with length "l" is:

A(tile) = l^2

The area of the smaller square inside the tile is determined by both the tile length "l" and the coin diameter "d": A(inner_square) = (l-d)^2 So the ratio of the two is:

P = (l-d)^2/l^2

I use "P" for our variable as this the probability of where the center of the coin lands. We want our outcomes equally likely, so we want the center of the coin to land with 50% probability inside the inner square and 50% probability outside of the inner square.

1/2 = (l-d)^2/l^2

We know the diameter of the coin, so we just need to solve for the tile edge length. This is a simple quadratic equation:

1/2 = (l-d)^2/l^2

l^2 = 2*(l-d)^2

l^2 = 2*(l^2 - 2*l*d + d^2) l^2 = 2*l^2 - 4*l*d + 2*d^2 0 = l^2 - 4*l*d + 2*d^2 If you remember from your college or high school days, the quadratic formula solution to the quadratic equation is: x = (-b +/- Sqrt(b^2 - 4*a*c))/(2*a) Plugging that in, we get: l = (-(-4*d) +/- Sqrt((4*d)^2 - 4*1*(2*d^2)))/2 l = (4*d +/- Sqrt(16*d^2-8*d^2))/2 l = (4*d +/- Sqrt(8*d^2))/2 l = (2*d +/- 2*d*Sqrt(2))/2 l = d*(2 +/- Sqrt(2)) No surprise, we have 2 solutions. So, which one is the correct one? Well, we set the restriction that "d < l" earlier, and we can't break that. So, we can clearly see that: l = d*(2 - Sqrt(2)) l = d*(Something less than 1)

l < d

So, this equation would mean that our coin diameter "d" is larger than our tile edge "l", which doesn't mean anything to us. As such, the solution to our problem for finding the tile edge length when we know the coin diameter is: l = d*(2 + Sqrt(2)) GETTING 4 OUTCOMES FROM ONE COIN TOSS Now that we have the theory knocked out, let's apply it. I know that a

United States penny

diameter is

1.905 centimeters. As such, my grid dege length needs to be: l = 1.905*(2+Sqrt(2)) l ~= 6.504 centimeters This means then that when I flip my coin onto the paper grid, there is now a 50% chance that the coin will cross a grid line and a 50% chance that it won't. But this result runs orthogonal to whether or not the coin lands on a heads or tails. As such, I have the following uniformly distributed outcomes: * Tails, does not cross an edge = 00 * Tails, crosses any edge(s) = 01. * Heads, does not cross an edge = 10. * Heads, crosses any edge(s) = 11. If we do a visual check, whenever the coin center lands in the white square of the following image, the edges of the coin will not cross the edge of the square. However, if the coin center lands in the red area, then the edge of the coin will cross an edge or edges. You can measure the ratios of the area of the red to that of the white to convince yourself they're equal, although careful- my image editor didn't allow me to calculate sub-pixel measurements when building this

image.

GETTING 8 OUTCOMES FROM ONE COIN TOSS Remarkably, rather than treat the grid as a single dimensional object (does it cross any grid edge or not), I can use the grid an an X/Y plane, and treat crossing an x-axis edge differently than crossing a y-axis edge. However, that only gives me 6 outcomes, and it is possible that the coin will land in a corner, crossing both the x-axis and y-axis simultaneously. So, I need to treat that separately. Just like we calculated the odds of crossing an edge to be 50%, now I

have 4 outcomes:

* Does not cross an edge. * Crosses an x-axis edge only. * Crosses a y-axis edge only. * Crosses both the x-axis and y-axis edges (corner). As such, each outcome needs to be equally likely, or 25%. Thus, my problem now becomes:

1/4 = (l-d)^2/l^2

Just like we solved the quadratic equation for P=50%, we will do the same here. However, I'll leave the step-by-step as an exercise for the user. Our solution becomes:

l = 2*d

Simple. The length of the grid edge must by twice the diameter of the coin. No square roots or distributed multiplication. Double the diameter of the coin, and we're good to go. However, this has a benefit and a drawback. The benefit is that the grid is more compact (2*d vs ~3.4*d). This reduces your ability to "aim" for a grid edge or not. The drawback is that you will have to make more ambiguous judgment calls on whether or not the coin crosses an edge (75% of the time vs 50% in the previous approach). So, like the previous approach of getting 4 outcomes, we have a visual check. If the coin center lands anywhere in the white area of the following image, it won't cross an edge. If the coin center lands anywhere in the green area, the coin will cross an x-axis edge. If the coin center lands anywhere in the blue, it will cross a y-axis edge, and if the coin center lands anywhere in the red, it will cross both an x-axis and y-axis edge simultaneously. As with the previous 4 outcome approach, we can convince ourselves this holds. The area of the white square should equal the are of the blue, as well as equal the area of green, as well as equal the area of

the red.

This means now we have 8 uniformly distributed outcomes: * Tails, does not cross an edge = 000 * Tails, crosses the x-axis only = 001 * Tails, crosses the y-axis only = 010 * Tails, crosses both axes (corner) = 011 * Heads, does not cross an edge = 100 * Heads, crosses the x-axis only = 101 * Heads, crosses the y-axis only = 110 * Heads, crosses both axis (corner) = 111 How do you like them apples? 8 equally likely outcomes from a single

coin toss.

As mentioned, one big advantage, aside from getting 3-bits per coin toss, is the smaller grid printed on paper. You have a less opportunity to try to cheat the system, and influence the results, but you also may have a more difficult time deciding if the coin crossed a

grid edge.

On the left is the paper grid used for 2-bit coin toss extraction, while the paper grid on the right is used for 3-bit coin toss

extraction.

SOME THOUGHTS

Both fortunately and unfortunately, the coin will bounce when it lands on the paper. If the bounce causes the coin to ounce off the paper, all you can record is a single bit, or one of two outcomes- heads or tails, because it becomes ambiguous on whether or not the coin would have crossed a grid edge, and which edge it would have crossed. You could probably minimize the bouncing by putting the paper on something less hard than a table, such as a felt card table, a towel or rag, or even a carpeted floor. But this may also create unwanted bias in the result. For example, if the paper is placed on something too soft, such as a carpeted floor, when the coin hits the paper, it could create damage to the paper, thus possibly influencing future flips, and as a result, introducing bias into the system. Surrounding the paper with "walls", such that the coin cannot pass the boundaries of the grid may work, but the coin bouncing off the walls will also impact the outcomes. It seems clear that the walls would need to be placed immediately against the outer edge of the grid to prevent from introducing any bias. ADDITIONAL APPROACHES This is hardly conclusive on extracting extra randomness from a coin flip. It is known that coins precess when flying through the air, and as part of that precess, the coin may be spinning like a wheel. Which means that the coin could be "facing north" or "facing south" in addition to heads or tails. However, it's not clear to me how much spin exists in a standard flip, and if this could be a reliable source

of randomness.

Further, in our 8-outcome approach, we used the center of the coin as the basis for our extra 2 bits, in addition to heads and tails. However, we could have made the grid edge length "4d" instead of "2d", and ignored the toss if the coin crosses both edges. This means a larger grid, which could improve your chances of aiming the coin in an attempt to skew the results, and also means sticking with smaller coins, as larger coins just won't have many grids that can fit on a

paper.

Other ideas could be adding color to each grid. So not only do we identify edge crossing, but use color as a third dimension to get up to 4-bits of randomness. So maybe the x-axes have alternating black and white areas, as do the y-axis, and the corners. The center of the grid could be possibly alternating red/blue quadrants. Where the center of the coin lands determines the extracted bits. Of course, this would be a visually busy, and possibly confusing paper. None of these have been investigated, but I think each could be interesting approaches to how to extract more bits out of a coin toss.

CONCLUSION

I think this is a refreshing approach to an age-old problem- the coin toss. Extracting 3-bits from a single flip is extracting more entropy than a fair d6 die can produce (~2.5 bits). This means that practically speaking, the coin is more efficient at entropy extraction than dice. However, you can roll multiple dice simultaneously, where it's more difficult to toss multiple coins simultaneously.

ACKNOWLEDGMENTS

Thanks to Dr. Markku-Juhani O. Saarinen , Marsh Ray

, and JV Roig for

the discussions we had on Twitter, and for helping me flush out the

ideas.

2018 08 10 Cryptology

Personal

Security Comments (1)

Shortlink

MIDDLE SQUARE WEYL SEQUENCE PRNG

INTRODUCTION

The very first software algorithm to generate random numbers, was supposedly written in 1946 by John von Neumann, and is called the Middle Square Method

, and it's crazy

simple. Enough so, you could execute it with a pencil, paper, and basic calculator. In this post, I'm going to cover the method, it's drawbacks, and an approach called the Weyl Sequence MIDDLE SQUARE METHOD The algorithm is to start with an n-digit seed. The seed is squared, producing a 2n-digit result, zero-padded as necessary. The middle n-digits are then extracted from the result for the next seed. See? Simple. Let's look at an example. Suppose my seed is 81 (2 digits). 81-squared is 6561 (4 digits). We then take the middle 2 digits out of the result, which is 56. We continue the process:

812 = 6561

562 = 3136

132 = 0169

162 = 0256

252 = 0625

622 = 3844

842 = 7056

52 = 0025

22 = 0004

02 = 0000

And we've reached the core problem with the middle square method- it has a tendency to converge, most likely to zero, but other numbers are possible, and in some cases a short loop. Of course, John von Neumann was aware of this problem, but he also preferred it that way. When the middle square method fails, it's immediately noticable. But, it's also horribly biased and fails most statistical tests for randomness. MIDDLE SQUARE WEYL SEQUENCE A modern approach to an old problem is known as the Middle Square Weyl

Sequence

,

from Hermann Weyl .

Basically, a number is added to the square, then the middle bits are extracted from the result for the next seed. Let's first look at the C code, then I'll explain it in detail.

#include

uint64_t x = 0, w = 0 // Must be odd (least significant bit is "1"), and upper 64-bits non-zero uint64_t s = 0xb5ad4eceda1ce2a9; // qualifying seed // return 32-bit number inline static uint32_t msws() { x *= x; // square the number w += s; // the weyl sequence x += w; // apply to x return x = (x>>32) | (x<<32); // return the middle 32-bits

}

EXPLANATION

Okay. Let's dive into the code. This is a 32-bit PRNG using John von Neumann's Middle Square Method, starting with a 64-bit seed "s". As the notes say, it must be an odd number, and the upper 64-bits must be non-zero. It must be odd, to ensure that "x" can be both odd and even. Recall- an odd plus an odd equals an even, and an odd plus an even

equals an odd.

Note that at the start, "x" is zero, so squaring it is also zero. But that's not a problem, because we are adding a non-zero number. During that time, the "w" variable is assigned. It's dynamically changed on every iteration, although "s" remains static. Finally, our return is a 32-bit number (because of the "inline static uint32_t" function width), but we're doing some bit-shifting. Supposedly, this is returning the middle 32-bits of our 64-bit "x", but that's not immediatly clear. Let's look at it more closely.

EXAMPLE

Suppose "x = 0xace983fe671dbd09". Then "x" is a 64-bit number with the

following bits:

1010110011101001100000111111111001100111000111011011110100001001 When that number is squared, it becomes the 128-bit number 0x74ca9e5f63b6047f6a65456d9da04a51, or in binary: 01110100110010101001111001011111011000111011011000000100011111110110101001100101010001010110110110011101101000000100101001010001 But remember, "x" is a 64-bit number, so in our C code, only the bottom 64-bits are returned from that 128-bit number. So "x" is really 0x6a65456d9da04a51, or in binary: 0110101001100101010001010110110110011101101000000100101001010001 But the bits "01101010011001010100010101101101" are the 3rd 32-bits of the 128-bit number that was the result of squaring "x" (see above). They are the "middle" 32-bits that we're after. So, we're going to do something rather clever. We're going to swap the upper 32-bits with the lower, then return the lower 32-bits. Effectively, what we're doing is "ABCD" -> "CDAB", then returning "AB". We do this via bit-shifting. So, starting with: 0110101001100101010001010110110110011101101000000100101001010001 First, we bitshift the 64-bit number right 32-bits: 0110101001100101010001010110110110011101101000000100101001010001 >> 32 = 0000000000000000000000000000000001101010011001010100010101101101 Then we bitshift "x" left 32-bits: 0110101001100101010001010110110110011101101000000100101001010001 << 32 = 1001110110100000010010100101000100000000000000000000000000000000 Now we logically "or" them together: 0000000000000000000000000000000001101010011001010100010101101101 | 1001110110100000010010100101000100000000000000000000000000000000 |================================================================ 1001110110100000010010100101000101101010011001010100010101101101 See the swap? Now, due to the function return width, we return the lower 32-bits as our random number, which is 01101010011001010100010101101101, or 1785021805 in decimal. We've arrived at our goal.

CONCLUSION

At the main website , the C source code is provided, along with 25,000 seeds, as well as C source code for the Big Crush randomness tests from TestU01. This approach passes Big Crush with flying colors on all 25,000 seeds. Something a simple as adding an odd 64-bit number to the square changes John von Neumann's approach so much, it becomes a notable PRNG. Who said you can't teach an old dog new tricks?

2018 07 30 General

Scripting Comments (0)

Shortlink

WHY THE "MULTIPLY AND FLOOR" RNG METHOD IS BIASED I've been auditing a lot of JavaScript source code lately, and a common problem I'm seeing when generating random numbers is using the naive "multiply-and-floor" method. Because the "Math.random()" function call returns a number between 0 and 1, not including 1 itself, then developers think that the "best practice" for generating a random number is as follows:

1

2

3

function randNumber(range) { return Math.floor(Math.random() * range); // number in the interval [0, range).

}

The problem with this approach is that it's biased. There are numbers returned that are more likely to occur than others. To understand this, you need to understand that Math.random() is a 32-bit RNG in Chrome and Safari, and a 53-bit RNG in Edge and Firefox. First, let's pretend every browser RNG is a 32-bit generator, then we'll extend it. A 32-bit Math.random() means that there are only 232 = 4,294,967,296 possible decimal values in the range of [0, 1). This means that the interval [0, 1) is divided up every "1/232 = 0.00000000023283064365" decimal values. But that doesn't matter though, because if I wanted a random number between 1 and 100, 100 does not divide 4,294,967,296 evenly. I get 42,949,672 with 96 left over. What does this mean? It

means that ...

1 randNumber(100);

... will favor 96 numbers out of our 100. The 4 least likely results are 24, 49, 74, & 99. That's our bias. It doesn't matter if it's a 53-bit RNG either. "253 = 9,007,199,254,740,992" is not a multiple of 100. Instead, dividing by 100, I get 90,071,992,547,409 with 92 left over. So, with a 53-bit RNG, we have the same problem where 92 results will be more likely to be generated than 8 others. Those unlucky 8 are 11, 22, 33, 45, 58,

66, 79, and 91.

The only time this bias would not exhibit itself in the naive "multiply-and-floor" approach above, is if the random number requested is in the interval [0, 2N), where "N" is any positive integer. 232, 253, and 2X, where "X" is a positive integer, is always a multiple of 2N (2N divides 2X evenly, when N ≤ X, N > 0). So, what do we do? How do we improve the naive multiply-and-floor approach? Thankfully, it's not too difficult. All we need to do is essentially the following: * Force the RNG into 32-bits (common denominator for all browsers). * Create a range of values that is a multiple of our desired range

(E.G.: 1-100).

* Loop over the range picking values until a value inside the range

is generated.

* Output the generated value modulo our desired range. Let's see this in practice. First the unbiased code, then the

explanation:

1

2

3

4

5

6

7

function uniformRandNumber(range) { var max = Math.floor(2**32/range) * range; // make "max" a multiple of "range"

do {

var x = Math.floor(Math.random() * 2**32); // pick a number of [0, 2^32). } while(x >= max); // try again if x is too big return(x % range); // uniformly picked in [0, range)

}

I know what you're thinking: WAIT! YOU JUST DID THE "MULTIPLY AND FLOOR" METHOD!! HYPOCRITE!!! Hold on though. There are two subtle differences. See what they are? The "max" variable is a multiple of "range" (step 2 above). So, if our range is [0, 100), then "max = 4294967200", which is a multiple of 100. This means that so long as "0 < = x < 4294967200", we can return "x % 100", and know that our number was uniformly chosen. However, if "x >= 4294967200", then we need to choose a new "x", and check if it falls within our range again (step 3 above). So long as "x" falls in [0, 4294967200), then we're good. This extends to cryptographically secure random numbers too. In

action, it's just:

1

2

3

4

5

6

7

8

function uniformSecureRandNumber(range) { const crypto = window.crypto || window.msCrypto; // Microsoft vs everyone else var max = Math.floor(2**32/range) * range; // make "max" a multiple of "range"

do {

var x = crypto.getRandomValues(new Uint32Array(1)); // pick a number of [0, 2^32). } while(x >= max); // try again if x is too big return(x % range); // uniformly picked in [0, range)

}

So it's not that "multiply and floor" is _wrong_ so long as you _use

it correctly_.

One small caveat- these examples are not checking if "range" is larger than 32-bits. I deliberately ignored this to draw your attention on how to correctly generate uniform random numbers. You may or may not need to do various checks on the "range" argument. Is it an integer type? Is it a positive integer? Is it 32-bits or less? Etc. As an exercise for the reader, how could you extend this uniform generator to pick a random number in the range of [100, 200)? Going further, how could you pick only a random _even number_ in the range

of [250, 500)?

2018 06 13 Cryptology

Scripting

Security Comments (0)

Shortlink

DO NOT USE SHA256CRYPT / SHA512CRYPT - THEY'RE DANGEROUS

INTRODUCTION

I'd like to demonstrate why I think using sha256crypt or sha512crypt on current GNU/Linux operating systems is dangerous, and why I think the developers of GLIBC should move to scrypt or Argon2, or at least

bcrypt or PBKDF2.

HISTORY AND MD5CRYPT In 1994, Poul-Henning Kamp (PHK) added md5crypt to FreeBSD to address the weaknesses of DES-crypt

that

was common on the Unix and BSD systems of the early 1990s. DES-Crypt has a core flaw in that, not only DES reversible (which necessarily isn't a problem here), and incredibly fast, but it also limited password length to 8 characters (each of those limited to 7-bit ASCII to create a 56-bit DES key). When PHK created md5crypt, one of the things he made sure to implement as a feature was to support arbitrary-length passwords. In other words, unlike DES-Crypt, a user could have passwords greater than 9 or more characters. This was "good enough" for 1994, but it had an interesting feature that I don't think PHK thought of at the time- md5crypt execution time is dependent on password length. To prove this, I wrote a simple Python script using passlib to hash passwords with md5crypt. I started with a single "a" character as my password, then increased the password length by appending more "a"s up until the password was 4,096

"a"s total.

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16 import time

from passlib.hash import md5_crypt md5_results = * 4096 for i in xrange(0, 4096):

print i,

pw = "a" * (i+1) start = time.clock() md5_crypt.hash(pw) end = time.clock() md5_results = end - start with open("md5crypt.txt", "w") as f: for i in xrange(0, 4096): f.write("{0} {1}\n".format(i+1, md5_results)) Nothing fancy. Start the timer, hash one "a" with md5crypt, stop the timer, and record the results. Start the timer, hash two "a"s with md5crypt, stop the timer, and record the results. Wash, rinse, repeat, until the password is 4,096 "a"s in length. What do the timing results look like? Below are scatter plots of timing md5crypt for passwords of 1-128, 1-512, and 1-4,096 characters

in length:

md5crypt 1-128 characters md5crypt 1-512 characters md5crypt 1-4,096 characters At first, you wouldn't think this is a big deal; in fact, you may even think you LIKE it (we're supposed to make things get slower, right? That's a good thing, right???). But, upon deeper inspection, this actually is a flaw in the algorithm's design for two reasons: * Long passwords can create a denial-of-service on the CPU (larger

concern).

* Passive observation of execution times can predict password length

(smaller concern).

Now, to be fair, predicting password length based on execution time is ... meh. Let's be honest, the bulk of passwords will be between 7-10

characters

.

And because these algorithms operate in block sizes of 16, 32, or 64 bytes, an adversary learning "AHA! I know your password is between 1-16 characters" really isn't saying much. But, should this even exist in a cryptographic primitive? Probably not. Still, the larger concern would be users creating a DoS on the CPU, strictly by changing

password length.

I know what you're thinking- it's 2018, so there should be no reason why any practical length password cannot be adequately hashed with md5crypt insanely quickly, and you're right. Except, md5crypt was invented in 1994, 24 years ago. According to PHK, he designed it to take about 36 milliseconds on the hardware he was testing, which would mean a speed about 28 per second. So, it doesn't take much to see that by increasing the password's length, you can increase execution time enough to affect a busy authentication server. The question though, is why? Why is the execution time dependent on password length? This is because md5crypt processes the hash for every 16 bytes in the password

.

As a result, this creates the stepping behavior you see in the scatter plots above. A good password hashing design would not do this. PHK eventually sunset md5crypt in 2012 with CVE-2012-3287

. Jeremi

Gosney, a professional password cracker, demonstrated with Hashcat and 8 clustered Nvidia GTX 1080Ti GPUS

,

that a password cracker could rip through 128.4 million md5crypt

guesses per second.

You should no longer be implementing md5crypt for your password

hashing.

SHA2CRYPT AND NIH SYNDROME In 2007, Ulrich Drepper decided to improve things for GNU/Linux

. He recognized the

threat that GPU clusters, and even ASICs, posed on fast password cracking with md5crypt. One aspect of md5crypt was the hard-coded 1,000 iterations spent on the CPU, before the password hash was finalized. This cost was not configurable. Also, MD5 was already

considered broken

, with

SHA-1 showing severe weaknesses

, so

he moved to SHA-2 for the core of his design. The first thing addressed, was to make the cost configurable, so as hardware improved, you could increase the iteration count, thus keeping the cost for calculating the final hash expensive for password crackers. However, he also made a couple core changes to his design that differed from md5crypt, which ended up having some rather drastic effects on its execution. Using code similar to above with Python's passlib, but rather using the _sha256_crypt()_ and _sha512_crypt()_ functions, we can create scatter plots of sha256crypt and sha512crypt for passwords up to 128-characters, 512-characters, and 4,096-characters total, just like we did weth md5crypt. How do they fall out? Take a look: sha256crypt 1-128 characters sha256crypt 1-512 characters sha256crypt 1-4,096 characters sha512crypt 1-128 characters sha512crypt 1-512 characters sha512crypt 1-4,096 characters Curious. Not only do we see the same increasing execution time based on password length, but unlike md5crypt, that growth is polynomial. The changes Ulrich Drepper made from md5crypt are subtle, but critical

.

Essentially, not only do we process the hash for every character in the password per round, like md5crypt, but we process every character in the password _three more times_. First, we take the binary representation of each bit in the password length, and update the hash based on if we see a "1" or a "0". Second, for every character in the password, we update the hash. Finally, again, for every character in the password, we update the hash. For those familiar with big-O notation, we end up with an execution run time of O(pw_length2 + pw_length*iterations). Now, while it is true that we want our password hashing functions to be slow, we also want the iterative cost to be the driving factor in that decision, but that isn't the case with md5crypt, and it's not the case with sha256crypt nor sha512crypt. In all three cases, the password length is the driving factor in the execution time, not the iteration count. Again, why is this a problem? To remind you: * Long passwords can create a denial-of-service on the CPU (larger

concern).

* Passive observation of execution times can predict password length

(smaller concern).

Now, granted, in practice, people aren't carrying around 4 kilobyte passwords. If you are a web service provider, you probably don't want people uploading 5 gigabyte "passwords" to your service, creating a network denial of service. So you would probably be interested in creating an adequate password maximum, such as what NIST recommends at

128 characters

, to

prevent that from occurring. However, if you have an adequate iterative cost (such as say, 640,000 rounds), then even moderately large passwords from staff, where such limits may not be imposed, could create a CPU denial of service on busy authentication servers. As with md5crypt, we don't want this. Now, here's what I find odd about Ulrich Drepper, and his design. In his post, he says about his specification (emphasis mine): > Well, there is a problem. I can already hear everybody complaining > that I suffer from the NIH syndrome BUT THIS IS NOT THE REASON. The > same people who object to MD5 make their decisions on what to use > also BASED ON NIST GUIDELINES. And Blowfish is not on the lists of > the NIST. Therefore bcrypt() does not solve the problem.

>

> WHAT IS ON THE LIST IS AES AND THE VARIOUS SHA HASH FUNCTIONS. Both > are viable options. The AES variant can be based upon bcrypt(), the > SHA variant could be based on the MD5 variant currently implemented.

>

> Since I had to solve the problem and I consider both solutions > equally secure I went with the one which involves less code. The > solution we use is based on SHA. More precisely, on SHA-256 and

> SHA-512.

PBKDF2 was standardized as an IETF standard in September 2000 , a full 7 years before Ulrich Drepper created his password hashing functions. While PBKDF2 as a whole would not be blessed by NIST until 3 years later, in December

2010 in SP 800-132

,

PBKDF2 can be based on functions that, as he mentioned, were already in the NIST standards. So, just like his special design that is based on SHA-2, PBKDF2 can be based on SHA-2. Where he said "I went with the one which involves less code", he should have gone with PBKDF2, as code had already long since existed in all sorts of cryptographic software, including OpenSSL. This seems to be a _very clear case of NIH syndrome_. Sure, I understand not wanting to go with bcrypt, as it's not part of the NIST standards . But don't roll your own crypto either, when algorithms already exist for this very purpose, that _ARE_ based on designs that

are part of NIST.

So, how does PBKDF2-HMAC-SHA512 perform? Using similar Python code with the passlib password hashing library, it was trivial to put

together:

PBKDF2-HMAC-SHA512 1-128 characters PBKDF2-HMAC-SHA512 1-512 characters PBKDF2-HMAC-SHA512 1-4,096 characters What this clearly demonstrates, is that the _only factor_ driving execution time, is the number of iterations you apply to the password, before delivering the final password hash. This is what you want to achieve, not giving the opportunity for a user to create a denial-of-service based on password length, nor an adversary learn the length of the user's password based on execution time. _This is the sort of details that a cryptographer or cryptography expert would pay attention to, as opposed to an end-developer._ It's worth pointing out that PBKDF2-HMAC-SHA512 is the default password hashing function for Mac OS X , with a variable cost between 30,000 and 50,000 iterations (typical PBKDF2 default is

1,000).

OPENBSD, USENIX, AND BCRYPT Because Ulrich Drepper brought up bcrypt, it's worth mentioning in this post. First off, let's get something straight- bcrypt IS NOT Blowfish. While it's true that bcrypt _is based on_ Blowfish, they are two completely different cryptographic primitives. bcrypt is a one-way cryptographic password hashing function , where as Blowfish is a two-way 64-bit block symmetric cipher

.

At the 1999 USENIX conference, Niels Provos and David Mazières, of OpenBSD, introduced bcrypt to the world

(it was actually

in OpenBSD 2.1, June 1, 1997). They were critical of md5crypt, stating the following (emphasis mine): > MD5 crypt hashes the password and salt in a number of different > combinations to slow down the evaluation speed. Some steps in the > algorithm make it DOUBTFUL THAT THE SCHEME WAS DESIGNED FROM A > CRYPTOGRAPHIC POINT OF VIEW--for instance, the binary representation > of the password length at some point determines which data is > hashed, for every zero bit the first byte of the password and for > every set bit the first byte of a previous hash computation. PHK was slightly offended by their off-handed remark that cryptography was not his core consideration when designing md5crypt. However, Niels Provos was a graduate student in the Computer Science PhD program at the University of Michigan at the time. By August 2003, he had earned his PhD. Since 1997, bcrypt has withstood the test of time, it has been considered "Best Practice" for hashing passwords, and is still well received today, even though better algorithms exist for hashing passwords. bcrypt limits password input to 72 bytes. One way around the password limit is with pre-hashing. A common approach in pseudocode is to hash the password with SHA-256, encode the digest into base64

,

then feed the resulting ASCII string into bcrypt. However, make sure to salt the prehash, or you fall victim to breach correlation attacks . Using HMAC is a better option than generic cryptographic hashes, as it has a construction for properly handling secret keys. In this case, a site-wide secret known as a "pepper" is appropriate.

In pseudocode:

pwhash = bcrypt(base64(hmac-sha-256(password, pepper, 256)), salt, cost) This results in a 44-byte password (including the "=" padding) that is within the bounds of the 72 byte bcrypt limitation. This prehashing allows users to have any length password, while only ever sending 44 bytes to bcrypt. My implementation in this benchmark uses the passlib.hash.bcrypt_sha256.hash() method. How does bcrypt compare to md5crypt, sha256crypt, and sha512crypt in execution time based on password length? bcrypt 1-128 characters (prehashed) bcrypt 1-512 characters (prehashed) bcrypt 1-4,096 characters (prehashed) Now, to be fair, bcrypt is only ever hashing 44 byte passwords in the above results, because of my prehashing. So of course it's running in constant time. So, how does it look with hashing 1 to 72 character passwords without prehashing? bcrypt 1-72 characters (raw) Again, we see consistent execution, driven entirely by iteration cost, not by password length. COLIN PERCIVAL, TARSNAP, AND SCRYPT In May 2009, mathematician Dr. Colin Percival presented to BSDCan'09 about a new adaptive password hashing function called scrypt , that was not only CPU expensive, but RAM expensive as well. The motivation was that even though bcrypt and PBKDF2 are CPU-intensive, FPGAs or ASICs could be built to work through the password hashes much more quickly, due to not requiring much RAM, around 4 KB. By adding a memory cost, in addition to a CPU cost to the password hashing function, we can now require the FPGA and ASIC designers to onboard a specific amount of RAM, thus financially increasing the cost of production. scrypt recommends a default RAM cost of at least 16 MB. I like to think of these expensive functions as "security by obesity". scrypt was initially created as an expensive KDF for his backup service Tarsnap . Tarsnap generates client-side encryption keys, and encrypts your data on the client, before shipping the encrypted payload off to Tarsnap's servers. If at any event your client is lost or stolen, generating the encryption keys requires knowing the password that created them, and attempting to discover that password, just like typical password hashing functions, should be slow. It's now been 9 years as of this post, since Dr. Percival introduced scrypt to the world, and like bcrypt, it has withstood the test of time. It has received, and continues to receive extensive cryptanalysis, is not showing any critical flaws or weaknesses, and as such is among the top choices as a recommendation from security professionals for password hashing and key derivation. How does it fare with its execution time per password length? scrypt 1-128 characters scrypt 1-512 characters scrypt 1-4,096 characters I'm seeing a trend here. THE PASSWORD HASHING COMPETITION WINNER ARGON2 In 2013, an open public competition

, in the

spirit of AES and SHA-3, was held to create a password hashing function that approached password security from what we knew with modern cryptography and password security. There were many interesting designs submitted, including a favorite of mine

by Dr. Thomas

Pornin of StackExchange fame

and

BearSSL , that used delegation to reduce the work load on the honest, while still making it expensive for the

password cracker.

In July 2015, the Argon2 algorithm was chosen as the winner of the competition. It comes with a clean approach of CPU and memory hardness, making the parameters easy to tweak, test, and benchmark. Even though the algorithm is relatively new, it has seen at least 5 years of analysis, as of this writing, and has quickly become the "Gold Standard" for password hashing. I fully recommend it for

production use.

Any bets on how it will execution times will be affected by password

length? Let's look:

Argon2 1-128 characters Argon2 1-512 characters Argon2 1-4,096 characters Execution time is not affected by password length. Imagine that. It's as if cryptographers know what they're doing when designing this

stuff.

CONCLUSION

Ulrich Drepper tried creating something more secure than md5crypt, on

par with bcrypt

, and ended

up creating something worse. Don't use sha256crypt or sha512crypt;

they're dangerous.

For hashing passwords, in order of preference, use with an appropriate

cost:

* Argon2 or scrypt (CPU and RAM hard) * bcrypt or PBKDF2 (CPU hard only) Avoid practically everything else: * md5crypt, sha256crypt, and sha512crypt * Any generic cryptographic hashing function (MD5, SHA-1, SHA-2, SHA-3, BLAKE2, etc.) * Any complex homebrew iterative design (10,000 iterations of salted

SHA-256, etc.)

* Any encryption design (AES, Blowfish (ugh), ChaCha20, etc.) UPDATE: 2020-12-28:Debian just pushed Linux PAM 1.4.0 into the

unstable repository

.

This enables bcrypt password hashing

for

Debian and Debian-based systems by default without any 3rd party tools to custom source code compilation. It is strongly advised that you drop sha256crypt/sha512crypt in favor of bcrypt. UPDATE: A note about PBKDF2 that was brought up in a Twitter thread

from @solardiz

.

PBKDF2-HMAC-SHA512 isn't really an upgrade from sha512crypt (nor PBKDF2-HMAC-SHA256 an upgrade from sha256crypt), because PBKDF2 really isn't GPU resistant in the way bcrypt is. However, bcrypt can be implemented cheaply on ASICs with only 4 KB of memory. If your choice of password hashing in constrained to NIST standards, which includes PBKDF2

,

then unfortunately, bcrypt, scrypt, and Argon2 are out of the question; just make sure to use it properly, which includes choosing a high iteration count based on your authentication load capacity. At that point, password storage is probably not the worst of your

security concerns.

However, if you're not limited to NIST constraints, then use the

others.

ACKNOWLEDGEMENT

Thanks to Steve Thomas (@Sc00bzT) for our discussions on Twitter for helping me see this quirky behavior with sha256crypt and sha512crypt. 2018 05 23 Cryptology

Passwords

Python

Scripting

Security Comments (9)

Shortlink

USE A GOOD PASSWORD GENERATOR

INTRODUCTION

For the past several months now, I have been auditing password generators for the web browser in Google Sheets

.

It started by looking for creative ideas I could borrow or extend upon for my online password generator . Sure enough, I found some, such as using mouse movements as a source of entropy to flashy animations of rolling dice for a Diceware generator. Some were deterministic, using strong password hashing or key derivation functions, and some had very complex interfaces, allowing you to control everything from letter case to pronounceable passwords and

unambiguity.

However, the deeper I got, the more I realized some were doing their generation securely and others weren't. I soon realized that I wanted to grade these online generators and sift out the good from the bad. So, I created a spreadsheet to keep track of what I was auditing, and it quickly grew from "online generators" to "password generators and passphrase generators", to "web password, web passphrase, bookmarklet, chrome extensions, and firefox extenions". When all was said and done, I had audited 300 total generators that can be used with the browser. Some were great while some were just downright horrible. So, what did I audit, why did I choose that criteria, and how did the generators fall out?

I audited:

* Software license

* Server vs. client generation * RNG security, bias, and entropy

* Generation type

* Network security

* Mobile support

* Ads or tracker scripts * Subresource integrity No doubt this is a "version 1.0" of the spreadsheet. I'm sure those in the security community will mock me for my choices of audit categories and scoring. However, I wanted to be informed of how each generator was generating the passwords, so when I made recommendations about using a password generator, I had confidence that I was making a good

recommendation.

USE A PASSWORD MANAGER Before I go any further, the most important advice with passwords, is to USE A PASSWORD MANAGER. There are a number of reasons for this: * They encourage unique passwords for each account. * They encourage passwords with sufficient entropy to withstand offline clustered attacks. * They allow storage of other types of data, such as SSNs or credit

card numbers.

* Many provide online synchronization support across devices, either internally or via Dropbox and Google Drive. * Many ship additional application support, such as browser

extensions.

* THEY SHIP PASSWORD GENERATORS. So before you go any further, the Best Practice for passwords is "Use A Password Manager". As members of the security community, this is the advice we should be giving to our clients, whether they are family, friends, coworkers, or professional customers. But, if they are already using a password manager, and discussions arise about password generation, then this audit is to inform members of the security community which generators are "great", which generators are "good", which generators are "okay", and finally, which generators are "bad". So to be clear, I don't expect mom, dad, and Aunt Josephine to read this spreadsheet, so they make informed decisions about which password generator to use. I do hope however that security researchers, cryptographers, auditors, security contractors, and other members of the security community to take advantage of it. So with that said, let me explain the audit categories and scoring.

SOFTWARE LICENSE

In an ethical software development community, there is value granted when software is licensed under a permissive "copyleft" license. Not necessarily GPL, but any permissive license, from the 2-clause BSD to the GPL, from the Creative Commons to unlicensed public domain software. When the software is licensed liberally, it allows developers to extend, modify, and share the code with the larger community. There are a number of different generators I found in my audit where this happened; some making great changes in the behavior of the generator, others not-so-much.

License

Open Source

Proprietary

So when a liberal license was explicitly specified, I awarded one point for being "Open Source" and no points for being "Proprietary" when a license either was either not explicitly specified or was licensed under a EULA or "All Rights Reserved". This is something to note- just because the code is on a public source code repository, does not mean the code is licensed under a permissive license. United States copyright law states that unless explicitly stated, all works fall under a proprietary "All Rights Reserved" copyright to the creator. So if you didn't specify a license in your Github repository, it got labeled as "Proprietary". It's unfortunate, because I think a lot of generators missed getting awarded that point for a simple oversight. SERVER VS. CLIENT GENERATION Every generator should run in the browser client without any knowledge of the generation process by a different computer, even the web server. No one should have any knowledge whatsoever of what passwords were generated in the browser. Now, I recognize that this is a bit tricky. When you visit a password generator website such as my own, you are showing a level of trust that the JavaScript delivered to your browser is what you expect, and is not logging the generated passwords back to the server. Even with TLS, unless you're checking the source code on every page refresh and disconnecting your network, you just cannot guarantee that the web server did not deliver some sort of

logging script.

Generator

Client

Server

With that said, you still should be able to check the source code on each page refresh, and check if it's being generated in the client or on the server. I awarded one point of being a "Client" generator and no points for being a "Server" generator. Interestingly enough, I thought I would just deal with this for the website generators, and wouldn't have to worry about this with bookmarklets or browser extensions. But I was wrong. I'll expand on this more in the "Network Security" category, but suffice it to say, this is still a problem.

GENERATION TYPE

I think deterministic password generators are fundamentally flawed. Fatally so, even. Tony Arcieri wrote a beautiful blog post on this

matter

,

and it should be internalized across the security community. The "four fatal flaws" of deterministic password generators are: * Deterministic password generators cannot accommodate varying password policies without keeping state. * Deterministic password generators cannot handle revocation of exposed passwords without keeping state. * Deterministic password managers can’t store existing secrets. * Exposure of the master password alone exposes all of your site

passwords.

Number 4 in that list is the most fatal. We all know the advice that accounts should have unrelated randomized unique passwords. When one account is compromised, the exposed password does not compromise any of the other accounts. This is not the case with deterministic password generators. Every account that uses a password from a deterministic generator shares a common thread via the master secret. When that master secret is exposed, all online accounts remain fatally vulnerable to compromise. Proponents of deterministic generators will argue that discovery of the master password of an encrypted password manager database will also expose every online account to compromise. They're not wrong, but let's consider the attack scenarios. In the password manager scenario, a first compromise needs to happen in order to get access to the encrypted database file. Then a second compromise needs to happen in discovering the master password. But with deterministic generators, only one compromise needs to take place- that is using dictionary or brute force attacks to discover the master password that led to password for the online account. With password managers, two compromises are necessary. With determenistic generators, only one compromise is necessary. As such, the generator was awardeded a point for being "Random" and no points for being "Deterministic".

Generator

Random

Unknown

Deterministic

RNG SECURITY

Getting random number generation is one of those least understood concepts in software development, but ironically, developers seem to think they have a firm grasp of the basic principles. When generating passwords, never at any point should a developer choose anything but a cryptographic random number generator. I awarded one point for using a CRNG, and no points otherwise. In some cases, the generation is done on the server, so I don't know or can't verify its security, and in some cases, the code is so difficult to analyze, that I cannot determine its security that way either.

CRNG

Yes

Maybe

Unknown

No

In JavaScript, using a CRNG primarily means using the Web Crypto API via "window.crypto.genRandomValues()", or "window.msCrypto.getRandomValues()" for Microsoft-based browsers. Never should I see "Math.random()". Even though it may be cryptographically secure in Opera, it likely isn't in any of the other browsers. Some developrs shipped the Stanford JavaScript Cryptographic Library . Others shipped a JavaScript implementation of ISAAC, and others yet shipped some AES-based CSPRNG. While these are "fine", you really should consider ditching those userspace scripts in favor of just calling "window.crypto.getRandomValues()". It's less software the user has to download, and as a developer, you are less likely to introduce a

vulnerability.

Also, RC4 is not a CSPRNG, neither is ChaCha20, SHA-256, or any other hash function, stream cipher, or block cipher. So if you were using some JavaScript library that is using a vanilla hashing function, stream cipher, or block cipher as your RNG, then I did not consider it as secure generation. The reason being, is that even though ChaCha20 or SHA-256 may be prediction resistant, it is not backtracking resistant. To be a CRNG, the generator must be both prediction and backtracking resistant. However, in deterministic password generators that are based on a master password, the "RNG" (using this term loosely here) should be a dedicated password hashing function or password-based key derivation function with an appropriate cost. This really means using only: * sha256crypt or sha512crypt with at least 5,000 rounds. * PBKDF2 with at least 1,000 rounds. * bcrypt with a cost of at least 5. * scrypt with a cost of at least 16 MiB of RAM and at least 0.5s

execution.

* Argon2 with sufficient cost of RAM and execution time. Anything else, like hashing the master password or mouse entropy with MD5, SHA-1, SHA-2, or even SHA-3 will not suffice. The goal of those dedicated password generators or key derivation functions is to slow down an offline attempt at discovering the password. Likely the master password does not contain sufficient entropy, so it remains the weakest link in the generation process. By using a dedicated password hashing or key derivation function with an appropriate cost, we can guarantee a certain "speed limit" with offline clustered password attacks, making it difficult to reconstruct the password.

RNG UNIFORMITY

Even though the developer may have chosen a CRNG, they may not be using the generator uniformly. This is probably the most difficult aspect of random number generation to grasp. It seems harmless enough to call "Math.floor(Math.random() * length)" or "window.crypto.getRandomValues(new UInt32Array(1)) % length". In both cases, unless "length" is a power of 2, the generator is biased. I awarded one point for being an unbiased generator, and zero points

otherwise.

Uniform

Yes

Maybe

Unknown

No

To do random number generation in an unbiased manner, you need to find how many times the length divides the period of the generator, and note the remainder. For example, if using "window.crypto.getRandomValues(new UInt32Array(1))", then the generator has a period of 32-bits. If your length is "7,776", is in the case of Diceware, then 7,776 divides 232 552,336 times with a remainder of 2,560. This means that 7,776 divides values 0 through 232-2,561 evenly. So if your random number is between the range of 232-2,560 through 232-1, the value needs to be tossed out, and a new

number generated.

Oddly enough, you could use a an insecure CRNG, such as SHA-256, but truncate the digest to a certain length. While the generator is not secure, the generator in this case is unbiased. More often than not actually, deterministic generators seem to fall in this category, where a poor hashing function was chosen, but the digest was

truncated.

RNG ENTROPY

I've blogged about this a number of times, so I won't repeat myself here. Search by blog for password entropy, and get caught up with the details. I awarded one point for generators with at least 70 bits of entropy, 0.5 points for 55 through 69 bits of entropy, and no points for entropy less than 55 bits.

Entropy

70

69

55

54

I will mention however that I chose the default value that was presented to me when generating the password. Worse, if I was forced to chose my length, and I could chose a password of one character, then I awarded it as such. When you're presenting password generators to people like Aunt Josephine, they'll do anything they can do get away with as little as possible. History has shown this is 6-8 characters in length. This shouldn't be possible. A few Nvidia GTX960 GPUs can crack every 8 character ASCII password hashed with SHA-1 in under a week. There is no reason why the password generator should not present minimum defaults that are outside the scope of practical hobbyist brute force searching. So while that may be harsh, if you developed one of the generators in my audit, and you were dinged for this- I'm calling you out. Stop it.

NETWORK SECURITY

When delivering the software for the password generation, it's critical that the software is delivered over TLS. There should be no room for a man-in-the-middle to inject malicious code to discover what passwords your generating, send you a determined list passwords, or any other sort of compromise. This means, however, that I expect a green lock in the browser's address or status bars. The certificate should not be self-signed, it should not be expired, it should not be missing intermediate certificates, it should not be generated for a different CN, or any other certificate problems. Further, the site should be HTTPS _by default_.

HTTPS

Yes

Not Default

Expired

No

I awarded one point for "Yes", serving the software over secure and problem-free HTTPS, and zero points otherwise.

MOBILE VIEW SUPPORT

For the most part, due to their ubiquity, developers are designing websites that support mobile devices with smaller screens and touch interfaces. It's as simple as adding a viewport in the HTML header, and as complex as customized CSS and JavaScript rules for screen widths, user agents, and other tests. Ultimately, when I'm recommending a password generator to Aunt Josephine while she's using her mobile phone, she shouldn't have to pinch-zoom, scroll, and other nuisances when generating and copying the password. As silly as that may sound, if the site doesn't support a mobile interface, then it wasn't awarded a point.

Mobile

Yes

No

ADS AND TRACKER SCRIPTS I get it. I know that as a developer, you want easy access to analytics about who is visiting your site. Having that data can help you learn what is driving traffic to your site, and how you can make adjustments as necessary to improve that traffic. But when it comes to password generators, no one other than me and the server that sent the code to my browser, should know that I'm about to generate passwords. I awarded a point for not shipping trackers, and zero points if the

generator did.

Trackers

No

Yes

Google Analytics, social media scripts, ads, and other 3rd party browser scripts track me via fingerprinting, cookies, and other methods to learn more about who I am and what I visited. If you want to see how extensive this is, install the Lightbeam extension for

Firefox

.

This shows the capability of companies to share data and learn more about who you are and where you've been on the web. Recently, Cambridge Analytica, a small unknown company, was able to mine data on millions of Facebook users

,

and the data mining exposed just how easy it was for Facebook to track your web behavior even when not on the site. At first, I thought this would be just something with website generators, but when I started auditing browser extensions, I quickly saw that developers were shipping Google Analytics, and other tracking scripts in the bundled extension as well.

OFFLINE

When I started auditing the bookmarklets and extensions, I pulled up my developer console, and watched for any network activity when generating the passwords or using the software. To my dismay, some do "call home" by either wrapping the generator around an -->

More Annotations

Maria Garcia

2020-04-10 16:22:36

Maria Garcia

2020-04-10 16:24:32

Maria Garcia

2020-04-10 23:06:08

Maria Garcia

2020-04-11 03:07:44

Maria Garcia

2020-04-11 06:23:47

Maria Garcia

2020-04-11 07:01:11

Maria Garcia

2020-04-11 07:31:41

Maria Garcia

2020-04-11 08:46:06

Maria Garcia

2020-04-11 08:53:54

Maria Garcia

2020-04-11 11:26:15

Maria Garcia

2020-04-11 13:03:15

Maria Garcia

2020-04-11 15:48:36

Favourite Annotations

Maria Garcia

2019-10-29 22:47:29

Maria Garcia

2019-10-29 22:47:46

Maria Garcia

2019-10-29 22:48:01

Maria Garcia

2019-10-29 22:48:10

Maria Garcia

2019-10-29 22:48:21

Maria Garcia

2019-10-29 22:50:20

Maria Garcia

2019-10-29 22:50:40

Maria Garcia

2019-10-29 22:50:52

Maria Garcia

2019-10-29 22:51:05

Maria Garcia

2019-10-29 22:51:18

Maria Garcia

2019-10-29 22:51:24

Maria Garcia

2019-10-29 22:51:32

Text

was finalized.

doesn't address

deduplication.

tech-world don't

you

was finalized.

doesn't address

deduplication.

tech-world don't

you

keys:

many times have you

was finalized.

doesn't address

deduplication.

tech-world don't

you

was finalized.

doesn't address

deduplication.

tech-world don't

you

keys:

many times have you

was finalized.

doesn't address

deduplication.

tech-world don't

you