Pinned toot

Hi, I'm Andreas. I work on Stripe's Reliability Tooling team, and I enjoy working with Rust, Ruby, Go, CL and many other programming languages.

I grew up using Linux but my screens are all driven by some macintosh software or another.

making ruby programmers unhappy 

Kubernetes: It's not just complex, it's military-industrial complex.

(cncf.io/case-study/dod/)

Wow, this instance is lighting up! Welcome, recursers!

no YOU just wrote a program to generate arbitrary unicode tag sequences stuck to arbitrary emoji, and it was you whose first instinct was to combine the 📼 emoji with the "every morning i wake up and open palm slam a vhs into the slot" copypasta

All that because I was researching how to represent the transgender flag (🏳️‍⚧️ - new in Unicode 13!) in cha(rs).

*That* flag uses ZWJ sequences, like any "reasonable" emojo should. But ... uh... that's not the only way emoji are represented in sequences, far from it!

Unicode is a fractal of weirdness, y'all.

unicode, ukpol commentary 

How does that look like, I wondered, and so, let's look at the scottish flag!

It's the sequence 1F3F4 E0067 E0062 E0073 E0063 E0074 E007F.

E007F is the "cancel tag" codepoint, so that's where the flag ends. What are the others? Let's see:

1F3F4: WAVING BLACK FLAG
E0067: TAG LATIN SMALL LETTER G
E0062: TAG LATIN SMALL LETTER B
E0073: TAG LATIN SMALL LETTER S
E0063: TAG LATIN SMALL LETTER C
E0074: TAG LATIN SMALL LETTER T

But what are they used for? That goes back to my original statement: They are specified (to my knowledge) for exactly and only three characters: The flags of England, Scotland and Wales.

So how do tag sequences work, you ask? Well, that's easy: They're characters that spell out a tag name.

Are they the same letters that spell out the flag names? No, of course not. This is unicode, of course there is room for everyone (especially latin letters)... Meet U+E0020 (TAG SPACE) through U+E007E (TAG TILDE), plus U+E007F (cancel tag).

Yeah, they duplicated ASCII in "tag" space, including the at-sign and other hilarious bits.

So you know how flags are made up of two "regional identifier" letters that combined together spell out the country code? That's great, but those are available only to actual countries.

Well, there's a solution for you ("you", if you are a component country of the british commonwealth, that is): Unicode tag sequences.

I've just learned about unicode tag sequences and so now you have to learn about them too.

Holy cats they're weird.

Thanks, unnamed friend for reminding me that I've been programming for nearly 25 years now.

Hi, I'm Andreas. I work on Stripe's Reliability Tooling team, and I enjoy working with Rust, Ruby, Go, CL and many other programming languages.

I grew up using Linux but my screens are all driven by some macintosh software or another.

recurse.social

A Mastodon server for Recursers.