#intro Hi, I'm Andreas. I work on Stripe's Reliability Tooling team, and I enjoy working with Rust, Ruby, Go, CL and many other programming languages.
I grew up using Linux but my screens are all driven by some macintosh software or another.
Wow, TIL google has a union now: https://alphabetworkersunion.org/ - well done, not a second too soon, etc etc.
life was easier when addresses were still backed by the RAM standard.
None of this "fiat" segmentation.
And now I wonder: are virtual memory allocations credit that userspace extends to the kernel?
And eventually, the kernel defaults (invokes the OOM killer). In that case, I would like to purchase virtual memory default insurance.
This post about demand paging has an interesting illustration: https://offlinemark.com/2020/10/14/demand-paging/
Kubernetes: It's not just complex, it's military-industrial complex.
https://github.com/antifuchs/unicode-tagger in case you want to experiment - it's quite silly.
All that because I was researching how to represent the transgender flag (🏳️⚧️ - new in Unicode 13!) in cha(rs).
*That* flag uses ZWJ sequences, like any "reasonable" emojo should. But ... uh... that's not the only way emoji are represented in sequences, far from it!
Unicode is a fractal of weirdness, y'all.
unicode, ukpol commentary
So what this is, is a pirate flag with "gbsct" glued to it. They didn't even use the upper-case letters... I guess, to prevent the commonwealth from thinking of themselves as too important?
So anyway, that's the only three sequences that I could find that use tag sequences. And ... uh. What a weird-ass mess.
How does that look like, I wondered, and so, let's look at the scottish flag!
It's the sequence 1F3F4 E0067 E0062 E0073 E0063 E0074 E007F.
E007F is the "cancel tag" codepoint, so that's where the flag ends. What are the others? Let's see:
1F3F4: WAVING BLACK FLAG
E0067: TAG LATIN SMALL LETTER G
E0062: TAG LATIN SMALL LETTER B
E0073: TAG LATIN SMALL LETTER S
E0063: TAG LATIN SMALL LETTER C
E0074: TAG LATIN SMALL LETTER T
But what are they used for? That goes back to my original statement: They are specified (to my knowledge) for exactly and only three characters: The flags of England, Scotland and Wales.
So how do tag sequences work, you ask? Well, that's easy: They're characters that spell out a tag name.
Are they the same letters that spell out the flag names? No, of course not. This is unicode, of course there is room for everyone (especially latin letters)... Meet U+E0020 (TAG SPACE) through U+E007E (TAG TILDE), plus U+E007F (cancel tag).
Yeah, they duplicated ASCII in "tag" space, including the at-sign and other hilarious bits.
So you know how flags are made up of two "regional identifier" letters that combined together spell out the country code? That's great, but those are available only to actual countries.
Well, there's a solution for you ("you", if you are a component country of the british commonwealth, that is): Unicode tag sequences.
Galaxy brain, as a verb.
I'm Andreas, this is my second home - my primary (often less-tech heavy) account is @antifuchs.
I love bots but #nobot