Sam Altman is one of the dullest, most incurious and least creative people to walk this earth. This is, after all, the person who once tweeted 'i am a stochastic parrot and so are u', in response to Emily Bender's (entirely incisive and absolutely brilliant) critique of what his large language models are *actually doing*.
Sadly all my best text encoding stories would make me identifiable to coworkers so I can’t share them here. Because there’s been some funny stuff over the years. Wait where did I go wrong that I have multiple text encoding stories?
That said I mostly just deal with normal stuff like UTF-8, UTF-16, Latin1, and ASCII.
My favourite was a junior dev who was like, “when I read from this input file the data is weirdly mangled and unreadable so as the first processing step I’ll just remove all null bytes, which seems to leave me with ASCII text.”
ah but did you tell them in CP437 or something fancy (like any text encoding after 1996)? 🤨🤨🥹
Sadly all my best text encoding stories would make me identifiable to coworkers so I can’t share them here. Because there’s been some funny stuff over the years. Wait where did I go wrong that I have multiple text encoding stories?
That said I mostly just deal with normal stuff like UTF-8, UTF-16, Latin1, and ASCII.
My favourite was a junior dev who was like, “when I read from this input file the data is weirdly mangled and unreadable so as the first processing step I’ll just remove all null bytes, which seems to leave me with ASCII text.”
(It was UTF-16.)
You’ve got to make sure you’re not over-specializing. I’d recommend trying to roll your own time zone library next.