Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I appreciate your insight, but I just want to expand on one point:

> Having actually worked on charset handling, when most people say "ASCII", they mean "ASCII" and not anything else.

Approximately zero people are referring to a true, packed, 7-bit encoding when they say "ASCII". They're nearly always talking about an 8-bit character set, and in such cases, something must happen when the high bit is 1. (I've never seen one that plain ignores or uses error glyphs for characters >127, although you likely have more experience with this than I do.) This is why I said people are referring to one of these encodings in practice... because ascii is 7-bit, and approximately everyone is talking about some 8-bit encoding of one form or another.

I would definitely agree that most wouldn't call KO18-R "ascii", but they may use the term "ascii" to describe the first 128 characters of KO18-R. (Notwithstanding if it uses weird replacement characters like Shift_JIS does with the backslash and the yen sign.) This is the reason for my comment about how the weird "ascii + custom" all just feels like ascii to me... if you stay below 128 it literally is.

I'll modify my original statement thusly:

> This actually rules out nearly any character set that isn't compatible with ASCII.

And add an addendum that if you don't use UTF-8, you can't use unicode and will be stuck in code page/locale hell.



> I've never seen one that plain ignores or uses error glyphs for characters >127

Reporting an error is the default behavior if you try to decode such a string with the ASCII codec in Python and .NET, at the very least.

The first 128 characters of KOI8-R are, of course, ASCII (the "weird replacement characters" are, in fact, explicitly allowed!). But a file encoded in KOI8-R is only ASCII if it contains those first 128 chars.

> if you don't use UTF-8, you can't use unicode and will be stuck in code page/locale hell.

UTF-7 was a thing. It just turned out that nobody really needed it.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: