Adam Hooper
1 min readMar 14, 2018

--

Maybe I didn’t explain clearly enough in my original post. “utf8mb4” takes about the same amount of space as “utf8”: one byte per character for Western languages.

MySQL gets the nomenclature backwards. MySQL’s “utf8mb4” takes up to four bytes per character, and it’s the standard that everybody, everybody, everybody uses everywhere, everywhere, everywhere: MySQL ought to call it “utf8.” MySQL’s “utf8” takes up to three bytes per character, and nobody, nobody, nobody uses it anywhere, anywhere, anywhere: it ought to be called “utf8mb3.”

There may be no situation — across the entire Internet — in which MySQL’s “utf8” is appropriate. MySQL’s “utf8” only exists for backwards compatibility with previous versions of MySQL (which shipped it in error).

--

--