Maybe I didn’t explain clearly enough in my original post. “utf8mb4” takes about the same amount of space as “utf8”: one byte per character for Western languages.

MySQL gets the nomenclature backwards. MySQL’s “utf8mb4” takes up to four bytes per character, and it’s the standard that everybody, everybody, everybody uses everywhere, everywhere, everywhere: MySQL ought to call it “utf8.” MySQL’s “utf8” takes up to three bytes per character, and nobody, nobody, nobody uses it anywhere, anywhere, anywhere: it ought to be called “utf8mb3.”

There may be no situation — across the entire Internet — in which MySQL’s “utf8” is appropriate. MySQL’s “utf8” only exists for backwards compatibility with previous versions of MySQL (which shipped it in error).

Journalist, ex software engineer

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store