In MySQL, never use “utf8”. Use “utf8mb4”.

Incorrect string value: ‘\xF0\x9F\x98\x83 <…’ for column ‘summary’ at row 1
  • MySQL’s “utf8mb4” means “UTF-8”.
  • MySQL’s “utf8” means “a proprietary character encoding”. This encoding can’t encode many Unicode characters.
  1. Your computer read “01000011” and determined that it’s the number 67. That’s because 67 was encoded as “01000011”.
  2. Your computer looked up character number 67 in the Unicode character set, and it found that 67 means “C”.
  1. My computer mapped “C” to 67 in the Unicode character set.
  2. My computer encoded 67, sending “01000011” to this web server.
  1. Choose CHAR columns. (The CHAR format is a relic nowadays. Back then, MySQL was faster with CHAR columns. Ever since 2005, it’s not.)
  2. Choose to encode those CHAR columns as “utf8”.
  1. Database systems have subtle bugs and oddities, and you can avoid a lot of bugs by avoiding database systems.
  2. If you need a database, don’t use MySQL or MariaDB. Use PostgreSQL.
  3. If you need to use MySQL or MariaDB, never use “utf8”. Always use “utf8mb4” when you want UTF-8. Convert your database now to avoid headaches later.

 by the author.

--

--

--

Journalist, ex software engineer

Love podcasts or audiobooks? Learn on the go with our new app.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Adam Hooper

Adam Hooper

Journalist, ex software engineer

More from Medium

CS371p Spring 2022: Sarunas Budreckis: Final Entry

#100daysAppChallenge Introduction

REST services & Koa.JS

Port 8080 was already in use, “APPLICATION FAILED TO START” Error.