Encoding Errors Guide: Fixing Mojibake and Garbled Text
Mojibake (garbled text like é) happens when text is decoded with the wrong charset. Learn how to diagnose encoding mismatches and fix them in files, databases, and APIs.
Published:
Tags: encoding, unicode, debugging
Encoding Errors Guide: Fixing Mojibake and Garbled Text Mojibake is the Japanese word for "character transformation" — the garbled text you get when bytes are decoded with the wrong charset. If you've seen "é" instead of "é", "’" instead of "'" (smart apostrophe), or a database full of "�" characters, this guide explains exactly what went wrong and how to fix it. Diagnosing the Encoding Bug Step 1: Look at the Pattern Common mojibake patterns and their diagnosis: | Garbled text | Actual char | What happened | |---|---|---| | | é | UTF-8 decoded as Latin-1 | | | è | UTF-8 decoded as Latin-1 | | | ü | UTF-8 decoded as Latin-1 | | | ' (right single quote) | UTF-8 decoded as Latin-1 (CP1252 smart quote) | | | " (left double quote) | UTF-8 decoded as Latin-1 | | | — (em dash) | UTF-8…
All articles · theproductguy.in