Punycode Converter: IDN Domains
Convert internationalized domain names (IDN) to and from Punycode — decode xn-- encoded hostnames.
Published:
Tags: Punycode converter, IDN domain converter, xn-- decode
Punycode Converter: IDN Domains Part of our complete guide to this topic — see the full series. Punycode encodes Unicode domain names into ASCII so they work in the DNS system. Here is how the encoding works, why it was created, and how to use it in code. --- All the tools discussed here are available for free at theproductguy.in — client-side, no sign-up required. Why Punycode Exists? The Domain Name System (DNS) was designed in 1983 with only ASCII in mind. Domain labels can contain letters A–Z, digits 0–9, and hyphens — nothing else. As the internet became global, the demand for domain names in Arabic, Chinese, Cyrillic, Japanese, Korean, and other scripts grew. Two approaches were considered: extend DNS to support Unicode natively, or find an encoding that would translate Unicode to…
Frequently Asked Questions
What is Punycode?
Punycode is an encoding algorithm defined in RFC 3492 (2003) that converts strings containing Unicode characters into a form using only ASCII letters, digits, and hyphens. It was designed specifically for encoding internationalized domain names (IDNs) so they can be stored in the DNS system, which only handles ASCII.
How do I convert a domain to Punycode?
Each label (part between dots) in an IDN is processed separately. If a label contains only ASCII characters, it is left unchanged. If it contains Unicode characters, Punycode encodes it as 'xn--' followed by the Punycode-encoded form. For example, 'münchen.de' becomes 'xn--mnchen-3ya.de'.
What is xn-- in a domain name?
The 'xn--' prefix is an ACE (ASCII Compatible Encoding) prefix that signals a Punycode-encoded domain label. It tells DNS resolvers and browsers that the label following the prefix is a Punycode encoding of a Unicode string. Without this prefix, DNS would treat the domain as purely ASCII.
What are internationalized domain names?
Internationalized Domain Names (IDNs) are domain names containing characters outside the ASCII range — Arabic, Chinese, Cyrillic, Korean, accented Latin characters, and even emoji. ICANN began supporting IDNs in 2009. They allow domain names to be written in a user's native script.
Why are IDN domains used for phishing?
Attackers register IDN domains using Unicode characters that look visually identical to ASCII characters. For example, Cyrillic 'а' (U+0430) looks like Latin 'a' (U+0061) in many fonts. A domain like 'аpple.com' using Cyrillic 'а' could fool users into thinking they are visiting the real apple.com. This is called an IDN homograph attack.
All articles · theproductguy.in