Image Format Detection from Magic Bytes
How to identify image format from binary magic bytes — JPEG FFD8, PNG 8950, GIF 4749 signatures.
Published:
Tags: image format magic bytes, detect image type binary, file signature bytes
Image Format Detection from Magic Bytes Magic bytes are fixed byte sequences at the start of a binary file that identify its format without relying on the filename extension. JPEG starts with , PNG with , GIF with . --- What is why file extensions are not enough? File extensions can be changed or spoofed. A file named might actually be a PNG, a GIF, or a malicious executable renamed to look like an image. Magic bytes are embedded in the binary content and set by the encoder — they can't be changed without corrupting the file. For image processing pipelines, security scanning, and content moderation systems, relying on the extension alone creates exploitable gaps. A CSRF-protected upload endpoint that only checks the extension can be bypassed by renaming a malicious file. What is magic…
Frequently Asked Questions
What are magic bytes in image files?
Magic bytes (also called magic numbers or file signatures) are specific byte sequences at the start of a file that identify its format. Unlike file extensions which can be renamed, magic bytes are embedded in the binary content and reliably identify the true format regardless of filename.
How do I identify an image format without an extension?
Read the first 12 bytes of the file and compare against known signatures. JPEG starts with FF D8 FF, PNG with 89 50 4E 47 0D 0A 1A 0A, GIF with 47 49 46 38, BMP with 42 4D, and WebP with 52 49 46 46 followed by 57 45 42 50 at offset 8. HEIC and AVIF are detected by checking for the ftyp box at offset 4.
What is the magic number for PNG?
The PNG magic number is 8 bytes: 89 50 4E 47 0D 0A 1A 0A (hex). In ASCII this reads as \x89PNG\r\n\x1a\n. The leading 0x89 byte is non-ASCII to prevent misidentification as a text file. The PNG 1.0 spec documents this signature as a deliberate design choice to detect file corruption.
How do I detect HEIC from bytes?
HEIC files use the ISOBMFF container format. Read bytes 4–11; if they spell 'ftyp', it's an ISOBMFF container. Then read bytes 8–11 for the major brand: 'heic', 'heix', 'hevc', or 'mif1' indicate HEIC/HEIF. 'avif' or 'avis' indicate AVIF. 'M4V ' or 'isom' indicate MP4/video.
What is the file(1) command for magic detection?
The Unix file(1) command reads magic byte signatures from /usr/share/misc/magic (or /etc/magic) and matches them against the file's binary content. Running `file photo.jpg` returns the MIME type and format details. The libmagic library provides the same detection programmatically from C, Python (python-magic), and Node.js (mmmagic).
All articles · theproductguy.in