Heap-buffer-overflow in EXIF writer for extra IFD tags

Ruikai Peng

We recently found a cool four-bytes heap-buffer-overflow in FFmpeg's avcodec/exif during the processing of IFDs (Image File Directory). This affects .png, .jpg, .webp, .avif … the formats we use most often. The cause of this bug is very interesting, and I don’t want to spoil it here; I want you to find out.

It’s also a short in-depth dive of FFmpeg internal workings of EXIFs, something we use so often. So even if you’re not really into memory bugs, this can be a cool way to learn how it works under the hood.

This bug wasn’t in FFmpeg long. We happened to catch it about three days after it got introduced in the codebase. You can always trust FFmpeg.

// libavcodec/pngdec.c:763
static int decode_exif_chunk(AVCodecContext *avctx, PNGDecContext *s,
                             GetByteContext *gb)
{
    // ...
    s->exif_data = av_buffer_alloc(bytestream2_get_bytes_left(gb));
    if (!s->exif_data)
        return AVERROR(ENOMEM);
    bytestream2_get_buffer(gb, s->exif_data->data, s->exif_data->size);

    return 0;
}

To begin with (in context of PNG), decode_exif_chunk at libavcodec/pngdec.c:763 handles the processing of the exif data in a image. Here is where we allocated the destination of the exif store (s->exif_data), in where the PNG decoder stores the exif chunk it read (from gb )_ into.

// libavcodec/pngdec.c:1758
    if (s->exif_data) {
        // we swap because ff_decode_exif_attach_buffer adds to p->metadata
        FFSWAP(AVDictionary *, p->metadata, s->frame_metadata);
        ret = ff_decode_exif_attach_buffer(avctx, p, &s->exif_data, AV_EXIF_TIFF_HEADER);
        FFSWAP(AVDictionary *, p->metadata, s->frame_metadata);
        if (ret < 0) {
            av_log(avctx, AV_LOG_WARNING, "unable to attach EXIF buffer\\n");
            return ret;
        }
    }

During frame production, that decoder buffer becomes the EXIF payload associated (attaches) with the outputting frame in libavcodec/pngdec.c:1761 , via ff_decode_exif_attach_buffer .

// libavcodec/decode.c:2436
int ff_decode_exif_attach_buffer(AVCodecContext *avctx, AVFrame *frame, AVBufferRef **pbuf,
                                 enum AVExifHeaderMode header_mode)
{
    int ret;
    AVBufferRef *data = *pbuf;
    AVExifMetadata ifd = { 0 };

    ret = av_exif_parse_buffer(avctx, data->data, data->size, &ifd, header_mode);
    if (ret < 0)
        goto end;

    ret = exif_attach_ifd(avctx, frame, &ifd, pbuf);

In which, in ff_decode_exif_attach_buffer() furthermore transcribes the raw EXIF bytes into FFmpeg’s native EXIF structure (AVExifMetadata) with av_exif_parse_buffer, via the GetByteContext way (as we seen the decoder g did it). It initializes base on what header_mode it is. (which is interesting, and the author also put explanations on AVExifHeaderMode ):

AV_EXIF_EXIF00 expects for a Exif\0\0 prefix, it skips six bytes.
AV_EXIF_T_OFF expects a 4-byte offset at the start (where it actually starts) and uses it to find the TIFF header.
AV_EXIF_TIFF_HEADER expects the buffer to start with a TIFF header; it decodes endianness + first IFD offset via ff_tdecode_header
AV_EXIF_ASSUME_LE/BE doesn’t decode a TIFF header, it just assumes little/big endian and starts at beginning of the buffer.

After the setup, it starts to parse the main IFD (Image File Directory), which stores a table of metadata entries. These entries tell you how to interpret a specific IFD structure. (What tag is it; what type ; offset to look for the specified IFD)_

/* IFD tags */
    {"ExifIFD",                    0x8769}, // <- An IFD pointing to standard Exif metadata
    {"GPSInfo",                    0x8825}, // <- An IFD pointing to GPS Exif Metadata
    {"InteropIFD",                 0xA005}, // <- Table 13 Interoperability IFD Attribute Information
    {"GlobalParametersIFD",        0x0190},
    {"ProfileIFD",                 0xc6f5},
};

Yes, they are the thing that sometime accidentally leaks where you live, and also the explanation why sometimes when your friend send you a picture from iMessage, you clicked on it and it shows you where it is. But don’t worry much about it; typically social media now strip the EXIF parts away, thanks to these H1 reports.

These IFD tags also includes: orientation; camera & lens info; capture settings like ISO, shutter time, focal length; properties like resolution, pixel dimension etc. Interesting some files also include a small JPEG thumbnail that tags points to (picture in a picture).

After parsing the IFD0 (the main IFD lists), If there’s more IFD data (if ret >0 )_, it tries seek to that next IFD offset and loops, up to 16 more IFDs. (In real-world, these extra IFDs can be TIFF multi-page, thumbnails we mentioned…)

// libavcodec/exif.c:920
    /* cap at 16 extra IFDs for sanity/parse security */
    for (int extra_tag = 0xFFFCu; extra_tag > 0xFFECu; extra_tag--) {
        AVExifMetadata extra_ifd = { 0 };
        ret = exif_parse_ifd_list(logctx, &gbytes, le, 0, &extra_ifd, 1);
        if (ret < 0) {
            av_exif_free(&extra_ifd);
            break;
        }
        next = ret;
        av_log(logctx, AV_LOG_DEBUG, "found extra IFD: %04x with next=%d\n", extra_tag, ret);
        bytestream2_seek(&gbytes, next, SEEK_SET);
        ret = av_exif_set_entry(logctx, ifd, extra_tag, AV_TIFF_IFD, 1, NULL, 0, &extra_ifd);
        av_exif_free(&extra_ifd);
        if (ret < 0 || !next || bytestream2_get_bytes_left(&gbytes) <= 0)
            break;
    }

It stores these extra IFDs into the reserved synthetic tags from 0xFFFC to 0xFFED . By store them as a AV_TIFF_IFD entries into the main IFDs.

After the parsing of these IFDs finishes, these IFDs enter the decoding flow into exif_attach_ifd() (invoked by ff_decode_exif_attach_buffer , libavcodec/decode.c:2447 ): We use the AVExifMetadata structured IFD (structured from stream) to apply (annotate) the outputting frame.

// libavcodec/decode.c:2375
static int exif_attach_ifd(AVCodecContext *avctx, AVFrame *frame, const AVExifMetadata *ifd, AVBufferRef **pbuf)
{
		// ...
    for (size_t i = 0; i < ifd->count; i++) {
        const AVExifEntry *entry = &ifd->entries[i];
        if (entry->id == av_exif_get_tag_id("Orientation") &&
            entry->count > 0 && entry->type == AV_TIFF_SHORT) {
            orient = entry;
            break;
        }
    }
    // ...

For exif_attach_ifd ’s “annotating”, we meant it:

processing orientation (finding it then converts it into a display matrix side-data to render the rotate; optionally removing it);
converting the remaining tags into frame→metadata , which is are strings in an AVDictionary;
(re)writing these EXIF back into byte-stream via av_exif_write , and attaching that raw EXIF blob to the frame.

Here the re-attaching the (re)writing into byte-stream for attachment is the most interesting, since it basically re-serializing the extra IFDs; taking them out from the main IFDs entries into a temporary extra_tag ; write the (main) IFD0 , then append these extra IFDs as IFD1 (appending them after IFD0 , and changing the previous IFD’s next into the appended IFDS)

// exif_attach_ifd (libavcodec/decode.c:2411)
    if (cloned || !*pbuf) {
        av_buffer_unref(pbuf);
        ret = av_exif_write(avctx, ifd, pbuf, AV_EXIF_TIFF_HEADER);
        if (ret < 0)
            goto end;
    }

    ret = ff_frame_new_side_data_from_buf(avctx, frame, AV_FRAME_DATA_EXIF, pbuf);
    
// av_exif_write (libavcodec/exif.c:747)
int av_exif_write(void *logctx, const AVExifMetadata *ifd, AVBufferRef **buffer, enum AVExifHeaderMode header_mode)
{
    // ...
    size = exif_get_ifd_size(ifd);
    buf = av_buffer_alloc(size + off + headsize);
    // ...
    // av_exif_write (libavcodec/exif.c:802)
    int extras;
    for (extras = 0; extras < FF_ARRAY_ELEMS(extra_ifds); extras++) {
        AVExifEntry *extra_entry = NULL;
        uint16_t extra_tag = 0xFFFCu - extras;
        ret = av_exif_get_entry(logctx, (AVExifMetadata *) ifd, extra_tag, 0, &extra_entry);
        if (ret <= 0)
            break;
        av_log(logctx, AV_LOG_DEBUG, "found extra IFD tag: %04x\n", extra_tag);
        if (!ifd_new) {
            ifd_new = av_exif_clone_ifd(ifd);
            if (!ifd_new)
                break;
            ifd = ifd_new;
        }
        /* calling remove_entry will call av_exif_free on the original */
        AVExifMetadata *cloned = av_exif_clone_ifd(&extra_entry->value.ifd);
		
		// ...
		ret = exif_write_ifd(logctx, &pb, le, 0, ifd);

However, takes notes on how does av_exif_write take accounts for the extra IFDs from 0xFFFC to 0xFFED ; it “peels” the extra_ifd linearly downwards from 0xFFFCu , and breaks on the first missing tag. Also note that, the size of the written buffer was calculated, in prior, to the peeling at exif_get_ifd_size(ifd) with the IFD list.

static size_t exif_get_ifd_size(const AVExifMetadata *ifd)
{
    /* 6 == 4 + 2; 2-byte entry-count at the beginning */
    /* plus 4-byte next-IFD pointer at the end */
    size_t total_size = IFD_EXTRA_SIZE;
    for (size_t i = 0; i < ifd->count; i++) {
        const AVExifEntry *entry = &ifd->entries[i];
        // traverse the main IFD
        if (entry->type == AV_TIFF_IFD) {
            /* this is an extra IFD, not an entry, so we don't need to add base tag size */
            size_t base_size = entry->id > 0xFFECu && entry->id <= 0xFFFCu ? 0 : BASE_TAG_SIZE;
            total_size += base_size + exif_get_ifd_size(&entry->value.ifd) + entry->ifd_offset;

The sizing of exif_get_ifd_size , treats extra IFDs, AV_TIFF_IFD with id of > 0xFFECu , <= 0xFFFCu calculated as zero. This calculation does makes sense because these tags are eventually going to be “peeled” out from the main IFDs entries (libavcodec/exif.c:802) as we mentioned previously;

However, what made these few slices of code worth me spending hours writing a analysis about, is because of this reasonable assumption: all extra IFD tags are going to be eventually removed.

You see, as we mentioned previously,

av_exif_write take accounts for the extra IFDs from 0xFFFC to 0xFFED ; it “peels” the extra_ifd linearly downwards. and breaks on the first missing tag.

The extra IFD removal, “peeling” scan from 0xFFFCu , -1 step at a time; since av_exif_parse_buffer might not use all of the extra IFD tags (in case there are less than 16 tags), it ends when there’s no tag. (the code slice above the exif_get_ifd_size one is describing this). This very much make sense as well since our extra IFDs are linearly distributed from av_exif_parse_buffer by extra fields.

But the question is, what if they are non-contiguous?

This question might sound stupid, since we just said that the entries of these extra tags are distributed linearly, and contiguous, and that’s 100% true. There are no shenanigans you can pull of from the initialization on av_exif_parse_buffer .

But what if we don’t go a expect route, and fake a extra AV_TIFF_IFD IFD, from the source?

exif_parse_ifd_list ( exif_decode_tag ) is the used to extract IFDs from the byte-stream context (GetByteContext *gb). We’ve seen it in the structuring av_exif_parse_buffer stage to extract the IPD0 (main IFD entries); as we mentioned, prior to when it deal with synthetic tags.

// libavcodec/exif.c:73
static int exif_decode_tag(void *logctx, GetByteContext *gb, int le,
                           int depth, AVExifEntry *entry)
{
		// ...
    entry->id = ff_tget_short(gb, le);
    type = ff_tget_short(gb, le);
    count = ff_tget_long(gb, le);
    payload = ff_tget_long(gb, le);
    // ...
    /* AV_TIFF_IFD is the largest, numerically */
    if (type > AV_TIFF_IFD || count >= INT_MAX/8U)
        return AVERROR_INVALIDDATA;

	// ....
	// libavcodec/exif.c:907
    ret = exif_parse_ifd_list(logctx, &gbytes, le, 0, ifd, 0);
    if (ret < 0) {
        av_log(logctx, AV_LOG_ERROR, "error decoding EXIF data: %s\n", av_err2str(ret));
        return ret;
    }

If we take a closer look, it seems like there’re a little heuristic checks on the type of the IFDs; If type > AV_TIFF_IFD or count >= INT_MAX/8U ; However, nothing is stopping us from directly adding a reserved synthetic tags (0xFFFC to 0xFFED) from the byte-stream (EXIF section of the file) through the IFD0 parsing, even through they’re used internally for accounting extra tags.

In which, allows us to add extra tags without following the expected av_exif_parse_buffer → int extra_tag = 0xFFFCu; extra_tag > 0xFFECu; extra_tag-- path; forging a non-contiguous extra IFD entries; making the “peeling” process break on first recursion; leaving extra tags in IFD0 that are expected to be peeled, and already sized by zero by exif_get_ifd_size , accounting for no 12 bytes directory slot.

When it hits a small payload entry (e.g., SHORT), it executes AV_WN32(pb->buffer, 0) to zero inline padding (libavcodec/exif.c:731). At that point pb->bufferis already at pb->buffer_end - 2, so the 4‑byte zero write spills past the buffer end directly into the next heap chunk’s metadata.

Repro

This bug can be triggered by simply ./ffmpeg -i <file> under default build, no prerequisites.

poc/poc.png: ASAN heap-buffer-overflow (decoder thread av:png:df0)
poc/poc.webp: ASAN heap-buffer-overflow (T0)
poc/poc.jpg: ASAN heap-buffer-overflow (decoder thread dec0:0:mjpeg)
poc/poc.avif: ASAN heap-buffer-overflow (T0)
poc/poc.jxl: ASAN heap-buffer-overflow (decoder thread dec0:0:libjxl)

This gave us ASan trace of:

heap-buffer-overflow at libavcodec/exif.c:731 in exif_write_ifd
exif_write_ifd
- av_exif_write
  - exif_attach_ifd
    - ff_decode_exif_attach_buffer
      - decode_frame_common
        
        decode_frame_png

Matching the writeup.

Note that for .tiff:

build-asan-jxl/ffmpeg -v debug -i poc-exif/poc_exif.tiff -f null - logs:
writing IFD with 17 entries and initial offset 218
writing TIFF entry: id: 0xfffb ... offset value: 329
EXIF metadata: (323 bytes)
- The directory area is IFD_EXTRA_SIZE + BASE_TAG_SIZE * count = 6 + 12*17 = 210 bytes. With the TIFF header, the initial payload offset is 218 (matches the log).
- The under‑allocation from exif_get_ifd_size() is only 12 bytes (the skipped base tag size for the 0xfffb IFD entry) in libavcodec/exif.c.
- So even under‑allocated, the buffer still comfortably covers the directory area, where the only unguarded write happens (AV_WN32(pb->buffer, 0) for inline payloads in exif_write_ifd()).
- The shortfall hits the payload area instead. Those writes use bytestream2_put_*() / bytestream2_seek_p() which clamp and set p->eof rather than OOB.

Timeline

Since the bug is disclosed during the Christmas holiday, we do not disclose the specific dates.

Dec, 2025: Discovery of potential issue, replications & validations.
Dec, 2025: Disclosed to ffmpeg-security.
Dec, 2025: avcodec/exif maintainer provided patch.
Dec, 2025: Patched merged