Files
postgresql/contrib
Heikki Linnakangas 00896ddaf4 Fix buffer overflows in pg_trgm due to lower-casing
The code made a subtle assumption that the lower-cased version of a
string never has more characters than the original. That is not always
true. For example, in a database with the latin9 encoding:

    latin9db=# select lower(U&'\00CC' COLLATE "lt-x-icu");
       lower
    -----------
     i\x1A\x1A
    (1 row)

In this example, lower-casing expands the single input character into
three characters.

The generate_trgm_only() function relied on that assumption in two
ways:

- It used "slen * pg_database_encoding_max_length() + 4" to allocate
  the buffer to hold the lowercased and blank-padded string. That
  formula accounts for expansion if the lower-case characters are
  longer (in bytes) than the originals, but it's still not enough if
  the lower-cased string contains more *characters* than the original.

- Its callers sized the output array to hold the trigrams extracted
  from the input string with the formula "(slen / 2 + 1) * 3", where
  'slen' is the input string length in bytes. (The formula was
  generous to account for the possibility that RPADDING was set to 2.)
  That's also not enough if one input byte can turn into multiple
  characters.

To fix, introduce a growable trigram array and give up on trying to
choose the correct max buffer sizes ahead of time.

Backpatch to v18, but no further. In previous versions lower-casing was
done character by character, and thus the assumption that lower-casing
doesn't change the character length was valid. That was changed in v18,
commit fb1a18810f.

Security: CVE-2026-2007
Reviewed-by: Noah Misch <noah@leadboat.com>
Reviewed-by: Jeff Davis <pgsql@j-davis.com>
2026-02-09 12:08:58 +13:00
..
2026-01-01 13:24:10 -05:00
2026-01-01 13:24:10 -05:00
2026-01-01 13:24:10 -05:00
2026-01-01 13:24:10 -05:00
2026-01-01 13:24:10 -05:00
2026-01-22 12:44:07 +01:00
2026-01-01 13:24:10 -05:00
2026-01-01 13:24:10 -05:00
2026-01-01 13:24:10 -05:00
2026-01-22 12:44:07 +01:00
2026-01-01 13:24:10 -05:00
2026-01-01 13:24:10 -05:00
2026-01-01 13:24:10 -05:00
2026-01-01 13:24:10 -05:00
2026-01-22 12:44:07 +01:00
2026-01-01 13:24:10 -05:00
2026-01-01 13:24:10 -05:00
2026-01-01 13:24:10 -05:00
2026-01-01 13:24:10 -05:00
2026-01-01 13:24:10 -05:00
2026-01-01 13:24:10 -05:00
2026-01-01 13:24:10 -05:00
2026-01-22 12:44:07 +01:00
2026-01-01 13:24:10 -05:00
2026-01-01 13:24:10 -05:00
2026-01-01 13:24:10 -05:00
2026-01-01 13:24:10 -05:00
2026-01-01 13:24:10 -05:00
2026-01-01 13:24:10 -05:00
2026-01-01 13:24:10 -05:00
2026-01-01 13:24:10 -05:00
2026-01-01 13:24:10 -05:00
2026-01-01 13:24:10 -05:00
2026-01-01 13:24:10 -05:00

The PostgreSQL contrib tree
---------------------------

This subtree contains porting tools, analysis utilities, and plug-in
features that are not part of the core PostgreSQL system, mainly
because they address a limited audience or are too experimental to be
part of the main source tree.  This does not preclude their
usefulness.

User documentation for each module appears in the main SGML
documentation.

When building from the source distribution, these modules are not
built automatically, unless you build the "world" target.  You can
also build and install them all by running "make all" and "make
install" in this directory; or to build and install just one selected
module, do the same in that module's subdirectory.

Some directories supply new user-defined functions, operators, or
types.  To make use of one of these modules, after you have installed
the code you need to register the new SQL objects in the database
system by executing a CREATE EXTENSION command.  In a fresh database,
you can simply do

    CREATE EXTENSION module_name;

See the PostgreSQL documentation for more information about this
procedure.