postgresql

mirror of https://git.postgresql.org/git/postgresql.git synced 2026-02-09 16:17:39 +08:00

Files

John Naylor 5bc429aacb Extend collection of Unicode combining characters to beyond the BMP

The former limit was perhaps a carryover from an older hand-coded
table. Since commit bab982161 we have enough space in mbinterval to
store larger codepoints, so collect all combining characters.

Discussion: https://www.postgresql.org/message-id/49ad1fa0-174e-c901-b14c-c484b60907f1%40enterprisedb.com

2021-08-26 13:07:34 -04:00

.gitignore

Update display widths as part of updating Unicode

2021-08-26 10:53:56 -04:00

generate-norm_test_table.pl

Update copyright for 2021

2021-01-02 13:06:25 -05:00

generate-unicode_combining_table.pl

Extend collection of Unicode combining characters to beyond the BMP

2021-08-26 13:07:34 -04:00

generate-unicode_east_asian_fw_table.pl

Update display widths as part of updating Unicode

2021-08-26 10:53:56 -04:00

generate-unicode_norm_table.pl

Update copyright for 2021

2021-01-02 13:06:25 -05:00

generate-unicode_normprops_table.pl

Update copyright for 2021

2021-01-02 13:06:25 -05:00

Makefile

Update display widths as part of updating Unicode

2021-08-26 10:53:56 -04:00

norm_test.c

2021-01-02 13:06:25 -05:00

README

Add support for automatically updating Unicode derived files

2020-01-09 10:08:14 +01:00

README

This directory contains tools to generate the tables in
src/include/common/unicode_norm.h, used for Unicode normalization. The
generated .h file is included in the source tree, so these are normally not
needed to build PostgreSQL, only if you need to re-generate the .h file
from the Unicode data files for some reason, e.g. to update to a new version
of Unicode.

Generating unicode_norm_table.h
-------------------------------

Run

    make update-unicode

from the top level of the source tree and commit the result.

Tests
-----

The Unicode consortium publishes a comprehensive test suite for the
normalization algorithm, in a file called NormalizationTest.txt. This
directory also contains a perl script and some C code, to run our
normalization code with all the test strings in NormalizationTest.txt.
To download NormalizationTest.txt and run the tests:

    make normalization-check

This is also run as part of the update-unicode target.