When querying NUMA status of pages in shared memory, we need to touch
the memory first to get valid results. This may trigger valgrind
reports, because some of the memory (e.g. unpinned buffers) may be
marked as noaccess.
Solved by adding a valgrind suppresion. An alternative would be to
adjust the access/noaccess status before touching the memory, but that
seems far too invasive. It would require all those places to have
detailed knowledge of what the shared memory stores.
The pg_numa_touch_mem_if_required() macro is replaced with a function.
Macros are invisible to suppressions, so it'd have to suppress reports
for the caller - e.g. pg_get_shmem_allocations_numa(). So we'd suppress
reports for the whole function, and that seems to heavy-handed. It might
easily hide other valid issues.
Reviewed-by: Christoph Berg <myon@debian.org>
Reviewed-by: Bertrand Drouvot <bertranddrouvot.pg@gmail.com>
Discussion: https://postgr.es/m/aEtDozLmtZddARdB@msg.df7cb.de
Backpatch-through: 18
Make sure that function declarations use names that exactly match the
corresponding names from function definitions in a few places. These
inconsistencies were all introduced during Postgres 18 development.
This commit was written with help from clang-tidy, by mechanically
applying the same rules as similar clean-up commits (the earliest such
commit was commit 035ce1fe).
This moves/renames some of the functions defined in pg_numa.c:
* pg_numa_get_pagesize() is renamed to pg_get_shmem_pagesize(), and
moved to src/backend/storage/ipc/shmem.c. The new name better reflects
that the page size is not related to NUMA, and it's specifically about
the page size used for the main shared memory segment.
* move pg_numa_available() to src/backend/storage/ipc/shmem.c, i.e. into
the backend (which more appropriate for functions callable from SQL).
While at it, improve the comment to explain what page size it returns.
* remove unnecessary includes from src/port/pg_numa.c, adding
unnecessary dependencies (src/port should be suitable for frontent).
These were either leftovers or unnecessary thanks to the other changes
in this commit.
This eliminates unnecessary dependencies on backend symbols, which we
don't want in src/port.
Reported-by: Kirill Reshke <reshkekirill@gmail.com>
Reviewed-by: Andres Freund <andres@anarazel.de>
https://postgr.es/m/CALdSSPi5fj0a7UG7Fmw2cUD1uWuckU_e8dJ+6x-bJEokcSXzqA@mail.gmail.com
Add basic NUMA awareness routines, using a minimal src/port/pg_numa.c
portability wrapper and an optional build dependency, enabled by
--with-libnuma configure option. For now this is Linux-only, other
platforms may be supported later.
A built-in SQL function pg_numa_available() allows checking NUMA
support, i.e. that the server was built/linked with the NUMA library.
The main function introduced is pg_numa_query_pages(), which allows
determining the NUMA node for individual memory pages. Internally the
function uses move_pages(2) syscall, as it allows batching, and is more
efficient than get_mempolicy(2).
Author: Jakub Wartak <jakub.wartak@enterprisedb.com>
Co-authored-by: Bertrand Drouvot <bertranddrouvot.pg@gmail.com>
Reviewed-by: Andres Freund <andres@anarazel.de>
Reviewed-by: Álvaro Herrera <alvherre@alvh.no-ip.org>
Reviewed-by: Tomas Vondra <tomas@vondra.me>
Discussion: https://postgr.es/m/CAKZiRmxh6KWo0aqRqvmcoaX2jUxZYb4kGp3N%3Dq1w%2BDiH-696Xw%40mail.gmail.com
The previous implementation of CRC32C on x86 relied on the native
CRC32 instruction from the SSE 4.2 extension, which operates on
up to 8 bytes at a time. We can get a substantial speedup by using
carryless multiplication on SIMD registers, processing 64 bytes per
loop iteration. Shorter inputs fall back to ordinary CRC instructions.
On Intel Tiger Lake hardware (2020), CRC is now 50% faster for inputs
between 64 and 112 bytes, and 3x faster for 256 bytes.
The VPCLMULQDQ instruction on 512-bit registers has been available
on Intel hardware since 2019 and AMD since 2022. There is an older
variant for 128-bit registers, but at least on Zen 2 it performs worse
than normal CRC instructions for short inputs.
We must now do a runtime check, even for builds that target SSE
4.2. This doesn't matter in practice for WAL (arguably the most
critical case), because since commit e2809e3a1 the final computation
with the 20-byte WAL header is inlined and unrolled when targeting
that extension. Compared with two direct function calls, testing
showed equal or slightly faster performance in performing an indirect
function call on several dozen bytes followed by inlined instructions
on constant input of 20 bytes.
The MIT-licensed implementation was generated with the "generate"
program from
https://github.com/corsix/fast-crc32/
Based on: "Fast CRC Computation for Generic Polynomials Using PCLMULQDQ
Instruction" V. Gopal, E. Ozturk, et al., 2009
Co-authored-by: Raghuveer Devulapalli <raghuveer.devulapalli@intel.com>
Co-authored-by: Paul Amonson <paul.d.amonson@intel.com>
Reviewed-by: Nathan Bossart <nathandbossart@gmail.com>
Reviewed-by: Andres Freund <andres@anarazel.de> (earlier version)
Reviewed-by: Matthew Sterrett <matthewsterrett2@gmail.com> (earlier version)
Tested-by: Raghuveer Devulapalli <raghuveer.devulapalli@intel.com>
Tested-by: David Rowley <<dgrowleyml@gmail.com>> (earlier version)
Discussion: https://postgr.es/m/BL1PR11MB530401FA7E9B1CA432CF9DC3DC192@BL1PR11MB5304.namprd11.prod.outlook.com
Discussion: https://postgr.es/m/PH8PR11MB82869FF741DFA4E9A029FF13FBF72@PH8PR11MB8286.namprd11.prod.outlook.com
On Red Hat 9 systems (or similar), the packaged gcc targets x86-64-v2,
but clang does not. This has caused build failures in the wake of
commit e2809e3a1 when building --with-llvm.
The most expedient fix is to use the same function attributes for
the inlined function as we do for the global function.
Reported-by: Kyotaro Horiguchi <horikyota.ntt@gmail.com> (plus members skimmer and bumblebee)
Diagnosed-by: Tom Lane <tgl@sss.pgh.pa.us>
Tested-by: Todd Cook <cookt@blackduck.com>
Discussion: https://postgr.es/m/CANWCAZZSxs3a1YRKehkgk2OHKbrVn+xZ+AWW8Co2R_f70NqqmA@mail.gmail.com
pg_crc32c.h now has a simplified copy of the loop in pg_crc32c_sse42.c
suitable for inlining where possible.
This may slightly reduce contention for the WAL insertion lock,
but that hasn't been tested. The motivation for this change is avoid
regressing for a future commit that will use a function pointer for
non-constant input in all x86 builds.
While it's technically possible to make a similar change for Arm and
LoongArch, there are some questions about how inlining should work
since those platforms prefer stricter alignment. There are also no
immediate plans to add additional implementations for them.
Reviewed-by: Nathan Bossart <nathandbossart@gmail.com>
Reviewed-by: Raghuveer Devulapalli <raghuveer.devulapalli@intel.com>
Discussion: https://postgr.es/m/CANWCAZZEiTzhZcuwTiJ2=opiNpAUn1vuDRu1N02z61AthwRZLA@mail.gmail.com
Discussion: https://postgr.es/m/CANWCAZYRhLHArpyfV4uRK-Rw9N5oV5HMkkKtBehcuTjNOMwCZg@mail.gmail.com
Remove setlocale() and _configthreadlocal() as fallback strategy on
systems that don't have uselocale(), where ECPG tries to control
LC_NUMERIC formatting on input and output of floating point numbers. It
was probably broken on some systems (NetBSD), and the code was also
quite messy and complicated, with obsolete configure tests (Windows).
It was also arguably broken, or at least had unstated environmental
requirements, if pgtypeslib code was called directly.
Instead, introduce PG_C_LOCALE to refer to the "C" locale as a locale_t
value. It maps to the special constant LC_C_LOCALE when defined by libc
(macOS, NetBSD), or otherwise uses a process-lifetime locale_t that is
allocated on first use, just as ECPG previously did itself. The new
replacement might be more widely useful. Then change the float parsing
and printing code to pass that to _l() functions where appropriate.
Unfortunately the portability of those functions is a bit complicated.
First, many obvious and useful _l() functions are missing from POSIX,
though most standard libraries define some of them anyway. Second,
although the thread-safe save/restore technique can be used to replace
the missing ones, Windows and NetBSD refused to implement standard
uselocale(). They might have a point: "wide scope" uselocale() is hard
to combine with other code and error-prone, especially in library code.
Luckily they have the _l() functions we want so far anyway. So we have
to be prepared for both ways of doing things:
1. In ECPG, use strtod_l() for parsing, and supply a port.h replacement
using uselocale() over a limited scope if missing.
2. Inside our own snprintf.c, use three different approaches to format
floats. For frontend code, call libc's snprintf_l(), or wrap libc's
snprintf() in uselocale() if it's missing. For backend code, snprintf.c
can keep assuming that the global locale's LC_NUMERIC is "C" and call
libc's snprintf() without change, for now.
(It might eventually be possible to call our in-tree Ryū routines to
display floats in snprintf.c, given the C-locale-always remit of our
in-tree snprintf(), but this patch doesn't risk changing anything that
complicated.)
Author: Thomas Munro <thomas.munro@gmail.com>
Reviewed-by: Peter Eisentraut <peter@eisentraut.org>
Reviewed-by: Tristan Partin <tristan@partin.io>
Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi>
Discussion: https://postgr.es/m/CWZBBRR6YA8D.8EHMDRGLCKCD%40neon.tech
The default of 128kB is unchanged, but the upper limit is changed from
32 blocks to 128 blocks, unless the operating system's IOV_MAX is too
low. Some other RDBMSes seem to cap their multi-block buffer pool I/O
around this number, and it seems useful to allow experimentation.
The concrete change is to our definition of PG_IOV_MAX, which provides
the maximum for io_combine_limit and io_max_combine_limit. It also
affects a couple of other places that work with arrays of struct iovec
or smaller objects on the stack, so we still don't want to use the
system IOV_MAX directly without a clamp: it is not under our control and
likely to be 1024. 128 seems acceptable for our current usage.
For Windows, we can't use real scatter/gather yet, so we continue to
define our own IOV_MAX value of 16 and emulate preadv()/pwritev() with
loops. Someone would need to research the trade-offs of raising that
number.
NB if trying to see this working: you might temporarily need to hack
BAS_BULKREAD to be bigger, since otherwise the obvious way of "a very
big SELECT" is limited by that for now.
Suggested-by: Tomas Vondra <tomas@vondra.me>
Discussion: https://postgr.es/m/CA%2BhUKG%2B2T9p-%2BzM6Eeou-RAJjTML6eit1qn26f9twznX59qtCA%40mail.gmail.com
Since commit f78667bd91, we've used __attribute__((target(...)))
instead of extra compiler flags for AVX-512 support, but this
comment still says that we put the code in a separate file because
it might require extra compiler flags. Let's just remove that part
of the comment.
It is not clear why these were originally added. One hypothesis is
that an ancient version of MinGW didn't define them. In any case,
they appear to now be superfluous, so let's remove them. If
nothing else, the buildfarm might offer us clues to their origins.
Reviewed-by: Thomas Munro
Discussion: https://postgr.es/m/Z4chOKfnthRH71mw%40nathan
Add various widely useful "IWYU pragma" annotations, such as
- Common header files such as c.h, postgres.h should be "always_keep".
- System headers included in c.h, postgres.h etc. should be considered
"export".
- Some portability headers such as getopt_long.h should be
"always_keep", so they are not considered superfluous on some
platforms.
- Certain system headers included from portability headers should be
considered "export" because the purpose of the portability header is
to wrap them.
- Superfluous includes marked as "for backward compatibility" get a
formal IWYU annotation.
- Generated header included in utils/syscache.h is marked exported.
This is a very commonly used include and this avoids lots of
complaints.
Discussion: https://www.postgresql.org/message-id/flat/9395d484-eff4-47c2-b276-8e228526c8ae@eisentraut.org
Redefine our exact width types with standard C99 types and macros,
including int64_t, INT64_MAX, INT64_C(), PRId64 etc. We were already
using <stdint.h> types in a few places.
One complication is that Windows' <inttypes.h> uses format strings like
"%I64d", "%I32", "%I" for PRI*64, PRI*32, PTR*PTR, instead of mapping to
other standardized format strings like "%lld" etc as seen on other known
systems. Teach our snprintf.c to understand them.
This removes a lot of configure clutter, and should also allow 64-bit
numbers and other standard types to be used in localized messages
without casting.
Reviewed-by: Peter Eisentraut <peter@eisentraut.org>
Discussion: https://postgr.es/m/ME3P282MB3166F9D1F71F787929C0C7E7B6312%40ME3P282MB3166.AUSP282.PROD.OUTLOOK.COM
The code in question (pg_preadv() and pg_pwritev()) has been around
for a while, but commit 15c9ac3629 moved it to a header file. If
third-party code that includes this header file is built with
-Wsign-compare on a system without preadv() or pwritev(), warnings
ensue. This commit fixes said warnings by casting the result of
pg_pread()/pg_pwrite() to size_t, which should be safe because we
will have already checked for a negative value.
Author: Wolfgang Walther
Discussion: https://postgr.es/m/16989737-1aa8-48fd-8dfe-b7ada06509ab%40technowledgy.de
Backpatch-through: 17
Use gmtime_r() and localtime_r() instead of gmtime() and localtime(),
for thread-safety.
There are a few affected calls in libpq and ecpg's libpgtypes, which
are probably effectively bugs, because those libraries already claim
to be thread-safe.
There is one affected call in the backend. Most of the backend
otherwise uses the custom functions pg_gmtime() and pg_localtime(),
which are implemented differently.
While we're here, change the call in the backend to gmtime*() instead
of localtime*(), since for that use time zone behavior is irrelevant,
and this side-steps any questions about when time zones are
initialized by localtime_r() vs localtime().
Portability: gmtime_r() and localtime_r() are in POSIX but are not
available on Windows. Windows has functions gmtime_s() and
localtime_s() that can fulfill the same purpose, so we add some small
wrappers around them. (Note that these *_s() functions are also
different from the *_s() functions in the bounds-checking extension of
C11. We are not using those here.)
On MinGW, you can get the POSIX-style *_r() functions by defining
_POSIX_C_SOURCE appropriately before including <time.h>. This leads
to a conflict at least in plpython because apparently _POSIX_C_SOURCE
gets defined in some header there, and then our replacement
definitions conflict with the system definitions. To avoid that sort
of thing, we now always define _POSIX_C_SOURCE on MinGW and use the
POSIX-style functions here.
Reviewed-by: Stepan Neretin <sncfmgg@gmail.com>
Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi>
Reviewed-by: Thomas Munro <thomas.munro@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/eba1dc75-298e-4c46-8869-48ba8aad7d70@eisentraut.org
Previously we had a fallback implementation that made a harmless system
call, based on the assumption that system calls must contain a memory
barrier. That shouldn't be reached on any current system, and it seems
highly likely that we can easily find out how to request explicit memory
barriers, if we've already had to find out how to do atomics on a
hypothetical new system.
Removed comments and a function name referred to a spinlock used for
fallback memory barriers, but that changed in 1b468a13, which left some
misleading words behind in a few places.
Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi>
Suggested-by: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/721bf39a-ed8a-44b0-8b8e-be3bd81db748%40technowledgy.de
Discussion: https://postgr.es/m/3351991.1697728588%40sss.pgh.pa.us
Previously we had a fallback implementation of pg_compiler_barrier()
that called an empty function across a translation unit boundary so the
compiler couldn't see what it did. That shouldn't be needed on any
current systems, and might not even work with a link time optimizer.
Since we now require compiler-specific knowledge of how to implement
atomics, we should also know how to implement compiler barriers on a
hypothetical new system.
Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi>
Suggested-by: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/721bf39a-ed8a-44b0-8b8e-be3bd81db748%40technowledgy.de
Discussion: https://postgr.es/m/3351991.1697728588%40sss.pgh.pa.us
Modern versions of all relevant architectures and tool chains have
atomics support. Since edadeb07, there is no remaining reason to carry
code that simulates atomic flags and uint32 imperfectly with spinlocks.
64 bit atomics are still emulated with spinlocks, if needed, for now.
Any modern compiler capable of implementing C11 <stdatomic.h> must have
the underlying operations we need, though we don't require C11 yet. We
detect certain compilers and architectures, so hypothetical new systems
might need adjustments here.
Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> (concept, not the patch)
Reviewed-by: Andres Freund <andres@anarazel.de> (concept, not the patch)
Discussion: https://postgr.es/m/3351991.1697728588%40sss.pgh.pa.us
This code wanted to ensure that the 'exchange' variable passed to
pg_atomic_compare_exchange_u64 has correct alignment, but apparently
platforms don't actually require anything that doesn't come naturally.
While messing with pg_atomic_monotonic_advance_u64: instead of using
Max() to determine the value to return, just use
pg_atomic_compare_exchange_u64()'s return value to decide; also, use
pg_atomic_compare_exchange_u64 instead of the _impl version; also remove
the unnecessary underscore at the end of variable name "target".
Backpatch to 17, where this code was introduced by commit bf3ff7bf83bc.
Reported-by: Alexander Lakhin <exclusion@gmail.com>
Discussion: https://postgr.es/m/36796438-a718-cf9b-2071-b2c1b947c1b5@gmail.com
This old CPU architecture hasn't been produced in decades, and
whatever instances might still survive are surely too underpowered
for anyone to consider running Postgres on in production. We'd
nonetheless continued to carry code support for it (largely at my
insistence), because its unique implementation of spinlocks seemed
like a good edge case for our spinlock infrastructure. However,
our last buildfarm animal of this type was retired last year, and
it seems quite unlikely that another will emerge. Without the ability
to run tests, the argument that this is useful test code fails to
hold water. Furthermore, carrying code support for an untestable
architecture has costs not to be ignored. So, remove HPPA-specific
code, in the same vein as commits 718aa43a4 and 92d70b77e.
Discussion: https://postgr.es/m/3351991.1697728588@sss.pgh.pa.us
This tracks the position of WAL that's been fully copied into WAL
buffers by all processes emitting WAL. (For some reason we call that
"WAL insertion"). This is updated using atomic monotonic advance during
WaitXLogInsertionsToFinish, which is not when the insertions actually
occur, but it's the only place where we know where have all the
insertions have completed.
This value is useful in WALReadFromBuffers, which can verify that
callers don't try to read past what has been inserted. (However, more
infrastructure is needed in order to actually use WAL after the flush
point, since it could be lost.)
The value is also useful in WaitXLogInsertionsToFinish() itself, since
we can now exit quickly when all WAL has been already inserted, without
even having to take any locks.
Commit 792752af4e added infrastructure for using AVX-512 intrinsic
functions, and this commit uses that infrastructure to optimize
visibilitymap_count(). Specificially, a new pg_popcount_masked()
function is introduced that applies a bitmask to every byte in the
buffer prior to calculating the population count, which is used to
filter out the all-visible or all-frozen bits as needed. Platforms
without AVX-512 support should also see a nice speedup due to the
reduced number of calls to a function pointer.
Co-authored-by: Ants Aasma
Discussion: https://postgr.es/m/BL1PR11MB5304097DF7EA81D04C33F3D1DCA6A%40BL1PR11MB5304.namprd11.prod.outlook.com
Presently, pg_popcount() processes data in 32-bit or 64-bit chunks
when possible. Newer hardware that supports AVX-512 instructions
can use 512-bit chunks, which provides a nice speedup, especially
for larger buffers. This commit introduces the infrastructure
required to detect compiler and CPU support for the required
AVX-512 intrinsic functions, and it adds a new pg_popcount()
implementation that uses these functions. If CPU support for this
optimized implementation is detected at runtime, a function pointer
is updated so that it is used by subsequent calls to pg_popcount().
Most of the existing in-tree calls to pg_popcount() should benefit
from these instructions, and calls with smaller buffers should at
least not regress compared to v16. The new infrastructure
introduced by this commit can also be used to optimize
visibilitymap_count(), but that is left for a follow-up commit.
Co-authored-by: Paul Amonson, Ants Aasma
Reviewed-by: Matthias van de Meent, Tom Lane, Noah Misch, Akash Shankaran, Alvaro Herrera, Andres Freund, David Rowley
Discussion: https://postgr.es/m/BL1PR11MB5304097DF7EA81D04C33F3D1DCA6A%40BL1PR11MB5304.namprd11.prod.outlook.com
If there aren't many bytes to process, the function call overhead
of the optimized implementation isn't worth taking, so instead we
inline a loop that consults pg_number_of_ones in that case. If
there are many bytes to process, we accept the function call
overhead because the optimized versions are likely to be faster.
The threshold at which we use the optimized implementation is set
to the smallest amount of data required to use special popcount
instructions.
Reviewed-by: Alvaro Herrera, Tom Lane
Discussion: https://postgr.es/m/20240402155301.GA2750455%40nathanxps13
The newly-introduced "one_by_one" label produces -Wunused-label
warnings when building without SIMD support. To fix, move the
label into the SIMD section of this function.
Oversight in commit 7644a7340c.
Reported-by: Tom Lane
Discussion: https://postgr.es/m/3189995.1711495704%40sss.pgh.pa.us
This commit improves the performance of pg_lfind32() in many cases
by modifying it to process the remaining "tail" of elements with
SIMD instructions instead of processing them one-by-one. Since the
SIMD code processes a large block of elements, this means that we
will process a subset of elements more than once, but that won't
affect the correctness of the result, and testing has shown that
this helps more cases than it regresses. With this change, the
standard one-by-one linear search code is only used for small
arrays and for platforms without SIMD support.
Suggested-by: John Naylor
Reviewed-by: John Naylor
Discussion: https://postgr.es/m/20231129171526.GA857928%40nathanxps13
On some systems, calls to pg_popcount{32,64} are indirected through
a function pointer. This commit converts pg_popcount() to a
function pointer on those systems so that we can inline the
appropriate pg_popcount{32,64} implementations into each of the
pg_popcount() implementations. Since pg_popcount() may call
pg_popcount{32,64} several times, this can significantly improve
its performance.
Suggested-by: David Rowley
Reviewed-by: David Rowley
Discussion: https://postgr.es/m/CAApHDvrb7MJRB6JuKLDEY2x_LKdFHwVbogpjZBCX547i5%2BrXOQ%40mail.gmail.com
The first argument of vshrq_n_s8 needs to be a signed vector type,
but it was passed unsigned. Clang is more lax with conversion, but
gcc needs a cast.
Fix by me, tested by Masahiko Sawada
Per buildfarm members splitfin, batta, widowbird, snakefly, parula,
massasauga
Discussion: https://postgr.es/m/20240306074106.mg6w4koohdlworbs%40alap3.anarazel.de
Writing correct code using atomic variables is often difficult due
to the memory barrier semantics (or lack thereof) of the underlying
operations. This commit introduces atomic read/write functions
with full barrier semantics to ease this cognitive load. For
example, some spinlocks protect a single value, and these new
functions make it easy to convert the value to an atomic variable
(thus eliminating the need for the spinlock) without modifying the
barrier semantics previously provided by the spinlock. Since these
functions may be less performant than the other atomic reads and
writes, they are not suitable for every use-case. However, using a
single atomic operation with full barrier semantics may be more
performant in cases where a separate explicit barrier would
otherwise be required.
The base implementations for these new functions are atomic
exchanges (for writes) and atomic fetch/adds with 0 (for reads).
These implementations can be overwritten with better architecture-
specific versions as they are discovered.
This commit leaves converting existing code to use these new
functions as a future exercise.
Reviewed-by: Andres Freund, Yong Li, Jeff Davis
Discussion: https://postgr.es/m/20231110205128.GB1315705%40nathanxps13
There isn't a lot of user demand for AIX support, we have a bunch of
hacks to work around AIX-specific compiler bugs and idiosyncrasies,
and no one has stepped up to the plate to properly maintain it.
Remove support for AIX to get rid of that maintenance overhead. It's
still supported for stable versions.
The acute issue that triggered this decision was that after commit
8af2565248, the AIX buildfarm members have been hitting this
assertion:
TRAP: failed Assert("(uintptr_t) buffer == TYPEALIGN(PG_IO_ALIGN_SIZE, buffer)"), File: "md.c", Line: 472, PID: 2949728
Apperently the "pg_attribute_aligned(a)" attribute doesn't work on AIX
for values larger than PG_IO_ALIGN_SIZE, for a static const variable.
That could be worked around, but we decided to just drop the AIX support
instead.
Discussion: https://www.postgresql.org/message-id/20240224172345.32@rfd.leadboat.com
Reviewed-by: Andres Freund, Noah Misch, Thomas Munro
Commit 5579388d removed code that supplied a fallback implementation of
getaddrinfo(), which was dead code on modern systems. One tiny piece of
the removed code was still doing something useful on Windows, though:
that OS's own gai_strerror()/gai_strerrorA() function returns a pointer
to a static buffer that it overwrites each time, so it's not
thread-safe. In rare circumstances, a multi-threaded client program
could get an incorrect or corrupted error message.
Restore the replacement gai_strerror() function, though now that it's
only for Windows we can put it into a win32-specific file and cut it
down to the errors that Windows documents. The error messages here are
taken from FreeBSD, because Windows' own messages seemed too verbose.
Back-patch to 16.
Reviewed-by: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Discussion: https://postgr.es/m/CA%2BhUKGKz%2BF9d2PTiXwfYV7qJw%2BWg2jzACgSDgPizUw7UG%3Di58A%40mail.gmail.com
Presently, all platforms implement atomic exchanges by performing
an atomic compare-and-swap in a loop until it succeeds. This can
be especially expensive when there is contention on the atomic
variable. This commit optimizes atomic exchanges on many platforms
by using compiler intrinsics, which should compile into something
much less expensive than a compare-and-swap loop. Since these
intrinsics have been available for some time, the inline assembly
implementations are omitted.
Suggested-by: Andres Freund
Reviewed-by: Andres Freund
Discussion: https://postgr.es/m/20231129212905.GA1258737%40nathanxps13
For the trivial case of iovcnt == 1, kernels are measurably slower at
dealing with the more complex arguments of preadv/pwritev than the
equivalent plain old pread/pwrite. The overheads are worth it for
iovcnt > 1, but for 1 let's just redirect to the cheaper calls. While
we could leave it to callers to worry about that, we already have to
have our own pg_ wrappers for portability reasons so it seems
reasonable to centralize this knowledge there (thanks to Heikki for this
suggestion). Try to avoid function call overheads by making them
inlinable, which might also allow the compiler to avoid the branch in
some cases. For systems that don't have preadv and pwritev (currently:
Windows and [closed] Solaris), we might as well pull the replacement
functions up into the static inline functions too.
Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi>
Discussion: https://postgr.es/m/CA+hUKGJkOiOCa+mag4BF+zHo7qo=o9CFheB8=g6uT5TUm2gkvA@mail.gmail.com