Commit Graph

797 Commits

Author SHA1 Message Date
5cbfce562f Initial pgindent and pgperltidy run for v13.
Includes some manual cleanup of places that pgindent messed up,
most of which weren't per project style anyway.

Notably, it seems some people didn't absorb the style rules of
commit c9d297751, because there were a bunch of new occurrences
of function calls with a newline just after the left paren, all
with faulty expectations about how the rest of the call would get
indented.
2020-05-14 13:06:50 -04:00
7a9c9ce641 Translation updates
Source-Git-URL: https://git.postgresql.org/git/pgtranslation/messages.git
Source-Git-Hash: 80d8f54b3c5533ec036404bd3c3b24ff4825d037
2020-05-11 13:14:32 +02:00
7666ef313d Unify find_other_exec() error messages
There were a few different ways to line-wrap the error messages.  Make
them all the same, and use placeholders for the actual program names,
to save translation work.
2020-05-08 13:34:53 +02:00
6c5f916168 Update Windows timezone name list to include currently-known zones.
Thanks to Juan José Santamaría Flecha.

Discussion: https://postgr.es/m/5752.1587740484@sss.pgh.pa.us
2020-04-24 17:53:23 -04:00
bd8c5cee96 Improve placement of "display name" comment in win32_tzmap[] entries.
Sticking this comment at the end of the last line was a bad idea: it's
not particularly readable, and it tempts pgindent to mess with line
breaks within the comment, which in turn reveals that win32tzlist.pl's
clean_displayname() does the wrong thing to clean up such line breaks.
While that's not hard to fix, there's basically no excuse for this
arrangement to begin with, especially since it makes the table layout
needlessly vary across back branches with different pgindent rules.
Let's just put the comment inside the braces, instead.

This commit just moves and reformats the comments, and updates
win32tzlist.pl to match; there's no actual data change.

Per odd-looking results from Juan José Santamaría Flecha.
Back-patch, since the point is to make win32_tzmap[] look the
same in all supported branches again.

Discussion: https://postgr.es/m/5752.1587740484@sss.pgh.pa.us
2020-04-24 17:21:44 -04:00
1933ae629e Add PostgreSQL home page to --help output
Per emerging standard in GNU programs and elsewhere.  Autoconf already
has support for specifying a home page, so we can just that.

Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Discussion: https://www.postgresql.org/message-id/flat/8d389c5f-7fb5-8e48-9a4a-68cec44786fa%402ndquadrant.com
2020-02-28 13:12:21 +01:00
864934131e Refer to bug report address by symbol rather than hardcoding
Use the PACKAGE_BUGREPORT macro that is created by Autoconf for
referring to the bug reporting address rather than hardcoding it
everywhere.  This makes it easier to change the address and it reduces
translation work.

Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Discussion: https://www.postgresql.org/message-id/flat/8d389c5f-7fb5-8e48-9a4a-68cec44786fa%402ndquadrant.com
2020-02-28 13:12:21 +01:00
e2e02191e2 Clean up some code, comments and docs referring to Windows 2000 and older
This fixes and updates a couple of comments related to outdated Windows
versions.  Particularly, src/common/exec.c had a fallback implementation
to read a file's line from a pipe because stdin/stdout/stderr does not
exist in Windows 2000 that is removed to simplify src/common/ as there
are unlikely versions of Postgres running on such platforms.

Author: Michael Paquier
Reviewed-by: Kyotaro Horiguchi, Juan José Santamaría Flecha
Discussion: https://postgr.es/m/20191219021526.GC4202@paquier.xyz
2020-02-19 13:20:33 +09:00
7aaefadaac Remove separate files for the initial contents of pg_(sh)description
This data was only in separate files because it was the most convenient
way to handle it with a shell script. Now that we use a general-purpose
programming language, it's easy to assemble the data into the same format
as the rest of the catalogs and output it into postgres.bki. This allows
removal of some special-purpose code from initdb.c.

Discussion: https://www.postgresql.org/message-id/CACPNZCtVFtjHre6hg9dput0qRPp39pzuyA2A6BT8wdgrRy%2BQdA%40mail.gmail.com
Author: John Naylor
2020-01-19 13:54:58 +02:00
e6afa8918c Move wchar.c and encnames.c to src/common/.
Formerly, various frontend directories symlinked these two sources
and then built them locally.  That's an ancient, ugly hack, and
we now have a much better way: put them into libpgcommon.
So do that.  (The immediate motivation for this is the prospect
of having to introduce still more symlinking if we don't.)

This commit moves these two files absolutely verbatim, for ease of
reviewing the git history.  There's some follow-on work to be done
that will modify them a bit.

Robert Haas, Tom Lane

Discussion: https://postgr.es/m/CA+TgmoYO8oq-iy8E02rD8eX25T-9SmyxKWqqks5OMHxKvGXpXQ@mail.gmail.com
2020-01-16 15:58:55 -05:00
7559d8ebfa Update copyrights for 2020
Backpatch-through: update all files in master, backpatch legal files through 9.4
2020-01-01 12:21:45 -05:00
2e4db241bf Remove configure --disable-float4-byval
This build option was only useful to maintain compatibility for
version-0 functions, but those are no longer supported, so this option
can be removed.

float4 is now always pass-by-value; the pass-by-reference code path is
completely removed.

Discussion: https://www.postgresql.org/message-id/flat/f3e1e576-2749-bbd7-2d57-3f9dcf75255a@2ndquadrant.com
2019-11-21 18:29:21 +01:00
01368e5d9d Split all OBJS style lines in makefiles into one-line-per-entry style.
When maintaining or merging patches, one of the most common sources
for conflicts are the list of objects in makefiles. Especially when
the split across lines has been changed on both sides, which is
somewhat common due to attempting to stay below 80 columns, those
conflicts are unnecessarily laborious to resolve.

By splitting, and alphabetically sorting, OBJS style lines into one
object per line, conflicts should be less frequent, and easier to
resolve when they still occur.

Author: Andres Freund
Discussion: https://postgr.es/m/20191029200901.vww4idgcxv74cwes@alap3.anarazel.de
2019-11-05 14:41:07 -08:00
fded4773eb initdb: Remove obsolete locale handling
The method of passing LC_COLLATE and LC_CTYPE to the backend during
initdb is obsolete as of 61d967498802ab86d8897cb3c61740d7e9d712f6.
This can all be removed.

Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://www.postgresql.org/message-id/flat/eeaf2f99-a1a6-8aca-3f43-9ab0b2fb112a%402ndquadrant.com
2019-08-14 06:51:13 +02:00
43211c2a02 initdb: Use varargs macro for PG_CMD_PRINTF
I left PG_CMD_PUTS around even though it could be handled by
PG_CMD_PRINTF since PG_CMD_PUTS is sometimes called with non-literal
arguments, and so that would create a potential problem if such a
string contained percent signs.

Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
2019-08-08 08:47:55 +02:00
8548ddc61b Fix inconsistencies and typos in the tree, take 9
This addresses more issues with code comments, variable names and
unreferenced variables.

Author: Alexander Lakhin
Discussion: https://postgr.es/m/7ab243e0-116d-3e44-d120-76b3df7abefd@gmail.com
2019-08-05 12:14:58 +09:00
eb43f3d193 Fix inconsistencies and typos in the tree
This is numbered take 8, and addresses again a set of issues with code
comments, variable names and unreferenced variables.

Author: Alexander Lakhin
Discussion: https://postgr.es/m/b137b5eb-9c95-9c2f-586e-38aba7d59788@gmail.com
2019-07-29 12:28:30 +09:00
3754113f33 Avoid choosing "localtime" or "posixrules" as TimeZone during initdb.
Some platforms create a file named "localtime" in the system
timezone directory, making it a copy or link to the active time
zone file.  If Postgres is built with --with-system-tzdata, initdb
will see that file as an exact match to localtime(3)'s behavior,
and it may decide that "localtime" is the most preferred spelling of
the active zone.  That's a very bad choice though, because it's
neither informative, nor portable, nor stable if someone changes
the system timezone setting.  Extend the preference logic added by
commit e3846a00c so that we will prefer any other zone file that
matches localtime's behavior over "localtime".

On the same logic, also discriminate against "posixrules", which
is another not-really-a-zone file that is often present in the
timezone directory.  (Since we install "posixrules" but not
"localtime", this change can affect the behavior of Postgres
with or without --with-system-tzdata.)

Note that this change doesn't prevent anyone from choosing these
pseudo-zones if they really want to (i.e., by setting TZ for initdb,
or modifying the timezone GUC later on).  It just prevents initdb
from preferring these zone names when there are multiple matches to
localtime's behavior.

Since we generally prefer to keep timezone-related behavior the
same in all branches, and since this is arguably a bug fix,
back-patch to all supported branches.

Discussion: https://postgr.es/m/CADT4RqCCnj6FKLisvT8tTPfTP4azPhhDFJqDF1JfBbOH5w4oyQ@mail.gmail.com
Discussion: https://postgr.es/m/27991.1560984458@sss.pgh.pa.us
2019-07-26 12:45:32 -04:00
7961886580 Revert "initdb: Change authentication defaults"
This reverts commit 09f08930f0f6fd4a7350ac02f29124b919727198.

The buildfarm client needs some adjustments first.
2019-07-22 19:28:25 +02:00
09f08930f0 initdb: Change authentication defaults
Change the defaults for the pg_hba.conf generated by initdb to "peer"
for local (if supported, else "md5") and "md5" for host.

(Changing from "md5" to SCRAM is left as a separate exercise.)

"peer" is currently not supported on AIX, HP-UX, and Windows.  Users
on those operating systems will now either have to provide a password
to initdb or choose a different authentication method when running
initdb.

Reviewed-by: Julien Rouhaud <rjuju123@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/bec17f0a-ddb1-8b95-5e69-368d9d0a3390%40postgresql.org
2019-07-22 15:14:27 +02:00
e435c1e7d9 Message style improvements 2019-07-09 15:47:09 +02:00
7b925e1270 Sync our Snowball stemmer dictionaries with current upstream
The main change is a new stemmer for Greek.  There are minor changes
in the Danish and French stemmers.

Author: Panagiotis Mavrogiorgos <pmav99@gmail.com>
2019-07-04 13:26:48 +02:00
84c41ae81b Fix accidentally swapped error message arguments
Author: Alexey Kondratov <a.kondratov@postgrespro.ru>
2019-07-02 23:44:30 +01:00
9e1c9f9594 pgindent run prior to branching v12.
pgperltidy and reformat-dat-files too, though the latter didn't
find anything to change.
2019-07-01 12:37:52 -04:00
95bbe5d82e Convert some stragglers to new frontend logging API 2019-07-01 13:34:31 +02:00
91acff7a53 Translation updates
Source-Git-URL: https://git.postgresql.org/git/pgtranslation/messages.git
Source-Git-Hash: 1a710c413ce4c4cd081843e563cde256bb95f490
2019-06-17 15:30:20 +02:00
e3846a00c2 Prefer timezone name "UTC" over alternative spellings.
tzdb 2019a made "UCT" a link to the "UTC" zone rather than a separate
zone with its own abbreviation. Unfortunately, our code for choosing a
timezone in initdb has an arbitrary preference for names earlier in
the alphabet, and so it would choose the spelling "UCT" over "UTC"
when the system is running on a UTC zone.

Commit 23bd3cec6 was backpatched in order to address this issue, but
that code helps only when /etc/localtime exists as a symlink, and does
nothing to help on systems where /etc/localtime is a copy of a zone
file (as is the standard setup on FreeBSD and probably some other
platforms too) or when /etc/localtime is simply absent (giving UTC as
the default).

Accordingly, add a preference for the spelling "UTC", such that if
multiple zone names have equally good content matches, we prefer that
name before applying the existing arbitrary rules. Also add a slightly
lower preference for "Etc/UTC"; lower because that preserves the
previous behaviour of choosing the shorter name, but letting us still
choose "Etc/UTC" over "Etc/UCT" when both exist but "UTC" does
not (not common, but I've seen it happen).

Backpatch all the way, because the tzdb change that sparked this issue
is in those branches too.
2019-06-15 18:15:23 +01:00
db6e2b4c52 Initial pgperltidy run for v12.
Make all the perl code look nice, too (for some value of "nice").
2019-05-22 13:36:19 -04:00
8255c7a5ee Phase 2 pgindent run for v12.
Switch to 2.1 version of pg_bsd_indent.  This formats
multiline function declarations "correctly", that is with
additional lines of parameter declarations indented to match
where the first line's left parenthesis is.

Discussion: https://postgr.es/m/CAEepm=0P3FeTXRcU5B2W3jv3PgRVZ-kGUXLGfd42FFhUROO3ug@mail.gmail.com
2019-05-22 13:04:48 -04:00
be76af171c Initial pgindent run for v12.
This is still using the 2.0 version of pg_bsd_indent.
I thought it would be good to commit this separately,
so as to document the differences between 2.0 and 2.1 behavior.

Discussion: https://postgr.es/m/16296.1558103386@sss.pgh.pa.us
2019-05-22 12:55:34 -04:00
3c439a58df Translation updates
Source-Git-URL: https://git.postgresql.org/git/pgtranslation/messages.git
Source-Git-Hash: a20bf6b8a5b4e32450967055eb5b07cee4704edd
2019-05-20 16:00:53 +02:00
fc9a62af3f Move logging.h and logging.c from src/fe_utils/ to src/common/.
The original placement of this module in src/fe_utils/ is ill-considered,
because several src/common/ modules have dependencies on it, meaning that
libpgcommon and libpgfeutils now have mutual dependencies.  That makes it
pointless to have distinct libraries at all.  The intended design is that
libpgcommon is lower-level than libpgfeutils, so only dependencies from
the latter to the former are acceptable.

We already have the precedent that fe_memutils and a couple of other
modules in src/common/ are frontend-only, so it's not stretching anything
out of whack to treat logging.c as a frontend-only module in src/common/.
To the extent that such modules help provide a common frontend/backend
environment for the rest of common/ to use, it's a reasonable design.
(logging.c does not yet provide an ereport() emulation, but one can
dream.)

Hence, move these files over, and revert basically all of the build-system
changes made by commit cc8d41511.  There are no places that need to grow
new dependencies on libpgcommon, further reinforcing the idea that this
is the right solution.

Discussion: https://postgr.es/m/a912ffff-f6e4-778a-c86a-cf5c47a12933@2ndquadrant.com
2019-05-14 14:20:10 -04:00
cc8d415117 Unified logging system for command-line programs
This unifies the various ad hoc logging (message printing, error
printing) systems used throughout the command-line programs.

Features:

- Program name is automatically prefixed.

- Message string does not end with newline.  This removes a common
  source of inconsistencies and omissions.

- Additionally, a final newline is automatically stripped, simplifying
  use of PQerrorMessage() etc., another common source of mistakes.

- I converted error message strings to use %m where possible.

- As a result of the above several points, more translatable message
  strings can be shared between different components and between
  frontends and backend, without gratuitous punctuation or whitespace
  differences.

- There is support for setting a "log level".  This is not meant to be
  user-facing, but can be used internally to implement debug or
  verbose modes.

- Lazy argument evaluation, so no significant overhead if logging at
  some level is disabled.

- Some color in the messages, similar to gcc and clang.  Set
  PG_COLOR=auto to try it out.  Some colors are predefined, but can be
  customized by setting PG_COLORS.

- Common files (common/, fe_utils/, etc.) can handle logging much more
  simply by just using one API without worrying too much about the
  context of the calling program, requiring callbacks, or having to
  pass "progname" around everywhere.

- Some programs called setvbuf() to make sure that stderr is
  unbuffered, even on Windows.  But not all programs did that.  This
  is now done centrally.

Soft goals:

- Reduces vertical space use and visual complexity of error reporting
  in the source code.

- Encourages more deliberate classification of messages.  For example,
  in some cases it wasn't clear without analyzing the surrounding code
  whether a message was meant as an error or just an info.

- Concepts and terms are vaguely aligned with popular logging
  frameworks such as log4j and Python logging.

This is all just about printing stuff out.  Nothing affects program
flow (e.g., fatal exits).  The uses are just too varied to do that.
Some existing code had wrappers that do some kind of print-and-exit,
and I adapted those.

I tried to keep the output mostly the same, but there is a lot of
historical baggage to unwind and special cases to consider, and I
might not always have succeeded.  One significant change is that
pg_rewind used to write all error messages to stdout.  That is now
changed to stderr.

Reviewed-by: Donald Dong <xdong@csumb.edu>
Reviewed-by: Arthur Zakirov <a.zakirov@postgrespro.ru>
Discussion: https://www.postgresql.org/message-id/flat/6a609b43-4f57-7348-6480-bd022f924310@2ndquadrant.com
2019-04-01 20:01:35 +02:00
5e1963fb76 Collations with nondeterministic comparison
This adds a flag "deterministic" to collations.  If that is false,
such a collation disables various optimizations that assume that
strings are equal only if they are byte-wise equal.  That then allows
use cases such as case-insensitive or accent-insensitive comparisons
or handling of strings with different Unicode normal forms.

This functionality is only supported with the ICU provider.  At least
glibc doesn't appear to have any locales that work in a
nondeterministic way, so it's not worth supporting this for the libc
provider.

The term "deterministic comparison" in this context is from Unicode
Technical Standard #10
(https://unicode.org/reports/tr10/#Deterministic_Comparison).

This patch makes changes in three areas:

- CREATE COLLATION DDL changes and system catalog changes to support
  this new flag.

- Many executor nodes and auxiliary code are extended to track
  collations.  Previously, this code would just throw away collation
  information, because the eventually-called user-defined functions
  didn't use it since they only cared about equality, which didn't
  need collation information.

- String data type functions that do equality comparisons and hashing
  are changed to take the (non-)deterministic flag into account.  For
  comparison, this just means skipping various shortcuts and tie
  breakers that use byte-wise comparison.  For hashing, we first need
  to convert the input string to a canonical "sort key" using the ICU
  analogue of strxfrm().

Reviewed-by: Daniel Verite <daniel@manitou-mail.org>
Reviewed-by: Peter Geoghegan <pg@bowt.ie>
Discussion: https://www.postgresql.org/message-id/flat/1ccc668f-4cbc-0bef-af67-450b47cdfee7@2ndquadrant.com
2019-03-22 12:12:43 +01:00
6dd263cfaa Rename pg_verify_checksums to pg_checksums
The current tool name is too restrictive and focuses only on verifying
checksums.  As more options to control checksums for an offline cluster
are planned to be added, switch to a more generic name.  Documentation
as well as all past references to the tool are updated.

Author: Michael Paquier
Reviewed-by: Michael Banck, Fabien Coelho, Seigei Kornilov
Discussion: https://postgr.es/m/20181221201616.GD4974@nighthawk.caipicrew.dd-dns.de
2019-03-13 10:43:20 +09:00
0301db623d Replace @postgresql.org with @lists.postgresql.org for mailinglists
Commit c0d0e54084 replaced the ones in the documentation, but missed out
on the ones in the code. Replace those as well, but unlike c0d0e54084,
don't backpatch the code changes to avoid breaking translations.
2019-01-19 19:06:35 +01:00
3913a40ff1 initdb: Use atexit()
Replace exit_nicely() calls with standard exit() and register the
cleanup actions using atexit().  The coding pattern used here mirrors
existing use in pg_basebackup.c.

Reviewed-by: Alvaro Herrera <alvherre@2ndquadrant.com>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Discussion: https://www.postgresql.org/message-id/flat/ec4135ba-84e9-28bf-b584-0e78d47448d5@2ndquadrant.com/
2019-01-07 16:24:50 +01:00
d33faa285b Move the built-in conversions into the initial catalog data.
Instead of running a SQL script to create the standard conversion
functions and pg_conversion entries, put those entries into the
initial data in postgres.bki.

This shaves a few percent off the runtime of initdb, and also allows
accurate comments to be attached to the conversion functions; the
previous script labeled them with machine-generated comments that
were not quite right for multi-purpose conversion functions.
Also, we can get rid of the duplicative Makefile and MSVC perl
implementations of the generation code for that SQL script.

A functional change is that these pg_proc and pg_conversion entries
are now "pinned" by initdb.  Leaving them unpinned was perhaps a
good thing back while the conversions feature was under development,
but there seems no valid reason for it now.

Also, the conversion functions are now marked as immutable, where
before they were volatile by virtue of lacking any explicit
specification.  That seems like it was just an oversight.

To avoid using magic constants in pg_conversion.dat, extend
genbki.pl to allow encoding names to be converted, much as it
does for language, access method, etc names.

John Naylor

Discussion: https://postgr.es/m/CAJVSVGWtUqxpfAaxS88vEGvi+jKzWZb2EStu5io-UPc4p9rSJg@mail.gmail.com
2019-01-03 19:47:53 -05:00
97c39498e5 Update copyright for 2019
Backpatch-through: certain files through 9.4
2019-01-02 12:44:25 -05:00
f4eabaf3e0 Fix ancient compiler warnings and typos in !HAVE_SYMLINK code
This has never been correct since this code was introduced.
2018-12-22 07:21:40 +01:00
578b229718 Remove WITH OIDS support, change oid catalog column visibility.
Previously tables declared WITH OIDS, including a significant fraction
of the catalog tables, stored the oid column not as a normal column,
but as part of the tuple header.

This special column was not shown by default, which was somewhat odd,
as it's often (consider e.g. pg_class.oid) one of the more important
parts of a row.  Neither pg_dump nor COPY included the contents of the
oid column by default.

The fact that the oid column was not an ordinary column necessitated a
significant amount of special case code to support oid columns. That
already was painful for the existing, but upcoming work aiming to make
table storage pluggable, would have required expanding and duplicating
that "specialness" significantly.

WITH OIDS has been deprecated since 2005 (commit ff02d0a05280e0).
Remove it.

Removing includes:
- CREATE TABLE and ALTER TABLE syntax for declaring the table to be
  WITH OIDS has been removed (WITH (oids[ = true]) will error out)
- pg_dump does not support dumping tables declared WITH OIDS and will
  issue a warning when dumping one (and ignore the oid column).
- restoring an pg_dump archive with pg_restore will warn when
  restoring a table with oid contents (and ignore the oid column)
- COPY will refuse to load binary dump that includes oids.
- pg_upgrade will error out when encountering tables declared WITH
  OIDS, they have to be altered to remove the oid column first.
- Functionality to access the oid of the last inserted row (like
  plpgsql's RESULT_OID, spi's SPI_lastoid, ...) has been removed.

The syntax for declaring a table WITHOUT OIDS (or WITH (oids = false)
for CREATE TABLE) is still supported. While that requires a bit of
support code, it seems unnecessary to break applications / dumps that
do not use oids, and are explicit about not using them.

The biggest user of WITH OID columns was postgres' catalog. This
commit changes all 'magic' oid columns to be columns that are normally
declared and stored. To reduce unnecessary query breakage all the
newly added columns are still named 'oid', even if a table's column
naming scheme would indicate 'reloid' or such.  This obviously
requires adapting a lot code, mostly replacing oid access via
HeapTupleGetOid() with access to the underlying Form_pg_*->oid column.

The bootstrap process now assigns oids for all oid columns in
genbki.pl that do not have an explicit value (starting at the largest
oid previously used), only oids assigned later by oids will be above
FirstBootstrapObjectId. As the oid column now is a normal column the
special bootstrap syntax for oids has been removed.

Oids are not automatically assigned during insertion anymore, all
backend code explicitly assigns oids with GetNewOidWithIndex(). For
the rare case that insertions into the catalog via SQL are called for
the new pg_nextoid() function can be used (which only works on catalog
tables).

The fact that oid columns on system tables are now normal columns
means that they will be included in the set of columns expanded
by * (i.e. SELECT * FROM pg_class will now include the table's oid,
previously it did not). It'd not technically be hard to hide oid
column by default, but that'd mean confusing behavior would either
have to be carried forward forever, or it'd cause breakage down the
line.

While it's not unlikely that further adjustments are needed, the
scope/invasiveness of the patch makes it worthwhile to get merge this
now. It's painful to maintain externally, too complicated to commit
after the code code freeze, and a dependency of a number of other
patches.

Catversion bump, for obvious reasons.

Author: Andres Freund, with contributions by John Naylor
Discussion: https://postgr.es/m/20180930034810.ywp2c7awz7opzcfr@alap3.anarazel.de
2018-11-20 16:00:17 -08:00
b34e84f160 Add TAP tests for pg_verify_checksums
All options available in the utility get coverage:
- Tests with disabled page checksums.
- Tests with enabled test checksums.
- Emulation of corruption and broken checksums with a full scan and
single relfilenode scan.

This patch has been contributed mainly by Michael Banck and Magnus
Hagander with things presented on various threads, and I have gathered
all the contents into a single patch.

Author: Michael Banck, Magnus Hagander, Michael Paquier
Reviewed-by: Peter Eisentraut
Discussion: https://postgr.es/m/20181005012645.GE1629@paquier.xyz
2018-10-12 09:12:31 +09:00
fd582317e1 Sync our Snowball stemmer dictionaries with current upstream.
We haven't touched these since text search functionality landed in core
in 2007 :-(.  While the upstream project isn't a beehive of activity,
they do make additions and bug fixes from time to time.  Update our
copies of these files.

Also update our documentation about how to keep things in sync, since
they're not making distribution tarballs these days.  Fortunately,
their source code turns out to be a breeze to build.

Notable changes:

* The non-UTF8 version of the hungarian stemmer now works in LATIN2
not LATIN1.

* New stemmers have appeared for arabic, indonesian, irish, lithuanian,
nepali, and tamil.  These all work in UTF8, and the indonesian and
irish ones also work in LATIN1.

(There are some new stemmers that I did not incorporate, mainly because
their names don't match the underlying languages, suggesting that they're
not to be considered mainstream.)

Worth noting: the upstream Nepali dictionary was contributed by
Arthur Zakirov.

initdb forced because the contents of snowball_create.sql have
changed.

Still TODO: see about updating the stopword lists.

Arthur Zakirov, minor mods and doc work by me

Discussion: https://postgr.es/m/20180626122025.GA12647@zakirov.localdomain
Discussion: https://postgr.es/m/20180219140849.GA9050@zakirov.localdomain
2018-09-24 17:29:38 -04:00
d18f6674bd Initialize random() in bootstrap/stand-alone postgres and in initdb.
This removes a difference between the standard IsUnderPostmaster
execution environment and that of --boot and --single.  In a stand-alone
backend, "SELECT random()" always started at the same seed.

On a system capable of using posix shared memory, initdb could still
conclude "selecting dynamic shared memory implementation ... sysv".
Crashed --boot or --single postgres processes orphaned shared memory
objects having names that collided with the not-actually-random names
that initdb probed.  The sysv fallback appeared after ten crashes of
--boot or --single postgres.  Since --boot and --single are rare in
production use, systems used for PostgreSQL development are the
principal candidate to notice this symptom.

Back-patch to 9.3 (all supported versions).  PostgreSQL 9.4 introduced
dynamic shared memory, but 9.3 does share the "SELECT random()" problem.

Reviewed by Tom Lane and Kyotaro HORIGUCHI.

Discussion: https://postgr.es/m/20180915221546.GA3159382@rfd.leadboat.com
2018-09-23 22:56:39 -07:00
925673f27b Remove special handling for open() in initdb for Windows
40cfe86 enforces the translation mode to text for all frontends, so this
special handling in initdb is not needed anymore.
2018-09-21 06:41:44 +09:00
0ba06e0bfb Allow concurrent-safe open() and fopen() in frontend code for Windows
PostgreSQL uses a custom wrapper for open() and fopen() which is
concurrent-safe, allowing multiple processes to open and work on the
same file.  This has a couple of advantages:
- pg_test_fsync does not handle O_DSYNC correctly otherwise, leading to
false claims that disks are unsafe.
- TAP tests can run into race conditions when a postmaster and pg_ctl
open postmaster.pid, fixing some random failures in the buildfam.

pg_upgrade is one frontend tool using workarounds to bypass file locking
issues with the log files it generates, however the interactions with
pg_ctl are proving to be tedious to get rid of, so this is left for
later.

Author: Laurenz Albe
Reviewed-by: Michael Paquier, Kuntal Ghosh
Discussion: https://postgr.es/m/1527846213.2475.31.camel@cybertec.at
Discussion: https://postgr.es/m/16922.1520722108@sss.pgh.pa.us
2018-09-14 10:04:14 +09:00
23bd3cec6e Attempt to identify system timezone by reading /etc/localtime symlink.
On many modern platforms, /etc/localtime is a symlink to a file within the
IANA database.  Reading the symlink lets us find out the name of the system
timezone directly, without going through the brute-force search embodied in
scan_available_timezones().  This shortens the runtime of initdb by some
tens of ms, which is helpful for the buildfarm, and it also allows us to
reliably select the same zone name the system was actually configured for,
rather than possibly choosing one of IANA's many zone aliases.  (For
example, in a system configured for "Asia/Tokyo", the brute-force search
would not choose that name but its alias "Japan", on the grounds of the
latter string being shorter.  More surprisingly, "Navajo" is preferred
to either "America/Denver" or "US/Mountain", as seen in an old complaint
from Josh Berkus.)

If /etc/localtime doesn't exist, or isn't a symlink, or we can't make
sense of its contents, or the contents match a zone we know but that
zone doesn't match the observed behavior of localtime(), fall back to
the brute-force search.

Also, tweak initdb so that it prints the zone name it selected.

In passing, replace the last few references to the "Olson" database in
code comments with "IANA", as that's been our preferred term since
commit b2cbced9e.

Patch by me, per a suggestion from Robert Haas; review by Michael Paquier

Discussion: https://postgr.es/m/7408.1525812528@sss.pgh.pa.us
2018-09-13 12:36:21 -04:00
d4a900458e Cosmetic cleanups in initdb.c.
Commit bcbd94080 didn't bother to fix up a comment it had falsified.

Also, const-ify choose_dsm_implementation(), since what it's returning
is always a constant string; pickier compilers would warn about this.
2018-08-10 14:14:27 -04:00
6b387179ba Fix misc typos, mostly in comments.
A collection of typos I happened to spot while reading code, as well as
grepping for common mistakes.

Backpatch to all supported versions, as applicable, to avoid conflicts
when backporting other commits in the future.
2018-07-18 16:17:32 +03:00
bcbd940806 Remove dynamic_shared_memory_type=none
PostgreSQL nowadays offers some kind of dynamic shared memory feature on
all supported platforms.  Having the choice of "none" prevents us from
relying on DSM in core features.  So this patch removes the choice of
"none".

Author: Kyotaro Horiguchi <horiguchi.kyotaro@lab.ntt.co.jp>
2018-07-10 18:35:24 +02:00