postgresql

mirror of https://git.postgresql.org/git/postgresql.git synced 2026-02-09 07:57:33 +08:00

Author	SHA1	Message	Date
Peter Eisentraut	3d9e73ea5f	Update copyright years in some recently added files	2017-01-25 12:32:05 -05:00
Peter Eisentraut	b480086760	Add more includes so header files are self-contained	2017-01-21 15:49:53 -05:00
Peter Eisentraut	665d1fad99	Logical replication - Add PUBLICATION catalogs and DDL - Add SUBSCRIPTION catalog and DDL - Define logical replication protocol and output plugin - Add logical replication workers From: Petr Jelinek <petr@2ndquadrant.com> Reviewed-by: Steve Singer <steve@ssinger.info> Reviewed-by: Andres Freund <andres@anarazel.de> Reviewed-by: Erik Rijkers <er@xs4all.nl> Reviewed-by: Peter Eisentraut <peter.eisentraut@2ndquadrant.com>	2017-01-20 09:04:49 -05:00
Peter Eisentraut	352a24a1f9	Generate fmgr prototypes automatically Gen_fmgrtab.pl creates a new file fmgrprotos.h, which contains prototypes for all functions registered in pg_proc.h. This avoids having to manually maintain these prototypes across a random variety of header files. It also automatically enforces a correct function signature, and since there are warnings about missing prototypes, it will detect functions that are defined but not registered in pg_proc.h (or otherwise used). Reviewed-by: Pavel Stehule <pavel.stehule@gmail.com>	2017-01-17 14:06:07 -05:00
Bruce Momjian	1d25779284	Update copyright via script for 2017	2017-01-03 13:48:53 -05:00
Fujii Masao	3901fd70cc	Support quorum-based synchronous replication. This feature is also known as "quorum commit" especially in discussion on pgsql-hackers. This commit adds the following new syntaxes into synchronous_standby_names GUC. By using FIRST and ANY keywords, users can specify the method to choose synchronous standbys from the listed servers. FIRST num_sync (standby_name [, ...]) ANY num_sync (standby_name [, ...]) The keyword FIRST specifies a priority-based synchronous replication which was available also in 9.6 or before. This method makes transaction commits wait until their WAL records are replicated to num_sync synchronous standbys chosen based on their priorities. The keyword ANY specifies a quorum-based synchronous replication and makes transaction commits wait until their WAL records are replicated to at least num_sync listed standbys. In this method, the values of sync_state.pg_stat_replication for the listed standbys are reported as "quorum". The priority is still assigned to each standby, but not used in this method. The existing syntaxes having neither FIRST nor ANY keyword are still supported. They are the same as new syntax with FIRST keyword, i.e., a priorirty-based synchronous replication. Author: Masahiko Sawada Reviewed-By: Michael Paquier, Amit Kapila and me Discussion: <CAD21AoAACi9NeC_ecm+Vahm+MMA6nYh=Kqs3KB3np+MBOS_gZg@mail.gmail.com> Many thanks to the various individuals who were involved in discussing and developing this feature.	2016-12-19 21:15:30 +09:00
Peter Eisentraut	a924c327e2	Add support for temporary replication slots This allows creating temporary replication slots that are removed automatically at the end of the session or on error. From: Petr Jelinek <petr.jelinek@2ndquadrant.com>	2016-12-12 08:38:17 -05:00
Alvaro Herrera	5714931b07	Fix Windows build for 78c8c814390f Author: Petr Jelínek	2016-12-02 09:40:36 -03:00
Peter Eisentraut	78c8c81439	Refactor libpqwalreceiver The whole walreceiver API is now wrapped into a struct, like most of our other loadable module APIs. The libpq connection is no longer a global variable in libpqwalreceiver. Instead, it is encapsulated into a struct that is passed around the functions. This allows multiple walreceivers to run at the same time. Add some rudimentary support for logical replication connections to libpqwalreceiver. These changes are mostly cosmetic and are going to be useful for the future logical replication patches. From: Petr Jelinek <petr@2ndquadrant.com>	2016-12-01 20:23:28 -05:00
Peter Eisentraut	597a87ccc9	Use latch instead of select() in walreceiver Replace use of poll()/select() by WaitLatchOrSocket(), which is more portable and flexible. Also change walreceiver to use its procLatch instead of a custom latch. From: Petr Jelinek <petr@2ndquadrant.com>	2016-12-01 20:23:28 -05:00
Alvaro Herrera	9ed551e0a4	Add conninfo to pg_stat_wal_receiver Commit b1a9bad9e744 introduced a stats view to provide insight into the running WAL receiver, but neglected to include the connection string in it, as reported by Michaël Paquier. This commit fixes that omission. (Any security-sensitive information is not disclosed). While at it, close the mild security hole that we were exposing the password in the connection string in shared memory. This isn't user-accessible, but it still looks like a good idea to avoid having the cleartext password in memory. Author: Michaël Paquier, Álvaro Herrera Review by: Vik Fearing Discussion: https://www.postgresql.org/message-id/CAB7nPqStg4M561obo7ryZ5G+fUydG4v1Ajs1xZT1ujtu+woRag@mail.gmail.com	2016-06-29 16:57:17 -04:00
Robert Haas	4bc424b968	pgindent run for 9.6	2016-06-09 18:02:36 -04:00
Alvaro Herrera	6b60916829	Fix thinko in comment Pointed out by Andres Freund	2016-05-02 16:46:42 -03:00
Alvaro Herrera	234a266066	Fix code comments regarding logical decoding Back in 3b02ea4f0780 I added some comments in various places to explain how logical decoding and other things worked. Not all of the changes were welcome, because they were misleading or wrong. This changes them a little bit to make them more accurate. Some other comments are also changed to be more accurate. Also, fix a bunch of typos. Author: Álvaro Herrera, Craig Ringer Andres Freund reviewed some parts of this.	2016-05-02 16:04:29 -03:00
Tom Lane	4c804fbdfb	Clean up parsing of synchronous_standby_names GUC variable. Commit 989be0810dffd08b added a flex/bison lexer/parser to interpret synchronous_standby_names. It was done in a pretty crufty way, though, making assorted end-use sites responsible for calling the parser at the right times. That was not only vulnerable to errors of omission, but made it possible that lexer/parser errors occur at very undesirable times, and created memory leakages even if there was no error. Instead, perform the parsing once during check_synchronous_standby_names and let guc.c manage the resulting data. To do that, we have to flatten the parsed representation into a single hunk of malloc'd memory, but that is not very hard. While at it, work a little harder on making useful error reports for parsing problems; the previous code felt that "synchronous_standby_names parser returned 1" was an appropriate user-facing error message. (To be fair, it did also log a syntax error message, but separately from the GUC problem report, which is at best confusing.) It had some outright bugs in the face of invalid input, too. I (tgl) also concluded that we need to restrict unquoted names in synchronous_standby_names to be just SQL identifiers. The previous coding would accept darn near anything, which (1) makes the quoting convention both nearly-unnecessary and formally ambiguous, (2) makes it very hard to understand what is a syntax error and what is a creative interpretation of the input as a standby name, and (3) makes it impossible to further extend the syntax in future without a compatibility break. I presume that we're intending future extensions of the syntax, else this parsing infrastructure is massive overkill, so (3) is an important objection. Since we've taken a compatibility hit for non-identifier names with this change anyway, we might as well lock things down now and insist that users use double quotes for standby names that aren't identifiers. Kyotaro Horiguchi and Tom Lane	2016-04-27 17:55:25 -04:00
Andres Freund	c6ff84b06a	Emit invalidations to standby for transactions without xid. So far, when a transaction with pending invalidations, but without an assigned xid, committed, we simply ignored those invalidation messages. That's problematic, because those are actually sent for a reason. Known symptoms of this include that existing sessions on a hot-standby replica sometimes fail to notice new concurrently built indexes and visibility map updates. The solution is to WAL log such invalidations in transactions without an xid. We considered to alternatively force-assign an xid, but that'd be problematic for vacuum, which might be run in systems with few xids. Important: This adds a new WAL record, but as the patch has to be back-patched, we can't bump the WAL page magic. This means that standbys have to be updated before primaries; otherwise "PANIC: standby_redo: unknown op code 32" errors can be encountered. XXX: Reported-By: Васильев Дмитрий, Masahiko Sawada Discussion: CAB-SwXY6oH=9twBkXJtgR4UC1NqT-vpYAtxCseME62ADwyK5OA@mail.gmail.com CAD21AoDpZ6Xjg=gFrGPnSn4oTRRcwK1EBrWCq9OqOHuAcMMC=w@mail.gmail.com	2016-04-26 20:21:54 -07:00
Tom Lane	c2dc194bdb	Adjust signature of walrcv_receive hook. Commit 314cbfc5da988eff redefined the signature of this hook as typedef int (walrcv_receive_type) (char buffer, int wait_fd); But in fact the type of the "wait_fd" variable ought to be pgsocket, which is what WaitLatchOrSocket expects, and which is necessary if we want to be able to assign PGINVALID_SOCKET to it on Windows. So fix that.	2016-04-14 13:49:37 -04:00
Andres Freund	be65eddd80	Add required database and origin filtering for logical messages. Logical messages, added in 3fe3511d05, during decoding failed to filter messages emitted in other databases and messages emitted "under" a replication origin the output plugin isn't interested in. Add tests to verify that both types of filtering actually work. While touching message.sql remove hunk obsoleted by d25379e. Bump XLOG_PAGE_MAGIC because xl_logical_message changed and because 3fe3511d05 had omitted doing so. 3fe3511d05 additionally didn't bump catversion, but 7a542700d has done so since. Author: Petr Jelinek Reported-By: Andres Freund Discussion: 20160406142513.wotqy3ba3kanr423@alap3.anarazel.de	2016-04-13 17:38:54 -07:00
Simon Riggs	3fe3511d05	Generic Messages for Logical Decoding API and mechanism to allow generic messages to be inserted into WAL that are intended to be read by logical decoding plugins. This commit adds an optional new callback to the logical decoding API. Messages are either text or bytea. Messages can be transactional, or not, and are identified by a prefix to allow multiple concurrent decoding plugins. (Not to be confused with Generic WAL records, which are intended to allow crash recovery of extensible objects.) Author: Petr Jelinek and Andres Freund Reviewers: Artur Zakirov, Tomas Vondra, Simon Riggs Discussion: 5685F999.6010202@2ndquadrant.com	2016-04-06 10:05:41 +01:00
Fujii Masao	989be0810d	Support multiple synchronous standby servers. Previously synchronous replication offered only the ability to confirm that all changes made by a transaction had been transferred to at most one synchronous standby server. This commit extends synchronous replication so that it supports multiple synchronous standby servers. It enables users to consider one or more standby servers as synchronous, and increase the level of transaction durability by ensuring that transaction commits wait for replies from all of those synchronous standbys. Multiple synchronous standby servers are configured in synchronous_standby_names which is extended to support new syntax of 'num_sync ( standby_name [ , ... ] )', where num_sync specifies the number of synchronous standbys that transaction commits need to wait for replies from and standby_name is the name of a standby server. The syntax of 'standby_name [ , ... ]' which was used in 9.5 or before is also still supported. It's the same as new syntax with num_sync=1. This commit doesn't include "quorum commit" feature which was discussed in pgsql-hackers. Synchronous standbys are chosen based on their priorities. synchronous_standby_names determines the priority of each standby for being chosen as a synchronous standby. The standbys whose names appear earlier in the list are given higher priority and will be considered as synchronous. Other standby servers appearing later in this list represent potential synchronous standbys. The regression test for multiple synchronous standbys is not included in this commit. It should come later. Authors: Sawada Masahiko, Beena Emerson, Michael Paquier, Fujii Masao Reviewed-By: Kyotaro Horiguchi, Amit Kapila, Robert Haas, Simon Riggs, Amit Langote, Thomas Munro, Sameer Thakur, Suraj Kharage, Abhijit Menon-Sen, Rajeev Rastogi Many thanks to the various individuals who were involved in discussing and developing this feature.	2016-04-06 17:18:25 +09:00
Alvaro Herrera	3dd0792ae0	Blind attempt at fixing Win32 issue on 24c5f1a103c As best as I can tell, MyReplicationSlot needs to be PGDLLIMPORT in order for the new test_slot_timelines test module to compile. Per buildfarm	2016-03-30 23:12:20 -03:00
Robert Haas	314cbfc5da	Add new replication mode synchronous_commit = 'remote_apply'. In this mode, the master waits for the transaction to be applied on the remote side, not just written to disk. That means that you can count on a transaction started on the standby to see all commits previously acknowledged by the master. To make this work, the standby sends a reply after replaying each commit record generated with synchronous_commit >= 'remote_apply'. This introduces a small inefficiency: the extra replies will be sent even by standbys that aren't the current synchronous standby. But previously-existing synchronous_commit levels make no attempt at all to optimize which replies are sent based on what the primary cares about, so this is no worse, and at least avoids any extra replies for people not using the feature at all. Thomas Munro, reviewed by Michael Paquier and by me. Some additional tweaks by me.	2016-03-29 21:29:49 -04:00
Andres Freund	c8f621c43a	logical decoding: Fix handling of large old tuples with replica identity full. When decoding the old version of an UPDATE or DELETE change, and if that tuple was bigger than MaxHeapTupleSize, we either Assert'ed out, or failed in more subtle ways in non-assert builds. Normally individual tuples aren't bigger than MaxHeapTupleSize, with big datums toasted. But that's not the case for the old version of a tuple for logical decoding; the replica identity is logged as one piece. With the default replica identity btree limits that to small tuples, but that's not the case for FULL. Change the tuple buffer infrastructure to separate allocate over-large tuples, instead of always going through the slab cache. This unfortunately requires changing the ReorderBufferTupleBuf definition, we need to store the allocated size someplace. To avoid requiring output plugins to recompile, don't store HeapTupleHeaderData directly after HeapTupleData, but point to it via t_data; that leaves rooms for the allocated size. As there's no reason for an output plugin to look at ReorderBufferTupleBuf->t_data.header, remove the field. It was just a minor convenience having it directly accessible. Reported-By: Adam Dratwiński Discussion: CAKg6ypLd7773AOX4DiOGRwQk1TVOQKhNwjYiVjJnpq8Wo+i62Q@mail.gmail.com	2016-03-05 18:02:20 -08:00
Andres Freund	d9e903f3cb	logical decoding: Tell reorderbuffer about all xids. Logical decoding's reorderbuffer keeps transactions in an LSN ordered list for efficiency. To make that's efficiently possible upper-level xids are forced to be logged before nested subtransaction xids. That only works though if these records are all looked at: Unfortunately we didn't do so for e.g. row level locks, which are otherwise uninteresting for logical decoding. This could lead to errors like: "ERROR: subxact logged without previous toplevel record". It's not sufficient to just look at row locking records, the xid could appear first due to a lot of other types of records (which will trigger the transaction to be marked logged with MarkCurrentTransactionIdLoggedIfAny). So invent infrastructure to tell reorderbuffer about xids seen, when they'd otherwise not pass through reorderbuffer.c. Reported-By: Jarred Ward Bug: #13844 Discussion: 20160105033249.1087.66040@wrigleys.postgresql.org Backpatch: 9.4, where logical decoding was added	2016-03-05 18:02:20 -08:00
Alvaro Herrera	343f709c06	Fix typos Backpatch to: 9.4	2016-02-25 20:50:20 -03:00
Robert Haas	2251179e6a	Migrate replication slot I/O locks into a separate tranche. This is following in a long train of similar changes and for the same reasons - see b319356f0e94a6482c726cf4af96597c211d8d6e and fe702a7b3f9f2bc5bf6d173166d7d55226af82c8 inter alia. Author: Amit Kapila Reviewed-by: Alexander Korotkov, Robert Haas	2016-01-29 09:45:38 -05:00
Alvaro Herrera	b1a9bad9e7	pgstat: add WAL receiver status view & SRF This new view provides insight into the state of a running WAL receiver in a HOT standby node. The information returned includes the PID of the WAL receiver process, its status (stopped, starting, streaming, etc), start LSN and TLI, last received LSN and TLI, timestamp of last message send and receipt, latest end-of-WAL LSN and time, and the name of the slot (if any). Access to the detailed data is only granted to superusers; others only get the PID. Author: Michael Paquier Reviewer: Haribabu Kommi	2016-01-07 16:21:19 -03:00
Bruce Momjian	ee94300446	Update copyright for 2016 Backpatch certain files through 9.1	2016-01-02 13:33:40 -05:00
Alvaro Herrera	17f5831c81	Fix "sesssion" typo It was introduced alongside replication origins, by commit 5aa2350426c, so backpatch to 9.5. Pointed out by Fujii Masao	2015-09-28 19:13:42 -03:00
Alvaro Herrera	4901b2f495	Don't include rel.h when relcache.h is sufficient Trivial change to reduce exposure of rel.h.	2015-08-11 13:03:14 -03:00
Andres Freund	6fcd88511f	Allow pg_create_physical_replication_slot() to reserve WAL. When creating a physical slot it's often useful to immediately reserve the current WAL position instead of only doing after the first feedback message arrives. That e.g. allows slots to guarantee that all the WAL for a base backup will be available afterwards. Logical slots already have to reserve WAL during creation, so generalize that logic into being usable for both physical and logical slots. Catversion bump because of the new parameter. Author: Gurjeet Singh Reviewed-By: Andres Freund Discussion: CABwTF4Wh_dBCzTU=49pFXR6coR4NW1ynb+vBqT+Po=7fuq5iCw@mail.gmail.com	2015-08-11 12:34:31 +02:00
Andres Freund	093d0c83c1	Introduce macros determining if a replication slot is physical or logical. These make the code a bit easier to read, and make it easier to add a more explicit notion of a slot's type at some point in the future. Author: Gurjeet Singh Discussion: CABwTF4Wh_dBCzTU=49pFXR6coR4NW1ynb+vBqT+Po=7fuq5iCw@mail.gmail.com	2015-08-11 12:32:48 +02:00
Andres Freund	3a145757a0	Improve includes introduced in the replication origins patch. pg_resetxlog.h contained two superfluous includes, origin.h superfluously depended on logical.h, and pg_xlogdump's rmgrdesc.h only indirectly included origin.h. Backpatch: 9.5, where replication origins were introduced.	2015-08-06 12:41:46 +02:00
Bruce Momjian	807b9e0dff	pgindent run for 9.5	2015-05-23 21:35:49 -04:00
Heikki Linnakangas	4fc72cc7bb	Collection of typo fixes. Use "a" and "an" correctly, mostly in comments. Two error messages were also fixed (they were just elogs, so no translation work required). Two function comments in pg_proc.h were also fixed. Etsuro Fujita reported one of these, but I found a lot more with grep. Also fix a few other typos spotted while grepping for the a/an typos. For example, "consists out of ..." -> "consists of ...". Plus a "though"/ "through" mixup reported by Euler Taveira. Many of these typos were in old code, which would be nice to backpatch to make future backpatching easier. But much of the code was new, and I didn't feel like crafting separate patches for each branch. So no backpatching.	2015-05-20 16:56:22 +03:00
Andrew Dunstan	72d422a522	Map basebackup tablespaces using a tablespace_map file Windows can't reliably restore symbolic links from a tar format, so instead during backup start we create a tablespace_map file, which is used by the restoring postgres to create the correct links in pg_tblspc. The backup protocol also now has an option to request this file to be included in the backup stream, and this is used by pg_basebackup when operating in tar mode. This is done on all platforms, not just Windows. This means that pg_basebackup will not not work in tar mode against 9.4 and older servers, as this protocol option isn't implemented there. Amit Kapila, reviewed by Dilip Kumar, with a little editing from me.	2015-05-12 09:29:10 -04:00
Andres Freund	168d5805e4	Add support for INSERT ... ON CONFLICT DO NOTHING/UPDATE. The newly added ON CONFLICT clause allows to specify an alternative to raising a unique or exclusion constraint violation error when inserting. ON CONFLICT refers to constraints that can either be specified using a inference clause (by specifying the columns of a unique constraint) or by naming a unique or exclusion constraint. DO NOTHING avoids the constraint violation, without touching the pre-existing row. DO UPDATE SET ... [WHERE ...] updates the pre-existing tuple, and has access to both the tuple proposed for insertion and the existing tuple; the optional WHERE clause can be used to prevent an update from being executed. The UPDATE SET and WHERE clauses have access to the tuple proposed for insertion using the "magic" EXCLUDED alias, and to the pre-existing tuple using the table name or its alias. This feature is often referred to as upsert. This is implemented using a new infrastructure called "speculative insertion". It is an optimistic variant of regular insertion that first does a pre-check for existing tuples and then attempts an insert. If a violating tuple was inserted concurrently, the speculatively inserted tuple is deleted and a new attempt is made. If the pre-check finds a matching tuple the alternative DO NOTHING or DO UPDATE action is taken. If the insertion succeeds without detecting a conflict, the tuple is deemed inserted. To handle the possible ambiguity between the excluded alias and a table named excluded, and for convenience with long relation names, INSERT INTO now can alias its target table. Bumps catversion as stored rules change. Author: Peter Geoghegan, with significant contributions from Heikki Linnakangas and Andres Freund. Testing infrastructure by Jeff Janes. Reviewed-By: Heikki Linnakangas, Andres Freund, Robert Haas, Simon Riggs, Dean Rasheed, Stephen Frost and many others.	2015-05-08 05:43:10 +02:00
Andres Freund	e0f26fc765	Correct replication origin's use of UINT16_MAX to PG_UINT16_MAX. We can't rely on UINT16_MAX being present, which is why we introduced PG_UINT16_MAX... Buildfarm animal bowerbird via Andrew Gierth.	2015-04-30 00:19:36 +02:00
Andres Freund	5aa2350426	Introduce replication progress tracking infrastructure. When implementing a replication solution ontop of logical decoding, two related problems exist: * How to safely keep track of replication progress * How to change replication behavior, based on the origin of a row; e.g. to avoid loops in bi-directional replication setups The solution to these problems, as implemented here, consist out of three parts: 1) 'replication origins', which identify nodes in a replication setup. 2) 'replication progress tracking', which remembers, for each replication origin, how far replay has progressed in a efficient and crash safe manner. 3) The ability to filter out changes performed on the behest of a replication origin during logical decoding; this allows complex replication topologies. E.g. by filtering all replayed changes out. Most of this could also be implemented in "userspace", e.g. by inserting additional rows contain origin information, but that ends up being much less efficient and more complicated. We don't want to require various replication solutions to reimplement logic for this independently. The infrastructure is intended to be generic enough to be reusable. This infrastructure also replaces the 'nodeid' infrastructure of commit timestamps. It is intended to provide all the former capabilities, except that there's only 2^16 different origins; but now they integrate with logical decoding. Additionally more functionality is accessible via SQL. Since the commit timestamp infrastructure has also been introduced in 9.5 (commit 73c986add) changing the API is not a problem. For now the number of origins for which the replication progress can be tracked simultaneously is determined by the max_replication_slots GUC. That GUC is not a perfect match to configure this, but there doesn't seem to be sufficient reason to introduce a separate new one. Bumps both catversion and wal page magic. Author: Andres Freund, with contributions from Petr Jelinek and Craig Ringer Reviewed-By: Heikki Linnakangas, Petr Jelinek, Robert Haas, Steve Singer Discussion: 20150216002155.GI15326@awork2.anarazel.de, 20140923182422.GA15776@alap3.anarazel.de, 20131114172632.GE7522@alap2.anarazel.de	2015-04-29 19:30:53 +02:00
Andres Freund	d811c037ce	Add 'active_in' column to pg_replication_slots. Right now it is visible whether a replication slot is active in any session, but not in which. Adding the active_in column, containing the pid of the backend having acquired the slot, makes it much easier to associate pg_replication_slots entries with the corresponding pg_stat_replication/pg_stat_activity row. This should have been done from the start, but I (Andres) dropped the ball there somehow. Author: Craig Ringer, revised by me Discussion: CAMsr+YFKgZca5_7_ouaMWxA5PneJC9LNViPzpDHusaPhU9pA7g@mail.gmail.com	2015-04-21 11:51:06 +02:00
Tom Lane	785941cdc3	Tweak __attribute__-wrapping macros for better pgindent results. This improves on commit bbfd7edae5aa5ad5553d3c7e102f2e450d4380d4 by making two simple changes: * pg_attribute_noreturn now takes parentheses, ie pg_attribute_noreturn(). Likewise pg_attribute_unused(), pg_attribute_packed(). This reduces pgindent's tendency to misformat declarations involving them. * attributes are now always attached to function declarations, not definitions. Previously some places were taking creative shortcuts, which were not merely candidates for bad misformatting by pgindent but often were outright wrong anyway. (It does little good to put a noreturn annotation where callers can't see it.) In any case, if we would like to believe that these macros can be used with non-gcc compilers, we should avoid gratuitous variance in usage patterns. I also went through and manually improved the formatting of a lot of declarations, and got rid of excessively repetitive (and now obsolete anyway) comments informing the reader what pg_attribute_printf is for.	2015-03-26 14:03:25 -04:00
Tatsuo Ishii	656ea810e5	Make SyncRepWakeQueue to a static function It is only used in src/backend/replication/syncrep.c. Back-patch to all supported branches except 9.1 which declares the function as static.	2015-03-26 10:34:08 +09:00
Andres Freund	bbfd7edae5	Add macros wrapping all usage of gcc's __attribute__. Until now __attribute__() was defined to be empty for all compilers but gcc. That's problematic because it prevents using it in other compilers; which is necessary e.g. for atomics portability. It's also just generally dubious to do so in a header as widely included as c.h. Instead add pg_attribute_format_arg, pg_attribute_printf, pg_attribute_noreturn macros which are implemented in the compilers that understand them. Also add pg_attribute_noreturn and pg_attribute_packed, but don't provide fallbacks, since they can affect functionality. This means that external code that, possibly unwittingly, relied on __attribute__ defined to be empty on !gcc compilers may now run into warnings or errors on those compilers. But there shouldn't be many occurances of that and it's hard to work around... Discussion: 54B58BA3.8040302@ohmu.fi Author: Oskari Saarenmaa, with some minor changes by me.	2015-03-11 14:30:01 +01:00
Tom Lane	e1a11d9311	Use FLEXIBLE_ARRAY_MEMBER for HeapTupleHeaderData.t_bits[]. This requires changing quite a few places that were depending on sizeof(HeapTupleHeaderData), but it seems for the best. Michael Paquier, some adjustments by me	2015-02-21 15:13:06 -05:00
Tom Lane	f2874feb7c	Some more FLEXIBLE_ARRAY_MEMBER fixes.	2015-02-21 01:46:43 -05:00
Tom Lane	09d8d110a6	Use FLEXIBLE_ARRAY_MEMBER in a bunch more places. Replace some bogus "x[1]" declarations with "x[FLEXIBLE_ARRAY_MEMBER]". Aside from being more self-documenting, this should help prevent bogus warnings from static code analyzers and perhaps compiler misoptimizations. This patch is just a down payment on eliminating the whole problem, but it gets rid of a lot of easy-to-fix cases. Note that the main problem with doing this is that one must no longer rely on computing sizeof(the containing struct), since the result would be compiler-dependent. Instead use offsetof(struct, lastfield). Autoconf also warns against spelling that offsetof(struct, lastfield[0]). Michael Paquier, review and additional fixes by me.	2015-02-20 00:11:42 -05:00
Andres Freund	ff44fba46c	Replace walsender's latch with the general shared latch. Relying on the normal shared latch simplifies interrupt/signal handling because we can rely on all signal handlers setting the proc latch. That in turn allows us to avoid the use of ImmediateInterruptOK, which arguably isn't correct because WaitLatchOrSocket isn't declared to be immediately interruptible. Also change sections that wait on the walsender's latch to notice interrupts quicker/more reliably and make them more consistent with each other. This is part of a larger "get rid of ImmediateInterruptOK" series. Discussion: 20150115020335.GZ5245@awork2.anarazel.de	2015-01-17 13:00:42 +01:00
Bruce Momjian	4baaf863ec	Update copyright for 2015 Backpatch certain files through 9.0	2015-01-06 11:43:47 -05:00
Heikki Linnakangas	b1332e98c4	Put the logic to decide which synchronous standby is active into a function. This avoids duplicating the code. Michael Paquier, reviewed by Simon Riggs and me	2014-12-12 14:26:42 +02:00
Heikki Linnakangas	2c03216d83	Revamp the WAL record format. Each WAL record now carries information about the modified relation and block(s) in a standardized format. That makes it easier to write tools that need that information, like pg_rewind, prefetching the blocks to speed up recovery, etc. There's a whole new API for building WAL records, replacing the XLogRecData chains used previously. The new API consists of XLogRegister* functions, which are called for each buffer and chunk of data that is added to the record. The new API also gives more control over when a full-page image is written, by passing flags to the XLogRegisterBuffer function. This also simplifies the XLogReadBufferForRedo() calls. The function can dig the relation and block number from the WAL record, so they no longer need to be passed as arguments. For the convenience of redo routines, XLogReader now disects each WAL record after reading it, copying the main data part and the per-block data into MAXALIGNed buffers. The data chunks are not aligned within the WAL record, but the redo routines can assume that the pointers returned by XLogRecGet* functions are. Redo routines are now passed the XLogReaderState, which contains the record in the already-disected format, instead of the plain XLogRecord. The new record format also makes the fixed size XLogRecord header smaller, by removing the xl_len field. The length of the "main data" portion is now stored at the end of the WAL record, and there's a separate header after XLogRecord for it. The alignment padding at the end of XLogRecord is also removed. This compansates for the fact that the new format would otherwise be more bulky than the old format. Reviewed by Andres Freund, Amit Kapila, Michael Paquier, Alvaro Herrera, Fujii Masao.	2014-11-20 18:46:41 +02:00

1 2 3

133 Commits