postgresql

mirror of https://git.postgresql.org/git/postgresql.git synced 2026-02-19 04:46:59 +08:00

Author	SHA1	Message	Date
Heikki Linnakangas	40f908bdcd	Introduce Streaming Replication. This includes two new kinds of postmaster processes, walsenders and walreceiver. Walreceiver is responsible for connecting to the primary server and streaming WAL to disk, while walsender runs in the primary server and streams WAL from disk to the client. Documentation still needs work, but the basics are there. We will probably pull the replication section to a new chapter later on, as well as the sections describing file-based replication. But let's do that as a separate patch, so that it's easier to see what has been added/changed. This patch also adds a new section to the chapter about FE/BE protocol, documenting the protocol used by walsender/walreceivxer. Bump catalog version because of two new functions, pg_last_xlog_receive_location() and pg_last_xlog_replay_location(), for monitoring the progress of replication. Fujii Masao, with additional hacking by me	2010-01-15 09:19:10 +00:00
Tom Lane	d5e0029862	Add some simple support and documentation for using process-specific oom_adj settings to prevent the postmaster from being OOM-killed on Linux systems. Alex Hunsaker and Tom Lane	2010-01-11 18:39:32 +00:00
Magnus Hagander	87091cb1f1	Create typedef pgsocket for storing socket descriptors. This silences some warnings on Win64. Not using the proper SOCKET datatype was actually wrong on Win32 as well, but didn't cause any warnings there. Also create define PGINVALID_SOCKET to indicate an invalid/non-existing socket, instead of using a hardcoded -1 value.	2010-01-10 14:16:08 +00:00
Bruce Momjian	0239800893	Update copyright for the year 2010.	2010-01-02 16:58:17 +00:00
Magnus Hagander	13c5fdb5c8	Fix one more cast for _open_osfhandle(). Tsutomu Yamada	2010-01-02 12:01:29 +00:00
Tom Lane	48c192c15e	Revise pgstat's tracking of tuple changes to improve the reliability of decisions about when to auto-analyze. The previous code depended on n_live_tuples + n_dead_tuples - last_anl_tuples, where all three of these numbers could be bad estimates from ANALYZE itself. Even worse, in the presence of a steady flow of HOT updates and matching HOT-tuple reclamations, auto-analyze might never trigger at all, even if all three numbers are exactly right, because n_dead_tuples could hold steady. To fix, replace last_anl_tuples with an accurately tracked count of the total number of committed tuple inserts + updates + deletes since the last ANALYZE on the table. This can still be compared to the same threshold as before, but it's much more trustworthy than the old computation. Tracking this requires one more intra-transaction counter per modified table within backends, but no additional memory space in the stats collector. There probably isn't any measurable speed difference; if anything it might be a bit faster than before, since I was able to eliminate some per-tuple arithmetic operations in favor of adding sums once per (sub)transaction. Also, simplify the logic around pgstat vacuum and analyze reporting messages by not trying to fold VACUUM ANALYZE into a single pgstat message. The original thought behind this patch was to allow scheduling of analyzes on parent tables by artificially inflating their changes_since_analyze count. I've left that for a separate patch since this change seems to stand on its own merit.	2009-12-30 20:32:14 +00:00
Tom Lane	0b39231431	Avoid memory leak if pgstat_vacuum_stat is interrupted partway through. The temporary hash tables made by pgstat_collect_oids should be allocated in a short-term memory context, which is not the default behavior of hash_create. Noted while looking through hash_create calls in connection with Robert Haas' recent complaint. This is a pre-existing bug, but it doesn't seem important enough to back-patch. The hash table is not so large that it would matter unless this happened many times within a session, which seems quite unlikely.	2009-12-27 19:40:07 +00:00
Simon Riggs	efc16ea520	Allow read only connections during recovery, known as Hot Standby. Enabled by recovery_connections = on (default) and forcing archive recovery using a recovery.conf. Recovery processing now emulates the original transactions as they are replayed, providing full locking and MVCC behaviour for read only queries. Recovery must enter consistent state before connections are allowed, so there is a delay, typically short, before connections succeed. Replay of recovering transactions can conflict and in some cases deadlock with queries during recovery; these result in query cancellation after max_standby_delay seconds have expired. Infrastructure changes have minor effects on normal running, though introduce four new types of WAL record. New test mode "make standbycheck" allows regression tests of static command behaviour on a standby server while in recovery. Typical and extreme dynamic behaviours have been checked via code inspection and manual testing. Few port specific behaviours have been utilised, though primary testing has been on Linux only so far. This commit is the basic patch. Additional changes will follow in this release to enhance some aspects of behaviour, notably improved handling of conflicts, deadlock detection and query cancellation. Changes to VACUUM FULL are also required. Simon Riggs, with significant and lengthy review by Heikki Linnakangas, including streamlined redesign of snapshot creation and two-phase commit. Important contributions from Florian Pflug, Mark Kirkwood, Merlin Moncure, Greg Stark, Gianni Ciolli, Gabriele Bartolini, Hannu Krosing, Robert Haas, Tatsuo Ishii, Hiroyuki Yamada plus support and feedback from many other community members.	2009-12-19 01:32:45 +00:00
Peter Eisentraut	b63b967a7e	If there is no sigdelset(), define it as a macro. This removes some duplicate code that recreated the identical workaround when the newer signal API is missing.	2009-12-16 22:55:34 +00:00
Tom Lane	8217cfbd99	Add support for an application_name parameter, which is displayed in pg_stat_activity and recorded in log entries. Dave Page, reviewed by Andres Freund	2009-11-28 23:38:08 +00:00
Tom Lane	b1d55dca91	Fix memory leak in syslogger: logfile_rotate() would leak a copy of the output filename if CSV logging was enabled and only one of the two possible output files got rotated during a particular call (which would, in fact, typically be the case during a size-based rotation). This would amount to about MAXPGPATH (1KB) per rotation, and it's been there since the CSV code was put in, so it's surprising that nobody noticed it before. Per bug #5196 from Thomas Poindessous.	2009-11-19 02:45:33 +00:00
Tom Lane	5e66a51c2e	Provide a parenthesized-options syntax for VACUUM, analogous to that recently adopted for EXPLAIN. This will allow additional options to be implemented in future without having to make them fully-reserved keywords. The old syntax remains available for existing options, however. Itagaki Takahiro	2009-11-16 21:32:07 +00:00
Peter Eisentraut	45d7e04fce	reenable -> re-enable Pointed out by Debian's lintian.	2009-11-05 20:13:06 +00:00
Tom Lane	66a8417f4e	Fix an oversight in an 8.3-era patch: pgstat_initstats should allow stats to be collected for sequences. Report and fix by Akira Kurosawa	2009-10-02 22:49:50 +00:00
Tom Lane	eeb6cb143a	Add a boolean GUC parameter "bonjour" to control whether a Bonjour-enabled build actually attempts to advertise itself via Bonjour. Formerly it always did so, which meant that packagers had to decide for their users whether this behavior was wanted or not. The default is "off" to be on the safe side, though this represents a change in the default behavior of a Bonjour-enabled build. Per discussion.	2009-09-08 17:08:36 +00:00
Tom Lane	59b9f3d36d	Replace use of the long-deprecated Bonjour API DNSServiceRegistrationCreate with the not-so-deprecated DNSServiceRegister. This patch shouldn't change any user-visible behavior, it just gets rid of a deprecation warning in --with-bonjour builds. The new code will fail on OS X releases before 10.3, but it seems unlikely that anyone will want to run Postgres 8.5 on 10.2.	2009-09-08 16:08:26 +00:00
Tom Lane	47ef623c0b	Remove pgstat's discrimination against MsgVacuum and MsgAnalyze messages. Formerly, these message types would be discarded unless there was already a stats hash table entry for the target table. However, the intent of saving hash table space for unused tables was subverted by the fact that the physical I/O done by the vacuum or analyze would result in an immediately following tabstat message, which would create the hash table entry anyway. All that we had left was surprising loss of statistical data, as in a recent complaint from Jaime Casanova. It seems unlikely that a real database would have many tables that go totally untouched over the long haul, so the consensus is that this "optimization" serves little purpose anyhow. Remove it, and just create the hash table entry on demand in all cases.	2009-09-04 22:32:33 +00:00
Tom Lane	00e6a16d01	Change the autovacuum launcher to read pg_database directly, rather than via the "flat files" facility. This requires making it enough like a backend to be able to run transactions; it's no longer an "auxiliary process" but more like the autovacuum worker processes. Also, its signal handling has to be brought into line with backends/workers. In particular, since it now has to handle procsignal.c processing, the special autovac-launcher-only signal conditions are moved to SIGUSR2. Alvaro, with some cleanup from Tom	2009-08-31 19:41:00 +00:00
Tom Lane	e710b65c1c	Remove the use of the pg_auth flat file for client authentication. (That flat file is now completely useless, but removal will come later.) To do this, postpone client authentication into the startup transaction that's run by InitPostgres. We still collect the startup packet and do SSL initialization (if needed) at the same time we did before. The AuthenticationTimeout is applied separately to startup packet collection and the actual authentication cycle. (This is a bit annoying, since it means a couple extra syscalls; but the signal handling requirements inside and outside a transaction are sufficiently different that it seems best to treat the timeouts as completely independent.) A small security disadvantage is that if the given database name is invalid, this will be reported to the client before any authentication happens. We could work around that by connecting to database "postgres" instead, but consensus seems to be that it's not worth introducing such surprising behavior. Processing of all command-line switches and GUC options received from the client is now postponed until after authentication. This means that PostAuthDelay is much less useful than it used to be --- if you need to investigate problems during InitPostgres you'll have to set PreAuthDelay instead. However, allowing an unauthenticated user to set any GUC options whatever seems a bit too risky, so we'll live with that.	2009-08-29 19:26:52 +00:00
Tom Lane	0a00c9a8ef	Remove useless code that propagated FrontendProtocol to a backend via a PostgresMain switch. In point of fact, FrontendProtocol is already set in a backend process, since ProcessStartupPacket() is executed inside the backend --- it hasn't been run by the postmaster for many years. And if it were, we'd still certainly want FrontendProtocol to be set before we get as far as PostgresMain, so that startup errors get reported in the right protocol. -v might have some future use in standalone backends, so I didn't go so far as to remove the switch outright. Also, initialize FrontendProtocol to 0 not PG_PROTOCOL_LATEST. The only likely result of presetting it like that is to mask failure-to-set-it mistakes.	2009-08-28 18:23:53 +00:00
Tom Lane	c66d9ce774	Non-Windows EXEC_BACKEND path was broken by recent write_inheritable_socket change ... it's got to return true.	2009-08-28 17:42:54 +00:00
Alvaro Herrera	53af86c55c	Fix handling of autovacuum reloptions. In the original coding, setting a single reloption would cause default values to be used for all the other reloptions. This is a problem particularly for autovacuum reloptions. Itagaki Takahiro	2009-08-27 17:18:44 +00:00
Tom Lane	8bed238c87	Try to make silent_mode behave somewhat reasonably. Instead of sending stdout/stderr to /dev/null after forking away from the terminal, send them to postmaster.log within the data directory. Since this opens the door to indefinite logfile bloat, recommend even more strongly that log output be redirected when using silent_mode. Move the postmaster's initial calls of load_hba() and load_ident() down to after we have started the log collector, if we are going to. This is so that errors reported by them will appear in the "usual" place. Reclassify silent_mode as a LOGGING_WHERE, not LOGGING_WHEN, parameter, since it's got absolutely nothing to do with the latter category. In passing, fix some obsolete references to -S ... this option hasn't had that switch letter for a long time. Back-patch to 8.4, since as of 8.4 load_hba() and load_ident() are more picky (and thus more likely to fail) than they used to be. This entire change was driven by a complaint about those errors disappearing into the bit bucket.	2009-08-24 20:08:32 +00:00
Tom Lane	5a4f763841	Small correction to previous patch: we shouldn't ReleasePostmasterChildSlot for a dead_end child, because we didn't AssignPostmasterChildSlot.	2009-08-24 18:09:37 +00:00
Alvaro Herrera	45f9b4646f	Avoid calling kill() in a postmaster signal handler. This causes problems when the system load is high, per report from Zdenek Kotala in <1250860954.1239.114.camel@localhost>; instead of calling kill directly, have the signal handler set a flag which is checked in ServerLoop. This way, the handler can return before being called again by a subsequent signal sent from the autovacuum launcher. Also, increase the sleep in the launcher in this failure path to 1 second. Backpatch to 8.3, which is when the signalling between autovacuum launcher/postmaster was introduced. Also, add a couple of ReleasePostmasterChildSlot calls in error paths; this part backpatched to 8.4 which is when the child slot stuff was introduced.	2009-08-24 17:23:02 +00:00
Tom Lane	04011cc970	Allow backends to start up without use of the flat-file copy of pg_database. To make this work in the base case, pg_database now has a nailed-in-cache relation descriptor that is initialized using hardwired knowledge in relcache.c. This means pg_database is added to the set of relations that need to have a Schema_pg_xxx macro maintained in pg_attribute.h. When this path is taken, we'll have to do a seqscan of pg_database to find the row we need. In the normal case, we are able to do an indexscan to find the database's row by name. This is made possible by storing a global relcache init file that describes only the shared catalogs and their indexes (and therefore is usable by all backends in any database). A new backend loads this cache file, finds its database OID after an indexscan on pg_database, and then loads the local relcache init file for that database. This change should effectively eliminate number of databases as a factor in backend startup time, even with large numbers of databases. However, the real reason for doing it is as a first step towards getting rid of the flat files altogether. There are still several other sub-projects to be tackled before that can happen.	2009-08-12 20:53:31 +00:00
Heikki Linnakangas	06f1f53ea9	Fast shutdown stop should forcibly disconnect any active backends, even if a smart shutdown is already in progress. Backpatch to 8.3, this was broken in the patch that introduced "dead-end backends". Per report by Itagaki Takahiro, patch by Fujii Masao.	2009-08-07 05:58:55 +00:00
Magnus Hagander	4000170535	Avoid terminating the postmaster on a number of "can't happen" cases during backend startup on Win32. Instead, log the error and just forget about the potentially dangling process, since we can't do anything about it anyway.	2009-08-06 09:50:22 +00:00
Tom Lane	2487d872e0	Create a multiplexing structure for signals to Postgres child processes. This patch gets us out from under the Unix limitation of two user-defined signal types. We already had done something similar for signals directed to the postmaster process; this adds multiplexing for signals directed to backends and auxiliary processes (so long as they're connected to shared memory). As proof of concept, replace the former usage of SIGUSR1 and SIGUSR2 for backends with use of the multiplexing mechanism. There are still some hard-wired definitions of SIGUSR1 and SIGUSR2 for other process types, but getting rid of those doesn't seem interesting at the moment. Fujii Masao	2009-07-31 20:26:23 +00:00
Magnus Hagander	a7e587863c	Reserve the shared memory region during backend startup on Windows, so that memory allocated by starting third party DLLs doesn't end up conflicting with it. Hopefully this solves the long-time issue with "could not reattach to shared memory" errors on Win32. Patch from Tsutomu Yamada and me, based on idea from Trevor Talbot.	2009-07-24 20:12:42 +00:00
Tom Lane	b11ce5608a	Remove no-longer-necessary transmission of postmaster's LC_COLLATE and LC_CTYPE settings to children via BackendParameters. Per discussion, the postmaster is now just using system defaults anyway, so we might as well save a few cycles during backend startup.	2009-07-08 18:55:35 +00:00
Tom Lane	2de48a83e6	Cleanup and code review for the patch that made bgwriter active during archive recovery. Invent a separate state variable and inquiry function for XLogInsertAllowed() to clarify some tests and make the management of writing the end-of-recovery checkpoint less klugy. Fix several places that were incorrectly testing InRecovery when they should be looking at RecoveryInProgress or XLogInsertAllowed (because they will now be executed in the bgwriter not startup process). Clarify handling of bad LSNs passed to XLogFlush during recovery. Use a spinlock for setting/testing SharedRecoveryInProgress. Improve quite a lot of comments. Heikki and Tom	2009-06-26 20:29:04 +00:00
Heikki Linnakangas	7e48b77b1c	Fix some serious bugs in archive recovery, now that bgwriter is active during it: When bgwriter is active, the startup process can't perform mdsync() correctly because it won't see the fsync requests accumulated in bgwriter's private pendingOpsTable. Therefore make bgwriter responsible for the end-of-recovery checkpoint as well, when it's active. When bgwriter is active (= archive recovery), the startup process must not accumulate fsync requests to its own pendingOpsTable, since bgwriter won't see them there when it performs restartpoints. Make startup process drop its pendingOpsTable when bgwriter is launched to avoid that. Update minimum recovery point one last time when leaving archive recovery. It won't be updated by the end-of-recovery checkpoint because XLogFlush() sees us as out of recovery already. This fixes bug #4879 reported by Fujii Masao.	2009-06-25 21:36:00 +00:00
Tom Lane	bfd06a713b	Fix several places where a function was declared static and then defined without static. Per testing with a compiler that complains about this.	2009-06-12 16:17:29 +00:00
Bruce Momjian	d747140279	8.4 pgindent run, with new combined Linux/FreeBSD/MinGW typedef list provided by Andrew.	2009-06-11 14:49:15 +00:00
Alvaro Herrera	e66576e58c	Fix typo, per Tom	2009-06-09 19:36:28 +00:00
Alvaro Herrera	e8f28cb25d	Dynamically set a lower bound on autovacuum nap time so that we don't rebuild the database list too often. Per bug report from Łukasz Jagiełło and ensuing discussion on pgsql-performance.	2009-06-09 16:41:02 +00:00
Tom Lane	32ea236361	Improve the IndexVacuumInfo/IndexBulkDeleteResult API to allow somewhat sane behavior in cases where we don't know the heap tuple count accurately; in particular partial vacuum, but this also makes the API a bit more useful for ANALYZE. This patch adds "estimated_count" flags to both structs so that an approximate count can be flagged as such, and adjusts the logic so that approximate counts are not used for updating pg_class.reltuples. This fixes my previous complaint that VACUUM was putting ridiculous values into pg_class.reltuples for indexes. The actual impact of that bug is limited, because the planner only pays attention to reltuples for an index if the index is partial; which probably explains why beta testers hadn't noticed a degradation in plan quality from it. But it needs to be fixed. The whole thing is a bit messy and should be redesigned in future, because reltuples now has the potential to drift quite far away from reality when a long period elapses with no non-partial vacuums. But this is as good as it's going to get for 8.4.	2009-06-06 22:13:52 +00:00
Tom Lane	76d4abf2d9	Improve the recently-added support for properly pluralized error messages by extending the ereport() API to cater for pluralization directly. This is better than the original method of calling ngettext outside the elog.c code because (1) it avoids double translation, which wastes cycles and in the worst case could give a wrong result; and (2) it avoids having to use a different coding method in PL code than in the core backend. The client-side uses of ngettext are not touched since neither of these concerns is very pressing in the client environment. Per my proposal of yesterday.	2009-06-04 18:33:08 +00:00
Tom Lane	4616d57dad	Fix all the server-side SIGQUIT handlers (grumble ... why so many identical copies?) to ensure they really don't run proc_exit/shmem_exit callbacks, as was intended. I broke this behavior recently by installing atexit callbacks without thinking about the one case where we truly don't want to run those callback functions. Noted in an example from Dave Page.	2009-05-15 15:56:39 +00:00
Tom Lane	969d7cd431	Install a "dead man switch" to allow the postmaster to detect cases where a backend has done exit(0) or exit(1) without having disengaged itself from shared memory. We are at risk for this whenever third-party code is loaded into a backend, since such code might not know it's supposed to go through proc_exit() instead. Also, it is reported that under Windows there are ways to externally kill a process that cause the status code returned to the postmaster to be indistinguishable from a voluntary exit (thank you, Microsoft). If this does happen then the system is probably hosed --- for instance, the dead session might still be holding locks. So the best recovery method is to treat this like a backend crash. The dead man switch is armed for a particular child process when it acquires a regular PGPROC, and disarmed when the PGPROC is released; these should be the first and last touches of shared memory resources in a backend, or close enough anyway. This choice means there is no coverage for auxiliary processes, but I doubt we need that, since they shouldn't be executing any user-provided code anyway. This patch also improves the management of the EXEC_BACKEND ShmemBackendArray array a bit, by reducing search costs. Although this problem is of long standing, the lack of field complaints seems to mean it's not critical enough to risk back-patching; at least not till we get some more testing of this mechanism.	2009-05-05 19:59:00 +00:00
Tom Lane	4071e0c242	Fix missed usage of DLNewElem()	2009-05-04 02:46:36 +00:00
Alvaro Herrera	a1e1ef4f77	Avoid a memory allocation in the backend startup code, to avoid having to check whether it failed. Modelled after catcache.c's usage of DlList, per suggestion from Tom.	2009-05-04 02:24:17 +00:00
Tom Lane	d90984f4f6	Install some simple defenses in postmaster startup to help ensure a useful error message if the installation directory layout is messed up (or at least, something more useful than the behavior exhibited in bug #4787). During postmaster startup, check that get_pkglib_path resolves as a readable directory; and if ParseTzFile() fails to open the expected timezone abbreviation file, check the possibility that the directory is missing rather than just the specified file. In case of either failure, issue a hint suggesting that the installation is broken. These two checks cover the lib/ and share/ trees of a full installation, which should take care of most scenarios where a sysadmin decides to get cute.	2009-05-02 22:02:37 +00:00
Tom Lane	27fbfd396c	Remove a boatload of useless definitions of 'int optreset'. If we are using our own ports of getopt or getopt_long, those will define the variable for themselves; and if not, we don't need these, because we never touch the variable anyway.	2009-04-05 04:19:59 +00:00
Tom Lane	948d6ec90f	Modify the relcache to record the temp status of both local and nonlocal temp relations; this is no more expensive than before, now that we have pg_class.relistemp. Insert tests into bufmgr.c to prevent attempting to fetch pages from nonlocal temp relations. This provides a low-level defense against bugs-of-omission allowing temp pages to be loaded into shared buffers, as in the contrib/pgstattuple problem reported by Stuart Bishop. While at it, tweak a bunch of places to use new relcache tests (instead of expensive probes into pg_namespace) to detect local or nonlocal temp tables.	2009-03-31 22:12:48 +00:00
Peter Eisentraut	8032d76b5b	Gettext plural support In the backend, I changed only a handful of exemplary or important-looking instances to make use of the plural support; there is probably more work there. For the rest of the source, this should cover all relevant cases.	2009-03-26 22:26:08 +00:00
Heikki Linnakangas	8e1a8fe288	Fix Windows-specific race condition in syslogger. This could've been the cause of the "could not write to log file: Bad file descriptor" errors reported at http://archives.postgresql.org//pgsql-general/2008-06/msg00193.php Backpatch to 8.3, the race condition was introduced by the CSV logging patch. Analysis and patch by Gurjeet Singh.	2009-03-18 08:44:49 +00:00
Heikki Linnakangas	fb7df896fc	Reload config file in startup process on SIGHUP. Fujii Masao	2009-03-04 13:56:40 +00:00
Heikki Linnakangas	fd75329e81	Fix copy-pasto in the patch to allow background writer to run during recovery: if background writer or pgstat process dies during recovery (or any other child process, but those two are the only ones running), send SIGQUIT to the startup process using correct pid.	2009-03-03 10:42:05 +00:00

1 2 3 4 5 ...

880 Commits