postgresql

mirror of https://git.postgresql.org/git/postgresql.git synced 2026-02-09 02:27:36 +08:00

Author	SHA1	Message	Date
Tom Lane	dc4e8c1016	Fix low-probability memory leak in regex execution. After an internal failure in shortest() or longest() while pinning down the exact location of a match, find() forgot to free the DFA structure before returning. This is pretty unlikely to occur, since we just successfully ran the "search" variant of the DFA; but it could happen, and it would result in a session-lifespan memory leak since this code uses malloc() directly. Problem seems to have been aboriginal in Spencer's library, so back-patch all the way. In passing, correct a thinko in a comment I added awhile back about the meaning of the "ntree" field. I happened across these issues while comparing our code to Tcl's version of the library.	2015-09-18 13:55:17 -04:00
Fujii Masao	67518a1415	Remove files signaling a standby promotion request at postmaster startup This commit makes postmaster forcibly remove the files signaling a standby promotion request. Otherwise, the existence of those files can trigger a promotion too early, whether a user wants that or not. This removal of files is usually unnecessary because they can exist only during a few moments during a standby promotion. However there is a race condition: if pg_ctl promote is executed and creates the files during a promotion, the files can stay around even after the server is brought up to new master. Then, if new standby starts by using the backup taken from that master, the files can exist at the server startup and should be removed in order to avoid an unexpected promotion. Back-patch to 9.1 where promote signal file was introduced. Problem reported by Feike Steenbergen. Original patch by Michael Paquier, modified by me. Discussion: 20150528100705.4686.91426@wrigleys.postgresql.org	2015-09-09 23:01:10 +09:00
Tom Lane	39ebb64669	Fix subtransaction cleanup after an outer-subtransaction portal fails. Formerly, we treated only portals created in the current subtransaction as having failed during subtransaction abort. However, if the error occurred while running a portal created in an outer subtransaction (ie, a cursor declared before the last savepoint), that has to be considered broken too. To allow reliable detection of which ones those are, add a bookkeeping field to struct Portal that tracks the innermost subtransaction in which each portal has actually been executed. (Without this, we'd end up failing portals containing functions that had called the subtransaction, thereby breaking plpgsql exception blocks completely.) In addition, when we fail an outer-subtransaction Portal, transfer its resources into the subtransaction's resource owner, so that they're released early in cleanup of the subxact. This fixes a problem reported by Jim Nasby in which a function executed in an outer-subtransaction cursor could cause an Assert failure or crash by referencing a relation created within the inner subtransaction. The proximate cause of the Assert failure is that AtEOSubXact_RelationCache assumed it could blow away a relcache entry without first checking that the entry had zero refcount. That was a bad idea on its own terms, so add such a check there, and to the similar coding in AtEOXact_RelationCache. This provides an independent safety measure in case there are still ways to provoke the situation despite the Portal-level changes. This has been broken since subtransactions were invented, so back-patch to all supported branches. Tom Lane and Michael Paquier	2015-09-04 13:36:50 -04:00
Tom Lane	472680c577	Fix s_lock.h PPC assembly code to be compatible with native AIX assembler. On recent AIX it's necessary to configure gcc to use the native assembler (because the GNU assembler hasn't been updated to handle AIX 6+). This caused PG builds to fail with assembler syntax errors, because we'd try to compile s_lock.h's gcc asm fragment for PPC, and that assembly code relied on GNU-style local labels. We can't substitute normal labels because it would fail in any file containing more than one inlined use of tas(). Fortunately, that code is stable enough, and the PPC ISA is simple enough, that it doesn't seem like too much of a maintenance burden to just hand-code the branch offsets, removing the need for any labels. Note that the AIX assembler only accepts "$" for the location counter pseudo-symbol. The usual GNU convention is "."; but it appears that all versions of gas for PPC also accept "$", so in theory this patch will not break any other PPC platforms. This has been reported by a few people, but Steve Underwood gets the credit for being the first to pursue the problem far enough to understand why it was failing. Thanks also to Noah Misch for additional testing.	2015-08-29 16:09:25 -04:00
Tom Lane	0e933fdf94	Add a small cache of locks owned by a resource owner in ResourceOwner. Back-patch 9.3-era commit eeb6f37d89fc60c6449ca12ef9e91491069369cb, to improve the older branches' ability to cope with pg_dump dumping a large number of tables. I back-patched into 9.2 and 9.1, but not 9.0 as it would have required a significant amount of refactoring, thus negating the argument that this is by-now-well-tested code. Jeff Janes, reviewed by Amit Kapila and Heikki Linnakangas.	2015-08-27 12:22:10 -04:00
Tom Lane	fe460461d9	Accept alternate spellings of __sparcv7 and __sparcv8. Apparently some versions of gcc prefer __sparc_v7__ and __sparc_v8__. Per report from Waldemar Brodkorb.	2015-08-10 17:34:51 -04:00
Tom Lane	b6659a3b9e	Fix bogus "out of memory" reports in tuplestore.c. The tuplesort/tuplestore memory management logic assumed that the chunk allocation overhead for its memtuples array could not increase when increasing the array size. This is and always was true for tuplesort, but we (I, I think) blindly copied that logic into tuplestore.c without noticing that the assumption failed to hold for the much smaller array elements used by tuplestore. Given rather small work_mem, this could result in an improper complaint about "unexpected out-of-memory situation", as reported by Brent DeSpain in bug #13530. The easiest way to fix this is just to increase tuplestore's initial array size so that the assumption holds. Rather than relying on magic constants, though, let's export a #define from aset.c that represents the safe allocation threshold, and make tuplestore's calculation depend on that. Do the same in tuplesort.c to keep the logic looking parallel, even though tuplesort.c isn't actually at risk at present. This will keep us from breaking it if we ever muck with the allocation parameters in aset.c. Back-patch to all supported versions. The error message doesn't occur pre-9.3, not so much because the problem can't happen as because the pre-9.3 tuplestore code neglected to check for it. (The chance of trouble is a great deal larger as of 9.3, though, due to changes in the array-size-increasing strategy.) However, allowing LACKMEM() to become true unexpectedly could still result in less-than-desirable behavior, so let's patch it all the way back.	2015-08-04 18:18:46 -04:00
Heikki Linnakangas	f4297f8c5f	Fix handling of all-zero pages in SP-GiST vacuum. SP-GiST initialized an all-zeros page at vacuum, but that was not WAL-logged, which is not safe. You might get a torn page write, when it gets flushed to disk, and end-up with a half-initialized index page. To fix, leave it in the all-zeros state, and add it to the FSM. It will be initialized when reused. Also don't set the page-deleted flag when recycling an empty page. That was also not WAL-logged, and a torn write of that would cause the page to have an invalid checksum. Backpatch to 9.2, where SP-GiST indexes were added.	2015-07-27 12:32:48 +03:00
Heikki Linnakangas	84330d0c1f	Fix off-by-one error in calculating subtrans/multixact truncation point. If there were no subtransactions (or multixacts) active, we would calculate the oldestxid == next xid. That's correct, but if next XID happens to be on the next pg_subtrans (pg_multixact) page, the page does not exist yet, and SimpleLruTruncate will produce an "apparent wraparound" warning. The warning is harmless in this case, but looks very alarming to users. Backpatch to all supported versions. Patch and analysis by Thomas Munro.	2015-07-23 01:30:15 +03:00
Tom Lane	88fab18a4c	Fix the logic for putting relations into the relcache init file. Commit f3b5565dd4e59576be4c772da364704863e6a835 was a couple of bricks shy of a load; specifically, it missed putting pg_trigger_tgrelid_tgname_index into the relcache init file, because that index is not used by any syscache. However, we have historically nailed that index into cache for performance reasons. The upshot was that load_relcache_init_file always decided that the init file was busted and silently ignored it, resulting in a significant hit to backend startup speed. To fix, reinstantiate RelationIdIsInInitFile() as a wrapper around RelationSupportsSysCache(), which can know about additional relations that should be in the init file despite being unknown to syscache.c. Also install some guards against future mistakes of this type: make write_relcache_init_file Assert that all nailed relations get written to the init file, and make load_relcache_init_file emit a WARNING if it takes the "wrong number of nailed relations" exit path. Now that we remove the init files during postmaster startup, that case should never occur in the field, even if we are starting a minor-version update that added or removed rels from the nailed set. So the warning shouldn't ever be seen by end users, but it will show up in the regression tests if somebody breaks this logic. Back-patch to all supported branches, like the previous commit.	2015-06-25 14:39:05 -04:00
Tom Lane	582eff507e	Stamp 9.2.13.	2015-06-09 15:33:16 -04:00
Tom Lane	3e69a73b98	Use a safer method for determining whether relcache init file is stale. When we invalidate the relcache entry for a system catalog or index, we must also delete the relcache "init file" if the init file contains a copy of that rel's entry. The old way of doing this relied on a specially maintained list of the OIDs of relations present in the init file: we made the list either when reading the file in, or when writing the file out. The problem is that when writing the file out, we included only rels present in our local relcache, which might have already suffered some deletions due to relcache inval events. In such cases we correctly decided not to overwrite the real init file with incomplete data --- but we still used the incomplete initFileRelationIds list for the rest of the current session. This could result in wrong decisions about whether the session's own actions require deletion of the init file, potentially allowing an init file created by some other concurrent session to be left around even though it's been made stale. Since we don't support changing the schema of a system catalog at runtime, the only likely scenario in which this would cause a problem in the field involves a "vacuum full" on a catalog concurrently with other activity, and even then it's far from easy to provoke. Remarkably, this has been broken since 2002 (in commit 786340441706ac1957a031f11ad1c2e5b6e18314), but we had never seen a reproducible test case until recently. If it did happen in the field, the symptoms would probably involve unexpected "cache lookup failed" errors to begin with, then "could not open file" failures after the next checkpoint, as all accesses to the affected catalog stopped working. Recovery would require manually removing the stale "pg_internal.init" file. To fix, get rid of the initFileRelationIds list, and instead consult syscache.c's list of relations used in catalog caches to decide whether a relation is included in the init file. This should be a tad more efficient anyway, since we're replacing linear search of a list with ~100 entries with a binary search. It's a bit ugly that the init file contents are now so directly tied to the catalog caches, but in practice that won't make much difference. Back-patch to all supported branches.	2015-06-07 15:32:09 -04:00
Tom Lane	3f7928a289	Stamp 9.2.12.	2015-06-01 15:10:35 -04:00
Tom Lane	aa8377e64f	Fix fsync-at-startup code to not treat errors as fatal. Commit 2ce439f3379aed857517c8ce207485655000fc8e introduced a rather serious regression, namely that if its scan of the data directory came across any un-fsync-able files, it would fail and thereby prevent database startup. Worse yet, symlinks to such files also caused the problem, which meant that crash restart was guaranteed to fail on certain common installations such as older Debian. After discussion, we agreed that (1) failure to start is worse than any consequence of not fsync'ing is likely to be, therefore treat all errors in this code as nonfatal; (2) we should not chase symlinks other than those that are expected to exist, namely pg_xlog/ and tablespace links under pg_tblspc/. The latter restriction avoids possibly fsync'ing a much larger part of the filesystem than intended, if the user has left random symlinks hanging about in the data directory. This commit takes care of that and also does some code beautification, mainly moving the relevant code into fd.c, which seems a much better place for it than xlog.c, and making sure that the conditional compilation for the pre_sync_fname pass has something to do with whether pg_flush_data works. I also relocated the call site in xlog.c down a few lines; it seems a bit silly to be doing this before ValidateXLOGDirectoryStructure(). The similar logic in initdb.c ought to be made to match this, but that change is noncritical and will be dealt with separately. Back-patch to all active branches, like the prior commit. Abhijit Menon-Sen and Tom Lane	2015-05-28 17:33:03 -04:00
Tom Lane	221f7a9494	Revert error-throwing wrappers for the printf family of functions. This reverts commit 16304a013432931e61e623c8d85e9fe24709d9ba, except for its changes in src/port/snprintf.c; as well as commit cac18a76bb6b08f1ecc2a85e46c9d2ab82dd9d23 which is no longer needed. Fujii Masao reported that the previous commit caused failures in psql on OS X, since if one exits the pager program early while viewing a query result, psql sees an EPIPE error from fprintf --- and the wrapper function thought that was reason to panic. (It's a bit surprising that the same does not happen on Linux.) Further discussion among the security list concluded that the risk of other such failures was far too great, and that the one-size-fits-all approach to error handling embodied in the previous patch is unlikely to be workable. This leaves us again exposed to the possibility of the type of failure envisioned in CVE-2015-3166. However, that failure mode is strictly hypothetical at this point: there is no concrete reason to believe that an attacker could trigger information disclosure through the supposed mechanism. In the first place, the attack surface is fairly limited, since so much of what the backend does with format strings goes through stringinfo.c or psprintf(), and those already had adequate defenses. In the second place, even granting that an unprivileged attacker could control the occurrence of ENOMEM with some precision, it's a stretch to believe that he could induce it just where the target buffer contains some valuable information. So we concluded that the risk of non-hypothetical problems induced by the patch greatly outweighs the security risks. We will therefore revert, and instead undertake closer analysis to identify specific calls that may need hardening, rather than attempt a universal solution. We have kept the portion of the previous patch that improved snprintf.c's handling of errors when it calls the platform's sprintf(). That seems to be an unalloyed improvement. Security: CVE-2015-3166	2015-05-19 18:17:42 -04:00
Tom Lane	520fecfae5	Stamp 9.2.11.	2015-05-18 14:34:28 -04:00
Noah Misch	82b7393eb4	Add error-throwing wrappers for the printf family of functions. All known standard library implementations of these functions can fail with ENOMEM. A caller neglecting to check for failure would experience missing output, information exposure, or a crash. Check return values within wrappers and code, currently just snprintf.c, that bypasses the wrappers. The wrappers do not return after an error, so their callers need not check. Back-patch to 9.0 (all supported versions). Popular free software standard library implementations do take pains to bypass malloc() in simple cases, but they risk ENOMEM for floating point numbers, positional arguments, large field widths, and large precisions. No specification demands such caution, so this commit regards every call to a printf family function as a potential threat. Injecting the wrappers implicitly is a compromise between patch scope and design goals. I would prefer to edit each call site to name a wrapper explicitly. libpq and the ECPG libraries would, ideally, convey errors to the caller rather than abort(). All that would be painfully invasive for a back-patched security fix, hence this compromise. Security: CVE-2015-3166	2015-05-18 10:02:37 -04:00
Noah Misch	1e6652aea5	Permit use of vsprintf() in PostgreSQL code. The next commit needs it. Back-patch to 9.0 (all supported versions).	2015-05-18 10:02:37 -04:00
Robert Haas	2bc3397168	Recursively fsync() the data directory after a crash. Otherwise, if there's another crash, some writes from after the first crash might make it to disk while writes from before the crash fail to make it to disk. This could lead to data corruption. Back-patch to all supported versions. Abhijit Menon-Sen, reviewed by Andres Freund and slightly revised by me.	2015-05-04 12:41:53 -04:00
Tatsuo Ishii	e37c1090df	Make SyncRepWakeQueue to a static function It is only used in src/backend/replication/syncrep.c. Back-patch to all supported branches except 9.1 which declares the function as static.	2015-03-26 10:39:52 +09:00
Tom Lane	d068609b95	Remove code to match IPv4 pg_hba.conf entries to IPv4-in-IPv6 addresses. In investigating yesterday's crash report from Hugo Osvaldo Barrera, I only looked back as far as commit f3aec2c7f51904e7 where the breakage occurred (which is why I thought the IPv4-in-IPv6 business was undocumented). But actually the logic dates back to commit 3c9bb8886df7d56a and was simply broken by erroneous refactoring in the later commit. A bit of archives excavation shows that we added the whole business in response to a report that some 2003-era Linux kernels would report IPv4 connections as having IPv4-in-IPv6 addresses. The fact that we've had no complaints since 9.0 seems to be sufficient confirmation that no modern kernels do that, so let's just rip it all out rather than trying to fix it. Do this in the back branches too, thus essentially deciding that our effective behavior since 9.0 is correct. If there are any platforms on which the kernel reports IPv4-in-IPv6 addresses as such, yesterday's fix would have made for a subtle and potentially security-sensitive change in the effective meaning of IPv4 pg_hba.conf entries, which does not seem like a good thing to do in minor releases. So let's let the post-9.0 behavior stand, and change the documentation to match it. In passing, I failed to resist the temptation to wordsmith the description of pg_hba.conf IPv4 and IPv6 address entries a bit. A lot of this text hasn't been touched since we were IPv4-only.	2015-02-17 12:49:44 -05:00
Heikki Linnakangas	a0d84da1d4	Fix broken #ifdef for __sparcv8 Rob Rowan. Backpatch to all supported versions, like the patch that added the broken #ifdef.	2015-02-13 23:57:25 +02:00
Tom Lane	9f20f2fc43	Stamp 9.2.10.	2015-02-02 15:44:39 -05:00
Heikki Linnakangas	289592b23e	Be more careful to not lose sync in the FE/BE protocol. If any error occurred while we were in the middle of reading a protocol message from the client, we could lose sync, and incorrectly try to interpret a part of another message as a new protocol message. That will usually lead to an "invalid frontend message" error that terminates the connection. However, this is a security issue because an attacker might be able to deliberately cause an error, inject a Query message in what's supposed to be just user data, and have the server execute it. We were quite careful to not have CHECK_FOR_INTERRUPTS() calls or other operations that could ereport(ERROR) in the middle of processing a message, but a query cancel interrupt or statement timeout could nevertheless cause it to happen. Also, the V2 fastpath and COPY handling were not so careful. It's very difficult to recover in the V2 COPY protocol, so we will just terminate the connection on error. In practice, that's what happened previously anyway, as we lost protocol sync. To fix, add a new variable in pqcomm.c, PqCommReadingMsg, that is set whenever we're in the middle of reading a message. When it's set, we cannot safely ERROR out and continue running, because we might've read only part of a message. PqCommReadingMsg acts somewhat similarly to critical sections in that if an error occurs while it's set, the error handler will force the connection to be terminated, as if the error was FATAL. It's not implemented by promoting ERROR to FATAL in elog.c, like ERROR is promoted to PANIC in critical sections, because we want to be able to use PG_TRY/CATCH to recover and regain protocol sync. pq_getmessage() takes advantage of that to prevent an OOM error from terminating the connection. To prevent unnecessary connection terminations, add a holdoff mechanism similar to HOLD/RESUME_INTERRUPTS() that can be used hold off query cancel interrupts, but still allow die interrupts. The rules on which interrupts are processed when are now a bit more complicated, so refactor ProcessInterrupts() and the calls to it in signal handlers so that the signal handlers always call it if ImmediateInterruptOK is set, and ProcessInterrupts() can decide to not do anything if the other conditions are not met. Reported by Emil Lenngren. Patch reviewed by Noah Misch and Andres Freund. Backpatch to all supported versions. Security: CVE-2015-0244	2015-02-02 17:09:35 +02:00
Noah Misch	5ca4e444cb	On Darwin, detect and report a multithreaded postmaster. Darwin --enable-nls builds use a substitute setlocale() that may start a thread. Buildfarm member orangutan experienced BackendList corruption on account of different postmaster threads executing signal handlers simultaneously. Furthermore, a multithreaded postmaster risks undefined behavior from sigprocmask() and fork(). Emit LOG messages about the problem and its workaround. Back-patch to 9.0 (all supported versions).	2015-01-07 22:41:49 -05:00
Tom Lane	e92c67ddc0	Fix off-by-one loop count in MapArrayTypeName, and get rid of static array. MapArrayTypeName would copy up to NAMEDATALEN-1 bytes of the base type name, which of course is wrong: after prepending '_' there is only room for NAMEDATALEN-2 bytes. Aside from being the wrong result, this case would lead to overrunning the statically allocated work buffer. This would be a security bug if the function were ever used outside bootstrap mode, but it isn't, at least not in any currently supported branches. Aside from fixing the off-by-one loop logic, this patch gets rid of the static work buffer by having MapArrayTypeName pstrdup its result; the sole caller was already doing that, so this just requires moving the pstrdup call. This saves a few bytes but mainly it makes the API a lot cleaner. Back-patch on the off chance that there is some third-party code using MapArrayTypeName with less-secure input. Pushing pstrdup into the function should not cause any serious problems for such hypothetical code; at worst there might be a short term memory leak. Per Coverity scanning.	2014-12-16 15:35:43 -05:00
Andres Freund	86673a44a7	Backport "Expose fsync_fname as a public API". Backport commit cc52d5b33ff5df29de57dcae9322214cfe9c8464 back to 9.1 to allow backpatching some unlogged table fixes that use fsync_fname.	2014-11-15 01:21:30 +01:00
Heikki Linnakangas	7eab804c22	Fix race condition between hot standby and restoring a full-page image. There was a window in RestoreBackupBlock where a page would be zeroed out, but not yet locked. If a backend pinned and locked the page in that window, it saw the zeroed page instead of the old page or new page contents, which could lead to missing rows in a result set, or errors. To fix, replace RBM_ZERO with RBM_ZERO_AND_LOCK, which atomically pins, zeroes, and locks the page, if it's not in the buffer cache already. In stable branches, the old RBM_ZERO constant is renamed to RBM_DO_NOT_USE, to avoid breaking any 3rd party extensions that might use RBM_ZERO. More importantly, this avoids renumbering the other enum values, which would cause even bigger confusion in extensions that use ReadBufferExtended, but haven't been recompiled. Backpatch to all supported versions; this has been racy since hot standby was introduced.	2014-11-13 20:01:18 +02:00
Tom Lane	d47fff3d72	Explicitly support the case that a plancache's raw_parse_tree is NULL. This only happens if a client issues a Parse message with an empty query string, which is a bit odd; but since it is explicitly called out as legal by our FE/BE protocol spec, we'd probably better continue to allow it. Fix by adding tests everywhere that the raw_parse_tree field is passed to functions that don't or shouldn't accept NULL. Also make it clear in the relevant comments that NULL is an expected case. This reverts commits a73c9dbab0165b3395dfe8a44a7dfd16166963c4 and 2e9650cbcff8c8fb0d9ef807c73a44f241822eee, which fixed specific crash symptoms by hacking things at what now seems to be the wrong end, ie the callee functions. Making the callees allow NULL is superficially more robust, but it's not always true that there is a defensible thing for the callee to do in such cases. The caller has more context and is better able to decide what the empty-query case ought to do. Per followup discussion of bug #11335. Back-patch to 9.2. The code before that is sufficiently different that it would require development of a separate patch, which doesn't seem worthwhile for what is believed to be an essentially cosmetic change.	2014-11-12 15:58:47 -05:00
Tom Lane	19ccaf9d46	Ensure that whole-row Vars produce nonempty column names. At one time it wasn't terribly important what column names were associated with the fields of a composite Datum, but since the introduction of operations like row_to_json(), it's important that looking up the rowtype ID embedded in the Datum returns the column names that users would expect. However, that doesn't work terribly well: you could get the column names of the underlying table, or column aliases from any level of the query, depending on minor details of the plan tree. You could even get totally empty field names, which is disastrous for cases like row_to_json(). It seems unwise to change this behavior too much in stable branches, however, since users might not have noticed that they weren't getting the least-unintuitive choice of field names. Therefore, in the back branches, only change the results when the child plan has returned an actually-empty field name. (We assume that can't happen with a named rowtype, so this also dodges the issue of possibly producing RECORD-typed output from a Var with a named composite result type.) As in the sister patch for HEAD, we can get a better name to use from the Var's corresponding RTE. There is no need to touch the RowExpr code since it was already using a copy of the RTE's alias list for RECORD cases. Back-patch as far as 9.2. Before that we did not have row_to_json() so there were no core functions potentially affected by bogus field names. While 9.1 and earlier do have contrib's hstore(record) which is also affected, those versions don't seem to produce empty field names (at least not in the known problem cases), so we'll leave them alone.	2014-11-10 15:21:26 -05:00
Tom Lane	38cb8687a9	Test IsInTransactionChain, not IsTransactionBlock, in vac_update_relstats. As noted by Noah Misch, my initial cut at fixing bug #11638 didn't cover all cases where ANALYZE might be invoked in an unsafe context. We need to test the result of IsInTransactionChain not IsTransactionBlock; which is notationally a pain because IsInTransactionChain requires an isTopLevel flag, which would have to be passed down through several levels of callers. I chose to pass in_outer_xact (ie, the result of IsInTransactionChain) rather than isTopLevel per se, as that seemed marginally more apropos for the intermediate functions to know about.	2014-10-30 13:03:31 -04:00
Andres Freund	fd29810d16	Flush unlogged table's buffers when copying or moving databases. CREATE DATABASE and ALTER DATABASE .. SET TABLESPACE copy the source database directory on the filesystem level. To ensure the on disk state is consistent they block out users of the affected database and force a checkpoint to flush out all data to disk. Unfortunately, up to now, that checkpoint didn't flush out dirty buffers from unlogged relations. That bug means there could be leftover dirty buffers in either the template database, or the database in its old location. Leading to problems when accessing relations in an inconsistent state; and to possible problems during shutdown in the SET TABLESPACE case because buffers belonging files that don't exist anymore are flushed. This was reported in bug #10675 by Maxim Boguk. Fix by Pavan Deolasee, modified somewhat by me. Reviewed by MauMau and Fujii Masao. Backpatch to 9.1 where unlogged tables were introduced.	2014-10-20 23:47:00 +02:00
Tom Lane	3609c0d1f8	Declare mkdtemp() only if we're providing it. Follow our usual style of providing an "extern" for a standard library function only when we're also providing the implementation. This avoids issues when the system headers declare the function slightly differently than we do, as noted by Caleb Welton. We might have to go to the extent of probing to see if the system headers declare the function, but let's not do that until it's demonstrated to be necessary. Oversight in commit 9e6b1bf258170e62dac555fc82ff0536dfe01d29. Back-patch to all supported branches, as that was.	2014-10-17 22:55:30 -04:00
Tom Lane	7c67b93659	Support timezone abbreviations that sometimes change. Up to now, PG has assumed that any given timezone abbreviation (such as "EDT") represents a constant GMT offset in the usage of any particular region; we had a way to configure what that offset was, but not for it to be changeable over time. But, as with most things horological, this view of the world is too simplistic: there are numerous regions that have at one time or another switched to a different GMT offset but kept using the same timezone abbreviation. Almost the entire Russian Federation did that a few years ago, and later this month they're going to do it again. And there are similar examples all over the world. To cope with this, invent the notion of a "dynamic timezone abbreviation", which is one that is referenced to a particular underlying timezone (as defined in the IANA timezone database) and means whatever it currently means in that zone. For zones that use or have used daylight-savings time, the standard and DST abbreviations continue to have the property that you can specify standard or DST time and get that time offset whether or not DST was theoretically in effect at the time. However, the abbreviations mean what they meant at the time in question (or most recently before that time) rather than being absolutely fixed. The standard abbreviation-list files have been changed to use this behavior for abbreviations that have actually varied in meaning since 1970. The old simple-numeric definitions are kept for abbreviations that have not changed, since they are a bit faster to resolve. While this is clearly a new feature, it seems necessary to back-patch it into all active branches, because otherwise use of Russian zone abbreviations is going to become even more problematic than it already was. This change supersedes the changes in commit 513d06ded et al to modify the fixed meanings of the Russian abbreviations; since we've not shipped that yet, this will avoid an undesirably incompatible (not to mention incorrect) change in behavior for timestamps between 2011 and 2014. This patch makes some cosmetic changes in ecpglib to keep its usage of datetime lookup tables as similar as possible to the backend code, but doesn't do anything about the increasingly obsolete set of timezone abbreviation definitions that are hard-wired into ecpglib. Whatever we do about that will likely not be appropriate material for back-patching. Also, a potential free() of a garbage pointer after an out-of-memory failure in ecpglib has been fixed. This patch also fixes pre-existing bugs in DetermineTimeZoneOffset() that caused it to produce unexpected results near a timezone transition, if both the "before" and "after" states are marked as standard time. We'd only ever thought about or tested transitions between standard and DST time, but that's not what's happening when a zone simply redefines their base GMT offset. In passing, update the SGML documentation to refer to the Olson/zoneinfo/ zic timezone database as the "IANA" database, since it's now being maintained under the auspices of IANA.	2014-10-16 15:22:20 -04:00
Tom Lane	71b88cf52e	Fix some more problems with nested append relations. As of commit a87c72915 (which later got backpatched as far as 9.1), we're explicitly supporting the notion that append relations can be nested; this can occur when UNION ALL constructs are nested, or when a UNION ALL contains a table with inheritance children. Bug #11457 from Nelson Page, as well as an earlier report from Elvis Pranskevichus, showed that there were still nasty bugs associated with such cases: in particular the EquivalenceClass mechanism could try to generate "join" clauses connecting an appendrel child to some grandparent appendrel, which would result in assertion failures or bogus plans. Upon investigation I concluded that all current callers of find_childrel_appendrelinfo() need to be fixed to explicitly consider multiple levels of parent appendrels. The most complex fix was in processing of "broken" EquivalenceClasses, which are ECs for which we have been unable to generate all the derived equality clauses we would like to because of missing cross-type equality operators in the underlying btree operator family. That code path is more or less entirely untested by the regression tests to date, because no standard opfamilies have such holes in them. So I wrote a new regression test script to try to exercise it a bit, which turned out to be quite a worthwhile activity as it exposed existing bugs in all supported branches. The present patch is essentially the same as far back as 9.2, which is where parameterized paths were introduced. In 9.0 and 9.1, we only need to back-patch a small fragment of commit 5b7b5518d, which fixes failure to propagate out the original WHERE clauses when a broken EC contains constant members. (The regression test case results show that these older branches are noticeably stupider than 9.2+ in terms of the quality of the plans generated; but we don't really care about plan quality in such cases, only that the plan not be outright wrong. A more invasive fix in the older branches would not be a good idea anyway from a plan-stability standpoint.)	2014-10-01 19:30:34 -04:00
Andres Freund	8557a9f75f	Mark x86's memory barrier inline assembly as clobbering the cpu flags. x86's memory barrier assembly was marked as clobbering "memory" but not "cc" even though 'addl' sets various flags. As it turns out gcc on x86 implicitly assumes "cc" on every inline assembler statement, so it's not a bug. But as that's poorly documented and might get copied to architectures or compilers where that's not the case, it seems better to be precise. Discussion: 20140919100016.GH4277@alap3.anarazel.de To keep the code common, backpatch to 9.2 where explicit memory barriers were introduced.	2014-09-19 17:14:17 +02:00
Andres Freund	7679eff0f6	Fix typo in solaris spinlock fix. 07968dbfaad03 missed part of the S_UNLOCK define when building for sparcv8+.	2014-09-09 23:37:50 +02:00
Andres Freund	d0b7ffc0f6	Fix spinlock implementation for some !solaris sparc platforms. Some Sparc CPUs can be run in various coherence models, ranging from RMO (relaxed) over PSO (partial) to TSO (total). Solaris has always run CPUs in TSO mode while in userland, but linux didn't use to and the various *BSDs still don't. Unfortunately the sparc TAS/S_UNLOCK were only correct under TSO. Fix that by adding the necessary memory barrier instructions. On sparcv8+, which should be all relevant CPUs, these are treated as NOPs if the current consistency model doesn't require the barriers. Discussion: 20140630222854.GW26930@awork2.anarazel.de Will be backpatched to all released branches once a few buildfarm cycles haven't shown up problems. As I've no access to sparc, this is blindly written.	2014-09-09 23:37:50 +02:00
Heikki Linnakangas	1578d13dc7	Treat 2PC commit/abort the same as regular xacts in recovery. There were several oversights in recovery code where COMMIT/ABORT PREPARED records were ignored: * pg_last_xact_replay_timestamp() (wasn't updated for 2PC commits) * recovery_min_apply_delay (2PC commits were applied immediately) * recovery_target_xid (recovery would not stop if the XID used 2PC) The first of those was reported by Sergiy Zuban in bug #11032, analyzed by Tom Lane and Andres Freund. The bug was always there, but was masked before commit d19bd29f07aef9e508ff047d128a4046cc8bc1e2, because COMMIT PREPARED always created an extra regular transaction that was WAL-logged. Backpatch to all supported versions (older versions didn't have all the features and therefore didn't have all of the above bugs).	2014-07-29 11:58:01 +03:00
Tom Lane	e1ea61a301	Stamp 9.2.9.	2014-07-21 15:12:31 -04:00
Alvaro Herrera	b42f09fc80	Fix REASSIGN OWNED for text search objects Trying to reassign objects owned by a user that had text search dictionaries or configurations used to fail with: ERROR: unexpected classid 3600 or ERROR: unexpected classid 3602 Fix by adding cases for those object types in a switch in pg_shdepend.c. Both REASSIGN OWNED and text search objects go back all the way to 8.1, so backpatch to all supported branches. In 9.3 the alter-owner code was made generic, so the required change in recent branches is pretty simple; however, for 9.2 and older ones we need some additional reshuffling to enable specifying objects by OID rather than name. Text search templates and parsers are not owned objects, so there's no change required for them. Per bug #9749 reported by Michal Novotný	2014-07-15 13:24:07 -04:00
Tom Lane	b568d38361	Avoid leaking memory while evaluating arguments for a table function. ExecMakeTableFunctionResult evaluated the arguments for a function-in-FROM in the query-lifespan memory context. This is insignificant in simple cases where the function relation is scanned only once; but if the function is in a sub-SELECT or is on the inside of a nested loop, any memory consumed during argument evaluation can add up quickly. (The potential for trouble here had been foreseen long ago, per existing comments; but we'd not previously seen a complaint from the field about it.) To fix, create an additional temporary context just for this purpose. Per an example from MauMau. Back-patch to all active branches.	2014-06-19 22:13:51 -04:00
Noah Misch	a919937f11	Add mkdtemp() to libpgport. This function is pervasive on free software operating systems; import NetBSD's implementation. Back-patch to 8.4, like the commit that will harness it.	2014-06-14 09:41:17 -04:00
Tom Lane	9601cb7bcc	Fix unportable setvbuf() usage in initdb. In yesterday's commit 2dc4f011fd61501cce507be78c39a2677690d44b, I tried to force buffering of stdout/stderr in initdb to be what it is by default when the program is run interactively on Unix (since that's how most manual testing is done). This tripped over the fact that Windows doesn't support _IOLBF mode. We dealt with that a long time ago in syslogger.c by falling back to unbuffered mode on Windows. Export that solution in port.h and use it in initdb. Back-patch to 8.4, like the previous commit.	2014-05-15 15:58:01 -04:00
Heikki Linnakangas	d8dbeb0482	Fix race condition in preparing a transaction for two-phase commit. To lock a prepared transaction's shared memory entry, we used to mark it with the XID of the backend. When the XID was no longer active according to the proc array, the entry was implicitly considered as not locked anymore. However, when preparing a transaction, the backend's proc array entry was cleared before transfering the locks (and some other state) to the prepared transaction's dummy PGPROC entry, so there was a window where another backend could finish the transaction before it was in fact fully prepared. To fix, rewrite the locking mechanism of global transaction entries. Instead of an XID, just have simple locked-or-not flag in each entry (we store the locking backend's backend id rather than a simple boolean, but that's just for debugging purposes). The backend is responsible for explicitly unlocking the entry, and to make sure that that happens, install a callback to unlock it on abort or process exit. Backpatch to all supported versions.	2014-05-15 16:58:14 +03:00
Bruce Momjian	0b44914c21	Remove tabs after spaces in C comments This was not changed in HEAD, but will be done later as part of a pgindent run. Future pgindent runs will also do this. Report by Tom Lane Backpatch through all supported branches, but not HEAD	2014-05-06 11:26:27 -04:00
Tom Lane	8c43980a18	Fix failure to detoast fields in composite elements of structured types. If we have an array of records stored on disk, the individual record fields cannot contain out-of-line TOAST pointers: the tuptoaster.c mechanisms are only prepared to deal with TOAST pointers appearing in top-level fields of a stored row. The same applies for ranges over composite types, nested composites, etc. However, the existing code only took care of expanding sub-field TOAST pointers for the case of nested composites, not for other structured types containing composites. For example, given a command such as UPDATE tab SET arraycol = ARRAY[(ROW(x,42)::mycompositetype] ... where x is a direct reference to a field of an on-disk tuple, if that field is long enough to be toasted out-of-line then the TOAST pointer would be inserted as-is into the array column. If the source record for x is later deleted, the array field value would become a dangling pointer, leading to errors along the line of "missing chunk number 0 for toast value ..." when the value is referenced. A reproducible test case for this was provided by Jan Pecek, but it seems likely that some of the "missing chunk number" reports we've heard in the past were caused by similar issues. Code-wise, the problem is that PG_DETOAST_DATUM() is not adequate to produce a self-contained Datum value if the Datum is of composite type. Seen in this light, the problem is not just confined to arrays and ranges, but could also affect some other places where detoasting is done in that way, for example form_index_tuple(). I tried teaching the array code to apply toast_flatten_tuple_attribute() along with PG_DETOAST_DATUM() when the array element type is composite, but this was messy and imposed extra cache lookup costs whether or not any TOAST pointers were present, indeed sometimes when the array element type isn't even composite (since sometimes it takes a typcache lookup to find that out). The idea of extending that approach to all the places that currently use PG_DETOAST_DATUM() wasn't attractive at all. This patch instead solves the problem by decreeing that composite Datum values must not contain any out-of-line TOAST pointers in the first place; that is, we expand out-of-line fields at the point of constructing a composite Datum, not at the point where we're about to insert it into a larger tuple. This rule is applied only to true composite Datums, not to tuples that are being passed around the system as tuples, so it's not as invasive as it might sound at first. With this approach, the amount of code that has to be touched for a full solution is greatly reduced, and added cache lookup costs are avoided except when there actually is a TOAST pointer that needs to be inlined. The main drawback of this approach is that we might sometimes dereference a TOAST pointer that will never actually be used by the query, imposing a rather large cost that wasn't there before. On the other side of the coin, if the field value is used multiple times then we'll come out ahead by avoiding repeat detoastings. Experimentation suggests that common SQL coding patterns are unaffected either way, though. Applications that are very negatively affected could be advised to modify their code to not fetch columns they won't be using. In future, we might consider reverting this solution in favor of detoasting only at the point where data is about to be stored to disk, using some method that can drill down into multiple levels of nested structured types. That will require defining new APIs for structured types, though, so it doesn't seem feasible as a back-patchable fix. Note that this patch changes HeapTupleGetDatum() from a macro to a function call; this means that any third-party code using that macro will not get protection against creating TOAST-pointer-containing Datums until it's recompiled. The same applies to any uses of PG_RETURN_HEAPTUPLEHEADER(). It seems likely that this is not a big problem in practice: most of the tuple-returning functions in core and contrib produce outputs that could not possibly be toasted anyway, and the same probably holds for third-party extensions. This bug has existed since TOAST was invented, so back-patch to all supported branches.	2014-05-01 15:19:14 -04:00
Tom Lane	f9cd2b7824	Fix incorrect pg_proc.proallargtypes entries for two built-in functions. pg_sequence_parameters() and pg_identify_object() have had incorrect proallargtypes entries since 9.1 and 9.3 respectively. This was mostly masked by the correct information in proargtypes, but a few operations such as pg_get_function_arguments() (and thus psql's \df display) would show the wrong data types for these functions' input parameters. In HEAD, fix the wrong info, bump catversion, and add an opr_sanity regression test to catch future mistakes of this sort. In the back branches, just fix the wrong info so that installations initdb'd with future minor releases will have the right data. We can't force an initdb, and it doesn't seem like a good idea to add a regression test that will fail on existing installations. Andres Freund	2014-04-23 21:21:12 -04:00
Heikki Linnakangas	003a31a7c9	Avoid palloc in critical section in GiST WAL-logging. Memory allocation can fail if you run out of memory, and inside a critical section that will lead to a PANIC. Use conservatively-sized arrays in stack instead. There was previously no explicit limit on the number of pages a GiST split can produce, it was only limited by the number of LWLocks that can be held simultaneously (100 at the moment). This patch adds an explicit limit of 75 pages. That should be plenty, a typical split shouldn't produce more than 2-3 page halves. The bug has been there forever, but only backpatch down to 9.1. The code was changed significantly in 9.1, and it doesn't seem worth the risk or trouble to adapt this for 9.0 and 8.4.	2014-04-03 15:44:09 +03:00
Tom Lane	029decfec6	Fix assorted issues in client host name lookup. The code for matching clients to pg_hba.conf lines that specify host names (instead of IP address ranges) failed to complain if reverse DNS lookup failed; instead it silently didn't match, so that you might end up getting a surprising "no pg_hba.conf entry for ..." error, as seen in bug #9518 from Mike Blackwell. Since we don't want to make this a fatal error in situations where pg_hba.conf contains a mixture of host names and IP addresses (clients matching one of the numeric entries should not have to have rDNS data), remember the lookup failure and mention it as DETAIL if we get to "no pg_hba.conf entry". Apply the same approach to forward-DNS lookup failures, too, rather than treating them as immediate hard errors. Along the way, fix a couple of bugs that prevented us from detecting an rDNS lookup error reliably, and make sure that we make only one rDNS lookup attempt; formerly, if the lookup attempt failed, the code would try again for each host name entry in pg_hba.conf. Since more or less the whole point of this design is to ensure there's only one lookup attempt not one per entry, the latter point represents a performance bug that seems sufficient justification for back-patching. Also, adjust src/port/getaddrinfo.c so that it plays as well as it can with this code. Which is not all that well, since it does not have actual support for rDNS lookup, but at least it should return the expected (and required by spec) error codes so that the main code correctly perceives the lack of functionality as a lookup failure. It's unlikely that PG is still being used in production on any machines that require our getaddrinfo.c, so I'm not excited about working harder than this. To keep the code in the various branches similar, this includes back-patching commits c424d0d1052cb4053c8712ac44123f9b9a9aa3f2 and 1997f34db4687e671690ed054c8f30bb501b1168 into 9.2 and earlier. Back-patch to 9.1 where the facility for hostnames in pg_hba.conf was introduced.	2014-04-02 17:11:31 -04:00

1 2 3 4 5 ...

5958 Commits