postgresql

mirror of https://git.postgresql.org/git/postgresql.git synced 2026-02-13 18:07:05 +08:00

Author	SHA1	Message	Date
Andrew Dunstan	4603903d29	Allow json{b}_strip_nulls to remove null array elements An additional paramater ("strip_in_arrays") is added to these functions. It defaults to false. If true, then null array elements are removed as well as null valued object fields. JSON that just consists of a single null is not affected. Author: Florents Tselai <florents.tselai@gmail.com> Discussion: https://postgr.es/m/4BCECCD5-4F40-4313-9E98-9E16BEB0B01D@gmail.com	2025-03-05 10:04:02 -05:00
Tomas Vondra	842265631d	Fix unique key checks in JSON object constructors When building a JSON object, the code builds a hash table of keys, to allow checking if the keys are unique. The uniqueness check and adding the new key happens in json_unique_check_key(), but this assumes the pointer to the key remains valid. Unfortunately, two places passed pointers to keys in a buffer, while also appending more data (additional key/value pairs) to the buffer. With enough data the buffer is resized by enlargeStringInfo(), which calls repalloc(), invalidating the earlier key pointers. Due to this the uniqueness check may fail with both false negatives and false positives, producing JSON objects with duplicate keys or failing to produce a perfectly valid JSON object. This affects multiple functions that enforce uniqueness of keys, all introduced in PG16 with the new SQL/JSON: - json_object_agg_unique / jsonb_object_agg_unique - json_object / jsonb_objectagg Existing regression tests did not detect the issue, simply because the initial buffer size is 1024 and the objects were small enough not to require the repalloc. With a sufficiently large object, AddressSanitizer reported the access to invalid memory immediately. So would valgrind, of course. Fixed by copying the key into the hash table memory context, and adding regression tests with enough data to repalloc the buffer. Backpatch to 16, where the functions were introduced. Reported by Alexander Lakhin. Investigation and initial fix by Junwang Zhao, with various improvements and tests by me. Reported-by: Alexander Lakhin Author: Junwang Zhao, Tomas Vondra Backpatch-through: 16 Discussion: https://postgr.es/m/18598-3279ed972a2347c7@postgresql.org Discussion: https://postgr.es/m/CAEG8a3JjH0ReJF2_O7-8LuEbO69BxPhYeXs95_x7+H9AMWF1gw@mail.gmail.com	2024-09-11 13:21:10 +02:00
David Rowley	ca6fde9225	Optimize JSON escaping using SIMD Here we adjust escape_json_with_len() to make use of SIMD to allow processing of up to 16-bytes at a time rather than processing a single byte at a time. This has been shown to speed up escaping of JSON strings significantly. Escaping is required for both JSON string properties and also the property names themselves, so this should also help improve the speed of the conversion from JSON into text for JSON objects that have property names 16 or more bytes long. Escaping JSON strings was often a significant bottleneck for longer strings. With these changes, some benchmarking has shown a query performing nearly 4 times faster when escaping a JSON object with a 1MB text property. Tests with shorter text properties saw smaller but still significant performance improvements. For example, a test outputting 1024 JSON strings with a text property length ranging from 1 char to 1024 chars became around 2 times faster. Author: David Rowley Reviewed-by: Melih Mutlu Discussion: https://postgr.es/m/CAApHDvpLXwMZvbCKcdGfU9XQjGCDm7tFpRdTXuB9PVgpNUYfEQ@mail.gmail.com	2024-08-05 23:16:44 +12:00
Michael Paquier	b8da37b3ad	Rework pg_input_error_message(), now renamed pg_input_error_info() pg_input_error_info() is now a SQL function able to return a row with more than just the error message generated for incorrect data type inputs when these are able to handle soft failures, returning more contents of ErrorData, as of: - The error message (same as before). - The error detail, if set. - The error hint, if set. - SQL error code. All the regression tests that relied on pg_input_error_message() are updated to reflect the effects of the rename. Per discussion with Tom Lane and Andrew Dunstan. Author: Nathan Bossart Discussion: https://postgr.es/m/139a68e1-bd1f-a9a7-b5fe-0be9845c6311@dunslane.net	2023-02-28 08:04:13 +09:00
Tom Lane	c60c9badba	Convert json_in and jsonb_in to report errors softly. This requires a bit of further infrastructure-extension to allow trapping errors reported by numeric_in and pg_unicode_to_server, but otherwise it's pretty straightforward. In the case of jsonb_in, we are only capturing errors reported during the initial "parse" phase. The value-construction phase (JsonbValueToJsonb) can also throw errors if assorted implementation limits are exceeded. We should improve that, but it seems like a separable project. Andrew Dunstan and Tom Lane Discussion: https://postgr.es/m/3bac9841-fe07-713d-fa42-606c225567d6@dunslane.net	2022-12-11 11:28:15 -05:00
John Naylor	0a8de93a48	Speed up lexing of long JSON strings Use optimized linear search when looking ahead for end quotes, backslashes, and non-printable characters. This results in nearly 40% faster JSON parsing on x86-64 when most values are long strings, and all platforms should see some improvement. Reviewed by Andres Freund and Nathan Bossart Discussion: https://www.postgresql.org/message-id/CAFBsxsGhaR2KQ5eisaK%3D6Vm60t%3DaxhD8Ckj1qFoCH1pktZi%2B2w%40mail.gmail.com Discussion: https://www.postgresql.org/message-id/CAFBsxsESLUyJ5spfOSyPrOvKUEYYNqsBosue9SV1j8ecgNXSKA%40mail.gmail.com	2022-09-02 09:36:22 +07:00
Tom Lane	ffd3944ab9	Improve reporting for syntax errors in multi-line JSON data. Point to the specific line where the error was detected; the previous code tended to include several preceding lines as well. Avoid re-scanning the entire input to recompute which line that was. Simplify the logic a bit. Add test cases. Simon Riggs and Hamid Akhtar, reviewed by Daniel Gustafsson and myself Discussion: https://postgr.es/m/CANbhV-EPBnXm3MF_TTWBwwqgn1a1Ghmep9VHfqmNBQ8BT0f+_g@mail.gmail.com	2021-03-01 16:44:17 -05:00
Tom Lane	bbd93667bd	Remove unnecessary test dependency on the contents of pg_pltemplate. Using pg_pltemplate as test data was probably not very forward-looking, considering we've had many discussions around removing that catalog altogether. Use a nearby temp table instead, to make these two test scripts more self-contained. This is a better test case anyway, since it exercises the scenario where the entries in the anyarray column actually vary in type intra-query.	2019-08-21 10:43:23 -04:00
Tom Lane	e136a0d8ca	Restore json{b}_populate_record{set}'s ability to take type info from AS. If the record argument is NULL and has no declared type more concrete than RECORD, we can't extract useful information about the desired rowtype from it. In this case, see if we're in FROM with an AS clause, and if so extract the needed rowtype info from AS. It worked like this before v11, but commit 37a795a60 removed the behavior, reasoning that it was undocumented, inefficient, and utterly not self-consistent. If you want to take type info from an AS clause, you should be using the json_to_record() family of functions not the json_populate_record() family. Also, it was already the case that the "populate" functions would fail for a null-valued RECORD input (with an unfriendly "record type has not been registered" error) when there wasn't an AS clause at hand, and it wasn't obvious that that behavior wasn't OK when there was one. However, it emerges that some people were depending on this to work, and indeed the rather off-point error message you got if you left off AS encouraged slapping on AS without switching to the json_to_record() family. Hence, put back the fallback behavior of looking for AS. While at it, improve the run-time error you get when there's no place to obtain type info; we can do a lot better than "record type has not been registered". (We can't, unfortunately, easily improve the parse-time error message that leads people down this path in the first place.) While at it, I refactored the code a bit to avoid duplicating the same logic in several different places. Per bug #15940 from Jaroslav Sivy. Back-patch to v11 where the current coding came in. (The pre-v11 deficiencies in this area aren't regressions, so we'll leave those branches alone.) Patch by me, based on preliminary analysis by Dmitry Dolgov. Discussion: https://postgr.es/m/15940-2ab76dc58ffb85b6@postgresql.org	2019-08-19 18:01:09 -04:00
Tom Lane	6f34fcbbd5	Fix conversion of JSON strings to JSON output columns in json_to_record(). json_to_record(), when an output column is declared as type json or jsonb, should emit the corresponding field of the input JSON object. But it got this slightly wrong when the field is just a string literal: it failed to escape the contents of the string. That typically resulted in syntax errors if the string contained any double quotes or backslashes. jsonb_to_record() handles such cases correctly, but I added corresponding test cases for it too, to prevent future backsliding. Improve the documentation, as it provided only a very hand-wavy description of the conversion rules used by these functions. Per bug report from Robert Vollmert. Back-patch to v10 where the error was introduced (by commit cf35346e8). Note that PG 9.4 - 9.6 also get this case wrong, but differently so: they feed the de-escaped contents of the string literal to json[b]_in. That behavior is less obviously wrong, so possibly it's being depended on in the field, so I won't risk trying to make the older branches behave like the newer ones. Discussion: https://postgr.es/m/D6921B37-BD8E-4664-8D5F-DB3525765DCD@vllmrt.net	2019-06-11 13:33:22 -04:00
Michael Paquier	4e197bf195	Fix typo related to to_tsvector() in tests of json and jsonb Author: Sho Kato Reviewed-by: Kyotaro Horiguchi Discussion: https://postgr.es/m/25C1C6B2E7BE044889E4FE8643A58BA963E1D03D@G01JPEXMBKW03	2019-03-15 16:20:11 +09:00
Tom Lane	eba2ce1712	Fix another crash in json{b}_populate_recordset and json{b}_to_recordset. populate_recordset_worker() failed to consider the possibility that the supplied JSON data contains no rows, so that update_cached_tupdesc never got called. This led to a null-pointer dereference since commit 9a5e8ed28; before that it led to a bogus "set-valued function called in context that cannot accept a set" error. Fix by forcing the update to happen. Per bug #15514. Back-patch to v11 as 9a5e8ed28 was. (If we were excited about the bogus error, we could perhaps go back further, but it'd take more work to figure out how to fix it in older branches. Given the lack of field complaints about that aspect, I'm not excited.) Discussion: https://postgr.es/m/15514-59d5b4c4065b178b@postgresql.org	2018-11-22 15:14:01 -05:00
Tom Lane	4984784f83	Fix crash in json{b}_populate_recordset() and json{b}_to_recordset(). As of commit 37a795a60, populate_recordset_worker() tried to pass back (as rsi.setDesc) a tupdesc that it also had cached in its fn_extra. But the core executor would free the passed-back tupdesc, risking a crash if the function were called again in the same query. The safest and least invasive way to fix that is to make an extra tupdesc copy to pass back. While at it, I failed to resist the temptation to get rid of unnecessary get_fn_expr_argtype() calls here and in populate_record_worker(). Per report from Dmitry Dolgov; thanks to Michael Paquier and Andrew Gierth for investigation and discussion. Discussion: https://postgr.es/m/CA+q6zcWzN9ztCfR47ZwgTr1KLnuO6BAY6FurxXhovP4hxr+yOQ@mail.gmail.com	2018-07-13 14:16:54 -04:00
Teodor Sigaev	01bb85169a	Make test of json(b)_to_tsvector language-independ Missed in 1c1791e00065f6986f9d44a78ce7c28b2d1322dd commit	2018-04-07 21:29:48 +03:00
Teodor Sigaev	1c1791e000	Add json(b)_to_tsvector function Jsonb has a complex nature so there isn't best-for-everything way to convert it to tsvector for full text search. Current to_tsvector(json(b)) suggests to convert only string values, but it's possible to index keys, numerics and even booleans value. To solve that json(b)_to_tsvector has a second required argument contained a list of desired types of json fields. Second argument is a jsonb scalar or array right now with possibility to add new options in a future. Bump catalog version Author: Dmitry Dolgov with some editorization by me Reviewed by: Teodor Sigaev Discussion: https://www.postgresql.org/message-id/CA+q6zcXJQbS1b4kJ_HeAOoOc=unfnOrUEL=KGgE32QKDww7d8g@mail.gmail.com	2018-04-07 20:58:03 +03:00
Tom Lane	b574228715	Add tests for json{b}_populate_recordset() crash case. The problem reported as CVE-2017-15098 was already resolved in HEAD by commit 37a795a60, but let's add the relevant test cases anyway. Michael Paquier and Tom Lane, per a report from David Rowley. Security: CVE-2017-15098	2017-11-06 10:29:37 -05:00
Tom Lane	37a795a60b	Support domains over composite types. This is the last major omission in our domains feature: you can now make a domain over anything that's not a pseudotype. The major complication from an implementation standpoint is that places that might be creating tuples of a domain type now need to be prepared to apply domain_check(). It seems better that unprepared code fail with an error like "<type> is not composite" than that it silently fail to apply domain constraints. Therefore, relevant infrastructure like get_func_result_type() and lookup_rowtype_tupdesc() has been adjusted to treat domain-over-composite as a distinct case that unprepared code won't recognize, rather than just transparently treating it the same as plain composite. This isn't a 100% solution to the possibility of overlooked domain checks, but it catches most places. In passing, improve typcache.c's support for domains (it can now cache the identity of a domain's base type), and rewrite the argument handling logic in jsonfuncs.c's populate_record[set]_worker to reduce duplicative per-call lookups. I believe this is code-complete so far as the core and contrib code go. The PLs need varying amounts of work, which will be tackled in followup patches. Discussion: https://postgr.es/m/4206.1499798337@sss.pgh.pa.us	2017-10-26 13:47:45 -04:00
Andrew Dunstan	18fc4ecf4a	Process variadic arguments consistently in json functions json_build_object and json_build_array and the jsonb equivalents did not correctly process explicit VARIADIC arguments. They are modified to use the new extract_variadic_args() utility function which abstracts away the details of the call method. Michael Paquier, reviewed by Tom Lane and Dmitry Dolgov. Backpatch to 9.5 for the jsonb fixes and 9.4 for the json fixes, as that's where they originated.	2017-10-25 07:34:00 -04:00
Andrew Dunstan	cf35346e81	Make json_populate_record and friends operate recursively With this change array fields are populated from json(b) arrays, and composite fields are populated from json(b) objects. Along the way, some significant code refactoring is done to remove redundancy in the way to populate_record[_set] and to_record[_set] functions operate, and some significant efficiency gains are made by caching tuple descriptors. Nikita Glukhov, edited some by me. Reviewed by Aleksander Alekseev and Tom Lane.	2017-04-06 22:22:13 -04:00
Tom Lane	c281cd5fe1	Fix unstable regression test result. Whoops, missed that same test was made for json as well as jsonb.	2017-03-31 20:29:30 -04:00
Andrew Dunstan	e306df7f9c	Full Text Search support for json and jsonb The new functions are ts_headline() and to_tsvector. Dmitry Dolgov, edited and documented by me.	2017-03-31 14:26:03 -04:00
Andrew Dunstan	502a3832cc	Correctly handle array pseudotypes in to_json and to_jsonb Columns with array pseudotypes have not been identified as arrays, so they have been rendered as strings in the json and jsonb conversion routines. This change allows them to be rendered as json arrays, making it possible to deal correctly with the anyarray columns in pg_stats.	2017-02-22 11:10:49 -05:00
Tom Lane	a9d199f6d3	Fix json_to_record() bug with nested objects. A thinko concerning nesting depth caused json_to_record() to produce bogus output if a field of its input object contained a sub-object with a field name matching one of the requested output column names. Per bug #13996 from Johann Visagie. I added a regression test case based on his example, plus parallel tests for json_to_recordset, jsonb_to_record, jsonb_to_recordset. The latter three do not exhibit the same bug (which suggests that we may be missing some opportunities to share code...) but testing seems like a good idea in any case. Back-patch to 9.4 where these functions were introduced.	2016-03-02 23:31:39 -05:00
Andrew Dunstan	94c745eb18	Fix two-argument jsonb_object when called with empty arrays Some over-eager copy-and-pasting on my part resulted in a nonsense result being returned in this case. I have adopted the same pattern for handling this case as is used in the one argument form of the function, i.e. we just skip over the code that adds values to the object. Diagnosis and patch from Michael Paquier, although not quite his solution. Fixes bug #13936. Backpatch to 9.5 where jsonb_object was introduced.	2016-02-21 10:30:49 -05:00
Tom Lane	d435542583	Fix incorrect translation of minus-infinity datetimes for json/jsonb. Commit bda76c1c8cfb1d11751ba6be88f0242850481733 caused both plus and minus infinity to be rendered as "infinity", which is not only wrong but inconsistent with the pre-9.4 behavior of to_json(). Fix that by duplicating the coding in date_out/timestamp_out/timestamptz_out more closely. Per bug #13687 from Stepan Perlov. Back-patch to 9.4, like the previous commit. In passing, also re-pgindent json.c, since it had gotten a bit messed up by recent patches (and I was already annoyed by indentation-related problems in back-patching this fix ...)	2015-10-20 11:07:04 -07:00
Andrew Dunstan	b6363772fd	Factor out encoding specific tests for json This lets us remove the large alternative results files for the main json and jsonb tests, which makes modifying those tests simpler for committers and patch submitters. Backpatch to 9.4 for jsonb and 9.3 for json.	2015-10-07 22:18:27 -04:00
Tom Lane	9e36c91b46	Fix insufficiently-portable regression test case. Some of the buildfarm members are evidently miserly enough of stack space to pass the originally-committed form of this test. Increase the requirement 10X to hopefully ensure that it fails as-expected everywhere. Security: CVE-2015-5289	2015-10-05 12:19:14 -04:00
Noah Misch	08fa47c485	Prevent stack overflow in json-related functions. Sufficiently-deep recursion heretofore elicited a SIGSEGV. If an application constructs PostgreSQL json or jsonb values from arbitrary user input, application users could have exploited this to terminate all active database connections. That applies to 9.3, where the json parser adopted recursive descent, and later versions. Only row_to_json() and array_to_json() were at risk in 9.2, both in a non-security capacity. Back-patch to 9.2, where the json type was introduced. Oskari Saarenmaa, reviewed by Michael Paquier. Security: CVE-2015-5289	2015-10-05 10:06:29 -04:00
Andrew Dunstan	d9a356ff2e	Fix treatment of nulls in jsonb_agg and jsonb_object_agg The wrong is_null flag was being passed to datum_to_json. Also, null object key values are not permitted, and this was not being checked for. Add regression tests covering these cases, and also add those tests to the json set, even though it was doing the right thing. Fixes bug #13514, initially diagnosed by Tom Lane.	2015-07-24 09:40:46 -04:00
Andrew Dunstan	e02d44b8a7	Support JSON negative array subscripts everywhere Previously, there was an inconsistency across json/jsonb operators that operate on datums containing JSON arrays -- only some operators supported negative array count-from-the-end subscripting. Specifically, only a new-to-9.5 jsonb deletion operator had support (the new "jsonb - integer" operator). This inconsistency seemed likely to be counter-intuitive to users. To fix, allow all places where the user can supply an integer subscript to accept a negative subscript value, including path-orientated operators and functions, as well as other extraction operators. This will need to be called out as an incompatibility in the 9.5 release notes, since it's possible that users are relying on certain established extraction operators changed here yielding NULL in the event of a negative subscript. For the json type, this requires adding a way of cheaply getting the total JSON array element count ahead of time when parsing arrays with a negative subscript involved, necessitating an ad-hoc lex and parse. This is followed by a "conversion" from a negative subscript to its equivalent positive-wise value using the count. From there on, it's as if a positive-wise value was originally provided. Note that there is still a minor inconsistency here across jsonb deletion operators. Unlike the aforementioned new "-" deletion operator that accepts an integer on its right hand side, the new "#-" path orientated deletion variant does not throw an error when it appears like an array subscript (input that could be recognized by as an integer literal) is being used on an object, which is wrong-headed. The reason for not being stricter is that it could be the case that an object pair happens to have a key value that looks like an integer; in general, these two possibilities are impossible to differentiate with rhs path text[] argument elements. However, we still don't allow the "#-" path-orientated deletion operator to perform array-style subscripting. Rather, we just return the original left operand value in the event of a negative subscript (which seems analogous to how the established "jsonb/json #> text[]" path-orientated operator may yield NULL in the event of an invalid subscript). In passing, make SetArrayPath() stricter about not accepting cases where there is trailing non-numeric garbage bytes rather than a clean NUL byte. This means, for example, that strings like "10e10" are now not accepted as an array subscript of 10 by some new-to-9.5 path-orientated jsonb operators (e.g. the new #- operator). Finally, remove dead code for jsonb subscript deletion; arguably, this should have been done in commit b81c7b409. Peter Geoghegan and Andrew Dunstan	2015-07-17 21:13:47 -04:00
Andrew Dunstan	bda76c1c8c	Render infinite date/timestamps as 'infinity' for json/jsonb Commit ab14a73a6c raised an error in these cases and later the behaviour was copied to jsonb. This is what the XML code, which we then adopted, does, as the XSD types don't accept infinite values. However, json dates and timestamps are just strings as far as json is concerned, so there is no reason not to render these values as 'infinity'. The json portion of this is backpatched to 9.4 where the behaviour was introduced. The jsonb portion only affects the development branch. Per gripe on pgsql-general.	2015-02-26 12:25:21 -05:00
Tom Lane	451d280815	Fix jsonb Unicode escape processing, and in consequence disallow \u0000. We've been trying to support \u0000 in JSON values since commit 78ed8e03c67d7333, and have introduced increasingly worse hacks to try to make it work, such as commit 0ad1a816320a2b53. However, it fundamentally can't work in the way envisioned, because the stored representation looks the same as for \\u0000 which is not the same thing at all. It's also entirely bogus to output \u0000 when de-escaped output is called for. The right way to do this would be to store an actual 0x00 byte, and then throw error only if asked to produce de-escaped textual output. However, getting to that point seems likely to take considerable work and may well never be practical in the 9.4.x series. To preserve our options for better behavior while getting rid of the nasty side-effects of 0ad1a816320a2b53, revert that commit in toto and instead throw error if \u0000 is used in a context where it needs to be de-escaped. (These are the same contexts where non-ASCII Unicode escapes throw error if the database encoding isn't UTF8, so this behavior is by no means without precedent.) In passing, make both the \u0000 case and the non-ASCII Unicode case report ERRCODE_UNTRANSLATABLE_CHARACTER / "unsupported Unicode escape sequence" rather than claiming there's something wrong with the input syntax. Back-patch to 9.4, where we have to do something because 0ad1a816320a2b53 broke things for many cases having nothing to do with \u0000. 9.3 also has bogus behavior, but only for that specific escape value, so given the lack of field complaints it seems better to leave 9.3 alone.	2015-01-30 14:44:56 -05:00
Andrew Dunstan	237a882443	Add json_strip_nulls and jsonb_strip_nulls functions. The functions remove object fields, including in nested objects, that have null as a value. In certain cases this can lead to considerably smaller datums, with no loss of semantic information. Andrew Dunstan, reviewed by Pavel Stehule.	2014-12-12 09:00:43 -05:00
Stephen Frost	c8a026e4f1	Revert 95d737ff to add 'ignore_nulls' Per discussion, revert the commit which added 'ignore_nulls' to row_to_json. This capability would be better added as an independent function rather than being bolted on to row_to_json. Additionally, the implementation didn't address complex JSON objects, and so was incomplete anyway. Pointed out by Tom and discussed with Andrew and Robert.	2014-09-29 13:32:22 -04:00
Stephen Frost	95d737ff45	Add 'ignore_nulls' option to row_to_json Provide an option to skip NULL values in a row when generating a JSON object from that row with row_to_json. This can reduce the size of the JSON object in cases where columns are NULL without really reducing the information in the JSON object. This also makes row_to_json into a single function with default values, rather than having multiple functions. In passing, change array_to_json to also be a single function with default values (we don't add an 'ignore_nulls' option yet- it's not clear that there is a sensible use-case there, and it hasn't been asked for in any case). Pavel Stehule	2014-09-11 21:23:51 -04:00
Tom Lane	41dd50e84d	Fix corner-case behaviors in JSON/JSONB field extraction operators. Cause the path extraction operators to return their lefthand input, not NULL, if the path array has no elements. This seems more consistent since the case ought to correspond to applying the simple extraction operator (->) zero times. Cause other corner cases in field/element/path extraction to return NULL rather than failing. This behavior is arguably more useful than throwing an error, since it allows an expression index using these operators to be built even when not all values in the column are suitable for the extraction being indexed. Moreover, we already had multiple inconsistencies between the path extraction operators and the simple extraction operators, as well as inconsistencies between the JSON and JSONB code paths. Adopt a uniform rule of returning NULL rather than throwing an error when the JSON input does not have a structure that permits the request to be satisfied. Back-patch to 9.4. Update the release notes to list this as a behavior change since 9.3.	2014-08-22 13:17:58 -04:00
Tom Lane	fa069822f5	More regression test cases for json/jsonb extraction operators. Cover some cases I omitted before, such as null and empty-string elements in the path array. This exposes another inconsistency: json_extract_path complains about empty path elements but jsonb_extract_path does not.	2014-08-20 19:05:05 -04:00
Tom Lane	9bac66020d	Fix core dump in jsonb #> operator, and add regression test cases. jsonb's #> operator segfaulted (dereferencing a null pointer) if the RHS was a zero-length array, as reported in bug #11207 from Justin Van Winkle. json's #> operator returns NULL in such cases, so for the moment let's make jsonb act likewise. Also add a bunch of regression test queries memorializing the -> and #> operators' behavior for this and other corner cases. There is a good argument for changing some of these behaviors, as they are not very consistent with each other, and throwing an error isn't necessarily a desirable behavior for operators that are likely to be used in indexes. However, everybody can agree that a core dump is the Wrong Thing, and we need test cases even if we decide to change their expected output later.	2014-08-20 16:48:53 -04:00
Andrew Dunstan	4ebe3519e1	Allow empty string object keys in json_object(). This makes the behaviour consistent with the json parser, other json-generating functions, and the JSON standards.	2014-07-22 11:27:31 -04:00
Tom Lane	a749a23d7a	Remove use_json_as_text options from json_to_record/json_populate_record. The "false" case was really quite useless since all it did was to throw an error; a definition not helped in the least by making it the default. Instead let's just have the "true" case, which emits nested objects and arrays in JSON syntax. We might later want to provide the ability to emit sub-objects in Postgres record or array syntax, but we'd be best off to drive that off a check of the target field datatype, not a separate argument. For the functions newly added in 9.4, we can just remove the flag arguments outright. We can't do that for json_populate_record[set], which already existed in 9.3, but we can ignore the argument and always behave as if it were "true". It helps that the flag arguments were optional and not documented in any useful fashion anyway.	2014-06-29 13:50:58 -04:00
Tom Lane	57d8c1270e	Fix handling of nested JSON objects in json_populate_recordset and friends. populate_recordset_object_start() improperly created a new hash table (overwriting the link to the existing one) if called at nest levels greater than one. This resulted in previous fields not appearing in the final output, as reported by Matti Hameister in bug #10728. In 9.4 the problem also affects json_to_recordset. This perhaps missed detection earlier because the default behavior is to throw an error for nested objects: you have to pass use_json_as_text = true to see the problem. In addition, fix query-lifespan leakage of the hashtable created by json_populate_record(). This is pretty much the same problem recently fixed in dblink: creating an intended-to-be-temporary context underneath the executor's per-tuple context isn't enough to make it go away at the end of the tuple cycle, because MemoryContextReset is not MemoryContextResetAndDeleteChildren. Michael Paquier and Tom Lane	2014-06-24 21:22:40 -07:00
Andrew Dunstan	0ad1a81632	Do not escape a unicode sequence when escaping JSON text. Previously, any backslash in text being escaped for JSON was doubled so that the result was still valid JSON. However, this led to some perverse results in the case of Unicode sequences, These are now detected and the initial backslash is no longer escaped. All other backslashes are still escaped. No validity check is performed, all that is looked for is \uXXXX where X is a hexidecimal digit. This is a change from the 9.2 and 9.3 behaviour as noted in the Release notes. Per complaint from Teodor Sigaev.	2014-06-03 16:11:31 -04:00
Andrew Dunstan	f30015b6d7	Output timestamps in ISO 8601 format when rendering JSON. Many JSON processors require timestamp strings in ISO 8601 format in order to convert the strings. When converting a timestamp, with or without timezone, to a JSON datum we therefore now use such a format rather than the type's default text output, in functions such as to_json(). This is a change in behaviour from 9.2 and 9.3, as noted in the release notes.	2014-06-03 13:56:53 -04:00
Andrew Dunstan	d9134d0a35	Introduce jsonb, a structured format for storing json. The new format accepts exactly the same data as the json type. However, it is stored in a format that does not require reparsing the orgiginal text in order to process it, making it much more suitable for indexing and other operations. Insignificant whitespace is discarded, and the order of object keys is not preserved. Neither are duplicate object keys kept - the later value for a given key is the only one stored. The new type has all the functions and operators that the json type has, with the exception of the json generation functions (to_json, json_agg etc.) and with identical semantics. In addition, there are operator classes for hash and btree indexing, and two classes for GIN indexing, that have no equivalent in the json type. This feature grew out of previous work by Oleg Bartunov and Teodor Sigaev, which was intended to provide similar facilities to a nested hstore type, but which in the end proved to have some significant compatibility issues. Authors: Oleg Bartunov, Teodor Sigaev, Peter Geoghegan and Andrew Dunstan. Review: Andres Freund	2014-03-23 16:40:19 -04:00
Andrew Dunstan	5264d91541	Add json_array_elements_text function. This was a notable omission from the json functions added in 9.3 and there have been numerous complaints about its absence. Laurence Rowe.	2014-01-29 15:39:01 -05:00
Andrew Dunstan	105639900b	New json functions. json_build_array() and json_build_object allow for the construction of arbitrarily complex json trees. json_object() turns a one or two dimensional array, or two separate arrays, into a json_object of name/value pairs, similarly to the hstore() function. json_object_agg() aggregates its two arguments into a single json object as name value pairs. Catalog version bumped. Andrew Dunstan, reviewed by Marko Tiikkaja.	2014-01-28 17:48:21 -05:00
Peter Eisentraut	001e114b8d	Fix whitespace issues found by git diff --check, add gitattributes Set per file type attributes in .gitattributes to fine-tune whitespace checks. With the associated cleanups, the tree is now clean for git	2013-11-10 14:48:29 -05:00
Andrew Dunstan	4d212bac17	json_typeof function. Andrew Tipton.	2013-10-10 12:21:59 -04:00
Andrew Dunstan	78ed8e03c6	Fix unescaping of JSON Unicode escapes, especially for non-UTF8. Per discussion on -hackers. We treat Unicode escapes when unescaping them similarly to the way we treat them in PostgreSQL string literals. Escapes in the ASCII range are always accepted, no matter what the database encoding. Escapes for higher code points are only processed in UTF8 databases, and attempts to process them in other databases will result in an error. \u0000 is never unescaped, since it would result in an impermissible null byte.	2013-06-12 13:35:24 -04:00
Andrew Dunstan	94e3311b97	Handle Unicode surrogate pairs correctly when processing JSON. In 9.2, Unicode escape sequences are not analysed at all other than to make sure that they are in the form \uXXXX. But in 9.3 many of the new operators and functions try to turn JSON text values into text in the server encoding, and this includes de-escaping Unicode escape sequences. This processing had not taken into account the possibility that this might contain a surrogate pair to designate a character outside the BMP. That is now handled correctly. This also enforces correct use of surrogate pairs, something that is not done by the type's input routines. This fact is noted in the docs.	2013-06-08 09:12:48 -04:00

1 2

58 Commits