mirror of
https://git.postgresql.org/git/postgresql.git
synced 2026-02-16 19:37:00 +08:00
Up to now, the parser's reporting of a statement's stmt_location included any preceding whitespace or comments. This isn't really desirable but was done to avoid accounting honestly for nonterminals that reduce to empty. It causes problems for pg_stat_statements, which partially compensates by manually stripping whitespace, but is not bright enough to strip /*-style comments. There will be more problems with an upcoming patch to improve reporting of errors in extension scripts, so it's time to do something about this. The thing we have to do to make it work right is to adjust YYLLOC_DEFAULT to scan the inputs of each production to find the first one that has a valid location (i.e., did not reduce to empty). In theory this adds a little bit of per-reduction overhead, but in practice it's negligible. I checked by measuring the time to run raw_parser() on the contents of information_schema.sql, and there was basically no change. Having done that, we can rely on any nonterminal that didn't reduce to completely empty to have a correct starting location, and we don't need the kluges the stmtmulti production formerly used. This should have a side benefit of allowing parse error reports to include an error position in some cases where they formerly failed to do so, due to trying to report the position of an empty nonterminal. I did not go looking for an example though. The one previously known case where that could happen (OptSchemaEltList) no longer needs the kluge it had; but I rather doubt that that was the only case. Discussion: https://postgr.es/m/ZvV1ClhnbJLCz7Sm@msg.df7cb.de
src/backend/parser/README Parser ====== This directory does more than tokenize and parse SQL queries. It also creates Query structures for the various complex queries that are passed to the optimizer and then executor. parser.c things start here scan.l break query into tokens scansup.c handle escapes in input strings gram.y parse the tokens and produce a "raw" parse tree analyze.c top level of parse analysis for optimizable queries parse_agg.c handle aggregates, like SUM(col1), AVG(col2), ... parse_clause.c handle clauses like WHERE, ORDER BY, GROUP BY, ... parse_coerce.c handle coercing expressions to different data types parse_collate.c assign collation information in completed expressions parse_cte.c handle Common Table Expressions (WITH clauses) parse_expr.c handle expressions like col, col + 3, x = 3 or x = 4 parse_enr.c handle ephemeral named rels (trigger transition tables, ...) parse_func.c handle functions, table.column and column identifiers parse_merge.c handle MERGE parse_node.c create nodes for various structures parse_oper.c handle operators in expressions parse_param.c handle Params (for the cases used in the core backend) parse_relation.c support routines for tables and column handling parse_target.c handle the result list of the query parse_type.c support routines for data type handling parse_utilcmd.c parse analysis for utility commands (done at execution time) See also src/common/keywords.c, which contains the table of standard keywords and the keyword lookup function. We separated that out because various frontend code wants to use it too.