doris

Author	SHA1	Message	Date
zy-kkk	f53d2c28cb	[improvement](catalog) fix jdbc mysql catalog to_date fun pushdown (#29900 )	2024-01-16 18:46:19 +08:00
minghong	22978726e3	[opt](nereids) if column stats are unknown, 10-20 table-join optimization use cascading instead of dphyp (#29902 ) * if column stats are unknown, do not use dphyp tpcds query64 is optimized in case of no stats sf500, query64 improved from 15sec to 7sec on hdfs, and from 4sec to 3.85sec on olaptable	2024-01-16 18:46:19 +08:00
meiyi	e1bcdc35fd	(fix)[group-commit] Fix some group commit case (#30008 )	2024-01-16 18:46:19 +08:00
morrySnow	07de535c4c	[fix](Nereids) should not fold constant when do ordinal group by (#29976 )	2024-01-16 18:46:19 +08:00
starocean999	ce2e84e9a6	[fix](nereids)need validate auto partition columns in DDL (#29985 )	2024-01-16 18:46:19 +08:00
jakevin	f7e12605a2	[fix](Nereids): fix NPE InferPredicates (#29978 ) PredicatePropagation shouldn't add null into List.	2024-01-16 18:46:19 +08:00
Lei Zhang	2bf8c51baa	[test](regression) Add debug level log of editlog for p0 p1 (#29992 )	2024-01-16 18:42:09 +08:00
yangshijie	66513d57f9	[feature](function) support ip function named ipv6_cidr_to_range(addr, cidr) (#29812 )	2024-01-16 18:42:09 +08:00
jakevin	0ea5f0f5de	[fix](Nereids): fix offset in PlanTranslator (#29789 ) Current BE operator don't support `offset`, we need add offset into `ExchangeNode`	2024-01-16 18:41:21 +08:00
amory	d5dcdf3e07	[Improve](array) support array_enumerate_uniq and array_suffle for nereids (#29936 )	2024-01-16 18:40:32 +08:00
morrySnow	b99b9efda8	[fix](Nereids) top-n with top project should hit top-n opt (#29971 ) in PR #29312, we limit top-n opt only enable in simplest cases. in this PR, we let go of some restrictions.	2024-01-16 18:40:32 +08:00
谢健	e09118eb9c	extract agg in struct info node (#29853 )	2024-01-16 18:39:37 +08:00
zy-kkk	f6dc6ea13b	[improvement](catalog) Escape characters for columns in recovery predicate pushdown in SQL (#29854 ) In the previous logic, when we restored the Column in the predicate pushdown based on the logical syntax tree for JdbcScanNode, in order to avoid query errors caused by keywords such as `key`, we added escape characters for it, but before we only Binary predicates are processed, which is imperfect. We should add escape characters to all columns that appear in the predicate to avoid errors with keywords or illegal characters.	2024-01-16 18:39:00 +08:00
wangbo	e35b26f4fc	[feature](auditlog)Add runtime cpu time/peak memory metric (#29925 )	2024-01-16 18:39:00 +08:00
Siyang Tang	97955da749	[enhancement](fe-memory) support label num threshold to reduce fe memory consumption (#22889 ) Co-authored-by: Yongqiang YANG <98214048+dataroaring@users.noreply.github.com>	2024-01-16 18:38:54 +08:00
jakevin	b47e289560	[refactor](Nereids): avoid ConnectContext.get() ASAP to improve proformance (#29952 )	2024-01-16 18:37:44 +08:00
Mryange	7a574df9fc	[fix](pipelineX) fix multi be may be missing profiles #29914	2024-01-16 18:37:44 +08:00
yujun	4b4fd1a290	[improvement](log) add txn log (#28875 )	2024-01-16 18:37:06 +08:00
Mingyu Chen	7c493b08c5	[refactor](dialect) make http sql converter plugin and audit loader as builtin plugin (#29692 ) Followup #28890 Make HttpSqlConverterPlugin and AuditLoader as Doris' builtin plugin. To make it simple for user to support sql dialect and using audit loader. HttpSqlConverterPlugin By default, there is nothing changed. There is a new global variable sql_converter_service, default is empty, if set, the HttpSqlConverterPlugin will be enabled set global sql_converter_service = "http://127.0.0.1:5001/api/v1/convert" AuditLoader By default, there is nothing changed. There is a new global variable enable_audit_plugin, default is false, if set to true, the audit loader plugin will be enable. Doris will create audit_log in __internal_schema when startup If enable_audit_plugin is true, the audit load will be inserted into audit_log table. 3 other global variables related to this plugin: audit_plugin_max_batch_interval_sec: The max interval for audit loader to insert a batch of audit log. audit_plugin_max_batch_bytes: The max batch size for audit loader to insert a batch of audit log. audit_plugin_max_sql_length: The max length of statement in audit log	2024-01-16 18:31:59 +08:00
minghong	a69ce49b07	[fix](Nereids) adjust min/max stats for cast function if types are comparable (#28166 ) estimate column stats for "cast(col, XXXType)" -----cast-est------ query4 41169 40335 40267 40267 query58 463 361 401 361 Total cold run time: 41632 ms Total hot run time: 40628 ms ----master------ query4 40624 40180 40299 40180 query58 487 389 420 389 Total cold run time: 41111 ms Total hot run time: 40569 ms	2024-01-16 18:31:59 +08:00
seawinde	0b16938b7f	[Fix](Nereids) Fix datatype length wrong when string contains chinese (#29885 ) When varchar literal contains chinese, the length of varchar should not be the length of the varchar, it should be the actual length of the using byte. Chinese is represented by unicode, a chinese char occypy 4 byte at mostly. So if meet chinese in varchar literal, we set the length is 4* length. for example as following: > CREATE MATERIALIZED VIEW test_varchar_literal_mv > BUILD IMMEDIATE REFRESH AUTO ON MANUAL > DISTRIBUTED BY RANDOM BUCKETS 2 > PROPERTIES ('replication_num' = '1') > AS > select case when l_orderkey > 1 then "一二三四" else "五六七八" end as field_1 from lineitem; mysql> desc test_varchar_literal_mv; the def of materialized view is as following: +---------+-------------+------+-------+---------+-------+ \| Field \| Type \| Null \| Key \| Default \| Extra \| +---------+-------------+------+-------+---------+-------+ \| field_1 \| VARCHAR(16) \| No \| false \| NULL \| NONE \| +---------+-------------+------+-------+---------+-------+	2024-01-16 18:31:59 +08:00
zhangstar333	115815739c	[bugfix](fe) add check for leg/lead function params (#29617 )	2024-01-16 18:31:59 +08:00
Nitin-Kashyap	c92648cb27	[ut](meta) added unit test for frontend service impl (#28455 )	2024-01-16 18:31:59 +08:00
Nitin-Kashyap	d127527af3	[feat](meta) Reuse HMS statistics analyzed by Spark engine for Analyze Task. (#28525 ) Taking the Idea further from PR #24853 (#24853) Column statistics already analyzed and available in HMS from spark, this PR proposes to reuse the analyzed stats from external source, when executed WITH SQL clause of analyze cooamd. Spark analyzes and stores the statistics in Table properties instead of HiveColumnStatistics. In this PR, we try to get the statistics from these properties and make it available to Doris.	2024-01-16 18:31:27 +08:00
HHoflittlefish777	7b30119537	[improve](multi-table-load) pause job when can not find table #29870 If there is no table that can be found, the task will cycle forever and no data will be loaded. To avoid invalid scheduled tasks, It is better to pause the job rather than run it.	2024-01-16 18:31:27 +08:00
zhangdong	e1a12cf222	[improvement](auth)Not allowed to operate internal_schema database (#29790 ) Only root user can operate __internal_schema database The scope of impact includes： create database drop database alter database create table drop table alter table truncate table insert overwrite insert delete update load(root also not allowed) delete support check auth	2024-01-16 18:31:27 +08:00
Jibing-Li	1dc0c74ad9	[improvement](statistics)Stop analyze quickly after user close auto analyze. #29809	2024-01-16 18:31:27 +08:00
seawinde	d47adbb81f	[Fix](nereids) Fix cte rewrite by mv failure and predicates compensation by mistake (#29820 ) Fix cte rewrite by mv wrongly when query has scalar aggregate but view no For example as following, it should not be rewritten by materialized view successfully // materialzied view define def mv20_1 = """ select l_shipmode, l_shipinstruct, sum(l_extendedprice), count() from lineitem left join orders on lineitem.L_ORDERKEY = orders.O_ORDERKEY group by l_shipmode, l_shipinstruct; """ // query sql def query20_1 = """ select sum(l_extendedprice), count() from lineitem left join orders on lineitem.L_ORDERKEY = orders.O_ORDERKEY """ Fix predicates compensation by mistake For example as following, it can return right result, but it's wrong earlier. // materialzied view define def mv7_1 = """ select l_shipdate, o_orderdate, l_partkey, l_suppkey from lineitem left join orders on lineitem.l_orderkey = orders.o_orderkey where l_shipdate = '2023-12-08' and o_orderdate = '2023-12-08'; """ // query sql def query7_1 = """ select l_shipdate, o_orderdate, l_partkey, l_suppkey from (select * from lineitem where l_shipdate = '2023-10-17' ) t1 left join orders on t1.l_orderkey = orders.o_orderkey; """ and optimize some code usage and add more comment for method	2024-01-16 18:31:27 +08:00
yangshijie	1998735432	[Improvement](function) enable ipv6_num_to_string function to support handling of IPv6 type (#29886 ) Enable ipv6_num_to_string function to handle IPv6 type normally in addition to handling 16 byte string types	2024-01-16 18:30:23 +08:00
xzj7019	ee66f1563e	[fix](Nereids) fix rf push down union (#29847 ) Current union rf push down only support rf from parent join, but not support ancestor join. The pr fixes this problem on project/distribute node's rf pushing down checking.	2024-01-16 18:30:22 +08:00
morrySnow	fd4795dace	[opt](Nereids) add graph sql function and one arg truncate (#29864 )	2024-01-16 18:30:22 +08:00
Gabriel	1718341051	[pipelineX](fix) Fix correctness problem due to local hash shuffle (#29881 )	2024-01-12 13:58:19 +08:00
Mryange	acda8d2129	[feature](profile )merge of profiles can be disabled by profile level. #29861 The merging of profiles requires ensuring the correctness of the profiles themselves. However, if merging is intended for troubleshooting correctness issues through profiles, errors may occur. Moreover, the 'try-catch' does not catch exceptions related to profile merging. If merging fails, even the normal profile cannot be obtained.	2024-01-12 13:58:19 +08:00
zhengyu	3ef1229635	[docs](query-accel) refine several statements in docs (#29716 )	2024-01-12 13:58:19 +08:00
yujun	2a51750abd	[fix](dynamic partition) fix dynamic partition storage medium not working (#29490 )	2024-01-12 13:58:19 +08:00
yujun	0d6ab3c68c	[chore](regression test) check disk is good (#29740 )	2024-01-12 13:58:19 +08:00
Luwei	53639a01fe	[Fix] (schema change) fix the bug that non light schema change tables can rename column (#29850 )	2024-01-12 13:58:19 +08:00
Kaijie Chen	fc5dc1c285	[config](move-memtable) set default load_stream_per_node to 20 (#29822 )	2024-01-12 13:58:19 +08:00
huanghaibin	cbffdbb8bf	[bug](group_commit) fix relay wal problem on materialized-view (#29848 )	2024-01-12 13:58:19 +08:00
Gabriel	a4f29193f6	[pipelineX](fix) Fix incorrect runtime filter (#29860 )	2024-01-12 13:58:19 +08:00
Mingyu Chen	ebfbe0c8dd	[opt](information_schema) support information_schema in external catalog (#28919 ) Add `information_schema` database for all catalog. This is useful when using BI tools to connect to Doris, the tools can get meta info from `information_schema`. This PR mainly changes: 1. There will be a `information_schema` db in each catalog. 2. Each `information_schema` db only store the meta info of the catalog it belongs to. 3. For `information_schema`, the `TABLE_SCHEMA` column's value is the database name. 4. There is a new global variable `show_full_dbname_in_info_schema_db`, default is false, if set to true, The `TABLE_SCHEMA` column's value is the like `ctl.db`, because: When connect to Doris, the `database` info in connection url will be: `xxx?db=ctl.db`. And then some BI will try to query `information_schema` with sql like: `select * from information_schema.columns where TABLE_SCHEMA = "ctl.db"` So it has to be format as `ctl.db` eg, the `information_schema.columns` table in external catalog `doris` is like: ``` mysql> select * from information_schema.columns limit 1\G ************************* 1. row ************************* TABLE_CATALOG: doris TABLE_SCHEMA: doris.__internal_schema TABLE_NAME: column_statistics COLUMN_NAME: id ORDINAL_POSITION: 1 COLUMN_DEFAULT: NULL IS_NULLABLE: NO DATA_TYPE: varchar CHARACTER_MAXIMUM_LENGTH: 4096 CHARACTER_OCTET_LENGTH: 16384 NUMERIC_PRECISION: NULL NUMERIC_SCALE: NULL DATETIME_PRECISION: NULL CHARACTER_SET_NAME: NULL COLLATION_NAME: NULL COLUMN_TYPE: varchar(4096) COLUMN_KEY: EXTRA: PRIVILEGES: COLUMN_COMMENT: COLUMN_SIZE: 4096 DECIMAL_DIGITS: NULL GENERATION_EXPRESSION: NULL SRS_ID: NULL ``` 6. Modify the behavior of - show tables - shwo databases - show columns - show table status The above statements may query the `information_schema` db if there is `where` predicate after them	2024-01-12 13:58:19 +08:00
HappenLee	d3721455b0	[Session](rf) Change the default min size of bf runtime filter (#29837 )	2024-01-12 13:58:19 +08:00
minghong	f67a00ffbb	[opt](nereids) prune runtime redundant filters (#29828 ) 1. expand_runtime_filter_by_inner_join will create some redundant rfs，e.g., tpch q5 and q9, we need to remove one 2. hive: prune rf if target only used as probe	2024-01-12 13:58:19 +08:00
zhangdong	ed3c8bba87	[fix](auth)remove the key when priv is empty (#29522 ) - remove the key when priv is empty - check priv when create mv	2024-01-12 13:58:19 +08:00
zhangdong	8ba1eb0b02	[feature](mtmv) task tvf add queryId (#29671 ) To better locate abnormal situations, add queryId	2024-01-12 12:00:32 +08:00
zclllyybb	4d97f8ea75	[enhance](function) support two special format for str_to_date (#29823 )	2024-01-12 12:00:32 +08:00
morrySnow	eed72a101e	[fix](Nereids) decimalv3 cast in fe produce wrong data (#29808 ) case: ``` MySQL root@127.0.0.1:test> select cast(12 as decimalv3(2,1)) +-----------------------------+ \| cast(12 as DECIMALV3(2, 1)) \| +-----------------------------+ \| 12.0 \| +-----------------------------+ ``` decimalv2 literal will generate wrong result too. But it is not only bugs in planner, but also have bugs in executor. So we need fix executor bug in another PR.	2024-01-12 12:00:13 +08:00
谢健	885d8b28ba	[fix](Nerids): fix function deps when check unique and not null #29797	2024-01-12 11:59:52 +08:00
zclllyybb	18f850c94f	[enhance](auto-partition) forbid null column for auto partition (#29749 )	2024-01-12 11:59:52 +08:00
morrySnow	e93a16ac6e	[fix](Nereids) support complex literal cast in fe (#29599 )	2024-01-12 11:59:52 +08:00

1 2 3 4 5 ...

5852 Commits