doris

Author	SHA1	Message	Date
mch_ucchi	7f2c433e08	[feature](Nereids) add relation id to unboundTVFRelation to avoid incorrect group expression comparison (#15740 )	2023-01-11 12:49:14 +08:00
Evan	af3416ede0	[docs] Update be-vscode-dev.md (#15800 ) Fix some syntax errors, making it more comfortable for developers to read.	2023-01-11 12:30:52 +08:00
jakevin	94f6380137	[enhance](Nereids): github action forgot some nereids file. (#15746 )	2023-01-11 11:42:52 +08:00
Pxl	2587095811	[Bug](mv) fix mv selector check group expr && forbid create dup mv with bitmap/hll && add some case (#15738 )	2023-01-11 11:38:56 +08:00
xy720	3c8c31a5f8	[chore](Session) remove unused codes for enable_lateral_view session variable `enable_lateral_view` has been removed for a long time. This cl just remove variable name `enable_lateral_view`.	2023-01-11 11:24:28 +08:00
camby	870b5c44e6	[fix](compile) compile failed in Mac with clang14 (#15661 ) HOW to reproduce? Add export CMAKE_BUILD_TYPE=DEBUG in custom_env.sh. Then build thirdparty in MAC. There are two problems: build vectorscan with DEBUG type, will got unused-but-set-variable error: doris/thirdparty/src/vectorscan-vectorscan-5.4.7/src/nfa/mcclellancompile.cpp:1485:13: error: variable 'total_daddy' set but not used [-Werror,-Wunused-but-set-variable] u16 total_daddy = 0; gflags will output libgflags_debug.a instead of libgflags.a while build with DEBUG type. Then we will got error can not find library gflags error. To avoid these errors, we set CMAKE_BUILD_TYPE while build vectorscan and gflags. Co-authored-by: cambyzju <zhuxiaoli01@baidu.com> Co-authored-by: Adonis Ling <adonis0147@gmail.com>	2023-01-11 11:09:00 +08:00
Adonis Ling	5c2a38d2a1	[chore](thirdparty) Fix the md5sum of the package brpc-1.2.0.tar.gz (#15789 ) Apache brpc has graduated from incubator recently. The MD5 of the package we download from https://github.com/apache/incubator-brpc/archive/refs/tags/1.2.0.tar.gz changed and the mismatch MD5 makes the build scripts fail.	2023-01-11 11:05:21 +08:00
zbtzbtzbt	fe5e5d2bf4	[refactor] separate agg and flush in memtable (#15713 )	2023-01-11 10:07:34 +08:00
HappenLee	f5948eb4b0	[Build](cmake) Uniform capitalization keyword of cmake (#15728 )	2023-01-11 09:58:07 +08:00
Mingyu Chen	3fec5ff0f5	[refactor](scan-pool) move scan pool from env to scanner scheduler (#15604 ) The origin scan pools are in exec_env. But after enable new_load_scan_node by default, the scan pool in exec_env is no longer used. All scan task will be submitted to the scan pool in scanner_scheduler. BTW, reorganize the scan pool into 3 kinds: local scan pool For olap scan node remote scan pool For file scan node limited scan pool For query which set cpu resource limit or with small limit clause TODO: Use bthread to unify all IO task. Some trivial issues: fix bug that the memtable flush size printed in log is not right Add RuntimeProfile param in VScanner	2023-01-11 09:38:42 +08:00
yiguolei	d857b4af1b	[refactor](remove row batch) remove impala rowbatch structure (#15767 ) * [refactor](remove row batch) remove impala rowbatch structure Co-authored-by: yiguolei <yiguolei@gmail.com>	2023-01-11 09:37:35 +08:00
Mingyu Chen	5b10116eca	[chore](thirdparty) fix bug that GSSAPI of libgsasl is disabled (#15753 ) In #15037, I modified the build script of libgsasl to enable GSSAPI, but it is still wrong, because the PATH does not include the `thirdparty/installed/bin`, so when building libgsasl, it will report error: `WARNING: MIT Kerberos krb5-config not found, disabling GSSAPI` but `krb5-config` is in `thirdparty/installed/bin`. Without GSSAPI, the libhdfs3 can not access hdfs with kerberos authentication.	2023-01-11 09:07:46 +08:00
Mingyu Chen	89c21af87d	[chore](fe) update fe snapshot to 1.2 and fix auditloader compile error (#15787 ) This PR #14925 change some field of AuditEvent, so we need to upgrade the fe-core's SNAPSHOT to 1.2 because auditloader depends on fe-core Already push the 1.2-SNAPSHOT to https://repository.apache.org/content/repositories/snapshots/org/apache/doris/fe-core/1.2-SNAPSHOT/	2023-01-11 08:46:48 +08:00
TengJianPing	8f31a36429	[feature] support spill to disk for sort node (#15624 )	2023-01-11 08:40:58 +08:00
Jerry Hu	4bbc93b7ce	[refactor](hashtable) simplify template args of partitioned hash table (#15736 )	2023-01-11 08:39:13 +08:00
Gabriel	124c8662e8	[Bug](schema scanner) Fix wrong type in schema scanner (#15768 )	2023-01-11 08:37:39 +08:00
mch_ucchi	bc34a44f06	[Fix](Nereids) fix type coercion for binary arithmetic (#15185 ) support sql like: select true + 1 + '2.0' and prevent select true + 1 + 'x';	2023-01-11 02:55:44 +08:00
谢健	c87a9a5949	[fix](Nereids) Add varchar literal compare (#15672 ) support "1" = "123"	2023-01-11 02:41:50 +08:00
minghong	280603b253	[fix](nereids) bind sort key priority problem (#15646 ) `a.b.c` should only bind on `a.b.c`, not on `b.c` or `c`	2023-01-11 02:03:09 +08:00
Adonis Ling	f5b0f5e01a	[chore](macOS) Don't build useless third-party stuff (#15763 ) On macOS, we need some extra libraries to build the codebase, therefore two packages were introduced to the project. They are `binutils` and `gettext`. It takes a lot of time to build these packages completely. This PR introduces a way to build the needed libraries and other stuff are skipped to build. It can save the time to build the third-party libraries on macOS.	2023-01-11 00:20:37 +08:00
chenlinzhong	5dc644769a	[mtmv](regression-test) add mtmv write data regression test (#15546 ) * [regression-test](mtmv) add mtmv write data regression test * [regression-test](mtmv) add mtmv write data regression test * [regression-test](mtmv) add mtmv write data regression test * [regression-test](mtmv) add mtmv write data regression test * [regression-test](mtmv) add mtmv write data regression test	2023-01-10 23:42:42 +08:00
Ashin Gau	4be54cfcac	[deps](hdfs) update libhdfs3 to v2.3.5 to support KMS (#15770 ) Support KMS in libhdfs3: apache/doris-thirdparty#22	2023-01-10 23:21:53 +08:00
Ashin Gau	ab2e0fd397	[fix](tvf) cancel strict restrictions on tvf parameters (#15764 ) Cancel strict restrictions on tvf parameters.	2023-01-10 22:40:19 +08:00
Tiewei Fang	79b24cdb1f	[fix](JdbcResource) fix that JdbcResource does not support the jdbcurl of Oracle and SQLServer (#15757 ) Actually, `JdbcResource` should support `Oracle` jdbcurl and `SQLServer` jdbcurl for jdbc external table.	2023-01-10 22:38:30 +08:00
slothever	90a92f0643	[feature-wip](multi-catalog) add iceberg tvf to read snapshots (#15618 ) Support new table value function `iceberg_meta("table" = "ctl.db.tbl", "query_type" = "snapshots")` we can use the sql `select * from iceberg_meta("table" = "ctl.db.tbl", "query_type" = "snapshots")` to get snapshots info of a table. The other iceberg metadata will be supported later when needed. One of the usage: Before we use following sql to time travel: `select * from ice_table FOR TIME AS OF "2022-10-10 11:11:11"`; `select * from ice_table FOR VERSION AS OF "snapshot_id"`; we can use the snapshots metadata to get the `committed time` or `snapshot_id`, and then, we can use it as the time or version in time travel clause	2023-01-10 22:37:35 +08:00
minghong	542542a4b2	[fix](nereids) fix bug in estimation of min/max of Year (#15712 ) 1. fix bug in estimation of min/max of Year 2. remove Utils.getLocalDatetimeFromLong(Long). this method is will throw exception if input parameter is too big. And this method is not used any more when we fix the above bug	2023-01-10 21:29:16 +08:00
starocean999	fec89ad58c	[fix](nereids) week should be able to recognized as function name in function call context (#15735 )	2023-01-10 19:54:59 +08:00
AKIRA	7767931aca	[ehancement](nereids) let parser support utf8 identifier (#15721 ) After this PR, below SQL could be parsed well too - SELECT k1 AS 测试 FROM test; - SELECT k1 AS テスト FROM test;	2023-01-10 19:43:04 +08:00
camby	bb28144c76	[fix](schema change) bugfix for light schema change while with rollup (#15681 ) Describe your changes. this problem come from pr: #11494 After add column to rollup index, it also change column UniqueId inside base index.	2023-01-10 19:03:06 +08:00
yongjinhou	a67cea2d27	[Enhancement](metric) add current edit log metric (#15657 )	2023-01-10 18:46:57 +08:00
Zhengguo Yang	503b6ee4da	[chore](vulnerability) fix fe high risk vulnerability scanned by bug scanner (#15649 )	2023-01-10 17:44:18 +08:00
caoliang-web	672d11522b	[regression](flink)add flink doris connector case (#15676 ) * add flink doris connector case	2023-01-10 17:25:06 +08:00
zclllyybb	c3da5a687a	[fix]fixed dangerous usage of namespace std (#15741 ) Co-authored-by: zhaochangle <zhaochangle@selectdb.com>	2023-01-10 16:10:49 +08:00
camby	47097a3db8	[fix](having) revert 15143 and fix having clause with multi-conditions (#15745 ) Describe your changes. Firstly having clause of Mysql is really very complex, we are hard to follow all rules, so we revert pr15143 to keep the logic the same as before. Secondly the origin implementation has problem while having clause has multi-conditions. For example: case1: here v2 inside having clause use table column test_having_alias_tb.v2 SELECT id, v1-2 as v, sum(v2) v2 FROM test_having_alias_tb GROUP BY id,v having(v2>1); ERROR 1105 (HY000): errCode = 2, detailMessage = HAVING clause not produced by aggregation output (missing from GROUP BY clause?): (`v2` > 1) case2: here v2 inside having clause use alias name v2 =sum(test_having_alias_tb.v2), another condition make logic of v2 differently. SELECT id, v1-2 as v, sum(v2) v2 FROM test_having_alias_tb GROUP BY id,v having(v>0 AND v2>1) ORDER BY id,v; +------+------+------+ \| id \| v \| v2 \| +------+------+------+ \| 2 \| 1 \| 3 \| +------+------+------+ So here we try to make the having clause rules simple: Rule1: if alias name inside having clause is the same as column name, we use column name not alias name; Rule2: if alias name inside having clause do not have same name as column name, we use alias name; Co-authored-by: cambyzju <zhuxiaoli01@baidu.com>	2023-01-10 15:57:29 +08:00
liwei	ec0a9647f1	[typo](docs)Update sequence-column-manual.md #15727 创建unique模型的test_table数据表，并指定指定sequence列映射到表中的modify_date列。重复	2023-01-10 14:54:57 +08:00
Tiewei Fang	f17d69e450	[feature](file cache)Import `file cache` for remote file reader (#15622 ) The main purpose of this pr is to import `fileCache` for lakehouse reading remote files. Use the local disk as the cache for reading remote file, so the next time this file is read, the data can be obtained directly from the local disk. In addition, this pr includes a few other minor changes Import File Cache: 1. The imported `fileCache` is called `block_file_cache`, which uses lru replacement policy. 2. Implement a new FileRereader `CachedRemoteFilereader`, so that the logic of `file cache` is hidden under `CachedRemoteFilereader`. Other changes: 1. Add a new interface `fs()` for `FileReader`. 2. `IOContext` adds some statistical information to count the situation of `FileCache` Co-authored-by: Lightman <31928846+Lchangliang@users.noreply.github.com>	2023-01-10 12:23:56 +08:00
chenlinzhong	dec79c000b	[fix](MTMV) build mode is missing after restart FE (#15551 )	2023-01-10 11:38:56 +08:00
chenlinzhong	1888aba301	[fix](MTMV) fix replayReplaceTable error when restart fe (#15564 )	2023-01-10 11:36:17 +08:00
924060929	025623a124	[feature](Nereids) Support lots of aggregate functions (#15671 ) 1. generate lots of aggregate functions 2. support `group_concat(columns order by order_columns)` grammer 3. support and generate array aggregate/scalar functions, like `array_union`. we should support array grammar in the future, e.g. `select [1, 2, 3]` 4. add `checkLegalityBeforeTypeCoercion` and `checkLegalityAfterRewrite` function to check the legality of expression before type coercion and after rewrite, copy the semantic check of `FunctionCallExpr` to the checkLegality; remove the `ForbiddenMetricTypeArguments`; move the check of aes/sm4 crypto function from translator to checkLegalityBeforeTypeCoercion 5. refactor the `NullableAggregateFunction`: distinct is the first parameter, alwaysNullable is the second parameter; Fix some wrong initialize order: some function invoke super(distinct, alwaysNullable) but some function invoke super(alwaysNullable, distinct)	2023-01-10 11:20:27 +08:00
morrySnow	601d9af23b	[fix](planner) disconjunct in sub-query failed when plan it on hash join (#15653 ) all conjuncts should be added before HashJoinNode init. Otherwise, some slots on conjuncts linked to the tuple not in intermediate tuple on HashJoinNode	2023-01-10 11:10:12 +08:00
Gabriel	d0e8f84279	[feature](vectorized) Support MemoryScratchSink on vectorized engine (#15612 )	2023-01-10 10:38:35 +08:00
liwei	fd7d13d4c0	[typo](docs)Update dynamic-partition.md #15734 拼写错误	2023-01-10 10:14:44 +08:00
Jerry Hu	c19e391d32	[fix](profile) show query profile for pipeline engine (#15687 )	2023-01-10 10:12:34 +08:00
Xinyi Zou	9c0f96883a	[fix](hashjoin) Fix right join pull output block memory overflow (#15440 ) For outer join / right outer join / right semi join, when HashJoinNode::pull->process_data_in_hashtable outputs a block, it will output all rows of a key in the hash table into a block, and the output of a key is completed After that, it will check whether the block size exceeds the batch size, and if it exceeds, the output will be terminated. If a key has 2000w+ rows, memory overflow will occur when the subsequent block operations on the 2000w+ rows are performed.	2023-01-10 10:10:43 +08:00
gnehil	3990a44aba	[typo](doc) add since dev lable to field function doc (#15648 )	2023-01-10 09:52:37 +08:00
zhannngchen	67a6ad648e	[typo](doc) command of manually trigger compaction incorrect (#15709 )	2023-01-10 09:50:47 +08:00
Mingyu Chen	9e3a61989b	[refactor](es) remove BE generated dsl for es query #15751 remove fe config enable_new_es_dsl and all related code. Now the DSL for es is always generated on FE side.	2023-01-10 08:40:32 +08:00
plat1ko	ab186a60ce	[enhancement](compaction) Optimize judging delete rowset and picking candidate rowsets for compaction #15631 Tablet::version_for_delete_predicate should travel all rowset metas in tablet meta which complex is O(N), however we can directly judge whether this rowset is a delete rowset by RowsetMeta::has_delete_predicate which complex is O(1). As we won't call Tablet::version_for_delete_predicate when pick input rowsets for compaction, we can reduce the critical area of Tablet::_meta_lock.	2023-01-10 08:32:15 +08:00
luozenglin	05f6e4c48a	[fix](predicate) fix be core dump caused by pushing down the double column predicate (#15693 )	2023-01-09 19:31:04 +08:00
AKIRA	2b0e5e42a5	[ehancement](nereids) Support list parttion prune (#15724 )	2023-01-09 19:00:53 +08:00

1 2 3 4 5 ...

8106 Commits