Commit Graph

6916 Commits

Author SHA1 Message Date
2b9e1878a2 [fix](hashjoin) return error if in progress of upgrade (#13753) 2022-10-31 09:41:20 +08:00
f5761c658f [Fix]Fix the extension mysql_to_doris bug (#13723)
* Fix the extension mysql_to_doris  BUG

e_mysql_to_doris.sh: command error,This error causes script execution errors.  :ERROR 1103 (42000) at line 1: Incorrect table name ''.
 ` ` symbol position error

* Update extension/mysql_to_doris/bin/e_mysql_to_doris.sh


Co-authored-by: Adonis Ling <adonis0147@gmail.com>
2022-10-31 08:45:34 +08:00
Pxl
711dad28fb [Chore](unused) remove QSorter #13769 2022-10-31 08:44:39 +08:00
6159e1cc3a [enhancement](tpch-tools) git ignore tpch tool gen file #13789 2022-10-31 08:39:54 +08:00
b15e0a9fb5 [Bug](function) fix bug of if function of nullable column process (#13779) 2022-10-31 08:38:53 +08:00
753c2ccfd1 [fix](test) drop table before create it (#13791) 2022-10-30 22:45:08 +08:00
9f7c76a0d6 [fix](memtracker) Fix the usage of bthread mem tracker (#13708)
bthead context init has performance loss, temporarily delete it first, it will be completely refactored in #13585.
2022-10-30 19:51:00 +08:00
98cc32aa0e [BugFix](regression-test) add order by in left/right join test case (#13774)
The result of right join is unordered, so we need add order by to guarantee results consistent.
2022-10-30 18:00:08 +08:00
efe813ba60 [fix](test)(explain) add full qualified name for scan node explain string (#13777)
1.
In the "explain" result of SQL, the table name in `ScanNode` should be full qualified with dbname.
And for olap scan node, the selected index name should not be "null".

2.
Remove `tpch_sf1_p1/tpch_sf1/nereids/` in regression test, it will be fixed later.
2022-10-30 13:24:48 +08:00
2a5d3dbb6e feat(nereids): draw hyper graph by graphviz (#13749) 2022-10-28 17:23:35 +08:00
e0667b297f [feature-wip](multi-catalog) reuse hdfsFs and decode parquet values in batch (#13688)
PR(https://github.com/apache/doris/pull/13404) introduced that ParquetReader
will break up batch insertion when encountering null values, which leads to the bad performance
compared to OrcReader.
So this PR has pushed null map into decode function, reduce the time of virtual function call
when encountering null values.

Further more, reuse hdfsFS among file readers to reduce the time of building connection to hdfs.
2022-10-28 15:52:52 +08:00
eab8876abc [Feature](remote) Using heavy schema change if the table is not enable light weight schema change (#13487) 2022-10-28 15:48:22 +08:00
f325119362 [fix](regression-test) update table name string in tpch_sf1 explain case (#13724) 2022-10-28 11:13:29 +08:00
Pxl
2fab0c45c7 [Feature](runtime-filter) add runtime filter breaking change adapt (#13246)
add runtime filter breaking change adapt
2022-10-28 10:59:28 +08:00
5805011629 [Feature](string-function) Add function mask/mask_first_n/mask_last_n (#13694)
Implementation of mask function from hive.
2022-10-28 10:43:56 +08:00
d6b72d9b89 [Bug](update) support to check optional value of agg_sort_infos (#13732) 2022-10-28 10:37:13 +08:00
a8a91a827a [fix] Fix the variable of boost_ROOT ,BOOST_ROOT will not work (#13450)
When execute shell command bash build.sh --be to build the backend, the cmake tool will show can't find the boost library, because the variable of BOOST_ROOT has some spelling mistake.

OS: Ubuntu 22.04 x86_64
CMake: 3.22.1
compiler: gcc (Ubuntu 11.2.0-7ubuntu2) 11.2.0
2022-10-28 08:46:35 +08:00
2ef8f3f6f4 [enhancement](java-udf) Support loading libjvm at runtime (#13660) 2022-10-28 08:45:12 +08:00
20363edc73 [BugFix](function) fix reverse function dynamic buffer overflow due to illegal character (#13671)
Previous logic of reverse function might not be strong enough to handle illegal character. For example, one one byte size character would be mistaken as one utf-8 character which occupies more than one byte space. And unfortunately exceeding the buffer space during future process.
2022-10-28 08:44:08 +08:00
859ffa6304 [bugfix](concat) be crash caused by function concat(ifnull) (#13693) 2022-10-28 08:42:51 +08:00
c108554f14 [function](date function) add new date function 'to_monday' #13707 2022-10-28 08:41:16 +08:00
f51464af59 [chore](macOS) Support Java UDF (#13714) 2022-10-28 08:40:56 +08:00
5dd052d386 [Function](array) support array_range function (#13547)
* array_range with 3 impl

* [Function](array) support array_range function

* update

* update code
2022-10-28 08:40:24 +08:00
43c6428aea [Function](string) support sub_replace function (#13736)
* [Function](string) support sub_replace function

* remove conf
2022-10-28 08:40:08 +08:00
36053d2419 [fix](array-type) fix the be core dump when select the invalid array format (#13514)
1. this pr is used to fix the be core dump when select the invalid array.
2. before the change, we run "select array_intersect([1, 2, 3, 1, 2, 3], '1[3, 2, 5]');" will cause be core dump.
MySQL [example_db]> select array_intersect([1, 2, 3, 1, 2, 3], '1[3, 2, 5]');
ERROR 1105 (HY000): RpcException, msg: io.grpc.StatusRuntimeException: UNAVAILABLE: Network closed for unknown reason
3. after the change, we run "select array_intersect([1, 2, 3, 1, 2, 3], '1[3, 2, 5]');" will get error message.
MySQL [example_db]> select array_intersect([1, 2, 3, 1, 2, 3], '1[3, 2, 5]');
errCode = 2, detailMessage = No matching function with signature: array_intersect(array<tinyint(4)>, varchar(-1))"
Co-authored-by: hucheng01 <hucheng01@baidu.com>
2022-10-27 23:11:12 +08:00
45b31506c7 [improvement](delete) support delete from partitioned table without partition specified (#13533)
Support delete from partitioned table without partition specified in [DELETE] stmt.

## Usage
If it is a partitioned table, you can specify a partition.
If not specified, Doris will infer partition from the given conditions.
In two cases, Doris cannot infer the partition from conditions:
1) the conditions do not contain partition columns;
2) The operator of the partition column is `not in`.
When a partition table does not specify the partition,
or the partition cannot be inferred from the conditions,
the session variable `delete_without_partition` needs to be `true`
to make delete statement be applied to all partitions.

## Test case
Test case is added in `regression-test/suites/delete_p0/test_delete_from_partition.groovy`,
user can delete from partitioned table without partition specified now.
2022-10-27 21:32:45 +08:00
578d956a6b [typo](doc):Correct spelling mistakes UDAF. (#13711) 2022-10-27 21:21:29 +08:00
4bfa95f669 [enhancement](tools) opt tpch q21: change join order (#13699) 2022-10-27 16:55:23 +08:00
bad950136d [chore](build) Pass the compile flag -Wno-unused-but-set-variable on demand (#13716)
There are some issues with the compile flag `-Wno-unused-but-set-variable` for clang.
1. `-Wno-unused-but-set-variable` should be set when building source by clang-15 on Linux. (#13000 #13016)
2. On macOS Monterey, Apple Clang 13 may treat it as a unknown warning option and the compilation process may interrupt.

This PR introduces a better way to make this compile flag more portable.
1. Test whether the compiler recognizes this flag.
2. Add this flag if the compiler recognizes it.
2022-10-27 15:18:28 +08:00
738da0b139 [bugfix](join) inner join return wrong result (#13608)
* bug fix for vhash join

* add regression test

Co-authored-by: cambyzju <zhuxiaoli01@baidu.com>
2022-10-27 11:48:41 +08:00
ec86e9c9b2 [feature-wip][MTMV] The schedule framework for the MTMV (#13147)
Design document: https://github.com/apache/doris/issues/13146
2022-10-27 11:37:24 +08:00
0e70d681d9 [feature](Nereids): Construct join graph (#13679)
* feat: add hypergraph and its api

* feat: add visulization api

Signed-off-by: xiejiann <jianxie0@gmail.com>

* remove unused code

Signed-off-by: xiejiann <jianxie0@gmail.com>

* fix format

Signed-off-by: xiejiann <jianxie0@gmail.com>

* remove unused test

Signed-off-by: xiejiann <jianxie0@gmail.com>

* remove unused tests

Signed-off-by: xiejiann <jianxie0@gmail.com>

* format

Signed-off-by: xiejiann <jianxie0@gmail.com>

Signed-off-by: xiejiann <jianxie0@gmail.com>
2022-10-27 11:32:31 +08:00
d388de6c11 [Enhancement](threadpool) print thread pool name on error (#13706) 2022-10-27 10:49:18 +08:00
c874931ac8 [fix](join)output all value from no-null side of outer join (#13655)
* [fix](joinoutput all value from no-null side of outer join

* add regression test
2022-10-27 10:48:36 +08:00
2697f72d77 [Improvement][SET-PROPERTY] Support for set query_timeout property (#13444) 2022-10-27 10:03:39 +08:00
7557980d64 [improvement](regression-test) avoid query empty result after loading finished (#13682)
When running regression test, we always found that the query return empty result after loading finished,
even if we call "sync" before the query.
This is because for `stream load`, the load task result will be returned immediately after the txn's status changed to VISIBLE,
but before writing the edit log.
So if we do the query right after we got the load task result, it is possible that we can not see the latest loaded data.

Same issue with `insert` operation
2022-10-27 09:47:18 +08:00
ffcb2f8525 [opt](exec) Replace get_utf8_byte_length function by array (#13664) 2022-10-27 09:46:41 +08:00
5bd66243ee [minor](log) remove some unused logs (#13689)
1. When running regression test with specific suites or group, do not print other suite name or file name
2. Remove unused alter table job log.
2022-10-27 09:37:32 +08:00
3e8cd0c669 [typo](doc) Add the description of json HDFS broker load (#13683)
Add the instruction of HDFS broker load with json format file.
2022-10-27 09:36:57 +08:00
d2262bc8fb [docs]fix 404 (#13695)
[docs]fix 404
2022-10-27 08:49:36 +08:00
3c95106d45 [Bug](jdbc) Fix memory leak for JDBC datasource (#13657) 2022-10-27 00:02:25 +08:00
0134e9d2f4 [Improvement](runtime filter) Reduce merging time for bloom filter (#13668) 2022-10-27 00:02:05 +08:00
06e433e14a [fix](cmake)fix cmake error (#13637)
fix cmake error if variables(${LIB_JVM}) is ""
2022-10-26 21:38:50 +08:00
ddb27b9c3f nereids use decimal(27,9) (#13678) 2022-10-26 21:37:24 +08:00
f4c8d4ce85 [feature](nereids) estimate plan cost by column ndv and table row count (#13375)
In this version, we use column ndv information to estimate plan cost.

This is the first version, covers TPCH queries.
2022-10-26 20:35:10 +08:00
bed759b3f5 [Fix](array-type) support CTAS for ARRAY column from collect_list and collect_set (#13627)
Co-authored-by: cambyzju <zhuxiaoli01@baidu.com>
2022-10-26 19:42:15 +08:00
c5559877b4 [typo](docs)fix docs 404 link (#13677) 2022-10-26 14:56:47 +08:00
0841c5bf28 [Bugfix](manager) fix query profile key incompatible with old versions (#13596) 2022-10-26 14:27:58 +08:00
65aa863dcf [Bugfix](bitmap) Fix to_bitmap_with_check function symbol is incorrect (#13667)
* [Bugfix](bitmap) Fix to_bitmap_with_check function symbol is incorrect
2022-10-26 14:27:38 +08:00
3548d0b824 [fix](statistics) fix cross join statistics exception (#13645) 2022-10-26 14:10:57 +08:00