Commit Graph

8666 Commits

Author SHA1 Message Date
8317c4a752 [Bug](cooldown) set new replica id when early exit in doing clone when no missed versions (#16644)
* set new replica id

* reduce lock

* reset when replica id is different
2023-02-13 14:39:03 +08:00
be9385d40a [improvement](lock raii) use raii to lock and unlock (#16652)
* [improvement](lock raii) use raii to lock and unlock

This is part of exception safe: #16366.

---------

Co-authored-by: yiguolei <yiguolei@gmail.com>
2023-02-13 14:06:36 +08:00
890a51838e [typo](docs) fix doc (#16651) 2023-02-13 11:51:08 +08:00
2a2ffacb5b retention function add 1.2 tags (#16465) 2023-02-13 11:33:06 +08:00
a2b9b9edd7 [fix](planner) fix bug in agg on constant column (#16442)
For performance reason, we want to remove constant column from groupingExprs.
For example:
                `select sum(T.A) from T group by T.B, 'xyz'` is equivalent to `select sum(T.A) from T group by T.B`
We can remove constant column `abc` from groupingExprs.

But there is an exception when all groupingExpr are constant
For example:

                sql1: `select 'abc' from t group by 'abc'`
                 is not equivalent to
                sql2: `select 'abc' from t`

                sql3: `select 'abc', sum(a) from t group by 'abc'`
                 is not equivalent to
                sql4: `select 1, sum(a) from t`
                (when t is empty, sql3 returns 0 tuple, sql4 return 1 tuple)

We need to keep some constant columns if all groupingExpr are constant.

Consider sql5 `select a from (select "abc" as a, 'def' as b) T group by b, a;`
if the constant column `a` is in select list, this column should not be removed.
sql5 is transformed to 
sql6 `select a from (select "abc" as a, 'def' as b) T group by a;`
2023-02-13 11:26:08 +08:00
46dd887ae2 [fix](nereids) make slot binding compatible to original planner (#16612)
SELECT a,2 as a FROM (SELECT '1' as a) b HAVING a=1

in original planner, having clause binding failed. Make Nereids failed too.
2023-02-13 11:14:17 +08:00
91c4d1cade [Feature-WIP](inverted index) step 1 for supporting range predicate pushing down to inverted index (#16615) 2023-02-13 10:30:51 +08:00
f41a2055d3 [feature](Load)Remove user/password in properties for mysql load to avoid double auth. (#16073)
Use FE cluster token to auth stream load.
This auth is only open for be, and fe auth still only support http basic auth.

I will use this auth for mysql load to build a no-auth stream load from fe to be.
And this will avoid double auth in mysql load.
More information to see the design doc.
2023-02-13 10:00:08 +08:00
80c1a99ef6 [enhance](Nereids): refactor JoinReorder code. (#16477)
* [enhance](Nereids): refactor JoinReorder code.

* apply nullable

* checkstyle

* set enableDPHypOptimizer default false
2023-02-13 09:08:58 +08:00
1de4e312cc [fix](metric) Fix be core when set enable_system_metrics to false in be (#16646)
when enable_system_metrics is false, we should not use system_metrics any more

Co-authored-by: caiconghui1 <caiconghui1@jd.com>
2023-02-12 23:01:41 +08:00
cf739e7496 [Enhancement](Stmt) Set insert_into timeout session variable separately (#16343) 2023-02-12 16:56:10 +08:00
78a958467f [improvement](Load) Make broker load support the properties of trim_double_quotes and skip_lines (#16622)
`trim_double_quotes` and `skip_lines` were supported in stream load.
So make it support broker load too.
2023-02-12 16:52:59 +08:00
6a8fc35b78 [Bug](Cooldown) fix load balance causing no cooldown replica (#16641) 2023-02-12 16:47:38 +08:00
4e814a7bbc [enhance](community): polish PULL_REQUEST_TEMPLATE.md (#16499) 2023-02-12 16:39:02 +08:00
0701ce1d71 [docs](docs)Fix Docker documentation description (#16643)
* change sh_checker_exclude

* add broker Dockerfile and init_broker.sh

* add docker docs

* Adjust the field naming rules when creating tables

* Fix Docker documentation description

---------

Co-authored-by: Yijia Su <suyijia@selectdb.com>
2023-02-12 16:37:51 +08:00
4350c98b02 [improve](dynamic-table) change addColumns RPC interface fields from required to optional and and config doc (#16632) 2023-02-11 20:57:10 +08:00
274016f50e [fix](docker)Fix Docker init_be script (#16629)
docker_process_sql function have error output.
2023-02-11 16:15:48 +08:00
09b7c22f6b [Opt](exec) remove unless null key when no split in convert key range (#16624) 2023-02-11 15:44:35 +08:00
aba843bb2b [Improvement](inverted index) inverted index query match bitmap cache (#16578)
Add cache for inverted index query match bitmap to accelerate common query keyword, especially for keyword matching many rows. 

Tests result:
- large result: matching 99% out of 247 million rows shows 8x speed up.
- small result: matching 0.1% out of 247 million rows shows 2x speed up.
2023-02-11 13:38:58 +08:00
37d1519316 [WIP](dynamic-table) support dynamic schema table (#16335)
Issue Number: close #16351

Dynamic schema table is a special type of table, it's schema change with loading procedure.Now we implemented this feature mainly for semi-structure data such as JSON, since JSON is schema self-described we could extract schema info from the original documents and inference the final type infomation.This speical table could reduce manual schema change operation and easily import semi-structure data and extends it's schema automatically.
2023-02-11 13:37:50 +08:00
e99202754e [UT-Fix](MTMV) Fix MTMV FE UT bugs (#16513) 2023-02-11 11:00:20 +08:00
b155fc07f6 [fix](fragment thread) fix thread in fragment thread pool hang (#16608)
process the return status for exec_state->execute() in function FragmentMgr::_exec_actual
2023-02-11 09:05:10 +08:00
171ae2892f [improvement](batch size) pass batch size of exec engine to storage engine (#16614)
Currently batch_size is not passed on to SegmentIterator, the SegmentIterator uses the hard coded value 4096 - 32 as the max row count of a block.


* fix bug
2023-02-11 09:01:44 +08:00
8749aedbae [Bug](point query) make get_rowset thread safe (#16609)
`get_rowset` calling from `lookup_row_data` without lock will lead to core dump if _rs_version_map, _stale_rs_version_map changed
2023-02-10 23:54:56 +08:00
c3110f8153 [fix](merge-on-write) fix that the query result has duplicate keys when load with sequence column (#16587) 2023-02-10 22:31:05 +08:00
e6abfed6d1 [fix](dlf) Support DLF by catalog properties and update the doc (#16573)
1. Add default credential provider list
2. Support create DLF catalog from catalog properties
3. Update the doc
2023-02-10 20:43:58 +08:00
75847f7f6a [bugfix](exchange node) should not depend on eos to judge the ending of stream receiver (#16600)
[bugfix](exchange node) should not depend on eos to judge the ending of stream receiver #16600

Co-authored-by: yiguolei <yiguolei@gmail.com>
2023-02-10 20:35:49 +08:00
f95dc28719 [fix](auth)(meta) fix auto info missing when upgrading from 1.1 to 1.2 (#16595)
When upgrading from 1.1.x to 1.2.x, the ADMIN_PRIV of normal user maybe missing.
This PR fix it
2023-02-10 20:34:56 +08:00
3c3110b253 [Fix](Jdbc Catalog) jdbc catalog support to connect to doris database (#16527)
Doris can use mysql-jdbc-jar to connect doris database, but doris has some data type that mysql without.
Such as DecimalV3 and Date/DatetimeV2
I add some case judgments in `Mysql Catalog` , so that Jdbc catalog can identify the data type of DORIS
2023-02-10 20:24:40 +08:00
3929e8214d [improvement](filecache) Use consistent hash to assign the same scan range into the same backend among different queries (#16574)
When file cache enabled, running the same query for the second time may be still slow, for `FE` will assign the same 
scan range into different backends among different queries, and the former cached data in `BE` will be useless if the scan range is changed.

So, this PR introduce consistent hash to assign the same scan range into the same backend among different queries.
2023-02-10 19:49:33 +08:00
1cc735f20b [feature](docker)Refactor Image build script (#16528)
Co-authored-by: Yijia Su <suyijia@selectdb.com>
2023-02-10 18:30:54 +08:00
ad141747b4 [fix](inverted index) fix array type inverted index query error (#16582) 2023-02-10 17:57:15 +08:00
43eca4f209 [Feature-WIP](inverted index) Implementation for alter inverted index. (#16371)
implementation for add/drop inverted index.
2023-02-10 17:56:17 +08:00
6a5277b391 [fix](sequence-column) MergeIterator does not use the correct seq column for comparison (#16494) 2023-02-10 17:51:15 +08:00
861f31205a [fix](window function) invalid order_by_start in VAnalyticEvalNode (#16589) 2023-02-10 17:40:40 +08:00
32188855ef [improve](topn) seperate multiget rpc to ThreadPool (#16598)
multiget_data working in bthread and may block the whole worker pthread of BRPC framework and effect other bthreads, so I seperate work task into a seperate task pool.
2023-02-10 17:39:31 +08:00
05103d88b2 [feature](docker)Add Doris Docker Build Script (#16522)
Add 3FE & 3BE Build Script
2023-02-10 17:18:26 +08:00
1f631c388d [enhance](cooldown)accelerate cooldown task produce efficiency (#16089) 2023-02-10 16:58:27 +08:00
c08c643ca0 [fix](test) disable failed ut 'SelectRollupIndexTest#testPreAggHint' temporarily (#16593)
UT 'SelectRollupIndexTest#testPreAggHint' failed caused by #16286
Disable it temporarily to avoid block CI/CD
2023-02-10 16:36:15 +08:00
b99e2dc727 [bug](jdbc) fix jdbc can't get object of PGobject (#16496)
when pg table have some  unsupported column type like: point, polygon, jsonb......
jdbc catalog will convert it to string type in doris. but get result set in java is org.postgresql.util.PGobject
 
Some test need this pr: #16442
2023-02-10 16:19:02 +08:00
06788bc2d0 [Bug](pipeline) Fix projection on streaming operator (#16592) 2023-02-10 15:57:26 +08:00
da753d6e26 [typo](docs)delete char and varchar in java udf when create (#16566) 2023-02-10 14:25:28 +08:00
ae325f546a [refactor](Nereids): mv AggregateStrategies to implementation rules (#16551) 2023-02-10 14:10:59 +08:00
a06baad7d7 [docs](docker) Add Run Docker cluster docs (#16520) 2023-02-10 14:07:07 +08:00
d9924c9b8e [Improvement](topn) add limit threashold session variable and fuzzy for topn optimizations (#16514)
1. add limit threshold for topn runtime pushdown and key topn optimization
2. use unified session variable topn_opt_limit_threshold for all topn optimizations
3. add fuzzy support for topn_opt_limit_threshold
2023-02-10 12:56:33 +08:00
8758cd412f [feature](auth)Implementing privilege management with rbac model (#16091)
change implement of auth to rbac

each user has one default role which can not be drop;

if you grant priv to user,it will grant to default role ,

In the current pr, the user can still only have one role other than the default role, but in the future, the user and role will be many-to-many

rename PaloRole,PaloAuth,PaloPrivilege to Role,Auth,Privilege
2023-02-10 12:30:49 +08:00
379bef598d [fix-core](block) clear block row_same_bit when block reuse (#16172) 2023-02-10 12:21:27 +08:00
e9cd1d64ed (fix)[multi-catalog][nereids] Reset ExternalFileScanNode required slots after Nereids planner do projection. #16549
The new Nereids planner do column projection after creating scan node. For ExternalFileScanNode, this may cause the columns in required_slots mismatch with the slots after projection. This pr is to reset the required_slots after projection.
2023-02-10 11:28:01 +08:00
1b3902baa2 [Feature](Complex-type) Add struct and map type to Doris (#16444)
This commit support:
1、Insert + select for struct/map type
2、Json stream load for struct type
3、m[key] function for map type

How to use:
Set the fe config to create table for struct and map type
1、admin set frontend config("enable_struct_type" = "true");
2、admin set frontend config("enable_map_type" = "true");

#16547

Co-authored-by: xy720 <xuyang25@baidu.com>
Co-authored-by: amory <wangqiannan@selectdb.com>
Co-authored-by: cambyzju <zhuxiaoli01@baidu.com>
Co-authored-by: hucheng01 <hucheng01@baidu.com>
2023-02-10 11:00:33 +08:00
0c20c607b2 fix stats (#16556) 2023-02-10 11:00:01 +08:00