Commit Graph

17 Commits

Author SHA1 Message Date
e61d296486 [Refactor] Replace '#ifndef' with '#pragma once' (#9456)
* Replace '#ifndef' with '#pragma once'
2022-05-10 09:25:59 +08:00
869fdff2f0 [refactor] add reference path for source file from impala (#9115)
According to the requirements of the APLv2, the referenced code needs to be marked with the path of the source code.
2022-04-20 12:29:57 +08:00
6c6380969b [refactor] replace boost smart ptr with stl (#6856)
1. replace all boost::shared_ptr to std::shared_ptr
2. replace all boost::scopted_ptr to std::unique_ptr
3. replace all boost::scoped_array to std::unique<T[]>
4. replace all boost:thread to std::thread
2021-11-17 10:18:35 +08:00
Pxl
29ca77622f [Refactor] Refactor part of RuntimeFilter's code (#6998)
#6997
2021-11-07 17:40:45 +08:00
b2f1e21a3b [Bugs] Fix some bugs (#6586)
* fix regex lazy

* fix result file core

* fix dynamic partition replica and table name length bug

* fix replicanum 0

* fix delete bug

* renew proxy

Co-authored-by: morningman <chenmingyu@baidu.com>
2021-09-10 09:53:30 +08:00
149def9e42 [Feature] Support RuntimeFilter in Doris (BE Implement) (#6077)
1. support in/bloomfilter/minmax
2. support broadcast/shuffle/bucket shuffle/colocate join
3. opt memory use and cpu cache miss while build runtime filter
4. opt memory use in left semi join (works well on tpcds-95)
2021-07-04 20:59:05 +08:00
a803ceea86 [refactor] Remove boost mutex, use std::mutex instead (#5684)
* Remove boost mutex, use std::mutex instead

* replace shared_mutex
2021-04-22 11:29:36 +08:00
a91888a68b [BUG] fix memory limit failure and optimize memory usage in join stage (#5514)
This patch works well on tpcds-1T query-24
2021-03-21 11:32:51 +08:00
6fedf5881b [CodeFormat] Clang-format cpp sources (#4965)
Clang-format all c++ source files.
2020-11-28 18:36:49 +08:00
f0db9272dd [Performance] Improve performence of hash join in some case (#3148)
improve performent of hash join  when build table has to many duplicated rows, this will cause hash table collisions and slow down the probe performence.
In this pr when join type is  semi join or anti join, we will build a hash table without duplicated rows.
benchmark:
dataset: tpcds dataset  `store_sales` and `catalog_sales`
```
mysql> select count(*) from catalog_sales;
+----------+
| count(*) |
+----------+
| 14401261 |
+----------+
1 row in set (0.44 sec)

mysql> select count(distinct cs_bill_cdemo_sk) from catalog_sales;
+------------------------------------+
| count(DISTINCT `cs_bill_cdemo_sk`) |
+------------------------------------+
|                            1085080 |
+------------------------------------+
1 row in set (2.46 sec)

mysql> select count(*) from store_sales;
+----------+
| count(*) |
+----------+
| 28800991 |
+----------+
1 row in set (0.84 sec)

mysql> select count(distinct ss_addr_sk) from store_sales;
+------------------------------+
| count(DISTINCT `ss_addr_sk`) |
+------------------------------+
|                       249978 |
+------------------------------+
1 row in set (2.57 sec)
```

test querys:
query1: `select count(*) from (select store_sales.ss_addr_sk  from store_sales left semi join catalog_sales  on catalog_sales.cs_bill_cdemo_sk = store_sales.ss_addr_sk) a;`

query2: `select count(*) from (select catalog_sales.cs_bill_cdemo_sk from catalog_sales left semi join store_sales on catalog_sales.cs_bill_cdemo_sk = store_sales.ss_addr_sk) a;`

benchmark result:


||query1|query2|
|:--:|:--:|:--:|
|before|14.76 sec|3 min 16.52 sec|
|after|12.64 sec|10.34 sec|
2020-03-20 10:31:14 +08:00
839ec45197 Remove llvm relative code from be/src/exec (#2955)
Remove unused LLVM related codes of directory:be/src/exec (#2910)

there are many LLVM related codes in code base, but these codes are not really used.
The higher version of GCC is not compatible with the LLVM 3.4.2 version currently used by Doris.
The PR delete all LLVM related code of directory: be/src/exec.
2020-02-20 20:43:26 +08:00
42395d2455 Change Null-safe equal operator from cross join to hash join (#2156)
* Change Null-safe equal operator from cross join to hash join
ISSUE-2136

This commit change the join method from cross join to hash join when the equal operator is Null-safe '<=>'.
It will improve the speed of query which has the Null-safe equal operator.
The finds_nulls field is used to save if there is Null-safe operator.
The finds_nulls[i] is true means that the i-th equal operator is Null-safe.
The equal function in hash table will return true, if both val and loc are NULL when finds_nulls[i] is true.
2019-11-08 12:43:48 +08:00
0e5b193243 Add cpu and io indicates to audit log (#531) 2019-01-17 12:43:15 +08:00
37b4cafe87 Change variable and namespace name in BE (#268)
Change 'palo' to 'doris'
2018-11-02 10:22:32 +08:00
2868793b6b Change license to Apache License 2.0 (#262) 2018-11-01 09:06:01 +08:00
2419384e8a push 3.3.19 to github (#193)
* push 3.3.19 to github

* merge to 20ed420122a8283200aa37b0a6179b6a571d2837
2018-05-15 20:38:22 +08:00
e2311f656e baidu palo 2017-08-11 17:51:21 +08:00