doris

Author	SHA1	Message	Date
Pxl	c4cee5122b	[Chore](brpc) make error messages more verbose when brpc pool offer failed (#22558 )	2023-08-03 22:02:37 +08:00
Mryange	a6f6b351fe	[feature](profile) add DORIS_BUILD_SHORT_HASH in profile #22516	2023-08-03 21:25:26 +08:00
Jibing-Li	151120c907	[Improvement](statistics)Improve show analyze performance. #22484	2023-08-03 21:22:37 +08:00
amory	86e6f5d039	[FIX](decimal)fix decimal precision (#22364 ) Now we make wrong for decimal parse from string if given string precision is bigger than defined decimal precision, we will return a overflow error, but only digit part is bigger than typed digit length , we should return overflow error when we traverse given string to decimal value	2023-08-03 21:13:58 +08:00
HappenLee	e7e73a618c	[exec](join) Print join type in profile (#22567 )	2023-08-03 20:46:15 +08:00
Pxl	098bab7b30	[Bug](exchange) disable implicit conversion of block to bool (#22534 ) disable implicit conversion of block to bool	2023-08-03 20:37:14 +08:00
AlexYue	ec187662be	use correct bool value (#22507 )	2023-08-03 20:09:57 +08:00
Dongyang Li	ab0d01d2b4	[fix](case) add sync, test_range_partition.groovy (#22556 ) add sync, test_range_partition.groovy	2023-08-03 19:41:54 +08:00
amory	469886eb4e	[FIX](array)fix if function for array() #22553 [FIX](array)fix if function for array() #22553	2023-08-03 19:40:45 +08:00
Xinyi Zou	96a46302e8	[fix](stacktrace) Fix Jemalloc enable profile fail to run BE after rewrites dl_iterate_phdr (#22549 ) Jemalloc heap profile follows libgcc's way of backtracing by default. rewrites dl_iterate_phdr will cause Jemalloc to fail to run after enable profile. TODO, two solutions: - Jemalloc specifies GNU libunwind as the prof backtracing way, but my test failed, --enable-prof-libunwind not work: --enable-prof-libunwind not work jemalloc/jemalloc#2504 - ClickHouse/libunwind solves Jemalloc profile backtracing, but the branch of ClickHouse/libunwind has been out of touch with GNU libunwind and LLVM libunwind, which will leave the fate to others.	2023-08-03 19:32:36 +08:00
bobhan1	23a69e860d	[fix](regression) fix flaky regression test delete_mow_partial_update (#22548 )	2023-08-03 19:26:42 +08:00
Jack Drogon	d02b45e847	[chore](cmake) Refactor be CMakeLists option (#22499 ) Signed-off-by: Jack Drogon <jack.xsuperman@gmail.com>	2023-08-03 19:25:04 +08:00
zhannngchen	e90f95dfda	[config](merge-on-write) use separate config to control primary key index cache (#22538 )	2023-08-03 17:11:19 +08:00
Jibing-Li	60ca5b0bad	[Improvement](statistics)Return meaningful error message when show column stats column name doesn't exist (#22458 ) The error message was not good for not exist column while show column stats: ``` MySQL [hive.tpch100]> show column stats `lineitem` (l_extendedpric); ERROR 1105 (HY000): errCode = 2, detailMessage = Unexpected exception: null ``` This pr show a meaningful message: ``` mysql> show column stats `lineitem` (l_extendedpric); ERROR 1105 (HY000): errCode = 2, detailMessage = Column: l_extendedpric not exists ```	2023-08-03 16:35:14 +08:00
Chenyang Sun	c63e3e6959	[fix](regression] fix test_table_level_compaction_policy [fix](regression] fix test_table_level_compaction_policy	2023-08-03 15:24:17 +08:00
zgxme	22344d6e4a	[test](pipline) exclude fail case (#22546 ) exclude fail case	2023-08-03 15:18:26 +08:00
AKIRA	27f6e4649e	[improvement](stats) Catch exception properly #22503 Catch exception instead of throw to caller directly to avoid unexpected interruption of upper logic	2023-08-03 15:16:55 +08:00
morrySnow	3961b8df76	[refactor](Nereids) mv top-n two phase read rule from post processor to rewriter (#22487 ) use three new plan node to represent defer materialize of TopN. Example: ``` -- SQL select * from t1 order by c1 limit 10; -- PLAN +------------------------------------------+ \| Explain String \| +------------------------------------------+ \| PhysicalDeferMaterializeResultSink \| \| --PhysicalDeferMaterializeTopN \| \| ----PhysicalDistribute \| \| ------PhysicalDeferMaterializeTopN \| \| --------PhysicalDeferMaterializeOlapScan \| +------------------------------------------+ ```	2023-08-03 14:28:13 +08:00
herry2038	4f9969ce1e	[feature](show-frontends-disk) Add Show frontend disks (#22040 ) Co-authored-by: yuxianbing <yuxianbing@yy.com> Co-authored-by: yuxianbing <iloveqaz123>	2023-08-03 14:04:48 +08:00
谢健	4322fdc96d	[feature](Nereids): add or expansion in CBO(#22465 )	2023-08-03 13:29:33 +08:00
yiguolei	85a95e206e	[bugfix](profile) not output some variables correctly (#22537 )	2023-08-03 13:17:02 +08:00
Mryange	e670d84b72	[feature](executor) using max_instance_num to limit automatically instance (#22521 )	2023-08-03 13:12:32 +08:00
Mingyu Chen	596fd4d86d	[improvement](file-scan) reduce the min size of file split (#22412 ) Reduce from 128MB to 8MB. So that user can set `file_split_size` more flexible.	2023-08-03 11:42:00 +08:00
HappenLee	f7755aa538	[exec](set_operation) Support one child node in set operation (#22463 ) Support one child node in set operation	2023-08-03 10:35:59 +08:00
zhangstar333	9f0a9e6fd6	[bug](distinct-agg) fix limit value not effective in some case (#22517 ) fix limit value not effective in some case	2023-08-03 10:35:36 +08:00
AKIRA	fb644ad691	[improvement](stats) Add more logs and config options (#22436 ) 1. add more logs and make error messages more clear 2. sleep a while between retry analyze 3. make concurrency of sync analyze configurable 4. Ignore internal columns like delete sign to save resources	2023-08-03 09:55:29 +08:00
zhannngchen	205a0793e9	[fix](regression) fix flaky test test_partial_update_schema_change (#22500 ) * update * update	2023-08-03 09:32:48 +08:00
zy-kkk	17f4776b0f	[typo](docs) fix get start zh doc (#22524 )	2023-08-02 23:32:07 +08:00
ZhenchaoXu	5aeea985e6	[typo](docs) Replace invalid mysql-connector-java download package. (#21954 )	2023-08-02 22:58:08 +08:00
Kaijie Chen	c2db01037a	[refactor](config) rename segcompaction_max_threads (#22468 )	2023-08-02 22:35:14 +08:00
Ashin Gau	938f768aba	[fix](parquet) resolve offset check failed in parquet map type (#22510 ) Fix error when reading empty map values in parquet. The `offsets.back()` doesn't not equal the number of elements in map's key column. ### How does this happen Map in parquet is stored as repeated group, and `repeated_parent_def_level` is set incorrectly when parsing map node in parquet schema. ``` the map definition in parquet: optional group <name> (MAP) { repeated group map (MAP_KEY_VALUE) { required <type> key; optional <type> value; } } ``` ### How to fix Set the `repeated_parent_def_level` of key/value node as the definition level of map node. `repeated_parent_def_level` is the definition level of the first ancestor node whose `repetition_type` equals `REPEATED`. Empty array/map values are not stored in doris column, so have to use `repeated_parent_def_level` to skip the empty or null values in ancestor node. For instance, considering an array of strings with 3 rows like the following: `null, [], [a, b, c]` We can store four elements in data column: `null, a, b, c` and the offsets column is: `1, 1, 4` and the null map is: `1, 0, 0` For the `i-th` row in array column: range from `offsets[i - 1]` until `offsets[i]` represents the elements in this row, so we can't store empty array/map values in doris data column. As a comparison, spark does not require `repeated_parent_def_level`, because the spark column stores empty array/map values , and use anther length column to indicate empty values. Please reference: https://github.com/apache/spark/blob/master/sql/core/src/main/java/org/apache/spark/sql/execution/datasources/parquet/ParquetColumnVector.java Furthermore, we can also avoid store null array/map values in doris data column. The same three rows as above, We can only store three elements in data column: `a, b, c` and the offsets column is: `0, 0, 3` and the null map is: `1, 0, 0`	2023-08-02 22:33:10 +08:00
Pxl	3d0d7a427b	[Chore](brpc) display pool name when try offer failed (#22514 )	2023-08-02 22:31:33 +08:00
KassieZ	876bd1c747	[typo](Docs) Capitalize the Title of Files in Data Operation - Import Category (#22456 )	2023-08-02 22:17:15 +08:00
KassieZ	bbbefc4b6f	[typo](docs) Capitalize and Rename Title of Files in Data Operation-Export (#22457 )	2023-08-02 21:56:38 +08:00
KassieZ	d5bf00583f	[typo](docs) Capitalize and Rename Table Design Files (#22453 )	2023-08-02 21:51:58 +08:00
KassieZ	76108bac2f	[typo](docs) Capitalize and Rename Title of Install and Deployment Files (#22451 )	2023-08-02 21:51:36 +08:00
KassieZ	a7a5e14d52	[typo](docs) Merge Doris Introductions File into Getting Started Category (#22449 )	2023-08-02 21:50:55 +08:00
gnehil	ec4fc1f9ef	[typo](doc) fix be java env faq (#22462 )	2023-08-02 21:50:33 +08:00
zy-kkk	57e0fa448c	[typo](docs) Change the jdk version on the macOS to 11 (#22522 )	2023-08-02 21:47:14 +08:00
Calvin Kirs	e5028314bc	[Feature](Job)Support scheduler job (#21916 )	2023-08-02 21:34:43 +08:00
ZhenchaoXu	6f575cf4b3	[typo](doc)Add a description of whether one of the dynamic partitioning parameters must be required. (#22422 )	2023-08-02 21:28:18 +08:00
ZhenchaoXu	2ff4e9d79d	[typo](doc)modify some sql syntax description errors (#22420 )	2023-08-02 21:28:02 +08:00
ZhenchaoXu	498c0124e8	[typo](doc)modify some sql syntax and example description errors (#22460 )	2023-08-02 21:27:34 +08:00
ZenoYang	9d3f1dcf44	[improvement](vectorized) Deserialized elements of count distinct aggregation directly inserted into target hashset (#21888 ) The original logic is to first deserialize the ColumnString into a HashSet (insert the deserialized elements into the hashset), and then traverse all the HashSet elements into the target HashSet during the merge phase. After optimization, when deserializing, elements are directly inserted into the target HashSet, thereby reducing unnecessary hashset insert overhead. In one of our internal query tests, 30 hashsets were merged in second phase aggregation(the average cardinality is 1,400,000), and the cardinality after merging is 42,000,000. After optimization, the MergeTime dropped from 5s965ms to 3s375ms.	2023-08-02 21:19:56 +08:00
Kaijie Chen	781c1d5238	[log](load) add debug logs for potential duplicate tablet ids (#22485 )	2023-08-02 20:38:41 +08:00
DeadlineFen	3a787b6684	[improvement](regression) syncer regression test (#22490 )	2023-08-02 20:09:27 +08:00
LiBinfeng	8cac8df40c	[Fix](Planner) fix create view tosql not include partition (#22482 ) Problem: When create view with join in table partitions, an error would rise like "Unknown column" Example: CREATE VIEW my_view AS SELECT t1.* FROM t1 PARTITION(p1) JOIN t2 PARTITION(p2) ON t1.k1 = t2.k1; select * from my_view ==> errCode = 2, detailMessage = Unknown column 'k1' in 't2' Reason: When create view, we do tosql first in order to persistent view sql. And when doing tosql of table reference, partition key word was removed to keep neat of sql string. But here when we remove partition keyword it would regarded as an alias. So "PARTITION" keyword can not be removed. Solved: Add “PARTITION” keyword back to tosql string.	2023-08-02 20:04:59 +08:00
KassieZ	4d9f4c7a68	[typo(docs) Capitalize Title of Files in Data Operation - Update and Delete(#22459 )	2023-08-02 19:16:11 +08:00
airborne12	0cd5183556	[Refactor](inverted index) refact tokenize function for inverted index (#22313 )	2023-08-02 19:12:22 +08:00
Jerry Hu	4bc65aa921	[fix](load) PrefetchBufferedReader Crashing caused updating counter with an invalid runtime profile (#22464 )	2023-08-02 18:19:48 +08:00

... 27 28 29 30 31 ...

13721 Commits