doris

Author	SHA1	Message	Date
zhannngchen	205a0793e9	[fix](regression) fix flaky test test_partial_update_schema_change (#22500 ) * update * update	2023-08-03 09:32:48 +08:00
zy-kkk	17f4776b0f	[typo](docs) fix get start zh doc (#22524 )	2023-08-02 23:32:07 +08:00
ZhenchaoXu	5aeea985e6	[typo](docs) Replace invalid mysql-connector-java download package. (#21954 )	2023-08-02 22:58:08 +08:00
Kaijie Chen	c2db01037a	[refactor](config) rename segcompaction_max_threads (#22468 )	2023-08-02 22:35:14 +08:00
Ashin Gau	938f768aba	[fix](parquet) resolve offset check failed in parquet map type (#22510 ) Fix error when reading empty map values in parquet. The `offsets.back()` doesn't not equal the number of elements in map's key column. ### How does this happen Map in parquet is stored as repeated group, and `repeated_parent_def_level` is set incorrectly when parsing map node in parquet schema. ``` the map definition in parquet: optional group <name> (MAP) { repeated group map (MAP_KEY_VALUE) { required <type> key; optional <type> value; } } ``` ### How to fix Set the `repeated_parent_def_level` of key/value node as the definition level of map node. `repeated_parent_def_level` is the definition level of the first ancestor node whose `repetition_type` equals `REPEATED`. Empty array/map values are not stored in doris column, so have to use `repeated_parent_def_level` to skip the empty or null values in ancestor node. For instance, considering an array of strings with 3 rows like the following: `null, [], [a, b, c]` We can store four elements in data column: `null, a, b, c` and the offsets column is: `1, 1, 4` and the null map is: `1, 0, 0` For the `i-th` row in array column: range from `offsets[i - 1]` until `offsets[i]` represents the elements in this row, so we can't store empty array/map values in doris data column. As a comparison, spark does not require `repeated_parent_def_level`, because the spark column stores empty array/map values , and use anther length column to indicate empty values. Please reference: https://github.com/apache/spark/blob/master/sql/core/src/main/java/org/apache/spark/sql/execution/datasources/parquet/ParquetColumnVector.java Furthermore, we can also avoid store null array/map values in doris data column. The same three rows as above, We can only store three elements in data column: `a, b, c` and the offsets column is: `0, 0, 3` and the null map is: `1, 0, 0`	2023-08-02 22:33:10 +08:00
Pxl	3d0d7a427b	[Chore](brpc) display pool name when try offer failed (#22514 )	2023-08-02 22:31:33 +08:00
KassieZ	876bd1c747	[typo](Docs) Capitalize the Title of Files in Data Operation - Import Category (#22456 )	2023-08-02 22:17:15 +08:00
KassieZ	bbbefc4b6f	[typo](docs) Capitalize and Rename Title of Files in Data Operation-Export (#22457 )	2023-08-02 21:56:38 +08:00
KassieZ	d5bf00583f	[typo](docs) Capitalize and Rename Table Design Files (#22453 )	2023-08-02 21:51:58 +08:00
KassieZ	76108bac2f	[typo](docs) Capitalize and Rename Title of Install and Deployment Files (#22451 )	2023-08-02 21:51:36 +08:00
KassieZ	a7a5e14d52	[typo](docs) Merge Doris Introductions File into Getting Started Category (#22449 )	2023-08-02 21:50:55 +08:00
gnehil	ec4fc1f9ef	[typo](doc) fix be java env faq (#22462 )	2023-08-02 21:50:33 +08:00
zy-kkk	57e0fa448c	[typo](docs) Change the jdk version on the macOS to 11 (#22522 )	2023-08-02 21:47:14 +08:00
Calvin Kirs	e5028314bc	[Feature](Job)Support scheduler job (#21916 )	2023-08-02 21:34:43 +08:00
ZhenchaoXu	6f575cf4b3	[typo](doc)Add a description of whether one of the dynamic partitioning parameters must be required. (#22422 )	2023-08-02 21:28:18 +08:00
ZhenchaoXu	2ff4e9d79d	[typo](doc)modify some sql syntax description errors (#22420 )	2023-08-02 21:28:02 +08:00
ZhenchaoXu	498c0124e8	[typo](doc)modify some sql syntax and example description errors (#22460 )	2023-08-02 21:27:34 +08:00
ZenoYang	9d3f1dcf44	[improvement](vectorized) Deserialized elements of count distinct aggregation directly inserted into target hashset (#21888 ) The original logic is to first deserialize the ColumnString into a HashSet (insert the deserialized elements into the hashset), and then traverse all the HashSet elements into the target HashSet during the merge phase. After optimization, when deserializing, elements are directly inserted into the target HashSet, thereby reducing unnecessary hashset insert overhead. In one of our internal query tests, 30 hashsets were merged in second phase aggregation(the average cardinality is 1,400,000), and the cardinality after merging is 42,000,000. After optimization, the MergeTime dropped from 5s965ms to 3s375ms.	2023-08-02 21:19:56 +08:00
Kaijie Chen	781c1d5238	[log](load) add debug logs for potential duplicate tablet ids (#22485 )	2023-08-02 20:38:41 +08:00
DeadlineFen	3a787b6684	[improvement](regression) syncer regression test (#22490 )	2023-08-02 20:09:27 +08:00
LiBinfeng	8cac8df40c	[Fix](Planner) fix create view tosql not include partition (#22482 ) Problem: When create view with join in table partitions, an error would rise like "Unknown column" Example: CREATE VIEW my_view AS SELECT t1.* FROM t1 PARTITION(p1) JOIN t2 PARTITION(p2) ON t1.k1 = t2.k1; select * from my_view ==> errCode = 2, detailMessage = Unknown column 'k1' in 't2' Reason: When create view, we do tosql first in order to persistent view sql. And when doing tosql of table reference, partition key word was removed to keep neat of sql string. But here when we remove partition keyword it would regarded as an alias. So "PARTITION" keyword can not be removed. Solved: Add “PARTITION” keyword back to tosql string.	2023-08-02 20:04:59 +08:00
KassieZ	4d9f4c7a68	[typo(docs) Capitalize Title of Files in Data Operation - Update and Delete(#22459 )	2023-08-02 19:16:11 +08:00
airborne12	0cd5183556	[Refactor](inverted index) refact tokenize function for inverted index (#22313 )	2023-08-02 19:12:22 +08:00
Jerry Hu	4bc65aa921	[fix](load) PrefetchBufferedReader Crashing caused updating counter with an invalid runtime profile (#22464 )	2023-08-02 18:19:48 +08:00
Pxl	751a7680c5	[Bug](exchange) fix core dump on send_local_block (#22494 ) fix core dump on send_local_block	2023-08-02 18:12:34 +08:00
starocean999	527782f3d3	[fix](nereids)move RecomputeLogicalPropertiesProcessor rule before topn optimization (#22488 ) topn optimization will change MutableState. So need move RecomputeLogicalPropertiesProcessor rule before it	2023-08-02 17:36:56 +08:00
zgxme	a4ef340777	[test](pipline) adjust mem limit to 90 & exclude some cases (#22445 ) adjust mem limit to 90 & exclude some cases	2023-08-02 15:11:22 +08:00
Mryange	ddd90855a9	[vectorized](udaf) java udaf support with map type (#22397 ) [vectorized](udaf) java udaf support with map type (#22397) * test * remove some unused * update * add case	2023-08-02 15:03:44 +08:00
jakevin	16461fdc1c	[feature](Nereids): pushdown COUNT through join (#22455 )	2023-08-02 14:55:25 +08:00
amory	18692b2a7c	fixed (#22481 ) [FIX](array) fix array-dcheck-contains_null	2023-08-02 14:22:16 +08:00
wangbo	e2ed2e99e2	exclude workload group test default (#22483 )	2023-08-02 12:45:08 +08:00
Siyang Tang	e991f607d5	[fix](string-column) fix unescape length error (#22411 )	2023-08-02 12:18:05 +08:00
Pxl	f5e3cd2737	[Improvement](aggregation) optimization for aggregation hash_table_lazy_emplace (#22327 ) optimization for aggregation hash_table_lazy_emplace	2023-08-02 11:50:21 +08:00
AKIRA	41f984bb39	[fix](fe) Fix stmt forward #22469 The call of String.format() contains orphan %s that will cause following error. Introduced from #21205	2023-08-02 10:34:04 +08:00
Xinyi Zou	bc87002028	[opt](conf) remote scanner thread num is changed to core num * 10 (#22427 )	2023-08-01 23:09:49 +08:00
Chenyang Sun	19d1f49fbe	[improvement](compaction) compaction policy and options in the properties of a table (#22461 )	2023-08-01 22:02:23 +08:00
starocean999	809f67e478	[fix](nereids)fix bug of cast expr to decimalv3 without any check (#22466 )	2023-08-01 21:59:47 +08:00
slothever	94dee833cd	[fix](multi-catalog)fix compatible with hdfs HA empty prefix (#22424 )	2023-08-01 21:48:16 +08:00
Mryange	bf50f9fa7f	[fix](decimal) fix cast rounding half up with negative number (#22450 )	2023-08-01 21:47:42 +08:00
qiye	b8399148ef	[fix](DOE) es catalog not working with pipeline,datetimev2, array and esquery (#22046 )	2023-08-01 21:45:16 +08:00
yiguolei	ff0fda460c	[be](parameter) change default fragment_pool_thread_num_max from 512 to 2048 (#22448 ) change some parameter's default value: brpc_num_threads from -1 to 256 compaction_task_num_per_disk from 2 to 4 compaction_task_num_per_fast_disk from 4 to 8 fragment_pool_thread_num_max from 512 to 2048 fragment_pool_queue_size from 2048 to 4096 --------- Co-authored-by: yiguolei <yiguolei@gmail.com>	2023-08-01 20:33:41 +08:00
HHoflittlefish777	4d3e56e2e7	[fix][regression-test] change lazy open regression test name (#22404 )	2023-08-01 20:26:10 +08:00
wangbo	e93b2aadfc	Add regression test for workload group (#22452 )	2023-08-01 19:56:39 +08:00
minghong	d5d82b7c31	[stats](nereids) fix bug for avg-size (#22421 )	2023-08-01 17:13:00 +08:00
谢健	d4a6ef3f8c	[fix](Nereids) fix test framework of hypergraph (#22434 )	2023-08-01 16:20:07 +08:00
Mryange	f16a39aea1	[feature](time) using timev2 type to replace the old time type. (#22269 )	2023-08-01 15:59:07 +08:00
gnehil	58c62431d1	[typo](doc) illustrate storage policy is not supported on merge-on-write table (#22194 )	2023-08-01 15:57:51 +08:00
Pxl	8d16f1bb09	[Chore](materialized-view) update documentation about materialized-view and update test (#22350 ) update documentation about materialized-view and update test	2023-08-01 15:13:34 +08:00
huanghaibin	43d783ae21	[fix](vertical compaction) compaction block reader should return error when reading next block failed (#22431 )	2023-08-01 14:09:18 +08:00
Xin Liao	f842067354	[fix](merge-on-write) fix duplicate keys occur when be restart (#22437 ) For mow table, delete bitmap of stale rowsets has not been persisted. When be restart, duplicate keys will occur if read stale rowsets. Therefore, for the mow table, we do not allow reading the stale rowsets. Although this may result in VERSION_ALREADY_MERGED error when query after be restart, its probability of occurrence is relatively low.	2023-08-01 14:07:04 +08:00

1 2 3 4 5 ...

12295 Commits