doris

Author	SHA1	Message	Date
ShowCode	f565f60bc3	[refactor](standard)BE:Initialize pointer variables in the class to nullptr by default (#27587 )	2023-11-28 13:02:30 +08:00
yiguolei	6ed0be8e3c	[refactor](profilev2) unify the counter name in shuffle operator and normal operator (#27267 ) using blocksproduced and rowsproduced to unify the counter name in DataStreamSender and other exec node, or exchange operator and other operators. blocks produced and rows produced are more easy to understand. --------- Co-authored-by: yiguolei <yiguolei@gmail.com>	2023-11-20 14:21:39 +08:00
yiguolei	836cda65d8	[refactor](profilev2) split merged profile to a single runtime profile to make the logic more clear (#27184 )	2023-11-19 13:21:50 +08:00
amory	2f41e0c823	[FIX](complextype)fix information schema for complex type (#27203 ) when we select in information schema , here do not show complex type information	2023-11-18 11:32:32 +08:00
Kaijie Chen	e29d8cb110	[feature](move-memtable) support pipelineX in sink v2 (#27067 )	2023-11-16 15:00:55 +08:00
caiconghui	83edcdead9	[enhancement](random_sink) change tablet search algorithm from random to round-robin for random distribution table (#26611 ) 1. fix race condition problem when get tablet load index 2. change tablet search algorithm from random to round-robin for random distribution table when load_to_single_tablet set to false	2023-11-15 19:55:31 +08:00
zhiqiang	d3fd923447	[opt](pipeline) Return InternalError to FE instead of doing a useless DCHECK in ExecNode #27035 Effect: Client will see error message like below when BE meeting plan logical error. RROR 1105 (HY000): errCode = 2, detailMessage = ([xxx]())[CANCELLED]Logical error during processing VNewOlapScanNode(dr_case_tag), output of projections 2 mismatches with exec node output 3	2023-11-15 18:15:21 +08:00
zhiqiang	a5565f68b2	[Refactor](opentelemetry) Remove opentelemetry (#26605 )	2023-11-09 18:05:34 +08:00
daidai	baae7bf339	[fix](information_schema)fix bug that metadata_name_ids error tableid and append information_schema case. (#26238 ) fix bug that #24059 . Added some information_schema scanner tests. files schema_privileges table_privileges partitions rowsets statistics table_constraints Based on infodb_support_ext_catalog=false, it currently includes tests for all tables under the information_schema database.	2023-11-09 14:07:12 +08:00
amory	95f74f1544	[FIX](complextype)fix shrink in topN for complex type #26609	2023-11-09 10:56:14 +08:00
TengJianPing	a3666aa87e	[feature](decimal) support decimal256 when creating table (#26308 )	2023-11-08 15:21:01 +08:00
zclllyybb	16644eff7f	[opt](load) optimize the performance of row distribution (#25546 ) For non-pipeline non-sinkv2: before: 14s now: 6s- For pipeline + sinkv2: before: 230ms 48 instances now: 38ms 48 instances	2023-11-07 10:04:59 +08:00
Gabriel	dd8bcc831c	[keyword](decimalv2) Add DecimalV2 keyword (#26283 )	2023-11-02 16:27:12 +08:00
Mingyu Chen	e20cab64f4	[improvement](scan) avoid too many scanners for file scan node (#25727 ) In previous, when using file scan node(eq, querying hive table), the max number of scanner for each scan node will be the `doris_scanner_thread_pool_thread_num`(default is 48). And if the query parallelism is N, the total number of scanner would be 48 * N, which is too many. In this PR, I change the logic, the max number of scanner for each scan node will be the `doris_scanner_thread_pool_thread_num / query parallelism`. So that the total number of scanners will be up to `doris_scanner_thread_pool_thread_num`. Reduce the number of scanner can significantly reduce the memory usage of query.	2023-10-29 17:41:31 +08:00
zhiqiang	d6c64d305f	[chore](log) Add log to trace query execution #25739	2023-10-26 14:09:25 +08:00
TengJianPing	693982fd1a	[feature](decimal) support decimal256 (#25386 )	2023-10-25 15:47:51 +08:00
Mryange	6b2eed779c	[feature](AuditLog) add scanRows scanBytes in auditlog (#25435 )	2023-10-25 10:00:35 +08:00
Pxl	2972daaed9	[Bug](status) process error status on es_scroll_parser and compaction_action (#25745 ) process error status on es_scroll_parser and compaction_action	2023-10-24 15:51:01 +08:00
Kaijie Chen	a4c9beba85	[fix](move-memtable) fallback if partial update (#25801 )	2023-10-24 10:29:59 +08:00
daidai	0e0f8090f7	[refactor](text_convert)Use serde to replace text_convert. (#25543 ) Remove text_convert and use serde to replace it.	2023-10-24 09:52:43 +08:00
Jerry Hu	b5ee4a9dbb	[enhancement](profilev2) add some fields for profile v2 (#25611 ) Add 3 counters for ExecNode: ExecTime - Total execution time(excluding the execution time of children). OutputBytes - The total number of bytes output to parent. BlockCount - The total count of blocks output to parent.	2023-10-23 15:55:40 +08:00
Pxl	2e2d5bcba2	[Improvements](status) catch some error status (#25677 ) catch some error status	2023-10-23 10:19:08 +08:00
Pxl	642c149e6a	remove datetime_value and move vecdatetime_value to doris namespace (#25695 ) remove datetime_value and move vecdatetime_value to doris namespace	2023-10-20 22:08:17 +08:00
Kaijie Chen	2e97044706	[fix](move-memtable) fix inverted index condition (#25684 )	2023-10-20 17:37:39 +08:00
meiyi	d0cd535cb9	[improvement](insert) refactor group commit stream load (#25560 )	2023-10-20 13:27:30 +08:00
Lightman	159be51ea6	[bugfix](schema_change) Fix the coredump when doubly write during schema change (#22557 )	2023-10-19 14:43:18 +08:00
Kaijie Chen	11fecafb74	[fix](move-memtable) fallback if target table contains inverted index (#25498 )	2023-10-18 22:11:59 +08:00
lihangyu	b0e0a0569a	[Fix](row store) Real default value should be used instead of default… (#25230 ) Before this PR the default value is not correct, we should use default value in Frontend schema.	2023-10-18 10:13:44 +08:00
qiye	b2e3ecb81d	[opt](load)change `load_to_single_tablet` tablet search algorithm from random to round-robin (#25256 ) At present, `load_to_singlt_tablet` import implementation refers to simple random number remainder, which cannot achieve true averaging. This will lead to uneven disk IO and uneven use of cluster resources. To solve this problem, we are preparing to implement round-robin for each partition tablet imported each time, in order to achieve average load to each tablet. When generating the load query plan, the tablet index record currently imported is passed to BE. Add a deamon task in FE to regularly clean up the `loadTabletRecordMap`. The map will get the bucket_number of the partition and update the `load_tablet_index` when `getCurrentLoadTabletIndex`.	2023-10-16 16:43:25 +08:00
Guangdong Liu	2014e16cfb	[fix](es catalog)fix es http timeout (#25273 )	2023-10-12 10:21:55 +08:00
qiye	7e9ffad933	[fix](ES catalog)Doris cannot parse ES date field without time zone (#24864 ) 1. Add support for Doris to parse ES date field without time zone info. eg: `2023-04-17T23:01:18.151`, this time will be treated as UTC time, since ES assumes that the time zone for time fields without time zones is UTC. 2. Change local time zone convertion from system local time zone to session variable time zone.	2023-10-08 19:28:08 +08:00
Mryange	0631ed61b0	[feature](profilev2) Preliminary support for profilev2. (#24881 ) You can set the level of counters on the backend using ADD_COUNTER_WITH_LEVEL/ADD_TIMER_WITH_LEVEL. The profile can then merge counters with level 1. set profile_level = 1; such as sql select count() from customer join item on c_customer_sk = i_item_sk profile Simple profile PLAN FRAGMENT 0 OUTPUT EXPRS: count() PARTITION: UNPARTITIONED VRESULT SINK MYSQL_PROTOCAL 7:VAGGREGATE (merge finalize) \| output: count(partial_count())[#44] \| group by: \| cardinality=1 \| TotalTime: avg 725.608us, max 725.608us, min 725.608us \| RowsReturned: 1 \| 6:VEXCHANGE offset: 0 TotalTime: avg 52.411us, max 52.411us, min 52.411us RowsReturned: 8 PLAN FRAGMENT 1 PARTITION: HASH_PARTITIONED: c_customer_sk STREAM DATA SINK EXCHANGE ID: 06 UNPARTITIONED TotalTime: avg 106.263us, max 118.38us, min 81.403us BlocksSent: 8 5:VAGGREGATE (update serialize) \| output: partial_count()[#43] \| group by: \| cardinality=1 \| TotalTime: avg 679.296us, max 739.395us, min 554.904us \| BuildTime: avg 33.198us, max 48.387us, min 28.880us \| ExecTime: avg 27.633us, max 40.278us, min 24.537us \| RowsReturned: 8 \| 4:VHASH JOIN \| join op: INNER JOIN(PARTITIONED)[] \| equal join conjunct: c_customer_sk = i_item_sk \| runtime filters: RF000[bloom] <- i_item_sk(18000/16384/1048576) \| cardinality=17,740 \| vec output tuple id: 3 \| vIntermediate tuple ids: 2 \| hash output slot ids: 22 \| RowsReturned: 18.0K (18000) \| ProbeRows: 18.0K (18000) \| ProbeTime: avg 862.308us, max 1.576ms, min 666.28us \| BuildRows: 18.0K (18000) \| BuildTime: avg 3.8ms, max 3.860ms, min 2.317ms \| \|----1:VEXCHANGE \| offset: 0 \| TotalTime: avg 48.822us, max 67.459us, min 30.380us \| RowsReturned: 18.0K (18000) \| 3:VEXCHANGE offset: 0 TotalTime: avg 33.162us, max 39.480us, min 28.854us RowsReturned: 18.0K (18000) PLAN FRAGMENT 2 PARTITION: HASH_PARTITIONED: c_customer_id STREAM DATA SINK EXCHANGE ID: 03 HASH_PARTITIONED: c_customer_sk TotalTime: avg 753.954us, max 1.210ms, min 499.470us BlocksSent: 64 2:VOlapScanNode TABLE: default_cluster:tpcds.customer(customer), PREAGGREGATION: ON runtime filters: RF000[bloom] -> c_customer_sk partitions=1/1, tablets=12/12, tabletList=1550745,1550747,1550749 ... cardinality=100000, avgRowSize=0.0, numNodes=1 pushAggOp=NONE TotalTime: avg 18.417us, max 41.319us, min 10.189us RowsReturned: 18.0K (18000) --------- Co-authored-by: yiguolei <676222867@qq.com>	2023-10-07 11:16:53 +08:00
bobhan1	642e5cdb69	[Fix](Status) Make `Status` `[[nodiscard]]` and handle returned `Status` correctly (#23395 )	2023-09-29 22:38:52 +08:00
Lijia Liu	864a0f9bcb	[opt](pipeline) Make pipeline fragment context send_report asynchronized (#23142 )	2023-09-28 17:55:53 +08:00
Gabriel	3b4d8b4ac8	[pipelineX](feature) Support schema scan operator (#24850 )	2023-09-25 14:42:25 +08:00
lihangyu	ac55d45f79	[Fix](topn opt) fix heap use after free when shrink in fetch phase (#24774 )	2023-09-22 19:48:05 +08:00
zhiqqqq	09e03247ec	[chore](readability) Better readability of ExecNode.cpp #24733	2023-09-22 08:54:57 +08:00
bobhan1	58ab25ccaa	Revert "[Feature](merge-on-write)Support ignore mode for merge-on-write unique table (#21773 )" (#24731 ) This reverts commit 3ee89aea35726197cb7e94bb4f2c36bc9d50da84.	2023-09-21 21:01:28 +08:00
HappenLee	dc9fa1a4f1	[Refactor](Sink) convert to tablet sink to tablet writer (#24474 )	2023-09-20 14:47:18 +08:00
plat1ko	b9ddcbf729	[feature](merge-cloud) Rewrite code related to IOContext (#24269 )	2023-09-15 19:57:58 +08:00
bobhan1	3ee89aea35	[Feature](merge-on-write)Support ignore mode for merge-on-write unique table (#21773 )	2023-09-14 18:03:51 +08:00
qiye	11afd321cb	[fix](es catalog) fix issue with select and insert from es catalog core (#24318 ) Issue Number: close #24315 The root cause of this issue is that Elasticsearch's long type allows inserting floats and strings. Doris did not handle these cases when doing type conversion. The current strategy is to take the integer before the decimal point if a float or string is found.	2023-09-13 23:07:31 +08:00
daidai	e30c3f3a65	[fix](csv_reader)fix bug that Read garbled files caused be crash. (#24164 ) fix bug that read garbled files caused be crash.	2023-09-13 14:12:55 +08:00
zclllyybb	d3f1388717	[Feature](partitions) Support auto-partition (#24153 ) Co-authored-by: zhangstar333 <2561612514@qq.com>	2023-09-12 15:23:15 +08:00
meiyi	82dc970916	[feature](insert) Support group commit insert (#22829 )	2023-09-08 15:51:03 +08:00
zclllyybb	fdb7a44f57	Revert "[Feature](partitions) Support auto partition" (#24024 ) * Revert "[Feature](partitions) Support auto partition (#23236)" This reverts commit 6c544dd2011d731b8c9c51384c77bcf19c017981. * Update config.h	2023-09-07 17:08:26 +08:00
zclllyybb	6c544dd201	[Feature](partitions) Support auto partition (#23236 ) Co-authored-by: zhangstar333 <2561612514@qq.com>	2023-09-06 16:26:45 +08:00
HappenLee	c74ca15753	[pipeline](sink) Supprt Async Writer Sink of result file sink and memory scratch sink (#23589 )	2023-08-31 22:44:25 +08:00
daidai	e680d42fe7	[feature](information_schema)add metadata_name_ids for quickly get catlogs,db,table and add profiling table in order to Compatible with mysql (#22702 ) add information_schema.metadata_name_idsfor quickly get catlogs,db,table. 1. table struct : ```mysql mysql> desc internal.information_schema.metadata_name_ids; +---------------+--------------+------+-------+---------+-------+ \| Field \| Type \| Null \| Key \| Default \| Extra \| +---------------+--------------+------+-------+---------+-------+ \| CATALOG_ID \| BIGINT \| Yes \| false \| NULL \| \| \| CATALOG_NAME \| VARCHAR(512) \| Yes \| false \| NULL \| \| \| DATABASE_ID \| BIGINT \| Yes \| false \| NULL \| \| \| DATABASE_NAME \| VARCHAR(64) \| Yes \| false \| NULL \| \| \| TABLE_ID \| BIGINT \| Yes \| false \| NULL \| \| \| TABLE_NAME \| VARCHAR(64) \| Yes \| false \| NULL \| \| +---------------+--------------+------+-------+---------+-------+ 6 rows in set (0.00 sec) mysql> select * from internal.information_schema.metadata_name_ids where CATALOG_NAME="hive1" limit 1 \G; ************************* 1. row ************************* CATALOG_ID: 113008 CATALOG_NAME: hive1 DATABASE_ID: 113042 DATABASE_NAME: ssb1_parquet TABLE_ID: 114009 TABLE_NAME: dates 1 row in set (0.07 sec) ``` 2. when you create / drop catalog , need not refresh catalog . ```mysql mysql> select count() from internal.information_schema.metadata_name_ids\G; ************************ 1. row ************************* count(): 21301 1 row in set (0.34 sec) mysql> drop catalog hive2; Query OK, 0 rows affected (0.01 sec) mysql> select count() from internal.information_schema.metadata_name_ids\G; ************************* 1. row ************************* count(): 10665 1 row in set (0.04 sec) mysql> create catalog hive3 ... mysql> select count() from internal.information_schema.metadata_name_ids\G; ************************* 1. row ************************* count(): 21301 1 row in set (0.32 sec) ``` 3. create / drop table , need not refresh catalog . ```mysql mysql> CREATE TABLE IF NOT EXISTS demo.example_tbl ... ; mysql> select count() from internal.information_schema.metadata_name_ids\G; ************************* 1. row ************************* count(): 10666 1 row in set (0.04 sec) mysql> drop table demo.example_tbl; Query OK, 0 rows affected (0.01 sec) mysql> select count() from internal.information_schema.metadata_name_ids\G; ************************* 1. row ************************* count(): 10665 1 row in set (0.04 sec) ``` 4. you can set query time , prevent queries from taking too long . ``` fe.conf : query_metadata_name_ids_timeout the time used to obtain all tables in one database ``` 5. add information_schema.profiling in order to Compatible with mysql ```mysql mysql> select from information_schema.profiling; Empty set (0.07 sec) mysql> set profiling=1; Query OK, 0 rows affected (0.01 sec) ```	2023-08-31 21:22:26 +08:00
TengJianPing	62c075bf7e	[improvement](Block) Replace Block(const PBlock&) with deserialize because it has heavy operations in ctor (#23672 )	2023-08-31 14:44:17 +08:00

1 2 3 4 5 ...

960 Commits