doris

Author	SHA1	Message	Date
Mingyu Chen	ef2151ae66	[Feature-WIP](multi-catalog) Add Hive sink on BE side. (#32306 ) (#32364 ) bp #32306 Co-authored-by: Qi Chen <kaka11.chen@gmail.com>	2024-03-18 11:23:01 +08:00
wangbo	83ab61ad22	Add QUEUE_START_TIME/QUEUE_END_TIME/QUERY_STATUS column for active_queries (#32259 )	2024-03-16 20:53:46 +08:00
wangbo	258dcfca97	[Refactor](executor)Add information_schema.workload_groups (#32195 ) (#32314 )	2024-03-15 20:46:54 +08:00
wangbo	df5ec16d7c	[Refactor](exectuor)Add schema type table active_queries (#32057 ) * Add schema type table active_queries	2024-03-15 17:57:28 +08:00
HappenLee	b031c95324	[Opt](exec) use libbase64 to replace base64 code in doris (#32078 ) * [Opt](exec) use libbase64 to replace base64 code in doris	2024-03-14 09:20:50 +08:00
wangbo	c5390d00bb	[Improvement]Add schema table backend_active_tasks (#31945 )	2024-03-09 19:55:48 +08:00
abmdocrt	7c30cb20fd	[Fix](partial update) Fix partial update load false when schema includes auto increment column (#31725 ) Problem: When partially updating columns without specifying the auto-increment column, and the imported data contains new keys, an error stating the auto-increment column could not be found occurs. Reason: The logic for partial column updates does not account for new keys in auto-increment columns. Since auto-increment columns can be generated by the system, it's possible to omit this column data during import. However, partial column updates treat this as a regular column, expecting it to be nullable or have a default value for automatic filling, overlooking the fact that auto-increment columns can also be auto-filled. This oversight leads to the error. Solution: Incorporate a check for auto-increment columns into the partial column update logic, and include the logic for generating auto-increment column values in the process of completing partial updates.	2024-03-06 13:06:27 +08:00
starocean999	3777ffb43f	[enhancement](nereids)support null partition for list partition (#31613 )	2024-03-06 13:05:22 +08:00
Uniqueyou	d8b9909675	[Fix](Status) Handle returned Status correctly #31434	2024-03-01 04:25:43 +08:00
zzzxl	92e3b31f50	[feature](invert index) match_phrase_edge feature added (#31142 )	2024-02-29 19:51:18 +08:00
zclllyybb	b177b26d39	[branch-2.1](tracing) Pick pipeline tracing and relative bugfix (#31367 ) * [Feature](pipeline) Trace pipeline scheduling (part I) (#31027) * [fix](compile) Fix performance compile fail #31305 * [fix](compile) Fix macOS compilation issues for PURE macro and CPU core identification (#31357) * [fix](compile) Correct PURE macro definition to fix compilation on macOS * 2 --------- Co-authored-by: zy-kkk <zhongyk10@gmail.com>	2024-02-29 08:42:35 +08:00
morrySnow	8fc9d80479	[compatibility](MySQL) update charset to utf8mb4, collation to utf8mb4_0900_bin (#31046 ) Doris's behaviour is more like utf8mb4 and utf8mb4_0900_bin than utf8 and utf8_general_ci	2024-02-21 17:01:39 +08:00
lihangyu	eaaab33f0a	[Fix](Top-N opt) evicting quering rowsets in prior to correct use_count (#102 ) (#30904 ) This addresses the scenario where a rowset cannot be removed.	2024-02-16 10:16:40 +08:00
HowardQin	0d32aeeaf6	[improvement](load) Enable lzo & Remove dependency on Markus F.X.J. Oberhumer's lzo library (#30573 ) Issue Number: close #29406 1. increase lzop version to 0x1040, I set to 0x1040 only for decompressing lzo files compressed by higher version of lzop, no change of decompressing logic, actully, 0x1040 should have "F_H_FILTER" feature, but it mainly for audio and image data, so we do not support it. 2. use orc::lzoDecompress() instead of lzo1x_decompress_safe() to decompress lzo data 3. use crc32c::Extend() instead of lzo_crc32() 4. use olap_adler32() instead of lzo_adler32() 5. thus, remove dependency of Markus F.X.J. Oberhumer's lzo library 6. remove DORIS_WITH_LZO, so lzo file are supported by stream and broker load by default 7. add some regression test	2024-02-05 22:00:24 +08:00
zzzxl	1ac5b45180	[fix](invert index) fixed the issue of insufficient index idx generation during partial column updates. (#30678 )	2024-02-01 19:01:08 +08:00
yangshijie	221308f78a	[fix](datatype) fix bugs for IPv4/v6 datatype and add some basic regression test cases (#30261 )	2024-01-31 23:53:39 +08:00
wangbo	0433b8730d	[Feature](profile)add shuffle send rows/bytes #30456	2024-01-28 18:25:08 +08:00
zclllyybb	d191809372	[fix](pipeline) Fix non-prepared execute of UnionOperator (#30355 )	2024-01-27 09:11:44 +08:00
yiguolei	ce5ba61640	[refactor](close)Full refactor async writer (#30082 ) --------- Co-authored-by: yiguolei <yiguolei@gmail.com>	2024-01-23 13:22:15 +08:00
zclllyybb	24ed3e4103	[Fix](Expr&code-style) check prepare&open before every VExpr execute (#26673 )	2024-01-23 10:09:54 +08:00
Jerry Hu	1b1e088e83	[fix](exec_node) crashing caused by cancelled query in ExecNode (#30192 )	2024-01-23 10:09:54 +08:00
yiguolei	f66f6b2a82	[refactor](close) refactor ispendingfinish logic and close logic to do close more quickly (#30021 )	2024-01-23 10:06:05 +08:00
lihangyu	9e30a67a2a	[Improve](topn opt) avoid crash when rpc returned row contains duplicated row entry (#29872 ) 1. Add more info to trace potential bug and avoid crash 2. use correct permutation size to do `column->permute`	2024-01-16 18:40:31 +08:00
Mingyu Chen	ebfbe0c8dd	[opt](information_schema) support information_schema in external catalog (#28919 ) Add `information_schema` database for all catalog. This is useful when using BI tools to connect to Doris, the tools can get meta info from `information_schema`. This PR mainly changes: 1. There will be a `information_schema` db in each catalog. 2. Each `information_schema` db only store the meta info of the catalog it belongs to. 3. For `information_schema`, the `TABLE_SCHEMA` column's value is the database name. 4. There is a new global variable `show_full_dbname_in_info_schema_db`, default is false, if set to true, The `TABLE_SCHEMA` column's value is the like `ctl.db`, because: When connect to Doris, the `database` info in connection url will be: `xxx?db=ctl.db`. And then some BI will try to query `information_schema` with sql like: `select * from information_schema.columns where TABLE_SCHEMA = "ctl.db"` So it has to be format as `ctl.db` eg, the `information_schema.columns` table in external catalog `doris` is like: ``` mysql> select * from information_schema.columns limit 1\G ************************* 1. row ************************* TABLE_CATALOG: doris TABLE_SCHEMA: doris.__internal_schema TABLE_NAME: column_statistics COLUMN_NAME: id ORDINAL_POSITION: 1 COLUMN_DEFAULT: NULL IS_NULLABLE: NO DATA_TYPE: varchar CHARACTER_MAXIMUM_LENGTH: 4096 CHARACTER_OCTET_LENGTH: 16384 NUMERIC_PRECISION: NULL NUMERIC_SCALE: NULL DATETIME_PRECISION: NULL CHARACTER_SET_NAME: NULL COLLATION_NAME: NULL COLUMN_TYPE: varchar(4096) COLUMN_KEY: EXTRA: PRIVILEGES: COLUMN_COMMENT: COLUMN_SIZE: 4096 DECIMAL_DIGITS: NULL GENERATION_EXPRESSION: NULL SRS_ID: NULL ``` 6. Modify the behavior of - show tables - shwo databases - show columns - show table status The above statements may query the `information_schema` db if there is `where` predicate after them	2024-01-12 13:58:19 +08:00
wangbo	0d691c638b	[Feature](profile)Support report runtime workload statistics #29591	2024-01-12 11:59:27 +08:00
yiguolei	bd8113f424	[bugfix](scannerscheduler) should minus num_of_scanners before check should schedule #28926 (#29331 ) --------- Co-authored-by: yiguolei <yiguolei@gmail.com>	2024-01-03 20:47:35 +08:00
qiye	2c4e52e44e	[fix](es catalog) only es_query function can push down to ES (#29320 ) Issue Number: close #29318 1. Only push down `es_query` function to ES 2. Add null check where ES query result not have `_source` or `fields` fields.	2023-12-30 09:33:26 +08:00
TengJianPing	a525d5c5a3	[refactor](decimal) change type name Decimal128 to Decimal128V2, Decimal128I to Decimal128V3 to avoid confusion (#29265 ) change type name Decimal128 to Decimal128V2, Decimal128I to Decimal128V3 to avoid confusion	2023-12-29 10:11:44 +08:00
Gabriel	c75e63a2a5	[Improvement](scan) Use scanner to do projection of scan node (#29124 )	2023-12-27 16:00:52 +08:00
caiconghui	7081139bdc	[fix](block) fix be core while mutable block merge may cause different row size between columns in origin block (#27943 )	2023-12-25 20:35:22 +08:00
yiguolei	1545c36d16	Revert "[bugfix](scannercore) scanner will core in deconstructor during collect profile (#28727 )" (#28931 ) This reverts commit 4066de375efe6ff8e156a61df4f9316b3d9eaa4e.	2023-12-24 20:37:33 +08:00
yiguolei	4066de375e	[bugfix](scannercore) scanner will core in deconstructor during collect profile (#28727 )	2023-12-23 11:09:46 +08:00
zclllyybb	0b9b1be1f1	[fix](function) Fix from_second functions overflow and wrong result (#28685 )	2023-12-22 10:22:49 +08:00
zzzxl	bcc32b5b26	[feature](invert index) match_regexp feature added (#28257 )	2023-12-20 14:30:35 +08:00
yiguolei	b142ade69e	[refactor](renamefile) rename some files according to the class names (#28606 )	2023-12-19 14:10:11 +08:00
yiguolei	73f7b61019	[refactor](scanner) use weak ptr to lock task execution context to avoid core in scanner dctor (#28493 ) using weak ptr as a lock between fragment execute thread and scanner thread, to solve the core problem in scanner's dctor to access scannode's profile.	2023-12-18 14:09:32 +08:00
meiyi	1e5ff40e17	[refactor](group commit) remove future block (#27720 ) Co-authored-by: huanghaibin <284824253@qq.com>	2023-12-11 08:41:51 +08:00
zzzxl	05adbfdb3d	[feature](inverted index) match_phrase_prefix feature added (#27404 ) select count() from test_index_match_phrase_prefix where request match_phrase_prefix 'xxx';	2023-12-05 20:15:13 +08:00
zhangstar333	e62d19d90d	[improve](partition) support auto list partition with more columns (#27817 ) before the partition by column only have one column. now remove those limit, could have more columns.	2023-12-04 11:33:18 +08:00
Mryange	10483ea12c	[fix](profile) fix error set with peak_memory_usage in pipeline #27749	2023-12-02 14:12:38 +08:00
daidai	ce271ff382	[fix](parquet)fix can not read parquet lz4 compress. (#27383 ) Fixed the problem of not being able to read parquet lz4 compressed format. By default, it is decompressed according to the Hadoop lz4 format. If it fails, it will fall back to the standard lz4 compression format.	2023-11-29 19:04:53 +08:00
ShowCode	f565f60bc3	[refactor](standard)BE:Initialize pointer variables in the class to nullptr by default (#27587 )	2023-11-28 13:02:30 +08:00
yiguolei	6ed0be8e3c	[refactor](profilev2) unify the counter name in shuffle operator and normal operator (#27267 ) using blocksproduced and rowsproduced to unify the counter name in DataStreamSender and other exec node, or exchange operator and other operators. blocks produced and rows produced are more easy to understand. --------- Co-authored-by: yiguolei <yiguolei@gmail.com>	2023-11-20 14:21:39 +08:00
yiguolei	836cda65d8	[refactor](profilev2) split merged profile to a single runtime profile to make the logic more clear (#27184 )	2023-11-19 13:21:50 +08:00
amory	2f41e0c823	[FIX](complextype)fix information schema for complex type (#27203 ) when we select in information schema , here do not show complex type information	2023-11-18 11:32:32 +08:00
Kaijie Chen	e29d8cb110	[feature](move-memtable) support pipelineX in sink v2 (#27067 )	2023-11-16 15:00:55 +08:00
caiconghui	83edcdead9	[enhancement](random_sink) change tablet search algorithm from random to round-robin for random distribution table (#26611 ) 1. fix race condition problem when get tablet load index 2. change tablet search algorithm from random to round-robin for random distribution table when load_to_single_tablet set to false	2023-11-15 19:55:31 +08:00
zhiqiang	d3fd923447	[opt](pipeline) Return InternalError to FE instead of doing a useless DCHECK in ExecNode #27035 Effect: Client will see error message like below when BE meeting plan logical error. RROR 1105 (HY000): errCode = 2, detailMessage = ([xxx]())[CANCELLED]Logical error during processing VNewOlapScanNode(dr_case_tag), output of projections 2 mismatches with exec node output 3	2023-11-15 18:15:21 +08:00
zhiqiang	a5565f68b2	[Refactor](opentelemetry) Remove opentelemetry (#26605 )	2023-11-09 18:05:34 +08:00
daidai	baae7bf339	[fix](information_schema)fix bug that metadata_name_ids error tableid and append information_schema case. (#26238 ) fix bug that #24059 . Added some information_schema scanner tests. files schema_privileges table_privileges partitions rowsets statistics table_constraints Based on infodb_support_ext_catalog=false, it currently includes tests for all tables under the information_schema database.	2023-11-09 14:07:12 +08:00

1 2 3 4 5 ...

1001 Commits