doris

Author	SHA1	Message	Date
Pxl	027b06059a	[Feature](materialized-view) support count(1) on materialized view (#28135 ) support count(1) on materialized view fix match failed like select k1, sum(k1) from t group by k1	2023-12-09 01:36:46 +08:00
Yulei-Yang	b6e72d57c5	[Improvement](hms catalog) support show_create_database for hms catalog (#28145 ) * [Improvement](hms catalog) support show_create_database for hms catalog * update	2023-12-09 01:34:21 +08:00
airborne12	055b3885c9	[Fix](inverted index) fix compound directory flush buffer error (#28191 )	2023-12-09 00:57:35 +08:00
yiguolei	abc802b5ba	[bugfix](core) child block is shared between operator and node, it should be shared ptr (#28106 ) _child_block in nest loop join , table value function, repeat node will be shared between ExecNode and related operator, but it should not be a unique ptr in operator, it belongs to exec node. It will double free the block, if operator's close method is not called correctly. It should be a shared ptr, then it will not core even if the opeartor's close method is not called.	2023-12-09 00:18:14 +08:00
Mingyu Chen	8eed760704	[fix](planner) separate table's isPartitioned() method (#28163 ) This PR #27515 change the logic if Table's `isPartitioned()` method. But this method has 2 usages: 1. To check whether a table is range or list partitioned, for some DML operation such as Alter, Export. For this case, it should return true if the table is range or list partitioned. even if it has only one partition and one buckets. 2. To check whether the data is distributed (either by partitions or by buckets), for query planner. For this case, it should return true if table has more than one bucket. Even if this table is not range or list partitioned, if it has more than one bucket, it should return true. So we should separate this method into 2, for different usages. Otherwise, it may cause some unreasonable plan shape	2023-12-08 23:15:45 +08:00
Mingyu Chen	baf85547ae	[feature](jdbc) support call function to pass sql directly to jdbc catalog #26492 Support a new stmt in Nereids: `CALL EXECUTE_STMT("jdbc", "stmt")` So that we can pass the origin stmt directly to the datasource of a jdbc catalog. show case: ``` mysql> select * from mysql_catalog.db1.tbl1; +------+------+ \| k1 \| k2 \| +------+------+ \| 111 \| 222 \| +------+------+ 1 row in set (0.63 sec) mysql> call execute("mysql_catalog", "insert into db1.tbl1 values(1,'abc')"); Query OK, 0 rows affected (0.01 sec) mysql> select * from mysql_catalog.db1.tbl1; +------+------+ \| k1 \| k2 \| +------+------+ \| 111 \| 222 \| \| 1 \| abc \| +------+------+ 2 rows in set (0.03 sec) mysql> call execute_stmt("mysql_catalog", "delete from db1.tbl1 where k1=111"); Query OK, 0 rows affected (0.01 sec) mysql> select * from mysql_catalog.db1.tbl1; +------+------+ \| k1 \| k2 \| +------+------+ \| 1 \| abc \| +------+------+ 1 row in set (0.03 sec) ```	2023-12-08 23:06:05 +08:00
minghong	2b914aebb6	[opt](nereids)improve partition prune when Date function is used (#27960 ) date func in partition prune	2023-12-08 21:53:39 +08:00
Kaijie Chen	18ef131410	[fix](load) select more active memtables at once in memtable limiter (#28171 )	2023-12-08 21:45:35 +08:00
lihangyu	06404114f1	[Fix](point query) fix memleak by increasing `scanReplicaIds` when using prepared statement (#28184 ) OlapScanNode should release memory for `scanReplicaIds`	2023-12-08 21:02:01 +08:00
Jibing-Li	5e7afa768e	[fix](statistics)Avoid potential NPE #28147	2023-12-08 20:42:17 +08:00
Sun Chenyang	573b594df3	[improvement](Variant Type) Support displaying subcolumns expanded for the variant column (#27764 )	2023-12-08 20:34:58 +08:00
zhangstar333	51f320a606	[bug](function) fix array_apply function return wrong result (#28133 )	2023-12-08 20:14:54 +08:00
zhiqiang	0931eb536c	Revert "[Improvement](auditlog) add column catalog for audit log and audit log table (#26403 )" (#28177 ) This reverts commit daea751a986823bf5858704663d58f49fd5dfb39.	2023-12-08 18:46:59 +08:00
zhangstar333	75b55f8f2f	[enhance](session)check invalid value when set parallel instance variables (#28141 ) in some case, if set incorrectly, will be cause BE core dump 10:18:19 *** SIGFPE integer divide by zero (@0x564853c204c8) received by PID 2132555 int max_scanners = config::doris_scanner_thread_pool_thread_num / state->query_parallel_instance_num();	2023-12-08 17:38:48 +08:00
Xinyi Zou	226a0c3b1d	[chore](memory) Warning in log when turning on THP (#28122 )	2023-12-08 17:38:38 +08:00
minghong	bc40025631	[opt](Nereids)Join cluster connectivity (#27833 ) * estimation join stats by connectivity	2023-12-08 14:55:10 +08:00
plat1ko	6da36e1077	[feature](merge-cloud) Refactor write path code by abstract base class (#26537 ) Refactor write path code by abstract base class. Whether to use `StorageEngine` or `CloudStorageEngine` will be determined during compilation instead of runtime `config::cloud_mode` to avoid unexpected null pointer or undefined behavior issues caused by merging code. Class that depend on `StorageEngine` but are shared by the cloud mode need to have an abstract base class. Common code should be extracted into the base class, while the code that depends on `StorageEngine` should be implemented in a `StorageEngine` mix-in class of the base class.	2023-12-08 14:50:36 +08:00
Xiangyu Wang	16230b5ebd	[Enhance](multi-catalog) parse hive view ddl first to avoid NPE. (#28067 )	2023-12-08 13:54:50 +08:00
minghong	61d556c718	[fix](nereids)runtime filter translator failed on set operator (#28102 ) * runtime filter translator failed on set operator	2023-12-08 12:58:42 +08:00
lihangyu	341822ec05	[regression-test](Variant) add compaction case for variant and fix bugs (#28066 )	2023-12-08 12:18:46 +08:00
wangbo	59ec3da899	open workload group in PR pipeline (#27744 )	2023-12-08 11:56:03 +08:00
yujun	ebed055d2b	[chore](clone) rename clone request field (#27591 )	2023-12-08 11:53:57 +08:00
zclllyybb	d534cdf027	[compile](BE) let arm gcc know some function no return (#28157 ) let arm gcc know some function no return	2023-12-08 11:32:08 +08:00
Calvin Kirs	cd108688c1	[Chore](docs)Fix job error docs (#28127 )	2023-12-08 10:24:21 +08:00
zhiqiang	0947bf4e97	[opt](mysql serde) Avoid core dump when converting invalid block to mysql result (#28069 ) BE will core dump if result block is invalid when we doing result serialization. An existing bug case is described in #28030, so we add check branch to avoid BE core dump due to out of range related problem.	2023-12-08 10:21:09 +08:00
zclllyybb	25b90eb782	[Feature](function) support random int from specific range (#28076 ) mysql> select rand(-20, -10); +------------------+ \| random(-20, -10) \| +------------------+ \| -13 \| +------------------+ 1 row in set (0.10 sec)	2023-12-08 10:15:25 +08:00
lihangyu	e75d91c91b	[regression-test](Variant) Add more cases related to schema changes (#27958 ) * [regression-test](Variant) Add more cases related to schema changes And fix bugs about schema change for variant: fix bug schema change crash on doing schema change with tablet schema that contains extracted columns	2023-12-08 10:15:12 +08:00
qiye	1d345877ce	[fix](regression-test) load_to_single_tablet assertEquals usage (#28128 )	2023-12-08 10:09:44 +08:00
julic20s	d8d8f15bf3	[improvement](vectorization) Use requires instead of specialization for doris::vectorized::Decimal (#28027 ) Use requires instead of specialization for doris::vectorized::Decimal	2023-12-08 09:59:52 +08:00
Gabriel	9461e86b10	[pipelineX](debug) add debug string (#28137 ) * [pipelineX](debug) add debug string * update	2023-12-07 23:21:10 +08:00
jakevin	66ed093410	[test](Nereids): fix test push_down_top_n (#26937 )	2023-12-07 23:07:32 +08:00
walter	cbb238a0ff	[improve](env) Add disk usage in not ready msg (#28125 )	2023-12-07 22:49:52 +08:00
HHoflittlefish777	f9d4690023	[improve](stack_trace) avoid print stack trace in csv and json reader #28129	2023-12-07 22:45:18 +08:00
zclllyybb	81a0f8c041	[Feature](function) support generating const values from tvf numbers (#28051 ) If specified, got a column of constant. otherwise an incremental series like it always be. mysql> select * from numbers("number" = "5", "const_value" = "-123"); +--------+ \| number \| +--------+ \| -123 \| \| -123 \| \| -123 \| \| -123 \| \| -123 \| +--------+ 5 rows in set (0.11 sec)	2023-12-07 22:26:43 +08:00
Xinyi Zou	397a401241	[fix](arrow-flight) Modify FE Arrow version to 14.0.1 #28093 Previously temporarily upgrade Arrow to dev version 15.0.0-SNAPSHOT, because the latest release version Arrow 14.0.1 jdbc:arrow-flight-sql has BUG, jdbc:arrow-flight-sql cannot be used normally, see: apache/arrow#38785 But Arrow 15.0.0-SNAPSHOT was not published to the Maven central repository, and the network could not be connected sometimes, so back to Arrow 14.0.1. jdbc:arrow-flight-sql will be supported after upgrading to Arrow 15.0.0 release version.	2023-12-07 22:25:08 +08:00
Xinyi Zou	a2d66911cd	[chore](docs) Fix partition cache design principles #28110	2023-12-07 22:23:46 +08:00
Jibing-Li	b1c5519aa8	[doc](statistics)Update external catalog statistics doc (#28123 )	2023-12-07 21:33:05 +08:00
HappenLee	104a822a2f	[Refacotr](RuntimeFilter) refactor rf code to improve performance (#28094 )	2023-12-07 20:32:30 +08:00
seawinde	be81eb1a9b	[feature](nereids) Support inner join query rewrite by materialized view (#27922 ) Work in process. Support inner join query rewrite by materialized view in some scene. Such as an exmple as following: > mv = "select lineitem.L_LINENUMBER, orders.O_CUSTKEY " + > "from orders " + > "inner join lineitem on lineitem.L_ORDERKEY = orders.O_ORDERKEY " > query = "select lineitem.L_LINENUMBER " + > "from lineitem " + > "inner join orders on lineitem.L_ORDERKEY = orders.O_ORDERKEY "	2023-12-07 20:29:51 +08:00
morrySnow	f37215a32a	[fix](Nereids) insert into target table lock should include finalize (#28085 )	2023-12-07 20:15:12 +08:00
morrySnow	65fc2e0438	[fix](Nereids) forbid two TVF in one fragment since the limit of coordinator (#28114 )	2023-12-07 19:58:31 +08:00
lihangyu	cc9b4bcddb	[Fix](variant) fallback to none partial update for mow table (#28116 )	2023-12-07 19:30:24 +08:00
lihangyu	942450a2e5	[Fix](Variant) ColumnObject need to be finalized when doing ColumnObject::update_hash_with_value (#28119 ) Otherwise accessing rows at `n` will lead to heap buffer overflow ``` 5# SipHash::update(char const*, unsigned long) at /home/zcp/repo_center/doris_master/doris/be/src/vec/common/sip_hash.h:132 6# doris::vectorized::ColumnString::update_hash_with_value(unsigned long, SipHash&) const at /home/zcp/repo_center/doris_master/doris/be/src/vec/columns/column_string.h:452 7# doris::vectorized::ColumnObject::update_hash_with_value(unsigned long, SipHash&) const at /home/zcp/repo_center/doris_master/doris/be/src/vec/columns/column_object.cpp:1433 8# doris::vectorized::Block::update_hash(SipHash&) const at /home/zcp/repo_center/doris_master/doris/be/src/vec/core/block.cpp:721 9# doris::EngineChecksumTask::_compute_checksum() at ```	2023-12-07 18:48:05 +08:00
Mingyu Chen	34642781c2	[fix](meta) fix ConcurrentModificationException when dump image (#28072 ) ``` Caused by: java.util.ConcurrentModificationException at java.util.HashMap$HashIterator.nextNode(HashMap.java:1437) ~[?:1.8.0_131] at java.util.HashMap$EntryIterator.next(HashMap.java:1471) ~[?:1.8.0_131] at java.util.HashMap$EntryIterator.next(HashMap.java:1469) ~[?:1.8.0_131] at org.apache.doris.catalog.CatalogRecycleBin.write(CatalogRecycleBin.java:1047) ~[doris-fe.jar:1.2-SNAPSHOT] at org.apache.doris.catalog.Env.saveRecycleBin(Env.java:2298) ~[doris-fe.jar:1.2-SNAPSHOT] ``` When calling `/dump` api to dump image, ConcurrentModificationException may be thrown. Because no lock to protect `CatalogRecycleBin`	2023-12-07 18:26:02 +08:00
Tiewei Fang	3dcbf16404	[Fix](Outfile) The Struct type data exported from select outfile to the csv file format should contain a column name #28068 If the original data is： ```sql +-----------------------------------------------------+ \| s_info \| +-----------------------------------------------------+ \| {"s_id": 2, "s_name": "nereids", "s_address": "20"} \| \| {"s_id": 1, "s_name": "doris", "s_address": "18"} \| +-----------------------------------------------------+ ``` In the original logic, the struct type data exported to a csv file format did not contain column names,like ``` {2, "nereids", "20"} {1, "doris", "18"} ``` This pr do not need to be merged into branch-2.0	2023-12-07 18:23:36 +08:00
airborne12	394b420180	[Update](inverted index) use session variable for inverted index try query threshold (#28052 ) * [Update](inverted index) use session variable for inverted index try query threshold * remove unused config * update clucene	2023-12-07 17:54:44 +08:00
minghong	172747669e	[fix](Nereids)fix regression case：nereids_rules_p0/transposeJoin/transposeSemiJoinAgg #28111	2023-12-07 17:41:08 +08:00
Kaijie Chen	a27c068a9d	[improve](move-memtable) make StreamWait time configurable (#28086 )	2023-12-07 17:27:43 +08:00
Kaijie Chen	84a651d976	[improve](load) rewrite memtable memory limiter rules (#27759 )	2023-12-07 17:26:26 +08:00
minghong	bc12a05915	[fix](Nereids) explain graph insert-select NPE (#28007 )	2023-12-07 17:25:44 +08:00

... 59 60 61 62 63 ...

18429 Commits