Commit Graph

15601 Commits

Author SHA1 Message Date
6e855dd198 [feature](sql-dialect) support convert sql use sql convertor service (#27581)
Add a new FE Config `sql_convertor_service`.
If this config is set, and the session variable `sql_dialect` is set,
Doris will try to use a standalone sql converter service to convert user input sql to
specified sql dialect. eg:

```
mysql> set sql_dialect="presto";
Query OK, 0 rows affected (0.02 sec)

Database changed
mysql> select * from db1.tbl1 where "k1" = 1;  # will be converted to select * from db1.tbl1 where `k1` = 1;
+------+------+
| k1   | k2   |
+------+------+
|    1 |    2 |
+------+------+
1 row in set (0.08 sec)
```

The sql converter service should be a http service.
The request and response body can be found in `SQLDialectUtils.java`
2023-12-18 10:32:52 +08:00
d11365da9c [Fix](memtable) fix shrink_memtable_by_agg should also update _row_in_blocks (#28536)
Otherwise using the stale `_row_in_blocks` will result in heap-buffer-overflow

```
==2695213==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x62900122e210 at pc 0x56524744aecf bp 0x7f62c595ef7
0 sp 0x7f62c595ef68
READ of size 8 at 0x62900122e210 thread T1627 (MemTableFlushTh)
    #0 0x56524744aece in doris::vectorized::ColumnVector<long>::insert_indices_from(doris::vectorized::IColumn const&, unsigned int const*, unsigned int const*) /mnt/disk2/lihangyu/doris/be/src/vec/columns/column_vector.cpp:378:33
    #1 0x5652472a7538 in doris::vectorized::ColumnNullable::insert_indices_from(doris::vectorized::IColumn const&, unsigned int const*, unsigned int const*) /mnt/disk2/lihangyu/doris/be/src/vec/columns/column_nullable.cpp:310:25
    #2 0x56524782a62a in doris::vectorized::MutableBlock::add_rows(doris::vectorized::Block const*, unsigned int const*, unsigned int const*) /mnt/disk2/lihangyu/doris/be/src/vec/core/block.cpp:961:14
    #3 0x565233f187ae in doris::MemTable::_put_into_output(doris::vectorized::Block&) /mnt/disk2/lihangyu/doris/be/src/olap/memtable.cpp:248:27
    #4 0x565233f1db66 in doris::MemTable::to_block() /mnt/disk2/lihangyu/doris/be/src/olap/memtable.cpp:496:13
    #5 0x565233efae60 in doris::FlushToken::_do_flush_memtable(doris::MemTable*, int, long*) /mnt/disk2/lihangyu/doris/be/src/olap/memtable_flush_executor.cpp:121:62
    #6 0x565233efc8d6 in doris::FlushToken::_flush_memtable(doris::MemTable*, int, long) /mnt/disk2/lihangyu/doris/be/src/olap/memtable_flush_executor.cpp:150:16
    #7 0x565233f0c5eb in doris::MemtableFlushTask::run() /mnt/disk2/lihangyu/doris/be/src/olap/memtable_flush_executor.cpp:58:23
```
2023-12-18 10:31:16 +08:00
b06f3edcab [fix](meta) fix meta replay issue when upgrading from v2.0 to master (#28532)
Introduced from #27861

The `dbName` saved in `CreateTableInfo` has `default_cluster` prefix, it should be removed.

Also modify the entry of `getDb` in internal catalog. This is a cover-up plan in case there may still 
db name exist with `default_cluster` prefix.
2023-12-17 22:16:42 +08:00
9b3d4bb5bc [fix](Export) Fix an export error when lower_case_table_names=1 (#28389) 2023-12-17 20:45:43 +08:00
0f3c544260 [feature](mtmv)mtmv support partition (#28144)
- create MTMV support partition and `AUTO` refresh method
- refresh mtmv support support specified partitions
- MTMV support incremental updates
- add property `EXCLUDED_TRIGGER_TABLES` for mv
- Maintain MTMVCache after successful task refresh for plan rewrite(MTMV.getOrGenerateCache)
- show partitions add "SyncWithBaseTables"
- drop job before drop MTMV
- task tvf add "MvId,MvDatabaseId,ErrorMsg,TaskContext,RefreshMode,RefreshPartitions"
- add `NotAllowFallback` for mtmv not fallback to old planner
- add `MTMVUtils.getMTMVCanRewritePartitions() `and `Env.getCurrentEnv().getMtmvService().getRelationManager().getAvailableMTMVs()` for plan rewrite
2023-12-17 18:28:03 +08:00
03e989b342 [Doc] Update flink-doris-connector.md (#27329) 2023-12-17 10:39:21 +08:00
27a3884060 [refactor](docker)Dockerfile shell refactor (#27569) 2023-12-16 23:45:06 +08:00
683c173480 [fix](test) fix index change testcases #28298 2023-12-16 23:42:02 +08:00
2f775260d7 [bugfix](jdbc catalog) refresh catalog close jdbcclient (#28300) 2023-12-16 23:38:24 +08:00
61ad3b8dc4 [fix](nereids)LogicalCTEConsumer's output lost column info in SlotReference (#28452) 2023-12-16 23:35:09 +08:00
e4585db32d [enhancement](err-msg) log out datadir path pattern when disk exceed capacity limit #28320 2023-12-16 23:07:02 +08:00
1e08845fc5 [regression test](broker load) add case for sequence col (#27583) 2023-12-16 22:47:20 +08:00
61de49c727 [case](regression) Test duplicated load id (#28251)
Co-authored-by: qinhao <qinhao@newland.com.cn>
2023-12-16 22:41:51 +08:00
74c0a3060f [feature](jdk) Using G1 as defaut garbage colletor in FE (#28263) 2023-12-16 22:40:11 +08:00
894bae4ebf [improvement](publish version) publish txn fail retry do not wait (#28441) 2023-12-16 22:31:10 +08:00
8ab63a9434 [regression-test][memtable] test memtable flush is high priority for vtable writerV2 (#28503) 2023-12-16 22:29:06 +08:00
8c05f7a784 [refactor](cluster)(step-4) remove cluster related to Database (#27861)
Issue Number: #19897

Remove `default_cluster` prefix related to database.
When upgrading, all prefix will be removed.
2023-12-16 18:28:53 +08:00
608baae001 [docker][regregression]update routine load cases #28450
Co-authored-by: 胥剑旭 <xujianxu@xujianxudeMacBook-Pro.local>
2023-12-16 17:57:39 +08:00
ad8faedac4 [fix](txn_manager) Fix wrong use of std::map::erase in TxnManager::delete_txn #28507 2023-12-16 14:50:46 +08:00
a3e2c6affe [fix](jdbc catalog) fix JdbcScanNode NOT CompoundPredicate filter expr handling errors (#28497) 2023-12-16 12:54:55 +08:00
92a4a9770c [improvement](hint) query fail print tablet detail info (#28476) 2023-12-16 12:54:25 +08:00
b11b76e778 [fix](full compaction) Full compaction should hold meta lock when modifying tablet's meta data (#28449) 2023-12-16 12:37:29 +08:00
469edbdd3d [feature](executor)make scan task wait timeout config #28467 2023-12-16 11:36:15 +08:00
920c75c870 [fix](ci)tpch pipeline add check (#28370)
Co-authored-by: stephen <hello-stephen@qq.com>
2023-12-16 11:11:28 +08:00
3ea68d576b [improve](group commit) Fix select tablet policy for random partition and remove some log (#28498)
This pr contains 2 improvements:

For random partition table, select tablet in the original way for load balance;
Skip execute stmt audit log since it's expensive in CPU;
2023-12-16 11:02:52 +08:00
f12a225844 [fix](session variables) Make default value of max_execution_time same to query_timeout #28474
Current problem, UNSET global VARIABLE ALL will write an oplog, which makes query_timeout = 0 when we replay it in a future time-stamp. So we change default value of max_execution_time to 90000 which is consistent to query_timeout default value.
2023-12-16 10:59:05 +08:00
4538f1ba8f [feature](pipelineX) add local_shuffle in nested loop join #28428 2023-12-16 10:53:13 +08:00
f741ce5b7b [fix](iterator) Fix mem leak when initial iterator failed (#28480) 2023-12-16 10:49:05 +08:00
f770403cca [enhancement](pipeline) add bvar for pipeline fragment instance and task (#28500) 2023-12-16 10:47:53 +08:00
20d815f0e7 [refactor](style) Using C++style and changing to smart pointers (#28454) 2023-12-16 10:44:43 +08:00
fb925bdd08 [Bug](memory) Fix exception-unsafe in aggregation node (#28483)
The alloc function may throw std::bad_alloc exception when the process memory exceed limit.

be.INFO:

W1214 09:14:17.434849 771103 mem_tracker_limiter.cpp:204] Memory limit exceeded:<consuming tracker:<Load#Id=28448230da1f432e-8a66597e1032
9235>, process memory used 20.41 GB exceed limit 18.76 GB or sys mem available 9.04 GB less than low water mark 1.60 GB, failed alloc siz
e 1.86 MB>, executing msg:<execute:<>>. backend xx.x.x.xxx process memory used 20.41 GB, limit 18.76 GB. If query tracker exceed, set ex ec_mem_limit=8G to change limit, details see be.INFO.
Process Memory Summary:
    OS physical memory 31.26 GB. Process memory usage 20.41 GB, limit 18.76 GB, soft limit 16.88 GB. Sys available memory 9.04 GB, low wa
ter mark 1.60 GB, warning water mark 3.20 GB. Refresh interval memory growth 0 B
Alloc Stacktrace:
    @     0x555cd858bee9  doris::MemTrackerLimiter::print_log_usage()
    @     0x555cd859a384  doris::ThreadMemTrackerMgr::exceeded()
    @     0x555cd85a0ac4  malloc
    @     0x555cd8fcf368  Allocator<>::alloc()
    @     0x555cd8fdbdaf  doris::vectorized::Arena::add_chunk()
    @     0x555cd96dc0ab  doris::vectorized::AggregateDataContainer::_expand()
    @     0x555cd96aded8  (unknown)
    @     0x555cd969fa2c  doris::vectorized::AggregationNode::_pre_agg_with_serialized_key()
    @     0x555cd96d1d61  std::_Function_handler<>::_M_invoke()
    @     0x555cd967ab0b  doris::vectorized::AggregationNode::get_next()
    @     0x555cd81282a6  doris::ExecNode::get_next_after_projects()
    @     0x555cd8452968  doris::PlanFragmentExecutor::get_vectorized_internal()
    @     0x555cd845553b  doris::PlanFragmentExecutor::open_vectorized_internal()
    @     0x555cd8456a9e  doris::PlanFragmentExecutor::open()
    @     0x555cd842f200  doris::FragmentExecState::execute()
    @     0x555cd843280e  doris::FragmentMgr::_exec_actual()
    @     0x555cd8432d42  _ZNSt17_Function_handlerIFvvEZN5doris11FragmentMgr18exec_plan_fragmentERKNS1_23TExecPlanFragmentParamsESt8funct
ionIFvPNS1_20PlanFragmentExecutorEEEEUlvE_E9_M_invokeERKSt9_Any_data
    @     0x555cd86ead05  doris::ThreadPool::dispatch_thread()
    @     0x555cd86e015f  doris::Thread::supervise_thread()
    @     0x7f3321593ea5  start_thread
    @     0x7f33218a69fd  __clone
    @              (nil)  (unknown)
2023-12-15 19:17:18 +08:00
0f93ee8793 [fix](Nereids): TransposeSemiJoinAgg can't apply in Scalar Agg (#28434)
Scalar Agg shouldn't be pushdown, it will cause wrong result
2023-12-15 16:18:16 +08:00
8986bb6bb4 [fix](Planner): parse more Punctuation Date/DateTime (#28432)
parse more Punctuation as separator, like `2021@01@01 00/00/00`;
2023-12-15 16:17:44 +08:00
0f25a4b3c6 [bug](json)Fix the problem of be down caused by json path ending with \ (#28180) 2023-12-15 15:57:08 +08:00
088bb80a9c [fix](test) fix case of test_unique_table_new_sequence (#28442)
pr #28105 and #28031 merging conflict caused this case to fail.
2023-12-15 15:10:39 +08:00
501a79a45c [Fix](format) compatible run_clang_format format string with python2/3.6/higher (#28469) 2023-12-15 15:03:23 +08:00
97b033813a [perf](Nereids) add back canEliminate temporary (#28017) 2023-12-15 14:26:29 +08:00
2018ab23f0 [chore](build) Add MVN_OPT env variable to enrich building FE with extra options (#28375)
e.g. just export or add it to custom_env.sh
```
export MVN_OPT="-o"
```
will build FE with maven option "-o" (offline), which means maven does
not need to download meta from maven repo, it is useful for saving time
if the internet is unstable or unusable.
2023-12-15 13:20:39 +08:00
e6b135c76a [improvement](fe) Add reason log when Env is not ready (#28286) 2023-12-15 12:22:06 +08:00
6f3fb81965 [fix](doc) spell errors fixes multi-tenant.md (#28436) 2023-12-15 12:21:46 +08:00
4c51558f6b [feature](nereids) Support basic aggregate rewrite and function rollup using materialized view (#28269)
Add aggregate materializedviewRules for query rewrite.
it support the query rewrite as following:

    def mv = "select lineitem.L_LINENUMBER, orders.O_CUSTKEY, sum(O_TOTALPRICE) as sum_alias " +
            "from lineitem " +
            "inner join orders on lineitem.L_ORDERKEY = orders.O_ORDERKEY " +
            "group by lineitem.L_LINENUMBER, orders.O_CUSTKEY "
    def query = "select lineitem.L_LINENUMBER, sum(O_TOTALPRICE) as sum_alias " +
            "from lineitem " +
            "inner join orders on lineitem.L_ORDERKEY = orders.O_ORDERKEY " +
            "group by lineitem.L_LINENUMBER"
2023-12-15 11:30:02 +08:00
c4242ab69e [Chore](Job)Add the configuration of the maximum number of persistence tasks for the job (#28411) 2023-12-15 11:14:06 +08:00
Pxl
8661b5ec21 [Bug](function) fix npe on select http_stream directly (#28423)
fix npe on select http_stream directly
2023-12-15 11:09:45 +08:00
ce60064573 [regression-test](variant) fix unstable query result m… (#28390)
* [regression-test](variant) fix unstable query result for unique key model

* fix p2 case

* add msg
2023-12-15 10:52:50 +08:00
1877389f12 [fix](Nereids) set card to olap table break card block rule (#28417)
we have card block rule to avoid scan too many data.
so we must set olap scan card by only scanned bucket.
2023-12-15 10:28:05 +08:00
4d9b6c272d [Fix](vcompound pred) Corrected evaluation for compound predicates with constant columns (#28421) 2023-12-15 10:10:48 +08:00
eb99e4270d [Fix](parquet_reader) Fix dict filtering doesn't work with plain dict encoding in parquet reader. (#28290) 2023-12-15 09:27:02 +08:00
xy
eebedbc879 [optimize](cooldown)Reduce unnecessary sort operations for vector (#27147)
Co-authored-by: xingying01 <xingying01@corp.netease.com>
2023-12-15 00:13:56 +08:00
415c6d854d [fix](partial update) Fix some bugs about partial update (#28358) 2023-12-15 00:04:29 +08:00
8ca7bd8f98 [enhancement](bitmap)support bitmap type for non-key column in duplicate table (#28392) 2023-12-14 23:59:12 +08:00