Commit Graph

18823 Commits

Author SHA1 Message Date
c38c939b52 [bug](Fe) fix potential deadlock in show proc statement (#34988) 2024-05-28 13:12:03 +08:00
f8fcd17f33 [fix](memory) Fix nested scoped tracker and nested reserve memory (#35257)
SCOPED_ATTACH_TASK cannot be nested, but SCOPED_SWITCH_THREAD_MEM_TRACKER_LIMITER can continue to be called, so attach_limiter_tracker may be nested.
2024-05-28 13:12:03 +08:00
9d6b2d66ca [feature](metrics)support be jvm metrics. (#35023)
support be jvm metrics.
if you `curl http://be_host:webserver_port/metrics` , you will get :
```
doris_be_jvm_heap_size_bytes{type="max"} 8589934592
doris_be_jvm_heap_size_bytes{type="committed"} 8589934592
doris_be_jvm_heap_size_bytes{type="used"} 364159504

doris_be_jvm_non_heap_size_bytes{type="committed"} 117899264
doris_be_jvm_non_heap_size_bytes{type="used"} 115330424

doris_be_jvm_young_size_bytes{type="used"} 255852544
doris_be_jvm_young_size_bytes{type="peak_used"} 255852544
doris_be_jvm_young_size_bytes{type="max"} 8589934592

doris_be_jvm_old_size_bytes{type="used"} 94393344
doris_be_jvm_old_size_bytes{type="peak_used"} 94393344
doris_be_jvm_old_size_bytes{type="max"} 8589934592

doris_be_jvm_gc{name="G1 Young Generation Count", type="count"} 3
doris_be_jvm_gc{name="G1 Young Generation Time", type="time"} 33
doris_be_jvm_gc{name="G1 Old Generation Count", type="count"} 0
doris_be_jvm_gc{name="G1 Old Generation Time", type="time"} 0

doris_be_jvm_thread{type="count"} 147
doris_be_jvm_thread{type="peak_count"} 147
doris_be_jvm_thread{type="new_count"} 0
doris_be_jvm_thread{type="runnable_count"} 25
doris_be_jvm_thread{type="blocked_count"} 0
doris_be_jvm_thread{type="waiting_count"} 48
doris_be_jvm_thread{type="timed_waiting_count"} 74
doris_be_jvm_thread{type="terminated_count"} 0
```
2024-05-28 13:12:03 +08:00
79cd726132 [Fix](inverted index) fix race condition in index build (#35427)
Fix race condition problem introduced by #35366 , which will cause heap-use-after-free
2024-05-28 13:12:03 +08:00
d8eefd0be8 [fix] fix wrong result of spill agg with limit (#35403) 2024-05-28 13:12:03 +08:00
7058b31edd [fix](move-memtable) clear load streams before shutdown SegmentFileWriterThreadPool (#35217) 2024-05-28 13:12:03 +08:00
f0e883c968 [Fix](executor)Fix backend_active_tasks only scan one be (#35490)
## Proposed changes
Fix ```select * from backend_active_tasks``` but only return one random
be info.
2024-05-28 11:48:42 +08:00
238e218312 [fix](httpapi) restore compaction/run_status api can show be's overall compaction status and refactor code (#35409) 2024-05-28 09:43:43 +08:00
8ff95a00f3 [Fix](test) fix test case output for inverted_index_p0.test_tokenize (#35464) 2024-05-27 19:19:24 +08:00
8c4f5af708 [opt](Nereids) auto fallback when insert unsupport catalog (#33353) (#35453)
pick from master #33353
2024-05-27 16:58:35 +08:00
1a52e4f7db [chore](mtmv)Optimize mtmv logs and exception information (#34957) (#35446)
pick from master #34957

1. Change some logs to debug.
2. Error prompt changed from MTMV to async materialized view
2024-05-27 16:35:13 +08:00
a32db25070 [enhance](mtmv) allow add index for MTMV (#34225) (#35443)
Previously, the limitation on whether operations can be performed on materialized views was to determine `opType`.

Now, a `allowOpMTMV()` method is implemented through various `clauses`.

Because some operations have the same `opType`, but some operations allow and some do not.

For example, the `opType` for both `add column` and `create index` is `SCHEMA-CHANGE`, but `add column` is not allowed and `create index` is allowed.
2024-05-27 16:22:16 +08:00
d71e9d34fe [Bugfix] Fix mv column type is not changed when do schema change (#34598) 2024-05-27 15:28:12 +08:00
596fb6f327 [improve](ub) fix some runtime error of ubsan when downcast (#35343)
those code could work well, but it will be report some runtime error under UBSAN,
so refactor it to let's ubsan could running happy.
2024-05-27 15:27:43 +08:00
c44affb43f Add downgrade scan thread num by column num (#35351) 2024-05-27 15:27:12 +08:00
6d362c1061 [fix](hint) fix hint tests with different be instances (#35188)
Problem:
When using multiple be to test hint with distribute hint, the result would be unstable
Solved:
Add ordered hint to every distribute hint and move some leading hint cases to check containing of hint infomation
2024-05-27 15:27:05 +08:00
68eda58a8c [Fix](multi-catalog) Fix string dict filtering when use null related function in parquet and orc reader. (#35335)
The following sql and when the dictionary column contains functions related to null, the results will be incorrect.
```
select * from ( select IF(o_orderpriority IS NULL, 'null', o_orderpriority) AS o_orderpriority from test_string_dict_filter_orc ) as A where o_orderpriority = 'null';
```
```
select * from ( select IFNULL(o_orderpriority, 'null') AS o_orderpriority from test_string_dict_filter_parquet ) as A where o_orderpriority = 'null'
```
```
select * from ( select COALESCE(o_orderpriority, 'null') AS o_orderpriority from test_string_dict_filter_parquet ) as A where o_orderpriority = 'null';
```
2024-05-27 15:25:29 +08:00
Pxl
82ff29faea [Chore](materialized-view) forbid create mv on row store table (#35360)
forbid create mv on row store table
2024-05-27 15:25:16 +08:00
7284b6959f [Configurations](multi-catalog)Fix enable_orc_filter_by_min_max functionality, the mistake for #35012. (#35320)
fix bug introduced from  #35012
2024-05-27 15:25:07 +08:00
09f9012817 [Fix](hive-writer) Fix hive partition update core. (#35311)
Issue: #31442
```
/home/zcp/repo_center/doris_master/doris/be/src/common/signal_handler.h:421
1# PosixSignals::chained_handler(int, siginfo*, void*) [clone .part.0] in /usr/lib/jvm/java-17-openjdk-amd64/lib/server/libjvm.so
2# JVM_handle_linux_signal in /usr/lib/jvm/java-17-openjdk-amd64/lib/server/libjvm.so
3# 0x00007F963FA9D090 in /lib/x86_64-linux-gnu/libc.so.6
4# doris::vectorized::VHivePartitionWriter::_build_partition_update() at /home/zcp/repo_center/doris_master/doris/be/src/vec/sink/writer/vhive_partition_writer.cpp:215
5# doris::vectorized::VHivePartitionWriter::close(doris::Status const&) at /home/zcp/repo_center/doris_master/doris/be/src/vec/sink/writer/vhive_partition_writer.cpp:164
6# doris::vectorized::VHiveTableWriter::close(doris::Status) at /home/zcp/repo_center/doris_master/doris/be/src/vec/sink/writer/vhive_table_writer.cpp:209
7# doris::vectorized::AsyncResultWriter::process_block(doris::RuntimeState*, doris::RuntimeProfile*) at /home/zcp/repo_center/doris_master/doris/be/src/vec/sink/writer/async_result_writer.cpp:184
8# doris::vectorized::AsyncResultWriter::start_writer(doris::RuntimeState*, doris::RuntimeProfile*)::$_0::operator()() const at
```
2024-05-27 15:24:53 +08:00
f98ed4e4c5 [bugfix](hive)Misspelling of class names (#34981) 2024-05-27 15:24:38 +08:00
b1795d44ec [bugfix](hive)fix testcase for test_hive_write_different_path (#35209)
Hive's test environment uses docker, so when using 127.0.0.1,
BE will write the file to the docker of its own machine.
But if FE and are not on the same machine,
FE cannot read this file because it can only read docker on its own machine. 
Therefore, the address 127.0.0.1 cannot be used in the test environment.
2024-05-27 15:24:30 +08:00
5ab5ec3d0d [Fix](inverted index) fix build index wrong size for inverted index (#35366) 2024-05-27 15:24:17 +08:00
2422439e45 [Update](regression) add case for inverted index (#35305)
Co-authored-by: Kang <kxiao.tiger@gmail.com>
2024-05-27 15:24:09 +08:00
a82c6e869e [fix](Nereids) LogicalEmptyRelation type is wrong (#35382) 2024-05-27 15:23:46 +08:00
f99b2f0f82 [branch-2.1][hotfix](jdbc table) Restoring a table type that should not be deleted (#35434)
* [hotfix](jdbc table) Restoring a table type that should not be deleted

* add comment
2024-05-27 14:39:36 +08:00
2e20e38523 [improvement](jdbc catalog) remove useless jdbc catalog code (#34986) (#35418) 2024-05-27 14:25:26 +08:00
e3b4d4e630 Reset workload_group_max_num for regression test (#35430) 2024-05-27 14:10:25 +08:00
b6eaf95720 [fix](memory) Fix BE memory info compatible with Cgroup (#35412) (#35425)
1. `memory.usage_in_bytes ~= free.used + free.(buff/cache) - (buff)`, free cache can be reused,
   so, modify cgroup_memory_usage = memory.usage_in_bytes - memory.meminfo["Cached"].
2. If system not configured with cgroup, find cgroup file path will failed, refactor refresh cgroup memory info, compatible with find failed.
2024-05-27 12:31:44 +08:00
af986c370b [feat](Nereids): Put the Child with Least Row Count in the First Position of Intersect (#34290) (#35339)
In this pull request, we optimize the ordering of children in the Intersect operator to improve query performance. The proposed change is to place the child with the least row count in the first position of the Intersect operator.

The rationale behind this optimization is that the Intersect operator works by first evaluating the leftmost child and then iterating through the results of the other children to find matching rows. By placing the child with the least row count first, we can minimize the number of iterations required to find the matching rows, thereby reducing the overall execution time of the query.
2024-05-27 11:52:35 +08:00
a9bd98d65b [fix](nereids)AdjustNullable rule should handle union node with no children (#35323) 2024-05-27 10:06:20 +08:00
83cbb4e255 fix cloud mode 2024-05-27 09:56:26 +08:00
8f5deb10be [be](oom) add stacktrace in debugmode to find oom reason 2024-05-26 23:39:46 +08:00
ade1841a01 [fix](shuffle) Do not return error if local recvr is null (#35399) 2024-05-26 20:20:50 +08:00
6e17dc1e87 (cherry-pick)[branch-2.1] add calc tablet file crc and fix single compaction test #33076 #34915 (#35215)
* [fix](compaction test) show single replica compaction status and fix test (#33076)
* [improve](http action) add http interface to calculate the crc of all files in tablet (#34915)
2024-05-26 17:15:09 +08:00
a79b436b12 remove iscloud mode 2024-05-25 19:29:47 +08:00
65b9e5ab69 [fix](chore) fix DCHECK failure of BufferWritable if failed to alloc memory (#35345) 2024-05-25 17:48:04 +08:00
fff6ab933c [fix](clean trash) Add clean trash regression case (#35330) 2024-05-25 17:47:51 +08:00
952875b437 [chore](restore) Add logs about the restore table state (#35363) 2024-05-25 17:47:38 +08:00
806b7d68e4 [regression-test](fix) runtime_filter.groovy case bug (#35368) 2024-05-25 17:47:29 +08:00
Pxl
b143f0dfe2 [Improvement](date) shortcut for str to date parse (#35288)
shortcut for str to date parse
2024-05-25 17:47:20 +08:00
80ba873d84 [regression-test](fix) test_date_diff case bug (#35356) 2024-05-25 17:46:57 +08:00
34e5030702 [bugifx](core) fix logical error of status check in nestedloop join (#35365) 2024-05-25 17:46:44 +08:00
c6c90ff63e [chore](routine-load) make routine_load_consumer_pool_size can update using HTTP API (#35315) 2024-05-25 17:46:29 +08:00
41c3a27bce [minor](nereids): remove useless code (#35325) 2024-05-25 17:44:39 +08:00
5bcdc75283 fix compile 2024-05-25 09:00:48 +08:00
9c6a6893d9 [fix](mtmv) Fix npe when the id of base table in mv is lager than Integer.MAX_VALUE (#35294) (#35384)
This brought by #34768
2024-05-24 23:27:08 +08:00
9af493f3f9 [fix](mtmv) Fix table id overturn and optimize get table qualifier method (#34768) (#35381)
commitid: 806e241
pr: #34768

Table id may be the same but actually they are different tables. so we optimize the
org.apache.doris.nereids.rules.exploration.mv.mapping.RelationMapping#getTableQualifier with following code:

Objects.hash(table.getDatabase().getCatalog().getId(), table.getDatabase().getId(), table.getId())

table id is long, we identify the table used in mv rewrite is bitSet. the bitSet can only use int, so we mapping the long id to init id in every query when mv rewrite
2024-05-24 21:19:15 +08:00
62998719df [opt](mtmv) Add threshold for relation mapping num when query rewrite (#34694) (#35378)
if query and mv def is as following:

    def mv1_1 = """
        select  t1.L_LINENUMBER,t2.l_extendedprice, t2.L_ORDERKEY
        from lineitem t1
        inner join lineitem t2 on t1.L_ORDERKEY = t2.L_ORDERKEY;
    """
    def query1_1 = """
        select  t1.L_LINENUMBER, t2.L_ORDERKEY
        from lineitem t1
        inner join lineitem t2 on t1.L_ORDERKEY = t2.L_ORDERKEY;
    """

this will generate relation mapping  by Cartesian, if the num of self join is too much, this will cause the performance problem
so we add `materialized_view_relation_mapping_max_count` session varaible, default 8. if actual num is greater than the value, the excess relation mapping is discarded.
2024-05-24 20:36:29 +08:00
0f550aeda7 [fix](compression) handle exception to reuse compression context (#35338) (#35380)
* [fix](compression) handle exception to reuse compression context

Otherwise, there is memleak and new context is allocated, then flush tlb
consumes a lot sys cpu.
2024-05-24 19:56:27 +08:00