Commit Graph

10965 Commits

Author SHA1 Message Date
32b0013a35 [fix](memory) Fix query memory tracking #20253
The memory released by the query end is recorded in the query mem tracker, main memory in _runtime_state.
fix page no cache memory tracking
Now the main reason for the inaccurate query memory tracking is that the virtual memory used by the query is sometimes much larger than the actual memory. And the mem hook counts virtual memory.
2023-06-05 08:33:38 +08:00
50ce237a24 [fix](regression) exclude test_analyze_stats_p1 suite (#20366)
test_analyze_stats_p1 is failing constantly in regression test, @morrySnow suggests ignoring it first.

http://43.132.222.7:8111/test/-5693062769677098407?currentProjectId=Doris_DorisRegression&expandTestHistoryChartSection=true&expandedTest=build%3A%28id%3A155592%29%2Cid%3A9944
2023-06-05 08:21:46 +08:00
Pxl
8e39f0cf6b [Enchancement](Agg State) storage function name and result is nullable in agg state type (#20298)
storage function name and result is nullable in agg state type
2023-06-04 22:44:48 +08:00
e6f395c9da [build](scripts) modify build-for-release.sh (#20398) 2023-06-04 20:47:42 +08:00
b0bbff0fd1 [performance](load) improve memtable sort performance (#20392) 2023-06-04 20:33:15 +08:00
34a1b7599f [Fix](lazy_open) fix lazy open commit info lose (#20404) 2023-06-04 19:08:36 +08:00
e1dbee5e90 [pipeline](opt) Opt fragment instance prepare performance by thread pool (#20399) 2023-06-04 12:10:35 +08:00
efb3b0f3ef [fix](community) fix PR template (#20400) 2023-06-04 08:44:32 +08:00
5b026df60d [typo](doc) Fixed typos in hive.md (#19457) 2023-06-03 21:24:07 +08:00
dd958b9745 [typo](doc)Remove the description of the BE configuration 'serialize_batch' which has been removed (#20163) 2023-06-03 21:23:25 +08:00
4c9b94a726 [typo](doc)Update runtime-filter.md (#20292) 2023-06-03 21:22:28 +08:00
282e3a3e85 [typo](doc)Update compilation-general.md (#20261)
Add some explanations about docker run parameter
2023-06-03 17:32:12 +08:00
ebc12ac55e [typo](doc)Update compilation-general.md (#20262)
Add some explanations about docker run parameter
2023-06-03 17:31:41 +08:00
4c2de40c17 [typo](doc)Update stream-load-manual.md (#20277)
Modify the sequential label
2023-06-03 17:31:21 +08:00
bc830d611a [docs](auth) forbid 127.0.0.1 passwd free login (#19096) 2023-06-03 17:30:21 +08:00
ad5e34ab9c [Doc](statistics) supplement stats doc (regression test and automatic collection) (#20071) 2023-06-03 17:25:33 +08:00
ce3be0c0a7 [docs](load-balancing):delete duplicate sentences and improve the documentation description (#20297) 2023-06-03 17:02:16 +08:00
160eae80c0 [docs](workload-group) add user binding workload group docs (#20382) 2023-06-03 17:01:31 +08:00
997f7ecc07 [typo](doc)Add a demo of export minio (#20323) 2023-06-03 16:59:32 +08:00
77855fcd43 [fix](inverted index) fix transaction id changed when light index change (#20302) 2023-06-03 16:05:02 +08:00
ffadaa4935 [improvement](inverted index) skip write index on load and generate index on compaction (#20325) 2023-06-03 16:03:21 +08:00
3e186a8821 [opt](MergedIO) optimize merge small IO, prevent amplified read (#20305)
Optimize the strategy of merging small IO to prevent severe read amplification, and turn off merged IO when file cache enabled.
Adjustable parameters:
```
// the max amplified read ratio when merging small IO
max_amplified_read_ratio=0.8
// the min segment size
file_cache_min_file_segment_size = 1048576
```
2023-06-03 10:51:24 +08:00
6958a8f92f [fix](dynamic_partition) fix dead lock when modify dynamic partition property for olap table (#20390)
Co-authored-by: caiconghui1 <caiconghui1@jd.com>
2023-06-03 08:25:20 +08:00
a389a51a61 [typo](docs)clearly describe the rename syntax (#20335) 2023-06-02 22:48:25 +08:00
0e06f15c87 [DOCS](data-types) remove old types (#20375) 2023-06-02 22:47:22 +08:00
4e4972d311 [fix](regression) test_partial_update_with_row_column (#20279) 2023-06-02 21:51:33 +08:00
Pxl
c2e96c7fa6 [Bug](schema-change) make test_dup_mv_schema_change more stable #20379
make test_dup_mv_schema_change more stable
2023-06-02 21:25:27 +08:00
Pxl
90d710e83d [Enchancement](function) optimize for padding function && add string length check on string op (#20363) 2023-06-02 21:24:41 +08:00
b62c5a70c7 [fix](match query) fix array column match query failed without inverted index (#20344) 2023-06-02 21:10:12 +08:00
299c3dc396 [fix](Nereids) should not inherit child's limit and offset when generate exchange node (#20373)
in legacy planner, when we new exchange, it inherit its child's limit and offset.
but in Nereids, we should not do this. because if we need set limit or offset, we will set it manually.
In this PR, we use a new ctor of ExchangeNode to ensure not set limit or offset unexpected.
2023-06-02 19:55:33 +08:00
a8e0841ef1 [fix](workload-group) fix incorrect memoryLimitPercent value (#20377) 2023-06-02 18:57:57 +08:00
adc3acb283 [fix](match) fix match query with compound predicates return -6003 (#20361) 2023-06-02 18:25:37 +08:00
0b86d07c0c [Docs](docs) Update BE http documents (#17604) 2023-06-02 18:01:44 +08:00
a20a6d2bea [refactor](jdbc catalog) Refactor the JdbcClient code (#20109)
This PR does the following:

1. This PR is a substantial refactor of the JDBC client architecture. The previous monolithic JDBC client has been refactored into an abstract base class `JdbcClient`, and a set of database-specific subclasses (e.g., `JdbcMySQLClient`, `JdbcOracleClient`, etc.), and the JdbcClient required config, abstract into an object. This allows for improved modularity, easier addition of support for new databases, and cleaner, more maintainable code. This change is backward-compatible and does not affect existing functionality.
2. As a result of client refactoring, OceanBaseClient can automatically recognize the mode of operation as MySQL or Oracle, so we cancel the oceanbase_mode property in the Jdbc Catalog, but due to the cancellation of the property, When creating a single OceanBase Jdbc Table, the table type needs to be filled in as oceanbase(mysql mode) or oceanbase_oracle(oracle_mode). The above work is a change in the usage behavior, please note.
3. For the PostgreSQL Jdbc Catalog, I did two things:

      1.   The adaptation to MATERIALIZED VIEW and FOREIGN TABLE is added
      2.   Fixed reading jsonb, which had been incorrectly changed to json in a previous PR

4. fix some jdbc catalog test case
5. modify oceanbase jdbc doc

And,Thanks @wolfboys for the guidance
2023-06-02 17:58:10 +08:00
c2121c831a [typo](docs) Update the help create command display (#20357) 2023-06-02 17:57:23 +08:00
4395fb70c4 [Enhancement](tvf) Backends tvf supports authentication (#20333)
Add authentication for backends tvf.
2023-06-02 17:53:44 +08:00
9c9f5fec0f [chore](function) Refactor FunctionSet Initialization for Better Maintainability and Compilation Success (#20285)
In this PR, I have refactored the initialization of the FunctionSet. Previously, all the functions were in one large method which led to the generation of Java code that was too long. This posed a problem for the compiler, as the length of the method exceeded the limit imposed by the Java compiler.

To resolve this issue and improve the readability and manageability of our code, I have categorized these functions by type, and created dedicated initialization methods for each type. As such, our code is now not only more readable and understandable, but also each method is of a length that is acceptable to the compiler and can be compiled successfully.

Moreover, this change makes it easier for us to add new functions as we can directly locate the right category and add new functions there.

This is a significant change aimed at enhancing the maintainability and scalability of our code, while ensuring that our code can be successfully compiled.
2023-06-02 17:50:47 +08:00
fb730fb653 [chore](third-party) Bump the version of hadoop_libs (#20369)
Bump the version of hadoop_libs to build HDFS related libraries only.
2023-06-02 17:18:36 +08:00
386a4a0b43 [fix](nereids) add fragment id on all PhysicalRelation (#20371)
fix "cannot find fragment id for scan" exception
2023-06-02 17:13:09 +08:00
78c37b5244 [Optimize](Function) Add fast path for col like '%%' or col like '%' or regexp '\\.*' (#20143)
Add fast path for col like '%%' or col like '%' or regexp '\\.*'
(1) like about 34% speed up when use count() test
support col like '%%' , col like '%', col not like '%%' , col not like '%'

(2) regexp about 37% speed up when use count() test
support col regexp '\\.', col not regexp '\\.'

Q1: select count() From hits where url like '%';
Q2: select count() From hits where url regexp '\\.*';
2023-06-02 16:26:56 +08:00
422fcd6377 [fix](Nereids) forbid unexpected expression on filter and fix two more bugs (#20331)
fix below bugs:
1. not check filter's expression, aggregate function, grouping scalar function and window expression should not appear in filter
2. show not change nullable of aggregate function when it is window function in window expression
3. bitmap and other metric types should not appear in order by or partition by of window expression
2023-06-02 16:19:50 +08:00
b1e6c6ffe5 [enhancement](txn) print commit backends when commit fails (#20367)
Print commit backends when a commit fails.
2023-06-02 15:10:38 +08:00
06e7c14320 [Improve](json-array) Support json array with nereids bool (#20248)
Support json array with nereids bool
now : 

```
set enable_nereids_planner=true;
mysql> SELECT json_array(1, "abc", NULL, TRUE, '10:00:00');
+----------------------------------------------+
| json_array(1, 'abc', NULL, TRUE, '10:00:00') |
+----------------------------------------------+
| [1,"abc",null,false,"10:00:00"]              |
+----------------------------------------------+
1 row in set (0.02 sec)
```
 
nereids boolean is "true"/"false" is not '0' /'1' , so we always get false
2023-06-02 14:47:24 +08:00
098c735064 [pipeline](fix) rm github_token, no need for it (#20360) 2023-06-02 14:11:21 +08:00
d68f3f3b3d [Feature](array-functions)improve array functions for array_last_index (#20294)
Now we just support array_first_index for lambda input , but no array_last_index
2023-06-02 13:54:03 +08:00
8ff8705b3f [fix](olap) deletion statement with space conditions did not take effect (#20349)
Deletion statement like this:

delete from tb where k1 = '  ';
The rows whose k1's value is ' ' will not be deleted.
2023-06-02 13:52:57 +08:00
a869056567 [performance](load) support parallel memtable flush for unique key tables (#20308) 2023-06-02 13:49:53 +08:00
e32eba8fdf [refactor](stats) Persist status of analyze task to FE meta data (#20264)
1. In the past, we use a BE table named `analysis_jobs` to persist the status of analyze jobs/tasks, however there are many flaws such as, if BE crashed analyze job/task would failed however the status of analyze job/task couldn't get updated.
2. Support `DROP ANALYZE JOB [job_id]` to delete analyze job
3. Support `SHOW ANALYZE TASK STATUS [job_id] ` to  get the task status of specific job
4. Restrict the execute condition of auto analyze, only when  the  last execution of auto analyze job finished a while ago could be executed again
5. Support analyze whole DB
2023-06-02 12:33:31 +08:00
62c188d9a2 [typo](docs) fix release note 2.0 zh url (#20320) 2023-06-02 11:45:24 +08:00
dc43e65d06 [Bug](pipeline) Fix memory leak if query is canceled caused by memory limit (#20316) 2023-06-02 11:42:52 +08:00