Commit Graph

10153 Commits

Author SHA1 Message Date
45874bbf62 [refactor](fs)(step2)separate the storage and filesystem methods (#19012)
Co-authored-by: jinzhe <jinzhe@selectdb.com>
2023-04-26 15:06:31 +08:00
6356146274 [Fix](Nereids) fix nereids fold failed by be return null exception (#19013)
```sql
select if(
    date_format(CONCAT_WS('', '9999-07', '-26'), '%Y-%m') = DATE_FORMAT(curdate(), '%Y-%m'),
    curdate(),
    DATE_FORMAT(DATE_SUB(month_ceil(CONCAT_WS('', '9999-07', '-26')), 1), '%Y-%m-%d')
) 
```
return null when construct new children of if(), we find that the the more than "0" index in result map doesn't replace the const map caused by incorrect value-assignment in code.
2023-04-26 14:57:45 +08:00
39cf393874 [fix](stats) Fix potential NPE when loading Histogram (#19078)
Return Histogram.UNKNOWN as default when error occurred during loding
2023-04-26 14:24:01 +08:00
d3a0b94602 [feature](stats) Support to kill analyze #18901
1. Report error if submit analyze jobs when stats table is not available
2. Support kill analyze
3. Support cancel sync analyze
2023-04-26 14:23:44 +08:00
50d9f35f63 [fix](planner) NPE when use ctas to create table (#18973)
This is caused by expr in orderbyelements is not analyzed.
2023-04-26 14:12:28 +08:00
7a786c3b09 [fix](Nerieds) fix bucket shuffle plan and cost model bugs and add new function add_months (#18836)
fix
1. fix varchar(1) compare to varchar(2) bug
2. fix bucket shuffle join's cost model bug

feature:
1. support add_months function
2023-04-26 13:52:44 +08:00
ca19b972cc [doc](update-key)add update key doc (#18899) 2023-04-26 13:41:14 +08:00
270be55c4c [feat](stats) Add option to config file to enable or disable analyze function (#19062)
Add this option in conf:

    /**
     * If set false, user couldn't submit analyze SQL and FE won't allocate any related resources.
     */
    @ConfField
    public static boolean enable_stats = true;

It will be checked during analyze of analyze related stmt and init analyze manager
2023-04-26 13:37:08 +08:00
aa88083c1e [fix](Nereids) dead loop in FillUpMissingSlots (#18902)
FillUpMissingSlots don't handle some cornel case, sometime we don't need fillup, we should return null
2023-04-26 13:31:51 +08:00
a7773d16d6 [fix](Nereids): UT shouldn't contains slotId (#19082) 2023-04-26 13:23:21 +08:00
94b11af17c [fixbug](json-reader) fix memory leak of new_json_reader #19067 2023-04-26 12:54:47 +08:00
5a7a96f317 [doc](fix)fix doc link error (#19083) 2023-04-26 12:33:13 +08:00
5bd4a3897e [optimize](multi-catalog) Skip whole row group in lazy_read if data has been filtered. (#19039)
We found qt_q11 in regression test test_external_catalog_hive is very slow.
The result is only one record, so other data should be filtered out in the parquet lazy read situation.
Then we found currently the parquet reader read many records because we can only skip parquet page. But in order to skip parquet page, currently we need to read page header, then it will caused prefetch data. Therefore, prefetch data in this case may be not good.

So there are two issues:

Skip whole row group in this case.
Prefetching data in this case may be not good, need to improve it.
This PR resolve issues 1.
2023-04-26 12:10:14 +08:00
375789d345 [enhancement](JNI) Provide default environment variables if it is unset (#19041) 2023-04-26 12:06:38 +08:00
1c8b70a48c [refactor](config) Do not let set enable_vectorized_engine throw an error (#19002)
* update

* Update fe/fe-core/src/main/java/org/apache/doris/qe/SessionVariable.java

Co-authored-by: Mingyu Chen <morningman.cmy@gmail.com>

---------

Co-authored-by: Mingyu Chen <morningman.cmy@gmail.com>
2023-04-26 12:03:32 +08:00
8864266a42 [fix](Jdbc Catalog) fix Druid Pool parameter and set testWhileIdle = true (#19049)
Set `testWhileIdle` for the druid pool to true
2023-04-26 11:44:45 +08:00
d037938a4c [vectorzied](function) fix year_floor get result is incorrectly (#19006) 2023-04-26 11:39:22 +08:00
ca80617bfe [chore](CI)Regularly check project quality with Sonar (#18998)
Since the fork warehouse cannot obtain the token (for security reasons),
So set it as a timed check.
2023-04-26 10:18:49 +08:00
5fd6d8ebd4 [fix](function) Support more behaviors of cast time in MySQL 2023-04-26 07:49:54 +08:00
c993964a88 [Bug](delete) fix the delete ignore char case (#18714) 2023-04-26 07:30:44 +08:00
2c836251b2 [Fix](schema scanner) Fixed the problem of overflow when multiplying two INT 2023-04-25 23:58:47 +08:00
1be5dac036 [improve] Refactor file cache and Improve the file cache strategy (#18652)
1. Refactor file cache. Before refactor, the file cache config format is "[{"path":"/path/to/file_cache","normal":21474836480,"persistent":10737418240,"query_limit":10737418240}]" and now change to "[{"path":"/mnt/disk3/selectdb_cloud/file_cache","total_size":21474836480,"query_limit":10737418240}]". It will be simpler than before.
2. Support more strategy. Support file cache priority. The file cache will have three queue,  name as 'index'/'normal'/'disposable'. We can avoid that the higher priority data is eliminate by the lower priority data.
2023-04-25 23:14:28 +08:00
c93d6ba3be [chore](third-party) Fix the checksums of mysql (#19047)
The checksum of MySQL changed which makes the workflows fail.

See https://github.com/apache/doris-thirdparty/actions/runs/4794208534/jobs/8527425262.
2023-04-25 23:13:53 +08:00
bc154f7a71 Fix 404 links in README.md (#19040) 2023-04-25 22:31:34 +08:00
9c25b514f5 [fix](doc) fix jsonb_extract doc (#19059)
This will cause FE start fail

1. docs under sql-manual need strict format.
2. Change the rule of github checks, to run FE ut if docs under sql-manual is changed
2023-04-25 20:01:51 +08:00
17b59df8dd [fix](function) Array_map compared offset rows one by one (#18406)
Array_map 's multi columns compare not only nested data rows to be equal,but also the offsets data must equal each other.
2023-04-25 19:12:19 +08:00
41fbe711b0 [typo][samples](docs)(java) add read bitmap sample and update document. (#19005) 2023-04-25 19:07:51 +08:00
8ea69ca11c [refactor](nereids) do not use in_filter in pipeline mode (#19028)
1. in pipeline in_or_bloom filter replaced by bloom filter
2. do not set broadcast row limit
2023-04-25 19:02:12 +08:00
fa0f3a2859 [fix](planner) vdatetime_value.cpp:1585 Array access may overflow. (#18872)
int64_t months = _year * 12 + _month - 1 + sign * (12 * interval.year + interval.month);
    _year = months / 12;
    if (_year > 9999) {
        return false;
    }
    _month = (months % 12) + 1;
    if (_day > s_days_in_month[_month]) {
        _day = s_days_in_month[_month];
        if (_month == 2 && doris::is_leap(_year)) {
            _day++;
        }
    }
The variable "months" may be negative. Taking modulus with it (_month) may also result in a negative value, which can cause an array access overflow.
2023-04-25 17:57:21 +08:00
d5c82b2ea0 [optimize](regression case) Optimizing some regression case of inverted index (#19032) 2023-04-25 15:35:56 +08:00
8d21f20753 [enhancement](javaudf) not depend on parent will cause deconstructor core (#18948)
Co-authored-by: yiguolei <yiguolei@gmail.com>
2023-04-25 15:26:54 +08:00
61b7a52444 [Enhancement](multi-catalogs) Use decimal V3 type in multi-catalogs module. (#18926)
1. Use decimal V3 type in JDBC and Iceberg tables.
2. Fix hdfs TVF decimal V3 type and regression test.
2023-04-25 14:49:40 +08:00
a4a85f2476 [feat](stats) Return job id for async analyze stmt (#18800)
1. Return job id from async analysis
2. Sync analysis jobs don't save to analysis_jobs anymore
2023-04-25 14:43:54 +08:00
339d804ec4 [Refactor](exceptionsafe) add factory creator to some class (#19000) 2023-04-25 14:33:47 +08:00
39d66ca2c6 [fix](parquet) hasn't initialize select vector when number of nested values equals zero (#18953)
Fix bug when reading array type in parquet file:
```
ERROR 1105 (HY000): errCode = 2, detailMessage = [INTERNAL_ERROR]Read parquet file xxx failed,
reason = [IO_ERROR]Decode too many values in current page
```
When reading normal columns, `ScalarColumnReader::_read_values` still calls `ColumnSelectVector::set_run_length_null_map` to initialize select vector, but `ScalarColumnReader::_read_nested_column` hasn't do this, making the number of values wrong.
The situation where this error occurs is particularly extreme: The column pages have remaining values to be read,
but all of them are null values at ancestor level, so there's no actual read operation, just skipping null values at ancestor level.
2023-04-25 14:21:33 +08:00
7c9e6e6ad5 [typo](docs) format function Syntax desc (#19019) 2023-04-25 13:31:32 +08:00
a836a6a4fe [refactor](multi catalog)Refactor FileQueryScanNode init and finalize mothods(#18954)
Refactor FileQueryScanNode init and finalize methods.
Handle schema related initialization in init method, handle scan range generation in finalize method.
2023-04-25 11:18:21 +08:00
228cc90e4e [fix](session-var) ignore exception when setting global only var in non master FE (#18949)
Introduced from #18609.

When setting global variables from Non Master FE, there will be error like:

`Variable 'password_history' is a GLOBAL variable and should be set with SET GLOBAL`

Because when setting global variables from Non Master FE, Doris will do following step:

1. forward this SetStmt to Master FE to execute.
2. Change this SetStmt to "SESSION" level, and execute it again on this Non Master FE.

But for "GLOBAL only" variable, such ash "password_history", it doesn't allow to set on SESSION level.
So when doing step 2, "set password_history=xxx" without "GLOBAL" keywords will throw exception.
So in this case, we should just ignore this exception and return.
2023-04-25 11:05:09 +08:00
e2afa07271 [fix](nereids) disable_join_reorder does not work with semi/anti #18898
semi/anti push rules should not work if disable_join_reorder = true;
2023-04-25 10:57:40 +08:00
8b27d42b9b [bugfix](MOW) fix core in set_txn_related_delete_bitmap (#18956)
Fe will clear transaction info when transaction timeout, but calc delete bitmap
related logic in DeltaWriter::close_wait will continue. In set_txn_related_delete_bitmap,
we return directly in such case.
2023-04-25 10:57:26 +08:00
d555bae290 [Bug](serde) fix serialize column to jsonb when meet boolean and decimal_v3 (#19011)
* [Bug](serde) fix serialize column to jsonb when meet boolean and decimal_v3

* add comment to explain why use uint8
2023-04-25 10:48:13 +08:00
4ef43f5374 [improvement](docs) Add sync statement docs (#18972) 2023-04-25 10:35:20 +08:00
fd4576e420 [Fix](auth) fix some problem of skip_localhost_auth_check in FE config #18996 2023-04-25 09:10:01 +08:00
171a194070 [minor](regression) fix unstable test case (#19018)
* [minor](regression) fix unstable test case

* update
2023-04-25 09:09:24 +08:00
93c48f2bb0 [fix](regression) fix show create table in_memory = false test result error #19022 2023-04-25 09:04:59 +08:00
72632b1e32 [improvement](regression-test) add max_failure_num to skip tests when too much failure #19003 2023-04-25 09:03:36 +08:00
bf75e74065 [typo](docs) add oceanbase jdbc catalog doc (#18994)
* [typo](docs) add oceanbase jdbc catalog doc

* fix
2023-04-25 08:50:31 +08:00
8e808abbd4 [doc](remove-useless-code)remove useless doc description #18957
Co-authored-by: journeychen <journeychen@tencent.com>
2023-04-25 08:49:24 +08:00
207c827cdb [fix](test) fix result of CHARACTER_OCTET_LENGTH in . (#18896) 2023-04-25 08:42:54 +08:00
4e9b32d622 [bugfix](exception) remove fmt code to test if there still exist core (#19009) 2023-04-25 07:24:14 +08:00