Commit Graph

16886 Commits

Author SHA1 Message Date
847db2c015 [Enhancement](group commit) Add retry message for group commit load while schema changing (#30391) 2024-02-16 10:12:23 +08:00
4701dd49c3 (selectdb-cloud) Use info level in recordCreatePartitionFailedMsg() due to intersection happens all the time (#30448) 2024-02-06 08:35:54 +08:00
bc2e8ac8f9 [fix](kerberos) fix kerberos ugi login method (#30766) 2024-02-06 08:35:54 +08:00
8e147f4c93 [BugFix](MultiCatalog) Fix oss file location is not avaiable in iceberg hadoop catalog (#30761)
1, create iceberg hadoop catalog like below:
CREATE CATALOG iceberg_catalog PROPERTIES (
"warehouse" = "s3a://xxx/xxx",
"type" = "iceberg",
"s3.secret_key" = "*XXX",
"s3.region" = "region",
"s3.endpoint" = "http://xxx.jd.local",
"s3.bucket" = "xxx-test",
"s3.access_key" = "xxxxx",
"iceberg.catalog.type" = "hadoop",
"fs.s3a.impl" = "org.apache.hadoop.fs.s3a.S3AFileSystem",
"create_time" = "2024-02-02 11:15:28.570"
);

2, run select * from iceberg_catalog.table limit 1;

will get errCode = 2, detailMessage = Unknown file location nullnulls3a:/xxxx

expect:
OK

also need to bp to branch-2.0
2024-02-06 08:35:54 +08:00
92226c986a [fix](catalog) fix data_sub/data_add func pushdown in jdbcscan (#30807) 2024-02-06 08:35:54 +08:00
06ed5780e4 [opt](catalog) cache the converted properties (#30668)
convert properties may be a heavy operation, so we cache the result.
2024-02-06 08:35:54 +08:00
cde74cbfee [feature](test)Add mtmv agg under join cases (#30793) 2024-02-05 22:23:16 +08:00
73940f96d3 [opt](string_to_unsigned_int) performance opt (#30825) 2024-02-05 22:23:16 +08:00
1ed24117ac [function](url_decode)add url_decode function (#30667) 2024-02-05 22:23:00 +08:00
4e8c94ef14 [config](move-memtable) set LOAD_STREAM_PER_NODE default to 2 (#30830) 2024-02-05 22:23:00 +08:00
bff3b04029 [fix](cosn) use s3 client to read cosn on BE side (#30835) 2024-02-05 22:22:59 +08:00
09ef78402e fix build error 2024-02-05 22:15:17 +08:00
d123abc903 disable check segment when build rowset meta by default (#30857) 2024-02-05 22:00:36 +08:00
e9f9fdf9af Fix unstable analyze mv case. (#30859) 2024-02-05 22:00:36 +08:00
501ece3123 Collect index row count for MTMV. (#30855) 2024-02-05 22:00:36 +08:00
2c99c53812 [refactor](taskqueue) remove old task scheduler based wg (#30832)
---------

Co-authored-by: yiguolei <yiguolei@gmail.com>
2024-02-05 22:00:27 +08:00
cffe79feba open workload group for broker load regression test (#30797) 2024-02-05 22:00:26 +08:00
0d32aeeaf6 [improvement](load) Enable lzo & Remove dependency on Markus F.X.J. Oberhumer's lzo library (#30573)
Issue Number: close #29406

1. increase lzop version to 0x1040,
    I set to 0x1040 only for decompressing lzo files compressed by higher version of lzop,
	no change of decompressing logic,
	actully, 0x1040 should have "F_H_FILTER" feature,
	but it mainly for audio and image data, so we do not support it.
2. use orc::lzoDecompress() instead of lzo1x_decompress_safe() to decompress lzo data
3. use crc32c::Extend() instead of lzo_crc32()
4. use olap_adler32() instead of lzo_adler32()
5. thus, remove dependency of Markus F.X.J. Oberhumer's lzo library
6. remove DORIS_WITH_LZO, so lzo file are supported by stream and broker load by default
7. add some regression test
2024-02-05 22:00:24 +08:00
499fd27ed0 [config](move-memtable) set StreamWait timeout default to 10min (#30831) 2024-02-05 21:59:55 +08:00
2344aaf337 [fix](join) JoinHashTable::pre_build_idxs should be const (#30837) 2024-02-05 21:59:55 +08:00
be31b8dc61 [Refactor](exchange) remove unless code in exchange and opt some code (#30813) 2024-02-05 21:59:52 +08:00
4b42156fc0 [chore](clang-tidy): add bugprone linters (#29521)
This PR introduces 4 bugprone linter rules to .clang-tidy, these linters found some bugs in #28965. This PR also add some comments to mute false positive reports.
2024-02-05 21:58:08 +08:00
7840e7cb81 [typo](doc) modify format (#30816) 2024-02-05 21:58:08 +08:00
3a752b758a [fix](Nereids) colcoate node attr lost after merge fragment (#30818) 2024-02-05 21:58:08 +08:00
255ca143f8 [fix](chinese) fix the issue where the be crashes due to the missing chinese dict (#30712) 2024-02-05 21:57:29 +08:00
a5d9004974 [fix](Nereids) physical property deriver on some node is not right (#30819) 2024-02-05 21:57:29 +08:00
fc762f426b [enhance](mtmv) mtmv disable hive auto refresh (#30775)
- If the `related table` is `hive`, do not refresh automatically
- If the `related table` is `hive`, the partition col is allowed to be `null`. Otherwise, it must be `not null`
- add more `ut`
2024-02-05 21:56:57 +08:00
8ff8d94697 [fix](ip) change IPv6 to little-endian byte order storage (like IPv4) (#30730) 2024-02-05 21:56:57 +08:00
cd939fcca2 [Enhancement](group commit) Optimize group commit block sink wal disk space log #30811 2024-02-05 21:56:57 +08:00
d1bb63ed67 [fix](arrow-flight) Modify FE Arrow version to 15.0.0 #30824 2024-02-05 21:56:57 +08:00
48aaaa8005 [Enhancement](fuction) change function REPEAT nullable mode (#30743) 2024-02-04 22:21:36 +08:00
aed858a442 [improve](log) print query_id when fold constant on BE (#30802) 2024-02-04 22:21:36 +08:00
88ff9c06cf [test](mtmv)fix table name duplicate (#30808) 2024-02-04 22:21:36 +08:00
27f65f4463 [Feature](executor)Stream load support workload group (#30763)
* Stream load support workload group

* skip mysql load
2024-02-04 22:21:36 +08:00
25f6a733fe [fix](stats) keep threads in pool alive to maintain reasonable parallelism (#30451) 2024-02-04 22:21:16 +08:00
d32292b292 [regression-test][conf] add master_sync_policy = WRITE_NO_SYNC replica_sync_policy = WRITE_NO_SYNC (#30494)
There is no power off scene in regression-test, so add these two configure has no side-effect.
2024-02-04 22:21:16 +08:00
ccbcf879b5 [test](mtmv) Add materialized view availability regression test (#30769)
Add materialized view availability regression test

when mv refresh_time is in the grace_period(unit is second), materialized view will be use to
query rewrite regardless of the base table is update or not
when mv refresh_time is out of the grace_period(unit is second), will check the base table is update or not
if update the materialized view will not be used to query rewrite
2024-02-04 22:21:16 +08:00
9e76592297 Support analyze materialized view. (#30540) 2024-02-04 22:21:16 +08:00
e891a095e7 check segment num when build rowset meta (#30803) 2024-02-04 18:15:12 +08:00
91a669f5fd [chore](mac compile) remove using regex to avoid mac compile failed frequently #30783 2024-02-04 14:28:38 +08:00
Pxl
1d39e16eda [Bug](compaction) pass arena to function->add_batch_range (#30709) 2024-02-04 14:28:38 +08:00
121d52dd37 [test](mtmv) Add mtmv basic one and two dimensional test cases (#30651) 2024-02-04 14:28:38 +08:00
383850ef12 [Opt](multi-catalog) Opt split assignment to resolve uneven distribution. (#30390)
[Opt] (multi-catalog) Opt split assignment to resolve uneven distribution. Currently only for `FileQueryScanNode`.

Referring to the implementation of Trino, 
- Local node soft affinity optimization. Prefer local replication node.
- Remote split will use the consistent hash algorithm is used when the file cache is turned on, and because of the possible unevenness of the consistent hash, the split is re-adjusted so that the maximum and minimum split numbers of hosts differ by at most `max_split_num_variance` split.
- Remote split will use the round-robin algorithm is used when the file cache is turned off.
2024-02-04 14:28:38 +08:00
b275cb0f44 [feature](mtmv) mtmv support workload group (#29595)
MTMV supports controlling the resource usage of refresh tasks by setting the name of workload group
about workload group : https://doris.apache.org/zh-CN/docs/dev/admin-manual/workload-group
2024-02-04 14:28:38 +08:00
e10defeaba [enhancement](plubin)support json format and other options in logstash doris output plugin (#27318) 2024-02-04 14:28:38 +08:00
6442663735 [Function](exec) upport atan2 math function (#30672)
Co-authored-by: Rohit Satardekar <rohitrs1983@gmail.com>
2024-02-04 14:28:38 +08:00
36b2712709 [chore](Nereids) turn on nereids dml when update to 2.1 (#30776) 2024-02-04 14:28:38 +08:00
c9ab243153 [feat-wip](join) support mark join for right semi join(without mark join conjunct) (#30767) 2024-02-04 14:28:38 +08:00
3cc409b14f [bug](function) fix date_sub function failed when arg type is datev2 (#30443)
* [bug](function) fix date_sub function failed when arg type is datev2

* update
2024-02-04 14:28:38 +08:00
d749fc3d27 [improvement](binlog) Change BinlogConfig default TTL_SECONDS to 86400 (1day) (#30771)
* Change BinlogConfig default TTL_SECONDS to 86400 (1day)

Signed-off-by: Jack Drogon <jack.xsuperman@gmail.com>

* Fix binlog.ttl_seconds in regression test

Signed-off-by: Jack Drogon <jack.xsuperman@gmail.com>

---------

Signed-off-by: Jack Drogon <jack.xsuperman@gmail.com>
2024-02-04 14:28:38 +08:00