doris

Author	SHA1	Message	Date
Rayner Chen	d7c3369ce7	[regression](case)fix mc regression test p2 case. #42217 (#42274 ) cherry pick from #42217 Co-authored-by: daidai <2017501503@qq.com>	2024-10-22 23:43:51 +08:00
daidai	b4875c2789	[fix](jni)fix jni use timezone_obj get timezone be core. (#41956 ) (#42003 ) bp #41956 This PR #40225 try to pass time zone info from BE to JNI, and it use `_state->timezone_obj().name()` to get the timezone name. But when we do some rolling upgrade of BE, it may coredump like: ``` * SIGSEGV address not mapped to object (@0x610) received by PID 72661 (TID 73538 OR 0x7f2e898d1640) from PID 1552; stack trace: * 0# doris::signal::(anonymous namespace)::FailureSignalHandler(int, siginfo_t, void) at /home/zcp/repo_center/doris_branch-2.1/doris/be/src/common/signal_handler.h:421 1# os::Linux::chained_handler(int, siginfo_t, void) in /usr/lib/jvm/java-8-openjdk-amd64/jre/lib/amd64/server/libjvm.so 2# JVM_handle_linux_signal in /usr/lib/jvm/java-8-openjdk-amd64/jre/lib/amd64/server/libjvm.so 3# signalHandler(int, siginfo_t, void) in /usr/lib/jvm/java-8-openjdk-amd64/jre/lib/amd64/server/libjvm.so 4# 0x00007F3070D3E520 in /lib/x86_64-linux-gnu/libc.so.6 5# cctz::time_zone::name[abi:cxx11]() const in /mnt/hdd01/ci/compatibility-deploy/be/lib/doris_be 6# doris::vectorized::JniConnector::open(doris::RuntimeState, doris::RuntimeProfile) at /home/zcp/repo_center/doris_branch-2.1/doris/be/src/vec/exec/jni_connector.cpp:87 7# doris::vectorized::AvroJNIReader::init_fetch_table_schema_reader() at /home/zcp/repo_center/doris_branch-2.1/doris/be/src/vec/exec/format/avro/avro_jni_reader.cpp:119 8# std::_Function_handler::_M_invoke(std::_Any_data const&) at /var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/std_function.h:291 9# doris::WorkThreadPool::work_thread(int) at /home/zcp/repo_center/doris_branch-2.1/doris/be/src/util/work_thread_pool.hpp:159 10# execute_native_thread_routine at ../../../../../libstdc++-v3/src/c++11/thread.cc:84 11# start_thread at ./nptl/pthread_create.c:442 12# 0x00007F3070E22850 at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:83 172.20.50.206 last coredump sql: 2024-10-13 04:12:23,985 [query] ``` This PR use another method: `_state->timezone()`, which just return a string, instead of reading and initializing time zone info file, to avoid potential coredump.	2024-10-17 14:47:33 +08:00
wuwenchi	a4b7d93ded	[bugfix](iceberg)add prefix for endpoint with s3 client for 2.1 (#41336 ) (#41877 ) bp: #41336	2024-10-15 19:59:10 +08:00
wuwenchi	ec0c008317	[feature](paimon)support paimon with dlf for 2.1 (#41247 ) (#41694 ) bp: #41247	2024-10-13 20:04:01 +08:00
daidai	8c0f73cb90	[Enhancement](MaxCompute)Refactoring maxCompute catalog using Storage API.(#40225 , #40888 ,#41386 ) (#41610 ) bp #40225 , #40888 ,#41386 ## Proposed changes Among them, #40225 is the new api of mc, #40888 is used to fix the bug when reading null between the new and old apis, #41386 is used for compatibility between the new and old versions	2024-10-11 11:55:41 +08:00
Mingyu Chen	ff6f17c22c	[fix](external-p2) ignore external p2 cases(#41148 ) (#41179 ) bp #41148	2024-09-24 09:58:50 +08:00
wuwenchi	057ee1905f	[bugfix](hudi)add timetravel for nereids for 2.1 (#38324 ) (#38582 ) ## Proposed changes bp #38324	2024-08-01 11:37:57 +08:00
wuwenchi	ef8a1918c3	[case][fix](iceberg)move rest cases from p2 to p0 and fix iceberg version issue for 2.1 (#37898 ) (#38589 ) bp: #37898	2024-07-31 22:41:56 +08:00
Dongyang Li	8f39143c14	[test](fix) replace hardcode s3BucketName (#37750 ) ## Proposed changes pick from master #37739 <!--Describe your changes.--> --------- Co-authored-by: stephen <hello-stephen@qq.com>	2024-07-14 18:38:52 +08:00
Mingyu Chen	56a207c3f0	[case](paimon/iceberg)move cases from p2 to p0 (#37276 ) (#37738 ) bp #37276 Co-authored-by: wuwenchi <wuwenchihdu@hotmail.com>	2024-07-13 10:01:05 +08:00
Mingyu Chen	55636e8035	[test](migrate) move 3 cases from p2 to p0 (#36957 ) (#37264 ) bp #36957 Co-authored-by: slothever <18522955+wsjz@users.noreply.github.com>	2024-07-04 20:09:59 +08:00
Jibing-Li	bf3ea1839c	[test]Mv external p2 test case to p0. (#37070 ) (#37140 ) backport: https://github.com/apache/doris/pull/37070	2024-07-04 11:19:31 +08:00
zy-kkk	a9f9113c48	[branch-2.1][test](external)move hive cases from p2 to p0 (#37149 ) pk (#36855) test_hive_same_db_table_name test_hive_special_char_partition test_complex_types test_wide_table	2024-07-03 19:44:52 +08:00
Mingyu Chen	e5695e058f	[test](migrate) move 2 cases from p2 to p0 (#36935 ) (#37200 ) bp #36935 Co-authored-by: zhangdong <493738387@qq.com>	2024-07-03 17:29:01 +08:00
Qi Chen	e857680661	[Migrate-Test](multi-catalog) Migrate p2 tests from p2 to p0. (#37175 ) Backport #36989.	2024-07-03 11:08:49 +08:00
wuwenchi	e7e1e967cf	[test](migrate) move 2 cases from p2 to p0 for 2.1 (#37139 ) pick #37004	2024-07-02 22:50:53 +08:00
Tiewei Fang	b445c783eb	[test](tvf) move p2 tvf tests from p2 to p0 (#37081 ) (#37152 ) bp: #37081	2024-07-02 22:38:22 +08:00
Tiewei Fang	74086189d3	[test](tvf) move p2 tvf tests from p2 to p0 (#36871 ) (#37150 ) bp: #36871	2024-07-02 22:37:43 +08:00
Ashin Gau	cf86eb8647	[test](migrate) move test_hive_text_complex_type from p2 to p0 (#37007 ) (#37123 ) bp: #37007	2024-07-02 17:36:37 +08:00
Mingyu Chen	fcc26cc671	[test](migrate) move some cases from p2 to p0 (#36750 )(#36787 ) (#36922 ) bp #36750 and #36787	2024-06-27 20:59:50 +08:00
Mingyu Chen	c78c7f6b45	[branch-2.1](test) fix some tests in external p0 (#36127 ) Also move the analysis exception of "Not support insert with partition spec in hive catalog." from create sink phase to bind sink phase. So that when `set enable_fallback_to_original_planner=false;`, the return error will be correct.	2024-06-11 22:15:28 +08:00
Thearas	d4956bfaf5	do not use path style to access s3 (#35788 ) ## Proposed changes	2024-06-03 13:57:13 +08:00
Jibing-Li	aa4fd3fd79	[fix](statistics)Improve analyze timeout. (#33836 ) (#35530 ) backport https://github.com/apache/doris/pull/33836 <!--Describe your changes.--> ## Further comments If this is a relatively large or complex change, kick off the discussion at [dev@doris.apache.org](mailto:dev@doris.apache.org) by explaining why you chose the solution you did and what alternatives you considered, etc...	2024-05-28 17:12:53 +08:00
Tiewei Fang	c4776a48f2	[fix](regression-test) fix test_tvf_view_count_p2 regression test (#35216 ) coused by: #34642 it must set verbose true	2024-05-24 16:23:58 +08:00
Jibing-Li	37f1bf317c	[fix](statistics)Disable fetch min/max column stats through HMS, because the value may inaccurate and misleading. (#35124 ) (#35145 ) backport #35124	2024-05-21 22:58:12 +08:00
Tiewei Fang	c0fd98abe5	[Fix](tvf) Fix that tvf reading empty files in compressed formats. (#34926 ) 1. Fix the issue with tvf reading empty compressed files. 2. move two test cases (`test_local_tvf_compression` and `test_s3_tvf_compression`) from p2 to p0	2024-05-21 12:59:31 +08:00
morrySnow	1545d96617	[WIP](test) remove enable_nereids_planner in regression cases (part 4) (#34642 ) before PR are #34417 #34490 #34558	2024-05-18 18:07:39 +08:00
Jibing-Li	7e967e53b8	Fix failed p2 hive statistics case. (#34663 )	2024-05-18 17:59:44 +08:00
Ashin Gau	1f0c45204b	[fix](iceberg) read the primary key columns if hasing equality delete (#34884 ) backport: #34835	2024-05-15 11:37:25 +08:00
daidai	02084fd91f	[fix](iceberg_orc)Fixed the bug that the iceberg reader did not perform position delete when reading the orc file without a predicate. (#34814 ) (#34882 ) bp #34814	2024-05-15 11:31:29 +08:00
daidai	a0a025f763	[fix](regression test)fix test_hive_parquet_alter_column p2 case. (#34727 ) (#34859 ) fix test_hive_parquet_alter_column p2 case. Since this is a p2 case. The data is stored on emr, not in docker. So there is no need to consider hive2 and hive3.	2024-05-14 23:30:06 +08:00
Mingyu Chen	35f8563a75	[feature](iceberg) support iceberg equality delete (#34223 ) (#34327 ) bp #34223 Co-authored-by: Ashin Gau <AshinGau@users.noreply.github.com>	2024-04-30 11:51:29 +08:00
Qi Chen	7cb00a8e54	[Feature](hive-writer) Implements s3 file committer. (#34307 ) Backport #33937.	2024-04-29 19:56:49 +08:00
daidai	1bfe0f0393	[feature](iceberg)support read iceberg complex type，iceberg.orc format and position delete. (#33935 ) (#34256 ) master #33935	2024-04-29 14:40:12 +08:00
Mingyu Chen	45556686ea	[fix](test) fix some external test cases (#34209 ) Fix some test cases and enable `test_information_schema_external` suite	2024-04-27 23:25:33 +08:00
Mingyu Chen	50f9d47e96	[test](hive) run suite cases both in hive2 and hive3 (#33874 ) (#34156 ) bp #33874 Co-authored-by: 苏小刚 <suxiaogang223@icloud.com>	2024-04-26 13:48:09 +08:00
Mingyu Chen	0e3ad5cd9d	[fix](parquet) fix time zone error(isAdjustedToUTC=true) in parquet reader (#33675 ) (#33924 ) bp (#33675) Co-authored-by: Ashin Gau <AshinGau@users.noreply.github.com>	2024-04-20 19:06:54 +08:00
Mingyu Chen	d0394b7f89	[fix](test) fix some unstable p2 test cases (#33637 ) (#33655 ) bp #33637	2024-04-17 23:42:12 +08:00
Ashin Gau	9b7af4c0cf	[feature](schema change) unified schema change for parquet and orc reader (#32873 ) Following #25138, unified schema change interface for parquet and orc reader, and can be applied to other format readers as well. Unified schema change interface for all format readers: - First, read the data according to the column type of the file into source column; - Second, convert source column to the destination column with type planned by FE.	2024-04-12 15:09:25 +08:00
Ashin Gau	29556f758e	[fix](parquet) fix time zone error in parquet reader (#33217 ) `isAdjustedToUTC` is exactly the opposite in parquet reader(https://github.com/apache/parquet-format/blob/master/LogicalTypes.md), resulting the time with `isAdjustedToUTC=true` has increased by eight hours(UTC8). The parquet with `isAdjustedToUTC=true` can be produced by spark-sql with the following configuration: ``` --conf spark.sql.session.timeZone=UTC --conf spark.sql.parquet.outputTimestampType=TIMESTAMP_MICROS ``` However, using the following configuration, there's no logical and convert type in parquet meta data, so the time read by doris will also increase by eight hours(UTC8). Users need to set their own UTC time zone in doris(https://doris.apache.org/docs/dev/advanced/time-zone/) ``` --conf spark.sql.session.timeZone=UTC --conf spark.sql.parquet.outputTimestampType=INT96 ```	2024-04-07 23:24:22 +08:00
Mingyu Chen	d9d950d98e	[fix](iceberg) fix iceberg predicate conversion bug (#33283 ) Followup #32923 Some cases are not covered in #32923	2024-04-07 22:12:38 +08:00
wuwenchi	190763e301	[bugfix](iceberg)Convert the datetime type in the predicate according to the target column (#32923 ) Convert the datetime type in the predicate according to the target column. And add a testcase for #32194 related #30478 #30162	2024-04-07 22:12:33 +08:00
Mingyu Chen	71e16e6f35	[fix](iceberg) fix iceberg catalog bug and p2 test cases (#32898 ) 1. Fix iceberg catalog bug This PR #30198 change the logic of `IcebergHMSExternalCatalog.java`, to get locationUrl by calling hive metastore's `getCatalog()` method. But this method only exists in hive 3+. So it will fail if we using hive 2.x. I temporary remove this logic, because this logic is only used from iceberg table writing. Which is still under development. We will rethink this logic later. 2. Fix test cases Some of P2 test cases missed `order_qt`. And because the output format of the floating point type is changed, some result in `out` files need to be regenerated.	2024-03-27 20:44:38 +08:00
Mingyu Chen	c0d7a5660e	[fix](paimon) support paimon with hive2 (#32455 ) In order to support paimon with hive2, we need to modify the origin HiveMetastoreClient.java to let it compatible with both hive2 and hive3. And this modified HiveMetastoreClient should be at the front of the CLASSPATH, so that it can overwrite the HiveMetastoreClient in hadoop jar. This PR mainly changes: 1. Copy HiveMetastoreClient.java in FE to BE's preload jar. 2. Split the origin `preload-extensions-jar-with-dependencies.jar` into 2 jars 1. `preload-extensions-project.jar`, which contains the modified HiveMetastoreClient. 2. `preload-extensions-jar-with-dependencies.jar`, which contains other dependency jars. 3. Modify the `start_be.sh`, to let `preload-extensions-project.jar` be loaded first. 4. Change the way the assemble the jni scanner jar Only need to assemble the project jar, without other dependencies. Because actually we only use classed under `org.apache.doris` package. So remove other unused dependency jars can also reduce the output size of BE. 5. fix bug that the prefix of paimon properties should be `paimon.`, not `paimon` 6. Support paimon with hive2 User can set `hive.version` in paimon catalog properties to specify the hive version.	2024-03-26 15:31:07 +08:00
Ashin Gau	ec43f65235	[feature](hudi) support hudi incremental read (#32052 ) * [feature](hudi) support incremental read for hudi table * fix jdk17 java options	2024-03-26 15:31:07 +08:00
Jibing-Li	1b783aaa7f	[fix](p2)Fix analyze hive partition column p2 case after row count change. #31958	2024-03-09 19:45:03 +08:00
Mingyu Chen	ad3308c8ab	[fix](hive) support partition prune for _HIVE_DEFAULT_PARTITION_ (#31736 ) This PR #23026 support the partition prune for hive table with `_HIVE_DEFAULT_PARTITION`, but it will always select partition with `_HIVE_DEFAULT_PARTITION`. This PR #31613 support null partition for olap table's list partition, so we can treat `_HIVE_DEFAULT_PARTITION` as null partition of hive table. So this PR change the partition prune logic	2024-03-06 13:07:49 +08:00
Jibing-Li	32033d08c6	Fix hive p2 cases. (#31541 )	2024-02-29 12:37:38 +08:00
Ashin Gau	260568db17	[update](hudi) update hudi version to 0.14.1 and compatible with flink hive catalog (#31181 ) 1. Update hudi version from 0.13.1 to .14.1 2. Compatible with the hudi table created by flink hive catalog	2024-02-22 19:51:20 +08:00
Jibing-Li	87b5ed187e	Fix hive p2 case (#31149 )	2024-02-21 13:53:18 +08:00

1 2 3

131 Commits