doris

Author	SHA1	Message	Date
Mingyu Chen (Rayner)	5d2930e783	[fix](shellcheck) fix hive-metastore and enable shellcheck in docker (#46496 ) (#46574 ) cherry-pick (#46496) Co-authored-by: Socrates <suyiteng@selectdb.com>	2025-01-08 11:10:34 +08:00
github-actions[bot]	d8c94d6392	branch-2.1: [fix](regression)fix hive translation unstable case. #46385 (#46409 ) Cherry-picked from #46385 Co-authored-by: daidai <changyuwei@selectdb.com>	2025-01-04 08:59:56 +08:00
github-actions[bot]	02239e4fb2	branch-2.1: [chore](regression) do not hard code S3 bucket and endpoint of hive t… #46159 (#46169 ) Cherry-picked from #46159 Co-authored-by: zgxme <zhenggaoxiong@selectdb.com>	2024-12-31 11:44:36 +08:00
James	6dd92be33d	[feature](statistics)Support get row count for pg and sql server. (#42674 ) (#46131 ) backport: https://github.com/apache/doris/pull/42674	2024-12-29 19:37:21 +08:00
daidai	a380f5d222	[enchement](utf8)import enable_text_validate_utf8 session var (#45537 ) (#46070 ) bp #45537	2024-12-28 10:05:03 +08:00
FreeOnePlus	e8be023b35	[typo](docker) Adjust the indentation format of the init_be and entry_point scripts, as well as the duration of loop execution(Merge-2.1). (#45860 ) ### What problem does this PR solve? Issue Number: close #xxx Related PR: #xxx Master PR: #45308 Problem Summary: Adjust the indentation format of the `init_be` and `entry_point` scripts, as well as the duration of loop execution. Adjust the smallest unit of all indentations to a single tab character, and modify the loop duration when checking the BE startup status, changing both from 300 seconds to 30 seconds to speed up the overall Docker startup time.	2024-12-25 12:16:42 +08:00
FreeOnePlus	980f28f9e2	[feat](docker)Modify the init_be and start_be scripts to meet the requirements for rapid Docker startup(Merge 2.1). (#45858 ) ### What problem does this PR solve? Issue Number: close #xxx Related PR: #45267 Master PR: #45269 Problem Summary: To meet the needs of rapid Docker startup, I have made adjustments to two related scripts in the Docker startup process. First, I added a env `SKIP_CHECK_ULIMIT` to the `start_be.sh` script, which will skip the size checks for `swap`, `ulimit`, and `max_map_count`. At the same time, I used `--console` to start the process and print logs. The reason why I did not use the `--daemon` daemon command to execute is that starting with a foreground log printing method in a Docker container is the correct and reliable approach. At the same time, I added a check logic for a `be.conf` configuration item in the `init_be.sh` script: if it is the first time starting, append the export `SKIP_CHECK_ULIMIT=true` to skip the `ulimit` value check in the BE process. In summary, these adjustments can meet the basic requirements for rapid Docker startup usage.	2024-12-25 12:11:14 +08:00
daidai	303557ac70	[fix](hive)fix hive insert only translaction table. (#45753 ) ### What problem does this PR solve? bp #44001 , but no hive4 acid table. Problem Summary: 1. Fixed the issue that when reading insert translaction only tables, there was no acid check, which caused multiple data reads (i.e., reading data from the previous base_n). 2. Forbidden to create, insert data, and delete aicd tables.	2024-12-22 21:23:21 +08:00
Mingyu Chen (Rayner)	19c0e89da7	[enchement](iceberg)support read iceberg partition evolution table. (#45367 ) (#45569 ) cherry-pick #45367 Co-authored-by: daidai <changyuwei@selectdb.com>	2024-12-20 08:56:51 +08:00
Socrates	7d32e4f71f	branch-2.1: [Fix](ORC) Not push down fixed char type in orc reader #45484 (#45525 ) cherry-pick #45484	2024-12-19 14:06:00 +08:00
MoanasDaddyXu	ea24410faf	[enhancement][docker] fix kafka docker issue (#45091 )	2024-12-06 14:36:57 +08:00
MoanasDaddyXu	11c517fe1e	[enhancement][docker]update routine docker file (#45048 )	2024-12-05 17:27:44 +08:00
daidai	702abbff0f	[Opt](orc)Optimize the merge io when orc reader read multiple tiny stripes. (#42004 ) (#44239 ) bp #42004 Co-authored-by: kaka11chen <kaka11.chen@gmail.com>	2024-11-22 11:01:41 +08:00
github-actions[bot]	3136fa48a6	branch-2.1: [chore](ci) adjust some invalid url #44261 (#44270 ) Cherry-picked from #44261 Co-authored-by: Dongyang Li <lidongyang@selectdb.com>	2024-11-19 19:28:04 +08:00
github-actions[bot]	83b74827aa	branch-2.1: [fix](iceberg)Fix count(*) error with dangling delete problem #44039 (#44101 ) Cherry-picked from #44039 Co-authored-by: wuwenchi <wuwenchi@selectdb.com>	2024-11-19 17:19:25 +08:00
Mingyu Chen (Rayner)	efb3bdd96e	[fix](test) fix clickhouse jdbc catalog func push down case #43196 (#44151 ) cherry pick from #43196 Co-authored-by: zy-kkk <zhongyk10@gmail.com>	2024-11-18 18:03:10 +08:00
github-actions[bot]	48e33bfb2a	branch-2.1: [fix](hive)Fixed the issue of reading hive table with empty lzo files #43979 (#44063 ) Cherry-picked from #43979 Co-authored-by: wuwenchi <wuwenchi@selectdb.com>	2024-11-16 16:14:50 +08:00
github-actions[bot]	4531cd86e3	branch-2.1: [fix](regression-test) add checks for existence and successful upload of data files in hive-metastore.sh #43853 (#43888 ) Cherry-picked from #43853 Co-authored-by: Socrates <suyiteng@selectdb.com>	2024-11-14 11:23:23 +08:00
github-actions[bot]	a1ff02288f	branch-2.1: [fix](hive) support query hive view created by spark (#43553 ) Cherry-picked from #43530 Co-authored-by: Mingyu Chen (Rayner) <morningman@163.com> Co-authored-by: morningman <yunyou@selectdb.com>	2024-11-11 23:28:53 +08:00
yujun	1ac00ea983	branch-2.1: [feat](doris compose) Copy lastest compose code from master branch (#43464 ) Copy lastest code from master branch to support run docker suites without external doris cluster, enable jvm debug port, ..., etc.	2024-11-08 09:47:19 +08:00
Mingyu Chen (Rayner)	cdd32d9582	[enhance](hive) support reading hive table with OpenCSVSerde #42257 (#42940 ) cherry pick from #42257 Co-authored-by: Socrates <suxiaogang223@icloud.com>	2024-10-31 11:12:07 +08:00
Mingyu Chen (Rayner)	fce4695f37	[Configuration](transactional-hive) Add `skip_checking_acid_version_file` session var to skip checking acid version file in some hive envs. (#42111 )(#42225 ) (#42939 ) cherry-pick (#42111)(#42225) --------- Co-authored-by: Qi Chen <kaka11.chen@gmail.com>	2024-10-31 09:52:20 +08:00
qiye	2defa90be7	[test](ES Catalog)Add mapping _routing test case (#42074 ) (#42282 ) ## Proposed changes bp #42074	2024-10-23 10:14:12 +08:00
Rayner Chen	157d67e7ca	[enhance](hive) Add regression-test cases for hive text ddl and hive text insert and fix reading null string bug #42200 (#42273 ) cherry pick from #42200 Co-authored-by: Socrates <suxiaogang223@icloud.com>	2024-10-22 23:56:57 +08:00
Socrates	38e529cd29	[cherry-pick](branch-2.1) support decimal256 for parquet reader (#42241 ) ## Proposed changes pick pr: https://github.com/apache/doris/pull/41526	2024-10-22 19:42:09 +08:00
zy-kkk	c1d2b8d548	[2.1][improvement](jdbc catalog) Disallow non-constant type conversion pushdown and implicit conversion pushdown (#42242 ) pick (#42102) Add a variable `enable_jdbc_cast_predicate_push_down`, the default value is false, which prohibits the pushdown of non-constant predicates with type conversion and all predicates with implicit conversion. This change can prevent the wrong predicates from being pushed down to the Jdbc data source, resulting in query data errors, because the predicates with cast were not correctly pushed down to the data source before. If you find that the data is read correctly and the performance is better before this change, you can manually set this variable to true ``` \| Expression \| Can Push Down \| \|-----------------------------------------------------\|---------------\| \| column type equals const type \| Yes \| \| column type equals cast const type \| Yes \| \| cast column type equals const type \| No \| \| cast column type equals cast const type \| No \| \| column type not equals column type \| No \| \| column type not equals cast const type \| No \| \| cast column type not equals const type \| No \| \| cast column type not equals cast const type \| No \| ```	2024-10-22 17:27:29 +08:00
Socrates	a32ad0b1f7	[cherry-pick](branch-2.1) support reading brotli compressed parquet file (#42162 ) pick pr: https://github.com/apache/doris/pull/41875	2024-10-21 16:48:09 +08:00
Rayner Chen	a150d160ea	[fix](jdbc catalog) fix and add mysql and doris extremum test #41679 (#42122 ) cherry pick from #41679 --------- Co-authored-by: zy-kkk <zhongyk10@gmail.com>	2024-10-21 16:39:40 +08:00
Socrates	1b901f6fcc	[cherry-pick](branch-2.1) add parquet tvf cases and fix some parquet bug (#41931 ) ## Proposed changes pick pr: https://github.com/apache/doris/pull/41683 https://github.com/apache/doris/pull/41506 https://github.com/apache/doris/pull/41338 https://github.com/apache/doris/pull/39326 --------- Co-authored-by: morningman <morningman@163.com>	2024-10-17 14:20:58 +08:00
Socrates	4888c632f4	[cherry-pick](branch2.1) support escape.delim and serialization.null.format for hive text (#41684 ) ## Proposed changes pick from master: https://github.com/apache/doris/pull/40291	2024-10-15 00:08:23 +08:00
wuwenchi	4f81fc474c	[bugfix](paimon)Get the file format by file name (#41020 ) (#41487 ) bp #41020	2024-09-30 15:46:13 +08:00
Socrates	0b4552f74b	[cherry-pick](branch-2.1) pick hive text write from master (#40537 ) ## Proposed changes pick prs: https://github.com/apache/doris/pull/38549 https://github.com/apache/doris/pull/40183 https://github.com/apache/doris/pull/40315 --------- Co-authored-by: Calvin Kirs <kirs@apache.org>	2024-09-27 20:57:07 +08:00
daidai	c744eb87c5	[fix](regression)fix some regression test (#40928 ) (#41046 ) bp #40928	2024-09-20 18:17:44 +08:00
zy-kkk	5f583fa329	[branch-2.1][test](jdbc catalog) add oceanbase ce jdbc catalog test (#40978 ) pick #34972)	2024-09-19 22:11:24 +08:00
Socrates	7bb9ca91c8	[branch-2.1](fix) adjust data download url about hive docker (#40846 ) ## Proposed changes fix paimon regression test Co-authored-by: Dongyang Li <hello_stephen@qq.com> Co-authored-by: stephen <hello-stephen@qq.com>	2024-09-14 23:19:54 +08:00
qiye	8708fae420	[fix](ES Catalog)Support parse single value for array column (#40614 ) (#40660 ) bp #40614	2024-09-11 17:26:48 +08:00
qiye	8104b992d1	[fix](ES Catalog)Do not extract doc_values of field with ignore_above setting (#40314 ) (#40464 ) bp #40314	2024-09-06 16:25:30 +08:00
yiguolei	ca07a00c93	Revert "[branch-2.1](hive) support hive write text table (#38549 ) (#4… (#40157 ) …0063)" This reverts commit c6df7c21a3c09ae1664deabacb88dfcea9d94b68. ## Proposed changes Issue Number: close #xxx <!--Describe your changes.--> Co-authored-by: yiguolei <yiguolei@gmail.com>	2024-08-30 10:25:38 +08:00
Socrates	c6df7c21a3	[branch-2.1](hive) support hive write text table (#38549 ) (#40063 ) 1. Support write hive text table 2. Add SessionVariable `hive_text_compression` to write compressed hive text table 3. Supported compression type: gzip, bzip2, snappy, lz4, zstd pick from https://github.com/apache/doris/pull/38549	2024-08-29 16:50:40 +08:00
Mingyu Chen	b9da934b16	[fix](hive) report error with escape char and null format (#39700 ) (#39869 ) bp #39700 Co-authored-by: Socrates <suxiaogang223@icloud.com>	2024-08-24 09:23:03 +08:00
Mingyu Chen	cf698fb615	[fix](regression) fix some jdbc datasource docker health check (#39141 ) (#39872 )	2024-08-24 03:29:18 +08:00
Mingyu Chen	508c7a7040	[fix](hive)Modify the Hive notification event processing method when using meta cache and add parameters to the Hive catalog. (#39239 ) (#39865 ) bp #39239 Co-authored-by: daidai <2017501503@qq.com>	2024-08-23 23:21:02 +08:00
zy-kkk	40a58b9e42	[branch-2.1][regression test](jdbc catalog) Enable CLICKHOUSE_ALWAYS_RUN_INITDB_SCRIPTS for clickhouse docker (#39667 ) pick (#39425) #39693	2024-08-23 09:59:03 +08:00
daidai	27ba2542e2	[case](iceberg)append iceberg schema change case. (#38766 ) (#39630 ) bp #38766	2024-08-21 09:17:12 +08:00
zy-kkk	2948b5ea2b	[branch-2.1][fix](jdbc scan) Remove the `conjuncts.remove` call in JdbcScan (#39407 ) pick (#39180) In #37565, due to the change in the calling order of finalize, the final generated Plan will be missing the PREDICATES that have been pushed down in Jdbc. Although this behavior is correct, before perfectly handling the push down of various PREDICATES, we need to keep all conjuncts to ensure that we can still filter data normally when the data returned by Jdbc is a superset.	2024-08-16 19:01:40 +08:00
qiye	43cc8d648d	[fix](ES Catalog)Check isArray before parse json to array (#39104 ) (#39273 ) ## Proposed changes bp #39104	2024-08-13 15:13:40 +08:00
daidai	3da2d1c9d6	[bug](parquet)Fix the problem that the parquet reader reads the missing sub-columns of the struct and fails. (#38718 ) (#39192 ) bp #38718	2024-08-11 20:37:40 +08:00
daidai	607c0b82a9	[opt](serde)Optimize the filling of fixed values into block columns without repeated deserialization. (#37377 ) (#38245 ) (#38810 ) ## Proposed changes pick pr: #38575 and fix this pr bug : #38245	2024-08-05 09:13:08 +08:00
daidai	5d02c48715	[feature](hive)Support reading renamed Parquet Hive and Orc Hive tables. (#38432 ) (#38809 ) bp #38432 ## Proposed changes Add `hive_parquet_use_column_names` and `hive_orc_use_column_names` session variables to read the table after rename column in `Hive`. These two session variables are referenced from `parquet_use_column_names` and `orc_use_column_names` of `Trino` hive connector. By default, these two session variables are true. When they are set to false, reading orc/parquet will access the columns according to the ordinal position in the Hive table definition. For example: ```mysql in Hive : hive> create table tmp (a int , b string) stored as parquet; hive> insert into table tmp values(1,"2"); hive> alter table tmp change column a new_a int; hive> insert into table tmp values(2,"4"); in Doris : mysql> set hive_parquet_use_column_names=true; Query OK, 0 rows affected (0.00 sec) mysql> select * from tmp; +-------+------+ \| new_a \| b \| +-------+------+ \| NULL \| 2 \| \| 2 \| 4 \| +-------+------+ 2 rows in set (0.02 sec) mysql> set hive_parquet_use_column_names=false; Query OK, 0 rows affected (0.00 sec) mysql> select * from tmp; +-------+------+ \| new_a \| b \| +-------+------+ \| 1 \| 2 \| \| 2 \| 4 \| +-------+------+ 2 rows in set (0.02 sec) ``` You can use `set parquet.column.index.access/orc.force.positional.evolution = true/false` in hive 3 to control the results of reading the table like these two session variables. However, for the rename struct inside column parquet table, the effects of hive and doris are different.	2024-08-05 09:06:49 +08:00
qiye	c0caca7c55	[fix](ES Catalog)Fix unstable test test_es_query (#38801 ) (#38802 ) ## Proposed changes bp #38801	2024-08-03 23:49:00 +08:00

1 2 3 4 5 ...

302 Commits