doris

Author	SHA1	Message	Date
Mingyu Chen	ed96442b85	[fix](multi-catalog) fix persist issue about jdbc catalog and class loader issue #14794 Fix a bug that JDBC catalog/database/table should be add to GsonUtil Fix a class loader issue that sometime it will cause ClassNotFoundException Fix regression test to use different catalog name. Comment out 2 regression tests: regression-test/suites/query_p0/system/test_query_sys.groovy regression-test/suites/statistics/alter_col_stats.groovy Need to be fixed later	2022-12-05 09:05:13 +08:00
Kikyou1997	283b23f6da	[fix](planner) wrong results when select from view which has with clause (#14747 )	2022-12-02 18:10:52 +08:00
Tiewei Fang	7627defc88	[fix](regression-test) Add test data for test_mysql_jdbc_catalog and fix mysql-5.7.yaml about UTF8 (#14749 ) Fix two things: 1. Fix that the MySQL table displays the garbled code even if the UTF8 is specified for table. 2. Fix that `test_mysql_jdbc_catalog.out` lack of returned data for table `ex_tb13`.	2022-12-02 11:58:11 +08:00
zhangguoqiang	07e8af7808	[regression](test) add external regression-test base on emr environment 1.0 11-29 (#14666 ) * add external regression-test base on emr environment 1.0 11-29 * delete ak sk info from regression-conf.groovy	2022-12-02 11:30:07 +08:00
lsy3993	ae6a007c4e	[test](jdbc)add new extremum case (#14692 )	2022-12-02 11:28:11 +08:00
yixiutt	94a6ffb906	[feature](compaction) support vertical_compaction & ordered_data_compaction (#14524 )	2022-12-01 22:15:41 +08:00
minghong	2be8235d95	[feature](nereids) support timestampdiff function (#14662 ) complete timeStampDiff supported timeunit: - YEAR - MONTH - WEEK - DAY - HOUR - MINUTE - SECOND	2022-12-01 22:11:55 +08:00
Gabriel	9dd1d989e8	[test](decimalv3) add regression test cases for decimalv3 (#14672 )	2022-12-01 15:18:40 +08:00
luozenglin	6c70d794f6	[fix](bitmapfilter) fix core dump caused by bitmap filter (#14702 )	2022-12-01 09:56:22 +08:00
AlexYue	738c36109f	rename tpch dir (#14668 )	2022-11-30 17:59:13 +08:00
Yongqiang YANG	11735043d6	[improvement](test) logging load result (#14694 ) When a load failed, we have to login to doris to investigate result.	2022-11-30 16:57:35 +08:00
Kang	79688a54d6	[bug](jsonb) fix be core at insert invalid json to JSONB column (#14686 )	2022-11-30 14:00:50 +08:00
Mingyu Chen	f3cf83a933	(fix)[test] add some logs (#14695 )	2022-11-30 12:45:12 +08:00
Tiewei Fang	9272680d00	[feature](multi-catalog) support Jdbc catalog (#14527 ) Issue Number: close #xxx I add jdbc catalog for doris multi-catalog feature. Currently, the jdbc catalog only supports MYSQL DBMS. TODO: support for postgre DB Support for other databases. Problem summary For jdbc catalog, we can create catalog like: CREATE CATALOG jdbc4 PROPERTIES ( "type"="jdbc", "jdbc.user"="root", "jdbc.password"="123456", "jdbc.jdbc_url" = "jdbc:mysql://127.0.0.1:13396/demo?yearIsDateType=false", "jdbc.driver_url" = "file:/mnt/disk2/ftw/tools/jar/mysql-connector-java-5.1.47/mysql-connector-java-5.1.47.jar", "jdbc.driver_class" = "com.mysql.jdbc.Driver" ); Note: yearIsDateType is a param of jdbc: If yearIsDateType configuration property is set to false, then the returned object type is java.sql.Short. If set to true (the default), then the returned object is of type java.sql.Date with the date set to January 1st, at midnight. To compat with mysql, we force the use of yearIsDateType=false in FE. if user sets yearIsDateType=true, doris FE will force to change yearIsDateType=false.	2022-11-30 11:28:08 +08:00
minghong	82f3980774	[feature](Nereids) estimation without column statistics (#14526 ) estimate plan cost without column statistics. change list: 1. remove original StatsCalculator, it is replaced by StatsCalculatorV2. rename StatsCalculatorV2 to StatsCalculator 2. remove FilterSelectivityCalculator, it is replaced by FilterEstimation 3. remove session var:ENABLE_NEREIDS_STATS_DERIVE_V2 4. add ColumnStatistics.isUnKnown, which means the column is not analyzed, and its stats is not accurate. 5. add estimatedRowCount() function for OLAP tables 6. add unit tests for FilterEstimation and StatsCalculator	2022-11-30 11:27:51 +08:00
starocean999	3a362fab76	[fix](fe)table function node use wrong info for projection (#14667 )	2022-11-30 10:41:32 +08:00
Mingyu Chen	ca90253b09	[config](storage-policy) add a FE config to disable storage policy by default (#14655 ) the cold-hot separation feature is still under development. And seems there are some unsolved feature remains. So I add a fe config enable_storage_policy, and default is false, to disable the creation and usage of storage policy by default. So that user can aware that he is using an experimental feature on his own, and it will not be released formally in v1.2.0. Disable storage policy by default, user can not use or create storage policy. Configured by enable_storage_policy. Remove property remote_storage_policy, it is duplicate with storage_policy Change the persist field in DataProperty.java. And remove remoteCooldownTime from DataProperty, because it can be got from StoragePolicy.	2022-11-30 10:04:33 +08:00
Mingyu Chen	dd7ec8f4ca	[improvement](test) add tpch1 orc for hive catalog and refactor some test dir (#14669 ) Add tpch 1g orc test case in hive docker Refactor some suites dir of catalog test cases. And "-internal" for dlf endpoint, to support access oss with aliyun vpc.	2022-11-30 10:03:58 +08:00
Kang	4faca56819	[bug](jsonb) fix INSERT/CAST NULL to JSONB (#14682 ) Add NULL -> JSONB in implicitCastMap to support INSERT/CAST NULL to JSONB.	2022-11-30 09:53:16 +08:00
FreeOnePlus	d5ee721621	[improvement](planner)Adjust the field naming rules when creating tables (#14671 ) Adjust the field naming rules when creating tables. The original table field rules are letters or underscores or @ characters as the first letter, followed by a maximum of 63 characters, and the total cannot exceed 64 characters. However, in many industries, such as the financial industry, the length of the derived fields often exceeds 64 characters, so adjust the regular The rules are from 64 characters to 128 characters. Many users load data from Hive to Doris through appearance or BrokerLoad. Arabic numerals can be used as the first letter in the Hive table, so the regular rules are adjusted to support Arabic numerals as the first letter.	2022-11-30 09:45:27 +08:00
Dongyang Li	5a2e3869df	[regression](test) enable fe and be fuzzy test (#14673 )	2022-11-30 08:40:32 +08:00
Kikyou1997	33ad616839	[fix](statistics) Fix potential NPE in ShowStatisticsStmt #14679 When required cache hasn't been loaded yet, cache would always return ColumnStatistics.DEFAULT which not define the max/min literal expr, add judge for that.	2022-11-30 08:38:20 +08:00
Jerry Hu	a60490651f	[improvement](function) add timezone cache for convert_tz (#14616 )	2022-11-29 17:00:54 +08:00
lsy3993	1713af6cd6	[test](java udf)add new java udf case (#14653 )	2022-11-29 16:43:53 +08:00
Kang	fe95b84c34	[fix](jsonb)fix CAST String to JSONB nullable problem (#14626 ) fix CAST String to SONB nullable problem in DEBUG mode.	2022-11-29 16:22:22 +08:00
Gabriel	3e8b3658c7	[feature-wip](decimalv3) Support basic agg and arithmetic operations for decimal v3 (#14513 )	2022-11-29 15:12:41 +08:00
lsy3993	f7a827c06b	[fix](new-scan) fix some bugs about new scan node and readers (#14504 ) json reader DCHECK fail because of missing TYPE_STRING fix bug that if no file is found, the tvf will throw NPE. The predicate conjuncts can not be pushed down to parquet reader if this is a load task. Because the predicate should be applied on column of dest table, not on column of source file. Add a temp property "use_new_load_scan_node" of broker load to make regression test happy. So that we can use new load scan node for a certain job and avoid setting global FE config.	2022-11-29 10:21:41 +08:00
Gabriel	7513c82431	[NLJoin](conjuncts) separate join conjuncts and general conjuncts (#14608 )	2022-11-29 08:55:54 +08:00
Mingyu Chen	c5eb8ab084	[fix](persiste) make ArithmeticExpr wriable (#14615 ) Fix bug that the ArithmeticExpr's write method is not implement, causing FE crash when creating function like: CREATE ALIAS FUNCTION IF NOT EXISTS mesh_udf_test1(INT,INT) WITH PARAMETER(n,d) AS ROUND(1+floor(n/d)); Add if exists and if not exists for drop and create function Fix a minor bug that if file does not exist, hdfs() table valued function will throw NPE	2022-11-29 08:55:18 +08:00
mch_ucchi	a803e75438	[feature](Nereids) add rule: EliminateGroupByConstants (#14541 ) remove group by constants, like: before apply rule: select 1, k1, min(k2), max(k3) from t1 group by 1, 2; after apply rule: select 1, k1, min(k2), max(k3) from t1 group by k1;	2022-11-28 22:52:24 +08:00
Yongqiang YANG	c7da050da4	[fix](test) tpch_sf1_p1 and tpch_sf1_p1/tpch_sf1 are confusing (#14206 )	2022-11-28 19:30:32 +08:00
abmdocrt	529bdfb153	[Fix](function) Fix retention function return wrong value type (#14552 ) MySQL [db]> SELECT SUM(a.r[1]) as active_user_num, SUM(a.r[2]) as active_user_num_1day, SUM(a.r[3]) as active_user_num_3day, SUM(a.r[4]) as active_user_num_7day FROM ( SELECT user_id, retention( day = '2022-11-01', day = '2022-11-02', day = '2022-11-04', day = '2022-11-07') as r FROM login_event WHERE (day >= '2022-11-01') AND (day <= '2022-11-21') GROUP BY user_id ) a; ERROR 1105 (HY000): errCode = 2, detailMessage = sum requires a numeric parameter: sum(%element_extract%(a.r, 1))	2022-11-28 15:56:18 +08:00
yiguolei	d3cb79c629	[regressiontest](fuzzy) modify window function and schema change test to pass fuzzy (#14632 ) enable fe fuzzy mode for P0 set parallel instance num = 1 for window function sleep 1000ms for schema change test or the data result is wrong.	2022-11-28 14:58:27 +08:00
Kang	ed92a8f81e	[feature](jsonb function)change jsonb_extract_string behavior and doc (#14619 ) 1. change jsonb_extract_string behavior: convert to string instead of NULL if the type of json path is not string 2. move jsonb tutorial doc to JSONB data type	2022-11-28 11:36:54 +08:00
Kikyou1997	b6605b99aa	[ehancement](nereids) eliminate project in the post process phase (#14490 ) Remove those projects that used for column pruning only and don't do any expression calculation, So that we could avoid some redundant data copy in do_projection of BE side.	2022-11-28 00:39:36 +08:00
minghong	280f8be4bd	[test](regression) adjust nereids related regression cases under datev2 (#14578 ) 1. revert 14439, recovery dup&unique test cases 2. adjust nereids related case	2022-11-27 23:57:51 +08:00
lsy3993	93b940bc92	[test](jdbc)add new case for mysql jdbc table (#14581 )	2022-11-27 13:39:59 +08:00
HappenLee	38b4cbe253	[Bug](regression) regression fail random in fuzzy mode (#14614 )	2022-11-27 09:23:36 +08:00
lsy3993	a877c8e50d	[test](docker) delete show table (#14612 )	2022-11-26 23:44:29 +08:00
xiaojunjie	dd21056a4c	[fix](nereids) delete view in regression-test (#14607 )	2022-11-26 18:03:21 +08:00
lsy3993	4c60186e87	[test](jdbc)add new case for pg jdbc table (#14582 )	2022-11-26 13:02:05 +08:00
Mingyu Chen	064b8d2aa6	[fix](multi-catalog) fix coredump when querying partitioned hive table with text format (#14604 ) BE will crash when querying partitioned hive table with text format and put partition column at first of select items. 1. FE should use file slots to set the column mapping index of csv file. 2. BE should use `get_by_name` of block to get right column in a block in csv reader.	2022-11-26 11:42:40 +08:00
Kang	52c6ba051e	[feature](jsonb type)refactor JSONB type using column and add testcase (#13778 ) 1. Refactor JSONB type using ColumnString instead making a copy. 2. Add regression testcase for JSONB load and functions.	2022-11-26 10:06:15 +08:00
xiaojunjie	2ae7dae925	[feature](nereids) Support row policy (#13879 ) This pr did two things: 1. 【new logical plan】add LogicalCheckPolicy before UnboundRelation in LogicalPlanBuilder. 2. 【new rule】turn LogicalCheckPolicy to LogicalFilter if row policy exist, otherwise remove it.	2022-11-25 22:57:56 +08:00
chunping	d159a8d24b	[test](pipline) modify teamcity regression pipline fe conf to 4G (#14584 ) * adjust mem limit to 30% * [test](pipline) modify teamcity regression pipline fe conf to 4G	2022-11-25 22:32:51 +08:00
Dongyang Li	ef82139a37	[pipeline](conf) set fragment_pool_thread_num_max=5000 in be.coonf (#14597 )	2022-11-25 22:30:04 +08:00
luozenglin	4728e75079	[feature](bitmap) Support in bitmap syntax and bitmap runtime filter (#14340 ) 1.Support in bitmap syntax, like 'where k1 in (select bitmap_column from tbl)'; 2.Support bitmap runtime filter. Generate a bitmap filter using the right table bitmap and push it down to the left table storage layer for filtering.	2022-11-25 15:22:44 +08:00
lihangyu	7ba4cd764a	[enhancement](array-function) `array_position`,`array_contains`,`countequal` which in `FunctionArrayIndex` handle target NULL (#14564 ) in the previous, the result is: ``` mysql> select array_position([1, null], null); +--------------------------------------+ \| array_position(ARRAY(1, NULL), NULL) \| +--------------------------------------+ \| NULL \| +--------------------------------------+ 1 row in set (0.02 sec) ``` but after this commit, the result become: ``` mysql> select array_position([1, null], null); +--------------------------------------+ \| array_position(ARRAY(1, NULL), NULL) \| +--------------------------------------+ \| 2 \| +--------------------------------------+ 1 row in set (0.02 sec) ```	2022-11-25 14:19:50 +08:00
zhangstar333	d5d356b17f	[vectorized](function) support order by field function (#14528 ) * [vectorized](function) support order by field function * update * update test	2022-11-25 14:00:46 +08:00
924060929	deef491e01	[fix](Nereids) refactor CTE and EliminateAliasNode and fix the bug that CTE reuse relationId (#14534 ) This pr contribute: - support explain CTE; - refine CTE, fix the bug: reuse the same analyzed plan which LogicalOlapScan has the same relationId; - change EliminateAliasNode to LogicalSubQueryAliasToLogicalProject and move to the top of rewrite stage, so we can simply observe the analyzed plan by the LogicalSubQueryAlias with alias; - job traverse left child first, so the ExprId growth from left child to right child.	2022-11-25 10:54:53 +08:00

1 2 3 4 5 ...

692 Commits