doris

Author	SHA1	Message	Date
morrySnow	954c44db39	[enhancement](Nereids) compare LogicalProperties with output set instead of output list (#12743 ) We used output list to compare two LogicalProperties before. Since join reorder will change the children order of a join plan and caused output list changed. the two join plan will not equals anymore in memo although they should be. So we must add a project on the new join to keep the LogicalProperties the same. This PR changes the equals and hashCode funtions of LogicalProperties. use a set of output to compare two LogicalProperties. Then we do not need add the top peoject anymore. This help us keep memo simple and efficient.	2022-09-20 10:55:29 +08:00
slothever	d435f0de41	[feature-wip](parquet-reader) add page index row range (#12652 ) Add some utils and provide the candidate row range (generated with skipped row range of each column) to read for page index filter this version support binary operator filter todo: - use context instead of structures in close() - process complex type filter - use this instead of row group minmax filter - refactor _eval_binary() for row group filter and page index filter	2022-09-20 10:36:19 +08:00
starocean999	ca3e52a0bb	[fix](agg)the output of window function's nullability should be consistent with output slot (#12607 ) FE may force window function to output a nullable value in some case, be should follow this and change the nullability accordingly.	2022-09-20 09:29:44 +08:00
starocean999	4f27692898	[fix](inlineview)the inlineview's slots' nullability property is not set correctly (#12681 ) The output slots of inline view may come from an outer join nullable side table. So it's should be nullable.	2022-09-20 09:29:15 +08:00
Xin Liao	41cf94498d	[feature-wip](unique-key-merge-on-write) fix that incremental clone may lead to loss of delete bitmap (#12721 )	2022-09-20 09:08:06 +08:00
Jerry Hu	f5f6a852fe	[chore](regression-test) add order by in test_rollup_agg_date.groovy (#12737 )	2022-09-20 09:06:13 +08:00
ElvinWei	e1d2f82d8e	[feature](statistics) template for building internal query SQL statements (#12714 ) Template for building internal query SQL statements，it mainly used for statistics module. After the template is defined, the executable statement will be built after the given parameters. For example, template and parameters: - template: `SELECT ${col} FROM ${table} WHERE id = ${id};`, - parameters: `{col=colName, table=tableName, id=1}` - result sql: `SELECT colName FROM tableName WHERE id = 1;` usage: ``` String template = "SELECT * FROM ${table} WHERE id = ${id};"; Map<String, String> params = new HashMap<>(); params.put("table", "table0"); params.put("id", "123"); // result: SELECT * FROM table0 WHERE id = 123; String result = InternalSqlTemplate.processTemplate(template, params); ```	2022-09-19 22:10:28 +08:00
luozenglin	43d6be8c4d	[docs](function) add a series of date function documents (#12713 ) * [docs](function) add a series of date function documents add docs for `hours_add`, `hours_sub`, `minutes_add`, `minutes_sub`, `seconds_add`, `seconds_sub`, `years_sub`, `years_add`, `months_add`, `months_sub`, `days_add`, `days_add`, `weeks_add`, `weeks_sub` functions.	2022-09-19 21:42:35 +08:00
TaoZex	a5d11dce3b	[typo](docs) Add docs of math function (#12532 ) * docs of math function	2022-09-19 21:41:59 +08:00
mch_ucchi	94d73abf2a	[test](Nereids) runtime filter unit cases not rely on NereidPlanner to generate PhysicalPlan anymore (#12740 ) This PR: 1. add rewrite and implement method to PlanChecker 2. improve unit tests of runtime filter	2022-09-19 19:53:55 +08:00
ElvinWei	1339eef33c	[fix](statistics) remove statistical task multiple times in one loop cycle (#12741 ) There is a problem with StatisticsTaskScheduler. The peek() method obtains a reference to the same task object, but the for-loop executes multiple removes.	2022-09-19 19:28:51 +08:00
jakevin	4b5cc62348	[refactor](Nereids) rename transform to applyExploration UT helper class PlanChecker (#12725 )	2022-09-19 16:49:56 +08:00
ElvinWei	08a71236a9	[feature](statistics) Internal-query, execute SQL query statement internally in FE (#9983 ) Execute SQL query statements internally(in FE). Internal-query mainly used for statistics module, FE obtains statistics by SQL from BE, such as column maximum value, minimum value, etc. This is a tool module as statistics, it will not affect the original code, also will not affect the use of users. The simple usage process is as follows(the following code does no exception handling): ``` String dbName = "test"; String sql = "SELECT * FROM table0"; InternalQuery query = new InternalQuery(dbName, sql); InternalQueryResult result = query.query(); List<ResultRow> resultRows = result.getResultRows(); for (ResultRow resultRow : resultRows) { List<String> columns = resultRow.getColumns(); for (int i = 0; i < resultRow.getColumns().size(); i++) { resultRow.getColumnIndex(columns.get(i)); resultRow.getColumnName(i); resultRow.getColumnType(columns.get(i)); resultRow.getColumnType(i); resultRow.getColumnValue(columns.get(i)); resultRow.getColumnValue(i); } } ```	2022-09-19 16:26:54 +08:00
jakevin	399af4572a	[improve](Nereids) improve join cost model (#12657 )	2022-09-19 16:25:30 +08:00
Jibing-Li	5978fd9647	[refactor](file scanner)Refactor file scanner. (#12602 ) Refactor the scanners for hms external catalog, work in progress. Use VFileScanner, will remove NewFileParquetScanner, NewFileOrcScanner and NewFileTextScanner after fully tested. Query for parquet file has been tested, still need to add readers for orc file, text file and load logic as well.	2022-09-19 15:23:51 +08:00
luozenglin	d68b8cce1a	[fix](intersect) fix intersect query failed in row storage code (#12712 )	2022-09-19 11:47:50 +08:00
jakevin	75d7de89a5	[improve](Nereids) Add all slots used by onClause to project when reorder and fix reorder mark (#12701 ) 1. Add all slots used by onClause in project ``` (A & B) & C like join(hash conjuncts: C.t2 = A.t2) \|---project(A.t2) \| +---join(hash conjuncts: A.t1 = B.t1) \| +---A \| +---B +---C transform to (A & C) & B join(hash conjuncts: A.t1 = B.t1) \|---project(A.t2) \| +---join(hash conjuncts: C.t2 = A.t2) \| +---A \| +---C +---B ``` But projection just include `A.t2`, can't find `A.t1`, we should add slots used by onClause when projection exist. 2. fix join reorder mark Add mark `LAsscom` when apply `LAsscom` 3. remove slotReference use `Slot` instead of `SlotReference` to avoid cast.	2022-09-19 11:01:25 +08:00
yiguolei	415721ef20	[enhancement](pred column) improve predicate column insert performance (#12690 ) Co-authored-by: yiguolei <yiguolei@gmail.com>	2022-09-19 10:53:48 +08:00
yinzhijian	fb9e48a34a	[fix](vstream load) Fix bug when load json with jsonpath (#12660 )	2022-09-19 10:13:18 +08:00
Liqf	1fa65708d7	[test](time_add or sub)add time_add and time_sub funcation case #12641	2022-09-19 09:22:53 +08:00
Yongqiang YANG	4669fa54cc	[enhancement](test) add tpch_sf100_unique p2 test (#12697 )	2022-09-19 09:19:17 +08:00
minghong	b608de668f	[fix](compile)compile error: open_telemetry_scop_wrapper.hpp cannot file 'UNLIKELY' (#12709 )	2022-09-19 09:18:04 +08:00
caoliang-web	6d3ae1e69c	[regression](left join)Add left join, the left table is empty, the query result is not empty case (#12344 ) Add left join, the left table is empty, the query result is not empty case	2022-09-19 08:53:50 +08:00
carlvinhust2012	fa8ed2bccc	[fix](array-type) fix the invalid format load for stream load (#12424 ) this pr is used to fix the invalid format load for stream load. before the change , we will get the error when we load the invalid array format. the origin file to load : 1 [1, 2, 3] 2 [4, 5, 6] 3 \N 4 [7, \N, 8] 5 10, 11, 12 [hugo@xafj-palo]$ sh curl_cmd.sh { "TxnId": 11035, "Label": "11c9f111-188e-4616-9a50-aec8b7814513", "TwoPhaseCommit": "false", "Status": "Fail", "Message": "Array does not start with '[' character, found '1'", "NumberTotalRows": 0, "NumberLoadedRows": 0, "NumberFilteredRows": 0, "NumberUnselectedRows": 0, "LoadBytes": 55, "LoadTimeMs": 7, "BeginTxnTimeMs": 0, "StreamLoadPutTimeMs": 2, "ReadDataTimeMs": 0, "WriteDataTimeMs": 3, "CommitAndPublishTimeMs": 0 } 3. after this change, we will get success and the error url which report the error line. [hugo@xafj-palo]$ sh curl_cmd.sh { "TxnId": 11046, "Label": "249808ee-55f4-4c08-b671-b3d82689d614", "TwoPhaseCommit": "false", "Status": "Success", "Message": "OK", "NumberTotalRows": 5, "NumberLoadedRows": 4, "NumberFilteredRows": 1, "NumberUnselectedRows": 0, "LoadBytes": 55, "LoadTimeMs": 39, "BeginTxnTimeMs": 0, "StreamLoadPutTimeMs": 2, "ReadDataTimeMs": 0, "WriteDataTimeMs": 19, "CommitAndPublishTimeMs": 16, "ErrorURL": "http://10.81.85.89:8502/api/_load_error_log?file=__shard_3/error_log_insert_stmt_8d4130f0c18aeb0a-ad7ffd4233c41893_8d4130f0c18aeb0a_ad7ffd4233c41893" } the sql select result: MySQL [example_db]> select * from array_test06; +------+--------------+ \| k1 \| k2 \| +------+--------------+ \| 1 \| [1, 2, 3] \| \| 2 \| [4, 5, 6] \| \| 3 \| NULL \| \| 4 \| [7, NULL, 8] \| +------+--------------+ 4 rows in set (0.019 sec) the url page show us: "Reason: Invalid format for array column(k2). src line [10, 11, 12]; " Issue Number: #7570	2022-09-19 08:52:59 +08:00
yixiutt	65cff8d40c	[enhancement](compaction) prevent quick_compaction&auto_compaction conflict (#12674 ) Co-authored-by: yixiutt <yixiu@selectdb.com>	2022-09-19 08:39:27 +08:00
Mingyu Chen	bc38b2fdfb	[improvement](new-scan) graceful quit scanner scheduler (#12715 )	2022-09-19 08:39:08 +08:00
Yongqiang YANG	625ac83f72	[enhancement](test) add opensky cases to p2 (#12693 )	2022-09-19 08:38:17 +08:00
Yongqiang YANG	fc8f4c787d	[enhancement](test) add yandex_metrica cases to p2 (#12692 )	2022-09-19 08:37:48 +08:00
starocean999	3b7a04ee8b	[fix](inpredicate)always use PredicateColumn<TYPE_STRING> for CHAR, VARCHAR and STRING type (#12637 ) The predicate column type for char, varchar and string is PredicateColumnType<TYPE_STRING>, so _base_evaluate method should convert the input column to PredicateColumnType<TYPE_STRING> always.	2022-09-19 08:37:06 +08:00
Mingyu Chen	a4ed023bad	[fix](colocation) fix decommission failure with 2 BEs and colocation table (#12644 ) This PR fix: 2 Backends. Create tables with colocation group, 1 replica. Decommission one of Backends. The tablet on decommissioned Backend is not reduced. This is a bug of ColocateTableCheckerAndBalancer.	2022-09-19 08:34:50 +08:00
HB	00dda79735	[fix](broker-load) Correction of kerberos authentication time determination rule (#11793 ) Every time a new broker load comes in, Doris will update the start time of Kerberos authentication, but this logic is wrong. Because the authentication duration of Kerberos is calculated from the moment when the ticket is obtained. This PR change the logic: 1. If it is kerberos, check fs expiration by create time. 2.Otherwise, check fs expiration by access time	2022-09-18 17:46:13 +08:00
luozenglin	cb06e67fba	[fix](tracing) Fix opentelemetry log output to be.out (#11856 )	2022-09-18 17:40:23 +08:00
abmdocrt	4f98146e83	[enhancement](tracing) Support forward to master tracing (#12290 )	2022-09-18 17:39:04 +08:00
Yongqiang YANG	e9f105aa1e	[enhancement](regression-test) add some p0 cases (#12243 )	2022-09-18 17:36:08 +08:00
Yongqiang YANG	c30453e9ab	[enhancement](regression-test) add ssb_sf100 to p2 cases (#12286 )	2022-09-18 17:35:16 +08:00
Xinyi Zou	a73b28789d	Fix memory leak by calling in mem hook (#12708 ) After the consume mem tracker exceeds the mem limit in the mem hook, the boost stacktrace will be printed. A query/load will only be printed once, and the process tracker will only be printed once per second. After the process memory reaches the upper limit, the boost stacktrace will be printed every second. The observed phenomena are as follows: After query/load is canceled, the memory increases instantly; tcmalloc profile total physical memory is less than perf process memory; The process mem tracker is smaller than the perf process memory;	2022-09-18 10:04:15 +08:00
morrySnow	2e41976b07	update tpch regression test (#12687 ) turn on all TPC-H sf1 test cases except Q2. Q2 caused dead loop in Join reorder. Will turn on Q2 after fix it.	2022-09-17 17:06:39 +08:00
Xin Liao	bac58a4774	[feature-wip](unique-key-merge-on-write) fix calculate delete bitmap when flush memtable (#12668 )	2022-09-17 17:04:03 +08:00
HappenLee	35b97a5af0	[Opt](hash) Speed up insert from dict data map and not datetime (#12670 ) Speed up dict data read and not datetime. same target #12636	2022-09-17 17:02:43 +08:00
luozenglin	3030a3606a	[fix](load) fix stream load fail when setting strict mode (#12684 )	2022-09-17 17:02:11 +08:00
Xinyi Zou	3bb042e45c	[fix](memtracker) Process physical mem check does not include tc/jemalloc allocator cache (#12688 ) tcmalloc/jemalloc allocator cache does not participate in the mem check as part of the process physical memory. because new/malloc will trigger mem hook when using tcmalloc/jemalloc allocator cache, but it may not actually alloc physical memory, which is not expected in mem hook fail. in addition: The value of tcmalloc/jemalloc allocator cache is used as a mem tracker, the parent is the process mem tracker, which is updated every 1s. Modify the process default mem_limit to 90%. expect mem tracker to effectively limit the memory usage of the process.	2022-09-17 11:31:01 +08:00
Lightman	e01986b8b9	[feature](light-schema-change) fix light-schema-change and add more cases (#12160 ) Fix _delete_sign_idx and _seq_col_idx when append_column or build_schema when load. Tablet schema cache support recycle when schema sptr use count equals 1. Add a http interface for flink-connector to sync ddl. Improve tablet->tablet_schema() by max_version_schema.	2022-09-17 11:29:36 +08:00
Xinyi Zou	942b31038f	[fix](memory) Fix BE OOM when load -238 fail (#12666 ) When the flush is triggered when the load channel exceeds the mem limit, if the flush fails, an error message is returned and the load is terminated. Usually flush failure is -238 error code. Because the memtable is frequently flushed after the load channel exceeds the mem limit, the number of segments exceeds the max value.	2022-09-17 00:17:53 +08:00
Xinyi Zou	42b6532131	remove gc and fix print (#12682 )	2022-09-17 00:16:15 +08:00
924060929	0a95ebf602	[feature](Nereids) Add scalar function code generator and some function trait (#12671 ) This pr did these things: 1. Change the nullable mode of 'from_unixtime' and 'parse_url' from DEPEND_ON_ARGUMENT to ALWAYS_NULLABLE, which nullable configuration was missing previously. 2. Add some new interfaces for origin NullableMode. This change inspired by the grammar of scala's mix-in trait, It help us to quickly understand the traits of function without read the lengthy procedural code and save the work to write some template code, like `class Substring extends ScalarFunction implements ImplicitCastInputTypes, PropagateNullable`. These are the interfaces: - PropagateNullable: equals to NullableMode.DEPEND_ON_ARGUMENT - AlwaysNullable: equals to NullableMode.ALWAYS_NULLABLE - AlwaysNotNullable: equals to NullableMode.ALWAYS_NOT_NULLABLE - others ComputeNullable: equals to NullableMode.CUSTOM 3. Add `GenerateScalarFunction` to generate nereids-style function code from legacy functions, but not actual generate any new function class yet, because the function's trait is not ready for use. I need add some traits for the legacy function's CompareMode and NonDeterministic, this thought is the same as ComputeNullable.	2022-09-16 21:27:30 +08:00
Zhengguo Yang	b733a23cf7	[Bugfix](stack_over_flow) fix be may core dump because of stack-buffer-overflow when TBrokerOpenReaderResponse too large (#12658 )	2022-09-16 20:57:22 +08:00
wudi	a3fee5afbb	[doc](variables) fix forward_to_master doc bug #12659 Co-authored-by: wudi <>	2022-09-16 20:56:55 +08:00
yongjinhou	6fc74def02	[fix](Broker load): fix bug for broker label has already been used (#12630 )	2022-09-16 20:46:01 +08:00
morrySnow	378acfa28f	[enhancement](Nereids) eliminate all unessential cross join in TPC-H benchmark (#12651 ) For eliminate all unessential cross join in TPC-H benchmark, this PR: 1. push all predicates that can be push down through join before do ReorderJoin rule. Then we could eliminate all cross join that can be eliminated in ReorderJoin rule since this rule need matching a LogicalFilter as a root pattern. (Q2, Q15, Q16, Q17, Q18) 2. enable expression optimization rule - extract common expression. (Q19) 3. fix cast translate failed. (Q19)	2022-09-16 19:09:58 +08:00
Yongqiang YANG	a4a5dae7dc	[enhancement](test) add tpcds_sf100 to p2 cases (#12296 )	2022-09-16 17:38:23 +08:00

... 37 38 39 40 41 ...

8276 Commits