doris

Author	SHA1	Message	Date
Xinyi Zou	3bc26f773d	[hotfix](memtracker) Fix expired `DCHECK(_limit != -1);` and segment_meta_mem_tracker inelegant end (#14223 )	2022-11-13 17:15:29 +08:00
zhannngchen	72748c229a	update (#14215 )	2022-11-13 12:31:42 +08:00
Xin Liao	33b50860c7	[improvement](load) release load channel actively when error occurs (#14218 )	2022-11-13 12:31:15 +08:00
Xinyi Zou	dd11d5c0a5	[enhancement](memory) Support try catch bad alloc (#14135 )	2022-11-13 11:22:56 +08:00
zhannngchen	7682c08af0	[improvement](load) reduce memory in batch for small load channels (#14214 )	2022-11-12 22:14:01 +08:00
catpineapple	beaf2fcaf6	[feature](partition) support new create partition syntax (#13772 ) Create partitions use ： ``` PARTITION BY RANGE(event_day)( FROM ("2000-11-14") TO ("2021-11-14") INTERVAL 1 YEAR, FROM ("2021-11-14") TO ("2022-11-14") INTERVAL 1 MONTH, FROM ("2022-11-14") TO ("2023-01-03") INTERVAL 1 WEEK, FROM ("2023-01-03") TO ("2023-01-14") INTERVAL 1 DAY, PARTITION p_20230114 VALUES [('2023-01-14'), ('2023-01-15')) ) PARTITION BY RANGE(event_time)( FROM ("2023-01-03 12") TO ("2023-01-14 22") INTERVAL 1 HOUR ) ``` can create a year/month/week/day/hour's date partitions in a batch, also it is compatible with the single partitioning method.	2022-11-12 20:52:37 +08:00
luozenglin	376b4fda9f	[fix](scankey) fix extended scan key errors. (#14200 ) Issue Number: close #14199	2022-11-12 20:44:09 +08:00
lsy3993	082028b2a2	[test](jdbc postgresql case)add jdbc test case for postgresql (#14162 )	2022-11-12 20:43:13 +08:00
lsy3993	78fa167b0a	[test](jdbc external table) add jdbc regression test case (#14086 )	2022-11-12 20:42:57 +08:00
xy720	035657c5a1	[typo](comment) Fix a lot of spell errors in be comments (#14208 ) fix typos.	2022-11-12 16:06:15 +08:00
Mingyu Chen	bf79805a66	[regression-test] sleep longer to void error (#14186 )	2022-11-12 11:13:52 +08:00
Yongqiang YANG	44eb1cf1c3	[fix](chore) read max_map_count from proc and make notice much more understandable (#14137 ) Some users can not use sysctl under non-root in linux, so we read max_map_count from proc. Notice users that they can change max_map_count under root.	2022-11-11 23:05:54 +08:00
lihangyu	43490a33a5	[feature-array](array-type) Add array function array_with_constant (#14115 ) Return array of constants with length num. ``` mysql> select array_with_constant(4, 1223); +------------------------------+ \| array_with_constant(4, 1223) \| +------------------------------+ \| [1223, 1223, 1223, 1223] \| +------------------------------+ 1 row in set (0.01 sec) ``` co-authored-by @eldenmoon	2022-11-11 22:08:43 +08:00
Yixi Zhang	0ba13af8ff	[feature](running_difference) support running_difference function (#13737 )	2022-11-11 21:22:56 +08:00
Adonis Ling	28ae281936	[chore](cmake) Fix wrong statements (#14187 )	2022-11-11 18:22:49 +08:00
Xin Liao	43f80e2633	[enhancement](load) Increase batch size of node channel to improve import performance (#13912 )	2022-11-11 18:05:36 +08:00
zy-kkk	2e29b15c6a	[test](array function)add array_range function test (#14123 ) * add array_range function test * add array_range function test	2022-11-11 18:04:33 +08:00
924060929	d9913b1317	[Enhancement](Nerieds) Support numbers TableValuedFunction and some bitmap/hll aggregate function (#14169 ) ## Problem summary This pr support 1. `numbers` TableValuedFunction for nereids test, like `select * from numbers(number = 10, backend_num = 1)` 2. bitmap/hll aggregate function 3. support find variable length function in function registry, like `coalesce` 4. fix a bug that print nerieds trace will throw exception because use RewriteRule in ApplyRuleJob, e.g: `AggregateDisassemble`, introduced by #13957	2022-11-11 16:29:15 +08:00
Gabriel	fe2944d56d	[Bug](nljoin) Keep compatibility for nljoin (#14182 )	2022-11-11 15:54:55 +08:00
xueweizhang	a162dab40a	[feature](docs) add docs for SHOW-CATALOG-RECYCLE-BIN (#14185 )	2022-11-11 15:54:05 +08:00
HappenLee	74a1e28af3	[Opt](exec) prevent the scan key split whole range (#14088 ) prevent the scan key split whole range	2022-11-11 15:46:00 +08:00
camby	015f8ab78d	[enhancement](thirdparty) support create stripe reader by column names (#14184 ) ORC NextStripeReader now only support read columns by indices, but it is hard to get column indices for complex types. We patch ORC adapter to support read columns by column names. Co-authored-by: cambyzju <zhuxiaoli01@baidu.com>	2022-11-11 15:10:20 +08:00
Gabriel	02a86d2215	[Bug](runtimefilter) Fix concurrent bug in runtime filter #14177 For runtime filter, signal will be called by a thread which is different from the await thread. So there will be a potential race for variable is_ready	2022-11-11 14:16:18 +08:00
morrySnow	7c48168a53	[refactor](Nereids) remove DecimalType, use DecimalV2Type instead (#14166 )	2022-11-11 13:58:16 +08:00
abmdocrt	b6ba654f5b	[Feature](Sequence) Support sequence_match and sequence_count functions (#13785 )	2022-11-11 13:38:45 +08:00
morrySnow	5fad4f4c7b	[feature](Nereids) replace order by keys by child output if possible (#14108 ) To support query like that: SELECT c1 + 1 as a, sum(c2) FROM t GROUP BY c1 + 1 ORDER BY c1 + 1 After rewrite, plan will equal to SELECT c1 + 1 as a, sum(c2) FROM t GROUP BY c1 + 1 ORDER BY a	2022-11-11 13:34:29 +08:00
minghong	9b50888aaf	[feature](Nereids) prune runtime filters which cannot reduce the tuple number of probe table (#13990 ) 1. add a post processor: runtime filter pruner Doris generates RFs (runtime filter) on Join node to reduce the probe table at scan stage. But some RFs have no effect, because its selectivity is 100%. This pr will remove them. A RF is effective if a. the build column value range covers part of that of probe column, OR b. the build column ndv is less than that of probe column, OR c. the build column's ColumnStats.selectivity < 1, OR d. the build column is reduced by another RF, which satisfies above criterions. 2. explain graph a. add RF info in Join and Scan node b. add predicate count in Scan node 3. Rename session variable rename `enable_remove_no_conjuncts_runtime_filter_policy` to `enable_runtime_filter_prune` 4. fix min/max column stats derive bug `select max(A) as X from T group by B` X.min is A.min, not A.max	2022-11-11 13:13:29 +08:00
Adonis Ling	118a7dff07	[chore](build) Optimize the compilation time (#14170 ) Currently, it takes too much time to build BE from source in workflow environments (P0/P1) which affects the efficiency of daily development. We can measure the time by executing the following command. time EXTRA_CXX_FLAGS='-O3' BUILD_TYPE=ASAN ./build.sh --be --fe --clean -j "$(nproc)" This PR optimizes the compilation time by exploiting the following methods. Reduce the codegen by removing some useless std::visit. Disable the optimization for some template functions which are instantiated by std::visit conditionally (except for the RELEASE build).	2022-11-11 12:09:54 +08:00
starocean999	8e17fcef3f	[fix](cast)fix cast to char(N) error (#14168 )	2022-11-11 11:27:51 +08:00
Xin Liao	883dfa38ab	[fix](decimal) change log fatal to log warning to avoid code dump on decimal type (#14150 )	2022-11-11 11:22:41 +08:00
Luzhijing	de00ade6dd	[Docs](README)Update the README.md (#14156 ) Add the new release in Readme.md	2022-11-11 11:22:17 +08:00
Luwei	8812a680fc	[fix](metric) fix the bug of not updating the query latency metric #14172	2022-11-11 11:21:17 +08:00
Gabriel	d204c7dc1e	[Improvement](profile) Improve readability for runtime filters in profile string (#14165 ) * [Improvement](profile) Improve readability for runtime filters in profile string * update	2022-11-11 11:19:24 +08:00
Lightman	1f9fb4dc8b	[Bugfix] Fix upgrade from 1.1 coredump (#14163 ) When upgrade from 1.1 to master, and then rollback to 1.1, and upgrade to master again, BE will coredump because some rowsets has schema and some rowsets has no schema. In the first time upgrade from 1.1, BE will flush schema in all rowsets and after rollback to 1.1, BE do compaction, and create some new rowset without schema. And the second time upgrade from 1.1, BE coredump because some conditions depend on having all or none of the rowsets.	2022-11-11 10:29:34 +08:00
Gabriel	7782fb63ca	[docs](outfile) Add ORC to outfile document (#14153 )	2022-11-11 09:42:30 +08:00
yongjinhou	6297ef10e9	[enhancement](plugin) import audit logs for slow queries into a separate table (#14100 ) * import audit logs for slow queries into a separate table	2022-11-11 09:06:01 +08:00
Zhengguo Yang	12652ebb0e	[UDF](java udf) using config to enable java udf instead of macro at compile time (#14062 ) * [UDF](java udf) useing config to enable java udf instead of macro at compile time	2022-11-11 09:03:52 +08:00
jakevin	b62e700f4e	[fix](doc): remove incubator. (#14159 )	2022-11-11 08:58:42 +08:00
Kikyou1997	e1e63f8354	[feature-wip](statistic) persistence table statistics into olap table (#13883 ) 1. Supports for persisting collected statistics to a pre-built OLAP table named `column_statistics`. 2. Use a much simpler mechanism to collect statistics: all the gauges are collected in single one SQL for each partition and then the whole column, which defined in class `AnalysisJob` 3. Implement a cache to manage the statistics records in FE TODO: 1. Use opentelemetry to monitor the execution time of each job 2. Format the internal analysis SQL 3. split SQL to promise the in expr's child count not exceeds the FE limits of generated SQL for deleting expired records 4. Implements show statements	2022-11-10 22:08:08 +08:00
xueweizhang	45a3bb87c4	[docs](recover) modify recover doc (#13904 )	2022-11-10 20:20:39 +08:00
Gabriel	1ef85ae1f2	[Improvement](join) Support nested loop outer join (#13965 )	2022-11-10 19:50:46 +08:00
morrySnow	6c13126e5c	[enhancement](Nereids) analyze check input slots must in child's output (#14107 )	2022-11-10 19:28:01 +08:00
minghong	ae4f2aead7	[fix](nereids) column stats min/max missing (#14091 ) in the result of SHOW COLUMN STATS tbl, min/max value is not displayed.	2022-11-10 17:08:44 +08:00
Ashin Gau	6bd5378f66	[feature-wip](multi-catalog) lazy read for ParquetReader (#13917 ) Read predicate columns firstly, and use VExprContext(push-down predicates) to generate the select vector, which is then applied to read the non-predicate columns. The data in non-predicate columns may be skipped by select vector, so the value-decode-time can be reduced. If a whole page can be skipped, the decompress-time can also be reduced.	2022-11-10 16:56:14 +08:00
Zhengguo Yang	724cf1cdb8	[chore][build] add instructions to build version string (#14067 )	2022-11-10 16:23:34 +08:00
shee	9b5b411112	[fix](schemeChange) fe oom because replicas too many when schema change (#12850 )	2022-11-10 16:17:25 +08:00
谢健	151a72d158	[feature](Nereids) support circle graph (#14082 )	2022-11-10 15:54:21 +08:00
Pxl	0e26f28bf2	[Enhancement](runtime-filter) enlarge runtime filter in predicate threshold (#13581 ) enlarge runtime filter in predicate threshold	2022-11-10 15:48:46 +08:00
Xinyi Zou	a73f4dfdc1	[fix](memtracker) Fix scanner thread ending after fragment thread causing mem tracker null pointer #14143	2022-11-10 15:42:53 +08:00
jakevin	4cde9c4765	[enhance](Nereids): add missing hypergraph rule. (#14087 )	2022-11-10 15:23:31 +08:00

1 2 3 4 5 ...

7146 Commits