doris

Author	SHA1	Message	Date
Qi Chen	ef2fdb79bb	[Improvement](parquet-reader) Optimize and refactor parquet reader to improve performance. (#16818 ) Optimize and refactor parquet reader to improve performance. - Improve 2x performance for small dict string by aligned copying. - Refactor code to decrease condition(if) checking. - Don't call skip(0). - Don't read page index if no condition. ssb-flat-100: (single-machine, single-thread) \| Query \| before opt \| after opt \| \| ------------- \|:-------------:\| ---------:\| \| SELECT count(lo_revenue) FROM lineorder_flat \| 9.23 \| 9.12 \| \| SELECT count(lo_linenumber) FROM lineorder_flat \| 4.50 \| 4.36 \| \| SELECT count(c_name) FROM lineorder_flat \| 18.22 \| 17.88\| \| SELECT count(lo_shipmode) FROM lineorder_flat \|10.09 \| 6.15\|	2023-02-20 11:42:29 +08:00
Pxl	2bc014d83a	[Enchancement](function) remove unused params on aggregate function (#16886 ) remove unused params on aggregate function	2023-02-20 11:08:45 +08:00
Xin Liao	46d5cca661	[fix](merge-on-write) The delete bitmap of the currently imported rowset is not persistent (#16859 )	2023-02-20 11:02:41 +08:00
zhannngchen	b7d2bec8ea	[fix](merge-on-write) add check for segment num (#14032 )	2023-02-20 11:01:34 +08:00
ZhaoChangle	e958b13747	[Exec] Add conjection for union_node. (#16777 )	2023-02-20 10:48:58 +08:00
Mingyu Chen	97230a54fb	[Refactor](auth)(step-2) Add AccessController to support customized authorization (#16802 ) Support specifying AccessControllerFactory when creating catalog create catalog hive properties( ... "access_controller.class" = "org.apache.doris.mysql.privilege.RangerAccessControllerFactory", "access_controller.properties.prop1" = "xxx", "access_controller.properties.prop2" = "yyy", ... ) So that user can specified their own access controller, such as RangerAccessController Add interface to check column level privilege A new method of CatalogAccessController: checkColsPriv(), for checking column level privileges. TODO: Support grant column level privileges statements in Doris Add TestExternalCatalog/Database/Table/ScanNode These classes are used for FE unit test. In unit test you can create catalog test1 properties( "type" = "test" "catalog_provider.class" = "org.apache.doris.datasource.ColumnPrivTest$MockedCatalogProvider" "access_controller.class" = "org.apache.doris.mysql.privilege.TestAccessControllerFactory", "access_controller.properties.key1" = "val1", "access_controller.properties.key2" = "val2" ); To create a test catalog, and specify catalog_provider to mock database/table/schema metadata Set roles in current user identity in connection context The roles can be used for authorization in access controller.	2023-02-20 10:32:48 +08:00
zhangstar333	5291f14aff	[vectorized](udf) java udf support array type (#16841 )	2023-02-20 10:00:25 +08:00
Xinyi Zou	2074b83c67	[enhancement](third-party) Upgrade JEMalloc version from 5.2.1 to 5.3.0 (#14871 ) https://github.com/jemalloc/jemalloc/releases	2023-02-20 00:00:40 +08:00
Kang	58c51086ca	[bugfix](topn) fix topn read_orderby_key_columns nullptr (#16896 ) The SQL `SELECT nationkey FROM regression_test_query_p0_limit.tpch_tiny_nation ORDER BY nationkey DESC LIMIT 5` make be core dump since dereference a nullptr `read_orderby_key_columns in VCollectIterator::_topn_next`, triggered by skipping _colname_to_value_range init in #16818 . This PR makes two changes: 1. avoid read_orderby_key_columns nullptr in TabletReader::_init_orderby_keys_param 2. return error if read_orderby_key_columns is nullptr unexpected in VCollectIterator::_topn_next to avoid core dump	2023-02-19 23:28:33 +08:00
zhangdong	1c6c28b8fb	[Enhance](ComputeNode) K8sDeployManager support domain (#16897 ) Describe your changes. 1.DeployManager adds the ability to obtain domain names from third-party systems 2.When the DeployManager determines whether the node exists, add the domain name judgment logic 3.rename Backend.getHost() to getIp() 4.Delete the logic for handling UnknownHostException in FQDNManager, because there are two cases of UnknownHostException. If it occurs temporarily, it can wait for the next detection. If the node is deleted, the logic can be handed over to DeployManager for processing.	2023-02-19 21:30:18 +08:00
Mingyu Chen	cd3dbc33c9	[deps](be) update libhdfs3 and jemalloc (#16894 ) - Modified: libhdfs3 2.3.7 -> 2.3.8 - Modified: jemalloc 5.2.1 -> 5.3.0 (#14871)	2023-02-19 19:49:27 +08:00
xy720	73f7979b73	[fix](struct-type) forbid struct-type to be distributed key/aggregation key and add more tests (#16626 ) This commits forbid struct and map type to be distributed key/aggregation key. The sql such as: select distinct stuct_col from struct_table will report an error.	2023-02-19 15:16:36 +08:00
amory	8b70bfdc31	[Feature](map-type) Support stream load and fix some bugs for map type (#16776 ) 1、support stream load with json, csv format for map 2、fix olap convertor when compaction action in map column which has null 3、support select outToFile for map 4、add some regression-test	2023-02-19 15:11:54 +08:00
huangzhaowei	96a3c60d3b	[feature-wip](MTMV) Support alter statement (#16817 ) Steps: 1. drop the old MTMV jobs 2. clear the old task records and clean the running and pending tasks 3. set the new scheduler info in MTMV and replay it in followers. 4. create a job in the master node. Note that if you change the refresh info of MTMV, the old MTMV tasks will be cleaned.	2023-02-19 12:15:17 +08:00
jakevin	d4cebb39ba	[fix](Nereids): fix SemiJoinLogicalJoinTransposeProject. (#16883 )	2023-02-18 23:12:34 +08:00
zhengshengjun	e2e6a0dd83	[Feature](load) Support mutable property for partition (#16036 ) The background is described in this issue: #15723, where users used Apache Druid to satisfy such lambada requirements before. We will not make Doris dropping data not belonged to current time window automatically like Druid, which is not flexible. We demand a ability to support mutable/immutable partition, the PR works this way: 1. Support mutable property for a partition. 2. The mutable property of a partition is passed from FE to BE in a load procedure 3. If a record's partition is immutable, we mark this row as "un selected" which will not be included in computation of 'max_filter_ratio', so that data write to immutable partition will be neglected and not cause load failure. Use Example: 1. Add immutable partition or modify an partition to be immutable: - alter table test_tbl add [temporary] partition xxx values less than ('xxx') ('mutable' = 'true'); - alter table test_tbl modify partition xx set ('mutable' = 'false'); 2. Write 5 records into table, two of then belongs to immutable partition	2023-02-18 23:09:34 +08:00
liwei	1ac5b23e40	Update doris-join-optimization.md (#15818 ) 修改文档错误	2023-02-18 22:24:51 +08:00
ZhaoChangle	d6a841409f	[Enhancement](func)Introduce non_nullable extraction function. #16621 Introduced a new function non_nullable to BE, which can extract concrete data column from a nullable column. If the input argument is already not a nullable column, raise an error.	2023-02-18 20:44:07 +08:00
xy720	45427b86be	[regression](struct-type) add more regression tests for struct and map type (#16790 ) This commit forbid struct and map column in Materialized view and add more regression tests.	2023-02-18 20:42:17 +08:00
catpineapple	45dbd4d872	[fix](dbt)fix dbt incremental #16840 fix dbt incremental :new ideas for no rollback and support incremental data rerun . add snapshot use 'mysql-connector-python' mysql driver to replace 'MysqlDb' driver	2023-02-18 20:40:56 +08:00
AKIRA	861e4bc64a	[fix](planner) Nullable of slot descriptor is mistaken and cause BE crash #16862	2023-02-18 20:39:56 +08:00
jiafeng.zhang	4bf778c6cd	[typo](docs)fix dynamic Table version label (#16895 )	2023-02-18 20:39:14 +08:00
zhangguoqiang	a4e42b1e94	[improvement](pipeline) Added compatible code synchronization delay issues with failures and updates needed to trigger the pipeline (#16902 )	2023-02-18 20:26:23 +08:00
Mingyu Chen	2d7d8102c7	[fix](doc) fix mal-format doc #16898 We must write sql reference with guidance: https://doris.apache.org/zh-CN/community/how-to-contribute/contribute-doc/#%E5%A6%82%E4%BD%95%E7%BC%96%E5%86%99%E5%91%BD%E4%BB%A4%E5%B8%AE%E5%8A%A9%E6%89%8B%E5%86%8C	2023-02-18 14:30:54 +08:00
Stalary	070f42c463	[Enhancement](Es): Support config like whether push down to es (#16800 ) Support config like whether push down to es and refactor some code Like transform to wildcard query and push down to es, this increases the cpu consumption of the es, I add a switch control it.	2023-02-17 21:56:11 +08:00
FreeOnePlus	d5c393f413	[docs](docs)Fix FE config max_running_txn_num_per_db default value (#16877 )	2023-02-17 20:55:52 +08:00
yagagagaga	90ae8dcf01	[typo](docs)supplement the document content (#16884 ) * [typo](docs)supplement the document content * Update grouping.md Add space before and after English letters in CN docs and keep the English case consistent. * Update grouping.md Change the Chinese title to English	2023-02-17 20:55:34 +08:00
yongkang.zhong	adc42600b4	[typo](docs)Modify some document label errors (#16866 ) * [typo](docs)Modify some document label errors * fix	2023-02-17 20:55:17 +08:00
HappenLee	fda4afecf5	[RegressionTest](Pipeline) Fix pipeline failed in regression test (#16880 ) regression-test/suites/inverted_index_p0/test_add_drop_index_with_data.groovy	2023-02-17 20:49:17 +08:00
caoliang-web	ea0e090a77	collect_set function documentation added 1.2 label (#16868 )	2023-02-17 19:05:44 +08:00
谢健	fd5d7d6097	[refactor](Nereids) remove local sort (#16819 ) After adding phase in sort, the locatSort is no longer needed change the order of sortPhase in constructor	2023-02-17 18:52:41 +08:00
morrySnow	9b94729c87	Revert "[test](pipeline) Run nereids cases in p1/p2 (#16130 )" (#16792 ) This reverts commit b480db2e119ac0516e8621ea3d53c40f250c1d24.	2023-02-17 18:48:27 +08:00
pengxiangyu	6a1e3d3435	[fix](cooldown)Fix bug for single cooldown compaction, add remote meta (#16812 ) * fix bug, add remote meta for compaction * fix bug, add remote meta for compaction * fix bug, add remote meta for compaction * fix bug, add remote meta for compaction * fix bug, add remote meta for compaction * fix bug, add remote meta for compaction * fix bug, add remote meta for compaction * fix bug, add remote meta for compaction * fix bug, add remote meta for compaction * fix bug, add remote meta for compaction * fix bug, add remote meta for compaction * fix bug, add remote meta for compaction * fix bug, add remote meta for compaction * fix bug, add remote meta for compaction * fix bug, add remote meta for compaction * fix bug, add remote meta for compaction * fix bug, add remote meta for compaction * fix bug, add remote meta for compaction * fix bug, add remote meta for compaction * fix bug, add remote meta for compaction * fix bug, add remote meta for compaction * fix bug, add remote meta for compaction * fix bug, add remote meta for compaction * fix bug, add remote meta for compaction * fix bug, add remote meta for compaction * fix bug, add remote meta for compaction	2023-02-17 15:13:06 +08:00
Pxl	da147f1d1c	[Chore](build) remove memory_copy and remove some wno build check (#16831 ) * remove memory_copy and remove some wno cbuild check	2023-02-17 14:43:24 +08:00
TengJianPing	ef2130de57	[improvement](memory) fix possible memory leak of vcollect iterator (#16822 ) Logic in function VCollectIterator::build_heap is not robust, which may cause memory leak: Level1Iterator* cumu_iter = new Level1Iterator( cumu_children, _reader, cumu_children.size() > 1, _is_reverse, _skip_same); RETURN_IF_NOT_EOF_AND_OK(cumu_iter->init()); std::list<LevelIterator> children; children.push_back(base_reader_child); children.push_back(cumu_iter); _inner_iter.reset( new Level1Iterator(children, _reader, _merge, _is_reverse, _skip_same)); cumu_iter will be leaked if cumu_iter->init()); is not success.	2023-02-17 14:40:15 +08:00
YueW	30dafd6a44	[improve](inverted index) Add element count limit for inverted index searcher cache (#16758 ) The element in InvertedIndexSearcherCache is inverted index searcher, which is a file descriptor of inverted index file, so InvertedIndexSearcherCache is actually cache file descriptor of inverted index file. If open file descriptor limit of the Linux system is set too small and config inverted_index_searcher_cache_limit is too big, during high pressure load maybe cause "Too many open files". So, when insert inverted index searcher into InvertedIndexSearcherCache, need also check whether reach file_descriptor_number limit for inverted index file.	2023-02-17 11:53:07 +08:00
airborne12	1a9eefebd4	[Fix](inverted index) fix array inverted index error match result when doing schema change add index (#16839 ) There is a bug in inverted_index_writer when adding multiple lines array values' index. This problem can cause error result when doing schema change adding index.	2023-02-17 11:50:39 +08:00
lihangyu	6acee1ce88	[Fix](topn opt) double check plan From OriginalPlanner to make sure optimized SQL is a general topn query (#16848 ) From the original logic, query like `select * from a where exists (select * from b order by 1) order by 1 limit 1` is a query contains subquery, but the top query will pass `checkEnableTwoPhaseRead` and set `isTwoPhaseOptEnabled=true`.So check the double plan is a general topn query plan is needed, and rollback the needMaterialize flag setted by the previous `analyze`.	2023-02-17 10:59:35 +08:00
Adonis Ling	630865a32f	[chore](workflow) Fix the BE UT (Clang) workflow (#16847 ) Fix the BE UT (Clang) workflow	2023-02-17 10:34:57 +08:00
lihangyu	5dfd6d2390	[improve](dynamic table) refine SegmentWriter columns writer generate (#16816 ) * [improve](dynamic table) refine SegmentWriter columns writer generate ``` Dynamic Block consists of two parts, dynamic part of columns and static part of columns static dynamic \| ----- \| ------- \| the static ones are original _tablet_schame columns the dynamic ones are auto generated and extended from file scan. ``` We should only consisder to use Block info to generte columns when it's a dynamic table load procudure. And seperate the static ones and dynamic ones * test	2023-02-17 10:24:33 +08:00
lihangyu	2426d8e6e8	[chore](be-config) set disable_storage_row_cache default true to default disable row cache (#16827 )	2023-02-17 10:21:28 +08:00
Gabriel	3d6077efe0	[pipeline](profile) Support real-time profile report in pipeline (#16772 )	2023-02-17 10:01:34 +08:00
Yulei-Yang	fe4ef23489	[fix](doc) add essential property for hive catalog on Kerberosied hms (#16781 ) property `hive.metastore.kerberos.principal` is essential when the principal of hms you are connecting is not the default value: hive-metastore/_HOST@your_realms。 otherwise, you will get error: Failure unspecified at GSS-API level (Mechanism level: Checksum failed)	2023-02-17 09:54:29 +08:00
zhangdong	1fc5023d97	[Enhance](ComputeNode) K8sDeployManager support computeNode (#16789 ) 1.allow have no ELECTABLE or BACKEND 2.add cn NodeType 3.delete deprecated code	2023-02-17 09:08:14 +08:00
Gabriel	b35998a3b7	[Bug](datetimev2) Support cast datetimev2 to datetimev2 with different precision #16826	2023-02-17 08:42:36 +08:00
FreeOnePlus	6012fc3605	[feature](docker)Fe docker init script add new interface option (#16846 ) add interface BUILD_TYPE, Values only one "k8s". e.g. docker run -itd \ --name=fe-02 \ --env BUILD_TYPE="k8s" -p 8032:8030 \ -p 9032:9030 \ --network=doris-network \ --ip=172.20.80.4 \ freeoneplus/doris:1.2.2-fe-x86_64 add interface group FE_MASTER_IP & FE_MASTER_PORT & FE_CURRENT_IP & FE_CURRENT_PORT docker run -itd \ --name=fe-02 \ --env FE_MASTER_IP="172.20.80.2" \ --env FE_MASTER_PORT=9010 \ --env FE_CURRENT_IP="172.20.80.4" \ --env FE_CURRENT_PORT=9010 \ -p 8032:8030 \ -p 9032:9030 \ --network=doris-network \ --ip=172.20.80.4 \ freeoneplus/doris:1.2.2-fe-x86_64 --------- Co-authored-by: Yijia Su <suyijia@selectdb.com>	2023-02-17 08:41:38 +08:00
HappenLee	24ef60b491	[Opt](exec) opt aggreate function performance in nullable column	2023-02-16 22:26:12 +08:00
starocean999	4c7f19ab02	[enhancement](nereids) add eliminate left nullaware anti join rule (#16774 ) if no join conjunct is nullable, the left null aware anti join can be converted to left anti join	2023-02-16 21:54:14 +08:00
mch_ucchi	407ccaaff7	[FIx](planner) create table as select with null_type select item cause be core bug (#16778 ) sql: create table t as select null as k will cause be core sometime. now we change it null_type to tinyint nullable to avoid it.	2023-02-16 20:01:13 +08:00
zhangguoqiang	f86e8ec395	[enhancement] change the teamcity pipeline trigger : triggered by github pull request comment (#16836 ) Optimized some code and Reduce invalid code,fix syntax error	2023-02-16 19:51:58 +08:00

1 2 3 4 5 ...

8795 Commits