doris

Author	SHA1	Message	Date
Xinyi Zou	c784fb3ddd	[fix] (mem tracker) Fix core dump during transmit_block (#10133 ) In some cases, query mem tracker does not exist in BE when transmit block. This will result in a null pointer for get query mem tracker in brpc transmit_block	2022-06-17 00:01:30 +08:00
HappenLee	8d98c17c4e	[Bug][Vectorized] Fix DCHECK failed in VExchangeNode close twice (#10184 ) Co-authored-by: lihaopeng <lihaopeng@baidu.com>	2022-06-16 23:56:49 +08:00
Jibing-Li	f1c9105af1	[feature] Support hive on s3 (#10128 ) Support query hive table on S3. Pass AK/SK, Region and s3 endpoint to hive table while creating the external table. example create table sql: ``` CREATE TABLE `region_s3` ( `r_regionkey` integer NOT NULL, `r_name` char(25) NOT NULL, `r_comment` varchar(152) ) engine=hive properties ("database"="default", "table"="region_s3", “hive.metastore.uris"="thrift://127.0.0.1:9083", “AWS_ACCESS_KEY”=“YOUR_ACCESS_KEY", “AWS_SECRET_KEY”=“YOUR_SECRET_KEY", "AWS_ENDPOINT"="s3.us-east-1.amazonaws.com", “AWS_REGION”=“us-east-1”); ```	2022-06-16 19:15:46 +08:00
smallhibiscus	41b693e1df	[test] Add window cast bitmap digital_masking function regression test. (#9924 )	2022-06-16 19:14:51 +08:00
Dongyang Li	ac2be958b3	[tpch tools]set exec_mem_limit=8G for tpch queries (#10119 ) Co-authored-by: Jerry <root@localhost.localdomain>	2022-06-16 18:19:11 +08:00
yinzhijian	75a7e72402	[Refactor] Use iequal to replace boost::iequals (#10146 ) * [Refactor] Use iequal to replace boost::iequals * remove unused include	2022-06-16 18:18:38 +08:00
camby	14d21edf65	[fix] croaringbitmap compile support USE_AVX2=0 (#10140 ) * If we disable AVX2 by config USE_AVX2=0, we need to croaringbitmap with ROARING_DISABLE_AVX=ON * update to trigger regression test again Co-authored-by: cambyzju <zhuxiaoli01@baidu.com>	2022-06-16 18:17:46 +08:00
Pxl	ae9c231925	[Enhancement][Storage] refactor InListPredicate/NotInListPredicate (#10139 ) * refactor in_list_pred * update	2022-06-16 18:09:29 +08:00
lihangyu	f49a4535c4	[Fix] fix vjson_scanner heap use after free when meet object or array type (#10179 ) quick merge. It is a serious bug in 1.1.	2022-06-16 16:01:18 +08:00
HappenLee	33921c5e75	[Bug] Fix _add_block_closure do not delete in ~VNodeChannel() (#10180 ) Co-authored-by: lihaopeng <lihaopeng@baidu.com>	2022-06-16 15:56:07 +08:00
wangyongfeng	dad953bc08	[doc](website)fix SSR bug and add algolia search (#10178 ) * fix ssr bug and add algolia search	2022-06-16 14:25:46 +08:00
lihangyu	3f9436c6a8	[compile]fix simdjson compile flags (#10054 )	2022-06-16 11:28:51 +08:00
Gabriel	28e8effc52	[Refactor] Refactor vectorized scan node (#9968 )	2022-06-16 11:10:56 +08:00
Jerry Hu	4b9d500425	[improvement](profile) Add table name and predicates (#10093 )	2022-06-16 10:59:31 +08:00
Pxl	3b6451273b	[regression test]fix test_outfile to use user regression conf (#10123 )	2022-06-16 10:58:36 +08:00
Pxl	5805f8077f	[Feature] [Vectorized] Some pre-refactorings or interface additions for schema change part2 (#10003 )	2022-06-16 10:50:08 +08:00
yiguolei	90f229c038	[refactor] remove useless plugin test code (#10061 ) * remove plugin test code * remove plugin test Co-authored-by: yiguolei <yiguolei@gmail.com>	2022-06-16 10:43:28 +08:00
yinzhijian	bc431f2806	[typo] Fix typos in comments (#10142 )	2022-06-16 10:13:59 +08:00
smallhibiscus	9217223cc5	[doc] update sequence en and zh-CN doc. (#10164 ) * update sequence en and zh-CN doc.	2022-06-16 09:32:52 +08:00
wangyongfeng	dff1f09406	[doc](website)update Chinese heme page text (#10168 ) update Chinese home page text	2022-06-16 08:04:21 +08:00
camby	ca88f258d9	[improvement] remove unused codes and docs for `SHOW USER` (#10107 ) * remove unused codes and docs for `SHOW USER`	2022-06-15 21:49:08 +08:00
chenlinzhong	4dfebb9852	[Feature] compaction quickly for small data import (#9804 ) * compaction quickly for small data import #9791 1.merge small versions of rowset as soon as possible to increase the import frequency of small version data 2.small version means that the number of rows is less than config::small_compaction_rowset_rows default 1000	2022-06-15 21:48:34 +08:00
wangyongfeng	c4871fb306	[doc](website)remove translate warning form Chinese docs (#10157 ) * modify home page text	2022-06-15 18:17:37 +08:00
jiafeng.zhang	4005b34a52	[doc] add tpc-h benchmark (#10150 ) [doc] add tpc-h benchmark	2022-06-15 16:43:10 +08:00
ccoffline	49f4437396	[fix] Fix disk used pct only consider the data that used by Doris (#9705 )	2022-06-15 16:28:56 +08:00
HappenLee	f1d0c231b9	[Opt][Vectorized] Opt vectorized the unique_table in storage vectorized (#10132 ) Co-authored-by: lihaopeng <lihaopeng@baidu.com>	2022-06-15 15:32:15 +08:00
wangyongfeng	606c32cc30	[doc](website)add translate warning in docs (#10152 ) * fix docs bugs with sidebar can not display and some style problems	2022-06-15 14:51:53 +08:00
Adonis Ling	983cdc7b0d	[feature-wip](array-type) Support loading data in vectorized format (#10065 )	2022-06-15 14:40:28 +08:00
wangyongfeng	96b54dd1d5	[doc](website)modify home page text and navbar (#10148 ) * fix docs bugs with sidebar can not display and some style problems	2022-06-15 12:21:40 +08:00
924060929	76a968d1dd	[Enhancement][Refactor](Nereids) generate pattern by operator and refactor Plan and NODE_TYPE generic type (#10019 ) This pr support 1. remove the generic type from operator, remove some NODE_TYPE from plan and expression 2. refactor Plan and NODE_TYPE generic type 3. support child class matching by TypePattern 4. analyze the code of operator and generate pattern makes it easy to create rules. e.g. ```java class LogicalJoin extends LogicalBinaryOperator; class PhysicalFilter extends PhysicalUnaryOperator; ``` will generate the code ```java interface GeneratedPatterns extends Patterns { default PatternDescriptor<LogicalBinaryPlan<LogicalJoin, Plan, Plan>, Plan> logicalJoin() { return new PatternDescriptor<LogicalBinaryPlan<LogicalJoin, Plan, Plan>, Plan>( new TypePattern(LogicalJoin.class, Pattern.FIXED, Pattern.FIXED), defaultPromise() ); } default <C1 extends Plan, C2 extends Plan> PatternDescriptor<LogicalBinaryPlan<LogicalJoin, C1, C2>, Plan> logicalJoin(PatternDescriptor<C1, Plan> child1, PatternDescriptor<C2, Plan> child2) { return new PatternDescriptor<LogicalBinaryPlan<LogicalJoin, C1, C2>, Plan>( new TypePattern(LogicalJoin.class, child1.pattern, child2.pattern), defaultPromise() ); } default PatternDescriptor<PhysicalUnaryPlan<PhysicalFilter, Plan>, Plan> physicalFilter() { return new PatternDescriptor<PhysicalUnaryPlan<PhysicalFilter, Plan>, Plan>( new TypePattern(PhysicalFilter.class, Pattern.FIXED), defaultPromise() ); } default <C1 extends Plan> PatternDescriptor<PhysicalUnaryPlan<PhysicalFilter, C1>, Plan> physicalFilter(PatternDescriptor<C1, Plan> child1) { return new PatternDescriptor<PhysicalUnaryPlan<PhysicalFilter, C1>, Plan>( new TypePattern(PhysicalFilter.class, child1.pattern), defaultPromise() ); } } ``` and then we don't have to add pattern for new operators. this function utilizing jsr269 to do something in compile time, and utilizing antlr4 to analyze the code of `Operator`, then we can generate corresponding pattern. pattern generate steps: 1. maven-compiler-plugin in the pom.xml will compile fe-core three terms. first term will compile `PatternDescribable.java` and `PatternDescribableProcessor.java` 2. second compile term will compile `PatternDescribableProcessPoint.java`, and enable annotation process `PatternDescribableProcessor`, PatternDescribableProcessor will receive the event and know that `PatternDescribableProcessPoint` class contains the `PatternDescribable` annotation. 3. `PatternDescribableProcessor` will not process `PatternDescribableProcessPoint`, but find all java file exists in `operatorPath` that specify in pom.xml, and then parse to Java AST(abstract syntax tree). 5. PatternDescribableProcessor collect java AST and use `PatternGeneratorAnalyzer` to analyze AST, find the child class file for `PlanOperator` then generate `GeneratedPatterns.java` by the AST. 6. third compile term will compile `GeneratedPatterns.java` and other java file.	2022-06-15 11:44:54 +08:00
yinzhijian	c9f33fa051	[test] add cast array regression test (#10069 ) * [test] add cast array regression test	2022-06-15 11:29:28 +08:00
pengxiangyu	c4d0fba713	Add storage policy for remote storage migration (#9997 )	2022-06-15 11:00:06 +08:00
zhangstar333	4c24586865	[Vectorized][UDF] support java-udaf (#9930 )	2022-06-15 10:53:44 +08:00
plat1ko	f4e2f78a1a	[fix] Fix the bug that data balance causes tablet loss (#9971 ) 1. Provide a FE conf to test the reliability in single replica case when tablet scheduling are frequent. 2. According to #6063, almost apply this fix on current code.	2022-06-15 09:52:56 +08:00
wangyongfeng	7ab64f9155	[doc][website]update home page content and add slack button (#10091 ) * fix docs bugs with sidebar can not display and some style problems	2022-06-15 09:31:40 +08:00
wudi	02b1908ce4	[modify default config]add be 2pc config enbale defalut (#10110 ) Co-authored-by: wudi <>	2022-06-15 09:08:28 +08:00
jiafeng.zhang	34ea6ce850	[doc]Added be enable_stream_load_record configuration description (#10130 )	2022-06-15 08:14:47 +08:00
jakevin	be3aa2aa37	[enhancement](community): polish doc to reformat (#10137 )	2022-06-15 08:14:13 +08:00
Xinyi Zou	85362a907e	[fix](mem tracker) Fix some memory leaks, inaccurate statistics, core dump, deadlock bugs (#10072 ) 1. Fix the memory leak. When the load task is canceled, the `IndexChannel` and `NodeChannel` mem trackers cannot be destructed in time. 2. Fix Load task being frequently canceled by oom and inaccurate `LoadChannel` mem tracker limit, and rewrite the variable name of `mem limit` in `LoadChannel`. 3. Fix core dump, when logout task mem tracker, phmap erase fails, resulting in repeated logout of the same tracker. 4. Fix the deadlock, when add_child_tracker mem limit exceeds, calling log_usage causes `_child_trackers_lock` deadlock. 5. Fix frequent log printing when thread mem tracker limit exceeds, which will affect readability and performance. 6. Optimize some details of mem tracker display.	2022-06-14 21:38:37 +08:00
gtchaos	f7b5f36da4	[feature] Support read hive external table and outfile into HDFS that authenticated by kerberos (#9579 ) At present, Doris can only access the hadoop cluster with kerberos authentication enabled by broker, but Doris BE itself does not supports access to a kerberos-authenticated HDFS file. This PR hope solve the problem. When create hive external table, users just specify following properties to access the hdfs data with kerberos authentication enabled: ```sql CREATE EXTERNAL TABLE t_hive ( k1 int NOT NULL COMMENT "", k2 char(10) NOT NULL COMMENT "", k3 datetime NOT NULL COMMENT "", k5 varchar(20) NOT NULL COMMENT "", k6 double NOT NULL COMMENT "" ) ENGINE=HIVE COMMENT "HIVE" PROPERTIES ( 'hive.metastore.uris' = 'thrift://192.168.0.1:9083', 'database' = 'hive_db', 'table' = 'hive_table', 'dfs.nameservices'='hacluster', 'dfs.ha.namenodes.hacluster'='n1,n2', 'dfs.namenode.rpc-address.hacluster.n1'='192.168.0.1:8020', 'dfs.namenode.rpc-address.hacluster.n2'='192.168.0.2:8020', 'dfs.client.failover.proxy.provider.hacluster'='org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider', 'dfs.namenode.kerberos.principal'='hadoop/_HOST@REALM.COM' 'hadoop.security.authentication'='kerberos', 'hadoop.kerberos.principal'='doris_test@REALM.COM', 'hadoop.kerberos.keytab'='/path/to/doris_test.keytab' ); ``` If you want to `select into outfile` to HDFS that kerberos authentication enable, you can refer to the following SQL statement： ```sql select * from test into outfile "hdfs://tmp/outfile1" format as csv properties ( 'fs.defaultFS'='hdfs://hacluster/', 'dfs.nameservices'='hacluster', 'dfs.ha.namenodes.hacluster'='n1,n2', 'dfs.namenode.rpc-address.hacluster.n1'='192.168.0.1:8020', 'dfs.namenode.rpc-address.hacluster.n2'='192.168.0.2:8020', 'dfs.client.failover.proxy.provider.hacluster'='org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider', 'dfs.namenode.kerberos.principal'='hadoop/_HOST@REALM.COM' 'hadoop.security.authentication'='kerberos', 'hadoop.kerberos.principal'='doris_test@REALM.COM', 'hadoop.kerberos.keytab'='/path/to/doris_test.keytab' ); ```	2022-06-14 20:07:03 +08:00
Kikyou1997	25b9d6eba2	[feature](nereids) Plan Translator (#9993 ) Issue Number: close #9621 Add following physical operator: PhysicalAgg PhysicalSort PhysicalHashJoin Add basic logic of plan translator 1. add new agg phase enum for nereids 2. remove the Analyzer from PlanContext.java 3. implement PlanTranslator::visitPhysicalFilter	2022-06-14 19:39:55 +08:00
minghong	15e1bb448f	[test] tpch q3 rewrite, change join order, make lineitem on left side (#10055 ) rewrite the sql in tpch test tools	2022-06-14 17:16:33 +08:00
HappenLee	c2af14fc61	[Bug] return type is not always nullable of function (#10116 ) Co-authored-by: lihaopeng <lihaopeng@baidu.com>	2022-06-14 16:32:35 +08:00
shee	2fadaddda0	[Enhancement] (Nereids) scalar expression rewrite framework (#9942 ) Issue Number: close #9633 The scalar expression is rewritten using the visitor pattern as a traversal. In the abstract class ExpressionVisitor, which contains all predicate to rewrite. We have provided a rewrite rules interface ExpressionRewriteRule, AbstractExpressionRewriteRule class implements the interface and expanded the ExpressionVisitor, if we want to realize an expression rewriting rules, Direct implementation AbstractExpressionRewriteRule provided in the method of traversing the predicate. There are two rules to refer: NormalizeExpressionRule and SimplifyNotExprRule	2022-06-14 16:20:48 +08:00
HappenLee	14bc971159	[Bug] Fix bug push value predicate of unique table when have sequence column (#10060 ) Co-authored-by: lihaopeng <lihaopeng@baidu.com>	2022-06-14 15:35:31 +08:00
morrySnow	59b3023adf	[fix](regression)bucket shuffle join with collocate table should use order_qt (#10082 )	2022-06-14 15:34:39 +08:00
Mingyu Chen	81e0a348a7	[fix] fix bug that show proc "/cluster_balance/history_tablets" return malformat error (#10073 )	2022-06-14 15:34:16 +08:00
Pxl	5d624dfe6c	[bugfix]fix segmentation fault at unalign address cast to int128 (#10094 )	2022-06-14 15:32:58 +08:00
camby	eb4d0f508a	[doc] Add docs for SHOW TABLETS (#10105 ) * add docs for SHOW TABLETS * update * add more examples for SHOW TABLETS Co-authored-by: cambyzju <zhuxiaoli01@baidu.com>	2022-06-14 15:29:46 +08:00
yinzhijian	2a96d7ffde	[spell] Fix spell error in row_batch.h (#10109 )	2022-06-14 15:28:29 +08:00

1 2 3 4 5 ...

4919 Commits