doris

Author	SHA1	Message	Date
Mingyu Chen	67b842ce04	[License] Organize and modify the license of the code (#4371 ) 1. Disable the MySQL client and LZO library by default when building the Doris. MySQL client library is used for MySQL external table feature. This feature will be replaced by the new ODBC external table soon. LZO library is used to compress/decompress data of some old data format of Doris, which is no longer used anymore. 2. Add missing license to some files. 3. For all non-Apache-License code, all are explained in NOTICE file and the corresponding license is declared. 4. Remove the js source code from webroot, it will be downloaded as thirdparty	2020-08-24 21:51:55 +08:00
Mingyu Chen	976820ba20	[SegmentV2] Change the default storage format to SegmentV2 (#4387 ) Since the Segment V2 has been released for a long time, we should make it as default storage format for newly created table. This CL mainly changes: 1. For all newly created tables, their default storage format is Segment V2. 2. For all already exist tablets, their storage format remain unchanged. 3. Fix bugs described in Fix #4384 and Fix #4385	2020-08-24 21:51:17 +08:00
Zhengguo Yang	af2b749a87	make some readFields Deprecated (#4399 ) We have changed most of our serialization methods to json. In order to be compatible with previous data, these classes still retain the readFields method. Some prs that involve modifying metadata often modify the readFields method. To avoid this, we should Mark these methods as Deprecated #4398	2020-08-21 22:58:08 +08:00
Zhengguo Yang	d61c10b761	[Delete] Support batch delete [part 1] (#4310 ) * Implements the grammar of the batch delete #4051 * Process create, alter table when table has delete sign column * Support the syntax for enabling the delete column * Automatically filtered deleted data in the select statement. * Automatically add delete sign when create rollup table TODO: * Optimize the reading and compaction logic on the be side, so that the data marked as deleted will be completely deleted during base compaction	2020-08-21 22:57:16 +08:00
EmmyMiao87	76a04de6c4	[MV] Input correct keys type of index meta when `Add Partition` (#4408 ) Define Expr will not serialized in Column `toThrift`. 1. When adding partition, different indexes should use their own keys type instead of using the keys type of base table uniformly. ` 2. There are two kinds of define expr in Column , one is analyzed, and the other is not analyzed. Currently, analyzed define expr is only used when creating materialized views, so the define expr in RollupJob must be analyzed. In other cases, such as define expr in `MaterializedIndexMeta`, it may not be analyzed after being relayed. When executing the load, the analyzed define expr (such as to_bitmap(cast(k1, varchar))) will not be analyzed again. Only a cast function will be added to the inner layer(such as to_bitmap(cast(cast(k1 ,int), varchar))) which is analyzed too. The define expr that has not been analyzed (such as cast(k1, varchar)) will be analyzed when executing the load.	2020-08-21 10:42:41 +08:00
EmmyMiao87	09b1965499	[MV] Fix errors when alter materialized view which based on dup table (#4375 ) 1. Input the correct keys type when mv is updated. The keys type of mv should be used in schema change job rather then keys type of base table. Otherwise, the be will core and thrown exception "Create replicas failed". 2. Forbidden add non-key column on agg mv directly when base table is duplicate model If a dup table has a agg mv, user will not add a non-key column on mv. The non-key column can only be added to dup index.	2020-08-21 10:36:03 +08:00
EmmyMiao87	6bb111b42c	Modify mv rewrite rule on 'Count distinct' (#4382 ) The rewrite rule named `CountToSum` does not distinguish between `Count` and `Count distinct` which causes `Count distinct` is rewritten as `Sum` incorrectly. So this commit modified matching rule. When the function is `Count distinct`, the rewrite rule will not take effect. Fixed #4381	2020-08-20 09:30:35 +08:00
xinghuayu007	bfb39a2826	[SQL][Function] Add replace() function (#4347 ) replace is an user defined function, which is to replace all old substrings with a new substring in a string, as follow: mysql> select replace("http://www.baidu.com:9090", "9090", ""); +------------------------------------------------------+ \| replace('http://www.baidu.com:9090', '9090', '') \| +------------------------------------------------------+ \| http://www.baidu.com: \| +------------------------------------------------------+	2020-08-20 09:28:53 +08:00
Mingyu Chen	38a2a7a269	[Bug] Fix bug that modification of global variable can not be persisted. (#4324 ) When setting global variables, such as `set global default_rowset_type=beta`, the operation is not correctly persisted. This CL change the fe meta version to 90. --------------- The main reason for this problem is that for the modification of global variable, we directly use Java's reflection mechanism to modify static member variables in `GlobalVariable` class. But in the persistence method of the `set` operation, we only persist the value stored in the `globalSessionVariable` variable, and this variable does not contain Global Variable. So I added a new OperationType: `OP_GLOBAL_VARIABLE_V2`, and added a `GlobalVarPersistInfo` class to record all changes.	2020-08-18 16:54:35 +08:00
Mingyu Chen	3359467b9a	[Tablet][Recovery] Support using empty tablet to repair the damaged or missing tablet (#4255 ) In some very special circumstances, such as code bugs, or human misoperation, etc., all replicas of some tablets may be lost. In this case, the data has been substantially lost. However, in some scenarios, the business still hopes to ensure that the query will not report errors even if there is data loss, and reduce the perception of the user layer. At this point, we can use the blank Tablet to fill the missing replica to ensure that the query can be executed normally. Add a new FE config `recover_with_empty_tablet`. default is false. true means to use empty tablet to fill the missing one. Also fix a bug in Fix #4274	2020-08-18 06:13:53 +00:00
caoyang10	53d00d92cc	[Doris On ES][Bug-Fix] ES queries always route at same 3 BE nodes (#4351 ) (#4352 ) resolve the problem of querying ES table always route at same 3 BE nodes because of random strategy	2020-08-18 10:36:18 +08:00
xueyan.li	e69496feaf	[MysqlCompatibility] Support collate field option in expr (#4365 ) Support SQL like: ``` select collation_name, character_set_name, is_default collate utf8_general_ci = 'Yes' as is_default from information_schema.collations ```	2020-08-17 22:52:57 +08:00
EmmyMiao87	38921d4343	[MV]Forbidden aggregated partition key column on mv (#4343 ) The partition column of table also must be the key in materialized view. If not, when user wants to add partition of table, the be will core. The materialized view could not create partition correctly when partition column has been aggregated.	2020-08-15 11:38:50 +08:00
HangyuanLiu	4fa35c9f39	[Bug][RoutineLoad] Fix routine load timezone property invalid (#4339 )	2020-08-13 23:40:54 +08:00
xueyan.li	ac9c7741e9	[SQL]Support datagrip show database information (#4332 ) Support show schema()	2020-08-13 23:39:05 +08:00
wangbo	790779fb6f	[SparkLoad]remove unncessary convert from dataframe to rdd (#4304 )	2020-08-13 23:37:38 +08:00
gengjun-git	48d89e06c3	[Bug fix]fix query id assign bug (#4291 )	2020-08-12 22:42:36 +08:00
EmmyMiao87	98fe80dd5a	[MV]Forbidden no grouping mv on aggregation table (#4317 ) If user wants to create a no grouping mv on aggregation table, the doris will thrown exception. The correct approach is that explicit declare the grouping column. For example: Agg table: k1, k2, sum(k3) Create materialized view stmt: select k1, k2 from agg_table group by k1, k2. Fixed #4316	2020-08-12 20:57:25 +08:00
HappenLee	3354645c77	[BugFix][ColocateJoin] Fix bug of issue 4305 (#4306 ) This PR use fragmentIdToSeqToAddressMap replace seqtoAddresss, Beacause SeqBucket to Address should bind to fragment	2020-08-12 12:11:47 +08:00
hexiang55	48f3ba35ec	[Doris On ES][Bug-Fix] Resolve NullPointerException when multi fields with `text` type (#4300 )	2020-08-11 12:09:17 +08:00
HangyuanLiu	493c88c1d6	[BUG] Fix NPE when distinct in predicate push down (#4294 ) Describe the bug Predicate push down where sub query has distinct may throw NPE To Reproduce Steps to reproduce the behavior: 1. create table like ``` +--------------+--------------+------+-------+---------+---------+ \| Field \| Type \| Null \| Key \| Default \| Extra \| +--------------+--------------+------+-------+---------+---------+ \| event_day \| DATETIME \| No \| true \| NULL \| \| \| title \| VARCHAR(600) \| No \| true \| NULL \| \| \| report_value \| VARCHAR(50) \| No \| false \| NULL \| REPLACE \| +--------------+--------------+------+-------+---------+---------+ ``` 2. exec query ``` ```SELECT * FROM ( SELECT DISTINCT event_day, title FROM click_show_window ) a WHERE a.title IS NOT NULL ``` 4. See error ``` ERROR 1064 (HY000): errCode = 2, detailMessage = Unexpected exception: null ``` This is because DISTINCT generate grouping exprs in agginfo, but this clause does not have a group by clause	2020-08-11 11:07:51 +08:00
Lijia Liu	a480dec7a4	Do not wrap NULL type tuple (#4245 ) Do not wrap NULL type expr to IF(TupleIsNull(tids), NULL, expr)	2020-08-11 09:38:42 +08:00
HangyuanLiu	6abb374d0c	Fix duplicate table export fail (#4293 )	2020-08-11 09:37:43 +08:00
HaiBo Li	4ad943e45d	[Feature][Cache] Cache proxy and coordinator #2581 (#4248 ) * [Feature][Cache] Cache proxy and coordinator #2581 1. Cache's abstract proxy class and BE's Cache implementation 2. Cache coordinator implemented by consistent hashing * Adjusted the formatting code, naming and variables according to the comments	2020-08-10 16:40:25 +08:00
xinghuayu007	411ced5715	Secure singleton mode (#4257 ) Co-authored-by: wangxixu <wangxixu@xiaomi.com>	2020-08-10 11:26:56 +08:00
kangkaisen	f516172f23	Fix window function with limit zero bug 2 (#4235 )	2020-08-10 10:29:05 +08:00
HappenLee	47fff6841b	[Bug][ColocateJoin] Fix bug of #4287 and #4285 of Colocatejoin (#4289 ) 1.Table join itself should have same single partition to valid colocate join. 2.Check eqjoinConjuncts column order to valid colocate join.	2020-08-09 20:48:36 +08:00
gengjun-git	a54b0eab0c	[Bug]fix cancel query bug (#4275 ) ConnectContext.kill() use executor to cancel query, but executor has never been set.	2020-08-08 20:29:32 +08:00
EmmyMiao87	d5909ae503	[MaterializedView]Change the type of slot when mv is selected (#4272 ) The column types of the materialized view and the base table are different. When mv is selected in query plan, the type of slot should be changed by mv column type. For example: base table: k1 int, k2 int mv table: k1 int, k2 bigint sum The k2 type of slot ref should be changed from int to bigint. Closed. #4271	2020-08-08 20:29:07 +08:00
caiconghui	eefad13107	[Feature] Support InPredicate in delete statement (#4006 ) This PR is to add inPredicate support to delete statement, and add max_allowed_in_element_num_of_delete variable to limit element num of InPredicate in delete statement.	2020-08-06 23:19:40 +08:00
EmmyMiao87	4c05eddc10	[SQL] Support approx_count_distinct rewrite to hll union in mv rewriter (#4239 ) The new function approx_count_distinct is the alias of function ndv. So Doris also need to rewrite approx_count_distinct to hll function when it is possible to match the hll materialized view.	2020-08-06 23:16:15 +08:00
Mingyu Chen	237c0807a4	[RoutineLoad] Support modify routine load job (#4158 ) Support ALTER ROUTINE LOAD JOB stmt, for example: ``` alter routine load db1.label1 properties ( "desired_concurrent_number"="3", "max_batch_interval" = "5", "max_batch_rows" = "300000", "max_batch_size" = "209715200", "strict_mode" = "false", "timezone" = "+08:00" ) ``` Details can be found in `alter-routine-load.md`	2020-08-06 23:11:02 +08:00
EmmyMiao87	c98b411500	[Bug] Revert part of #4199 to avoid BE crash(#4269 ) Revert “Change type of sum, min, max function column in mv” This pr is revert pr #4199 . The daily test is cored when the type of mv column has been changed. So I revert the pr. The daily core will be fixed in the future. After that, the pr#4199 will be enable. Change-Id: Ie04fcfacfcd38480121addc5e454093d4ae75181	2020-08-06 19:06:00 +08:00
EmmyMiao87	173bc09833	[Alter]Analyze define expr before replay Rollup job (#4236 ) The define expr should be analyzed after replay RollupJob. The slot desc of define expr is used to transfrom to thrift and send to backend.	2020-08-05 21:47:18 +08:00
gengjun-git	a4f3d43e15	fix version check bug (#4244 ) Co-authored-by: gengjun <gengjun@dorisdb.com>	2020-08-05 21:45:36 +08:00
Zhengguo Yang	1b341601fe	Generate jave files using maven (#4133 ) generate generated-java files using maven instead of by build.sh	2020-08-05 15:20:39 +08:00
HappenLee	5caa347e86	[ColocateJoin] ColocateJoin support table join itself (#4230 ) (#4231 ) if left table and right table is same table, they are naturally colocate relationship.	2020-08-02 22:05:45 +08:00
HangyuanLiu	85e0a68783	[SQL][Bug] Fix multi predicate in correlation subquery analyze fail (#4211 )	2020-08-02 22:05:23 +08:00
WangCong	d64d65322b	[Bug][DynamicPartition]Fix bug that Modify a dynamic partition property in a non-dynamic partition table will throw a Exception (#4127 )	2020-08-02 22:03:57 +08:00
Lijia Liu	bdaef84a10	[FE] [HttpServer] Config netty param in HttpServer (#4225 ) Now, if the length of URL is longer than 4096 bytes, netty will refuse. The case can be reproduced by constructing a very long URL(longer than 4096bytes) Add 2 http server params: 1. http_max_line_length 2. http_max_header_size	2020-08-01 17:59:01 +08:00
HangyuanLiu	116d7ffa3c	[SQL][Function] Add approx_count_distinct() function (#4221 ) Add approx_count_distinct() function to replace the ndv() function	2020-08-01 17:54:19 +08:00
kangkaisen	c32ddce0b5	[SQL][BUG]Fix window function with limit zero bug (#4207 )	2020-08-01 17:43:47 +08:00
EmmyMiao87	25f3420855	[MaterializedView] Change type of sum, min, max function column in mv (#4199 ) If the agg function is sum, the type of mv column will be bigint. The only exception is that if the base column is largeint, the type of mv column will be largeint. If the agg function is min or max, the type of mv column will be same as the type of base column. For example, the type of mv column is smallint when the agg function is min.	2020-08-01 17:43:23 +08:00
HappenLee	f412f99511	[Bug][ColocateJoin] Make a wrong choice of colocate join (#4216 ) If table1 and table2 are colocated using column k1, k2. Query should contains all of the k1, k2 to apply colocation algorithm. Query like select * from table1 inner join table2 where t1.k1 = t2.k1 can not be used as colocation. We add the rule to avoid the problem.	2020-07-31 15:18:00 +08:00
HaiBo Li	1ebd156b99	[Feature]Add fetch/update/clear proto of fe&be for cache (#4190 )	2020-07-31 13:23:24 +08:00
HaiBo Li	b4cb8fb9b2	[Feature][Cache]Add interface, metric, variable and config for query cache (#4159 )	2020-07-30 11:24:20 +08:00
worker24h	fdcc223ad2	[Bug][Json] Refactor the json load logic to fix some bug 1. Add `json_root` for nest json data. 2. Remove `_jmap` to make the logic reasonable.	2020-07-30 10:36:34 +08:00
caiconghui	237271c764	[Bug] Fix fe meta version problem, make drop meta check code easy to read and add doc content for drop meta check (#4205 ) This PR is mainly do three things: 1. Fix fe meta version bug introduced by #4029 , when fix conflict with #4086 2. Make drop check code easy to read 3. Add doc content for drop meta check	2020-07-30 09:54:20 +08:00
Mingyu Chen	8a169981cf	[Bug][TabletRepair] Fix bug that too many replicas generated when decommission BE (#4148 ) Try to select the BE with an existing replicas as the destination BE for REPLICA_RELOCATING clone task. Fix #4147 Also add 2 new FE configs `max_clone_task_timeout_sec` and `min_clone_task_timeout_sec`	2020-07-30 09:46:33 +08:00
HangyuanLiu	abeb25d2a9	Fx large int literal (#4168 )	2020-07-30 00:53:50 +08:00

... 92 93 94 95 96 ...

5755 Commits