doris

Author	SHA1	Message	Date
wangbo	efef067f2d	[Bug] Fix mem_pool npe (#4045 ) Fix mem_pool NPE in column reader. Add a safe allocation method.	2020-07-09 21:50:22 +08:00
yangzhg	ebaa0c7137	[Bug][SQL]Fix predicate pushdown may incorrect when groupby with grouping sets (#4041 ) Fixes #4040 Fix predicate pushdown may incorrect when groupby with grouping sets	2020-07-09 21:49:37 +08:00
xy720	d2ab38a5e0	[Feature] Batch update partition's property in one command (#3981 ) Support following command. ``` alter table tbl_name modify partition (p1, p2, p3) set ("replication_num" = "3"); ```	2020-07-09 21:48:43 +08:00
HappenLee	fafc7e406e	[Spill]Fix the problem of mem exec, when analytic eval node need to spill to disk with a low mem limit (#3991 ) [Bug] Fix the problem of mem exec, when analytic eval node need to spill to disk with a low mem limit. And clear_reservations of Analytic node reservation of block manager. [Running Profile] Add Spilled flag in Running Profile, when Analytic eval node and sort node spill to Disk.	2020-07-09 09:30:22 +08:00
caiconghui	5a27981e49	[Config] Add thrift_client_retry_interval_ms config in be for thrift client to avoid avalanche disaster in fe thrift server (#4022 ) This PR is mainly to add `thrift_client_retry_interval_ms` config in be for thrift client to avoid avalanche disaster in fe thrift server and fix some typo and some rpc setting problems at the same time.	2020-07-08 21:07:00 +08:00
wutiangan	fb0ecb70fd	[SQL]fix inline view join mysql choose shuffle join bug (#4048 ) fix #4047 #3886 has certain relevance to this case。 the sql : `bigtable t1 join mysqltable t2 join mysqltable t3 on t1.k1 = t3.k1` 1. after reorder: t1, t2, t3 2. choose join t1 with t2: t1 join t2 with no conditions, and Doris choose cross join 3. choose join (t1 join on t2) with t3: in old code, the t2 is mysqlTable, so the cardinality is zero, and "the cross join t1 with t2" 's cardinality is t1.cardinality multiply t2.cardinality, for t2 is mysql, so t2.cardinality is zero, and "the cross join t1 with t2" is zero. t3 is mysqltable, t3's cardinality is zero. If two tables need to be joined both are zero，we will choose the shuffle join So I change the mysql table ‘s cardinality from 0 to 1, the cross join's cardinality is not zero.	2020-07-08 20:56:24 +08:00
lichaoyong	413d6d2f22	[Bug] Fix core when modifing char to varchar and loading boolean with replace aggregation (#4042 ) 1. Doris support modify char to varchar. There is a bug when use two-level pointer when converting the date. 2. Boolean can be used as metric value with REPLACE and REPLACE_IF_NOT_NULL aggregation function. The aggregation function should be added into aggregation map.	2020-07-08 11:12:42 +08:00
Yingchun Lai	6d4fd25815	[shell] Fix BE unit test directory not match bug (#4028 )	2020-07-08 10:01:42 +08:00
Mingyu Chen	7715a84d4d	[Config] Enable some features by default (#4031 ) Its time to enable some features by default. 1. Enable FE plugins by setting `plugin_enable=true` 2. Enable dynamic partition by setting `dynamic_partition_enable=true` 3. Enable nio mysql server by setting `mysql_service_nio_enabled=true` Also modify installation doc, add download link of MySQL client.	2020-07-08 09:59:10 +08:00
caiconghui	b7051d0971	[Config]Make it easier for users to find configuration items needed (#3957 ) This PR is to make config items ordered by key and support like predicate for admin show config stmt	2020-07-07 23:12:21 +08:00
Yingchun Lai	ab8851f7aa	[webserver] Make BE webserver handle static files (#4021 ) Make BE webserver handle static files, e.g. css, js, ico, then we can make BE website more pretty.	2020-07-07 23:08:29 +08:00
Lijia Liu	1aa148da7f	[Bug]Fix mini load NPE (#4026 ) for #4025	2020-07-07 23:08:08 +08:00
Mingyu Chen	c3d9feed75	[Load][Json] Refactor json load logic to make it more reasonable (#4020 ) This CL mainly changes: 1. Reorganized the code logic to limit the supported json format to two, and the import behavior is more consistent. 2. Modified the statistical behavior of the number of error rows when loading in json format, so that the error rows can be counted correctly. 3. See `load-json-format.md` to get details of loading json format.	2020-07-07 23:07:28 +08:00
yangzhg	5c42514a8f	[Bug][SQL]Fix except node child not order correctly (#4003 ) Fixes #3995 ## Why does it happen When SetOperations encounters that the previous node needs Aggregate, the timing of add AggregationNode is wrong. You should add AggregationNode first before add other children. ## Why doesn't intersect and union have this problem intersect and union conform to the commutation law, so it doesn't matter if the order is wrong ## Why this problem has not been tested before In the previous test case, not cover the previous node was not AggregationNode	2020-07-07 23:06:36 +08:00
Yunfeng,Wu	1cc9e1606f	[Doris On ES] Add UT test for all search phase (#4035 ) I forget push some UT test in this PR #4012. Also remove `_cluster/state` resource because DOE does not rely the full ES cluster state meta.	2020-07-07 23:05:02 +08:00
lichaoyong	c9a7c373a7	[Bug] Return actual json for ConnectionAction (#4016 )	2020-07-07 20:14:55 +08:00
funyeah	d396408861	Correct typos (#4024 )	2020-07-07 13:33:46 +08:00
Yunfeng,Wu	3ba38e3381	[Doris On ES][Refactor] refactor and enchanment ES sync meta logic (#4012 ) After PR #3454 was merged, we should refactor and reorganize some logic for long-term sustainable iteration for Doris On ES. To facilitate code review，I would divided into this work to multiple PRs (some other WIP work I also need to think carefully) This PR include: 1. introduce SearchContext for all state we needed 2. divide meta-sync logic into three phase 3. modify some logic processing 4. introduce version detect logic for future using	2020-07-07 09:04:05 +08:00
WingC	913b2caac4	[Dynamic Partition]Support set replication number (#3965 ) This CL mainly support set replication_num property in dynamic partition table if dynamic_partition.replication_num is not set, the value is the same as table's default replication_num.	2020-07-05 16:28:38 +08:00
Mingyu Chen	2e111c05ac	[Bug] Fix bug that BE crash when doing alter table task (#4015 ) Need to check delete condition first	2020-07-05 16:28:03 +08:00
WingC	fa338fb6d9	[Bug][Memroy Leak]Fix bug TransactionState is not clear from idToFinalStatusTransactionState (#4013 ) This CL includes: 1. Memory leak because transactionState is not removed. 2. Extracting the clear logic to method to avoid forget.	2020-07-05 16:27:41 +08:00
Mingyu Chen	ba120292ab	[ShowIndex] Make Show Index stmt act same as MySQL behavior (#4010 ) `SHOW INDEX FROM db2.tbl1 FROM db1;` will be same as `SHOW INDEX FROM db1.tbl1;`	2020-07-05 16:26:54 +08:00
WingC	1fc82cd6e4	[Code Cleanup]Use ThreadPoolManager to manage some native thread (#3997 ) Now, FE use ThreadPoolManager to manage and monitor all Thread, but there are still some threads are not managed. And FE use `Timer` class to do some scheduler task, but `Timer` class has some problem and is out of date, It should replace by ScheduledThreadPool.	2020-07-05 16:26:22 +08:00
yangzhg	6699be2ac8	[Bug] Keep order of read from segment consist with the write order (#3993 ) Fixes #3989 Add segment id to the comparator when merging the rows read from UNIQUE key table.	2020-07-05 16:25:28 +08:00
WingC	7351f7c237	[Config]Allower use to config different thrift server model (#3986 ) Doris only support TThreadPoolServer model in thrift server, but the server model is not effective in some high concurrency scenario, so this PR introduced new config to allow user to choose different server model by their scenario. Add new FE config: `thrift_server_type`	2020-07-05 16:24:29 +08:00
wutiangan	f521507a46	[SQL] Explain verbose stmt to print tupleDesc/slotDesc information (#3970 )	2020-07-05 16:22:43 +08:00
Mingyu Chen	725ebafd99	[Bug] Cancel the query if OlapScanner prepare failed (#4002 )	2020-07-03 21:33:07 +08:00
Mingyu Chen	bbb7782702	[Bug] Fix bug that linked schema change for alpha rowset will case BE to crash (#3983 ) Co-authored-by: chenmingyu <chenmingyu@baidu.com>	2020-07-03 21:19:31 +08:00
xy720	64f7a1fd1e	[Log] Add log for loading image (#3996 ) When fe load image failed, more logs should be printed to help users analyze errors.	2020-07-03 21:19:08 +08:00
HuangWei	9bb7e5d208	Fix some code & comments (#3999 ) TPlanExecParams::volume_id is never used, so delete the print_volume_ids() function. Fix log, and log if PlanFragmentExecutor::open() returns error. Fix some comments	2020-07-03 21:18:47 +08:00
Yingchun Lai	ab325f5bfd	[shell] Fix BUILD_TYPE not used bug (#3990 ) We can build unit tests by specify BUILD_TYPE to DEBUG/RELEASE/LSAN/ASAN, and outputs in each mode will be placed to different directories, it will save time if rebuild in a same mode.	2020-07-03 10:00:05 +08:00
Yingchun Lai	a16236f22f	[refactor] Remove useless return value of class RowsetGraph (#3977 )	2020-07-03 09:59:51 +08:00
Yunfeng,Wu	1e813df3fd	[Doris On ES] [Bug-Fix][Refactor] Fix potential null pointer exception and refactor function process logic (#3985 ) fix: https://github.com/apache/incubator-doris/issues/3984 1. add `conjunct.size` checking and `slot_desc nullptr` checking logic 2. For historical reasons, the function predicates are added one by one, I just refactor the processing make thelogic for function predicate processing more clearly	2020-07-02 22:32:16 +08:00
yangzhg	5ade21b55d	[Load] Support load true or false as boolean value (#3898 ) Fixes #3831 After this PR insert into: `1/"1" -> 1, 0/"0"->0, true/"true"->1, false/"false" -> 0, "10"->null, "xxxx" -> null` load: `1/true -> 1, 0/false -> 0` other -> null	2020-07-02 13:58:24 +08:00
yangzhg	707d03cbde	[SQL] Remove order by for subquery in set opertion clause (#3806 ) implemnets #3803 Support disable some unmeaningful order by clause. The default limit of 65535 will not be disabled because of it is added at plannode, after we support spill to disk we can move this limit to analyze.	2020-07-02 13:56:53 +08:00
yangzhg	d3d835844f	[Performance] Improve performance of unique table read (#3974 ) Implements #3971 the test table as list: ``` mysql> desc test; +------------+---------+------+-------+---------+---------+ \| Field \| Type \| Null \| Key \| Default \| Extra \| +------------+---------+------+-------+---------+---------+ \| rid \| BIGINT \| No \| true \| 0 \| \| \| qid \| BIGINT \| No \| true \| 0 \| \| \| qidDeleted \| TINYINT \| No \| false \| 0 \| REPLACE \| \| type \| TINYINT \| No \| false \| 0 \| REPLACE \| \| uid \| BIGINT \| No \| false \| 0 \| REPLACE \| \| toUid \| BIGINT \| No \| false \| 0 \| REPLACE \| \| status \| INT \| No \| false \| 0 \| REPLACE \| \| createTime \| INT \| No \| false \| 0 \| REPLACE \| \| source \| INT \| No \| false \| 0 \| REPLACE \| \| misFlag \| INT \| No \| false \| 0 \| REPLACE \| \| anonymous \| TINYINT \| No \| false \| 0 \| REPLACE \| \| uv \| TINYINT \| No \| false \| 1 \| REPLACE \| +------------+---------+------+-------+---------+---------+ 12 rows in set (0.00 sec) mysql> select count() from test; +----------+ \| count() \| +----------+ \| 1093760 \| +----------+ 1 row in set (1.00 sec) ``` There is 29 versions at present ![image](https://user-images.githubusercontent.com/9098473/85992244-2aa26c80-ba27-11ea-918a-04701a58dbdf.png) I run the query `select sum(uv) from test` for 10 times, the average ScanTime reduced from `9s277ms` to `8s206ms`	2020-07-02 13:56:08 +08:00
Mingyu Chen	6a7583bb08	[Doc] Add doc for setting dev env of FE in Eclipse (#3952 ) Also fix some doc bugs	2020-07-02 13:54:36 +08:00
caiconghui	9785e103ea	[Bug] Fix bug that delete stmt with filter condition delete all data from table on segment v2 (#3943 ) When we get different columns's row ranges by column_delete_conditions, we should use union operation instead of intersection operation to get final get final row ranges. The root cause is that we lost the relationship of the two delete conditions in same delete stmt. Base data: ``` k1, k2 1, 2 1, 3 case 1: delete from tbl where k1=1 and k2=2; case 2: delete from tbl where k1=1; delete from tbl where k2=2; ``` We treat the above 2 cases as same, which is incorrect. So we need to process every rowset of delete conditions separately.	2020-07-02 11:07:23 +08:00
Yunfeng,Wu	2362500e77	[Doris On ES] Support create table with wildcard or aliase index (#3968 )	2020-07-01 22:08:06 +08:00
Dayue Gao	fdcbea480d	[Enhancement] DO NOT increase report version for publish task (#3894 ) Fixes #3893 In a cluster with frequent load activities, FE will ignore most tablet report from BE because currently it only handle reports whose version >= BE's latest report version (which is increased each time a transaction is published). This can be observed from FE's log, with many logs like `out of date report version 15919277405765 from backend[177969252]. current report version[15919277405766]` in it. However many system functionalities rely on TabletReport processing to work properly. For example 1. bad or version miss replica is detected and repaired during TabletReport 2. storage medium migration decision and action is made based on TabletReport 3. BE's old transaction is cleared/republished during TabletReport In fact, it is not necessary to update the report version after the publish task. Because this is actually a problem left over by history. In the reporting logic of the current version, we will no longer decrease the version information of the replica in the FE metadata according to the report. So even if we receive a stale version of the report, it does not matter. This CL contains mainly two changes 1. do not increase report version for publish task 2. populate `tabletWithoutPartitionId` out of read lock of TabletInvertedIndex	2020-07-01 09:23:40 +08:00
Mingyu Chen	1bfb105ec1	[Bug] Fix bug that routine load task throw exception when calling afterVisible() (#3979 )	2020-07-01 09:22:33 +08:00
wangbo	210ee9664f	[SparkLoad]add user doc for build global dict (#3938 ) describe global dict and how to use it in spark load	2020-06-30 19:12:35 +08:00
Dayue Gao	f9a52f5db4	[Bug] Insert may leak DeltaWriter when re-analyzed (#3973 )	2020-06-30 11:09:53 +08:00
Yunfeng,Wu	3ac459f0ca	[UT] resolve metric ut fails (#3975 )	2020-06-29 21:54:41 +08:00
caiconghui	48398232e7	[Bug] Fix bug that default_rowset_type have a session variable (#3953 ) This PR is mainly for fixing bug that `default_rowset_type` have a session variable	2020-06-29 19:16:42 +08:00
caiconghui	48d947edf4	Support rpc_timeout property in stream load request to cancel request in fe in time when stream load request is timeout (#3948 ) This PR is to enable cancel stream load request in FE in time when stream load request is timeout to make stream load more robust.	2020-06-29 19:16:16 +08:00
Mingyu Chen	2c96d27fdc	[Enhance] Add MetaUrl and CompactionUrl for "show tablet" stmt (#3962 ) * [Enhance] Add MetaUrl and CompactionUrl for "show tablet" stmt Add MetaUrl and CompactionUrl in result of following stmt: `show tablet 10010`; * fix ut * add doc Co-authored-by: chenmingyu <chenmingyu@baidu.com>	2020-06-29 19:15:38 +08:00
Mingyu Chen	af1beb6ce4	[Enhance] Add prepare phase for some timestamp functions (#3947 ) Fix: #3946 CL: 1. Add prepare phase for `from_unixtime()`, `date_format()` and `convert_tz()` functions, to handle the format string once for all. 2. Find the cctz timezone when init `runtime state`, so that don't need to find timezone for each rows. 3. Add constant rewrite rule for `utc_timestamp()` 4. Add doc for `to_date()` 5. Comment out the `push_handler_test`, it can not run in DEBUG mode, will be fixed later. 6. Remove `timezone_db.h/cpp` and add `timezone_utils.h/cpp` The performance shows bellow: 11,000,000 rows SQL1: `select count(from_unixtime(k1)) from tbl1;` Before: 8.85s After: 2.85s SQL2: `select count(from_unixtime(k1, '%Y-%m-%d %H:%i:%s')) from tbl1 limit 1;` Before: 10.73s After: 4.85s The date string format seems still slow, we may need a further enhancement about it.	2020-06-29 19:15:09 +08:00
xy720	9671394015	[BUG]Make segment V1 and V2 share same file cache (#3945 ) This commit make segment V1 and V2 share on same file cache, so that segment V2's file descriptors stored in cache can be cleaned up as V1 do.	2020-06-29 19:13:24 +08:00
Mingyu Chen	0cbacaf01d	[Refactor] Replace some boost to std in OlapScanNode (#3934 ) Replace some boost to std in OlapScanNode. This refactor seems solve the problem describe in #3929. Because I found that BE will crash to calling `boost::condition_variable.notify_all()`. But after upgrade to this, BE does not crash any more.	2020-06-29 19:13:03 +08:00

1 2 3 4 5 ...

2096 Commits