doris

Author	SHA1	Message	Date
morningman	b08e08b3ba	first	2020-05-14 09:25:51 +08:00
hffariel	47bce081d2	[website] Support documents' fulltext searching (master) (#3535 ) add documents' fulltext search powered by algolia	2020-05-13 21:18:42 +08:00
Zhao Chun	95c67db712	[community] Add Committer Guide (#3522 )	2020-05-13 21:17:12 +08:00
Binglin Chang	a7cfafe076	[Memory Engine] add core column related classes (#3508 ) add core column related classes	2020-05-13 16:30:32 +08:00
Mingyu Chen	54e38ecda2	[Bug] Fix bug of transaction manager (#3565 ) Fix bug of using wrong `abortTransaction()` method	2020-05-13 15:45:15 +08:00
Mingyu Chen	ca7c0717cd	Fix compile bug (#3557 )	2020-05-12 10:24:37 +08:00
caiconghui	b648734441	[TxxMgr] Support txn management in db level and use ArrayDeque to improve txn task performance (#3369 ) This PR is the first step to make Doris stream load more robust with higher concurrent performance(#3368)，the main work is to support txn management in db level isolation and use ArrayDeque to stored final status txns.	2020-05-11 23:32:43 +08:00
WingC	4294301c53	Throw DdlException when use `admin set frontend config` (#3539 ) The set more than one config in a single set config stmt, an exception will be thrown to forbid the operation.	2020-05-11 23:29:38 +08:00
Binglin Chang	63fecc7954	Remove unused ColumnType (#3532 )	2020-05-11 18:57:47 +08:00
令狐少侠	5a57ecca15	[Doris On ES]fix bug of query failed in doc_value_mode when fields have none value (#3513 ) #3479 Here I try to explain the cause of the problem and how to fix it. The Cause of The problem Take the case in issue(#3479 ) as an example: The general results are as follows: ``` GET table/_doc/_search {"query":{"match_all":{}},"stored_fields":"_none_","docvalue_fields":["k1"],"sort":["_doc"],"size":100} { "took": 6, "timed_out": false, "_shards": { …… }, "hits": { "total": 3, "max_score": null, "hits": [ { "_index": "table", "_score": null, "sort": [ 0 ] }, { "_index": "table", "_score": null, "fields": { "k1": [ "kkk1" ] }, "sort": [ 0 ] }, { "_index": "table", "_score": null, "sort": [ 0 ] } ] } } ``` But in Doris on ES，Be fetched data parallelly on all shards, and use `filter_path` to reduce the network cost. The process will be as follows: ``` GET table/_doc/_search?preference=_shards:1&filter_path=_scroll_id,hits.hits._source,hits.total,_id,hits.hits._source.fields,hits.hits.fields {"query":{"match_all":{}},"stored_fields":"_none_","docvalue_fields":["k1"],"sort":["_doc"],"size":100} { "hits": { "total": 0 } } GET table/_doc/_search?preference=_shards:2&filter_path=_scroll_id,hits.hits._source,hits.total,_id,hits.hits._source.fields,hits.hits.fields {"query":{"match_all":{}},"stored_fields":"_none_","docvalue_fields":["k1"],"sort":["_doc"],"size":100} { "hits": { "total": 1 } } GET table/_doc/_search?preference=_shards:3&filter_path=_scroll_id,hits.hits._source,hits.total,_id,hits.hits._source.fields,hits.hits.fields {"query":{"match_all":{}},"stored_fields":"_none_","docvalue_fields":["k1"],"sort":["_doc"],"size":100} { "hits": { "total": 1, "hits": [ { "fields": { "k1": [ "kkk1" ] } } ] } } ``` Scan-Worker On BE which processed result of shard2 will failed. The reasons are as follows: 1. "filter_path" causes the hits.hits object not exist. 2. In the current implementation, if there are some data rows（total > 0）, the hits.hits. object must be an array How To Fix it Two Method: 1. modify "filter_path" to contain the hits. Pros: Fixed Code is very simple Cons: More network cost 2. Deal with the case where fields are missing in a batch. Pros: No loss of performance Cons: Code is more complex Performance first, I use Method2. Design 1. Add a variable "_doc_value_mode" into Class "EsScrollParser" to =indicate whether the data processed by this parser is doc_value_mode or not. 2. "_doc_value_mode" is passed from ESScollReader <- ESScanner <- ScrollQueryBuilder::build() that determines whether DSL is enable doc_value_mode 3. When hits.hits of response from ES is empty and total > 0. We know there are data lines, but the corresponding fields do not exist. EsScrollParser will use "_doc_value_mode" and _total to construct _total lines which fields are assigned with 'NULL'	2020-05-11 15:34:12 +08:00
WingC	edbeaf8e30	Throw a UserException when miss plugin's md5 file (#3542 )	2020-05-11 15:33:35 +08:00
HuangWei	57cbfb772d	Add -Werror when gcc<=7.3.0 & udf fix (#3533 )	2020-05-11 10:31:38 +08:00
Yingchun Lai	b576e54fe6	[ASAN] Fix some address problems detected by ASAN (#3495 ) LSAN detected errors have been fixed by a prior pathch (#3326), but there are still some ASAN detected errors. This patch try to fix these errors to make Doris BE more robustness. And then we can add CI run in LSAN/ASAN mode to detect memory errors as early as possible.	2020-05-11 10:30:45 +08:00
Lijia Liu	561765fc08	Identify old empty tablet when add tablet to meta in ReportHandler (#3547 )	2020-05-11 09:50:43 +08:00
Dayue Gao	56db6e7a35	[Config] allow user to config BRPC socket_max_unwritten_bytes (#3488 ) Add new BE config `brpc_socket_max_unwritten_bytes`	2020-05-10 17:56:14 +08:00
Lijia Liu	edb3ad696d	[Deps] Remove redundant com.baidu:jprotobuf (#3322 ) * exclude jprotobuf from jprotobuf-rpc-core * add commons-io used in fe.	2020-05-10 17:10:46 +08:00
Seaven	488aa22938	[Doc] Update plugin document (#3447 ) (#3505 )	2020-05-09 19:19:38 +08:00
Dayue Gao	b62b310864	[Bug] Fix BE crash when input to hll_merge is null (#3521 )	2020-05-09 11:01:48 +08:00
Yingchun Lai	e2c3c84e8d	[ut] disable backgrounp scan context gc to speed up unit test (#3524 ) Each test case in ExternalScanContextMgrTest may cost 1 minitue which is too long, we'd better disable backgrounp scan context gc to speed up unit test.	2020-05-09 09:01:05 +08:00
Youngwb	a656a7ddd4	Support append_trailing_char_if_absent function (#3439 )	2020-05-09 08:59:34 +08:00
Mingyu Chen	2586f09548	[Bug] Fix bug that SHOW DELETE not return Delete job info (#3515 ) The callback added to the CallbackFactory should not be removed until the transaction is aborted or visible. Otherwise, some callback method may failed to be called.	2020-05-08 13:04:20 +08:00
sduzh	8015d8ca6b	[Script] Make build script portable on Mac OS (#3493 ) 1. options of command `mkdir` and `cp` must be put before arguments on Mac OS. 2. directory name `build` is conflict with file `BUILD` on Mac OS.	2020-05-08 12:55:34 +08:00
Dayue Gao	2f7d2c7e1a	[BUG] Fix a bug that ignore_broken_disk may not work (#3486 ) When BE sets `ignore_broken_disk` to true, it's expected that non-exist path in storage_root_path won't prevent BE from launching, but in 0.12 BE fails to launch in such scenario. ``` W0506 14:46:11.039953 17040 options.cpp:64] path can not be canonicalized. may be not exist. path=/data11/olap W0506 14:46:11.040014 17040 options.cpp:141] failed to parse store path /data11/olap, res=-203 ``` The reason is that #2861 adds a path existence check in `parse_root_path` which precedes the usage of `ignore_broken_disk` in the main method.	2020-05-08 12:53:44 +08:00
yangzhg	94b3a2bd50	[Bug] Fix string functions not support multibyte string (#3345 ) Let string functions support utf8 encoding	2020-05-08 12:52:46 +08:00
yangzhg	f90da72078	[Planner]Enhance AssertNumRowsNode (#3485 ) Enhance AssertNumRowsNode to support equal, less than, greater than,... assert conditions	2020-05-08 12:49:48 +08:00
HappenLee	45814c85ac	[BugFix] Fix the bug that FE web can't show each fragment execute time percent (#3497 ) like this： (Active: 14.133ms, non-child: 93.20%)	2020-05-08 12:48:05 +08:00
EmmyMiao87	f591976976	[Doc] Fix the incorrect docs (#3501 )	2020-05-08 12:47:00 +08:00
Mingyu Chen	084515317f	[Bug] Fix constant In Predicate result error (#3511 ) `select 1 not in (2, NULL, 1);` should return `0`	2020-05-08 11:30:11 +08:00
Seaven	bc42870936	Update UT compile path (#3498 ) (#3499 ) Change UT path from be/build to be/ut_build. In this way, normal build won't overwrite ut build directory.	2020-05-08 10:18:34 +08:00
kangkaisen	1893cd7b5b	Update travis.yml (#3512 )	2020-05-07 23:10:36 +08:00
yangzhg	c85d847b1e	[CompileBug] fix a compile error (#3502 ) NodeChannel::mark_close() missing `return`	2020-05-07 23:01:46 +08:00
wangbo	d60bb81cb0	[SQL Function] Calculate 'case when expr' when possible (#3396 ) Calculate 'case when expr' when possible	2020-05-07 22:04:09 +08:00
HuangWei	94539e7120	Non blocking OlapTableSink (#3143 ) ImplementaItion Notes NodeChannel _cur_batch -> _pending_batches: when _cur_batch is filled up, move it to _pending_batches. add_row() just produce batches. try_send_and_fetch_status() tries to consume one pending batch. If has in flight packet, skip send in this round. So we can add one sender thread to be in charge of all node channels try_send. IndexChannel init(), open() stay the same. Use for_each_node_channel() to expose the detailed changes of NodeChannel.(It's more easy to read & modify) Sender thread See func OlapTableSink::_send_batch_process() Why use polling？ If we use wait/notify, it will notify when generate a new batch. We can't skip sending this batch, coz it won't notify the same batch again. So wait/notify can't avoid blocking simply. So I choose polling. It's wasting to continuously try_send(), but it's difficult to set the suitable polling interval. Thus, I add std::this_thread::yield() to give up the time slice, give priority to other process/threads (if there are other process/threads waiting in the queue).	2020-05-07 10:43:41 +08:00
sduzh	9e8a060e5b	Replace std::tr1::unordered_map with std::unordered_map (#3478 )	2020-05-07 10:38:27 +08:00
sduzh	36f2863574	fix mismatched tags (#3489 ) RandomAccessFileOptions, WritableFileOptions, RandomRWFileOptions defined as a struct but previously declared as a class; this is valid, but will result in compile warning or error under clang compiler	2020-05-07 09:37:26 +08:00
Mingyu Chen	ca36dc697f	[Bug] Fix bug that push down logic error on semi join (#3481 ) For SQL like: ``` select * from join1 left semi join join2 on join1.id = join2.id and join2.id > 1; ``` the predicate `join2.id > 1` can not be pushed down to table join1.	2020-05-07 09:30:30 +08:00
Binglin Chang	7399997433	[Memory Engine] Add hash index implementation (#3462 )	2020-05-06 23:37:25 +08:00
sduzh	d64704599d	[CodeRefactor] vector reserve before push_back (#3463 ) 1. reserve `SegmentWriter::_column_writers` before writing it 2. remove some condition branchs in SegmentWriter::init 3. fix hard-coded library names in build-thirdpary.sh	2020-05-06 17:16:37 +08:00
Mingyu Chen	5e63629b8b	[Decommission] Support NOT dropping BE after decommission (#3461 ) Add a new config `drop_backend_after_decommission` in FE. if this config is false, the BE will not be dropped after finishing decommission operation. This new config is try to solve the problem described in ISSUE: #3460 . TODO: This method will generate a lot of data migration, so it is only a temporary solution. After that, we should try to solve the problem of data balancing within the BE. This CL also add the documents of FE and BE configuration. These documents are incomplete and can be added later.	2020-05-06 17:14:24 +08:00
Mingyu Chen	101628c813	[Bug] Fix bug of predicate pushdown logic (#3475 ) When there is subquery in where clause, the query will be rewritten to join operation. And some auxiliary binary predicates will be generated. These binary predicates will not go through the ExprRewriteRule, so they are not normalized as "column to the left and constant to the right" format. We need to take this case into account so that the `canPushDownPredicate()` judgement will not throw exception.	2020-05-06 15:15:37 +08:00
hffariel	dafb356b42	[Bugfix] Fix navbar not showing on mobile clients(#3419 ) & image relative path problem (#3427 )	2020-05-06 11:57:03 +08:00
kangkaisen	caa7a07c70	[Query Plan]Support simple transitivity on join predicate pushdown (#3453 ) Current implement is very simply and conservative, because our query planner is error-prone. After we implement the new query planner, we could do this work by `Predicate Equivalence Class` and `PredicatePushDown` rule like presto.	2020-05-04 15:32:19 +08:00
sduzh	c6822f513a	[Code refactor] Remove unnecessary if condition (#3459 ) if _opts.conditions is nullptr, the set cids would be empty.	2020-05-04 15:26:52 +08:00
Mingyu Chen	101c7c161d	[Bug] Fix bug that double unregister the resource pool in runtime state (#3458 ) The resource pool in runtime state will be automatically unregistered when deconstructing the RuntimeState. So no need to unregister it when closing the plan fragment executor.	2020-05-04 14:48:57 +08:00
Mingyu Chen	a5922051c9	[Fix] Fix bug that rowset meta is deleted after compaction (#3451 ) * [Fix] Fix bug that rowset meta is deleted after compaction After compaction, the tablet rowset meta will be modified by adding to new output rowsets and deleting the old input rowsets. The output version may equals to the input version. So we should delete the "input" version from _rs_version_map before adding the "output" version to _rs_version_map. Otherwise, the new "output" version will be lost in _rs_version_map.	2020-05-04 09:45:25 +08:00
Yingchun Lai	b58b1b3953	[metrics] Make DorisMetrics to be a real singleton (#3417 )	2020-05-04 09:20:53 +08:00
Binglin Chang	d948af6a2f	Fix build failure after binutils-dev 2.34 (#3449 ) Doris uses some binutils private API, and binutils-dev 2.34 remove them. This commit makes the code compatible with new versions.	2020-05-03 17:26:22 +08:00
wangbo	da4d2d2699	[UT] Fix UT bug (#3456 ) SSD cool downtime shouldn't be fix time in UT;	2020-05-03 16:24:08 +08:00
xbyang18	a1500eb544	Update doris-on-es.md (#3446 )	2020-05-03 12:48:48 +08:00
xbyang18	2cb4027164	Update doris-on-es.md (#3441 )	2020-05-03 12:48:19 +08:00

1 2 3 4 5 ...

1820 Commits