doris

Author	SHA1	Message	Date
GoGoWen	a8a5c0a6a8	[improvement](load) memory usage optimization for load job (#7454 ) Reduce memory usage when loading unqualified data	2021-12-24 21:30:28 +08:00
Mingyu Chen	6c4aeab06f	[fix](broker-load) BE may crash when using preceding filter in broker or routine load (#7193 ) The broker scan node has two tuple descriptors: One is dest tuple and the other is src tuple. The src tuple is used to read the lines of the original file, and the dest tuple is used to save the converted lines. The preceding filter is executed on the src tuple, so src tuple descriptor should be used to initialize the filter expression	2021-11-30 22:04:05 +08:00
HappenLee	9216735cfa	[New Featrue] Support Vectorization Execution Engine Interface For Doris (#6329 ) 1. FE vectorized plan code 2. Function register vec function 3. Diff function nullable type 4. New thirdparty code and new thrift struct	2021-08-11 14:54:06 +08:00
stdpain	7eae3e280a	[optimization] use inline optimize ExprContext::get_value (#5385 )	2021-02-16 22:35:14 +08:00
Mingyu Chen	780900ac9c	[Feature] Support preceding filter original data when loading (#5338 ) Support conditional filtering of original data in broker load and routine load eg: ``` LOAD LABEL `label1` ( DATA INFILE ('bos://cmy-repo/1.csv') INTO TABLE tbl2 COLUMNS TERMINATED BY '\t' (event_day, product_id, ocpc_stage, user_id) SET ( ocpc_stage = ocpc_stage + 100 ) PRECEDING FILTER user_id = 1381035 WHERE ocpc_stage > 30 ) ... ```	2021-02-07 22:37:48 +08:00
sduzh	6fedf5881b	[CodeFormat] Clang-format cpp sources (#4965 ) Clang-format all c++ source files.	2020-11-28 18:36:49 +08:00
Zhengguo Yang	75e0ba32a1	Fixes some be typo (#4714 )	2020-10-13 09:37:15 +08:00
HappenLee	e4e9af4577	This PR contain three things (#4448 ) 1. Fix core bug wild pointer in PlanFragmentExecutor, fix issue #4447 2. Fix core bug wild pointer json load, fix issue #4452 3. Change the declare order of ODBC type in thrift for compatibility	2020-08-26 10:53:53 +08:00
Zhengguo Yang	d61c10b761	[Delete] Support batch delete [part 1] (#4310 ) * Implements the grammar of the batch delete #4051 * Process create, alter table when table has delete sign column * Support the syntax for enabling the delete column * Automatically filtered deleted data in the select statement. * Automatically add delete sign when create rollup table TODO: * Optimize the reading and compaction logic on the be side, so that the data marked as deleted will be completely deleted during base compaction	2020-08-21 22:57:16 +08:00
HuangWei	10f822eb43	[MemTracker] make all MemTrackers shared (#4135 ) We make all MemTrackers shared, in order to show MemTracker real-time consumptions on the web. As follows: 1. nearly all MemTracker raw ptr -> shared_ptr 2. Use CreateTracker() to create new MemTracker(in order to add itself to its parent) 3. RowBatch & MemPool still use raw ptrs of MemTracker, it's easy to ensure RowBatch & MemPool destructor exec before MemTracker's destructor. So we don't change these code. 4. MemTracker can use RuntimeProfile's counter to calc consumption. So RuntimeProfile's counter need to be shared too. We add a shared counter pool to store the shared counter, don't change other counters of RuntimeProfile. Note that, this PR doesn't change the MemTracker tree structure. So there still have some orphan trackers, e.g. RowBlockV2's MemTracker. If you find some shared MemTrackers are little memory consumption & too time-consuming, you could make them be the orphan, then it's fine to use the raw ptr.	2020-07-31 21:57:21 +08:00
HuangWei	9bb7e5d208	Fix some code & comments (#3999 ) TPlanExecParams::volume_id is never used, so delete the print_volume_ids() function. Fix log, and log if PlanFragmentExecutor::open() returns error. Fix some comments	2020-07-03 21:18:47 +08:00
worker24h	ef8fd1fcbe	[Load] Support load json-data into Doris by RoutineLoad or StreamLoad (#3553 ) Doris support load json-data by RoutineLoad or StreamLoad	2020-05-21 13:00:49 +08:00
Yingchun Lai	b58b1b3953	[metrics] Make DorisMetrics to be a real singleton (#3417 )	2020-05-04 09:20:53 +08:00
HuangWei	a80e9bf229	Fix broker scan node mem limit check (#3123 )	2020-03-16 20:36:46 +08:00
HangyuanLiu	2326b478b6	Support load orc format in Apache Doris (#2554 ) Support load orc format in Apache Doris	2020-01-07 14:22:43 +08:00
Mingyu Chen	3c12af4dcc	Limit the memory consumption of broker scan node (#1996 ) If memory exceed limit, no more row batch will be pushed to batch queue	2019-10-17 14:40:16 +08:00
Mingyu Chen	0a27ef030b	Reduce the number of partition info in BrokerScanNode param (#1675 ) And we should reduce the number of partition info in BrokerScanNode param if user already set target partitions to load, instead of adding all partitions' info. It will cause the size of RPC packet too large.	2019-08-20 19:30:57 +08:00
kangkaisen	cd2b8373c2	Fix Stream load double NumberTotalRows (#1664 )	2019-08-19 12:23:43 +08:00
worker24h	7eab12a40e	Support reading Parquet file when loading data (#1173 )	2019-07-01 18:39:27 +08:00
ZHAO Chun	ba44249f80	Remove unused code (#1320 )	2019-06-15 20:41:48 +08:00
ZHAO Chun	30028bc35b	Deny specify partition for unpartitioned table (#1319 )	2019-06-15 18:19:56 +08:00
ZHAO Chun	9d03ba236b	Uniform Status (#1317 )	2019-06-14 23:38:31 +08:00
EmmyMiao87	79ab7f4413	Change label of broker load txn (#1134 ) * Change label of broker load txn 1. put broker load label into txn label 2. fix the bug of `label is already used` 3. fix partition error of new broker load * Fix count error in mini load and broker load There are three params (num_rows_load_total, num_rows_load_filtered, num_rows_load_unselected) which are used to count dpp.norm.ALL and dpp.abnorm.ALL. num_rows_load_total is the number rows of source file. num_rows_load_unselected is the not satisfied (where conjuncts) rows of num_rows_load_total num_rows_load_filtered is the rows (quality not good enough) of (num_rows_load_total-num_rows_load_unselected)	2019-05-10 16:53:46 +08:00
Mingyu Chen	b7b66527ce	Fix some load bugs (#961 ) 1. Use load job's timeout as its txn timeout 2. Add a new session variable 'forward_to_master' for SHOW PROC and ADMIN stmt	2019-04-28 10:33:50 +08:00
Mingyu Chen	9d08be3c5f	Add metrics for routine load (#795 ) * Add metrics for routine load * limit the max number of routine load task in backend to 10 * Fix bug that some partitions will no be assigned	2019-04-28 10:33:50 +08:00
Zhao Chun	a2b299e3b9	Reduce UT binary size (#314 ) * Reduce UT binary size Almost every module depend on ExecEnv, and ExecEnv contains all singleton, which make UT binary contains all object files. This patch seperate ExecEnv's initial and destory to anthor file to avoid other file's dependence. And status.cc include debug_util.h which depend tuple.h tuple_row.h, and I move get_stack_trace() to stack_util.cpp to reduce status.cc's dependence. I add USE_RTTI=1 to build rocksdb to avoid linking librocksdb.a Issue: #292 * Update	2018-11-15 16:17:23 +08:00
chenhao7253886	37b4cafe87	Change variable and namespace name in BE (#268 ) Change 'palo' to 'doris'	2018-11-02 10:22:32 +08:00
morningman	2868793b6b	Change license to Apache License 2.0 (#262 )	2018-11-01 09:06:01 +08:00
morningman	2419384e8a	push 3.3.19 to github (#193 ) * push 3.3.19 to github * merge to 20ed420122a8283200aa37b0a6179b6a571d2837	2018-05-15 20:38:22 +08:00
morningman	5de798fdd6	Merge code to github (#187 ) * merge to 95787f8be1fd0ff215708fb0f49997b632876586 * Bugs fixed	2018-03-23 14:04:55 +08:00
李超勇	6486be64c3	fix license statement (#29 ) * change picture to word * change picture to word * SHOW FULL TABLES WHERE Table_type != VIEW sql can not execute * change license description	2017-08-18 19:16:23 +08:00
cyongli	e2311f656e	baidu palo	2017-08-11 17:51:21 +08:00

32 Commits