doris

Author	SHA1	Message	Date
EmmyMiao87	7550b2f09b	Convert mini load to streaming mini load (#1323 ) * This commit has brought contribution to streaming mini load The operation of streaming mini load is sames as previous. Also, user can check the load by frontend. The difference is that streaming mini load finish the task before reply of REST API while the non-streaming only register a load. * When updating doris Updating fe or be firstly are also supported. After fe and be are updated, the streaming mini load will take effect. * For multi mini load The non-streaming mini load still has been used by multi mini load. The behavior of multi mini load has not been changed. * Add a interface named isSupportedFunction This function is used to protect the correctness of new feature which consists of be and fe during updaing.	2019-06-21 19:34:50 +08:00
kangpinghuang	6afedb88a8	Add bitshuffle page (#1304 )	2019-06-19 21:57:06 +08:00
kangpinghuang	7f1720b632	Add rle encoding (#1326 )	2019-06-18 14:48:33 +08:00
ZHAO Chun	a0294b8f40	Add Env for file operation (#1321 )	2019-06-17 10:18:16 +08:00
ZHAO Chun	9d03ba236b	Uniform Status (#1317 )	2019-06-14 23:38:31 +08:00
kangpinghuang	e9b2d30c6a	Add faststring and cpu util (#1281 )	2019-06-12 14:00:50 +08:00
ZHAO Chun	84632cd062	Add BitMapIterator (#1277 )	2019-06-11 09:23:02 +08:00
ZHAO Chun	d1b1fce92f	Change LICENSE file (#1265 )	2019-06-09 15:55:46 +08:00
ZHAO Chun	3e1c70d1b7	Add coding function (#1264 )	2019-06-08 21:02:31 +08:00
ZHAO Chun	934ca2481a	Make MySQL support optional (#1248 )	2019-06-05 12:28:15 +08:00
EmmyMiao87	77a1b31baa	Add show load of loadv2 (#1113 ) This change include the show load of loadv2 and some bug fix of loadv2. Firstly, the show load will perform both load and loadv2 info. According to loadv2, the ETL progress of loadv2 is N/A during the period of loading. Secondly, the loadv2 will be created when version of property is v2. This is a temporary property which will not influence the old broker load. After the loadv2 is finished, the default load will be changed to loadv2. Finally, there are some bug in LoadingTaskPlanner fixed by this change.	2019-05-09 10:27:30 +08:00
Mingyu Chen	a08170fd50	Enhance the usabilities (#1100 ) * Enhence the usabilities 1. Add metrics to monitor transactions and steaming load process in BE. 2. Modify BE config 'result_buffer_cancelled_interval_time' to 300s. 3. Modify FE config 'enable_metric_calculator' to true. 4. Add more log for tracing broker load process. 5. Modify the query report process, to cancel query immediately if some instance failed. * Fix bugs 1. Avoid NullPointer when enabling colocation join with broker load 2. Return immediately when pull load task coordinator execution failed	2019-05-07 15:55:04 +08:00
Mingyu Chen	afa3aa9069	Add some pre-calculated metrics (#1079 ) 1. max io util of disks 2. max network send/receive bytes rate of all network devices 3. base/cumulative compaction request counter and failure counter	2019-04-30 11:12:23 +08:00
lide	9c82d41981	Support Doris query ES by HTTP way (#925 )	2019-04-28 17:14:44 +08:00
Mingyu Chen	400d8a906f	Optimize the consumer assignment of Kafka routine load job (#870 ) 1. Use a data consumer group to share a single stream load pipe with multi data consumers. This will increase the consuming speed of Kafka messages, as well as reducing the task number of routine load job. Test results： * 1 consumer, 1 partitions: consume time: 4.469s, rows: 990140, bytes: 128737139. 221557 rows/s, 28M/s * 1 consumer, 3 partitions: consume time: 12.765s, rows: 2000143, bytes: 258631271. 156689 rows/s, 20M/s blocking get time(us): 12268241, blocking put time(us): 1886431 * 3 consumers, 3 partitions: consume time(all 3): 6.095s, rows: 2000503, bytes: 258631576. 328220 rows/s, 42M/s blocking get time(us): 1041639, blocking put time(us): 10356581 The next 2 cases show that we can achieve higher speed by adding more consumers. But the bottle neck transfers from Kafka consumer to Doris ingestion, so 3 consumers in a group is enough. I also add a Backend config `max_consumer_num_per_group` to change the number of consumers in a data consumer group, and default value is 3. In my test(1 Backend, 2 tablets, 1 replicas), 1 routine load task can achieve 10M/s, which is same as raw stream load. 2. Add OFFSET_BEGINNING and OFFSET_END support for Kafka routine load	2019-04-28 10:33:50 +08:00
EmmyMiao87	8b52787114	Stream load with no data will abort txn (#735 ) 1. stream load executor will abort txn when no correct data in task 2. change txn label to DebugUtil.print(UUID) which is same as task id printed by be 3. change print uuid to hi-lo	2019-04-28 10:33:50 +08:00
Mingyu Chen	0820a29b8d	Implement the routine load process of Kafka on Backend (#671 )	2019-04-28 10:33:50 +08:00
Mingyu Chen	95a06dcd2a	Change the buffer length for FloatToBuffer() method (#1019 )	2019-04-24 20:11:06 +08:00
Mingyu Chen	d3251a19f7	Modify the method to obtain some metrics (#904 )	2019-04-10 19:37:48 +08:00
lide	c34b306b4f	Decimal optimize branch #695 (#727 )	2019-03-22 17:22:16 +08:00
lichaoyong	a9e9aef3ca	Fix sync and trash bug (#570 ) 1. no need to save header when header has no incremental delta 2. make fsync tablet_meta configurable 3. add metric for meta operation	2019-01-22 20:13:09 +08:00
lide	0dd4c6e0a0	Fix ASAN compilation issue (#561 )	2019-01-21 13:09:45 +08:00
lide	79dc521893	Fix UnixMicros function (#544 )	2019-01-16 17:26:43 +08:00
Mingyu Chen	a51ce03595	Enhance the usability of Load operation (#490 ) 1. Add broker load error hub A broker load error hub will collect error messages in load process and saves them as a file to the specified remote storage via broker. In case that in broker/min/streaming load process, user may not be able to access the error log file in Backend directly. We also add a new header option: 'enable_hub' in streaming load request, and default is false. Because if we enable the broker load error hub, it will significantly slow down the processing speed of streaming load, due to the visit of remote storage via broker. So use can disable the error load hub using this header option, to avoid slowing down the load speed. 2. Show load error logs by using SHOW LOAD WARNINGS stmt We also provide a more easy way to get load error logs. We implement 'SHOW LOAD WARNINGS ON 'url'' stmt to show load error logs directly. The 'url' in stmt is provided in 'SHOW LOAD' stmt. eg: show load warnings on "http://192.168.1.1:8040/api/_load_error_log?file=__shard_2/error_log_xxx"; 3. Support now() function in broker load User can mapping a column to now() in broker load stmt, which means this column will be filled with time when the ETL started. 4. Support more types of wildcard in broker load Currently, we only support wildcard '' to match the file names. wildcard like '/path/to/20190[1-4]' is not support.	2019-01-03 19:07:27 +08:00
Mingyu Chen	ff7d3e5878	Unify the print method of TUniqueId (#487 )	2018-12-29 16:22:38 +08:00
lide	c74c915441	Fix license issue of gutil (#471 )	2018-12-27 13:53:27 +08:00
lichaoyong	b037466d56	Get rid of choosing one tablet by compaction (#433 ) 1. Get rid of choosing one tablet by compaction. 2. Change PREFER_READER to PREFER_WRITING from _tablet_map_lock. 3. Change license of murmur_hash	2018-12-24 16:55:39 +08:00
ZHAO Chun	f1db289934	Fix compile error (#461 )	2018-12-24 10:26:03 +08:00
ZHAO Chun	90d71508ff	Add UserFunctionCache to cache UDF's library (#453 ) * Add UserFunctionCache to cache UDF's library This patch replace LibCache with UserFunctionCache. LibCache use HDFS URL to identify a UDF's Library, and when BE process restart all of downloaded library should be loaded another time. We use function id corresponding to a library, and when process restart, all downloaded libraries can be loaded without another downloading. * update	2018-12-21 22:07:21 +08:00
chenhao	e177c23787	Update the increased frequency of priority for remaining tasks in BlockingPriorityQueue (#434 )	2018-12-17 21:05:57 +08:00
ZHAO Chun	e2bb86cf78	Add Md5Digest to util (#420 )	2018-12-12 20:06:35 +08:00
李超勇	6b4049e21c	Unify Slice code path (#380 )	2018-12-03 18:11:47 +08:00
Mingyu Chen	9a2ad18428	Add path info of replica in catalog (#327 ) Add path info of replica in catalog Also fix a bug that when calling check_none_row_oriented_table, store is null, it cannot be used to create table. Instead, OLAPHeader can be used to get storage type information.	2018-11-19 17:42:46 +08:00
Zhao Chun	a2b299e3b9	Reduce UT binary size (#314 ) * Reduce UT binary size Almost every module depend on ExecEnv, and ExecEnv contains all singleton, which make UT binary contains all object files. This patch seperate ExecEnv's initial and destory to anthor file to avoid other file's dependence. And status.cc include debug_util.h which depend tuple.h tuple_row.h, and I move get_stack_trace() to stack_util.cpp to reduce status.cc's dependence. I add USE_RTTI=1 to build rocksdb to avoid linking librocksdb.a Issue: #292 * Update	2018-11-15 16:17:23 +08:00
李超勇	063f7d7a9a	Fix code LICENSE for file modified from LevelDB. (#300 )	2018-11-12 16:09:40 +08:00
Zhao Chun	2081b7fea5	Be compatible with old RPC (#296 ) Add palo.PInternalService which can server old version palo's client. Issue: #293	2018-11-10 15:46:45 +08:00
kangpinghuang	c877b43013	Remove my aes and fix palo ns to doris (#277 )	2018-11-02 17:05:48 +08:00
kangpinghuang	d57e91db6e	Rewrite aes encryption (#264 ) Resolve #257	2018-11-02 15:26:31 +08:00
chenhao7253886	37b4cafe87	Change variable and namespace name in BE (#268 ) Change 'palo' to 'doris'	2018-11-02 10:22:32 +08:00
morningman	2868793b6b	Change license to Apache License 2.0 (#262 )	2018-11-01 09:06:01 +08:00
morningman	051aced48d	Missing many files in last commit In last commit, a lot of files has been missed	2018-10-31 16:19:21 +08:00
morningman	5d3fc80067	Added: * Add streaming load feature. You can execute 'help stream load;' to see more information. Changed: * Loading phase of a certain table can be parallelized, to reduce the load job execution time when multi load jobs to a single table. * Using RocksDB to save the header info of tablets in Backends, to reduce the IO operations and increate speeding of restarting. Fixed: * A lot of bugs fixed.	2018-10-31 14:46:22 +08:00
zhaochun	765c91bbc2	Added: change Doris build.sh to get environment variables from custom_env.sh, and add run-ut.sh and run-fe-ut.sh	2018-10-30 23:42:05 +08:00
imay	ae9ce81453	Changed: change build.sh to use environment variable to get thirdparty's path, and change PALO_HOME to DORIS_HOME	2018-10-30 16:29:06 +08:00
morningman	4f6f8572de	Added: Add 3 new metrics of Backends: host_fd_metrics, process_fd_metrics and process_thread_metrics, to monitor open file number and thread number. Added: Support getting column size and precision info of table or view using JDBC. Updated: Change the promethues type name GAUGE to lowercase, to fit the latest promethues version. Updated: Backend ip saved in FE will be compared with BE's local ip when doing heartbeat, to avoid false positive heartbeat response. Updated: Using version_num of tablet instead of calculating nice value to select cumulative compaction candicates. Fixed: Predicates should not be pushed down to subquery which contains limit clause. Fixed: Fix the formula of calculating BE load score. Fixed: Fix a bug that in some edge cases, non-master Fontend may wait for a unnecessary long timeout after forwarding cmd to Master FE. Fixed: A bug that granting privs on more than one table does not work. Fixed: Support 'Insert into' table which contains HLL columns. Fixed: ExportStmt' toSql() method may throw NullPointer Exception if table does not exist. Fixed: Remove unnecessary 'get capacity' operation to avoid IO impact. Internal commit id: merge to c16bd603a53dfe2089ff95704c698a738c317792	2018-10-26 14:48:21 +08:00
morningman	cc74efb3c5	merge to ddb65b69f9c788e359e191889cb31f15279c41ec (#224 ) 1. Apache HDFS broker support HDFS HA and Hadoop kerberos authentication. 2. New Backup and Restore function. Use Fs Broker to backup your data to HDFS or restore them from HDFS. 3. Table-Level Privileges. Grant fine-grained privileges on table-level to specified user. 4. A lot of bugs fixed. 5. Performance improvement.	2018-08-24 17:12:26 +08:00
morningman	19997510a6	merge to 9625ef157dd44c58802d63cb7547f037b75fd710 (#208 ) 1. Implement Backend http server using libevent instead of mongoose. 2. Remove Old Hypertable rpc framework, use brpc instead. 3. Change rpc from FE to BE to brpc. 4. Fs broker support HDFS HA. 5. add more metrics to monitor. 6. Lots of bug fixed.	2018-07-17 09:20:30 +08:00
lide-reed	bfe1bc7cf3	remove mysql_dtoa	2018-06-13 08:33:48 +08:00
李超勇	7e2a3aa1b3	modify the license (#203 ) some license is replaced not correctly.	2018-06-09 19:12:16 +08:00
morningman	2419384e8a	push 3.3.19 to github (#193 ) * push 3.3.19 to github * merge to 20ed420122a8283200aa37b0a6179b6a571d2837	2018-05-15 20:38:22 +08:00

1 2

57 Commits