doris

Author	SHA1	Message	Date
kangpinghuang	9d7f99a669	Add new file format design markdown (#1267 )	2019-06-11 09:34:06 +08:00
EmmyMiao87	53062122ea	Change strategy of incorrect data (#1255 ) This change adds a load property named strict_mode which is used to prohibit the incorrect data. When it is set to false, the incorrect data will be loaded by NULL just like before. When it is set to true, the incorrect data which belongs to a column without expr will be filtered. The strict_mode is supported in broker load v2 now. It will be supported in stream load later.	2019-06-10 20:39:45 +08:00
Mingyu Chen	ff0dd0d2da	Support SSL authentication with Kafka in routine load job (#1235 )	2019-06-07 16:29:01 +08:00
kevin	cb91e15f1e	Modify UDF docs (#1260 )	2019-06-06 15:47:10 +08:00
ZHAO Chun	7cdaba66dc	Add spatial func (#1213 ) Support some spatial functions, such as ST_Contains.	2019-05-31 14:23:09 +08:00
HangyuanLiu	9d19c6c315	Support arbitrary kafka properties (#1204 )	2019-05-28 10:03:50 +08:00
HangyuanLiu	5ca2805701	Add some date time function doc (#1206 )	2019-05-27 17:36:09 +08:00
EmmyMiao87	85b4619d54	Change insert into to streaming (#1191 ) The non-streaming hint of insert into will use the streamin plan which is same as the plan of stream insert. It will also record the load info and return the label of insert stmt. The partition is supportted in insert into stmt. The result which meet the target partitions will be loaded. The introduction of example has been changed especially non-streaming insert. Also, the param of partition_names is added in sql syntax which is used to declare the target partition_names in target table. Change META_VERSION to 50	2019-05-23 20:53:30 +08:00
HangyuanLiu	cde315c9e9	Add date-function doc (#1190 )	2019-05-23 15:29:08 +08:00
Mingyu Chen	722a9e71c7	Optimize json functions (#1177 ) 1. get_json_xxx() now support using quoto to escape dot 2. Implement json_path_prepare() function to preprocess json_path Performance of get_json_string() on 1000000 rows reduces from 2.27s to 0.27s	2019-05-21 09:13:12 +08:00
EmmyMiao87	398055ef3e	Add logic of cancel job (#1154 )	2019-05-14 17:26:45 +08:00
Yunfeng,Wu	76a8093c70	Add documentation for doris on es (#1151 )	2019-05-13 21:58:05 +08:00
ZHAO Chun	debb58c278	Add SHOW FUNCTION and update docs for UDF (#1140 )	2019-05-11 21:46:37 +08:00
Mingyu Chen	4039985729	Fix some bugs about decommission (#1138 ) 1. Print the last few tablets of decommission backend in fe.log for debug. 2. OlapTableSink should get replica on alive Backends, not only available Backends. 3. When decommission multi Backends, we should drop the redundant replicas before creating a new one. 4. Replicas on decommissioning Backends should be not added to catalog again. 5. Decommissioning Backends should not be chosen as destination of tablet repairing.	2019-05-10 17:41:48 +08:00
EmmyMiao87	79ab7f4413	Change label of broker load txn (#1134 ) * Change label of broker load txn 1. put broker load label into txn label 2. fix the bug of `label is already used` 3. fix partition error of new broker load * Fix count error in mini load and broker load There are three params (num_rows_load_total, num_rows_load_filtered, num_rows_load_unselected) which are used to count dpp.norm.ALL and dpp.abnorm.ALL. num_rows_load_total is the number rows of source file. num_rows_load_unselected is the not satisfied (where conjuncts) rows of num_rows_load_total num_rows_load_filtered is the rows (quality not good enough) of (num_rows_load_total-num_rows_load_unselected)	2019-05-10 16:53:46 +08:00
mengqinghuan	e5a5201626	Update routine-load-manual.md (#1133 ) edit some descriptions about “max_error_number”	2019-05-10 14:38:28 +08:00
mengqinghuan	4aa41a4e3b	Update admin_stmt.md (#1131 )	2019-05-10 11:49:29 +08:00
Mingyu Chen	ba78adae94	Fix bugs when using function in both stream load request and routine load job (#1091 )	2019-05-05 20:51:30 +08:00
kangkaisen	b2a022b348	Add money_format function (#1064 )	2019-04-29 18:31:24 +08:00
ZHAO Chun	9a570af9a3	Add insert statement document (#1069 )	2019-04-29 14:22:20 +08:00
Mingyu Chen	310a375aec	Fix bug that null value is not correctly handled when loading data (#1070 ) When partition column's value is NULL, it should be loaded into the partition which include MIN VALUE	2019-04-29 13:55:28 +08:00
EmmyMiao87	1662d91877	Change the logic of RoutineLoadTaskScheduler (#1061 ) 1. TaskScheduler will process one task per round 2. TaskScheduler will be blocked till queue tasks a new task 3. TaskScheduler will submit tasks when queue is empty 4. Add a example of creating a broker table by BOS 5. Change syntax of show routine load job	2019-04-28 20:05:48 +08:00
Mingyu Chen	5e36a769a0	Change the way to calculate task num (#1049 )	2019-04-28 10:33:50 +08:00
Mingyu Chen	9cd090c96a	Modify routine load doc (#1016 ) Add config specification	2019-04-28 10:33:50 +08:00
EmmyMiao87	a79bd0c771	Add doc of auto creator of kafka topic (#985 ) * Add annotation of show routine load	2019-04-28 10:33:50 +08:00
Mingyu Chen	1b5643c6fb	Fix some bugs (#979 ) 1. Add Config.max_routine_load_concurrent_task_num instead of the old one 2. Fix a bug that SHOW ALTER TABLE COLUMN may throw Nullpointer exception 3. Fix some misspelling of docs	2019-04-28 10:33:50 +08:00
Mingyu Chen	56bec6f22a	Add routine load manual (#967 )	2019-04-28 10:33:50 +08:00
Mingyu Chen	b7b66527ce	Fix some load bugs (#961 ) 1. Use load job's timeout as its txn timeout 2. Add a new session variable 'forward_to_master' for SHOW PROC and ADMIN stmt	2019-04-28 10:33:50 +08:00
Mingyu Chen	400d8a906f	Optimize the consumer assignment of Kafka routine load job (#870 ) 1. Use a data consumer group to share a single stream load pipe with multi data consumers. This will increase the consuming speed of Kafka messages, as well as reducing the task number of routine load job. Test results： * 1 consumer, 1 partitions: consume time: 4.469s, rows: 990140, bytes: 128737139. 221557 rows/s, 28M/s * 1 consumer, 3 partitions: consume time: 12.765s, rows: 2000143, bytes: 258631271. 156689 rows/s, 20M/s blocking get time(us): 12268241, blocking put time(us): 1886431 * 3 consumers, 3 partitions: consume time(all 3): 6.095s, rows: 2000503, bytes: 258631576. 328220 rows/s, 42M/s blocking get time(us): 1041639, blocking put time(us): 10356581 The next 2 cases show that we can achieve higher speed by adding more consumers. But the bottle neck transfers from Kafka consumer to Doris ingestion, so 3 consumers in a group is enough. I also add a Backend config `max_consumer_num_per_group` to change the number of consumers in a data consumer group, and default value is 3. In my test(1 Backend, 2 tablets, 1 replicas), 1 routine load task can achieve 10M/s, which is same as raw stream load. 2. Add OFFSET_BEGINNING and OFFSET_END support for Kafka routine load	2019-04-28 10:33:50 +08:00
Mingyu Chen	c577b9397e	Add help doc of routine load (#811 )	2019-04-28 10:33:50 +08:00
HangyuanLiu	a5494372b8	Fix some error in doc (#998 )	2019-04-23 13:45:04 +08:00
HangyuanLiu	22f93b5d7a	Fix doc in alter bloom filter (#984 )	2019-04-22 14:07:12 +08:00
ZHAO Chun	22dc6119b9	Add some string functions doc (#965 )	2019-04-19 09:50:06 +08:00
ZHAO Chun	cdd613c5a4	Add some string functions document (#928 )	2019-04-16 10:14:49 +08:00
ZHAO Chun	de1d1f715a	Add readme for new documentation framework (#919 )	2019-04-15 11:01:26 +08:00
HangyuanLiu	61e356b8a5	Fix docs 'CANCEL DECOMMISSION' (#879 )	2019-04-04 17:38:29 +08:00
zhaidongbo	a1bfc90320	Support hll_raw_agg in Aggregate Function (#832 ) hll_raw_agg Function aggregates the HLL type value, and return the HLL type value	2019-04-01 16:17:56 +08:00
HangyuanLiu	67314f07f3	Fix syntax error lable -> label (#817 )	2019-03-26 19:29:11 +08:00
Mingyu Chen	d47600ed84	Modify the logic of setting password (#798 ) * Modify the logic of setting password 1. User can set password for current_user() or if it has GRANT priv 2. And USER() function support	2019-03-25 09:27:40 +08:00
Mingyu Chen	c11e78c6e6	Fix bug of invalid replica last failed version (#746 ) 1. Some previous doris version may cause some invalid replica last failed version. 2. Also modify the CREATE TABLE help doc, remove row storage type and random distribution.	2019-03-14 12:35:29 +08:00
Mingyu Chen	4dbbd32a72	Remove sensitive info (#692 )	2019-03-06 17:29:11 +08:00
Mingyu Chen	1b96af5e3e	Add some images for wiki pages (#670 )	2019-02-27 15:06:15 +08:00
Mingyu Chen	9252beca99	Simplify the delete stmt (#668 ) Remove the restrict that delete stmt must specify partition even for unpartitioned table	2019-02-27 12:46:36 +08:00
lide	e135e3d41e	Add an example of help load (#584 )	2019-01-25 11:18:10 +08:00
Mingyu Chen	079141e14a	Add disk usage percent in SHOW BACKEND stmt (#571 )	2019-01-23 14:08:33 +08:00
Mingyu Chen	54e98f6964	Auto fix missing version replica (#560 )	2019-01-21 08:56:43 +08:00
Mingyu Chen	d15bc83de0	Fix some bugs of alter table operation (#550 ) 1. Fix bug that failed to query restored table after schema change. 2. Fix bug that failed to add rollup to restored table. 3. Optimize the info of SHOW ALTER TABLE stmt. 4. Optimize the info of some PROCs. 5. Optimize the tablet checker to avoid adding too much task to scheduler.	2019-01-17 15:17:51 +08:00
Mingyu Chen	798a66e6a0	Implement new tablet repair and balance framework (#336 ) More detail, see issue #540	2019-01-16 13:29:17 +08:00
ZHAO Chun	9bfd8d818a	Add md5 property for UDF create statement (#500 )	2019-01-06 19:45:04 +08:00
Mingyu Chen	a51ce03595	Enhance the usability of Load operation (#490 ) 1. Add broker load error hub A broker load error hub will collect error messages in load process and saves them as a file to the specified remote storage via broker. In case that in broker/min/streaming load process, user may not be able to access the error log file in Backend directly. We also add a new header option: 'enable_hub' in streaming load request, and default is false. Because if we enable the broker load error hub, it will significantly slow down the processing speed of streaming load, due to the visit of remote storage via broker. So use can disable the error load hub using this header option, to avoid slowing down the load speed. 2. Show load error logs by using SHOW LOAD WARNINGS stmt We also provide a more easy way to get load error logs. We implement 'SHOW LOAD WARNINGS ON 'url'' stmt to show load error logs directly. The 'url' in stmt is provided in 'SHOW LOAD' stmt. eg: show load warnings on "http://192.168.1.1:8040/api/_load_error_log?file=__shard_2/error_log_xxx"; 3. Support now() function in broker load User can mapping a column to now() in broker load stmt, which means this column will be filled with time when the ETL started. 4. Support more types of wildcard in broker load Currently, we only support wildcard '' to match the file names. wildcard like '/path/to/20190[1-4]' is not support.	2019-01-03 19:07:27 +08:00

1 2

100 Commits