Commit Graph

283 Commits

Author SHA1 Message Date
ccf2e5bb9e Add page api for new format segment (#1270) 2019-06-11 10:37:16 +08:00
922fa28097 Add common and options for new format segment (#1269) 2019-06-11 09:34:58 +08:00
84632cd062 Add BitMapIterator (#1277) 2019-06-11 09:23:02 +08:00
53062122ea Change strategy of incorrect data (#1255)
This change adds a load property named strict_mode which is used to prohibit the incorrect data.
When it is set to false, the incorrect data will be loaded by NULL just like before.
When it is set to true, the incorrect data which belongs to a column without expr will be filtered.
The strict_mode is supported in broker load v2 now. It will be supported in stream load later.
2019-06-10 20:39:45 +08:00
8cd29d194e Fix += decimal error (#1272)
This change fix the +=decimal error when integer is zero or fraction is zero.
In this situation, the += operator will make a mistake.
2019-06-10 16:30:57 +08:00
d1b1fce92f Change LICENSE file (#1265) 2019-06-09 15:55:46 +08:00
3e1c70d1b7 Add coding function (#1264) 2019-06-08 21:02:31 +08:00
e4e04e8203 Make LZO support optional (#1263) 2019-06-07 22:26:54 +08:00
ff0dd0d2da Support SSL authentication with Kafka in routine load job (#1235) 2019-06-07 16:29:01 +08:00
934ca2481a Make MySQL support optional (#1248) 2019-06-05 12:28:15 +08:00
ece34fb838 Make hll function backward compatibility (#1251) 2019-06-05 11:12:36 +08:00
6ce8087916 Fix bug that RowCusor do NOT match with RowBlock's layout (#1249) 2019-06-04 22:20:10 +08:00
9f5f44ec48 Reduce memory RowBlock needed (#1238)
Before RowBlock will reserve memory for all columns in schema, even if
it is not queried. Which will cause bad performance when quering wide
table.

In this patch, RowBlock will reserve memory for needed columns. In a
case, this reduce ConvertBatchTime from 10s to 60ms when quering a wide
table who has 178 columns.

 #1236
2019-06-04 12:58:41 +08:00
bedd94dca2 Upgrade brpc to 0.9.5 (#1243)
Change some ut
2019-06-04 11:13:23 +08:00
ae75e44e05 fixup leak memory (#1244)
When I declared that the compilation mode was BUILD_TYPE=LSAN, there was a memory leak after running doris.

be.out:
Direct leak of 32816 byte(s) in 1 object(s) allocated from:
    #0 0x1089666 in __interceptor_malloc ../../../../libsanitizer/lsan/lsan_interceptors.cc:53
    #1 0x7ff459547280 in __alloc_dir (/lib64/libc.so.6+0xc0280)

SUMMARY: LeakSanitizer: 32816 byte(s) leaked in 1 allocation(s).
2019-06-04 11:07:37 +08:00
6231fe0abc Fix FragmentMgrTest crash sometimes (#1232) 2019-06-01 18:10:24 +08:00
741539de91 Release udf headers & lib (#1231)
remove internal headers from udf.h
release udf headers & lib
2019-05-31 17:47:41 +08:00
7cdaba66dc Add spatial func (#1213)
Support some spatial functions, such as ST_Contains.
2019-05-31 14:23:09 +08:00
c20d62679e Add negative load from StreamLoad (#1227) 2019-05-31 07:14:06 +08:00
dc0cd5fd67 Fix the bug of += decimal in olap engine (#1226)
* Fix the bug of += decimal in olap engine
[ISSUE-1225] This change fix the olap engine bug of decimal agg. Using ^ instead of * to judge result is less then zero.
The result of * will be less then zero when the result is overflow. So the answer of += is incorrect.
2019-05-31 07:12:22 +08:00
180d8e5cbd Modify some thirdparties (#1228)
1. Change Kafka java client from 2.0.0 to 0.10.1.1. Because high version client may not support low server server.
2. Enable SSL in librdkafka
2019-05-30 21:23:37 +08:00
9d19c6c315 Support arbitrary kafka properties (#1204) 2019-05-28 10:03:50 +08:00
08c8caeacf Add max cache size to ClientCache in BE (#1202)
Currently, unlimited client cache pool may cause too many connections in FE
2019-05-24 22:02:09 +08:00
85b4619d54 Change insert into to streaming (#1191)
The non-streaming hint of insert into will use the streamin plan which is same as the plan of stream insert.
It will also record the load info and return the label of insert stmt.
The partition is supportted in insert into stmt. The result which meet the target partitions will be loaded.
The introduction of example has been changed especially non-streaming insert.
Also, the param of partition_names is added in sql syntax which is used to declare the target partition_names in target table.

Change META_VERSION to 50
2019-05-23 20:53:30 +08:00
488e3825f7 Fix bug that restore process in BE causes BE crash (#1193)
When calling SnapshotLoader.move(), all files should be revoked if they
are in GC queue, or the file may be deleted after move() success.
2019-05-23 19:22:29 +08:00
5d1457c0b6 Add check to create tablet upon alter tablet task (#1187)
When creating new tablet by alter tablet task,
next_unique_id will increase on the base old tablet.
If next_unique_id is eqaul to zero, it will cause that
ColumnDataMessage not match with tablet meta.
2019-05-23 14:17:22 +08:00
d42409cc35 Fix short key not fill up all space (#1183) 2019-05-22 11:32:46 +08:00
592c2c24d9 Fix revoke files bug (#1181) 2019-05-22 11:06:31 +08:00
c5bf1a8da1 Fix prefix index comparison (#1180)
1. Upon prefix index comparison, it should only compare the fixed length of prefix index
2019-05-21 20:17:24 +08:00
b132f4ac0c Add a configuration to force seek for block (#1179) 2019-05-21 14:53:37 +08:00
722a9e71c7 Optimize json functions (#1177)
1. get_json_xxx() now support using quoto to escape dot
2. Implement json_path_prepare() function to preprocess json_path

Performance of get_json_string() on 1000000 rows reduces from 2.27s to 0.27s
2019-05-21 09:13:12 +08:00
ff2746157e Remove log info from decimalv2_value to avoid performance degradation (#1175) 2019-05-20 14:26:14 +08:00
7f8a1bcdb6 Threadpool should be shutdown before join() (#1171) 2019-05-17 19:10:22 +08:00
18e06d2e67 Fix position seek bug for varchar short key (#1167) 2019-05-16 17:29:43 +08:00
b2e63910a6 Fix bug that routine load task may be blocked due to premature deconstruction (#1166)
Data consumer group should wait all data consumers finished before return.
2019-05-16 16:15:00 +08:00
b24fab48cd Add some logs for compaction process (#1163) 2019-05-15 18:47:08 +08:00
758adce761 Change the compaction thread number based on disks number (#1161)
Add 2 BE configs: base_compaction_num_threads_per_disk and cumulative_compaction_num_threads_per_disk to control the number of threads per disks.
2019-05-15 14:17:11 +08:00
47fb206cdf Skip tablet under compaction when choose candidate (#1160) 2019-05-15 13:05:32 +08:00
ad88741b69 Fix bug that bad tablet blocking compaction of other tablets (#1158)
A bad tablet is always be chosen to do compaction, and failed again
and again, which may block compaction of other tablets.
Add a BE config 'min_compaction_failure_interval_ms' to avoid choosing
bad tablet again at a certain interval, so that other tablets have
chance to do the compaction.

Also fix a bug that using avg() function on varchar column return
unexpected exception.
2019-05-15 12:44:38 +08:00
910f16af81 Fix bug that using wrong capacity in trash sweep policy (#1156)
We should use total disk usage capacity instread of data used capacity,
otherwise, the config 'disk_capacity_insufficient_percentage' will not work.
2019-05-14 11:30:17 +08:00
02f36c23ed Set tablet as bad when loading index failed (#1146)
Bad tablet will be reported to FE and be handled

And add a config auto_recover_index_loading_failure to control the index loading failure processing
2019-05-13 10:22:04 +08:00
79ab7f4413 Change label of broker load txn (#1134)
* Change label of broker load txn

1. put broker load label into txn label
2. fix the bug of `label is already used`
3. fix partition error of new broker load

* Fix count error in mini load and broker load

There are three params (num_rows_load_total, num_rows_load_filtered, num_rows_load_unselected) which are used to count dpp.norm.ALL and dpp.abnorm.ALL.
num_rows_load_total is the number rows of source file.
num_rows_load_unselected is the not satisfied (where conjuncts) rows of num_rows_load_total
num_rows_load_filtered is the rows (quality not good enough) of (num_rows_load_total-num_rows_load_unselected)
2019-05-10 16:53:46 +08:00
77a1b31baa Add show load of loadv2 (#1113)
This change include the show load of loadv2 and some bug fix of loadv2.
Firstly, the show load will perform both load and loadv2 info. According to loadv2, the ETL progress of loadv2 is N/A during the period of loading.
Secondly, the loadv2 will be created when version of property is v2.
This is a temporary property which will not influence the old broker load.
After the loadv2 is finished, the default load will be changed to loadv2.
Finally, there are some bug in LoadingTaskPlanner fixed by this change.
2019-05-09 10:27:30 +08:00
7699c76df2 Fix Nullpointer exception encountered in transaction process (#1112)
* Fix Nullpointer exception encountered in transaction process
* Do not choose unavailable BE when repair tablet
2019-05-08 20:30:34 +08:00
a08170fd50 Enhance the usabilities (#1100)
* Enhence the usabilities

1. Add metrics to monitor transactions and steaming load process in BE.
2. Modify BE config 'result_buffer_cancelled_interval_time' to 300s.
3. Modify FE config 'enable_metric_calculator' to true.
4. Add more log for tracing broker load process.
5. Modify the query report process, to cancel query immediately if some instance failed.

* Fix bugs
1. Avoid NullPointer when enabling colocation join with broker load
2. Return immediately when pull load task coordinator execution failed
2019-05-07 15:55:04 +08:00
53ae591183 Fix pending_delta has no pending_segment_group (#1089) 2019-05-05 19:53:45 +08:00
6cc0457eaf Eliminate add transaction many times when pending_delta has many segmentgroups (#1081) 2019-04-30 18:40:41 +08:00
afa3aa9069 Add some pre-calculated metrics (#1079)
1. max io util of disks
2. max network send/receive bytes rate of all network devices
3. base/cumulative compaction request counter and failure counter
2019-04-30 11:12:23 +08:00
b2a022b348 Add money_format function (#1064) 2019-04-29 18:31:24 +08:00
310a375aec Fix bug that null value is not correctly handled when loading data (#1070)
When partition column's value is NULL, it should be loaded into
    the partition which include MIN VALUE
2019-04-29 13:55:28 +08:00