Commit Graph

583 Commits

Author SHA1 Message Date
d8cfbbedf7 Support bitmap_empty function (#2227) 2019-11-18 20:37:00 +08:00
626001fae4 Fix bug upon is not null with batch without null data (#2219)
DefaultValueReader will be created for this column after adding a column.
Upon IS NOT NULL predicate, be will be core dump because of null pointer.
2019-11-17 22:54:29 +08:00
59e9027f76 Fix bug that timeout is not taken effect in streamload (#2217) 2019-11-16 22:29:55 +08:00
c5ce72215d Optimize tablet report with expired transaction. (#2215)
When there are lots of expired transactions on BE, and with large
number of tablet, the report thread may become to slow. Because it
has to iterate the whole transaction map for each tablet.

But this is unnecessary. We should first build a expired transaction
map with 'tablet id' as key. And for each tablet, we only need to seek
the expired transaction map once with tablet id, instead of traversing
the whole transaction map.
2019-11-15 23:03:21 +08:00
22d0987e97 Set cumulative point upon no suitable rowsets (#2204) 2019-11-14 22:16:42 +08:00
8b7735f7e9 Fix npe when segment writer init failed (#2202) 2019-11-14 21:40:18 +08:00
20e0344033 Fix condition npe bug (#2200) 2019-11-14 14:13:49 +08:00
b187c0881c Fix bug of null safe equal join (#2193) 2019-11-14 08:52:48 +08:00
3dcb8c991c Make RowBatch compatible with old version (#2190)
Field len of StringValue is changed from int to int64. This will cause
invalid length of StringValue when deserializing RowBatch sent from 0.10
Doris. And then this will lead fail to allocate memory and make BE
crash.
2019-11-13 23:26:26 +08:00
35b2800542 Keep num_of_columns_from_file incompatibile with 0.10 protocol (#2187)
After checking, I found that broker load in 0.11 added num_of_columns_from_file parameter in thrift. This parameter does not consider compatibility in BE.
So broker load could cause BE crashed during the upgrade
2019-11-13 22:04:15 +08:00
8063353429 Fix concurrent create same directory failed (#2176) (#2185) 2019-11-13 20:59:25 +08:00
11872d5cf6 Sending clear txn task explicitly after transaction being aborted (#2182) 2019-11-13 11:22:45 +08:00
b9c7f6e5ac Fix create path bug (#2177) 2019-11-12 12:46:54 +08:00
d0316d158d Refactor and reorganize the file utils (#2089) 2019-11-11 20:25:41 +08:00
068eed8eb0 Add delete state of row block v2 for performance (#2055) 2019-11-11 20:07:37 +08:00
c92de36bec Add ext_unix_timestamp for date < 1970-01-01 and > 2038-01-19 (#2161) 2019-11-08 21:19:26 +08:00
9ea14b83bb Remove failed UT (#2165) 2019-11-08 16:48:32 +08:00
42395d2455 Change Null-safe equal operator from cross join to hash join (#2156)
* Change Null-safe equal operator from cross join to hash join
ISSUE-2136

This commit change the join method from cross join to hash join when the equal operator is Null-safe '<=>'.
It will improve the speed of query which has the Null-safe equal operator.
The finds_nulls field is used to save if there is Null-safe operator.
The finds_nulls[i] is true means that the i-th equal operator is Null-safe.
The equal function in hash table will return true, if both val and loc are NULL when finds_nulls[i] is true.
2019-11-08 12:43:48 +08:00
89dc461f91 Fix UT and remove unused code (#2160) 2019-11-08 08:47:48 +08:00
c25e826dce Fix default value column bug (#2134) 2019-11-07 19:06:24 +08:00
cfc98e3571 Fix string type column zone map bug (#2144)
string type column's zone map of segment is wrong and segments are filtered incorrectly.
2019-11-07 15:57:38 +08:00
188d97c215 Add null bit verification for row_batch transformation (#2139) 2019-11-07 14:05:23 +08:00
f14cdacfd1 Fix single column read bug (#2122) 2019-11-07 10:24:02 +08:00
78a4270457 Fix in predicate bug (#2132) 2019-11-05 20:27:22 +08:00
0046eecb0a Refactor OwnedSlice (#2126) 2019-11-05 20:09:17 +08:00
65c3b0907a Support aggregation type of REPLACE_IF_NOT_NULL (#2127)
Some use has the requirment that only some of columns will be update in
one load operation, and others will retain as original. However, Doris
can't handle this situation, because user must specify value for all
columns. Then if a column aggregation method is REPLACE, use must query
original value to overwrite it. This often needs some work for user to
do.

If this CL is applied, user can use REPLACE_IF_NOT_NULL instead of
REPLACE. Then when load data to table, if user don't intent to change
value of this column, user can specify NULL for this column. Doris will
retain original value for this column.
2019-11-05 18:08:34 +08:00
ccc1b9d98c Optimize percentile_approx through radix sort (#2102) (#2107) 2019-11-05 09:25:47 +08:00
e1a8f9d30f Segment v2 stream load core dump(#2037) (#2075)
[STORAGE]
1 fix mem fix mem leak when calling string builder.get_dictionary_page;
2 fix delete invalid mem addr in bitshuffleBuilder when no array grow happends
when bitshuffleBuilder didn't grow array, the data page which not use new to allocate will be
returned to ColumnWriter.
When ColumnWriter destructs, the data page will be deleted,this causes core dump
2019-11-01 22:52:58 +08:00
713e04624f Modify the lower bound of percentile_approx compression param to 2048 (#2111) 2019-11-01 13:07:39 +08:00
45df6aae08 Fix some routine load bugs (#2093)
Mainly fix the following issues:

1. A null pointer exception is raised when a database or table is dropped. The expected behavior is that the routine load job is stopped.

2. Memory leaks. Batch routine load task submissions are no longer performed, and modifications are submitted separately for each task.

3. Unreasonable task timeout.
    Routine load tasks should not be queued in the BE thread pool for execution. The task sent to the BE should be executed immediately, otherwise the task in the FE will be timeout first. Eventually leads to constant timeout for all subsequent tasks.

4. All routine load job should be scheduled once it being submitted. Not waiting the available BE slot. Otherwise, all later submitted jobs may not be scheduled forever.
2019-10-31 21:53:03 +08:00
95a3b4ccfe Add object type (#1948)
Add a new type: Object. Currently, it's mainly for complex aggregate metrics(HLL , Bitmap).

The Object type has the following constraints:
1 Object type could not as key column type
2 Object type doesn't support all indices (BloomFilter, short key, zone map, invert index)
3 Object type doesn't support filter and group by

In the implementation:

The Object type reuse the StringValue and StringVal, because in storage engine, the Object type is binary, it has a pointer and length.
2019-10-31 21:42:58 +08:00
5e8c96f28b Optimize FE start logic (#2052) 2019-10-31 11:11:50 +08:00
f53f188c5d Add arrow IPC serialization for Doris-Spark-Connector (#2013) 2019-10-31 10:32:06 +08:00
6b4ef34162 fix AlphaRowsetTest by remove StorageEngine #2078 (#2091) 2019-10-30 19:39:41 +08:00
0a0da8292f Fix BE could not strat (#2104) 2019-10-30 18:53:39 +08:00
b006d58f5c Fix SegmentIterator lost data when there are multiple RowRanges (#2092) 2019-10-30 12:27:50 +08:00
2ae54250e7 Fix null stats when beta rowset schema change (#2085)
BetaRowsetReader's _context->stats is null when schema change calls next_block
2019-10-28 22:15:33 +08:00
ebdcfc21df Multi distinct + no group by + big data is stuck (#2079)
ISSUE-2069: This kind of query could be stuck.
The sender failed to send the last packet to receiver.
Also, the failure does not be reportted to FE , so the query is not cancelled.
The error log sames as "body_size=xxxx from xxx:xxx is too large".
The reason of the socket is that the packet of the query is too big which is more then the max_body_size of brpc.

This commit add a config named brpc_max_body_size whcih is used to change the max_body_size of brpc.
Also, user can change the max_body_size directly on-the-fly by "http://host:brpc_port/flags".
2019-10-28 18:51:05 +08:00
9408ad67e9 Fix predicate error when reading BetaRowset (#2067) 2019-10-27 12:12:41 +08:00
13fde9fce3 Add stats to BetaRowsetReader (#2074) 2019-10-27 12:06:39 +08:00
52a176b229 Remove stats in SchemaChange (#2071) 2019-10-25 19:25:18 +08:00
b6e3725c5d Fix bug that tablet failed to be committed when no data is loaded (#2064) 2019-10-25 16:36:35 +08:00
189e08faa5 Replace NewStatus with Status (#2046) 2019-10-24 22:48:59 +08:00
78bf825e73 Optimize the convert of row block v2 to v1 #2011 (#2058)
Use MemPool exchange to avoid string copy
Use batch convert to replace row by row
2019-10-24 22:36:30 +08:00
0bcfddab92 Remove clear_alter_task (#2056)
Alter task has been refactored and clear_alter_task is not necessary.
2019-10-24 18:57:14 +08:00
e3c39a192c Fix schema change core dump because of null stats (#2049) 2019-10-23 23:06:29 +08:00
d33e1693b0 Initialize DeltaWriter lazily (#2044)
Only when there is loading data passing to the delta writer, the delta writer is
then initailized. Otherwise, there will be lots of unnecessary transaction adding
and removing on BE.
2019-10-23 18:51:38 +08:00
9bc2325c6a Fix incorrect scan bytes in metrics (#2034) 2019-10-23 18:13:40 +08:00
e6bd1855e2 fix default compaction rowset type bug (#2042) 2019-10-23 11:08:14 +08:00
d25f0ba69a Make ColumnReader load lazily (#2026)
[Storage][SegmentV2]
Currently `segment_v2::Segment::open` will eagerly initialize all column readers, regardless of whether the column is queried or not. Initializing `segment_v2::ColumnReader` incurs additional I/O cost to read ordinal index and zonemap index and should be delayed to the time it's needed.
2019-10-23 10:25:28 +08:00