Commit Graph

18263 Commits

Author SHA1 Message Date
78d7a8f315 Add Apache license header in config files (#2081) (#2110) 2019-10-31 17:21:32 +08:00
e7d6bbd336 Fix explain InsertStmt NPE in FOLLOWER node (#2097) 2019-10-31 14:10:43 +08:00
5618baebbc Modify Copyright in NOTICE file to 2018-2019 (#2080) (#2109) 2019-10-31 14:05:40 +08:00
5e8c96f28b Optimize FE start logic (#2052) 2019-10-31 11:11:50 +08:00
03d384ac51 Add .rat_excludes file, and modify related documents (#2031) (#2105) 2019-10-31 10:34:22 +08:00
f53f188c5d Add arrow IPC serialization for Doris-Spark-Connector (#2013) 2019-10-31 10:32:06 +08:00
6b4ef34162 fix AlphaRowsetTest by remove StorageEngine #2078 (#2091) 2019-10-30 19:39:41 +08:00
5287bc2231 Replace DISCLAIMER with DISCLAIMER-WIP (#2100) 2019-10-30 19:06:21 +08:00
0a0da8292f Fix BE could not strat (#2104) 2019-10-30 18:53:39 +08:00
8d2cc71934 Format markdown of docker section (#2098)
[DOC]
This change makes the format correct so that's easier to view.
2019-10-30 16:52:45 +08:00
b006d58f5c Fix SegmentIterator lost data when there are multiple RowRanges (#2092) 2019-10-30 12:27:50 +08:00
6fd63a8f3c Add the cast function for if function in outer join (#2087)
[QUERY]
The type of function which is different from the type of expr will return the incorrect result in query.

Example:
  the type of expr is date
  the type of function is int
  So, the upper fragment will receive a int value instead of date while the result expr is date.
  If there is no cast function, the result of query will be incorrect.
2019-10-29 11:07:17 +08:00
2ae54250e7 Fix null stats when beta rowset schema change (#2085)
BetaRowsetReader's _context->stats is null when schema change calls next_block
2019-10-28 22:15:33 +08:00
5e3ba03b52 Awareness of Backend down when loading data (#2076) 2019-10-28 20:18:44 +08:00
ebdcfc21df Multi distinct + no group by + big data is stuck (#2079)
ISSUE-2069: This kind of query could be stuck.
The sender failed to send the last packet to receiver.
Also, the failure does not be reportted to FE , so the query is not cancelled.
The error log sames as "body_size=xxxx from xxx:xxx is too large".
The reason of the socket is that the packet of the query is too big which is more then the max_body_size of brpc.

This commit add a config named brpc_max_body_size whcih is used to change the max_body_size of brpc.
Also, user can change the max_body_size directly on-the-fly by "http://host:brpc_port/flags".
2019-10-28 18:51:05 +08:00
9408ad67e9 Fix predicate error when reading BetaRowset (#2067) 2019-10-27 12:12:41 +08:00
13fde9fce3 Add stats to BetaRowsetReader (#2074) 2019-10-27 12:06:39 +08:00
1859819aa7 Update doc for FE metadata recover (#2073) 2019-10-25 22:27:41 +08:00
52a176b229 Remove stats in SchemaChange (#2071) 2019-10-25 19:25:18 +08:00
b6e3725c5d Fix bug that tablet failed to be committed when no data is loaded (#2064) 2019-10-25 16:36:35 +08:00
189e08faa5 Replace NewStatus with Status (#2046) 2019-10-24 22:48:59 +08:00
78bf825e73 Optimize the convert of row block v2 to v1 #2011 (#2058)
Use MemPool exchange to avoid string copy
Use batch convert to replace row by row
2019-10-24 22:36:30 +08:00
78a5a84e06 Remove drop repository name toLowerCase (#2060)
Repository's name is case sensitive
2019-10-24 20:06:13 +08:00
0bcfddab92 Remove clear_alter_task (#2056)
Alter task has been refactored and clear_alter_task is not necessary.
2019-10-24 18:57:14 +08:00
4848c94262 Fix bug that unable to add bloom filter columns (#2054) 2019-10-24 14:08:52 +08:00
e3c39a192c Fix schema change core dump because of null stats (#2049) 2019-10-23 23:06:29 +08:00
d33e1693b0 Initialize DeltaWriter lazily (#2044)
Only when there is loading data passing to the delta writer, the delta writer is
then initailized. Otherwise, there will be lots of unnecessary transaction adding
and removing on BE.
2019-10-23 18:51:38 +08:00
9bc2325c6a Fix incorrect scan bytes in metrics (#2034) 2019-10-23 18:13:40 +08:00
06fe8579d2 Update release process documents (#2008) 2019-10-23 16:20:46 +08:00
e6bd1855e2 fix default compaction rowset type bug (#2042) 2019-10-23 11:08:14 +08:00
d25f0ba69a Make ColumnReader load lazily (#2026)
[Storage][SegmentV2]
Currently `segment_v2::Segment::open` will eagerly initialize all column readers, regardless of whether the column is queried or not. Initializing `segment_v2::ColumnReader` incurs additional I/O cost to read ordinal index and zonemap index and should be delayed to the time it's needed.
2019-10-23 10:25:28 +08:00
0f94b685ab Add ES7.x compatibility for doris on es (#2033) 2019-10-22 17:23:33 +08:00
9c2d149c36 add profile for segment v2 (#2015) 2019-10-22 09:43:16 +08:00
6634051359 Make default rowset type to config (#2020) 2019-10-21 21:44:00 +08:00
8aa2cbe12d Load Rowset only once in a thread-safe manner (#2022)
[Storage]
This PR implements thread-safe `Rowset::load()` for both AlphaRowset and BetaRowset. The main changes are 

1. Introduce `DorisCallOnce<ReturnType>` to be the replacement for `DorisInitOnce` . It works for both Status and OLAPStatus.
2. `segment_v2::ColumnReader::init()` is now implemented by DorisCallOnce.
3. `segment_v2::Segment` is now created by a factory open() method. This guarantees all Segment instances are in opened state.
4. `segment_v2::Segment::_load_index()` is now implemented by DorisCallOnce.
5. Implement thread-safe load() for AlphaRowset and BetaRowset
2019-10-21 16:05:12 +08:00
58c882fa2a Remove SchemaChangeV1 (#2014) 2019-10-21 15:07:28 +08:00
751a219f0a Add the unchecked cast from date literal to others (#2021)
Fix the ISSUE:2017
This commit enable the cast function in date.
The date literal can be cast to target type which is implicitly castable such as int, bigint, largeint.
2019-10-21 13:57:50 +08:00
109eb79f19 Add help doc for debug tool (#2019) 2019-10-20 22:58:03 +08:00
05643dc403 Replace Arena with MemPool (#2012)
After replacing Arena with MemPool, we can achieve one copy for string
value read from segment v2. We can exchange MemPool's chunk between
RowBlockV2 and RowBlock. This change only replace Arena, this work will
be done in other change list.
2019-10-19 15:53:24 +08:00
292273be2e Fix string bug in segment v2 (#2005) 2019-10-18 15:53:01 +08:00
05c05cfc83 Add apache arrow IPC module for Spark-Doris-Connector usage (#1958)
Add FLATBUFFERS to TP archive
2019-10-18 14:27:19 +08:00
3f325e001a Change the priority of different type in function (#2003)
This commit fix the issue [ISSUE-2002].
It changes the priority of coalesce, ifnull, nullif function etc.
The priority of decimal is higher then varchar in the IS_SUPERTYPE_OF compare mode.

Example:
select coalesce(decimal_column, 1) from table;
    the return type of coalesce should be decimal instead of varchar.

Add supertype about datetime and date
The supertype of datetime is bigint, largeint etc.
In IS_SUPERTYPE_OF compare mode, the function(bigint, bigint, bigint) is a supertype of function(datetime, bigint, int).

Example:
select coalesce(now(), 1)) from web_returns;
    the return type of coalesce should be bigint instead of varchar.
2019-10-18 09:35:49 +08:00
c3b5046940 Fix bug of invalid stream load task rollback (#1999)
If stream load be committed with result PUBLISH_TIMEOUT, it should not rollback
this transaction, but only return this message to user.
2019-10-17 21:08:29 +08:00
d9bb494d7f Fix bug that insert stmt with label return label already exist. (#2006) 2019-10-17 20:00:12 +08:00
4f7cc7e033 add predicate filter(#1652) (#1775) 2019-10-17 19:20:00 +08:00
3bca253fb3 Fix beta rowset read slow (#1994)
[Bug][BetaRowset] fix beta rowset read slowly with limit

beta rowset do not update raw_rows_read in statistics and will read all
data in tablet when query with limit, which lead to long query time.
2019-10-17 19:19:46 +08:00
3c12af4dcc Limit the memory consumption of broker scan node (#1996)
If memory exceed limit, no more row batch will be pushed to batch queue
2019-10-17 14:40:16 +08:00
ac16318c9b [Bug-fix][Broker-load] Fix the bug of the label already exists when the txn has been finished (#1992)
If FE is restarted between txn committed and visible, the load job will be rescheduled and failed with label already exists.
The reason is that there are inconsistency between transaction of load job and meta of load job.
So, the replay of the txn attachment need to be done in function replayOnCommitted.
The load job state and progress is correct after that.
2019-10-16 16:35:18 +08:00
d2bc47d2cc Add introduction of label_keep_max_second (#1993)
[Docs]
2019-10-16 16:05:13 +08:00
41e55cfca9 Modify fixed partition feature (#1989)
1. Not support MAVALUE in multi partition column.
2. Fix the incorrect show create table stmt.
2019-10-16 16:03:46 +08:00