Commit Graph

458 Commits

Author SHA1 Message Date
e8da855cd2 Support setting timezone for stream load and routine load (#1831) 2019-09-20 07:55:05 +08:00
720808fda5 Remove config::max_file_descriptor_number (#1833) 2019-09-20 07:50:57 +08:00
315f762523 Seek block when starts a ScanKey (#1828)
In Doris, one block has 1024 rows.
1. If the previous ScanKey scan rows multiple blocks,
   and also the final block has 1024 rows just right.
2. The current ScanKey scan rows with number less than one block.
Under the two conditions, if not seek block, the position of prefix shortkey columns is wrong.
2019-09-19 20:08:03 +08:00
aaabf97471 Split channel close operation into two phase (#1830)
In this change, channel close is finished into two phases. So we can
close channels parallel, which can make query faster.
2019-09-19 18:14:30 +08:00
17e52a4bac Improve LRUCache to get better performance (#1826)
In this CL, I move the entry's deleter out of LRUCache's mutex block,
which can let others access this cache without waiting free cache entry.
2019-09-19 17:37:02 +08:00
d1676c3c3d Check file descriptor number is larger than 65536 upon start (#1819) 2019-09-19 12:48:36 +08:00
dc813e6c61 Limit the max version to cumulative compaction (#1813) 2019-09-17 14:10:05 +08:00
054a3f48bc Add where expr in broker load (#1812)
The where predicate in broker load is responsible for filtering transformed data.
The docs of help and operator has been changed.
2019-09-17 11:32:40 +08:00
11eafe524f Add ChunkAllocator to accelerate chunk allocation (#1792)
I add ChunkAllocator in this CL to put unused memory chunk to a chunk
pool other than return it to system allocator. Now we only change
MemPool's chunk allocation and free to this.

And two configuration are introduduced too. 'chunk_reserved_bytes_limit'
is the limit of how many bytes this chunk pool can reserve in total and
its default value is 2147483648(2GB). 'use_mmap_allocate_chunk': if
chunk is allocated via mmap and default value is false.

And in my test case with default configuration a simple like
"select * from table limit 10", this can improve throughput from 280 QPS
to to 650 QPS. And when I config 'chunk_reserved_bytes_limit' to 0,
which means this is disabled, the throughput is the same with origin's.
2019-09-13 08:27:24 +08:00
9aa2045987 Refactor alter job (#1695) 2019-09-12 16:31:29 +08:00
dad4def708 Support estimate size for v2 segment writer (#1787) 2019-09-12 15:15:39 +08:00
f58a222da7 Fix bug that the calculation of disk usage percent is wrong (#1791)
This bug may cause unable to load data
2019-09-12 14:37:20 +08:00
348e2129b7 Initialize tablet uid not using default constructor for performance reason (#1795) 2019-09-12 12:59:16 +08:00
b327643132 Fix bug that failed to limit the mem usage of HLL column when loading (#1778)
Should use arena to allocate mem for HyperLogLog column.
2019-09-11 10:20:46 +08:00
5a12a1d7df Fix compile error (#1780) 2019-09-10 23:48:42 +08:00
bf373758b2 Make CpuInfo::get_current_core work (#1773) 2019-09-10 19:35:55 +08:00
235cdb0ecd Commit kafka offset (#1734)
Commit kafka offset in routine load

Kafka will decide whether to delete data based on whether all consumer group is commit offset or not. If there is no commit offset, the kafka server disk may be full
2019-09-10 14:27:06 +08:00
40a11c41a9 Fix BE crash when schema changing with HLL column (#1772) 2019-09-10 13:59:38 +08:00
5653822298 Writer magic number in footer instead of header (#1771) 2019-09-10 09:54:13 +08:00
044489b92f Optimize some kinds of load jobs (#1762)
1. Support specifying label to Insert Into stmt.

    INSERT INTO tbl1 WITH LABEL label1 ...;

2. Return job' state corresponding to the existing label in result of stream load.

    ...
    "Status": "Label Already Exists",
    "ExistingJobStatus": "FINISHED"
    ...

3. Return the recent 2000 transactions in SHOW PROC '/transactions'
2019-09-09 22:11:12 +08:00
0f44ce99ce Fix segment v2 comment (#1769) 2019-09-09 18:26:48 +08:00
cd5cfea5cc Encapsulate HLL logic (#1756) 2019-09-09 15:52:10 +08:00
a349409838 Move compare from RowCursor to row (#1764) 2019-09-09 14:51:13 +08:00
ca23b7a511 Should create init rowset for alter task v2 (#1767) 2019-09-09 13:59:19 +08:00
5acdeee4d2 Assign schema_size from other Schema (#1768) 2019-09-09 13:50:14 +08:00
fd2937360c Get rid of external_sorting when rowsets have already been filtered (#1760) 2019-09-08 21:31:12 +08:00
981e0feb99 Check rowset is useful atomicly (#1750)
* Check rowset is useful atomicly

* Only release rowset id when it is added to unused rowset

* remove release rowset id when save rowset meta
2019-09-06 17:21:42 +08:00
65dcabf1df Use crc32c checksum for segment v2 (#1753) 2019-09-06 15:23:57 +08:00
54fd3652e6 Fix bug in BetaRowsetReader which results in empty result (#1754) 2019-09-06 15:07:23 +08:00
da69812f65 Fix compile error (#1749) 2019-09-05 22:30:41 +08:00
3f22238012 Add check for to_bitmap function argument (#1747) 2019-09-05 18:11:38 +08:00
85940a292b RowsetFactory as a single entry for Rowset creation (#1748) 2019-09-05 18:10:18 +08:00
0dc0dadad1 Reduce unnecessary memory allocat and copy in OlapScanNode (#1742) 2019-09-04 21:05:12 +08:00
726509e9b9 Add MIN/MAX aggregate function compatible with char/varchar (#1739) 2019-09-04 17:28:27 +08:00
a63989cc61 Use RowsetFactory to create and init RowsetWriter (#1740) 2019-09-04 17:02:43 +08:00
03b399150e Not add alter task to tablet in alter tablet request v2 (#1741) 2019-09-03 21:34:52 +08:00
f76dad289e Basic implementation for BetaRowsetReader (#1718) 2019-09-03 13:52:16 +08:00
9f5e5717d4 Unify the msg of 'Memory exceed limit' (#1737)
The new msg of limit exceed: "Memory exceed limit. %msg, Backend:%ip, fragment:%id Used:% , Limit:%. xxx".
This commit unifies the msg of 'Memory exceed limit' such as check_query_state, RETURN_IF_LIMIT_EXCEEDED and LIMIT_EXCEEDED.
2019-09-03 10:42:16 +08:00
a80e9996a6 Move version to high 8 bit (#1736) 2019-09-02 19:43:04 +08:00
b4f6f755f1 Add exchange in MemPool to reduce alloc/free operation (#1732)
Reuse allocated chunks when storage read operation.
2019-09-02 19:29:30 +08:00
8034d83e20 Add scroll keepalive and http timeout configuration (#1731) 2019-09-02 19:04:30 +08:00
6f4feca3dc Add rowset id generator to FE and BE (#1678) 2019-09-02 18:51:31 +08:00
81ca3e3abf Free olap scanner out of lock (#1733)
Close scanner out of OlapScanner's batch lock,
which will lead all scanners wait for one scanner to finish.
2019-09-02 16:49:28 +08:00
76987275b9 Fix result of unix_timestamp() (#1727) 2019-08-30 21:39:16 +08:00
206f5394ee Limit V2 segment file row count (#1647) (#1705) 2019-08-30 18:54:37 +08:00
3a33f3d350 Make bitmap_union agg column support insert into and broker load (#1721) 2019-08-30 14:44:51 +08:00
378ce8ca04 Use double when converting TIME type value (#1722)
TIME type value is saved in DOUBLE, so using int64 can extend the time range.
2019-08-29 21:19:19 +08:00
ecbdfc2cee Avoid consistency problem when has no more data (#1716) 2019-08-29 18:57:49 +08:00
056a9fada3 fix delete bug (#1720) 2019-08-29 10:42:04 +08:00
c541c3fd59 Fix bug that failed to get enough normal replica because path hash is not set. (#1714)
Path Hash of a replica in metadata should be set immediately after replica is created.
And we should not depend on path hash to find replicas. Because path hash may be set
delayed.
2019-08-28 19:37:38 +08:00