doris

Author	SHA1	Message	Date
kangkaisen	b246d93128	Avoid SerDe for aggregation query with object pool (#1854 )	2019-09-26 13:51:13 +08:00
lichaoyong	7df1418ff4	Check transaction_id in TClearTransactionTaskRequest (#1872 )	2019-09-26 10:15:43 +08:00
yangzhg	5d1165fad2	Fix direct compilation failed #1862 (#1875 ) Fix direct compilation failed： fix compile thirdparty in ubuntu will install libs to lib dir instead of lib64 fix compile error in gcc5 due to the defect of c++11 (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=60970) fix gcc version check will not work on some OS	2019-09-26 09:34:41 +08:00
Mingyu Chen	f3bbdfe7d3	Fix bug that load statistic in show load result is incorrect (#1871 ) Each load job has several load tasks, and each task is a query plan with serveral plan fragments. Each plan fragment report query profile independently. So we need to collect each plan fragment's report, separately.	2019-09-25 22:56:59 +08:00
EmmyMiao87	ce6fb1cfba	Fix bug: broker load not support inline function in hll_hash (#1873 ) hll_hash should support the inline function in broker load and should not support the inline function in hadoop load.	2019-09-25 22:00:02 +08:00
lichaoyong	09482c9f52	Take segments in singleton rowset into consideration upon cumulative compaction (#1866 ) In previous compaction, only rowsets will be taken into consideration. Doing streaming load, the singleton rowset may is made up of many overlapping segments. Scanning these overlapping segments will result in read amplification. To address this problem, overlapping segments should be taken into consideration when doing cumulative compaction to reduce read amplification.	2019-09-25 15:27:44 +08:00
wubiao	e43f1a2766	Fix NPE error when creating table with bool column (#1864 )	2019-09-25 14:40:13 +08:00
wubiao	eb840ecca8	Support boolean/date/datetime/decimal types in segment V2 (#1863 )	2019-09-25 13:53:00 +08:00
Mingyu Chen	c643cbd30c	Optimize the load performance for large file (#1798 ) The current load process is: Tablet Sink -> Tablet Channel Mgr -> Tablets Channel -> Delta Writer -> MemTable -> Flush to disk In the path of Tablets Channel -> DeltaWriter -> MemTable -> Flush to disk, the following operations are performed: Insert tuple into different memtables according to tablet ID When the memtable size reaches the threshold, it is written to disk. The above operations are equivalent to single thread execution for a single load task. In fact, the insertion of memtable and the flush of memtable can be executed synchronously. Perform these operation in single thread prevents the insertion of memtable from being delayed due to slow disk writing. In the new implementation, I added a MemTableFlushExecutor class with a set of flush queues and corresponding worker threads. By default, each data directory uses two worker threads for flush, which can be modified by the parameter flush_thread_num_per_store of BE. DeltaWriter will push the full memtable to MemTableFlushExecutor for flush operation and generate a new memtable for receiving new data. This design can improve the performance of load large files. In single host testing, the time to load a 1GB text file is reduced from 48 seconds to 29 seconds.	2019-09-25 13:49:32 +08:00
xionglei0	dd02382abd	Check buckets limit: buckets > 0 when adding partition (#1855 )	2019-09-25 13:02:09 +08:00
shgxwxl	c2de62d6a1	Collect scanner's status when es_http_scan_node close (#1861 )	2019-09-25 12:20:13 +08:00
HangyuanLiu	40b9c3571b	Support hll_empty function (#1825 )	2019-09-25 09:28:02 +08:00
kangpinghuang	533a2e0f94	Optimize memory usage in wrapper field #1852 (#1853 )	2019-09-25 09:25:54 +08:00
wubiao	0b15d26b6c	Fix segment V2 estimate size inaccuracy (#1858 )	2019-09-24 20:13:15 +08:00
kangpinghuang	8d0fee7e64	Add default value column iterator #1834 (#1835 )	2019-09-24 14:39:10 +08:00
kangpinghuang	fe27969978	add delete predicate filter(#1636 ) (#1745 ) Delete predicate can be used to prune data by zone map.	2019-09-24 14:38:19 +08:00
xionglei0	b756dfd90b	Fix bug: compare column with equals rather than == (#1850 )	2019-09-24 09:40:11 +08:00
ZHAO Chun	c3fccb7a49	Support cast datetime to decimal (#1849 )	2019-09-23 19:56:20 +08:00
EmmyMiao87	fded13e3cd	Fix bug: Enable StringLiteral cast to Varchar (#1846 ) StringLiteral could be cast to VARCHAR or CHAR. The default value of lead and lag function could be 'String' when the column type is CHAR or VARCHAR.	2019-09-23 18:42:25 +08:00
EmmyMiao87	4c7b52d077	Fix bug: Remove conjuncts for empty set node (#1840 ) The function named assign conjuncts has been invoked before creating aggregation plan node. If the empty set node is the child of aggregation node, the conjuncts will be assign to empty set node which could not be executed correctly in Backend. It will thrown the exception "couldn't resolve slot descriptor" for query which has both empty set node and aggregation node. For example: select sum(pv) from test where type != 1 and 1=0 group by type; This commit fix this bug. It remove conjuncts for empty set node.	2019-09-23 15:09:04 +08:00
ZHAO Chun	93fe10a268	Reduce size of HyperLogLog struct (#1845 ) Now size of HyperLogLog struct is so large that it lead the rowset is too small when ingesting data. In this CL, registers in HyperLogLog are only created when it is needed. When ingesting data, it's normal case that there are only few values in one HyperLogLog.	2019-09-21 14:38:58 +08:00
HangyuanLiu	74d6d04e01	Fix two digit year bug in to_days function (#1839 )	2019-09-20 22:59:05 +08:00
kangkaisen	9036014954	Add schema change check for DUPLICATE KEY table (#1844 )	2019-09-20 22:33:08 +08:00
wubiao	cc36905aea	Fix write file crash when using segment V2 in debug mode (#1841 )	2019-09-20 20:37:29 +08:00
ZHAO Chun	abd27dfcca	Remove unused debug (#1836 )	2019-09-20 09:31:56 +08:00
Mingyu Chen	e8da855cd2	Support setting timezone for stream load and routine load (#1831 )	2019-09-20 07:55:05 +08:00
Mingyu Chen	7bf02d0ae7	Fix bug that routine load may mistakenly skipped some data (#1832 ) Reproduce: 1. start a routine load, send a routine load task to BE 2. BE executes task successfully and commit to FE. 3. Commit request failed on FE because database is renamed(throw db not found exception) 4. After commit failed, BE will send rollback request to FE. 5. FE receive this rollback request and mistakenly update the routine load progress, because the number of loaded rows in this rollback request's attachment is larger than 0	2019-09-20 07:54:11 +08:00
lichaoyong	720808fda5	Remove config::max_file_descriptor_number (#1833 )	2019-09-20 07:50:57 +08:00
lichaoyong	315f762523	Seek block when starts a ScanKey (#1828 ) In Doris, one block has 1024 rows. 1. If the previous ScanKey scan rows multiple blocks, and also the final block has 1024 rows just right. 2. The current ScanKey scan rows with number less than one block. Under the two conditions, if not seek block, the position of prefix shortkey columns is wrong.	2019-09-19 20:08:03 +08:00
ZHAO Chun	aaabf97471	Split channel close operation into two phase (#1830 ) In this change, channel close is finished into two phases. So we can close channels parallel, which can make query faster.	2019-09-19 18:14:30 +08:00
ZHAO Chun	17e52a4bac	Improve LRUCache to get better performance (#1826 ) In this CL, I move the entry's deleter out of LRUCache's mutex block, which can let others access this cache without waiting free cache entry.	2019-09-19 17:37:02 +08:00
xy720	e516eba940	Remove the "author" tag (#1829 )	2019-09-19 16:59:08 +08:00
lichaoyong	d1676c3c3d	Check file descriptor number is larger than 65536 upon start (#1819 )	2019-09-19 12:48:36 +08:00
Mingyu Chen	e70e48c01e	Add a ALTER operation to change distribution type from RANDOM to HASH (#1823 ) Random distribution is no longer supported since version 0.9. And we need a way to convert the random distribution to hash distribution. ALTER TABLE db.tbl SET ("distribution_type" = "hash");	2019-09-18 14:16:26 +08:00
Mingyu Chen	714dca8699	Support table comment and column comment for view (#1799 )	2019-09-18 09:45:28 +08:00
Mingyu Chen	3f63bde5cb	Fix 'Invalid Column Name' error when loading parquet file (#1820 )	2019-09-17 21:17:55 +08:00
Mingyu Chen	c4e28f0d13	Update FeConstants meta version to VERSION_62 (#1822 ) This should be modified along with commit a232a56c0	2019-09-17 17:30:22 +08:00
lichaoyong	dc813e6c61	Limit the max version to cumulative compaction (#1813 )	2019-09-17 14:10:05 +08:00
EmmyMiao87	054a3f48bc	Add where expr in broker load (#1812 ) The where predicate in broker load is responsible for filtering transformed data. The docs of help and operator has been changed.	2019-09-17 11:32:40 +08:00
ZHAO Chun	ede51da777	Resolve reduce/reduce conflict in our syntax (#1811 )	2019-09-16 20:25:05 +08:00
WingC	973eff26cd	Fix tablet meta tool command argument bug (#1810 )	2019-09-16 17:40:23 +08:00
xionglei0	a232a56c06	Add parallel_exchange_instance_num to set parallel after exchange (#1788 )	2019-09-16 16:41:14 +08:00
Mingyu Chen	86feddb5d7	Fix bug that dead lock may happen when drop table during alter table process (#1800 ) the cancel() function will try get database's write lock, while its caller may already hold the database's read lock.	2019-09-16 00:12:00 +08:00
kangkaisen	dcea6daf4f	Fix Cluster meta write error (#1802 )	2019-09-13 22:06:55 +08:00
ZHAO Chun	11eafe524f	Add ChunkAllocator to accelerate chunk allocation (#1792 ) I add ChunkAllocator in this CL to put unused memory chunk to a chunk pool other than return it to system allocator. Now we only change MemPool's chunk allocation and free to this. And two configuration are introduduced too. 'chunk_reserved_bytes_limit' is the limit of how many bytes this chunk pool can reserve in total and its default value is 2147483648(2GB). 'use_mmap_allocate_chunk': if chunk is allocated via mmap and default value is false. And in my test case with default configuration a simple like "select * from table limit 10", this can improve throughput from 280 QPS to to 650 QPS. And when I config 'chunk_reserved_bytes_limit' to 0, which means this is disabled, the throughput is the same with origin's.	2019-09-13 08:27:24 +08:00
Mingyu Chen	9aa2045987	Refactor alter job (#1695 )	2019-09-12 16:31:29 +08:00
wubiao	dad4def708	Support estimate size for v2 segment writer (#1787 )	2019-09-12 15:15:39 +08:00
Mingyu Chen	f58a222da7	Fix bug that the calculation of disk usage percent is wrong (#1791 ) This bug may cause unable to load data	2019-09-12 14:37:20 +08:00
EmmyMiao87	c354f30767	Fix mistake in docs (#1796 )	2019-09-12 14:15:06 +08:00
yiguolei	348e2129b7	Initialize tablet uid not using default constructor for performance reason (#1795 )	2019-09-12 12:59:16 +08:00

... 330 331 332 333 334 ...

17549 Commits