Commit Graph

529 Commits

Author SHA1 Message Date
ec7c8a2c6f Support adding fixed range partition
eg: ALTER TABLE test_table ADD PARTITION p0125 VALUES [("20190125"), ("20190126"));
2019-10-15 09:50:30 +08:00
62acf5d098 Limit the memory usage of Loading process (#1954) 2019-10-15 09:26:20 +08:00
ccc236484b Fix bug that failed to add KEY column to DUPLICATE KEY table (#1973) 2019-10-14 16:40:34 +08:00
463b462b8d Add create_time to information_schema.tables 2019-10-12 21:45:14 +08:00
ce236bfcd4 add alter table modify limit: Cannot change DATETIME to DATE (#1963) 2019-10-12 19:11:17 +08:00
bbb3fdef8c Fix bug that OlapTableSink use invalid column as distribution column for RANDOM distribution table. (#1956)
RANDOM distribution is deprecated long time ago, this is just for compatibility and bug fix.
2019-10-11 20:07:25 +08:00
4a17152f40 Add tdigest compression param for pencentile_approx function (#1939) 2019-10-11 18:56:59 +08:00
e267d031bb Enhance the speed of avg function (#1889)
This commit enable the avg operator in fe instead of converting the avg function into sum/count.
Also, this commit fix the bug of deciamlv2 avg which cause the core in be.
The int128 could not be assinged directly.

The speed of avg function is similar to sum function after enhancement.
2019-10-10 22:43:46 +08:00
d46fc59cc3 Add send_clear_alter_tasks operation
ALTER TABLE tbl SET ("send_clear_alter_tasks" = "true");
2019-10-09 22:32:48 +08:00
1c99e88fc0 Invalid hash value of DateLiteral (#1933) 2019-10-08 11:07:02 +08:00
0d729b1191 Filter empty strings of properties in file fe.conf (#1932) 2019-10-07 23:21:05 +08:00
0072712c80 Add address reuesd option for http server (#1915)
Avoid test failure accidentally because of the BindException (Address already in use)
2019-09-30 20:34:10 +08:00
c8abdf8989 Fix length equal restrict in schema change (#1921) 2019-09-30 20:32:32 +08:00
8f016d3ab2 Make HLL be able to handle invalid data (#1908)
In this change list
1. validate HLL column when loading data, if data is invalid, this row
will be filtered.
2. seems as empty HLL when serializing invalid type of HLL data, with
this change, all ingested data will be valid.
3. seems as empty HLL when deserializing nullptr or invalid type of HLL data.
With this change, dirty data can be handled normally.
4. rename function empty_hll to hll_empty.
5. disable memtable_flush_execute_test because this will fails
sometimes. When tearing down, some thread is not joined, and they will
visit destroyed resource, which is invalid.
2019-09-29 10:55:23 +08:00
58f1d79597 Make batchEndId default value to zero instead (#1907) 2019-09-28 23:12:59 +08:00
bdd9c31766 Remove default value for HLL column (#1901)
1.fixed hll default column to no default value (#1901)
2. Don't allow insert stmt insert default values into Doris except hll_empty
2019-09-28 11:19:25 +08:00
de8f273217 Add hardware info in fe httpserver home page #1894 (#1896) 2019-09-28 11:17:08 +08:00
e67b398916 Fix bug that backup may create an empty file on remote storage. (#1869)
Sometime the broker writer failed to close, but we do not handle this failure.
This may create an empty file on remote storage but be treated as normal.

Also enhance some usabilities:
1. getting latest 2000 transactions instead of getting the earliest.
2. Show backend which download and upload tasks are being executed.
2019-09-28 00:11:43 +08:00
b970290ae4 Reduce memory usage of View object (#1878) 2019-09-26 14:57:46 +08:00
f3bbdfe7d3 Fix bug that load statistic in show load result is incorrect (#1871)
Each load job has several load tasks, and each task is a query plan
with serveral plan fragments. Each plan fragment report query profile
independently.
So we need to collect each plan fragment's report, separately.
2019-09-25 22:56:59 +08:00
ce6fb1cfba Fix bug: broker load not support inline function in hll_hash (#1873)
hll_hash should support the inline function in broker load and should not support the inline function in hadoop load.
2019-09-25 22:00:02 +08:00
e43f1a2766 Fix NPE error when creating table with bool column (#1864) 2019-09-25 14:40:13 +08:00
c643cbd30c Optimize the load performance for large file (#1798)
The current load process is:

Tablet Sink -> Tablet Channel Mgr -> Tablets Channel -> Delta Writer -> MemTable -> Flush to disk

In the path of Tablets Channel -> DeltaWriter -> MemTable -> Flush to disk, the following operations are performed:

Insert tuple into different memtables according to tablet ID
When the memtable size reaches the threshold, it is written to disk.
The above operations are equivalent to single thread execution for a single load task.
In fact, the insertion of memtable and the flush of memtable can be executed synchronously.
Perform these operation in single thread prevents the insertion of memtable from being delayed due to slow disk writing.

In the new implementation, I added a MemTableFlushExecutor class with a set of flush queues and corresponding worker threads.
By default, each data directory uses two worker threads for flush, which can be modified by the parameter flush_thread_num_per_store of BE.
DeltaWriter will push the full memtable to MemTableFlushExecutor for flush operation and generate a new memtable for receiving new data.

This design can improve the performance of load large files.
In single host testing, the time to load a 1GB text file is reduced from 48 seconds to 29 seconds.
2019-09-25 13:49:32 +08:00
dd02382abd Check buckets limit: buckets > 0 when adding partition (#1855) 2019-09-25 13:02:09 +08:00
40b9c3571b Support hll_empty function (#1825) 2019-09-25 09:28:02 +08:00
b756dfd90b Fix bug: compare column with equals rather than == (#1850) 2019-09-24 09:40:11 +08:00
c3fccb7a49 Support cast datetime to decimal (#1849) 2019-09-23 19:56:20 +08:00
fded13e3cd Fix bug: Enable StringLiteral cast to Varchar (#1846)
StringLiteral could be cast to VARCHAR or CHAR.
The default value of lead and lag function could be 'String' when the column type is CHAR or VARCHAR.
2019-09-23 18:42:25 +08:00
4c7b52d077 Fix bug: Remove conjuncts for empty set node (#1840)
The function named assign conjuncts has been invoked before creating aggregation plan node.
If the empty set node is the child of aggregation node, the conjuncts will be assign to empty set node which could not be executed correctly in Backend.
It will thrown the exception "couldn't resolve slot descriptor" for query which has both empty set node and aggregation node.
For example: select sum(pv) from test where type != 1 and 1=0 group by type;

This commit fix this bug. It remove conjuncts for empty set node.
2019-09-23 15:09:04 +08:00
74d6d04e01 Fix two digit year bug in to_days function (#1839) 2019-09-20 22:59:05 +08:00
9036014954 Add schema change check for DUPLICATE KEY table (#1844) 2019-09-20 22:33:08 +08:00
e8da855cd2 Support setting timezone for stream load and routine load (#1831) 2019-09-20 07:55:05 +08:00
7bf02d0ae7 Fix bug that routine load may mistakenly skipped some data (#1832)
Reproduce:
1. start a routine load, send a routine load task to BE
2. BE executes task successfully and commit to FE.
3. Commit request failed on FE because database is renamed(throw db not found exception)
4. After commit failed, BE will send rollback request to FE.
5. FE receive this rollback request and mistakenly update the routine load progress,
   because the number of loaded rows in this rollback request's attachment is larger than 0
2019-09-20 07:54:11 +08:00
e516eba940 Remove the "author" tag (#1829) 2019-09-19 16:59:08 +08:00
e70e48c01e Add a ALTER operation to change distribution type from RANDOM to HASH (#1823)
Random distribution is no longer supported since version 0.9.
And we need a way to convert the random distribution to hash distribution.

    ALTER TABLE db.tbl SET ("distribution_type" = "hash");
2019-09-18 14:16:26 +08:00
714dca8699 Support table comment and column comment for view (#1799) 2019-09-18 09:45:28 +08:00
3f63bde5cb Fix 'Invalid Column Name' error when loading parquet file (#1820) 2019-09-17 21:17:55 +08:00
c4e28f0d13 Update FeConstants meta version to VERSION_62 (#1822)
This should be modified along with commit a232a56c0
2019-09-17 17:30:22 +08:00
054a3f48bc Add where expr in broker load (#1812)
The where predicate in broker load is responsible for filtering transformed data.
The docs of help and operator has been changed.
2019-09-17 11:32:40 +08:00
ede51da777 Resolve reduce/reduce conflict in our syntax (#1811) 2019-09-16 20:25:05 +08:00
a232a56c06 Add parallel_exchange_instance_num to set parallel after exchange (#1788) 2019-09-16 16:41:14 +08:00
86feddb5d7 Fix bug that dead lock may happen when drop table during alter table process (#1800)
the cancel() function will try get database's write lock, while its caller may already
hold the database's read lock.
2019-09-16 00:12:00 +08:00
dcea6daf4f Fix Cluster meta write error (#1802) 2019-09-13 22:06:55 +08:00
9aa2045987 Refactor alter job (#1695) 2019-09-12 16:31:29 +08:00
a85ffa1c2a Fix FE log error (#1785) 2019-09-11 16:13:34 +08:00
044489b92f Optimize some kinds of load jobs (#1762)
1. Support specifying label to Insert Into stmt.

    INSERT INTO tbl1 WITH LABEL label1 ...;

2. Return job' state corresponding to the existing label in result of stream load.

    ...
    "Status": "Label Already Exists",
    "ExistingJobStatus": "FINISHED"
    ...

3. Return the recent 2000 transactions in SHOW PROC '/transactions'
2019-09-09 22:11:12 +08:00
8b663bf416 Fix bug: unknown column from the inline view (#1770)
Revert code from PR-1617. The column who belongs to inline view need to be initialized by alias.
2019-09-09 20:57:42 +08:00
cd5cfea5cc Encapsulate HLL logic (#1756) 2019-09-09 15:52:10 +08:00
b85cb0071b Bug-fix: error result of union stmt (#1758)
ISSUES-1725: The result of union stmt whose child is outer join stmt is incorrect.

Example:
sql: (select k1 from empty) union all (select b.k1 k1 from left_table a left join empty  b on a.k2 = b.k2);
context: the empty table has no data.
error result: 0
expect result: null

Reason:
The judgment (columns k1 who belongs to union tuple is nullable ) is incorrect.
It could not be determined by slot attribute of children when the slot is produced by the outer join.
The slot A is not nullable while the result of outer join is nullable which is same as slot A.
So, the judgment needs to consider if the slot is come from the outer join.
2019-09-08 21:26:31 +08:00
f23ac0eadd Planner support push down predicates past agg, win and sort (#1471) 2019-09-08 09:30:46 +08:00