Commit Graph

89 Commits

Author SHA1 Message Date
a323a190a2 Update monitor-alert.md (#1975) 2019-10-14 12:22:51 +08:00
4a17152f40 Add tdigest compression param for pencentile_approx function (#1939) 2019-10-11 18:56:59 +08:00
024348d74b Enable auto convert when check in (#1926)
Leverage gitattributes to enable auto convert end-of-line to LF when
checking in. Convert already exist CRLF to LF by removing all files and
checking out with new .gitattributes file. Except .gitattributes, all
files are only modified at the end of line.
2019-10-09 22:31:27 +08:00
ec3aa03c45 Add more routine load example (#1902) 2019-09-27 20:42:52 +08:00
2ea7de8b5e Update some docs (#1882) 2019-09-26 14:43:55 +08:00
40b9c3571b Support hll_empty function (#1825) 2019-09-25 09:28:02 +08:00
e8da855cd2 Support setting timezone for stream load and routine load (#1831) 2019-09-20 07:55:05 +08:00
d1676c3c3d Check file descriptor number is larger than 65536 upon start (#1819) 2019-09-19 12:48:36 +08:00
e70e48c01e Add a ALTER operation to change distribution type from RANDOM to HASH (#1823)
Random distribution is no longer supported since version 0.9.
And we need a way to convert the random distribution to hash distribution.

    ALTER TABLE db.tbl SET ("distribution_type" = "hash");
2019-09-18 14:16:26 +08:00
714dca8699 Support table comment and column comment for view (#1799) 2019-09-18 09:45:28 +08:00
054a3f48bc Add where expr in broker load (#1812)
The where predicate in broker load is responsible for filtering transformed data.
The docs of help and operator has been changed.
2019-09-17 11:32:40 +08:00
973eff26cd Fix tablet meta tool command argument bug (#1810) 2019-09-16 17:40:23 +08:00
9aa2045987 Refactor alter job (#1695) 2019-09-12 16:31:29 +08:00
c354f30767 Fix mistake in docs (#1796) 2019-09-12 14:15:06 +08:00
b327643132 Fix bug that failed to limit the mem usage of HLL column when loading (#1778)
Should use arena to allocate mem for HyperLogLog column.
2019-09-11 10:20:46 +08:00
044489b92f Optimize some kinds of load jobs (#1762)
1. Support specifying label to Insert Into stmt.

    INSERT INTO tbl1 WITH LABEL label1 ...;

2. Return job' state corresponding to the existing label in result of stream load.

    ...
    "Status": "Label Already Exists",
    "ExistingJobStatus": "FINISHED"
    ...

3. Return the recent 2000 transactions in SHOW PROC '/transactions'
2019-09-09 22:11:12 +08:00
f87abd93c8 Modify the website (#1730)
1. Add Apache incubator disclaimer.
2. Add "Edit on Github" button on every page.
3. Add Committer Ling Miao.
4. Modify some English documents.
2019-08-31 19:49:44 +08:00
1164264e9d Add English version Doris website (#1729) 2019-08-30 22:07:24 +08:00
76987275b9 Fix result of unix_timestamp() (#1727) 2019-08-30 21:39:16 +08:00
3a33f3d350 Make bitmap_union agg column support insert into and broker load (#1721) 2019-08-30 14:44:51 +08:00
378ce8ca04 Use double when converting TIME type value (#1722)
TIME type value is saved in DOUBLE, so using int64 can extend the time range.
2019-08-29 21:19:19 +08:00
7a0c7f45b2 Add English documents for Doris (#1719)
The english documents is translated by machine. So It may has some mistake.
We will fix them later
2019-08-29 13:47:15 +08:00
6865f4238b Add limit to show tablet stmt (#1547)
Also add some where predicates for filtering results
ISSUE #1687
2019-08-28 16:25:12 +08:00
7e981b2b14 Limit the disk usage to avoid running out of disk capacity (#1702)
Set high watermark and flood stage of disk used capacity.
And forbid some operations if disk usage is too high.
2019-08-27 22:18:17 +08:00
a1b92768dd Add a loaded rows in SHOW LOAD result (#1686)
Loaded rows will be updated periodically by query report. So that
user can see that a load job is still running or being blocked.
2019-08-27 14:13:47 +08:00
1e4dd77d2a Add bitmap agg type and udaf (#1610) 2019-08-26 14:24:42 +08:00
b28f4242c3 Add config max_concurrent_task_num_per_be (#1693)
This config is used to control the max concurrent task num per be.
The cluster max concurrent task num = max_concurrent_task_num_per_be * number of be.
2019-08-24 00:56:40 +08:00
c73b3f15a4 Update tablet-repair-and-balance doc (#1692) 2019-08-22 21:31:56 +08:00
978b1ee1af Add strict mode in Routine load, Stream load and Mini load (#1677) 2019-08-20 21:56:45 +08:00
176e185e18 Add broker doc (#1662)
This broker document introduces the properties for different broker types.
2019-08-20 17:18:54 +08:00
8e6814cfcd Support setting timeout for stream load (#1670) 2019-08-20 15:43:03 +08:00
ccaf39c48f Fix spelling mistake (#1676) 2019-08-20 12:16:55 +08:00
ba6d728f26 Enable parsing columns from file path for Broker Load (#1582) (#1635)
Currently, we do not support parsing encoded/compressed columns in file path, eg: extract column k1 from file path /path/to/dir/k1=1/xxx.csv

This patch is able to parse columns from file path like in Spark(Partition Discovery).

This patch parse partition columns at BrokerScanNode.java and save parsing result of each file path as a property of TBrokerRangeDesc, then the broker reader of BE can read the value of specified partition column.
2019-08-19 09:39:21 +08:00
6d73658207 Support checking error data row when doing INSERT (#1597)
If strict mode is true, and at least one row is filtered, the insert operation will fail and a url will be given to get the error rows.

```
ERROR 1064 (HY000): all partitions have no load data. url: http://host:ip/api/_load_error_log?file=__shard_2/error_log_insert_stmt_e0a620e93dc54461-b89ec64768367d25_e0a620e93dc54461_b89ec64768367d25
```

 If all rows are good, insert will return OK with affected rows:

```
Query OK, 1 row affected (0.26 sec)
```

If strict mode is false, and at least one row is good, the insert operation will return OK with affected rows and warnings. If has error row num, a label will be returned:

```
Query OK, 1 row affected, 1 warning (0.32 sec)
{'label':'7d66c457-658b-4a3e-bdcf-8beee872ef2c'}
```
2019-08-16 21:40:29 +08:00
82d0afc1ba FROM_UNIXTIME should only convert timestamp from 0 to 253402271999 (#1658)
which is between 1970-01-01 00:00:00 ~ 9999-12-31 23:59:59, otherwise, return null
2019-08-16 18:29:57 +08:00
0e6560ceca Fix document typo (#1657) 2019-08-16 14:52:32 +08:00
1ed25ad83d Add kafka_default_offsets when no partiotion specify
Support read kafka partition from start (#1642)
2019-08-16 13:30:26 +08:00
a551abba58 Modify timediff documents (#1600) 2019-08-15 12:45:53 +08:00
199ff968dc Fix time zone compatibility (#1631) 2019-08-13 18:44:35 +08:00
1e2a4c3b9b Fix tablet restore api in BE(#1623) (#1624) 2019-08-13 09:34:24 +08:00
69af50aa8c Time zone related BE function (#1598)
Details can be found in time-zone.md document
2019-08-12 20:57:59 +08:00
add6266c71 Broker load supports function (#1592)
* Broker load supports function
The commit support the column function in broker load.
The grammar of LoadStmt has not been changed.
Example:
columns terminated by ',' (tmp_c1, tmp_c2) set (c1=tmp_c1+tmp_c2)

Also, the old function is compatible such as default_value, strftime etc.
After this commit, there are no difference in column function between stream load and broker load except old function.
2019-08-09 13:27:31 +08:00
326d765c64 Add doc of modify replication num upon partition (#1611) 2019-08-08 16:47:32 +08:00
fd2accbcf9 Modify some docs' format to make it work with document website (#1604) 2019-08-08 14:47:38 +08:00
4c2a3d6da4 Merge Help document to documentation (#1586)
Help document collation (integration of help and documentation documents)
2019-08-07 21:31:53 +08:00
93a3577baa Support multi partition column when creating table (#1574)
When creating table with OLAP engine, use can specify multi parition columns.
eg:

PARTITION BY RANGE(`date`, `id`)
(
    PARTITION `p201701_1000` VALUES LESS THAN ("2017-02-01", "1000"),
    PARTITION `p201702_2000` VALUES LESS THAN ("2017-03-01", "2000"),
    PARTITION `p201703_all`  VALUES LESS THAN ("2017-04-01")
)

Notice that load by hadoop cluster does not support multi parition column table.
2019-08-05 16:16:43 +08:00
cefe1794d4 Fix bug that replicas of a tablet may be located on same host (#1517)
Doris support deploy multi BE on one host. So when allocating BE for replicas of
a tablet, we should select different host. But there is a bug in tablet scheduler
that same host may be selected for one tablet. This patch will fix this problem.

There are some places related to this problem:

1. Create Table
    There is no bug in Create Table process.

2. Tablet Scheduler
    Fixed when selecting BE for REPLICA_MISSING and REPLICA_RELOCATING.
    Fixed when balance the tablet.

3. Colocate Table Balancer
    Fixed when selecting BE for repairing colocate backend sequence.
    Not fix in colocate group balance. Leave it to colocate repairing.

4. Tablet report
    Tablet report may add replica to catalog. But I did not check the host here,
    Tablet Scheduler will fix it.
2019-08-01 10:26:06 +08:00
99836f0d7c Modify load docs (#1558)
Make it work with documentation website
2019-07-29 15:48:59 +08:00
000e9cf53c Add administrator guide of load (#1488)
The catalogue of load docs:
---- load-manual.md
---- broker-load-manual.md
---- insert-into-manual.md
---- stream-load-manual.md

This commit also changes max/min_stream_load_timeout to max/min_load_timeout.
The old config named stream_load_timeout means the max timeout suited for all types of load.
So the config name has been changed.
2019-07-25 21:02:32 +08:00
473d69e8f8 Fix the mistake in docs of rollup (#1551) 2019-07-25 20:53:41 +08:00