Commit Graph

76 Commits

Author SHA1 Message Date
008e59476d Add curdate function doc (#2520) 2019-12-20 21:24:56 +08:00
6815979ba5 Fix invalid to_bitmap input lead to BE core (#2510) 2019-12-19 21:28:00 +08:00
c81b1db406 Support convert VARCHAR type to DATE type (#2489) 2019-12-18 12:58:47 +08:00
89003b774b Support Convert Varchar to INT (#2481) 2019-12-17 22:02:28 +08:00
55cb1cd1f1 Update date_format.md (#2476) 2019-12-16 20:43:55 +08:00
b20a76163b Update from_unixtime.md (#2475) 2019-12-16 19:39:54 +08:00
9244db40f7 Update bitmap doc (#2467) 2019-12-16 18:56:53 +08:00
bf31bd238b Change default storage model from aggregate to duplicate(#2318) (#2412)
change default storage model from aggregate to duplicate
for sql  `create table t (k1 int) DISTRIBUTED BY HASH(k1) BUCKETS 10 PROPERTIES("replication_num" = "1");`
before: 
```
 CREATE TABLE `t` (
  `k1` int(11) NULL COMMENT ""
) ENGINE=OLAP
AGGREGATE KEY(`k1`)
COMMENT "OLAP"
DISTRIBUTED BY HASH(`k1`) BUCKETS 10
PROPERTIES (
"storage_type" = "COLUMN"
);
```
after:
```
CREATE TABLE `t` (
  `k1` int(11) NULL COMMENT ""
) ENGINE=OLAP
DUPLICATE KEY(`k1`)
COMMENT "OLAP"
DISTRIBUTED BY HASH(`k1`) BUCKETS 10
PROPERTIES (
"storage_type" = "COLUMN"
);
```

#2318
2019-12-12 14:30:30 +08:00
5951a0eaea Add more schema change docs (#2411)
Add explanation about converting:

DATE -> DATETIME
DATETIME -> DATE
INT->DATE
2019-12-10 16:46:41 +08:00
a46bf1ada3 [Authorization] Modify the authorization checking logic (#2372)
**Authorization checking logic**

There are some problems with the current password and permission checking logic. For example:
First, we create a user by:
`create user cmy@"%" identified by "12345";`

And then 'cmy' can login with password '12345' from any hosts.

Second, we create another user by:
`create user cmy@"192.168.%" identified by "abcde";`

Because "192.168.%" has a higher priority in the permission table than "%". So when "cmy" try
to login in by password "12345" from host "192.168.1.1", it should match the second permission
entry, and will be rejected because of invalid password.
But in current implementation, Doris will continue to check password on first entry, than let it pass. So we should change it.

**Permission checking logic**

After a user login, it should has a unique identity which is got from permission table. For example,
when "cmy" from host "192.168.1.1" login, it's identity should be `cmy@"192.168.%"`. And Doris
should use this identity to check other permission, not by using the user's real identity, which is
`cmy@"192.168.1.1"`.

**Black list**
Functionally speaking, Doris only support adding WHITE LIST, which is to allow user to login from
those hosts in the white list. But is some cases, we do need a BLACK LIST function.
Fortunately, by changing the logic described above, we can simulate the effect of the BLACK LIST.

For example, First we add a user by:
`create user cmy@'%' identified by '12345';`

And now user 'cmy' can login from any hosts. and if we don't want 'cmy' to login from host A, we
can add a new user by:
`create user cmy@'A' identified by 'other_passwd';`

Because "A" has a higher priority in the permission table than "%". If 'cmy' try to login from A using password '12345', it will be rejected.
2019-12-06 17:45:56 +08:00
9fbc1c7ee6 Support where/orderby/limit after “SHOW ALTER TABLE COLUMN“ syntax (#2380)
Features:
1、Support WHERE/ORDER BY/LIMIT
2、Columns:TableName、CreatTime、FinishTime、State
3、Only “And” between conditions
4、TableName and State column only support "=" operator
5、CreateTime and FinishTime column support “=”,“>=”,"<=",">","<","!=" operators
6、CreateTime and FinishTime column support Date and DateTime string, eg:"2019-12-04" or "2019-12-04 17:18:00"

TestCase:
MySQL [haibotest]> show alter table column where State='FINISHED' and CreateTime > '2019-12-03' order by FinishTime desc limit 0,2;
+-------+---------------+---------------------+---------------------+---------------+---------+---------------+---------------+---------------+----------+------+----------+---------+
| JobId | TableName | CreateTime | FinishTime | IndexName | IndexId | OriginIndexId | SchemaVersion | TransactionId | State | Msg | Progress | Timeout |
+-------+---------------+---------------------+---------------------+---------------+---------+---------------+---------------+---------------+----------+------+----------+---------+
| 11134 | test_schema_2 | 2019-12-03 19:21:42 | 2019-12-03 19:22:11 | test_schema_2 | 11135 | 11059 | 1:192010000 | 3 | FINISHED | | N/A | 86400 |
| 11096 | test_schema_3 | 2019-12-03 19:21:31 | 2019-12-03 19:21:51 | test_schema_3 | 11097 | 11018 | 1:2063361382 | 2 | FINISHED | | N/A | 86400 |
+-------+---------------+---------------------+---------------------+---------------+---------+---------------+---------------+---------------+----------+------+----------+---------+
2 rows in set (0.00 sec)
2019-12-06 16:24:44 +08:00
e7b05f7eb3 Date format support java date style "yyyy-MM-dd HH:mm:ss" (#2309) 2019-11-28 14:34:31 +08:00
46181c0880 Fix some bugs about load label (#2241) 2019-11-23 00:04:45 +08:00
d8cfbbedf7 Support bitmap_empty function (#2227) 2019-11-18 20:37:00 +08:00
6759e83a07 Add license header for md files and fix some translation's error (#2137) 2019-11-06 21:35:07 +08:00
65c3b0907a Support aggregation type of REPLACE_IF_NOT_NULL (#2127)
Some use has the requirment that only some of columns will be update in
one load operation, and others will retain as original. However, Doris
can't handle this situation, because user must specify value for all
columns. Then if a column aggregation method is REPLACE, use must query
original value to overwrite it. This often needs some work for user to
do.

If this CL is applied, user can use REPLACE_IF_NOT_NULL instead of
REPLACE. Then when load data to table, if user don't intent to change
value of this column, user can specify NULL for this column. Doris will
retain original value for this column.
2019-11-05 18:08:34 +08:00
713e04624f Modify the lower bound of percentile_approx compression param to 2048 (#2111) 2019-11-01 13:07:39 +08:00
95a3b4ccfe Add object type (#1948)
Add a new type: Object. Currently, it's mainly for complex aggregate metrics(HLL , Bitmap).

The Object type has the following constraints:
1 Object type could not as key column type
2 Object type doesn't support all indices (BloomFilter, short key, zone map, invert index)
3 Object type doesn't support filter and group by

In the implementation:

The Object type reuse the StringValue and StringVal, because in storage engine, the Object type is binary, it has a pointer and length.
2019-10-31 21:42:58 +08:00
41e55cfca9 Modify fixed partition feature (#1989)
1. Not support MAVALUE in multi partition column.
2. Fix the incorrect show create table stmt.
2019-10-16 16:03:46 +08:00
63fa260d3f Support prepare/close in UDF (#1985)
The prepare/close step of scalar function is already supported in execution framework, We only need to do is that support it in syntax and meta in frontend.

In addition, 'Hive' binary type of scalar function NOT supports prepare/close step, we need to make it supports.
2019-10-16 07:19:20 +08:00
ec7c8a2c6f Support adding fixed range partition
eg: ALTER TABLE test_table ADD PARTITION p0125 VALUES [("20190125"), ("20190126"));
2019-10-15 09:50:30 +08:00
62acf5d098 Limit the memory usage of Loading process (#1954) 2019-10-15 09:26:20 +08:00
b84ef013eb Fix the mistake for HLL in mini load (#1981)
[Docs] Fix mistakes for HLL column in mini load
2019-10-14 19:46:23 +08:00
ccc236484b Fix bug that failed to add KEY column to DUPLICATE KEY table (#1973) 2019-10-14 16:40:34 +08:00
4a17152f40 Add tdigest compression param for pencentile_approx function (#1939) 2019-10-11 18:56:59 +08:00
ec3aa03c45 Add more routine load example (#1902) 2019-09-27 20:42:52 +08:00
40b9c3571b Support hll_empty function (#1825) 2019-09-25 09:28:02 +08:00
e8da855cd2 Support setting timezone for stream load and routine load (#1831) 2019-09-20 07:55:05 +08:00
e70e48c01e Add a ALTER operation to change distribution type from RANDOM to HASH (#1823)
Random distribution is no longer supported since version 0.9.
And we need a way to convert the random distribution to hash distribution.

    ALTER TABLE db.tbl SET ("distribution_type" = "hash");
2019-09-18 14:16:26 +08:00
714dca8699 Support table comment and column comment for view (#1799) 2019-09-18 09:45:28 +08:00
054a3f48bc Add where expr in broker load (#1812)
The where predicate in broker load is responsible for filtering transformed data.
The docs of help and operator has been changed.
2019-09-17 11:32:40 +08:00
9aa2045987 Refactor alter job (#1695) 2019-09-12 16:31:29 +08:00
c354f30767 Fix mistake in docs (#1796) 2019-09-12 14:15:06 +08:00
b327643132 Fix bug that failed to limit the mem usage of HLL column when loading (#1778)
Should use arena to allocate mem for HyperLogLog column.
2019-09-11 10:20:46 +08:00
044489b92f Optimize some kinds of load jobs (#1762)
1. Support specifying label to Insert Into stmt.

    INSERT INTO tbl1 WITH LABEL label1 ...;

2. Return job' state corresponding to the existing label in result of stream load.

    ...
    "Status": "Label Already Exists",
    "ExistingJobStatus": "FINISHED"
    ...

3. Return the recent 2000 transactions in SHOW PROC '/transactions'
2019-09-09 22:11:12 +08:00
76987275b9 Fix result of unix_timestamp() (#1727) 2019-08-30 21:39:16 +08:00
3a33f3d350 Make bitmap_union agg column support insert into and broker load (#1721) 2019-08-30 14:44:51 +08:00
378ce8ca04 Use double when converting TIME type value (#1722)
TIME type value is saved in DOUBLE, so using int64 can extend the time range.
2019-08-29 21:19:19 +08:00
6865f4238b Add limit to show tablet stmt (#1547)
Also add some where predicates for filtering results
ISSUE #1687
2019-08-28 16:25:12 +08:00
1e4dd77d2a Add bitmap agg type and udaf (#1610) 2019-08-26 14:24:42 +08:00
978b1ee1af Add strict mode in Routine load, Stream load and Mini load (#1677) 2019-08-20 21:56:45 +08:00
176e185e18 Add broker doc (#1662)
This broker document introduces the properties for different broker types.
2019-08-20 17:18:54 +08:00
8e6814cfcd Support setting timeout for stream load (#1670) 2019-08-20 15:43:03 +08:00
ba6d728f26 Enable parsing columns from file path for Broker Load (#1582) (#1635)
Currently, we do not support parsing encoded/compressed columns in file path, eg: extract column k1 from file path /path/to/dir/k1=1/xxx.csv

This patch is able to parse columns from file path like in Spark(Partition Discovery).

This patch parse partition columns at BrokerScanNode.java and save parsing result of each file path as a property of TBrokerRangeDesc, then the broker reader of BE can read the value of specified partition column.
2019-08-19 09:39:21 +08:00
82d0afc1ba FROM_UNIXTIME should only convert timestamp from 0 to 253402271999 (#1658)
which is between 1970-01-01 00:00:00 ~ 9999-12-31 23:59:59, otherwise, return null
2019-08-16 18:29:57 +08:00
0e6560ceca Fix document typo (#1657) 2019-08-16 14:52:32 +08:00
1ed25ad83d Add kafka_default_offsets when no partiotion specify
Support read kafka partition from start (#1642)
2019-08-16 13:30:26 +08:00
a551abba58 Modify timediff documents (#1600) 2019-08-15 12:45:53 +08:00
199ff968dc Fix time zone compatibility (#1631) 2019-08-13 18:44:35 +08:00
69af50aa8c Time zone related BE function (#1598)
Details can be found in time-zone.md document
2019-08-12 20:57:59 +08:00