Commit Graph

66 Commits

Author SHA1 Message Date
af1beb6ce4 [Enhance] Add prepare phase for some timestamp functions (#3947)
Fix: #3946 

CL:
1. Add prepare phase for `from_unixtime()`, `date_format()` and `convert_tz()` functions, to handle the format string once for all.
2. Find the cctz timezone when init `runtime state`, so that don't need to find timezone for each rows.
3. Add constant rewrite rule for `utc_timestamp()`
4. Add doc for `to_date()`
5. Comment out the `push_handler_test`, it can not run in DEBUG mode, will be fixed later.
6. Remove `timezone_db.h/cpp` and add `timezone_utils.h/cpp`

The performance shows bellow:

11,000,000 rows

SQL1: `select count(from_unixtime(k1)) from tbl1;`
Before: 8.85s
After: 2.85s

SQL2: `select count(from_unixtime(k1, '%Y-%m-%d %H:%i:%s')) from tbl1 limit 1;`
Before: 10.73s
After: 4.85s

The date string format seems still slow, we may need a further enhancement about it.
2020-06-29 19:15:09 +08:00
838c1e9212 Modify HLL functions return type (#3656)
1、Modify hll_hash function return type to HLL
2、Make HLL_RAW_AGG is alias of HLL_UNION
2020-05-24 21:22:43 +08:00
a656a7ddd4 Support append_trailing_char_if_absent function (#3439) 2020-05-09 08:59:34 +08:00
94b3a2bd50 [Bug] Fix string functions not support multibyte string (#3345)
Let string functions support utf8 encoding
2020-05-08 12:52:46 +08:00
8276c6d7f8 Show BE version in 'show backends;' (#3074)
In a large scale cluster, we may rolling upgrade BEs, this patch add a
column named 'Version' for command 'show backends;', as well as website
'/system?path=//backends', to provide a method to check whether there
is any BE missing upgraded.
2020-03-12 22:15:13 +08:00
fdcbfbb793 [Bug] Fix bug that coalesce() function return null when there is constant value in parameter. (#3062)
select coalesce(1, null);

RETURNS:    NULL
EXPECTED:   1
2020-03-09 16:38:50 +08:00
c8054ebe13 [Function] ifnull function supports new args (date,datetime) and (datetime, date) (#3043) 2020-03-09 09:37:26 +08:00
0d1e28746e [Function] Support null_or_empty function (#2977)
It returns true if the string is empty or NULL. Otherwise it returns false.
2020-03-01 17:35:45 +08:00
35b09ecd66 [JDK] Support OpenJDK (#2804)
Support compile and running Frontend process and Broker process with OpenJDK.
OpenJDK 13 is tested.
2020-02-20 23:47:02 +08:00
ece8740c1b Fix some function DATE type priority (#2952)
1. Fix the bug introduced by https://github.com/apache/incubator-doris/pull/2947.  
The following sql result is 0000, which is wrong. The result should be 1601
```
select date_format('2020-02-19 16:01:12','%H%i');
```

2. Add constant Express plan test, ensure the FE constant Express compute result is right.

3. Remove the `castToInt ` function in `FEFunctions`, which is duplicated with `CastExpr::getResultValue`

4. Implement `getNodeExplainString` method for `UnionNode`
2020-02-20 20:45:45 +08:00
147953f09e Fix some function with date type bug (#2947)
The logic chain is following:
1. `date_format(if(, NULL, `dt`), '%Y%m%d')` as HASH_PARTITIONED exprs,which is not right, we should use Agg  intermediate materialized slot
2. we don't use Agg  intermediate materialized slot as  HASH_PARTITIONED exprs, becasue
```
            // the parent fragment is partitioned on the grouping exprs;
            // substitute grouping exprs to reference the *output* of the agg, not the input
            partitionExprs = Expr.substituteList(partitionExprs,
                    node.getAggInfo().getIntermediateSmap(), ctx_.getRootAnalyzer(), false);
            parentPartition = DataPartition.hashPartitioned(partitionExprs);
```
the partitionExprs substitute failed。
3. partitionExprs substitute failed because partitionExprs  has a casttodate child,but agg info getIntermediateSmap has a cast in datetime child.
4. The cast to date or cast to datetime child exist because `TupleIsNullPredicate` insert a `if` Expr.   we don't have `if date` fn, so Doris use `if int` Expr.
5. the `date` in the `catstodate` depend on slot dt date type. the `datetime` in the `catstodatetime` depend on datetime arg type in `date_format` function.

So we could fix this issue by make if fn support date type or make date_format fn support date type
2020-02-19 20:16:44 +08:00
0fb52c514b [UDF] Fix bug that UDF can't handle constant null value (#2914)
This CL modify the `evalExpr()` of ExpressionFunctions, so that it won't change the
`FunctionCallExpr` to `NullLiteral` when there is null parameter in UDF. Which will fix the
problem described in ISSUE: #2913
2020-02-17 22:13:50 +08:00
3a8e783444 Compatible with python3 in build (#2876) 2020-02-10 21:50:42 +08:00
c6090965e2 [Unused Code] Remove unused file in gensrc/scripts (#2863)
gensrc/script/doris_functions.py
gensrc/script/gen_opcodes.py

These 2 files are useless. Remove them.
2020-02-10 09:45:01 +08:00
89c7234c1c Support starts_with (str, prefix) function (#2813)
Support starts_with function
2020-01-21 14:09:08 +08:00
fc55423032 [SQL] Support Grouping Sets, Rollup and Cube to extend group by statement
Support Grouping Sets, Rollup and Cube to extend group by statement
support GROUPING SETS syntax 
```
SELECT a, b, SUM( c ) FROM tab1 GROUP BY GROUPING SETS ( (a, b), (a), (b), ( ) );
```
cube  or rollup like 
```
SELECT a, b,c, SUM( d ) FROM tab1 GROUP BY ROLLUP|CUBE(a,b,c)
```

[ADD] support grouping functions in expr like grouping(a) + grouping(b) (#2039)
[FIX] fix analyzer error in window function(#2039)
2020-01-17 16:24:02 +08:00
0ddca59d36 Add timestampadd/timestampdiff function (#2725) 2020-01-15 21:47:07 +08:00
7768629f08 Add bitmap_contains and bitmap_has_any functions (#2752) 2020-01-15 14:31:44 +08:00
f071d5a307 Support ends_with function (#2746) 2020-01-14 22:37:20 +08:00
a99a49a444 Add bitamp_to_string function (#2731)
This CL changes:

1. add function bitmap_to_string and bitmap_from_string, which will
 convert a bitmap to/from string which contains all bit in bitmap
2. add function murmur_hash3_32, which will compute murmur hash for
input strings
3. make the function cast float to string the same with user result
logic
2020-01-13 12:31:37 +08:00
ccaa97a5ac Make bitmap functions accept any expression that returns bitmap (#2728)
This CL make bitmap_count, bitmap_union, and bitmap_union_count accept any expression whose return type is bitmap as input so that we can support flexible bitmap expression such as bitmap_count(bitmap_and(to_bitmap(1), to_bitmap(2))).

This CL also create separate documentation for each bitmap UDF to conform with other functions.
2020-01-11 14:02:12 +08:00
a028c52edd Add BE function bitmap_or and bitmap_and (#2707) 2020-01-08 19:59:44 +08:00
7d2610d091 Change bitmap functions return type to BITMAP (#2690) 2020-01-07 19:27:21 +08:00
5b9b0a84d5 Add curdate function (#2521) 2019-12-20 21:23:16 +08:00
a5f52f80df Add bitmap_hash function (#2439)
Add a bitmap_hash function.
Add a murmur_hash3_32 hash function.
2019-12-12 16:55:07 +08:00
8787a59cb4 Make FE compile support Mac (#2334) 2019-11-29 20:50:35 +08:00
d8cfbbedf7 Support bitmap_empty function (#2227) 2019-11-18 20:37:00 +08:00
4f56698db4 Change ifnull syntax support date type (#2123) (#2206) 2019-11-14 21:33:12 +08:00
3f325e001a Change the priority of different type in function (#2003)
This commit fix the issue [ISSUE-2002].
It changes the priority of coalesce, ifnull, nullif function etc.
The priority of decimal is higher then varchar in the IS_SUPERTYPE_OF compare mode.

Example:
select coalesce(decimal_column, 1) from table;
    the return type of coalesce should be decimal instead of varchar.

Add supertype about datetime and date
The supertype of datetime is bigint, largeint etc.
In IS_SUPERTYPE_OF compare mode, the function(bigint, bigint, bigint) is a supertype of function(datetime, bigint, int).

Example:
select coalesce(now(), 1)) from web_returns;
    the return type of coalesce should be bigint instead of varchar.
2019-10-18 09:35:49 +08:00
8f016d3ab2 Make HLL be able to handle invalid data (#1908)
In this change list
1. validate HLL column when loading data, if data is invalid, this row
will be filtered.
2. seems as empty HLL when serializing invalid type of HLL data, with
this change, all ingested data will be valid.
3. seems as empty HLL when deserializing nullptr or invalid type of HLL data.
With this change, dirty data can be handled normally.
4. rename function empty_hll to hll_empty.
5. disable memtable_flush_execute_test because this will fails
sometimes. When tearing down, some thread is not joined, and they will
visit destroyed resource, which is invalid.
2019-09-29 10:55:23 +08:00
40b9c3571b Support hll_empty function (#1825) 2019-09-25 09:28:02 +08:00
cd5cfea5cc Encapsulate HLL logic (#1756) 2019-09-09 15:52:10 +08:00
1e4dd77d2a Add bitmap agg type and udaf (#1610) 2019-08-26 14:24:42 +08:00
69af50aa8c Time zone related BE function (#1598)
Details can be found in time-zone.md document
2019-08-12 20:57:59 +08:00
4aedaea84e Support TIME type and timediff function (#1505) 2019-07-23 13:42:39 +08:00
941dec215b Add utc_timestamp function (#1456) 2019-07-11 11:09:08 +08:00
98bd4b4565 Add string function split_part (#1451) 2019-07-10 09:47:33 +08:00
8db97998ba Collect all documents to Doris code base (#1414) 2019-07-01 09:23:13 +08:00
adba5249c4 Add dayofweek function (#1376) 2019-06-25 21:37:42 +08:00
5cd4777bb4 Fix bug that if match wrong symbol (#1324) 2019-06-19 09:17:13 +08:00
7cdaba66dc Add spatial func (#1213)
Support some spatial functions, such as ST_Contains.
2019-05-31 14:23:09 +08:00
722a9e71c7 Optimize json functions (#1177)
1. get_json_xxx() now support using quoto to escape dot
2. Implement json_path_prepare() function to preprocess json_path

Performance of get_json_string() on 1000000 rows reduces from 2.27s to 0.27s
2019-05-21 09:13:12 +08:00
b2a022b348 Add money_format function (#1064) 2019-04-29 18:31:24 +08:00
101106d83e Add left/right string function (#1032) 2019-04-25 17:29:34 +08:00
a1bfc90320 Support hll_raw_agg in Aggregate Function (#832)
hll_raw_agg Function aggregates the HLL type value, and return the HLL type value
2019-04-01 16:17:56 +08:00
504fbc3627 Fix bug that Greatest get wrong function's symbol (#796) 2019-03-23 23:28:37 +08:00
c34b306b4f Decimal optimize branch #695 (#727) 2019-03-22 17:22:16 +08:00
daa9d975ca Fix bugs of Tablet Scheduler (#600) 2019-01-29 15:35:07 +08:00
4d5f92cce7 Add EsScanNode (#450) 2019-01-17 17:59:33 +08:00
1ba8a4ee4e Transform row-oriented table to columnar-oriented table (#311) 2018-11-16 16:03:56 +08:00