Commit Graph

8276 Commits

Author SHA1 Message Date
95e6553d90 [feature-wip](nereids) Implement using join (#15311) 2022-12-28 19:22:20 +08:00
75aa00d3d0 [Feature](NGram BloomFilter Index) add new ngram bloom filter index to speed up like query (#11579)
This PR implement  the new bloom filter index: NGram bloom filter index, which was proposed in  #10733.
The new index can improve the like query performance greatly, from our some test case , can  get order of magnitude  improve.
For how to use it you can check the docs in this PR, and the index based on the ```enable_function_pushdown```,
you need set it to ```true```, to make the index work for like query.
2022-12-28 18:01:50 +08:00
0f8b15b902 [feature](nereids) support string alias in select list (#15369)
support such syntax: select '' as 'b', col1 from select_with_const
2022-12-28 17:26:48 +08:00
3aae27634a [doc](flink-connector) update flink connector faq (#15405) 2022-12-28 16:15:49 +08:00
69d95c857a [feature](remote)Add alter storage policy (#15381)
* add alter storage policy

* add alter storage policy

* add alter storage policy
2022-12-28 16:09:06 +08:00
8342691b62 [feature](remote)Add drop storage policy (#15364)
* add drop storage policy

* add drop storage policy

* add drop storage policy

* add drop storage policy
2022-12-28 16:04:30 +08:00
f7988fad03 [improvement](string) set bigger limit for ColumnString chars length (#15426) 2022-12-28 15:41:01 +08:00
8ce62600dc [Bug] #14876 && #15225 have some bugs in rewrite or to in, revert them (#15420) 2022-12-28 13:30:09 +08:00
fe02b08e04 [Improvement](thirdparty)upgrade simdjson from 1.0.2 to 3.0.1 (#15412)
Upgrade simdjson from 1.0.2 to latest version 3.0.1 to avoid -mlzcnt compiler flag causing BE UT(macOS) failure.
simdjson is now only used by VJsonScanner and disabled by default. So the impact of upgrade is limited.
2022-12-28 12:24:16 +08:00
Pxl
121f00b6e2 [Bug](function) forbid hll_union input not hll type param (#15397)
forbid hll_union input not hll type param
2022-12-28 12:23:34 +08:00
d05f430ca2 [feature](nereids) support syntax: count(all *) (#15376) 2022-12-28 11:09:56 +08:00
f8bb8c7829 [fix](broker) fix be core dump caused by broker load (#15390)
* [fix](broker) fix be core dump caused by broker load
2022-12-28 10:57:41 +08:00
2af831de33 [Fix](Nereids)fix group by binding error, resulting in incorrect results (#15328)
Original: group by is bound to the outputExpression of the current node.

Problem: When the name of the new reference of outputExpression is the same as the child's output column, the child's output column should be used for group by, but at this time, the new reference of the node's outputExpression will be used for group by, resulting in an error

Now: Give priority to the child's output for group by binding. If the child does not have a corresponding column, use the outputExpression of this node for binding
2022-12-28 10:42:21 +08:00
9f9651b2f2 [Enhancement](Jemalloc): correct the varialbe name of malloc_conf & enable prof (#15382)
enable profile and correct the conf name in Jemalloc.
2022-12-28 09:50:59 +08:00
816e12db6a [Bench](mem) some benchmark over the query limit (#15408) 2022-12-28 09:29:53 +08:00
28bb13a026 [feature](light-schema-change) enable light schema change by default (#15344) 2022-12-28 09:29:26 +08:00
22b31e516c [Bug](decimalv3) select view of decimalv3 error (#15404) 2022-12-28 08:38:33 +08:00
5ac7b09765 [feature](Nereids) Support SchemaScan (#15411)
such as: select * from information_schema.backends;
2022-12-28 00:33:48 +08:00
aad53d37c7 [typo](docs)fix doris docs 404 link (#15400) 2022-12-27 22:57:40 +08:00
03aef7a8ac [fix](nereids) sender in union's child fragment has no destination (#15402)
1. always create an exchange node for set operation node's children
2. fix cast expr's nullability bug.
2022-12-27 22:36:54 +08:00
700f963571 [refactor](remove-non-vec) remove some non vec code in segment iterator and remove reuse schema opt since it is introduced in non-vec code (#15407)
Co-authored-by: yiguolei <yiguolei@gmail.com>
2022-12-27 22:30:02 +08:00
51b14c06d3 [enhancement](nereids) support approx_count_distinct function (#15406) 2022-12-27 22:25:21 +08:00
c63dda99db [test](Nereids) Disable some regression test for materialized index. (#15387)
When light schema change is enabled by default (#15344), regression tests that run SQL by selecting data from the materialized index will fail.
This PR disabled those failed queries in the regression test. Those tests would be added back when nereids planner could give the correct plan when light schema change is enabled.
2022-12-27 19:24:03 +08:00
0550dfaeb2 [enhancement](rewrite) add OrToIn rule and fix ExtractCommonFactorsRule apply problems (#12872)
Co-authored-by: wuhangze <wuhangze@jd.com>
2022-12-27 18:39:53 +08:00
849adca225 [fix](Nereids) expressions is not be extracted when binding function (#15361) 2022-12-27 17:12:07 +08:00
a07ca41f8e [Fix](Nereids) fix repeat node nullable error bugs (#15251) 2022-12-27 17:01:33 +08:00
524208ab3a [Feature](bitmap/hll)Support return bitmap/hll data in select statement in vectorization (#15224)
Support return bitmap data in select statement in vectorization mode

In the scenario of using Bitmap to circle people, users need to return the Bitmap results to the upper layer, which is parsing the contents of the Bitmap to deal with high QPS query scenarios
2022-12-27 14:49:24 +08:00
5a8201320a [fix](nereids) group by constants produce wrong result (#15322)
SELECT 2 FROM tbl GROUP BY 1

it should produce 2 would the table is not empty when table is not empty. Before this PR, the execution of nereids generated plan would produce empty result set
2022-12-27 14:35:02 +08:00
b3f77a2e00 [feature](Show) add one show type cast command (#15137) 2022-12-27 14:19:04 +08:00
6d851b1fc9 [Doc](Flink) update flink connector doc add new version #15365
Co-authored-by: wudi <>
2022-12-27 14:15:49 +08:00
777b0b94bb [typo](docs) fix wrong date format (#15363)
fix wrong date format
2022-12-27 11:45:05 +08:00
69068f9835 [fix](planner) fix hll_union plan: Invalid Aggregate Operator: hll_union (#14931)
When using hll_union aggregate function, PREAGGREGATION is always OFF and Rollup cannot be hit.
2022-12-27 11:20:41 +08:00
99868f9929 [fix](planner) set limit to nagtive value when has offset in limit (#15367) 2022-12-27 11:16:19 +08:00
73957a028c [fix](mow-uniquekey) fix dereference to nullptr in Tablet::calc_delete_bitmap (#15375) 2022-12-27 11:14:25 +08:00
0f922a83c7 [fix](nereids) group_bit_xxx signature update (#15318) 2022-12-27 00:41:42 +08:00
325d247b92 [Feature](Nereids) Support hll and count for materialized index. (#15275) 2022-12-27 00:38:04 +08:00
650136c32e [Enhancement](fe): replace assertTrue(X.equals(X)) with assertEquals (#15356) 2022-12-27 00:37:24 +08:00
8879400419 [feature](nereids) Support query on specific partitions (#15243) 2022-12-27 00:32:14 +08:00
a1c6ea876f [fix](inbitmap) fix core dump caused by bitmap filter with union (#15333)
The join node need project operation to remove unnecessary columns from the output tuples.
For SetOperationNode output tuple and input tuple is consistent and do not need project,
but the children of SetOperationNode may be join nodes, so the children of the SetOperationNode
need to do the project operation.
2022-12-26 23:14:32 +08:00
fc8f6a0715 [fix](multi-catalog) throw NPE when reading data after EOF (#15358)
1. Fix 1 bug:  
Throw null pointer exception when reading data after the reader reaches the end of file, so should return directly when `_do_lazy_read` read no data.

2. Optimize code:  
Remove unused parameters.

3. Fix regression test
2022-12-26 22:49:35 +08:00
aa0f38f864 [chore](gutil) remove some gutil files and use c++ stl instead (#15357)
* [chore](gutil) remove some gutil files and use c++ stl instead

* fix

* fix
2022-12-26 21:25:09 +08:00
301640d3c0 [fix](string) fix offsets over flow for extreme large String column (#15360)
* [fix](string) fix offsets over flow for extreme large String column

* fix
2022-12-26 21:23:58 +08:00
c3d0e2931a [typo](docs) fix version tag for docs of s3 token (#15362) 2022-12-26 19:23:43 +08:00
72f0003753 [enhancement](regression) use sf0.1 data in datev2 and decimalv3 cases (#15342) 2022-12-26 19:15:49 +08:00
ae87415174 [Feature](Nereids) add simplify arithmetic rule (#15242)
support simplify arithmetic rule

for example :
a + 1 > 1
=> a > 0
2022-12-26 16:57:59 +08:00
24a994eb9f [Feature-WIP](inverted) add inverted index writer api for be (#14207) 2022-12-26 15:02:12 +08:00
1400a89065 [Bug](Compile) fix compile error by using correct method name (#15355)
fix compile error by using correct method name
2022-12-26 14:58:01 +08:00
8b6e4e74e7 [improvement](jdbc) add default jdbc driver's dir (#15346)
Add a new config "jdbc_drivers_dir" for both FE and BE.
User can put jdbc drivers' jar file in this dir, and only specify file name in "driver_url" properties
when creating jdbc resource.
And Doris will find jar files in this dir.

Also modify the logic so that when the jdbc resource is modified, the corresponding jdbc table
will get the latest properties.
2022-12-26 11:51:12 +08:00
bf71943605 [feature](load) stream load trim double quotes for csv (#15241) 2022-12-26 11:45:54 +08:00
7b5739e9a9 [Fix](Nerids) fix dup key for pull predicate from project children (#15292)
In InferPredicates, we need pull predicates from project children then use sid replace id1.
In our code, use alias name as key, use expression as value to build map. Obviously, sid has two alias name(id1,id2) so throw Duplicate key exception.
2022-12-26 10:57:14 +08:00