Commit Graph

8276 Commits

Author SHA1 Message Date
3360bdf124 [feature-wip](statistics) update cache when analysis job finished (#14370)
1. Update cache when analysis job finished
2. Rename `StatisticsStorageInitializer` to `InernalSchemaInitializer`
2022-11-22 21:33:10 +08:00
e77151868d [Docs](table-valued-function) add docs for s3 and hdfs tvf (#14369) 2022-11-22 21:31:47 +08:00
1fe9bced25 [test](jdbc)add more mysql jdbc test case (#14475) 2022-11-22 21:14:10 +08:00
e78086a501 [chore](macOS) Fix the build for thirdparty (#14462) 2022-11-22 20:49:15 +08:00
b04ec41c1d [Vectorized](udaf) fix java-udaf couldn't get jar core dump (#14393)
fix java-udaf couldn't get jar core dump
2022-11-22 20:49:02 +08:00
45aeb1d40d [test](delete) Change delete case (#14483)
* kafka kerberos

* change delete where in case
2022-11-22 19:41:43 +08:00
d8b03db45a [typo](docs) add-kafka-kerberos-version #14489 2022-11-22 18:45:52 +08:00
ed17294d96 [improvement](config)Add the ShellCheck check-free project about Dockerfile (#14451)
Co-authored-by: Yijia Su <suyijia@selectdb.com>
2022-11-22 17:31:44 +08:00
30e1818724 [fix](tracing) fix tracing in the new scan node does not meet expectations (#14155)
Issue Number: close #14149

- Remove unexpected tracing, like 'vscanner::scan'
- Merge span vscannode::get_next
2022-11-22 16:44:02 +08:00
89c676e597 [Bug] fix bug for grouping set query which where condition is false (#14401) 2022-11-22 16:03:43 +08:00
663f7dddcc [improvement](planner) eliminating useless sort node (#14377) 2022-11-22 15:13:25 +08:00
b9f017ebb1 [typo](docs) kafka kerberos #14479 2022-11-22 14:42:16 +08:00
8cf971e32f [chore](workflow) set clickbench as required (#14476)
Co-authored-by: yiguolei <yiguolei@gmail.com>
2022-11-22 14:32:33 +08:00
f72c63e4bb [chore](error status) print error stack when rpc error (#14473)
Currently, BE will print fail to get master client from cache. host=xxxxx, port=9228, code=THRIFT_RPC_ERROR but we did not know which step generate this error. So that I refactor error status in be and add error stack for RPC_ERROR.

W1122 10:19:21.130796 30405 utils.cpp:89] fail to get master client from cache. host=xxxx, port=9228, code=RPC error(error -1): Couldn't open transport for xxxx:9228 (open() timed out)/n @ 0x559af8f774ea doris::Status::ConstructErrorStatus()
@ 0x559af9aacbee _ZN5doris16ThriftClientImpl4openEv.cold
@ 0x559af97f563a doris::ClientCacheHelper::_create_client()
@ 0x559af97f78cd doris::ClientCacheHelper::get_client()
@ 0x559af934f38b doris::MasterServerClient::report()
@ 0x559af932e7a7 doris::TaskWorkerPool::_handle_report()
@ 0x559af932f07c doris::TaskWorkerPool::_report_task_worker_thread_callback()
@ 0x559af9b223c5 doris::ThreadPool::dispatch_thread()
@ 0x559af9b187af doris::Thread::supervise_thread()
@ 0x7f661bd8bea5 start_thread
@ 0x7f661c09eb0d __clone

Co-authored-by: yiguolei <yiguolei@gmail.com>
2022-11-22 14:29:28 +08:00
6e3716e0ea [enhancement](regression) split ssb sf1 to sf0.1 to get smaller test data size (#14437) 2022-11-22 10:36:12 +08:00
1ec7f45fb6 [Bug](avg) Fix avg for bigint (#14433) 2022-11-22 10:29:59 +08:00
63f4b35f95 [bugfix](short_key) fix short key coder for nullable key (#14298) 2022-11-22 09:27:22 +08:00
fea9966728 [fix](parquet-orc) fix that be core dump when some columns specified are not in the parquet or orc file (#14440)
When some columns specified are not in the parquet or orc file in broker load, _batch->num_columns() will less than _num_of_columns_from_file. It will lead to be core dump.
To prevent be core dump, just return an error in this case.
2022-11-22 09:10:38 +08:00
16d8a1853a [Bug](array-function) array set function not handle all null value (#14318) 2022-11-22 09:07:43 +08:00
034aa20b0a [fix](regression)when using regression-conf-custom.groovy, properties in regression-conf.groovy are missing #14458 2022-11-22 08:44:50 +08:00
ca486cdfbc [Enhancement](storage) optimize segment compaction log (#14448) (#14449)
Signed-off-by: freemandealer <freeman.zhang1992@gmail.com>

Signed-off-by: freemandealer <freeman.zhang1992@gmail.com>
2022-11-22 08:43:51 +08:00
74f694753b Fix the en docs of benchmark (#14459) 2022-11-22 08:40:51 +08:00
e3d764aac5 [test](jdbc) add new jdbc case in other source (#14443) 2022-11-21 21:33:06 +08:00
7624c80d83 [Feature](Kafka) Add kerberos support for kafka (#14431)
Compile librdkafka with Kerberos SASL GSSAPI support.
2022-11-21 20:45:50 +08:00
730cd1a0c1 [Feature](Nereids) Simplify range of predicate (#14113)
Simplify range of predicate

for example:
1. `a > 1 or a > 2` => `a > 1`
2. `a in (1,2,3) or a (3,4,5)` => `a in (1,2,3,4,5)`
2022-11-21 20:24:03 +08:00
91bd76a902 [enhancement](FE) use forEach() to replace stream().forEach() (#14039) 2022-11-21 15:40:43 +08:00
a91fe11b4d [feature](Nereids) Add random test framework (#14388) 2022-11-21 15:16:03 +08:00
b36f3d7e61 [typo](docs) fix typo in schema-change.md (#14311) 2022-11-21 13:38:47 +08:00
Pxl
bcd641877f [Enhancement](scan) disable build key range and filters when push down agg work (#14248)
disable build key range and filters when push down agg work
2022-11-21 12:47:57 +08:00
ff197b0fa5 [chore](macOS) Fix linker errors (#14410) 2022-11-21 10:38:36 +08:00
ce489cf723 [Feature](JDBC)support clickhouse jdbc external table (#14244) 2022-11-21 10:33:53 +08:00
41dae8b6bb [improvement](load) add a log when close OlapTableSink with error (#14257) 2022-11-21 10:33:37 +08:00
a9a6fdd8c3 [fix](insert) fix insert into table which contains column name prefix mv_ (#14361) 2022-11-21 10:31:01 +08:00
0613ccda74 [feature](tools)profile viewer (#14429)
It is a painful work to read profile, especially there are multi-parallel instances.
This tool helps us to grasp the main information of profile in a graphical view.

The profile is represented by a tree.
Sql operation nodes contains operation type(join, scan...), its node id, its fragment id. The number on the arrow edge means how many rows output by child node. This tool will sum the output rows of the same node in multi-parallel instances, that is if there are 4 parallel instance, and each ScanNode on lineitem table output 10 rows, the label on the arrow beginning with ScanNode(lineitem) is 40.

Here is a demo for tpch Q2
tpch q2 profile viewer

Issue Number: close #xxx
2022-11-21 10:29:54 +08:00
4976021bf7 [Enhancement] Doris broker support aliyun-oss #13665 (#14305) 2022-11-21 10:29:14 +08:00
Pxl
c18a471303 [Optimize](predicate) update inplace on VcompoundPred (#14402)
select count(*) from lineorder where lo_orderkey<100000000 OR lo_orderkey>100000000 AND lo_orderkey<200000000 OR lo_orderkey >200000000;

0.6s -> 0.5s
2022-11-21 09:12:30 +08:00
3f29e3bff6 [bug](test) fix regression test of jdbc postgresql table core (#14417) 2022-11-20 23:03:14 +08:00
98cea90950 [typo](docs)benchmark doc fix number (#14427) 2022-11-20 22:51:42 +08:00
c29975d347 [Docs](function) Add some function do not in sidebars (#14426) 2022-11-20 22:50:52 +08:00
71e80e8957 [typo](docs)Performance test documentation update (#14147)
* Performance test documentation update
2022-11-20 09:40:57 +08:00
2ccb5209a0 (improvement)[doc] add document version tag instruction (#14406) 2022-11-20 00:05:53 +08:00
3489f4826c [fix](test) sync conf used in pipeline and in repository (#14414) 2022-11-20 00:05:08 +08:00
3e1e8db173 [fix](exec) fix thread token shutdown (#14418)
Fix Thread pool token was shut down error.
This is because when there are more than 1 fragment of a query on one BE, the thread token maybe
reset incorrectly, causing thread token shutdown earlier.
cherry-pick from master
Introduced from #13021
2022-11-20 00:04:48 +08:00
5dfe5ef965 [test](hive catalog)add hive catalog test case (#14217) 2022-11-19 17:26:18 +08:00
2c42f0a905 [refactor](decimalv3) Refine code for DecimalV3 (#14394) 2022-11-19 16:57:17 +08:00
1482ab32b6 [tools](tpch)fix invalid download url (#14329) 2022-11-19 13:29:33 +08:00
1f2c06dd6e [enhancement](rewrite) Remove unused wide common factors to improve scan performance in ExtractCommonFactorsRule (#14381)
* [enhancemeng](sql) Remove unused wide common factors to improve scan performance in ExtractCommonFactorsRule

* fix regression test

Co-authored-by: caiconghui1 <caiconghui1@jd.com>
2022-11-19 13:23:49 +08:00
f5f2e84e31 [refactor](planner) remove the limit return rows of order by (#12478)
Originally, Order By Limit returned a maximum of 65535 rows of data by default during the query,
but now many businesses do not apply this limit.
It is necessary to add larger data after the query statement to complete the full data query,
which is extremely inconvenient, so adjustments have been made.

At the same time, I added the variable DEFAULT_ORDER_BY_LIMIT to the SessionVariable,
the default value is -1, if the user does not use the LIMIT keyword or the LIMIT value is a negative integer,
the default query return value is Long.MAX_VALUE. If the corresponding maximum query value is set,
the number of data items is returned according to the maximum query value or the value followed by the
LIMIT keyword.
2022-11-19 12:45:44 +08:00
1b6e872a8a [improvement](common) table name length exceeds limit error message (#14368)
For the table name check, the regular match error and the length exceeds the limit, both of which display the message "Incorrect table name 'xxx'. Table name regex is 'xxx'".
Obviously, the message cannot clearly point out what kind of error it is.
So it is a better way to separate the two error messages.
2022-11-19 11:36:08 +08:00
512b787559 [fix](parquet-reader) fix stack-use-after-return error (#14411) 2022-11-19 10:52:50 +08:00