Commit Graph

1932 Commits

Author SHA1 Message Date
Pxl
10c3712aa1 [fix](vectorized) fix arithmetic calculate get wrong result(#8226) 2022-03-09 13:03:57 +08:00
826467e116 [fix](replica) handle replica version missing info to avoid -214 error (#8209)
In the original tablet reporting information, the version missing information is done by combining
two pieces of information as follows:

1. the maximum consecutive version number
2. the `version_miss` field

The logic of this approach is confusing and inconsistent with the logic of checking for missing versions when querying.

After the change, we directly use the version checking logic used in the query, and set `version_miss` to true
if a missing version is found

and on the FE processing side. Originally, only the **bad replica** information was syncronized among FEs,
but not the **version missing** information. As a result, the non-master FE is not aware of the missing version information.

In the new design, we deprecate the original log persistence class `BackendTabletsInfo` and use the new 
`BackendReplicasInfo` to record replica reporting information and write both **bad** and **version missing**
information to metadata so that other FEs can synchronize these information.
2022-03-09 13:03:22 +08:00
22bafef875 [fix](broker-load) fix bug that a cancelled job's state is LOADING (#8363)
1.
Before executing LoadLoadingTask of a broker load, we should check if the job is cancelled.

2.
Add a new column `runningTransactionNum` for `show proc "/transactions"`.
So that we can view all running txns in each db in one command.
2022-03-08 18:53:45 +08:00
1e70f992e7 [improvement][fix](insert)(replay) support SHOW LAST INSERT stmt and fix json replay bug (#8355)
1. support SHOW LAST INSERT
    In the current implementation, the insert operation returns a json string to describe the result information
    of the insert. But this information is in the session track field of the mysql protocol,
    and it is difficult to obtain programmatically.

    Therefore, I provide a new syntax `show last insert` to explicitly obtain the result of the latest insert operation,
    and return a normal query result set to facilitate the user to obtain the result information of the insert.

2. the `ReturnRows` field in fe.audit.log of insert operation will be set to the loaded row num of the insert.

3.  Fix a bug described in #8354
2022-03-08 18:53:11 +08:00
50a59f3f86 [license] Organize third-party dependent licenses for bianry releases (#8350) 2022-03-07 23:18:58 +08:00
477b87cb28 [feature](vec) Support update stmt in vec query engine (#8296) 2022-03-07 14:03:55 +08:00
22a0011403 [fix](planner) Convert format in RewriteFromUnixTimeRule (#8235)
SQL to reproduce:
```
SELECT * FROM table WHERE where FROM_UNIXTIME(d_datekey,'%Y-%m-%d %H:%i:%s') != '1970-08-20 00:11:43';

org.apache.doris.common.AnalysisException: errCode = 2, detailMessage = Unexpected exception: Illegal pattern character 'i'
        at org.apache.doris.qe.StmtExecutor.analyze(StmtExecutor.java:584) ~[palo-fe.jar:3.4.0]
        at org.apache.doris.qe.StmtExecutor.execute(StmtExecutor.java:345) ~[palo-fe.jar:3.4.0]
        at org.apache.doris.qe.StmtExecutor.execute(StmtExecutor.java:318) ~[palo-fe.jar:3.4.0]
        at org.apache.doris.qe.ConnectProcessor.handleQuery(ConnectProcessor.java:221) ~[palo-fe.jar:3.4.0]
        at org.apache.doris.qe.ConnectProcessor.dispatch(ConnectProcessor.java:361) ~[palo-fe.jar:3.4.0]
        at org.apache.doris.qe.ConnectProcessor.processOnce(ConnectProcessor.java:562) ~[palo-fe.jar:3.4.0]
        at org.apache.doris.mysql.nio.ReadListener.lambda$handleEvent$0(ReadListener.java:50) ~[palo-fe.jar:3.4.0]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) [?:?]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) [?:?]
        at java.lang.Thread.run(Thread.java:835) [?:?]
```


Describe the overview of changes.
Just support:
yyyy-MM-dd HH:mm:ss
yyyy-MM-dd
yyyyMMdd
2022-03-05 15:26:37 +08:00
f57f02bbf2 [improvement] Support show tablets stmt (#7970)
change `show tablet from tbl` to `show tablets from tbl`
2022-03-05 15:25:57 +08:00
9961b2c860 [refactor] Remove mysql-connector and replace org.json with com.googlecode.json-simple (#8319)
1. mysql-connector-java
    mysql-connector-java is under GLPv2 license, which is not compatible with APLv2, and Doris does not use it.

2. org.json
    org.json is under JSON license, which is not compatible with APLv2. I use `json-simple` to replace it.
2022-03-05 14:41:04 +08:00
9cf2798fb3 [typo] fix error for PushTask (#8316) 2022-03-05 14:40:32 +08:00
0383001442 [Enhancement] Support Skipping compaction lower replica where select queryable replica for better scan performance (#8146) 2022-03-05 09:51:50 +08:00
46ca23f216 [Feature] Support Changing the bucketing mode of the table from Hash Distribution to Random Distribution (#8259)
Co-authored-by: caiconghui1 <caiconghui1@jd.com>
2022-03-04 09:05:23 +08:00
80e88159d9 [improvement](restore) allow query on part of partitions when others are in RESTORE (#8245) 2022-03-03 22:34:10 +08:00
c56a372e06 [improvement][fix](grouping-set)(tablet-repair) optimize compaction too slow replica process, (#8123)
1. optimize compaction too slow replica process, will set to ok if the replica compaction is done.
And will not set bad if more than half replica is too slow,
2. fix field in grouping_id() is not the same name will cause error lile  `select list expression not produced by
aggregation output (missing from GROUP BY clause?): id`
2022-03-03 22:30:35 +08:00
f5ab0553ff [chore] remove some ut temp files and add some file to .gitignore (#8309) 2022-03-03 13:23:27 +08:00
f622ce0497 [refactor] remove types_test (#8289)
* [refactor] remove types_test
1. remove types_test, it will cause core dump in higher version GCC or
   clang, because of memory align, some code will be vectorized in higher
   GCC or clang
2. Change string type length to 2 GB instead of -1
3. modify inaccessible code
2022-03-03 09:31:35 +08:00
18098c5ceb [fix](fe-ut) Fix FE unit test (#8293)
Fix following ut:
1. GlobalTransactionMgrTest
2. BackupJobTest
3. ReplicaTest
4. SparkLoadJobTest

Also remove old FE Meta version
2022-03-03 09:30:17 +08:00
09bfb8b9d3 [fix] (rpc-udf) Fixed the problem that the query could not be interrupted (#8248)
if an error occurred in the rpc server during the execution of rpc-udf.
Add java,cpp,python demo of rpc-udf server
2022-03-03 09:30:03 +08:00
114eb19518 [fix](ut) query stmt test error (#8303) 2022-03-03 09:29:42 +08:00
f41316a3ec [fix](fold-constant)(hive) fix constant-folding in order by and optimize logs in hive client (#8268)
fix: #7509 
1. fix order by clause constant folding
2. optimize Hive exception message
3. change hive file status log type
2022-03-02 10:17:17 +08:00
315bfe2d0e Revert "[chore](dependency) upgrade-grpc-version (#8218)" (#8250)
This reverts commit df7e848cbbc8170c7bd83d812d7cac58b5574570.

Reverts apache/incubator-doris#8218

Because when using grpc 1.44.1, the corresponding `protoc-gen-grpc-java` plugin
requried GLIBC_2.14, which is not found in CentOS 6.

So I suggest to revert this commit this time. And considering upgrading this component
after most systems have reached glibc version 2.14.

And for Mac M1, you may have to change this version manually for now
2022-03-02 10:16:25 +08:00
d5b6428c6d [improvement] Upgrade MySQL version to 5.7.37 to reduce unnecessary CVE issues (#8247) 2022-03-02 10:16:02 +08:00
236105daa0 [feature][show-transaction] Support view transactions info for specified status by SHOW TRANSACTION stmt (#8156)
SHOW TRANSACTION WHERE STATUS = 'prepare/precommitted/committed/visible/aborted';
2022-03-02 10:14:42 +08:00
93c638f3a2 [fix][chore](insert)(fe) Fix analysis error of insert stmt and modify grpc-netty dependency (#8265)
This bug is introduced from #8112.

Also , I change the `grpc-netty` dependency to `grpc-netty-shaded`, to avoid dependency conflict:
```
java.lang.NoSuchMethodError: io.netty.buffer.PooledByteBufAllocator.
```
2022-03-01 11:12:10 +08:00
0fce094080 [typo] fix listdb description error (#8257)
Co-authored-by: zhaolipan <zhaolipan@shizhuang-inc.com>
2022-03-01 11:08:34 +08:00
27d2e3e949 [refactor](fe) Remove old fe meta version (#8246)
Remove old FE meta version < 100.
2022-02-28 17:47:01 +08:00
385ccf7c8a [fix](routine-load) fix show routine load task error (#8195) 2022-02-26 17:04:39 +08:00
87b96cfcd6 [feature](iceberg) Step3: Support query iceberg external table (#8179)
1. Add Iceberg scan node 
2. Add Iceberg/Hive table type in thrift 
3. Support querying Iceberg tables of format types `parquet` and `orc`
2022-02-26 17:04:11 +08:00
83521a826a [Feature](create_table) Support create table with random distribution to avoid data skew (#8041)
In some scenarios, users cannot find a suitable hash key to avoid data skew, so we need to provide an additional data distribution for olap table to avoid data skew

example:
CREATE TABLE random_table
(
siteid INT DEFAULT '10',
citycode SMALLINT,
username VARCHAR(32) DEFAULT '',
pv BIGINT SUM DEFAULT '0'
)
AGGREGATE KEY(siteid, citycode, username)
DISTRIBUTED BY random BUCKETS 10
PROPERTIES("replication_num" = "1");

Co-authored-by: caiconghui1 <caiconghui1@jd.com>
2022-02-26 10:38:55 +08:00
a630e037b9 [Enhancement](routine_load) Support show routine load statement with like predicate (#8188)
* [Enhancement](routine_load) Support show routine load with like predicate

Co-authored-by: caiconghui1 <caiconghui1@jd.com>
2022-02-26 10:35:38 +08:00
40c1fa2335 [refactor] change mysql server version to avoid some cve issues (#8223)
5.1.0 -> 5.1.73
2022-02-25 11:14:57 +08:00
f7c18d300c [Improvement] Add minimum fe meta version check (#8203)
There are many old codes in FE for old FE meta version such as `if (FeMetaVersion < VERSION_45) xxxxx`,
but the latest FE meta version is 107, these code maybe never reached,
but we do not remove these code because "sometimes" there are old code.

Add minimum required version check to allow us remove these old codes.
2022-02-25 11:14:00 +08:00
ddf08cc207 [refactor](fe) Remove version hash on FE side (#8099)
version hash is not used any more
2022-02-25 11:08:29 +08:00
df7e848cbb [chore](dependency) upgrade-grpc-version (#8218)
upgrade grpc.version, so macos with M1 chip can build Fe correctly.
1.30.0 -> 1.44.1
2022-02-24 23:17:32 +08:00
0dcbfbdde0 [fix](load) Fix InsertStmt prepareExpressions (#8112)
Use queryStmt.getResultExprs() instead of queryStmt.getBaseTblResultExprs() in InsertStmt prepareExpressions func.
2022-02-24 23:12:51 +08:00
9120de205e [refactor] fix some typos (#8159) 2022-02-23 11:42:00 +08:00
273ced0219 [Build] Fix build fe error caused by Inaccessible pentaho-aggdesigner-algorithm jar (#8175)
Co-authored-by: caiconghui1 <caiconghui1@jd.com>
2022-02-22 09:30:39 +08:00
f13fd13e1b [fix] (schema change) Fix BE crash after schema change int column to varchar column(#8073) (#8142)
Co-authored-by: jianping.teng <tengjp@outlook.com>
2022-02-22 09:22:00 +08:00
c47368f80c [fix] (udf) fix check_fn and fn_call function name not same (#8132) 2022-02-22 09:18:07 +08:00
5cc8cb1b93 [improvement](txn) Add PreCommitTime for the result of SHOW TRANSACTION stmt (#8124)
Add `PreCommitTime` for  the result of ` SHOW TRANSACTION;` and `SHOW PROC '/transactions/{DbId}/{state}';`.
2022-02-19 12:02:07 +08:00
9df5b2dfdc [fix](variables) Fix bug that execute showVariablesStmt with where expression return empty resultset (#8094)
This Bug is introduced by PR #7936 , which change key type of connectionMap from Long to Integer,
which cause connectionMap could not find connectContext by connectionId
2022-02-19 11:58:17 +08:00
8892780091 [Vectorized][Feature] support agg function percentile&&percentile_approx (#8066) 2022-02-18 13:42:24 +08:00
Z6N
920a6db5a7 Fix username@cluster:passwod is modified to cluster: username:passwod causes authentication failure (#8115)
Fix username@cluster:passwod is modified to cluster: username:passwod causes authentication failure 

Co-authored-by: z6n <ztmailgo@gmail.com>
2022-02-18 11:19:17 +08:00
b7e07ee472 [fix](cache) Throws ClassCastException when there are multiple EXCEPT, INTERSECT and UNION in the local view (#8083)
Issue Number: close #8082
Throws ClassCastException when there are multiple EXCEPT, INTERSECT and UNION in the local view.
2022-02-18 10:56:37 +08:00
Pxl
e0dbf48682 [Vectorized] [AggFunction] Support group_concat (#8086) 2022-02-17 14:19:07 +08:00
289aacb78c [improvement] enable check_java_version (#8034)
Enable to check the Java version when Doris starts, to prevent the user experience caused by the inconsistency 
between the compiled version and the running version.
If the Java version is compiled and the Java version is run, it will not start, and a prompt message will be given.
2022-02-17 11:16:45 +08:00
26289c28b0 [fix](load)(compaction) Fix NodeChannel coredump bug and modify some compaction logic (#8072)
1. Fix the problem of BE crash caused by destruct sequence. (close #8058)
2. Add a new BE config `compaction_task_num_per_fast_disk`

    This config specify the max concurrent compaction task num on fast disk(typically .SSD).
    So that for high speed disk, we can execute more compaction task at same time,
    to compact the data as soon as possible

3. Avoid frequent selection of unqualified tablet to perform compaction.
4. Modify some log level to reduce the log size of BE.
5. Modify some clone logic to handle error correctly.
2022-02-17 10:52:08 +08:00
264f38471c [feature](spark-load) add Hive Bitmap UDFs (#8036)
Hive Bitmap UDF provides UDFs for generating bitmap and bitmap operations in hive tables.
The bitmap in Hive is exactly the same as the Doris bitmap.
The bitmap in Hive can be imported into Doris through spark bitmap load.
2022-02-17 10:45:20 +08:00
e6fedff68f [Refactor][heartbeat] Make get fe heart response by thrift (#8035)
* [Refactor] Make get fe heart response by thrift

Co-authored-by: caiconghui1 <caiconghui1@jd.com>
2022-02-17 10:25:51 +08:00
Pxl
143c4085ee [Feature][Vectorized] support aggregate function ndv()/approx_count_distinct() (#8044) 2022-02-16 14:30:13 +08:00