Commit Graph

4660 Commits

Author SHA1 Message Date
0fc8d2e029 [Bug](decimal) fix variance_samp and avg_weighted #19861 2023-05-19 16:44:36 +08:00
9d54545bac [Fix](inverted index) add datev2/datetimev2 for inverted index column type (#19845)
When we try to query array of datetimev2 column by inverted index, it returns an error like this:

CREATE TABLE `nested` (
 `qid` bigint(20) NULL,
 `tag` array<text> NULL,
 `creationDate` datetime NULL,
 `title` text NULL,
 `user` text NULL,
 `answers.user` array<text> NULL,
 `answers.date` array<datetimev2(0)> NULL,
 INDEX tag_idx (`tag`) USING INVERTED PROPERTIES("parser" = "english") COMMENT '',
 INDEX creation_date_idx (`creationDate`) USING INVERTED COMMENT '',
 INDEX title_idx (`title`) USING INVERTED COMMENT '',
 INDEX user_idx (`user`) USING INVERTED COMMENT '',
 INDEX answers_user_idx (`answers.user`) USING INVERTED COMMENT '',
 INDEX answers_date_idx (`answers.date`) USING INVERTED COMMENT ''
) ENGINE=OLAP
DUPLICATE KEY(`qid`)
COMMENT 'OLAP'
DISTRIBUTED BY HASH(`qid`) BUCKETS 18
PROPERTIES (
"replication_allocation" = "tag.location.default: 1",
"storage_format" = "V2",
"compression" = "ZSTD",
"light_schema_change" = "true",
"dynamic_schema" = "true",
"disable_auto_compaction" = "false"
); 

mysql> select * from nested.nested where tag match 'java' and `answers.date` element_le '2012-04-08T21:15:33.873Z' limit 10;
ERROR 1105 (HY000): errCode = 2, detailMessage = no function found for MATCH_ELEMENT_LE,`answers.date` MA
2023-05-19 14:57:01 +08:00
f46f0c84b2 [Enhancement](meta) Show remote data usage via SHOW DATA #19533 (#19752)
* [Enhancement](meta) Show remote data usage via SHOW DATA #19533

* [fix] correct some unit test results
2023-05-19 14:23:50 +08:00
c4900eb658 [Bug](DecimalV3) fix decimalv3 functions (#19801) 2023-05-19 14:10:01 +08:00
fcffb1d3de [minor](Nereids): add toString() for LogicalProperties (#19851) 2023-05-19 13:46:47 +08:00
92c6a3c53b [fix](Nereids) normalize repeat generate push down project with error nullable (#19831) 2023-05-19 13:15:42 +08:00
9c86cad4ec [improvement](session variable) add max execution time session variabe like mysql and add setter attributes in variables (#19759)
1. add session variable max_execution_time to an alias of query timeout, if user set max_execution_time, the query timeout will be modified too.
2. add a setter attribute to session variable, so that we could add some logic in setter method instead of field reflection.
2023-05-19 12:42:47 +08:00
cf7083d58b [explain](point query) modify explain for SHORT-CIRCUIT query (#19820) 2023-05-19 11:50:08 +08:00
609b20bd02 [Feature](planner) use partial update in update from & delete from (#19262) 2023-05-19 09:46:29 +08:00
84bad03ccb [feature](nereids) set proper min/max value for column stats when minExpr/maxExpr is not avialable #19673 2023-05-19 09:02:40 +08:00
0dd361dbf7 [fix](tracing) fix the issue that a trace may track multiple queries (#19804) 2023-05-19 08:58:53 +08:00
6f6d744a2a [fix](nereids) avoid 0 row count in stats derive #19640
row count of join estimation is at least 1 to make less error propagation.
2023-05-19 08:54:24 +08:00
14620a6766 [minor](log) add details for unqueryable replicas (#19792)
Add a new FE config: show_details_for_unaccessible_tablet.
Default is false, when set to true, if a query is unable to select a healthy replica,
the detailed information of all the replicas of the tablet including the specific reason why they are unqueryable,
will be printed out.
2023-05-19 08:53:57 +08:00
dc8a992bba [improve](nereids) check be status when column stats is unknown #19742
when forbid_unknown_col_stats is open and some column stats is unknown,
we will check the be status by StatisticsUtil.statsTblAvailable(), and report error according to be status.
2023-05-19 08:53:34 +08:00
1e8eb1c756 [fix](profile) Fix pipeline load channel profile #19828 2023-05-19 08:51:02 +08:00
adc5522c9b [bug](MTMV) Fix the wrong interpretation for NEVER REFRESH (#19800) 2023-05-18 23:56:56 +08:00
dfc4432e83 [improvement](jdbc catalog) Add adaptation to Oracle special character / table names (#19809) 2023-05-18 22:58:33 +08:00
f2b2a568de [fix](jdbc catalog)fixed oceanbase catalog row limit bug (#19796) 2023-05-18 22:05:51 +08:00
40ab4ce305 fix select resource groups bug (#19808) 2023-05-18 21:54:31 +08:00
481e9aebdb [Refactor](spark load) remove parquet scanner (#19251) 2023-05-18 19:19:13 +08:00
f68d3a660e [improvement](opentelemetry) upgrade opentelemetry jar to v1.26.0 and opentelemetry-cpp to v1.8.3 (#19733)
why upgrade? anything wrong?

Try to fix the problem about opentelemetry::v1::ext::http::client::curl::HttpOperation::Send(), I have updated the pr info.
2023-05-18 18:46:20 +08:00
ed85a10a70 [Fix](multi-catalog) Fix sync hms event failed. (#19555)
A similar situation with #19344 , because sometimes hms meta info is newer than hms events, if we try to invoke org.apache.doris.datasource.hive.PooledHiveMetaStoreClient#getTable and this table is not exists, some error will throws and this event can not be handled.
2023-05-18 18:34:18 +08:00
e67872d391 [fix](Nereids) fallback not work when cannot parse after forward (#19790) 2023-05-18 18:24:59 +08:00
294599ee45 [feature](jsonb) rename JSONB type name and function name to JSON (#19774)
To be more compatible with MySQL, rename JSONB type name and function name to JSON.

The old JSONB type name and jsonb_xx function can still be used for backward compatibility.

There is a function jsonb_extract remained since json_extract is used by json string function and more work need to change it. It will be changed further.
2023-05-18 16:16:52 +08:00
160d2be0d8 [minimal](Nereids) add more comments for the rewriter (#19788)
Only add some comments to the rewriter. Because it is fewer comments before and it's hard to understand for the newbie.
2023-05-18 14:47:25 +08:00
e45bc160c9 [fix](mtmv) fix bug that should not write edit log when replaying alter mv (#19781) 2023-05-18 13:34:05 +08:00
50370dead9 [fix](load) fix unified load converted failed when forwarding to master (#19779) 2023-05-18 12:28:32 +08:00
18c1081659 [fix](nereids) fix some nereids bugs (#19711)
1. add json_unquote and json_extract functions
2. remove mv releated code in visitPhysicalOlapScan
3. forbid bitmap and hll type for topn node's sort exprs
4. HashDistributionInfo of olap scan node should use the slots from output not the full schema
5. SelectMaterializedIndexWithoutAggregate should use the filter node's output together with the predicate to get the correct mv
6. forbid SimplifyArithmeticRule for decimal type
7. make DecimalLiteral's type and value consistent with each other if the value is decimalv2
8. json_array need support empty argument
2023-05-18 11:33:56 +08:00
a3f06e5fbd [git](Nereids): ignore apache parquet. (#19765) 2023-05-18 10:54:46 +08:00
88ca4f3e6b [feature](like) make like regexp used as a sql function (#19755) 2023-05-18 10:03:12 +08:00
098dac20c2 [log](Nereids): add more debug info when check logicalproperties. (#19763) 2023-05-18 08:37:10 +08:00
c80c4477cf [Enhancement](broker-load) broker load show stmt support display cluster name if specified (#19392) 2023-05-18 00:10:15 +08:00
97d4778ecf [enhancement](schema) dynamic_partition.time_unit support year (#19551)
dynamic_partition.time_unit support year
2023-05-17 23:49:15 +08:00
8aa7f0e188 [fix](catalog) fix the include_database_list not in effect (#19589) 2023-05-17 22:56:21 +08:00
60d5c82f44 [fix](tvf) fix the inconsistency between tvf backends function and show backends result (#19697) 2023-05-17 22:55:46 +08:00
be47a27013 [Fix](multi catalog, nereids)Fix FileQueryScanNode couldn't filter partition in nereids planner bug (#19564)
Nereids planner add conjuncts to ScanNode after call finalize, this may cause external table scan node fail to filter 
useless partition, because external table do the partition prune in the finalize method.

This pr is to fix this bug. In the rewrite stage, pass the conjuncts to LogicalFileScan object, and eventually pass to 
ScanNode while creating it. So that the ScanNode could use the conjuncts while doing finalize. 

Why not doing the partition prune in the LogicalFileScan like LogicalOlapScan doing? 

Because Iceberg api doesn't have the partition concept, it just accept a list of Conjuncts, 
so it's easier to pass the conjuncts to ScanNode (Hive, Icegerg, Hudi...) and doing the partition prune in there.
2023-05-17 21:39:59 +08:00
b5f60bde42 [fix](checkpoint)fix Checkpoint error when http server is not ready #19699 2023-05-17 21:33:56 +08:00
2993cdb36e [fix](multi-catalog)fix iceberg catalog display type #19728 2023-05-17 21:33:18 +08:00
1d05feea1b [Feature](Nereids) add executable function to support fold constant for functions (#18209)
1. Add date-time functions for fold constant for Nereids.
This is the list of executable date-time function nereids supports up to now:
- now()
- now(int)
- current_timestamp()
- current_timestamp(int)
- localtime()
- localtimestamp()
- curdate()
- current_date()
- curtime()
- current_time()
- date_{add/sub}(),{years/months/days/hours/minutes/seconds}_{add/sub}()
- datediff()
- {date/datev2}()
- {year/quarter/month/day/hour/minute/second}()
- dayof{year/month/week}()
- date_format()
- date_trunc()
- from_days()
- last_day()
- to_monday()
- from_unixtime()
- unix_timestamp()
- utc_timestamp()
- to_date()
- to_days()
- str_to_date()
- makedate()

2. solved problem:
- enable datev2/datetimev2 default.
- refactor Nereids foldConstantOnFE and support fold nested expression.
- separate the executable into multi-files for easily-reading and adding new functions
2023-05-17 21:26:31 +08:00
1eb929e1ca [Bugfix](Jdbc Catalog) fix data type mapping of SQLServer Catalog (#19525)
We map `money/smallmoney` types of SQLSERVER into decimal type of doris.
2023-05-17 21:02:42 +08:00
30c4f25cb3 [fix](multi-catalog) verify the precision of datetime types for each data source (#19544)
Fix threes bugs of timestampv2 precision:
1. Hive catalog doesn't set the precision of timestampv2, and can't get the precision from hive metastore, so set the largest precision for timestampv2;
2. Jdbc catalog use datetimev1 to parse timestamp, and convert to timestampv2, so the precision is lost.
3. TVF doesn't use the precision from meta data of file format.
2023-05-17 20:50:15 +08:00
73be97f8d8 [fix](meta) fix upgrade failed on FE meta from 1.2 (#19674)
Introduced from #19355.
We need to keep OP_CREATE_CLUSTER so that Doris can be upgraded from 1.2.x.
This OP type should be removed after 3.0
2023-05-17 20:48:58 +08:00
131e77a816 [Fix](Nereids) fix minidump parameter name and double not a number serialize bug (#19635)
Change nereids minidump switch from "Dump_nereids" to "enable_minidump" which is more exactly and neat. Also fix bug of Double.NaN (not a number) serialize bug when doing column statistic serialization.
2023-05-17 20:16:50 +08:00
5d5db157d0 [Fix](planner) fix literal type incompatible after fold constant by be. (#19190) 2023-05-17 19:54:29 +08:00
05d47d43bd [Fix](Nereids) check the tableName in catalog (#19695)
# Proposed changes
In the nereids. Before this PR: when we access some unexists tables. It will report the exception as follows:
```
mysql> select * from tt;
ERROR 1105 (HY000): errCode = 2, detailMessage = Unexpected exception: null
```

After this PR, it will get the following results:
```
mysql> select * from tt;
ERROR 1105 (HY000): errCode = 2, detailMessage = Unexpected exception: Table [tt] does not exist in database [default_cluster:test].
```

## Problem summary
It is because in this [function](f5af07f7b2/fe/fe-core/src/main/java/org/apache/doris/nereids/CascadesContext.java (L328)), we ignore the exception. So the size of `tables` in `CascadesContext` is zero not null. So we can only get null after `table = cascadesContext.getTableByName(tableName);`.
2023-05-17 19:48:30 +08:00
bee2e2964f [refactor](Nereids) refactor adjust nullable rule as a custom rewriter (#19702)
use custom rewriter to do adjust nullable to avoid nullable changed in expression but not changed in output
2023-05-17 19:24:42 +08:00
ce12cf404c [bugfix](inverted index) Fix mv inheriting unexpectedly inverted index of base table (#19722) 2023-05-17 17:18:07 +08:00
6ba2f681af [fix](Nereids) result error when do agg with distinct under pipeline (#19735) 2023-05-17 17:08:42 +08:00
Pxl
800de168db [Chore](function) clean some unused function symbols (#19649)
clean some unused function symbols
2023-05-17 15:31:51 +08:00
cc9d340400 [Fix](Nereids) Fix minidump connect context loading and concurrency bug (#19578)
There are two problems of mini dump:
1、minidump do not load connect context to ThreadInfo, so it can not be get easily
2、minidump write maps with not concurrent protection, so the map size with change when we iterating map iterator

Solution:
1、loading connect context to minidump threading
2、use immutable map copy a new map before we actually doing iteration
2023-05-17 15:09:00 +08:00