Split ExternalFileScanNode into FileQueryScanNode and FileLoadScanNode.
Remove some useless code in FileLoadScanNode.
Remove unused config item: enable_vectorized_load and enable_new_load_scan_node
1. remove TypeCoercion and CharacterLiteralTypeCoercion
2. Nereids Cast do not relay on legacy planner's analyze()
3. fix below problem in legacy planner, after this PR
a. BOOLEAN can cast to DECIMALV2 explicitly
b. compare between BOOLEAN and DATE will cast both side to DOUBLE
c. HLL cannot be implicitly cast to any other type
when group-by-keys does not contain unique column
1. with out distinct: we prefer two phase aggregate to one phase aggregate
2. with distinct: we prefer three phase aggregate to two phase aggregate
steps to repo:
1, create any catalog re; [OK]
2, switch re [OK]
3, show catalogs [OK]
4, drop catalog re [OK]
5, show catalogs [FAIL with "Current catalog is not exist, please switch catalog." ]
expect:
show catalogs should always be OK, not depends on current catalog.
When the be_exec_version is less than 2, murmurhash will still be used, otherwise crc32 will be used. When the be_exec_version is upgraded to 2, please remove.
if inner join implemented by NLJ, the runtime filter generation phase will be terminated and children are not be travelled. we fix it by adjust the order of travelling children and handle the node itself.
1. If we set hadoop user property along with kerberos info, the authentication will fail.
2. fix some minor issue of local fs, follow up #18397
3. Add KW_HOSTNAME to keywords region, follow up #17329
4. Fix tvf not working with pipeline engine, follow up #18376
Introduced from #17884.
When replay catalog from image, we should not call `catalog.getProperties()`.
Because it will visit the resource mgr, but resource mgr is not replayed yet.
currently, the AnalysisException throw by matchSql will catch immediately.
however, the AnalysisException throws by checkLimitations will catch as UserException.
Sometimes, `show load profile` will only show part of the insert opertion's profile.
This is because we assume that for all load operation(including insert), there is only one fragment in the plan.
But actually, there will be more than 1 fragment in plan. eg:
`insert into tbl1 select * from tbl1 limit 1` will have 2 fragments.
This PR mainly changes:
1. modify the `show load profile`
Before: `show load profile "/queryid/taskid/instanceid";`
After: `show load profile "/queryid/taskid/fragmentid/instanceid";`
2. Modify the display of `ReadColumns` in OlapScanNode
Because for wide table, the line of `ReadColumns` may be too long for show in profile.
So I wrap it and each line contains at most 10 columns names.
3. Fix tvf not working with pipeline engine, follow up #18376
Optimize concat function 29% up by memcpy_small_allow_read_write_overflow15.
Optimize string functions list: concat, convert_to, mask, initcap, lower, upper.
concat function has 29% up:
Optimize constant empty string compare:
(1) When the constant empy string '' (size is 0), we can compare offsets in SIMD directly.
q10: SELECT MobilePhoneModel, COUNT(DISTINCT UserID) AS u FROM hits WHERE MobilePhoneModel <> '' GROUP BY MobilePhoneModel ORDER BY u DESC LIMIT 10;
q11: SELECT MobilePhone, MobilePhoneModel, COUNT(DISTINCT UserID) AS u FROM hits WHERE MobilePhoneModel <> '' GROUP BY MobilePhone, MobilePhoneModel ORDER BY u DESC LIMIT 10;
q12: SELECT SearchPhrase, COUNT(*) AS c FROM hits WHERE SearchPhrase <> '' GROUP BY SearchPhrase ORDER BY c DESC LIMIT 10;
q13: SELECT SearchPhrase, COUNT(DISTINCT UserID) AS u FROM hits WHERE SearchPhrase <> '' GROUP BY SearchPhrase ORDER BY u DESC LIMIT 10;
q14: SELECT SearchEngineID, SearchPhrase, COUNT(*) AS c FROM hits WHERE SearchPhrase <> '' GROUP BY SearchEngineID, SearchPhrase ORDER BY c DESC LIMIT 10;
Issue Number: close #xxx
Currently if a catalog is modified externally in doris, doris is not dynamically aware of it.
So if a catalog is created with a refresh time configuration, I added a timer for it to refresh the catalog regularly.