Commit Graph

13721 Commits

Author SHA1 Message Date
ed368d7f6c [chore](build) Ignore clucene checks (#19353) 2023-05-07 09:38:44 +08:00
6c21df6324 [tools](tpch) run mode like clickbench (#19339) 2023-05-06 23:33:26 +08:00
9203e0392f [typo](docs) add mac local dev docs (#19342)
* [typo](docs) add mac local dev docs
2023-05-06 22:58:40 +08:00
5bf1396efe [enhancement](load) merge single-replica related services as non-standalone (#18421) 2023-05-06 22:54:56 +08:00
9edbfa37cd [Enhancement](Broker Load) New progress manager for showing loading progress status (#19170)
This work is in the early stage, current progress is not accurate because the scan range will be too large
for gathering information, what's more, only file scan node and import job support new progress manager

## How it works

for example, when we use the following load query:
```
LOAD LABEL test_broker_load
(
	DATA INFILE("XXX")
	INTO TABLE `XXX`
        ......
)
```

Initial Progress: the query will call `BrokerLoadJob` to create job, then `coordinator` is called to calculate scan range and its location. 
Update Progress: BE will report runtime_state to FE and FE update progress status according to jobID and fragmentID

we can use `show load` to see the progress

PENDING:
```
         State: PENDING
      Progress: 0.00%
```

LOADING:
```
         State: LOADING
      Progress: 14.29% (1/7)
```

FINISH:
```
         State: FINISHED
      Progress: 100.00% (7/7)
```

At current time, full output of `show load\G` looks like:

```
*************************** 1. row ***************************
         JobId: 25052
         Label: test_broker
         State: LOADING
      Progress: 0.00% (0/7)
          Type: BROKER
       EtlInfo: NULL
      TaskInfo: cluster:N/A; timeout(s):250000; max_filter_ratio:0.0
      ErrorMsg: NULL
    CreateTime: 2023-05-03 20:53:13
  EtlStartTime: 2023-05-03 20:53:15
 EtlFinishTime: 2023-05-03 20:53:15
 LoadStartTime: 2023-05-03 20:53:15
LoadFinishTime: NULL
           URL: NULL
    JobDetails: {"Unfinished backends":{"5a9a3ecd203049bc-85e39a765c043228":[10080]},"ScannedRows":39611808,"TaskNumber":1,"LoadBytes":7398908902,"All backends":{"5a9a3ecd203049bc-85e39a765c043228":[10080]},"FileNumber":1,"FileSize":7895697364}
 TransactionId: 14015
  ErrorTablets: {}
          User: root
       Comment: 
```

## TODO:

1. The current partition granularity of scan range is too large, resulting in an uneven loading process for progress."
2. Only broker load supports the new Progress Manager, support progress for other query
2023-05-06 22:44:40 +08:00
2fe9ba7c2a [fix](jdbc catalog) fix trino jdbc catalog varchar type err (#19298) 2023-05-06 17:16:28 +08:00
4c6ca88088 Revert "[refactor](function) ignore DST for function from_unixtime (#19151)" (#19333)
This reverts commit 9dd6c8f87b73db238bfd38fb1d76f3796910f398.
2023-05-06 16:33:58 +08:00
f584ad52ca [UDF](demo) add new demo code for java udf (#19276) 2023-05-06 16:17:54 +08:00
626a4c2ab0 [RegressionTest](pipeline) coredump when run regression test in pipeline engine (#19306) 2023-05-06 14:54:17 +08:00
3f6e5118e6 [enchancement](statistics) support periodic collection of statistics (#19247)
This PR enables periodic collection of statistics and is a precursor to automatic statistics collection. It mainly includes the following contents:

support periodic collection of statistics.
Change the type of Date in statistics p0 to DateV2(see [Enhancement](data-type) add FE config to prohibit create date and decimalv2 type #19077) for test locally. complement cases(remove Chinese characters, optimize code, etc) , improve stability.
Supports setting whether to keep records of statistics synchronization job info, convenient for use in p0 testing.
The statistics job table was modified, and some auxiliary judgments were added to avoid the user perceiving the modification. This function was removed when the table schema is stable.
2023-05-06 14:53:06 +08:00
ccd22c508a [chore](fe) Fix the build on Centos 6 (#19255) 2023-05-06 14:50:56 +08:00
3287f350de [feature](table) implement the round robin selection be when create tablet (#19167) 2023-05-06 14:46:48 +08:00
83040c8f25 [feature](S3FileWriter) Reduce network RTT for files that multipart are not applicable (#19135)
For files less than 5MB, we don't need to use multi part upload which would at least takes 3 network IO.
Instead we can just call PutObject which only takes one shot.
2023-05-06 14:46:18 +08:00
ff6e0d3943 [Improvement](meta) support return no partition info for show_create_table (#19030)
Some tables have a mount of partitions, when use show create table stmt on them,
you will get so many lines of result that a whole screen cannot  show them all, even if you scroll up to the top.

show create table table2;
| table2 | CREATE TABLE `table2` (
  `k1` int(11) NULL COMMENT 'test column k1',
  `k2` int(11) NULL COMMENT 'test column k2'
) ENGINE=OLAP                                        
DUPLICATE KEY(`k1`, `k2`)
COMMENT 'test table1'          
PARTITION BY RANGE(`k1`)           
(PARTITION p01 VALUES [("-2147483648"), ("10")),
PARTITION p02 VALUES [("10"), ("100"))) 
 DISTRIBUTED BY HASH(`k1`) BUCKETS 1
PROPERTIES (                                                                                                                        
"replication_allocation" = "tag.location.default: 1",
"storage_format" = "V2",
"light_schema_change" = "true",
"disable_auto_compaction" = "false"
);


show brief create table table2;
| table2 | CREATE TABLE `table2` (  `k1` int(11) NULL COMMENT 'test column k1',
  `k2` int(11) NULL COMMENT 'test column k2'
) ENGINE=OLAP
DUPLICATE KEY(`k1`, `k2`)
COMMENT 'test table1'
DISTRIBUTED BY HASH(`k1`) BUCKETS 1
PROPERTIES (
"replication_allocation" = "tag.location.default: 1",
"storage_format" = "V2",
"light_schema_change" = "true",
"disable_auto_compaction" = "false"
); |
2023-05-06 14:45:08 +08:00
28b5ef436a [improvement](scripts) modify download scripts to shorten the dir name (#19330)
1. Rename the download dir to short name: fe, be, dependencies
2. Remove tsinghua source for 1.2.3
3. Modify download link to archieve for 1.2.3
2023-05-06 14:00:38 +08:00
bd23db762d [minor](stats) Add doc for stats framework (#19311) 2023-05-06 13:30:55 +08:00
1223f81228 [doc](flinkconnector) fix english doc #19315
Co-authored-by: wudi <>
2023-05-06 12:04:26 +08:00
Pxl
dff669899a [Feature](generic-aggregation) add some type define for generic aggregate functions support (#19252)
add some type define for generic aggregate functions support
2023-05-06 11:30:13 +08:00
cdfbfd1f6b [fix](replica) Fix inconsistent replica id between FE and BE (#18688) 2023-05-06 11:06:29 +08:00
a72eee24f1 [fix](nereids) fix merge project with window function bug (#19280)
1. don't merge projects if any window function exists
2. bypass SimplifyArithmeticRule for decimalV3 type
2023-05-06 10:38:14 +08:00
3ddedb676c [fix](status) do not capture stacktrace for META_KEY_NOT_FOUND (#19308)
* [fix](status) do not capture stacktrace for META_KEY_NOT_FOUND

* handle PUSH_VERSION_ALREADY_EXIST
2023-05-06 10:04:28 +08:00
811ba73ffe [refactor](scan) avoid unnecessary function call (#19299)
* [Improvement](scan) avoid unnecessary function call

* update

* update
2023-05-06 10:03:51 +08:00
987a85c6e9 [FIX](regress_testcase)fix sync with stream load #19309
when query from slave node , here maybe can not select the stream load result , even the publish version is visible , so make a sync sql to enable result in slave node.
2023-05-06 10:03:02 +08:00
c936810e83 [fix](compile) fix bug in build.sh (#19314)
fix path for $(dirname $0)/generated-source.sh to enable docker build
2023-05-06 10:00:20 +08:00
3ece5b801c [fix](FileReader) broker reader is not thread-safe and can't be prefetched (#19321)
Fix errors when using brokers to load csv/json files:

5# doris::ClientCacheHelper::reopen_client(std::function<doris::ThriftClientImpl* (doris::TNetworkAddress const&, void**)>&, void**, int) [clone .cold] at /root/doris/be/src/runtime/client_cache.cpp:84
6# doris::io::BrokerFileReader::read_at_impl(unsigned long, doris::Slice, unsigned long*, doris::io::IOContext const*) [clone .cold] at /root/doris/be/src/io/fs/broker_file_reader.cpp:104
7# doris::io::FileReader::read_at(unsigned long, doris::Slice, unsigned long*, doris::io::IOContext const*) at /root/doris/be/src/io/fs/file_reader.cpp:31
8# doris::io::PrefetchBuffer::prefetch_buffer() at /root/doris/be/src/io/fs/buffered_reader.cpp:71
2023-05-06 09:16:56 +08:00
153f42a873 [enhancement](exprcontext) modify get_output_block_after_execute_expr method more clear to avoid mis usage (#19310)
The original method signature is Block VExprContext::get_output_block_after_execute_exprs(
const std::vectorvectorized::VExprContext*& output_vexpr_ctxs, const Block& input_block,
Status& status)
It return error status as a out parameter and the block as return value. It has to check the block.rows == 0 and then check error status.
It is not conforming to the convention.


---------

Co-authored-by: yiguolei <yiguolei@gmail.com>
2023-05-06 09:03:22 +08:00
42bac3343d [Refactor](StmtExecutor)(step-1) Extract profile logic from StmtExecutor and Coordinator (#19219)
Previously, we use RuntimeProfile class directly, and because there are multiple level in profile, so you can see there may be several RuntimeProfile instances be to maintain.

I created several new classes for profile:

class Profile:
	The root profile of a execution task(query or load)
	
class SummaryProfile:
	The profile that contains summary info of a execution task,
	such as start time, end time, query id. etc.
	
class ExecutionProfile:
	The profile for a single Coordinator. Each Coordinator will
	have a ExecutionProfile.
The profile structure is as following:

Profile:
	SummaryProfile:
	ExecutionProfile 1:
		Fragment 0:
			Instance 0:
			Instance 1:
			...
		Fragment 1:
		...
	ExecutionProfile 2:
		...
You can see, each Profile has a SummaryProfile and one or more ExecutionProfile.
For most kinds of job, such as query/insert, there is only one ExecutionProfile. But for broker load job, will may be more than one ExecutionProfile, corresponding to each sub task of the load job.

How to use
For query/insert, etc:

Each StmtExcutor will have a Profile instance.
Each Coordinator will have a ExecutionProfile instance.
StmtExcutor is responsible for the SummaryProfile, it will update the SummaryProfile during the execution.
Coordinator is responsible for the ExecutionProfile, it will first add ExecutionProfile to the child of Profile, and update the ExecutionProfile periodically during the execution.
For Load/Export, etc:

Each job will hava a Profile instance.
For each Coordinator of this job, add its ExecutionProfile to the children of job's Profile.
Behavior Change
The columns of show load profile/show query profile and QueryProfile Web UI has changed to:

| Profile ID | Task Type | Start Time | End Time | Total | Task State | User | Default Db| Sql Statement | Is Cached | Total Instances Num | Instances Num Per BE | Parallel Fragment Exec Instance Num | Trace ID |
The Query Id and Job Id is removed and using Profile ID instead.
For load job, the profile id is job id, for query/insert, is query id.
2023-05-06 09:01:51 +08:00
5210c04241 [Refactor](ScanNode) Split interface refactor (#19133)
Move getSplits function to ScanNode, remove Splitter interface.
For each kind of data source, create a specific ScanNode and implement the getSplits interface. For example, HiveScanNode.
Remove FileScanProviderIf move the code to each ScanNode.
2023-05-05 23:20:29 +08:00
c9fa10ac10 [fix](doc) avoid generate config doc automatically (#19302)
After #19246, when compilng FE, it will automatically generate Config and Session Variables doc and overwrite the origin one.
Need to avoid it because it is not ready to use yet
2023-05-05 20:39:05 +08:00
8aa61eb8f4 [fix](compile) Add missing inclusion (#19199)
Co-authored-by: hugoluo <hugoluo@tencent.com>
2023-05-05 20:32:32 +08:00
159344792f [enhance](Nereids) make getExplorationRule static (#19278)
make getExplorationRule static to avoid new ArrayList() multiple times.
2023-05-05 19:58:24 +08:00
5c8ecfbf9c [fix](thirdparty) fix opentelemetry error message compiling with ubsan (#18912) 2023-05-05 19:09:43 +08:00
34228ba805 [doc](release note) add 2.0.0 alpha1 release note (#19286) 2023-05-05 18:06:25 +08:00
3e3262361c [fix](fe)havingClause should be substituted the same way as resultExprs (#19261)
substituted havingClause in the same way as resultExprs to prevent " HAVING clause not produced by aggregation output" error
2023-05-05 18:03:43 +08:00
58cb404661 [fix](memory) Allocator throws Exception instead of std::bad_alloc (#19285)
W0505 01:31:25.840227 1727715 scanner_scheduler.cpp:340] Scan thread read VScanner failed: [MEM_LIMIT_EXCEEDED]PreCatch error code:11, [E11] Allocator sys memory check failed: Cannot alloc:16384, consuming tracker:<Orphan>, exec node:<>, process memory used 5.87 GB exceed limit 5.64 GB or sys mem available 252.17 GB less than low water mark 1.60 GB, failed alloc size 16.00 KB.
    @     0x555c19e0cca8  doris::Exception::Exception()
    @     0x555c1c3e0c3f  Allocator<>::sys_memory_check()
    @     0x555c1c3e1052  Allocator<>::memory_check()
    @     0x555c19e0a645  Allocator<>::alloc()
    @     0x555c1c34508b  COWHelper<>::create<>()
    @     0x555c1e23f574  doris::vectorized::ConvertThroughParsing<>::execute<>()
    @     0x555c1e23f209  doris::vectorized::FunctionConvertFromString<>::execute_impl()
    @     0x555c1e23f4aa  doris::vectorized::FunctionConvertFromString<>::execute_impl()
    @     0x555c1e15ac29  doris::vectorized::PreparedFunctionImpl::execute_without_low_cardinality_columns()
    @     0x555c1e15ac56  doris::vectorized::PreparedFunctionImpl::execute()
    @     0x555c1e245276  _ZNSt17_Function_handlerIFN5doris6StatusEPNS0_15FunctionContextERNS0_10vectorized5BlockERKSt6vectorImSaImEEmmEZNKS4_12FunctionCast14create_wrapperINS4_14DataTypeNumberIiEEEESt8functionISC_ERKSt10shared_ptrIKNS4_9IDataTypeEEPKT_bEUlS3_S6_SB_mmE_E9_M_invokeERKSt9_Any_dataOS3_S6_SB_OmSY_
    @     0x555c1e2a9341  _ZZNK5doris10vectorized12FunctionCast23prepare_remove_nullableEPNS_15FunctionContextERKSt10shared_ptrIKNS0_9IDataTypeEES9_bENKUlS3_RNS0_5BlockERKSt6vectorImSaImEEmmE_clES3_SB_SG_mm
    @     0x555c1e2a8d42  _ZNSt17_Function_handlerIFN5doris6StatusEPNS0_15FunctionContextERNS0_10vectorized5BlockERKSt6vectorImSaImEEmmEZNKS4_12FunctionCast23prepare_remove_nullableES3_RKSt10shared_ptrIKNS4_9IDataTypeEESJ_bEUlS3_S6_SB_mmE_E9_M_invokeERKSt9_Any_dataOS3_S6_SB_OmSQ_
    @     0x555c1e20e42b  doris::vectorized::PreparedFunctionCast::execute_impl()
    @     0x555c1e15ac29  doris::vectorized::PreparedFunctionImpl::execute_without_low_cardinality_columns()
    @     0x555c1e15ac56  doris::vectorized::PreparedFunctionImpl::execute()
    @     0x555c1d63e960  doris::vectorized::IFunctionBase::execute()
    @     0x555c1d628700  doris::vectorized::VCastExpr::execute()
    @     0x555c1d6163e5  doris::vectorized::VExprContext::execute()
    @     0x555c20a83fe1  doris::vectorized::VFileScanner::_convert_to_output_block()
    @     0x555c20a809af  doris::vectorized::VFileScanner::_get_block_impl()
    @     0x555c209b9bc4  doris::vectorized::VScanner::get_block()
    @     0x555c209b1a50  doris::vectorized::ScannerScheduler::_scanner_scan()
    @     0x555c209b2ac1  _ZNSt17_Function_handlerIFvvEZZN5doris10vectorized16ScannerScheduler18_schedule_scannersEPNS2_14ScannerContextEENK3$_0clEvEUlvE1_E9_M_invokeERKSt9_Any_data
    @     0x555c1a8378cf  doris::ThreadPool::dispatch_thread()
    @     0x555c1a830fac  doris::Thread::supervise_thread()
    @     0x7f461faa117a  start_thread
    @     0x7f462033bdf3  __GI___clone
    @              (nil)  (unknown)
2023-05-05 18:01:48 +08:00
0283039f90 [improvement](load) log time consumed by io and enlarge timeout in p0 (#19243) 2023-05-05 17:39:16 +08:00
96d729f719 [refactor](fs)(step3)use filesystem instead of old storage, new storage just access remote object storage (#19098)
see #18960

PR1: add new storage file system template and move old storage to new package
PR2: extract some method in old storage to new file system.
PR3: use storages to access remote object storage, and use file systems to access file in local or remote location. Will add some unit tests.

this is PR3.
2023-05-05 16:20:20 +08:00
70236adc1f [Refactor](doc)(config)(variable) use script to generate doc for FE config and session variables (#19246)
The document of configs(FE and BE) and session variables is hard to maintain.
Because developer need to modify both code and document.
And you can see that some of config's document is missing.

So I plan to write the document of config or variables directly in code, and using
script to generate document automatically.

How To
This CL mainly changes:

Add field in Config and Session Variables' annaotion

description: The description of the config or variable item. It is a String array. And first element is in Chinese, second is in English
options: the valid options if the config or variable is enum.
Add a scripts docs/generate-config-and-variable-doc.sh

Simple run sh docs/generate-config-and-variable-doc.sh and it will generate docs of FE config and variables,
And save it under docs/admin-manual/config/fe-config.md and docs/advanced/variables.md,
both in Chinese and in English.

And there are template markdowns for this script to read and replace with real doc content.

TODO
Too many description need to be filled. I will finish them in next PR. And now the origin doc remain unchanged.
Find a way to check the description field of config and variables, to make sure we won't missing it.
Generate doc for BE config.
2023-05-05 14:42:43 +08:00
f2a34dde52 [fix](memory) Fix memory leak due to incorrect block reuse of AggregateFunctionSortData #19214 2023-05-05 14:29:34 +08:00
b6c7f3aeb8 [opt](FileCache) Add file cache metrics and management (#19177)
Add file cache metrics and management.
1. Get file cache metrics
> If the performance of file cache is not efficient, there are currently no metrics to investigate the cause. In practice, hit ratio, disk usage, and segments removed status are very important information. 

API: `http://be_host:be_webserver_port/metrics`
File cache metrics for each base path start with `doris_be_file_cache_` prefix. `hits_ratio` is the hit ratio of the cache since BE startup; `removed_elements` is the num of removed segment files since BE startup; Every cache path has three queues: index, normal and disposable. The capacity ratio of the three queues is 1:17:2.
```
doris_be_file_cache_hits_ratio{path="/mnt/datadisk1/gaoxin/file_cache"} 0.500000
doris_be_file_cache_hits_ratio{path="/mnt/datadisk1/gaoxin/small_file_cache"} 0.500000
doris_be_file_cache_removed_elements{path="/mnt/datadisk1/gaoxin/file_cache"} 0
doris_be_file_cache_removed_elements{path="/mnt/datadisk1/gaoxin/small_file_cache"} 0

doris_be_file_cache_normal_queue_max_size{path="/mnt/datadisk1/gaoxin/file_cache"} 912680550400
doris_be_file_cache_normal_queue_max_size{path="/mnt/datadisk1/gaoxin/small_file_cache"} 8500000000
doris_be_file_cache_normal_queue_max_elements{path="/mnt/datadisk1/gaoxin/file_cache"} 217600
doris_be_file_cache_normal_queue_max_elements{path="/mnt/datadisk1/gaoxin/small_file_cache"} 102400

doris_be_file_cache_normal_queue_curr_size{path="/mnt/datadisk1/gaoxin/file_cache"} 14129846
doris_be_file_cache_normal_queue_curr_size{path="/mnt/datadisk1/gaoxin/small_file_cache"} 14874904
doris_be_file_cache_normal_queue_curr_elements{path="/mnt/datadisk1/gaoxin/file_cache"} 18
doris_be_file_cache_normal_queue_curr_elements{path="/mnt/datadisk1/gaoxin/small_file_cache"} 22

...
```
2. Release file cache
> Frequent segment files swapping can seriously affect the performance of file cache. Adding a deletion interface helps users clean up the file cache.

API: `http://be_host:be_webserver_port/api/file_cache?op=release&base_path=${file_cache_base_path}`
Return the number of released segment files. If `base_path` is not provide in url, all cache paths will be released.
It's thread-safe to call this api, so only the segment files not been read currently can be released.
```
{"released_elements":22}
```
3. Specify the base path to store cache data
> Currently, regression testing lacks test cases of file cache, which cannot guarantee the stability of file cache. This interface is generally used in regression testing scenarios. Different queries use different paths to verify different usage cases and performance.

User can set session variable `file_cache_base_path` to specify the base path to store cache data. `file_cache_base_path="random"` as default, means chosing a random path from cached paths to store cache data.  If `file_cache_base_path` is not one of the base paths in BE configuration, a random path is used.
2023-05-05 14:28:01 +08:00
817f3ce510 [fix](nereids) plan shape on tpch_sf1T q21 case #19291 2023-05-05 14:24:28 +08:00
525ede54cb [doc](fix)fix array_map doc tag wrong #19249 2023-05-05 12:44:46 +08:00
63602f9f06 [Chore](thrift) prevent BE to be recompiled many files #19272
Prevent BE to be recompiled many files:

When we execute build.sh, it clean thrift code so that BE will be recompiled many files. It is added by this pr
#19217
We can use build.sh --clean to clean the thrift code. No need to clean it every time.
2023-05-05 12:28:00 +08:00
Pxl
09b9aba243 [Bug](web) fix web of frontend meet error (#19279)
* fix web of frontend meet error

upgrade servelet api version
2023-05-05 12:26:50 +08:00
8286098b19 [community](release) add download scripts for 2.0.0-alpha1 release #19289 2023-05-05 12:17:09 +08:00
9dd6c8f87b [refactor](function) ignore DST for function from_unixtime (#19151) 2023-05-05 11:51:49 +08:00
1a1aee3886 [fix](load) exclude canceled job when canceling load (#19268) 2023-05-05 10:31:16 +08:00
693a3651c1 [bugfix](rpc) fix read-after-free problem of DeleteClosure (#19250)
1. fix read-after-free problem of DeleteClosure.
    2. modified fresh_exec_timer for operators
2023-05-05 09:57:54 +08:00
44d95aa3d9 [typo](docs)add new attention of str_to_date function (#19264) 2023-05-05 09:40:06 +08:00
9813406757 [Enhancement](HttpServer) Add http interface authentication for BE (#17753) 2023-05-04 23:46:49 +08:00