1. expand the semantics of variable strict_mode to control the behavior for stream load: if strict_mode is true, the stream load can only update existing rows; if strict_mode is false, the stream load can insert new rows if the key is not present in the table
2. when inserting a new row in non-strict mode stream load, the unmentioned columns should have default value or be nullable
* run-tpch-query shell add analyze database with sync and calculate total time
* run-tpch-query shell add analyze database with sync and calculate total time
The cost estimation can be more accurate if the statistics of partition are available. But we are running big data like 1T, can not really import.
So now we want to extend this by injecting partition statistics.
Syntax:
ALTER TABLE table_name MODIFY COLUMN column_name SET STATS ('stat_name' = 'stat_value', ...)
[ PARTITION (partition_name) ];
Explanation:
- Table_name: The table to which the statistics are dropped. It can be a db_name.table_name form.
Column_name: Specified target column. table_name Must be a column that exists in. Statistics can only be modified one column at a time.
- Stat _ name and stat _ value: The corresponding stat name and the value of the stat info. Multiple stats are comma separated. Statistics that can be modified include row_count, ndv, num_nulls min_value max_value, and data_size.
- Partition_name: specifies the target partition. Must be a partition existing in table_name. Multiple partitions are separated by commas.
Here we will calculate all the rowsets delete bitmaps which are committed but not published to reduce the calculation pressure of publish phase.
Step1: collect this tablet's all committed rowsets' delete bitmaps.
Step2: calculate all rowsets' delete bitmaps which are published during compaction.
Step3: write back updated delete bitmap and tablet info.
should set: enable_simdjson_reader=false in master as master enable_simdjson_reader=true by default.
Issue Number: close#21389
from rapidjson:
Query String
In addition to GetString(), the Value class also contains GetStringLength(). Here explains why:
According to RFC 4627, JSON strings can contain Unicode character U+0000, which must be escaped as "\u0000". The problem is that, C/C++ often uses null-terminated string, which treats \0 as the terminator symbol.
To conform with RFC 4627, RapidJSON supports string containing U+0000 character. If you need to handle this, you can use GetStringLength() to obtain the correct string length.
For example, after parsing the following JSON to Document d:
{ "s" : "a\u0000b" }
The correct length of the string "a\u0000b" is 3, as returned by GetStringLength(). But strlen() returns 1.
GetStringLength() can also improve performance, as user may often need to call strlen() for allocating buffer.
Besides, std::string also support a constructor:
string(const char* s, size_t count);
which accepts the length of string as parameter. This constructor supports storing null character within the string, and should also provide better performance.
Fix problem:
For the same column, there are concurrent drop index request and build index request, if build index obtain lock before drop index, build a new index file, but when drop index request execute, link file not contains all index files for the column, that lead to new index file is missed.
Based on the above questions, use index id instead of column unique id to determine whether a hard link is required when do build index
after we forbid some cases off agg candidate plans,
all local phase agg require DistributionSpecAny for child.
So, we could enable parallel scan for it
Refactor the interface of create_file_reader
the file_size and mtime are merged into FileDescription, not in FileReaderOptions anymore.
Now the file handle cache can get correct file's modification time from FileDescription.
Add HdfsIO for hdfs file reader
pick from [Enhancement](multi-catalog) Add hdfs read statistics profile. #21442
Add alias name for system variable to fix the col name is the values of system variable like:
```
mysql> select @@character_set_client;
+--------+
| 'utf8' |
+--------+
| utf8 |
+--------+
==================================
mysql> select @@character_set_client;
+------------------------+
| @@character_set_client |
+------------------------+
| utf8 |
+------------------------+
```