Commit Graph

6656 Commits

Author SHA1 Message Date
ece5f8e86c [pipelineX](fix) Fix input data distribution for distinct streaming agg (#29980) 2024-01-16 18:42:09 +08:00
66513d57f9 [feature](function) support ip function named ipv6_cidr_to_range(addr, cidr) (#29812) 2024-01-16 18:42:09 +08:00
43597afe2c [bugfix](core) writer status is read and write concurrently and will core by use after free (#29982)
Co-authored-by: yiguolei <yiguolei@gmail.com>
2024-01-16 18:42:09 +08:00
c9cf9ab841 [pipelineX](improvement) Improve data distribution for streaming agg (#29969) 2024-01-16 18:40:32 +08:00
9e30a67a2a [Improve](topn opt) avoid crash when rpc returned row contains duplicated row entry (#29872)
1. Add more info to trace potential bug and avoid crash
2. use correct permutation size to do `column->permute`
2024-01-16 18:40:31 +08:00
ffc6f58e85 [pipelineX](fix) Fix incorrect partition number (#29963) 2024-01-16 18:39:37 +08:00
be893d792c [fix](jni) fix jni_reader function name get_nex_block to get_next_block (#29943) 2024-01-16 18:39:00 +08:00
05a65b9f81 [improve](join) remove join probe dependency of wait rf publish finish #29792 2024-01-16 18:39:00 +08:00
e35b26f4fc [feature](auditlog)Add runtime cpu time/peak memory metric (#29925) 2024-01-16 18:39:00 +08:00
b7b8e59392 [opt](scanner) use buffered queue to avoid acquiring locks frequently (#29938) 2024-01-16 18:37:44 +08:00
c8845c9e07 [opt](scanner) Improve the efficiency of TOPN opt (#29937) 2024-01-16 18:37:44 +08:00
4b4fd1a290 [improvement](log) add txn log (#28875) 2024-01-16 18:37:06 +08:00
8ca807578f [fix](migrate disk) fix migrate disk lost data during publish version (#29887)
Co-authored-by: Yongqiang YANG <98214048+dataroaring@users.noreply.github.com>
2024-01-16 18:37:06 +08:00
74e4486c65 [fix](partition) Add more log for single replica load when partition id eq 0 (#28707) 2024-01-16 18:35:32 +08:00
615d94bbc7 [log](insertadd log in parse insert into values data (#29903) 2024-01-16 18:35:32 +08:00
7309061db4 [pipelineX](improvement) Adjust local exchange strategy (#29915) 2024-01-16 18:35:32 +08:00
25428bd7fb [fix](kerberos) fix BE kerberos ccache renew, optimize kerbero options (#29291)
1. we need  remove BE kinit, and use jni login with keytab, because kinit cannot renew TGT for doris in many complex cases.
> This pull requet will support new instance from keytab: https://github.com/apache/doris-thirdparty/pull/173, so now we  won't need kinit cmd, just login with keytab and principal

2. add `kerberos_ccache_path` to set kerberos credentials cache path manually.

3. add `max_hdfs_file_handle_cache_time_ms` to set hdfs fs handle cache time.
2024-01-16 18:35:29 +08:00
5e697990a8 [bugfix](timeout) serving_blocks_num may cause timeout, try to fix it (#29912)
Although serving_blocks_num is an atomic variable. It's ++ and -- are not protected by transfer lock.
I am not sure the memory order of ++ and --.
I think it maybe the root cause of query timeout. So that I remove the check and test it in github pipeline.
2024-01-16 18:34:19 +08:00
a836f41854 [enhance](serde)update slice reserve and deduce slice back usage #29879 2024-01-16 18:33:51 +08:00
620cfc3cd7 [fix](move-memtable) set idle timeout equal to load timeout (#29839) 2024-01-16 18:33:51 +08:00
c599cf311d [fix](migrate) migrate check old tablet had deleted (#29909) 2024-01-16 18:33:51 +08:00
e3a1138da7 [fix](migrate disk) fix tablet disk migration timeout too large (#29895) 2024-01-16 18:33:51 +08:00
e417128fb9 [bug](bitmap) should return error status when execute failed (#29841) 2024-01-16 18:30:23 +08:00
41875a0bf5 [fix](move-memtable) check segment id in add_segment (#29898) 2024-01-16 18:30:23 +08:00
de3fdc7d08 [chore](Fix) Fix uninitilized buffer in read_cluster_id() (#29949) 2024-01-14 15:56:19 +08:00
60f6436f26 [fix](schema cache) adjust the destruction order of _tablet_schema_cache and storage engine (#29923) 2024-01-13 23:36:15 +08:00
e4e57e9b05 [chore](removelogs) remove debug query timeout logs 2024-01-12 14:37:20 +08:00
99024ad7bd [fix](move-memtable) check eos for already closed streams (#29734) 2024-01-12 13:58:20 +08:00
3b25e69311 [bug](rf) fix invalid type for runtime filters when result column is const (#29851) 2024-01-12 13:58:20 +08:00
2a77858845 [fix](move-memtable) check all streams for failed reason (#29877) 2024-01-12 13:58:20 +08:00
a314491535 [Fix](inverted index) fix array inverted index builder error (#29869) 2024-01-12 13:58:19 +08:00
5ef8428345 [Refactor](executor)refactor workload group log fron WARNING to INFO #29878 2024-01-12 13:58:19 +08:00
1718341051 [pipelineX](fix) Fix correctness problem due to local hash shuffle (#29881) 2024-01-12 13:58:19 +08:00
ad2c13e009 [Optimize](kill-query)Support the scanners exits as soon as possible when kill query #29803 2024-01-12 13:58:19 +08:00
d494674ff4 [opt](parquet-reader) Opt parquet decimal type reading. (#29825) 2024-01-12 13:58:19 +08:00
ad986a78ae [Fix](executor)Fix Grayscale upgrade be code dump when report statistics #29843 2024-01-12 13:58:19 +08:00
d525f576e1 [improve] Use lru cache to count the number of column in tablet schema to control memory (#29668) 2024-01-12 13:58:19 +08:00
cbffdbb8bf [bug](group_commit) fix relay wal problem on materialized-view (#29848) 2024-01-12 13:58:19 +08:00
a4f29193f6 [pipelineX](fix) Fix incorrect runtime filter (#29860) 2024-01-12 13:58:19 +08:00
407a4a285d [improve](load) reduce logs from memtable memory limiter (#29840) 2024-01-12 13:58:19 +08:00
ebfbe0c8dd [opt](information_schema) support information_schema in external catalog (#28919)
Add `information_schema` database for all catalog.
This is useful when using BI tools to connect to Doris,
the tools can get meta info from `information_schema`.

This PR mainly changes:

1. There will be a `information_schema` db in each catalog.
2. Each `information_schema` db only store the meta info of the catalog it belongs to.
3. For `information_schema`, the `TABLE_SCHEMA` column's value is the database name.
4. There is a new global variable `show_full_dbname_in_info_schema_db`, default is false, if set to true,
    The `TABLE_SCHEMA` column's value is the like `ctl.db`, because:

	When connect to Doris, the `database` info in connection url will be: `xxx?db=ctl.db`.
	
	And then some BI will try to query `information_schema` with sql like:
	
	`select * from information_schema.columns where TABLE_SCHEMA = "ctl.db"`
	
	So it has to be format as `ctl.db`
	
	eg, the `information_schema.columns` table in external catalog `doris` is like:
	
	```
	mysql> select * from information_schema.columns limit 1\G
	*************************** 1. row ***************************
	           TABLE_CATALOG: doris
	            TABLE_SCHEMA: doris.__internal_schema
	              TABLE_NAME: column_statistics
	             COLUMN_NAME: id
	        ORDINAL_POSITION: 1
	          COLUMN_DEFAULT: NULL
	             IS_NULLABLE: NO
	               DATA_TYPE: varchar
	CHARACTER_MAXIMUM_LENGTH: 4096
	  CHARACTER_OCTET_LENGTH: 16384
	       NUMERIC_PRECISION: NULL
	           NUMERIC_SCALE: NULL
	      DATETIME_PRECISION: NULL
	      CHARACTER_SET_NAME: NULL
	          COLLATION_NAME: NULL
	             COLUMN_TYPE: varchar(4096)
	              COLUMN_KEY:
	                   EXTRA:
	              PRIVILEGES:
	          COLUMN_COMMENT:
	             COLUMN_SIZE: 4096
	          DECIMAL_DIGITS: NULL
	   GENERATION_EXPRESSION: NULL
	                  SRS_ID: NULL
	```
	
6. Modify the behavior of

	- show tables
	- shwo databases
	- show columns
	- show table status

	The above statements may query the `information_schema` db if there is `where` predicate after them
2024-01-12 13:58:19 +08:00
4d97f8ea75 [enhance](function) support two special format for str_to_date (#29823) 2024-01-12 12:00:32 +08:00
22c134fa0a [fix](rowset-reader) direct mode shouldn't use merge iterator (#29678) 2024-01-12 11:59:52 +08:00
f02fb5d49e [fix](vec) wrong implementation of operator <=> of Field (#29743) 2024-01-12 11:59:52 +08:00
c9a949130b [Case](wal) Add wal group commit sink case with low disk space fault injection (#29731) 2024-01-12 11:59:52 +08:00
8c0b046ad4 [case](wal)Add wal backpressure case (#29725) 2024-01-12 11:59:52 +08:00
Pxl
068367063f [Improvement](function) optimization for substr with ascii string (#29799) 2024-01-12 11:59:52 +08:00
Pxl
3cf95d0fdf [Improvement](execute) optimize for ColumnNullable's serialize_vec/deserialize_vec (#28788)
optimize for ColumnNullable's serialize_vec/deserialize_vec
2024-01-12 11:59:52 +08:00
Pxl
33b8311d5f [Improvement](runtime-filter) build runtime filter before build hash table on join build probe (#29727)
build runtime filter before build hash table on join build probe
2024-01-12 11:59:52 +08:00
58f8994f5d [Fix](core) Fix initializing the WalManager could prevent the BE from starting (#29688) 2024-01-12 11:59:27 +08:00