Commit Graph

16373 Commits

Author SHA1 Message Date
7b30119537 [improve](multi-table-load) pause job when can not find table #29870
If there is no table that can be found, the task will cycle forever and no data will be loaded. To avoid invalid scheduled tasks, It is better to pause the job rather than run it.
2024-01-16 18:31:27 +08:00
6598b4f7c8 [fix](http) fix exception when querying map data through http #29686
The mysql type code mapped by the map type is 400, but 400 is an unknown type for mysql.
For the jdbc driver of mariadb, when querying through the http api of /api/query or using the jdbc driver of mariadb, an exception will occur.
For the jdbc driver of mysql, it will be converted into binary form, and the correct data can be read through the string type.
Therefore, the mysql custom type of map was removed and changed to string type, so that both the jdbc driver of mariadb and mysql can work normally.
2024-01-16 18:31:27 +08:00
e1a12cf222 [improvement](auth)Not allowed to operate internal_schema database (#29790)
Only root user can operate __internal_schema database
The scope of impact includes:
create database
drop database
alter database
create table
drop table
alter table
truncate table
insert overwrite
insert
delete
update
load(root also not allowed)

delete support check auth
2024-01-16 18:31:27 +08:00
8b4ffcc8f7 [typo](docs) fix typo of outfile and export md (#29804) 2024-01-16 18:31:27 +08:00
1dc0c74ad9 [improvement](statistics)Stop analyze quickly after user close auto analyze. #29809 2024-01-16 18:31:27 +08:00
9d3a017706 [fix](doriswriter)Fix the problem that specifying multiple loadurls does not take effect #29865 2024-01-16 18:31:27 +08:00
b3e37b3efa [unit test](statistics)Add unit test case for auto analyze. #29904
Add unit and p0 test case for auto analyze.
2024-01-16 18:31:27 +08:00
d47adbb81f [Fix](nereids) Fix cte rewrite by mv failure and predicates compensation by mistake (#29820)
Fix cte rewrite by mv wrongly when query has scalar aggregate but view no
For example as following, it should not be rewritten by materialized view successfully

// materialzied view define
def mv20_1 = """
select
l_shipmode,
l_shipinstruct,
sum(l_extendedprice),
count()
from lineitem
left join
orders on lineitem.L_ORDERKEY = orders.O_ORDERKEY
group by
l_shipmode,
l_shipinstruct;
"""
// query sql
def query20_1 =
"""
select
sum(l_extendedprice),
count()
from lineitem
left join
orders
on lineitem.L_ORDERKEY = orders.O_ORDERKEY
"""

Fix predicates compensation by mistake
For example as following, it can return right result, but it's wrong earlier.

// materialzied view define
def mv7_1 = """
select l_shipdate, o_orderdate, l_partkey, l_suppkey
from lineitem
left join orders
on lineitem.l_orderkey = orders.o_orderkey
where l_shipdate = '2023-12-08' and o_orderdate = '2023-12-08';
"""
// query sql
def query7_1 = """
select l_shipdate, o_orderdate, l_partkey, l_suppkey
from (select * from lineitem where l_shipdate = '2023-10-17' ) t1
left join orders
on t1.l_orderkey = orders.o_orderkey;
"""

and optimize some code usage and add more comment for method
2024-01-16 18:31:27 +08:00
e417128fb9 [bug](bitmap) should return error status when execute failed (#29841) 2024-01-16 18:30:23 +08:00
1e225b56ab [fix](doc)Added english translation for monitoring Metric description page (#28435)
Added english translation for monitoring Metric description page
2024-01-16 18:30:23 +08:00
12f936558e [fix](doc) spell errors fixes for debug-point-action (#28152) 2024-01-16 18:30:23 +08:00
1998735432 [Improvement](function) enable ipv6_num_to_string function to support handling of IPv6 type (#29886)
Enable ipv6_num_to_string function to handle IPv6 type normally in addition to handling 16 byte string types
2024-01-16 18:30:23 +08:00
e7b221ba66 [fix](be-ut) Fix unstable test cases (#29896)
The following cases are unstable.

1. LoadStreamMgrTest
2. TaskWorkerPoolTest.PriorTaskWorkerPool

Rationales

1. LoadStreamMgrTest
It is related to timeout. If we investigate the examples in BRPC, we will find the timeout is usually set to 0 rather than a specific number.
2. TaskWorkerPoolTest.PriorTaskWorkerPool
The order of the threads for the lock contentions is undetermined.
2024-01-16 18:30:23 +08:00
88eab1b4b9 [doc](hight-concurrent-point-query) Improve and supplement hight-concurrent-point-query documentation (#29396) 2024-01-16 18:30:23 +08:00
41875a0bf5 [fix](move-memtable) check segment id in add_segment (#29898) 2024-01-16 18:30:23 +08:00
ee66f1563e [fix](Nereids) fix rf push down union (#29847)
Current union rf push down only support rf from parent join, but not support ancestor join.
The pr fixes this problem on project/distribute node's rf pushing down checking.
2024-01-16 18:30:22 +08:00
fd4795dace [opt](Nereids) add graph sql function and one arg truncate (#29864) 2024-01-16 18:30:22 +08:00
f79ec8ea7e [test](regression-test) fix case bug suites/export/test_array_export.groovy (#29783) 2024-01-16 18:30:22 +08:00
06a6477275 [test](regression-test) move test_alter_user.groovy to run nonConcurrent, for it has set global operation (#29772) 2024-01-16 18:30:22 +08:00
5f1b888a24 add 2.1.0-rc04 tag 2024-01-16 14:28:38 +08:00
a34ac7f73a [asf] remove some collaborators 2024-01-15 11:46:31 +08:00
de3fdc7d08 [chore](Fix) Fix uninitilized buffer in read_cluster_id() (#29949) 2024-01-14 15:56:19 +08:00
12af86176a [fix](class-loader) fix class loader conflict on BE side (#29942)
1. make `hadoop-common` in be java extension as `provided`.
2. must load be java extension jars before hadoop jars
2024-01-14 15:53:33 +08:00
7a6475eeee [deps](hadoop) update hadoop on BE side to 3.3.6 (#29939)
Same as on FE side
2024-01-14 15:52:04 +08:00
60f6436f26 [fix](schema cache) adjust the destruction order of _tablet_schema_cache and storage engine (#29923) 2024-01-13 23:36:15 +08:00
e4e57e9b05 [chore](removelogs) remove debug query timeout logs 2024-01-12 14:37:20 +08:00
62064a86bf [test](ut) added UT cases for show create load stmt (#29564) 2024-01-12 13:58:20 +08:00
99024ad7bd [fix](move-memtable) check eos for already closed streams (#29734) 2024-01-12 13:58:20 +08:00
3b25e69311 [bug](rf) fix invalid type for runtime filters when result column is const (#29851) 2024-01-12 13:58:20 +08:00
2a77858845 [fix](move-memtable) check all streams for failed reason (#29877) 2024-01-12 13:58:20 +08:00
a314491535 [Fix](inverted index) fix array inverted index builder error (#29869) 2024-01-12 13:58:19 +08:00
5ef8428345 [Refactor](executor)refactor workload group log fron WARNING to INFO #29878 2024-01-12 13:58:19 +08:00
1718341051 [pipelineX](fix) Fix correctness problem due to local hash shuffle (#29881) 2024-01-12 13:58:19 +08:00
acda8d2129 [feature](profile )merge of profiles can be disabled by profile level. #29861
The merging of profiles requires ensuring the correctness of the profiles themselves. However, if merging is intended for troubleshooting correctness issues through profiles, errors may occur.

Moreover, the 'try-catch' does not catch exceptions related to profile merging. If merging fails, even the normal profile cannot be obtained.
2024-01-12 13:58:19 +08:00
3ef1229635 [docs](query-accel) refine several statements in docs (#29716) 2024-01-12 13:58:19 +08:00
2a51750abd [fix](dynamic partition) fix dynamic partition storage medium not working (#29490) 2024-01-12 13:58:19 +08:00
ad2c13e009 [Optimize](kill-query)Support the scanners exits as soon as possible when kill query #29803 2024-01-12 13:58:19 +08:00
d494674ff4 [opt](parquet-reader) Opt parquet decimal type reading. (#29825) 2024-01-12 13:58:19 +08:00
ad986a78ae [Fix](executor)Fix Grayscale upgrade be code dump when report statistics #29843 2024-01-12 13:58:19 +08:00
d525f576e1 [improve] Use lru cache to count the number of column in tablet schema to control memory (#29668) 2024-01-12 13:58:19 +08:00
0d6ab3c68c [chore](regression test) check disk is good (#29740) 2024-01-12 13:58:19 +08:00
89995402dd [regression](conf) enable load_stream_fault_injection (#29829)
Co-authored-by: Yongqiang YANG <98214048+dataroaring@users.noreply.github.com>
2024-01-12 13:58:19 +08:00
53639a01fe [Fix] (schema change) fix the bug that non light schema change tables can rename column (#29850) 2024-01-12 13:58:19 +08:00
42c21f7eda [regression](conf) enable test_stream_stub_fault_injection (#29830) 2024-01-12 13:58:19 +08:00
fc5dc1c285 [config](move-memtable) set default load_stream_per_node to 20 (#29822) 2024-01-12 13:58:19 +08:00
cbffdbb8bf [bug](group_commit) fix relay wal problem on materialized-view (#29848) 2024-01-12 13:58:19 +08:00
a4f29193f6 [pipelineX](fix) Fix incorrect runtime filter (#29860) 2024-01-12 13:58:19 +08:00
407a4a285d [improve](load) reduce logs from memtable memory limiter (#29840) 2024-01-12 13:58:19 +08:00
f94ff21eef [ci](perf) adjust threshold (#29856)
Co-authored-by: stephen <hello-stephen@qq.com>
2024-01-12 13:58:19 +08:00
ebfbe0c8dd [opt](information_schema) support information_schema in external catalog (#28919)
Add `information_schema` database for all catalog.
This is useful when using BI tools to connect to Doris,
the tools can get meta info from `information_schema`.

This PR mainly changes:

1. There will be a `information_schema` db in each catalog.
2. Each `information_schema` db only store the meta info of the catalog it belongs to.
3. For `information_schema`, the `TABLE_SCHEMA` column's value is the database name.
4. There is a new global variable `show_full_dbname_in_info_schema_db`, default is false, if set to true,
    The `TABLE_SCHEMA` column's value is the like `ctl.db`, because:

	When connect to Doris, the `database` info in connection url will be: `xxx?db=ctl.db`.
	
	And then some BI will try to query `information_schema` with sql like:
	
	`select * from information_schema.columns where TABLE_SCHEMA = "ctl.db"`
	
	So it has to be format as `ctl.db`
	
	eg, the `information_schema.columns` table in external catalog `doris` is like:
	
	```
	mysql> select * from information_schema.columns limit 1\G
	*************************** 1. row ***************************
	           TABLE_CATALOG: doris
	            TABLE_SCHEMA: doris.__internal_schema
	              TABLE_NAME: column_statistics
	             COLUMN_NAME: id
	        ORDINAL_POSITION: 1
	          COLUMN_DEFAULT: NULL
	             IS_NULLABLE: NO
	               DATA_TYPE: varchar
	CHARACTER_MAXIMUM_LENGTH: 4096
	  CHARACTER_OCTET_LENGTH: 16384
	       NUMERIC_PRECISION: NULL
	           NUMERIC_SCALE: NULL
	      DATETIME_PRECISION: NULL
	      CHARACTER_SET_NAME: NULL
	          COLLATION_NAME: NULL
	             COLUMN_TYPE: varchar(4096)
	              COLUMN_KEY:
	                   EXTRA:
	              PRIVILEGES:
	          COLUMN_COMMENT:
	             COLUMN_SIZE: 4096
	          DECIMAL_DIGITS: NULL
	   GENERATION_EXPRESSION: NULL
	                  SRS_ID: NULL
	```
	
6. Modify the behavior of

	- show tables
	- shwo databases
	- show columns
	- show table status

	The above statements may query the `information_schema` db if there is `where` predicate after them
2024-01-12 13:58:19 +08:00