289d621faa
[improvement](information_schema)Show view definition in information_schema.views. ( #45857 ) ( #45930 )
...
backport: https://github.com/apache/doris/pull/45857
2024-12-26 10:11:13 +08:00
5d3f0a267a
[opt](scan) unify the local and remote scan bytes stats for all scanners for 2.1 ( #45167 )
...
pick part of #40493
TODO: not working with s3 reader
2024-12-10 14:19:19 +08:00
f0324e2a56
branch-2.1: [improvement](information_schema)Support show default value in information_schema. #44849 ( #45080 )
...
Cherry-picked from #44849
Co-authored-by: James <lijibing@selectdb.com >
2024-12-06 14:54:09 +08:00
78b6157aa9
[fix](ip/variant) fix information meta ( #41871 )
...
fix datatype information meta for ip/variant (#41666 )
## Proposed changes
Issue Number: close #xxx
<!--Describe your changes.-->
2024-10-15 18:01:14 +08:00
8e860a26a7
[fix](systable) fix unstable case for partitions table ( #40553 ) ( #41043 )
...
bp #40553
2024-09-20 17:13:30 +08:00
3604d63184
[Branch 2.1] backport systable PR (#34384,#40153,#40456,#40455,#40568) ( #40687 )
...
backport
https://github.com/apache/doris/pull/40568
https://github.com/apache/doris/pull/40455
https://github.com/apache/doris/pull/40456
https://github.com/apache/doris/pull/40153
https://github.com/apache/doris/pull/34384
Test result:
2024-09-11 11:00:45.618 INFO [suite-thread-1] (SuiteContext.groovy:309)
- Recover original connection
2024-09-11 11:00:45.619 INFO [suite-thread-1] (Suite.groovy:359) -
Execute sql: REVOKE SELECT_PRIV ON
test_partitions_schema_db.duplicate_table FROM partitions_user
2024-09-11 11:00:45.625 INFO [suite-thread-1] (SuiteContext.groovy:299)
- Create new connection for user 'partitions_user'
2024-09-11 11:00:45.632 INFO [suite-thread-1] (Suite.groovy:1162) -
Execute tag: select_check_5, sql: select
TABLE_CATALOG,TABLE_SCHEMA,TABLE_NAME,PARTITION_NAME,SUBPARTITION_NAME,PARTITION_ORDINAL_POSITION,SUBPARTITION_ORDINAL_POSITION,PARTITION_METHOD,SUBPARTITION_METHOD,PARTITION_EXPRESSION,SUBPARTITION_EXPRESSION,PARTITION_DESCRIPTION,TABLE_ROWS,AVG_ROW_LENGTH,DATA_LENGTH,MAX_DATA_LENGTH,INDEX_LENGTH,DATA_FREE,CHECKSUM,PARTITION_COMMENT,NODEGROUP,TABLESPACE_NAME
from information_schema.partitions where
table_schema="test_partitions_schema_db" order by
TABLE_CATALOG,TABLE_SCHEMA,TABLE_NAME,PARTITION_NAME,SUBPARTITION_NAME,PARTITION_ORDINAL_POSITION,SUBPARTITION_ORDINAL_POSITION,PARTITION_METHOD,SUBPARTITION_METHOD,PARTITION_EXPRESSION,SUBPARTITION_EXPRESSION,PARTITION_DESCRIPTION,TABLE_ROWS,AVG_ROW_LENGTH,DATA_LENGTH,MAX_DATA_LENGTH,INDEX_LENGTH,DATA_FREE,CHECKSUM,PARTITION_COMMENT,NODEGROUP,TABLESPACE_NAME
2024-09-11 11:00:45.644 INFO [suite-thread-1] (SuiteContext.groovy:309)
- Recover original connection
2024-09-11 11:00:45.645 INFO [suite-thread-1] (ScriptContext.groovy:120)
- Run test_partitions_schema in
/root/doris/workspace/doris/regression-test/suites/query_p0/system/test_partitions_schema.groovy
succeed
2024-09-11 11:00:45.652 INFO [main] (RegressionTest.groovy:259) - Start
to run single scripts
2024-09-11 11:01:10.321 INFO [main] (RegressionTest.groovy:380) -
Success suites:
/root/doris/workspace/doris/regression-test/suites/query_p0/system/test_partitions_schema.groovy:
group=default,p0, name=test_partitions_schema
2024-09-11 11:01:10.322 INFO [main] (RegressionTest.groovy:459) - All
suites success.
____ _ ____ ____ _____ ____
| _ \ / \ / ___/ ___|| ____| _ \
| |_) / _ \ \___ \___ \| _| | | | |
| __/ ___ \ ___) |__) | |___| |_| |
|_| /_/ \_\____/____/|_____|____/
2024-09-11 11:01:10.322 INFO [main] (RegressionTest.groovy:410) - Test 1
suites, failed 0 suites, fatal 0 scripts, skipped 0 scripts
2024-09-11 11:01:10.322 INFO [main] (RegressionTest.groovy:119) - Test
finished
2024-09-11 11:03:00.712 INFO [suite-thread-1] (Suite.groovy:1162) -
Execute tag: select_check_5, sql: select * from
information_schema.table_options ORDER BY
TABLE_CATALOG,TABLE_SCHEMA,TABLE_NAME,TABLE_MODEL,TABLE_MODEL_KEY,DISTRIBUTE_KEY,DISTRIBUTE_TYPE,BUCKETS_NUM,PARTITION_NUM;
2024-09-11 11:03:00.729 INFO [suite-thread-1] (SuiteContext.groovy:309)
- Recover original connection
2024-09-11 11:03:00.731 INFO [suite-thread-1] (ScriptContext.groovy:120)
- Run test_table_options in
/root/doris/workspace/doris/regression-test/suites/query_p0/system/test_table_options.groovy
succeed
2024-09-11 11:03:04.817 INFO [main] (RegressionTest.groovy:259) - Start
to run single scripts
2024-09-11 11:03:28.741 INFO [main] (RegressionTest.groovy:380) -
Success suites:
/root/doris/workspace/doris/regression-test/suites/query_p0/system/test_table_options.groovy:
group=default,p0, name=test_table_options
2024-09-11 11:03:28.742 INFO [main] (RegressionTest.groovy:459) - All
suites success.
____ _ ____ ____ _____ ____
| _ \ / \ / ___/ ___|| ____| _ \
| |_) / _ \ \___ \___ \| _| | | | |
| __/ ___ \ ___) |__) | |___| |_| |
|_| /_/ \_\____/____/|_____|____/
2024-09-11 11:03:28.742 INFO [main] (RegressionTest.groovy:410) - Test 1
suites, failed 0 suites, fatal 0 scripts, skipped 0 scripts
2024-09-11 11:03:28.742 INFO [main] (RegressionTest.groovy:119) - Test
finished
*************************** 7. row ***************************
PartitionId: 18035
PartitionName: p100
VisibleVersion: 2
VisibleVersionTime: 2024-09-11 10:59:28
State: NORMAL
PartitionKey: col_1
Range: [types: [INT]; keys: [83647]; ..types: [INT]; keys: [2147483647];
)
DistributionKey: pk
Buckets: 10
ReplicationNum: 1
StorageMedium: HDD
CooldownTime: 9999-12-31 15:59:59
RemoteStoragePolicy:
LastConsistencyCheckTime: NULL
DataSize: 2.872 KB
IsInMemory: false
ReplicaAllocation: tag.location.default: 1
IsMutable: true
SyncWithBaseTables: true
UnsyncTables: NULL
CommittedVersion: 2
RowCount: 4
7 rows in set (0.01 sec)
---------
Co-authored-by: Mingyu Chen <morningman.cmy@gmail.com >
2024-09-12 11:50:09 +08:00
92752b90e7
[feature](metacache) add system table catalog_meta_cache_statistics #40155 ( #40210 )
...
bp #40155
2024-09-02 23:23:35 +08:00
131238ff71
[fix](file-cache) change metric_value column in file_cache_statistics table to string ( #40083 )
...
Make it more flexible
followup #39552
2024-08-29 16:39:22 +08:00
173aafc86f
[Enhancement] add information_schema.table_properties #38745 ( #38746 ) ( #39886 )
...
bp #38746
---------
Co-authored-by: Vallish Pai <vallishpai@gmail.com >
2024-08-27 17:22:19 +08:00
6ceb574aa0
[branch-2.1]Pick IO limit/workload group usage table ( #39839 )
2024-08-23 18:51:47 +08:00
a55e109e97
[pick][Improment]Add schema table workload_group_privileges ( #38436 ) ( #39708 )
...
pick #38436
2024-08-22 00:44:43 +08:00
0bfcee1251
[opt](file-cache) support system table file_cache_statistics ( #39552 )
...
1. Add new system table: `file_cache_statistics`
This table is used for viewing metrics related to file cache on BE side
```
mysql> select * from information_schema.file_cache_statistics limit 10;
+-------+---------------+----------------------------+--------------------------------+--------------------+
| BE_ID | BE_IP | CACHE_PATH | METRIC_NAME | METRIC_VALUE |
+-------+---------------+----------------------------+--------------------------------+--------------------+
| 10003 | 172.20.32.136 | /mnt/output/be/file_cache/ |
disposable_queue_curr_elements | 0 |
| 10003 | 172.20.32.136 | /mnt/output/be/file_cache/ |
disposable_queue_curr_size | 0 |
| 10003 | 172.20.32.136 | /mnt/output/be/file_cache/ |
disposable_queue_max_elements | 102400 |
| 10003 | 172.20.32.136 | /mnt/output/be/file_cache/ |
disposable_queue_max_size | 21474836480 |
| 10003 | 172.20.32.136 | /mnt/output/be/file_cache/ | hits_ratio |
0.8539634687001242 |
| 10003 | 172.20.32.136 | /mnt/output/be/file_cache/ | hits_ratio_1h | 0
|
| 10003 | 172.20.32.136 | /mnt/output/be/file_cache/ | hits_ratio_5m | 0
|
| 10003 | 172.20.32.136 | /mnt/output/be/file_cache/ |
index_queue_curr_elements | 0 |
| 10003 | 172.20.32.136 | /mnt/output/be/file_cache/ |
index_queue_curr_size | 0 |
| 10003 | 172.20.32.136 | /mnt/output/be/file_cache/ |
index_queue_max_elements | 102400 |
+-------+---------------+----------------------------+--------------------------------+--------------------+
```
It will show metrics of file caches on each BE.
2. Add new metrics `hits_ratio_1h` and `hits_ratio_5m` for file cache
This 2 metrics will show the hit ratio of file cache in recent 1 hour or
5 minutes.
So that we can know recent hit ratio instead of global historical hit
ratio.
2024-08-21 10:03:39 +08:00
fc0222a64c
[opt](info) processlist schema table support show all fe ( #38701 ) ( #38953 )
...
pick #38701
2024-08-07 11:01:46 +08:00
9d23ccf1f2
[Improvement](schema scan) Use async scanner for schema scanners (#38… ( #38666 )
...
…403)
2024-08-01 16:05:24 +08:00
dda25cceb6
[Bug](information-schema) fix some bug of information_schema.PROCESSLIST ( #36447 )
...
## Proposed changes
pick from #36409
2024-06-18 16:45:48 +08:00
75a6f28f2e
[cherry-pick]Add query type when report ( #35918 )
...
pick #34978
2024-06-11 10:51:59 +08:00
f38ecd349c
[enhancement](memory) return error if allocate memory failed during add rows method ( #35085 )
...
* return error when add rows failed
* f
---------
Co-authored-by: yiguolei <yiguolei@gmail.com >
2024-05-22 00:53:34 +08:00
9b5028785d
[fix](prepare) fix datetimev2 return err when binary_row_format ( #34662 )
...
fix datetimev2 return err when binary_row_format. before pr, Backend return datetimev2 alwary by to_string.
fix datatimev2 return metadata loss scale.
2024-05-18 18:37:41 +08:00
39fdc9ba0c
[refactor](executor)Rename workload schedule policy #34497
2024-05-08 08:35:20 +08:00
299d069da9
Fix alter policy failed ( #33910 )
2024-04-22 22:33:24 +08:00
03c3419265
[Refactor](executor)Add workload schedule policy table ( #33729 )
2024-04-20 20:06:34 +08:00
face7c42fd
[enhancement](plsql) Support select * from routines ( #32866 )
...
Support show of plsql procedure using select * from routines.
2024-04-17 23:42:12 +08:00
1be753ed75
[enhancement](mysql compatible) add user and procs_priv tables to mysql db in all catalogs ( #33058 )
...
Issue Number: close #xxx
This PR aims to enhance the compatibility of BI tools (such as Dbeaver, DataGrip) when using the mysql connector to connect to Doris, because some BI tools query some tables in the mysql database. In our tests, the user and procs_priv tables were mainly queried. This PR adds these two tables and adds actual data to the user table. However, please note that most of the fields in the user table are in Doris' own format rather than mysql format, so it can only ensure that the BI tool is querying No error is reported when accessing these tables, which does not guarantee that the data is completely displayed, and the tables under Doris's mysql database do not support data modification.
Thanks to @liujiwen-up for assisting in testing
2024-04-17 23:42:12 +08:00
97a2977f2a
[improvement](executor)Add tag property for workload group #32874
2024-04-10 11:34:29 +08:00
fae55e0e46
[Feature](information_schema) add processlist table for information_schema db ( #32511 )
2024-04-07 23:24:22 +08:00
326a264fcd
[Improvement](executor)Add spill property for workload group #32554
2024-03-22 16:38:19 +08:00
fd0bc720e9
[opt](information_schema) Add DEFAULT_ENCRYPTION column to schemata table ( #32501 )
2024-03-22 08:52:16 +08:00
83ab61ad22
Add QUEUE_START_TIME/QUEUE_END_TIME/QUERY_STATUS column for active_queries ( #32259 )
2024-03-16 20:53:46 +08:00
258dcfca97
[Refactor](executor)Add information_schema.workload_groups ( #32195 ) ( #32314 )
2024-03-15 20:46:54 +08:00
df5ec16d7c
[Refactor](exectuor)Add schema type table active_queries ( #32057 )
...
* Add schema type table active_queries
2024-03-15 17:57:28 +08:00
c5390d00bb
[Improvement]Add schema table backend_active_tasks ( #31945 )
2024-03-09 19:55:48 +08:00
8fc9d80479
[compatibility](MySQL) update charset to utf8mb4, collation to utf8mb4_0900_bin ( #31046 )
...
Doris's behaviour is more like utf8mb4 and utf8mb4_0900_bin than utf8 and utf8_general_ci
2024-02-21 17:01:39 +08:00
f565f60bc3
[refactor](standard)BE:Initialize pointer variables in the class to nullptr by default ( #27587 )
2023-11-28 13:02:30 +08:00
2f41e0c823
[FIX](complextype)fix information schema for complex type ( #27203 )
...
when we select in information schema , here do not show complex type information
2023-11-18 11:32:32 +08:00
baae7bf339
[fix](information_schema)fix bug that metadata_name_ids error tableid and append information_schema case. ( #26238 )
...
fix bug that #24059 .
Added some information_schema scanner tests.
files
schema_privileges
table_privileges
partitions
rowsets
statistics
table_constraints
Based on infodb_support_ext_catalog=false, it currently includes tests for all tables under the information_schema database.
2023-11-09 14:07:12 +08:00
693982fd1a
[feature](decimal) support decimal256 ( #25386 )
2023-10-25 15:47:51 +08:00
2e2d5bcba2
[Improvements](status) catch some error status ( #25677 )
...
catch some error status
2023-10-23 10:19:08 +08:00
642c149e6a
remove datetime_value and move vecdatetime_value to doris namespace ( #25695 )
...
remove datetime_value and move vecdatetime_value to doris namespace
2023-10-20 22:08:17 +08:00
642e5cdb69
[Fix](Status) Make Status [[nodiscard]] and handle returned Status correctly ( #23395 )
2023-09-29 22:38:52 +08:00
3b4d8b4ac8
[pipelineX](feature) Support schema scan operator ( #24850 )
2023-09-25 14:42:25 +08:00
e680d42fe7
[feature](information_schema)add metadata_name_ids for quickly get catlogs,db,table and add profiling table in order to Compatible with mysql ( #22702 )
...
add information_schema.metadata_name_idsfor quickly get catlogs,db,table.
1. table struct :
```mysql
mysql> desc internal.information_schema.metadata_name_ids;
+---------------+--------------+------+-------+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+---------------+--------------+------+-------+---------+-------+
| CATALOG_ID | BIGINT | Yes | false | NULL | |
| CATALOG_NAME | VARCHAR(512) | Yes | false | NULL | |
| DATABASE_ID | BIGINT | Yes | false | NULL | |
| DATABASE_NAME | VARCHAR(64) | Yes | false | NULL | |
| TABLE_ID | BIGINT | Yes | false | NULL | |
| TABLE_NAME | VARCHAR(64) | Yes | false | NULL | |
+---------------+--------------+------+-------+---------+-------+
6 rows in set (0.00 sec)
mysql> select * from internal.information_schema.metadata_name_ids where CATALOG_NAME="hive1" limit 1 \G;
*************************** 1. row ***************************
CATALOG_ID: 113008
CATALOG_NAME: hive1
DATABASE_ID: 113042
DATABASE_NAME: ssb1_parquet
TABLE_ID: 114009
TABLE_NAME: dates
1 row in set (0.07 sec)
```
2. when you create / drop catalog , need not refresh catalog .
```mysql
mysql> select count(*) from internal.information_schema.metadata_name_ids\G;
*************************** 1. row ***************************
count(*): 21301
1 row in set (0.34 sec)
mysql> drop catalog hive2;
Query OK, 0 rows affected (0.01 sec)
mysql> select count(*) from internal.information_schema.metadata_name_ids\G;
*************************** 1. row ***************************
count(*): 10665
1 row in set (0.04 sec)
mysql> create catalog hive3 ...
mysql> select count(*) from internal.information_schema.metadata_name_ids\G;
*************************** 1. row ***************************
count(*): 21301
1 row in set (0.32 sec)
```
3. create / drop table , need not refresh catalog .
```mysql
mysql> CREATE TABLE IF NOT EXISTS demo.example_tbl ... ;
mysql> select count(*) from internal.information_schema.metadata_name_ids\G;
*************************** 1. row ***************************
count(*): 10666
1 row in set (0.04 sec)
mysql> drop table demo.example_tbl;
Query OK, 0 rows affected (0.01 sec)
mysql> select count(*) from internal.information_schema.metadata_name_ids\G;
*************************** 1. row ***************************
count(*): 10665
1 row in set (0.04 sec)
```
4. you can set query time , prevent queries from taking too long .
```
fe.conf : query_metadata_name_ids_timeout
the time used to obtain all tables in one database
```
5. add information_schema.profiling in order to Compatible with mysql
```mysql
mysql> select * from information_schema.profiling;
Empty set (0.07 sec)
mysql> set profiling=1;
Query OK, 0 rows affected (0.01 sec)
```
2023-08-31 21:22:26 +08:00
1d05feea1b
[Feature](Nereids) add executable function to support fold constant for functions ( #18209 )
...
1. Add date-time functions for fold constant for Nereids.
This is the list of executable date-time function nereids supports up to now:
- now()
- now(int)
- current_timestamp()
- current_timestamp(int)
- localtime()
- localtimestamp()
- curdate()
- current_date()
- curtime()
- current_time()
- date_{add/sub}(),{years/months/days/hours/minutes/seconds}_{add/sub}()
- datediff()
- {date/datev2}()
- {year/quarter/month/day/hour/minute/second}()
- dayof{year/month/week}()
- date_format()
- date_trunc()
- from_days()
- last_day()
- to_monday()
- from_unixtime()
- unix_timestamp()
- utc_timestamp()
- to_date()
- to_days()
- str_to_date()
- makedate()
2. solved problem:
- enable datev2/datetimev2 default.
- refactor Nereids foldConstantOnFE and support fold nested expression.
- separate the executable into multi-files for easily-reading and adding new functions
2023-05-17 21:26:31 +08:00
2bdfaac609
[fix](ubsan) fix ubsan errors ( #19658 )
...
ixu ubsan errors:
doris/be/src/util/string_parser.hpp:275:58: runtime error: signed integer overflow: 2147483647 + 1 cannot be represented in type 'int'
doris/be/src/vec/functions/functions_comparison.h:214:51: runtime error: addition of unsigned offset to 0x7fea6c6b7010 overflowed to 0x7fea6c6b700c
doris/be/src/vec/functions/multiply.cpp:67:50: runtime error: signed integer overflow: 1295699415680000000 * 0x0000000000015401d0a4cd4890a77700 cannot be represented in type '__int128
doris/be/src/vec/aggregate_functions/aggregate_function_percentile_approx.h:445:73: runtime error: addition of unsigned offset to 0x7feca3343d10 overflowed to 0x7feca3343d08
doris/be/src/exec/schema_scanner/schema_tables_scanner.cpp:330:24: run
2023-05-17 09:32:03 +08:00
2c836251b2
[Fix](schema scanner) Fixed the problem of overflow when multiplying two INT
2023-04-25 23:58:47 +08:00
339d804ec4
[Refactor](exceptionsafe) add factory creator to some class ( #19000 )
2023-04-25 14:33:47 +08:00
e412dd12e8
[chore](build) Use include-what-you-use to optimize includes (PART II) ( #18761 )
...
Currently, there are some useless includes in the codebase. We can use a tool named include-what-you-use to optimize these includes. By using a strict include-what-you-use policy, we can get lots of benefits from it.
2023-04-19 23:11:48 +08:00
9e960f4c4f
[chore](build) Use include-what-you-use to optimize includes ( #18681 )
...
Currently, there are some useless includes in the codebase. We can use a tool named include-what-you-use to optimize these includes. By using a strict include-what-you-use policy, we can get lots of benefits from it.
2023-04-17 11:44:58 +08:00
759f1da32e
[Enhencement](Backends) add HostName filed in backends table and delete backends table in information_schema ( #18156 )
...
1. Add `HostName` field for `show backends` statement and `backends()` tvf.
2. delete the `backends` table in `information_schema` database
2023-04-07 08:30:42 +08:00
e77833bfa1
[Bug](materialized-view) fix where clause persistence replay incorrect ( #18228 )
...
fix where clause persistence replay incorrect
2023-04-03 12:49:01 +08:00
d9fe5f7b67
[enhancement](memory) Remove MemPool and replace it with Arena ( #17820 )
...
Arena can replace MemPool in most scenarios. Except for memory reuse, MemPool supports reuse of previous memory chunks after clear, but Arena does not.
Some comparisons between MemPool and Arena:
1. Expansion
Arena is less than 128M index 2 alloc chunk; more than 128M memory, allocate 128M * n > `size`, n is equal to the minimum value that satisfies the expression;
MemPool less than 512K index 2 alloc chunk, greater than 512K memory, separately apply for a `size` length chunk
After Arena applied for a chunk larger than 128M last time, the minimum chunk applied for after that is 128M. Does this seem to be a waste of memory? MemPool is also similar. After the chunk of 512K was applied for last time, the minimum chunk of subsequent applications is 512K.
2. Alignment
MemPool defaults to 16 alignment, because memtable and other places that use int128 require 16 alignment;
Arena has no default alignment;
3. Memory reuse
Arena only supports `rollback`, which reuses the memory of the current chunk, usually the memory requested last time.
MemPool supports clear(), all chunks can be reused; or call ReturnPartialAllocation() to roll back the last requested memory; if the last chunk has no memory, search for the most free chunk for allocation
4. Realloc
Arena supports realloc contiguous memory; it also supports realloc contiguous memory from any position at the time of the last allocation. The difference between `alloc_continue` and `realloc` is:
1. Alloc_continue does not need to specify the old size, but the default old size = head->pos - range_start
2. alloc_continue supports expansion from range_start when additional_bytes is between head and pos, which is equivalent to reusing a part of memory, while realloc completely allocates a new memory
MemPool does not support realloc, but supports transferring or absorbing chunks between two MemPools
5. check mem limit
MemPool checks the mem limit, and Arena checks at the Allocator layer.
6. Support for ASAN
Arena does something extra
7. Error handling
MemPool supports returning the error message of application failure directly through `Status`, and Arena throws Exception.
Tests that Arena can consider
1. After the last applied chunk is larger than 128M, the minimum applied chunk is 128M, which seems to waste memory;
2. Support clear, memory multiplexing;
3. Increase the large list, alloc the memory larger than 128M, and the size is equal to `size`, so as to avoid the current chunk not being fully used, which is wasteful.
4. In some cases, it may be possible to allocate backwards to find chunks t
2023-03-29 20:56:49 +08:00