doris

Author	SHA1	Message	Date
Jibing-Li	7d1b3d4704	[feature](statistics, metadata)Meta data place holder for statistics (#29867 ) Meta data place holder for statistics in version 2.1.x. Users could upgrade to this version, but doesn't support rollback. After this change, statistics related functions doesn't need to change meta data any more in the 2.1 series.	2024-01-18 12:03:07 +08:00
amory	ade720470d	[Improve](config)delete confused config for nested complex type (#29988 )	2024-01-18 12:03:07 +08:00
zy-kkk	d658a44cef	[improvement](catalog) Change the push-down parameters of the predicate function of the table query SQL into variables (#30028 ) In this PR, we will control whether the external data source query is a push-down function parameter in the filter condition, changing the enable_fun_pushdown of fe conf to the enable_ext_func_pred_pushdown of the variable	2024-01-16 21:14:35 +08:00
谢健	4e41e1d797	[feat](Nereids) persist constraint in table (#29767 )	2024-01-16 18:49:29 +08:00
deardeng	168afdb965	[fix](disk balance) Change disk rebalance unpick time to configurable (#28949 )	2024-01-16 18:49:04 +08:00
Siyang Tang	97955da749	[enhancement](fe-memory) support label num threshold to reduce fe memory consumption (#22889 ) Co-authored-by: Yongqiang YANG <98214048+dataroaring@users.noreply.github.com>	2024-01-16 18:38:54 +08:00
Mingyu Chen	7c493b08c5	[refactor](dialect) make http sql converter plugin and audit loader as builtin plugin (#29692 ) Followup #28890 Make HttpSqlConverterPlugin and AuditLoader as Doris' builtin plugin. To make it simple for user to support sql dialect and using audit loader. HttpSqlConverterPlugin By default, there is nothing changed. There is a new global variable sql_converter_service, default is empty, if set, the HttpSqlConverterPlugin will be enabled set global sql_converter_service = "http://127.0.0.1:5001/api/v1/convert" AuditLoader By default, there is nothing changed. There is a new global variable enable_audit_plugin, default is false, if set to true, the audit loader plugin will be enable. Doris will create audit_log in __internal_schema when startup If enable_audit_plugin is true, the audit load will be inserted into audit_log table. 3 other global variables related to this plugin: audit_plugin_max_batch_interval_sec: The max interval for audit loader to insert a batch of audit log. audit_plugin_max_batch_bytes: The max batch size for audit loader to insert a batch of audit log. audit_plugin_max_sql_length: The max length of statement in audit log	2024-01-16 18:31:59 +08:00
gnehil	6598b4f7c8	[fix](http) fix exception when querying map data through http #29686 The mysql type code mapped by the map type is 400, but 400 is an unknown type for mysql. For the jdbc driver of mariadb, when querying through the http api of /api/query or using the jdbc driver of mariadb, an exception will occur. For the jdbc driver of mysql, it will be converted into binary form, and the correct data can be read through the string type. Therefore, the mysql custom type of map was removed and changed to string type, so that both the jdbc driver of mariadb and mysql can work normally.	2024-01-16 18:31:27 +08:00
Mingyu Chen	ebfbe0c8dd	[opt](information_schema) support information_schema in external catalog (#28919 ) Add `information_schema` database for all catalog. This is useful when using BI tools to connect to Doris, the tools can get meta info from `information_schema`. This PR mainly changes: 1. There will be a `information_schema` db in each catalog. 2. Each `information_schema` db only store the meta info of the catalog it belongs to. 3. For `information_schema`, the `TABLE_SCHEMA` column's value is the database name. 4. There is a new global variable `show_full_dbname_in_info_schema_db`, default is false, if set to true, The `TABLE_SCHEMA` column's value is the like `ctl.db`, because: When connect to Doris, the `database` info in connection url will be: `xxx?db=ctl.db`. And then some BI will try to query `information_schema` with sql like: `select * from information_schema.columns where TABLE_SCHEMA = "ctl.db"` So it has to be format as `ctl.db` eg, the `information_schema.columns` table in external catalog `doris` is like: ``` mysql> select * from information_schema.columns limit 1\G ************************* 1. row ************************* TABLE_CATALOG: doris TABLE_SCHEMA: doris.__internal_schema TABLE_NAME: column_statistics COLUMN_NAME: id ORDINAL_POSITION: 1 COLUMN_DEFAULT: NULL IS_NULLABLE: NO DATA_TYPE: varchar CHARACTER_MAXIMUM_LENGTH: 4096 CHARACTER_OCTET_LENGTH: 16384 NUMERIC_PRECISION: NULL NUMERIC_SCALE: NULL DATETIME_PRECISION: NULL CHARACTER_SET_NAME: NULL COLLATION_NAME: NULL COLUMN_TYPE: varchar(4096) COLUMN_KEY: EXTRA: PRIVILEGES: COLUMN_COMMENT: COLUMN_SIZE: 4096 DECIMAL_DIGITS: NULL GENERATION_EXPRESSION: NULL SRS_ID: NULL ``` 6. Modify the behavior of - show tables - shwo databases - show columns - show table status The above statements may query the `information_schema` db if there is `where` predicate after them	2024-01-12 13:58:19 +08:00
morrySnow	e93a16ac6e	[fix](Nereids) support complex literal cast in fe (#29599 )	2024-01-12 11:59:52 +08:00
wangbo	0d691c638b	[Feature](profile)Support report runtime workload statistics #29591	2024-01-12 11:59:27 +08:00
HappenLee	463a7ab212	[Performance](exec) opt the exchange performance (#29579 )	2024-01-12 11:46:29 +08:00
Xiangyu Wang	2ca90b2bf1	[Refactor](dialect) Add sql dialect converter plugins (#28890 ) The current logic for SQL dialect conversion is all in the `fe-core` module, which may lead to the following issues: - Changes to the dialect conversion logic may occur frequently, requiring users to upgrade the Doris version frequently within the fe-core module, leading to a longer change cycle. - The cost of customized development is high, requiring users to replace the fe-core JAR package. Turning it into a plugin can address the above issues properly.	2024-01-12 11:44:20 +08:00
abmdocrt	1ea51e9f20	[Feature](group commit) Support table property "group commit data bytes" (#29484 )	2024-01-07 19:46:42 +08:00
yujun	2d89b7aed4	[fix](tablet sched) disable disk balance for single replica (#29576 )	2024-01-07 19:21:42 +08:00
xueweizhang	75efdd6e1f	[fix](http) throw RejectedExecutionException to prevent http hanging by Future (#29607 )	2024-01-06 16:17:07 +08:00
Mingyu Chen	c1ddcc5751	[opt](config) create custom conf dir if not exists (#29391 )	2024-01-05 00:14:16 +08:00
Luwei	3c6c652997	[Fix](schema change) disable convert light schema change (#28205 ) (#29300 )	2023-12-31 17:02:15 +08:00
wangbo	c3c34e10bb	[feature](executor) Add some check when create workload group/workload schedule policy (#29236 )	2023-12-29 15:41:16 +08:00
slothever	8becf053cb	[fix](multi-catalog)unsupported hive input format should throw an exception and remove useless method (#29087 ) introduce from: #28644	2023-12-28 15:43:28 +08:00
HowardQin	8a169b9906	[case](regression) Test enable pipeline load (#28172 ) Co-authored-by: qinhao <qinhao@newland.com.cn>	2023-12-28 10:49:19 +08:00
yujun	ffc6596cef	[refactor](create tablet) default create tablet round robin (#28911 )	2023-12-26 17:36:05 +08:00
zhengyu	1964a77d6c	[enhencement](config) change default memtable size & loadStreamPerNode & default load parallelism (#28977 ) We change memtable size from 200MB to 100MB to achieve smoother flush performance. We change loadStreamPerNode from 20 to 60 to avoid stream rpc to be the bottleneck when enable memtable_on_sink_node. We change default s3&broker load parallelsim to make the most of CPUs on moderm multi-core systems. Signed-off-by: freemandealer <freeman.zhang1992@gmail.com>	2023-12-26 16:22:52 +08:00
slothever	509cfea99a	[feature](Load)(step2)support nereids load job schedule (#26356 ) We will Integrate new load job manager into new job scheduling framework so that the insert into task can be scheduled after the broker load sql is converted to insert into TVF(table value function) sql. issue: https://github.com/apache/doris/issues/24221 Now support: 1. load data by tvf insert into sql, but just for simple load（columns need to be defined in the table） 2. show load stmt - job id, label name, job state, time info - simple progress 3. cancel load from db 4. support that enable new load through Config.enable_nereids_load 5. can replay job after restarting doris TODO: - support partition insert job - support show statistics from BE - support multiple task and collect task statistic - support transactional task - need add ut case	2023-12-26 12:29:05 +08:00
starocean999	17f3ca7349	[fix](planner)should save original select list item before analyze (#28187 ) * [fix](planner)should save original select list item before analyze * fix test case * fix failed case	2023-12-25 23:06:45 +08:00
AlexYue	007f498f3b	(enhance)(InternalQuery) Support to collect profile for intenal query (#28762 )	2023-12-22 14:03:48 +08:00
Kaijie Chen	e2941aa9c8	[improve](config) set mutable and masterOnly in FE config stream_load_default_memtable_on_sink_node (#28835 )	2023-12-22 10:58:43 +08:00
HHoflittlefish777	4ee661202e	[improve](transaction) extend abort transaction time (#28662 )	2023-12-21 14:01:05 +08:00
amory	e9848066c9	[FIX](type) fix matchExactType for complex type (#28233 ) fe matchExactType function should call type.matchTypes for its own logic, do not switch case to do special logic otherwise we may meet core in be like this. ``` F20231208 18:54:39.359673 680131 block.h:535] Check failed: _data_types[i]->is_nullable() target type: Struct(l_info:Nullable(Array(Nullable(String)))) src type: Struct(col:Nullable(Array(Nullable(UInt8)))) * Check failure stack trace: * @ 0x5584e952b926 google::LogMessage::SendToLog() @ 0x5584e9527ef0 google::LogMessage::Flush() @ 0x5584e952c169 google::LogMessageFatal::~LogMessageFatal() @ 0x5584cf17201e doris::vectorized::MutableBlock::merge_impl<>() @ 0x5584ceac4b1d doris::vectorized::MutableBlock::merge<>() @ 0x5584d4dd7de3 doris::vectorized::VUnionNode::get_next_const() @ 0x5584d4dd9a45 doris::vectorized::VUnionNode::get_next() @ 0x5584bce469bd std::__invoke_impl<>() @ 0x5584bce466d0 std::__invoke<>() @ 0x5584bce465c7 _ZNSt5_BindIFMN5doris8ExecNodeEFNS0_6StatusEPNS0_12RuntimeStateEPNS0_10vectorized5BlockEPbEPS1_St12_PlaceholderILi1EESC_ILi2EESC_ILi3EEEE6__callIS2_JOS4_OS7_OS8_EJLm0ELm1ELm2ELm3EEEET_OSt5tupleIJDpT0_EESt12_Index_tupleIJXspT1_EEE @ 0x5584bce46358 std::_Bind<>::operator()<>() @ 0x5584bce46208 std::__invoke_impl<>() @ 0x5584bce46178 _ZSt10__invoke_rIN5doris6StatusERSt5_BindIFMNS0_8ExecNodeEFS1_PNS0_12RuntimeStateEPNS0_10vectorized5BlockEPbEPS3_St12_PlaceholderILi1EESD_ILi2EESD_ILi3EEEEJS5_S8_S9_EENSt9enable_ifIX16is_invocable_r_vIT_T0_DpT1_EESL_E4typeEOSM_DpOSN_ @ 0x5584bce45c18 std::_Function_handler<>::_M_invoke() @ 0x5584bce6412f std::function<>::operator()() @ 0x5584bce56382 doris::ExecNode::get_next_after_projects() @ 0x5584bce26218 doris::PlanFragmentExecutor::get_vectorized_internal() @ 0x5584bce2431b doris::PlanFragmentExecutor::open_vectorized_internal() @ 0x5584bce22a96 doris::PlanFragmentExecutor::open() @ 0x5584bce27c9d doris::PlanFragmentExecutor::execute() @ 0x5584bcbdb3f8 doris::FragmentMgr::_exec_actual() @ 0x5584bcbf982f doris::FragmentMgr::exec_plan_fragment()::$_0::operator()() @ 0x5584bcbf9715 std::__invoke_impl<>() @ 0x5584bcbf96b5 _ZSt10__invoke_rIvRZN5doris11FragmentMgr18exec_plan_fragmentERKNS0_23TExecPlanFragmentParamsERKSt8functionIFvPNS0_12RuntimeStateEPNS0_6StatusEEEE3$_0JEENSt9enable_ifIX16is_invocable_r_vIT_T0_DpT1_EESH_E4typeEOSI_DpOSJ_ @ 0x5584bcbf942d std::_Function_handler<>::_M_invoke() @ 0x5584b9dfd883 std::function<>::operator()() @ 0x5584bd6e3929 doris::FunctionRunnable::run() @ 0x5584bd6cf8ce doris::ThreadPool::dispatch_thread() ```	2023-12-21 11:49:05 +08:00
Yongqiang YANG	a443a39e2c	[enhance](blacklist) seperate blacklist conf from heartbeat (#28638 ) There is a circuit breaker lasting for 2 minutes in grpc, then if a be is down and up again, send fragments to the be fails lasting for 2 minutes.	2023-12-21 00:17:45 +08:00
amory	a8dcca98ec	[FIX](explode)fix explode array decimal (#28744 ) * fix explode with array<decimal> has specific precision at old planner	2023-12-20 20:19:56 +08:00
yujun	044d7830c9	[improvement](transaction) reduce publish txn log (#28277 )	2023-12-20 09:31:21 +08:00
wangbo	71b7dcfb8f	[feature][executor]support workload schedule policy (#28443 )	2023-12-19 18:00:02 +08:00
Mingyu Chen	6e855dd198	[feature](sql-dialect) support convert sql use sql convertor service (#27581 ) Add a new FE Config `sql_convertor_service`. If this config is set, and the session variable `sql_dialect` is set, Doris will try to use a standalone sql converter service to convert user input sql to specified sql dialect. eg: ``` mysql> set sql_dialect="presto"; Query OK, 0 rows affected (0.02 sec) Database changed mysql> select * from db1.tbl1 where "k1" = 1; # will be converted to select * from db1.tbl1 where `k1` = 1; +------+------+ \| k1 \| k2 \| +------+------+ \| 1 \| 2 \| +------+------+ 1 row in set (0.08 sec) ``` The sql converter service should be a http service. The request and response body can be found in `SQLDialectUtils.java`	2023-12-18 10:32:52 +08:00
yujun	92a4a9770c	[improvement](hint) query fail print tablet detail info (#28476 )	2023-12-16 12:54:25 +08:00
Lei Zhang	e6b135c76a	[improvement](fe) Add reason log when `Env` is not ready (#28286 )	2023-12-15 12:22:06 +08:00
Calvin Kirs	c4242ab69e	[Chore](Job)Add the configuration of the maximum number of persistence tasks for the job (#28411 )	2023-12-15 11:14:06 +08:00
meiyi	ee24667b9f	[fix](group commit) Fix some group commit problems (#28319 )	2023-12-14 14:38:56 +08:00
Mingyu Chen	3f202477ec	[minor](import) modify some imports (#28206 )	2023-12-12 11:39:54 +08:00
Sun Chenyang	573b594df3	[improvement](Variant Type) Support displaying subcolumns expanded for the variant column (#27764 )	2023-12-08 20:34:58 +08:00
Xiangyu Wang	ec08850c08	[Config](multi-catalog) Enable query hive views as default. (#27906 ) Remove EXPERIMENTAL tag for enable_query_hive_views and set enable_query_hive_views to true as default. This feature has been used on our cluster which has more then a hundred thousands of tables for several months, i think it is fine to enable it as default.	2023-12-06 20:46:09 +08:00
amory	393c491820	[FIX](map/struct)fix map/struct literal from fe (#28026 )	2023-12-06 13:56:56 +08:00
Calvin Kirs	cbf1f8620a	[Feature](job)support cancel task and fix log invalid (#27703 ) - Running task can be show and fix cancel fail - When the insert task scheduling cycle is reached, if there are still tasks running, the scheduling of this task will be canceled at this time. - refactor job status changes SQL - Fix timer job window error - Support cancel task	2023-12-06 10:44:09 +08:00
xueweizhang	80f528bf26	[enhancement](backup-restore) add config for upload/download task num per be (#27772 ) set upload/download task num per be, and improve the overall speed of upload/download, enhance the performance of backup and recovery. --------- Signed-off-by: nextdreamblue <zxw520blue1@163.com>	2023-12-04 11:19:45 +08:00
yujun	ba893a4e60	[log](table) add table lock failed log (#27659 )	2023-12-03 23:34:21 +08:00
wangbo	66cfcc67cb	[Fix](exectuor)Fix Follower Fe query queue may not work when exec alter #27831	2023-12-02 20:19:50 +08:00
Mingyu Chen	9daa7dc6b5	[refactor](http) disable snapshot and get_log_file api (#27724 ) Disable 2 http api by default: 1. BE's `/api/snapshot` 2. FE's `/get_log_file`	2023-11-29 16:11:51 +08:00
Mingyu Chen	7ac97c1650	[fix](bdbje) add free disk config (#27578 )	2023-11-27 21:29:02 +08:00
Lei Zhang	bb68900bed	[fix](bdbje) Fix bdbje logging level not work (#27597 ) * `EnvironmentConfig.FILE_LOGGING_LEVEL` only set FileHandlerLevel, we should set logger level firstly, otherwise it will not take effect.	2023-11-27 21:24:34 +08:00
seawinde	1b4cd24b36	[opt](Nereids) support where, group by, having, order by clause without from clause in query statement (#27006 ) Support where, group by, having, order by clause without from clause in query statement. For example as following: SELECT 1 AS a, COUNT(), SUM(2), AVG(1), RANK() OVER() AS w_rank WHERE 1 = 1 GROUP BY a, w_rank HAVING COUNT() IN (1, 2) AND w_rank = 1 ORDER BY a; this will return result: \| a \|count(*)\|sum(2)\|avg(1)\|w_rank\| +----+--------+------+------+------+ \| 1 \| 1\| 2\| 1.0\| 1\| For another example as following: select 1 c1, 2 union (select "hell0", "") order by c1 the second column datatype will be varchar(65533), 65533 is the default varchar length. this will return result: \|c1 \| 2 \| +------+---+ \|1 \| 2 \| \|hell0 \| \|	2023-11-27 12:05:14 +08:00

1 2 3 4 5 ...

376 Commits