doris

Author	SHA1	Message	Date
wangbo	df5ec16d7c	[Refactor](exectuor)Add schema type table active_queries (#32057 ) * Add schema type table active_queries	2024-03-15 17:57:28 +08:00
yiguolei	c20567d088	change to 2.1.1-rc01	2024-03-15 12:59:42 +08:00
Mingyu Chen	6d2924668e	[fix](audit-loader) fix invalid token check logic (#32095 ) The check of the token should be forwarded to Master FE. I add a new RPC method `checkToken()` in Frontend for this logic. Otherwise, after enable the audit loader, the log from non-master FE can not be loaded to audit table with `Invalid token` error.	2024-03-12 22:52:11 +08:00
HappenLee	2470634859	[RuntimeFilter] fix <=> runtime filter failed bug (#32003 )	2024-03-12 14:13:13 +08:00
TengJianPing	3358f76a7f	[feature](spill) Implement spill to disk for hash join, aggregation and sort for pipelineX (#31910 ) Co-authored-by: Jerry Hu <mrhhsg@gmail.com>	2024-03-12 14:12:09 +08:00
wangbo	c5390d00bb	[Improvement]Add schema table backend_active_tasks (#31945 )	2024-03-09 19:55:48 +08:00
Vallish Pai	4bfecac08a	[enhancement](plsql) Support show procedure and show create procedure (#31297 ) (#31763 )	2024-03-09 19:45:03 +08:00
Gabriel	b2de83f250	[agg](conf) Add a knob to control distinct agg (#31930 ) Add a knob to control distinct agg	2024-03-09 19:44:54 +08:00
starocean999	3b56c4bcfa	[enhancement](nereids)send is_nereids flag to be (#31752 )	2024-03-09 19:43:12 +08:00
wangbo	28f0b7eb32	[Improvement](profile)Add tvf active_be_tasks() #31815	2024-03-07 16:12:23 +08:00
abmdocrt	7c30cb20fd	[Fix](partial update) Fix partial update load false when schema includes auto increment column (#31725 ) Problem: When partially updating columns without specifying the auto-increment column, and the imported data contains new keys, an error stating the auto-increment column could not be found occurs. Reason: The logic for partial column updates does not account for new keys in auto-increment columns. Since auto-increment columns can be generated by the system, it's possible to omit this column data during import. However, partial column updates treat this as a regular column, expecting it to be nullable or have a default value for automatic filling, overlooking the fact that auto-increment columns can also be auto-filled. This oversight leads to the error. Solution: Incorporate a check for auto-increment columns into the partial column update logic, and include the logic for generating auto-increment column values in the process of completing partial updates.	2024-03-06 13:06:27 +08:00
HappenLee	231768db0d	[Performance](exec) Support runtime filter in <=> join (#31754 )	2024-03-06 13:06:26 +08:00
Pxl	25d1934289	[Feature](topn) support multiple topn filter on backend (#31665 ) support multiple topn filter on backend	2024-03-06 13:05:22 +08:00
yiguolei	792907ff89	doris-2.1.0-rc11	2024-03-04 18:15:17 +08:00
yiguolei	8ef3b634cc	2.1.0 release	2024-03-04 18:04:43 +08:00
HappenLee	b248d3a27e	[Refactor](rf) Refactor the rf code interface to remove update filter v1 (#31643 )	2024-03-02 17:12:49 +08:00
zy-kkk	07224686ef	[feature](jdbc catalog) support db2 jdbc catalog (#31627 )	2024-03-01 14:19:28 +08:00
zhangstar333	819ab6fc00	[feature](sink) support paritition tablet sink shuffle (#30914 ) Co-authored-by: morrySnow <morrysnow@126.com>	2024-03-01 04:25:43 +08:00
zzzxl	92e3b31f50	[feature](invert index) match_phrase_edge feature added (#31142 )	2024-02-29 19:51:18 +08:00
slothever	0b5b7175d6	[fix](multi-catalog) add max compute custom odps and tunnel url (#31390 ) add max compute custom odps and tunnel url	2024-02-29 16:44:40 +08:00
Guangdong Liu	9c4708ee74	[function](random_bytes)add random_bytes function (#31547 ) SELECT random_bytes(10); random_bytes(10) \| ----------------------+ 0x9b8ea00b7d1084bc5b26\|	2024-02-29 16:44:39 +08:00
wangbo	7f566f9365	Reset report_workload_runtime_status to optional (#31479 )	2024-02-28 13:07:47 +08:00
Xinyi Zou	c0754583cb	[opt](plsql) Fix procedure key compatibility (#31445 ) use dbId replace dbName, because dbName may be renamed by Alter. procedure key add package name (only reserved, currently no plans to support package) Optimize procedure create and exception	2024-02-28 13:07:47 +08:00
wangbo	c34639245e	[Improvement](executor)add remote scan thread pool (#31376 ) * add remote scan thread pool * +1	2024-02-27 10:12:33 +08:00
wangbo	1127b0065a	[Improment](executor)Add scanbytes/scanrows condition (#31364 ) * Add scanbytes/scanrows condition * fix reg	2024-02-27 10:12:33 +08:00
Chester	f163d56a98	[feature](function) support sequence function(alias of array_range), enhance both to handle datetimev2 (#30823 )	2024-02-27 10:12:19 +08:00
yiguolei	41a67fc218	change to 2.1.0-rc10	2024-02-26 01:36:54 +08:00
yangshijie	8f77e6363a	[Feature](function) Support xxhash function like murmur hash function (#31193 )	2024-02-23 19:03:28 +08:00
yiguolei	7bb276d071	change to rc09	2024-02-22 22:23:01 +08:00
yiguolei	4735c5b50f	2.1.0-rc08	2024-02-20 23:45:35 +08:00
wangbo	97c9d75af3	[Feature](executor)Add scan_thread_num property for workload group (#31106 )	2024-02-20 16:24:05 +08:00
koarz	6cf7468073	[enhancement](function) change some function nullable mode (#30991 ) change some function nullable mode	2024-02-18 14:45:25 +08:00
Tiewei Fang	f65844fae4	[Enhencement](Outfile/Export) Export data to csv file format with BOM (#30533 ) The UTF8 format of the Windows system has BOM. We add a new user property to `Outfile/Export`。Therefore, when exporting Doris data, users can choose whether to bring BOM on the beginning of the CSV file. Usage: ```sql -- outfile: select * from demo.student into outfile "file:///xxx/export/exp_" format as csv properties( "column_separator" = ",", "with_bom" = "true" ); -- Export: EXPORT TABLE student TO "file:///xx/tmpdata/export/exp_" PROPERTIES( "format" = "csv", "with_bom" = "true" ); ```	2024-02-16 10:16:40 +08:00
wangbo	24e80b23a5	[Feature](executor)Support ShowProcessStmt Show all Fe connection (#30907 )	2024-02-16 10:16:39 +08:00
Siyang Tang	08c196f3dc	[enhancement](stmt-forward) record query result for proxy query to avoid EOF (#30536 )	2024-02-16 10:12:25 +08:00
Ashin Gau	366a6792bf	[refactor](scanner) refactoring and optimizing scanner scheduling (#30746 )	2024-02-16 10:12:24 +08:00
Sun Chenyang	0442d5dc0e	[fix](Variant Type) Add sparse columns meta to fix compaction (#28673 ) Co-authored-by: eldenmoon <15605149486@163.com>	2024-02-16 10:12:23 +08:00
Xinyi Zou	08508d65fd	[feature-wip](plsql)(step1) Support PL-SQL (#30817 ) # 1. Motivation PL-SQL (Stored procedure) is a collection of sql, which is defined and used similarly to functions. It supports conditional judgments, loops and other control statements, supports cursor processing of result sets, and can write business logic in SQL. Hive uses Hplsql to support PL-SQL and is largely compatible with Oracle, Impala, MySQL, Redshift, PostgreSQL, DB2, etc. We support PL-SQL in Doris based on Hplsql to achieve compatibility with Stored procedures of database systems such as Oracle and PostgreSQL. Reference documentation: Hive: http://mail.hplsql.org Oracle: https://docs.oracle.com/en/database/oracle/oracle-database/21/lnpls/plsql-language-fundamentals.html#GUID-640DB3AA-15AF-4825-BD6C-1D4EB5AB7715 Mysql: https://dev.mysql.com/doc/refman/8.0/en/create-procedure.html # 2. Implementation Take the following case as an example to explain the process of connecting Doris FE to execute stored procedures using the Mysql protocol. ``` CREATE OR REPLACE PROCEDURE A(IN name STRING, OUT result int) select count() from test; select count() into result from test where k = name; END declare result INT default = 0; call A(‘xxx’, result); print result; ``` ![image](https://github.com/apache/doris/assets/13197424/0b78e039-0350-4ef1-bef3-0ebbf90274cd) 1. Add procedure and persist the Procedure Name and Source (raw SQL) into Doris FE metadata. 2. Call procedure, extract the actual parameter Value and Procedure Name in Call Stmt. Use Procedure Name to find the Source in the metadata, extract the Name and Type of the Procedure parameter, and match them with the actual parameter Value to form a complete variable <Name, Type, Value>. 3. Execute Doris Statement - Use Doris Logical Plan Builder to parse the Doris Statement syntax in Source, replace parameter variables, remove the into variable clause, and generate a Plan Tree that conforms to Doris syntax. - Use stmtExecutor to execute SQL and encapsulate the query result set iterator into QueryResult. - Output the query results to Mysql Channel, or write them into Cursor, parameters, and variables. - Stored Programs compatible with Mysql protocol support multiple statements. 4. Execute PL-SQL Statement - Use Plsql Logical Plan Builder to parse and execute PL-SQL Statement syntax in Source, including Loop, Cursor, IF, Declare, etc., and basically reuse HplSQL. # 3. TODO 1. Support drop procedure. 2. Create procedure only in `PlSqlOperation`. 3. Doris Parser supports declare variable. 4. Select Statement supports insert into variable. 5. Parameters and fields have the same name. 6. If Cursor exits halfway, will there be a memory leak? 7. Use getOriginSql(ctx) in syntax parsing LogicalPlanBuilder to obtain the original SQL. Is there any problem with special characters? 8. Supports complex types such as Map and Struct. 9. Test syntax such as Package. 10. Support UDF 11. In Oracle, create procedure must have AS or IS after RIGHT_PAREN, but Mysql and Hive not support AS or IS. Compatibility issues with Oracle will be discussed and resolved later. 12. Built-in functions require a separate management. 13. Doris statement add stmt: egin_transaction_stmt, end_transaction_stmt, commit_stmt, rollback_stmt. 14. Add plsql stmt: cmp_stmt, copy_from_local_stmt, copy_stmt, create_local_temp_table_stmt, merge_stmt. # 4. Some questions 1. JDBC does not support the execution of stored procedures that return results. You can only Into the execution results into a variable or write them into a table, because when multiple result sets are returned, JDBC needs to use the prepareCall statement to execute, otherwise the Statemnt of the returned result executes Finalize. Send EOF Packet will report an error; 2. Use PL-SQL Cursor to open multiple Query result set iterators at the same time. Doris BE will cache the intermediate status of these Queries (such as HashTable) and query results until the Query result set iteration is completed. If the Cursor is not available for a long time Being used will result in a lot of memory waste. 3. In plsql/Var.defineType(), the corresponding Plsql Var type will be found through the Mysql type name string, and the corresponding relationship between Doris type and Plsql Var needs to be implemented. 4. Currently, PL-SQL Statement will be forwarded to Master FE for creation and calculation, which may affect other services on Doris FE and is limited by the performance of Doris FE. Consider moving it to Doris BE for execution. 5. The format of the result returned by Doris Statement is ```xxxx\n, xxxx\n, 2 rows affected (0.03 sec)```. PL-SQL uses Print to print variable values in an unformatted format, and JDBC cannot easily obtain them. Real results. # 5. Some thoughts The above execution of Doris Statement reuses Doris Logical Plan Builder for syntax parsing, parses it from top to bottom into a Plan Tree, and calls stmtExecutor for execution. PL-SQL replacement variables, removal of Into Variable and other operations are coupled in Doris syntax parsing. The advantage is that it is easier to It can be compatible with Doris grammar with a few changes, but the disadvantage is that it will invade the Doris grammar parsing process. HplSQL performs a syntax parsing independently of Hive to implement variable substitution and other operations, and finally outputs a SQL that conforms to Hive syntax. The following is a simple syntax parsing process for select, where, expression, table name, join, The parsing of agg, order and other grammars must be re-implemented. The advantage is that it is completely independent from the original system, but the changes are too complicated. ![image](https://github.com/apache/doris/assets/13197424/7539e485-0161-44de-9100-1a01ebe6cc07)	2024-02-16 10:12:23 +08:00
zhangstar333	8d4e0c50c6	[feature](sink) support paritition tablet sink shuffle (#30821 )	2024-02-16 10:12:23 +08:00
Guangdong Liu	1ed24117ac	[function](url_decode)add url_decode function (#30667 )	2024-02-05 22:23:00 +08:00
koarz	48aaaa8005	[Enhancement](fuction) change function REPEAT nullable mode (#30743 )	2024-02-04 22:21:36 +08:00
Rohit Satardekar	6442663735	[Function](exec) upport atan2 math function (#30672 ) Co-authored-by: Rohit Satardekar <rohitrs1983@gmail.com>	2024-02-04 14:28:38 +08:00
zy-kkk	4f8730d092	[improvement](jdbc catalog) Optimize connection pool parameter settings (#30588 ) This PR makes the following changes to the connection pool of JDBC Catalog 1. Set the maximum connection survival time, the default is 30 minutes - Moreover, one-half of the maximum survival time is the recyclable time, - One-tenth is the check interval for recycling connections 2. Keepalive only takes effect on the connection pool on BE, and will be activated based on one-fifth of the maximum survival time. 3. The maximum number of existing connections is changed from 100 to 10 4. Add the connection cache recycling thread on BE, and add a parameter to control the recycling time, the default is 28800 (8 hours) 5. Add CatalogID to the key of the connection pool cache to achieve better isolation, requires refresh catalog to take effect 6. Upgrade druid connection pool to version 1.2.20 7. Added JdbcResource's setting of default parameters when upgrading the FE version to avoid errors due to unset parameters.	2024-02-03 20:26:03 +08:00
koarz	94eedd8ea4	[Enhancement](function)make SUBSTRING_INDEX function DEPEND_ON_ARGUMENT (#30392 )	2024-02-02 13:31:47 +08:00
wangbo	cd65a8c9a7	Remove useless statistics report path (#30687 )	2024-02-01 23:14:14 +08:00
Jibing-Li	7c7a423828	Sync stats cache while task finished, doesn't need to query column_statistics table. (#30609 )	2024-01-31 23:53:40 +08:00
Rohit Satardekar	19f57b544e	support cosh math function (#30602 ) Co-authored-by: Rohit Satardekar <rohitrs1983@gmail.com>	2024-01-31 23:53:39 +08:00
yangshijie	8b61b7c6cd	[exec](function) Add tanh func (#30555 )	2024-01-31 23:53:39 +08:00
wangbo	0433b8730d	[Feature](profile)add shuffle send rows/bytes #30456	2024-01-28 18:25:08 +08:00
yiguolei	f988686708	2.1.0-rc07	2024-01-27 10:55:03 +08:00

1 2 3 4 5 ...

1155 Commits