Commit Graph

708 Commits

Author SHA1 Message Date
7248420cfd [chore](session_variable) Add 'data_queue_max_blocks' to prevent the DataQueue from occupying too much memory. (#34017) (#34395) 2024-05-05 21:20:33 +08:00
35f8563a75 [feature](iceberg) support iceberg equality delete (#34223) (#34327)
bp #34223

Co-authored-by: Ashin Gau <AshinGau@users.noreply.github.com>
2024-04-30 11:51:29 +08:00
7cb00a8e54 [Feature](hive-writer) Implements s3 file committer. (#34307)
Backport #33937.
2024-04-29 19:56:49 +08:00
a050513c91 [Fix](clean trash) Fix clean trash use agent task (#33912) (#33972)
* [Fix](clean trash) Fix clean trash use agent task (#33912)

* add .h
2024-04-22 17:14:21 +08:00
7f61626c8d [fix](arrow_flight_sql) Fix ArrowSchema column alias (#33490)
run: select TABLE_SCHEMA as a, sum(TABLE_ROWS) as b  from tables group by TABLE_SCHEMA limit 2;
old output:

          TABLE_SCHEMA                        Nullable(Int64)_1
0  regression_test_mv_p0_sum_count           9
1  regression_test_query_p0_sql_functions_string_functions       70414
now output:

          a                        b
0  regression_test_mv_p0_sum_count            9
1  regression_test_query_p0_sql_functions_string_functions       70414
2024-04-22 11:28:22 +08:00
03c3419265 [Refactor](executor)Add workload schedule policy table (#33729) 2024-04-20 20:06:34 +08:00
20b37e7a18 Add workload group id in workload policy's property (#33483) 2024-04-17 23:42:14 +08:00
face7c42fd [enhancement](plsql) Support select * from routines (#32866)
Support show of plsql procedure using select * from routines.
2024-04-17 23:42:12 +08:00
1be753ed75 [enhancement](mysql compatible) add user and procs_priv tables to mysql db in all catalogs (#33058)
Issue Number: close #xxx

This PR aims to enhance the compatibility of BI tools (such as Dbeaver, DataGrip) when using the mysql connector to connect to Doris, because some BI tools query some tables in the mysql database. In our tests, the user and procs_priv tables were mainly queried. This PR adds these two tables and adds actual data to the user table. However, please note that most of the fields in the user table are in Doris' own format rather than mysql format, so it can only ensure that the BI tool is querying No error is reported when accessing these tables, which does not guarantee that the data is completely displayed, and the tables under Doris's mysql database do not support data modification.
Thanks to @liujiwen-up for assisting in testing
2024-04-17 23:42:12 +08:00
b2b385a4ff [improve](fold) support complex type for constant folding (#32867) 2024-04-17 23:41:59 +08:00
Pxl
5f30463bb3 [Chore](descriptors) remove unused codes for descriptors (#33408)
remove unused codes for descriptors
2024-04-12 15:09:25 +08:00
716c146750 [fix](insert)fix hive external return msgs and exception and pass all columns to BE (#32824)
[fix](insert)fix hive external return msgs and exception and pass all columns to BE
2024-04-12 10:23:52 +08:00
3d66723214 [branch-2.1](auto-partition) pick auto partition and some more prs (#33523) 2024-04-11 17:12:17 +08:00
Pxl
3081fc584d [Improvement](runtime-filter) support sync join node build side's size to init bloom runtime filter (#32180)
support sync join node build side's size to init bloom runtime filter
2024-04-11 09:31:50 +08:00
67bb519613 [Fix](nereids) forward the user define variables to master (#33013) 2024-04-10 15:26:08 +08:00
8e19cdd745 [featrue](expr) support common subexpression elimination be part (#32673) 2024-04-10 11:56:21 +08:00
96b995504c [enhancement](statistics) excluded delta rows num for rollup&mv tablets (#32568)
Signed-off-by: freemandealer <freeman.zhang1992@gmail.com>
Co-authored-by: tsy <tangsiyang2001@foxmail.com>
2024-04-10 11:34:28 +08:00
326a264fcd [Improvement](executor)Add spill property for workload group #32554 2024-03-22 16:38:19 +08:00
baf3ae1a93 [refactor](nereids)unify outputTupleDesc and projection be part (#32439) 2024-03-22 16:35:43 +08:00
0cde0cbf19 (invert index) modify of time series compaction policy 2024-03-22 08:16:30 +08:00
e892774c9a [improvement](agg) streaming agg should not take too much memory when spilling enabled (#32426) 2024-03-21 14:07:24 +08:00
ecadb60bcd [Pick 2.1](inverted index) support inverted index format v2 (#30145) (#32418) 2024-03-19 08:11:33 +08:00
b82de68d7e [feature][insert]add hive table sink thrift (#32274) (#32360)
bp #32274
2024-03-18 10:46:17 +08:00
1645f2e0a7 [feature](insert)add hive table sink definition (#31662) (#32347)
bp #31662
Co-authored-by: slothever <18522955+wsjz@users.noreply.github.com>
2024-03-17 20:52:44 +08:00
258dcfca97 [Refactor](executor)Add information_schema.workload_groups (#32195) (#32314) 2024-03-15 20:46:54 +08:00
97b35d6830 [fix](nereids)AssertNumRow node's output should be nullable (#32136)
Co-authored-by: Co-Author Jerry Hu <mrhhsg@gmail.com>
2024-03-15 18:06:28 +08:00
9c1888e7ec [RuntimeFilter](exec) support min max runtime filter and do refactor (#32210) 2024-03-15 18:06:20 +08:00
Pxl
5e4da61df9 [Bug](top-n) do not get runtime predicate when predicate not initialized (#32208) 2024-03-15 18:06:15 +08:00
c8f3643890 [exec](runtimefilter) support null aware in runtime filter (#32152)
null aware in runtime filter
2024-03-15 18:05:13 +08:00
df5ec16d7c [Refactor](exectuor)Add schema type table active_queries (#32057)
* Add schema type table active_queries
2024-03-15 17:57:28 +08:00
6d2924668e [fix](audit-loader) fix invalid token check logic (#32095)
The check of the token should be forwarded to Master FE.
I add a new RPC method `checkToken()` in Frontend for this logic.
Otherwise, after enable the audit loader, the log from non-master FE can not be loaded to audit table
with `Invalid token` error.
2024-03-12 22:52:11 +08:00
3358f76a7f [feature](spill) Implement spill to disk for hash join, aggregation and sort for pipelineX (#31910)
Co-authored-by: Jerry Hu <mrhhsg@gmail.com>
2024-03-12 14:12:09 +08:00
c5390d00bb [Improvement]Add schema table backend_active_tasks (#31945) 2024-03-09 19:55:48 +08:00
4bfecac08a [enhancement](plsql) Support show procedure and show create procedure (#31297) (#31763) 2024-03-09 19:45:03 +08:00
b2de83f250 [agg](conf) Add a knob to control distinct agg (#31930)
Add a knob to control distinct agg
2024-03-09 19:44:54 +08:00
3b56c4bcfa [enhancement](nereids)send is_nereids flag to be (#31752) 2024-03-09 19:43:12 +08:00
28f0b7eb32 [Improvement](profile)Add tvf active_be_tasks() #31815 2024-03-07 16:12:23 +08:00
7c30cb20fd [Fix](partial update) Fix partial update load false when schema includes auto increment column (#31725)
Problem:
When partially updating columns without specifying the auto-increment column, and the imported data contains new keys, an error stating the auto-increment column could not be found occurs.

Reason:
The logic for partial column updates does not account for new keys in auto-increment columns. Since auto-increment columns can be generated by the system, it's possible to omit this column data during import. However, partial column updates treat this as a regular column, expecting it to be nullable or have a default value for automatic filling, overlooking the fact that auto-increment columns can also be auto-filled. This oversight leads to the error.

Solution:
Incorporate a check for auto-increment columns into the partial column update logic, and include the logic for generating auto-increment column values in the process of completing partial updates.
2024-03-06 13:06:27 +08:00
231768db0d [Performance](exec) Support runtime filter in <=> join (#31754) 2024-03-06 13:06:26 +08:00
Pxl
25d1934289 [Feature](topn) support multiple topn filter on backend (#31665)
support multiple topn filter on backend
2024-03-06 13:05:22 +08:00
07224686ef [feature](jdbc catalog) support db2 jdbc catalog (#31627) 2024-03-01 14:19:28 +08:00
819ab6fc00 [feature](sink) support paritition tablet sink shuffle (#30914)
Co-authored-by: morrySnow <morrysnow@126.com>
2024-03-01 04:25:43 +08:00
92e3b31f50 [feature](invert index) match_phrase_edge feature added (#31142) 2024-02-29 19:51:18 +08:00
0b5b7175d6 [fix](multi-catalog) add max compute custom odps and tunnel url (#31390)
add max compute custom odps and tunnel url
2024-02-29 16:44:40 +08:00
7f566f9365 Reset report_workload_runtime_status to optional (#31479) 2024-02-28 13:07:47 +08:00
c0754583cb [opt](plsql) Fix procedure key compatibility (#31445)
use dbId replace dbName, because dbName may be renamed by Alter.
procedure key add package name (only reserved, currently no plans to support package)
Optimize procedure create and exception
2024-02-28 13:07:47 +08:00
c34639245e [Improvement](executor)add remote scan thread pool (#31376)
* add remote scan thread pool

* +1
2024-02-27 10:12:33 +08:00
1127b0065a [Improment](executor)Add scanbytes/scanrows condition (#31364)
* Add scanbytes/scanrows condition

* fix reg
2024-02-27 10:12:33 +08:00
97c9d75af3 [Feature](executor)Add scan_thread_num property for workload group (#31106) 2024-02-20 16:24:05 +08:00
f65844fae4 [Enhencement](Outfile/Export) Export data to csv file format with BOM (#30533)
The UTF8 format of the Windows system has BOM. 

We add a new user property to `Outfile/Export`。Therefore, when exporting Doris data, users can choose whether to bring BOM on the beginning of the CSV file.

**Usage:**
```sql
-- outfile:
select * from demo.student
into outfile "file:///xxx/export/exp_"
format as csv
properties(
    "column_separator" = ",",
    "with_bom" = "true"
);

-- Export:
EXPORT TABLE student TO "file:///xx/tmpdata/export/exp_"
PROPERTIES(
    "format" = "csv",
    "with_bom" = "true"
);
```
2024-02-16 10:16:40 +08:00