This PR proposes to ignore tracking the following file and dir auto-generated by ANTLR4:
fe/fe-core/src/main/antlr4/org/apache/doris/nereids/DorisLexer.tokens
fe/fe-core/src/main/antlr4/org/apache/doris/nereids/gen/
This is an example of s3 hms_catalog:
```sql
CREATE CATALOG hms_catalog properties(
"type" = "hms",
"hive.metastore.uris"="thrift://localhost:9083",
"AWS_ACCESS_KEY" = "your access key",
"AWS_SECRET_KEY"="your secret key",
"AWS_ENDPOINT"="s3 endpoint",
"AWS_REGION"="s3-region",
"fs.s3a.paging.maximum"="1000");
```
All these params are necessary;
Define a new file scanner node for hms table in be.
This file scanner node is different from broker scan node as blow:
1. Broker scan node will define src slot and dest slot, there is two memory copy in it: first is from file to src slot
and second from src to dest slot. Otherwise FileScanNode only have one stemp memory copy just from file to dest slot.
2. Broker scan node will read all the filed in the file to src slot and FileScanNode only read the need filed.
3. Broker scan node will convert type into string type for src slot and then use cast to convert to dest slot type,
but FileScanNode will have the final type.
Now FileScanNode is a standalone code, but we will uniform the file scan and broker scan in the feature.
Add scalable regression testing framework(#7584)
contains
- Test framework making by groovy, and support built-in **readable DSL** named as `Action`
- Demo exists in `${DORIS_HOME}/regression-test/data/demo`
- Chinese doc exist in `${DORIS_HOME}/docs/zh-CN/developer-guide/regression-testing.md`
English document coming soon
* [improvement](show) Support that user can use show data skew statement instead of admin
This PR mainly do two things:
1. Support that user can use show data skew statement instead of admin
2. Fix fe ut failed caused by pr [improvement](rewrite) Make RewriteDateLiteralRule to be compatible with mysql #7876 and pr [feature-wip](iceberg) Step1: Support create Iceberg external table #7391
Co-authored-by: caiconghui1 <caiconghui1@jd.com>
When I use dev container feature in vscode, there is some config file that shouldn't be put in git.
So it's better to add config file into gitignore for convenience.
This PR is to reduce lock competition by supporting read and write lock in table level. When we modify or read table's meta, we don't need to get database lock, just get table write or read lock. And when we get database lock, that means meta directly in db cannot be modified by other thread. Database lock only protect meta in Database class, while table lock protect meta in Table class.
Co-authored-by: caiconghui [蔡聪辉] <caiconghui@xiaomi.com>
* Optimized the read performance of the table when have multi versions,
changed the merge method of the unique table,
merged the cumulative version data first, and then merged with the base version.
For the data with only one base version, read directly without merging
This is the last PR of proposal #4308
1. Add a new FE config `enable_http_server_v2` to enable new HTTP Server implementation. The default value is false.
2. Add a new FE config `http_api_extra_base_path` so that we can set base path for Frontend UI.
3. Refactor the new HTTP API response body. The return http status code is always 200, and using internal code in response body to indicate the certain error.
* Implements the grammar of the batch delete #4051
* Process create, alter table when table has delete sign column
* Support the syntax for enabling the delete column
* Automatically filtered deleted data in the select statement.
* Automatically add delete sign when create rollup table
TODO:
* Optimize the reading and compaction logic on the be side, so that the data marked as deleted will be completely deleted during base compaction
This CL mainly changes:
1. Add 2 new FE modules
1. fe-common
save all common classes for other modules, currently only `jmockit`
2. spark-dpp
The Spark DPP application for Spark Load. And I removed all dpp related classes to this module, including unit tests.
2. Change the `build.sh`
Add a new param `--spark-dpp` to compile the `spark-dpp` alone. And `--fe` will compile all FE modules.
the output of `spark-dpp` module is `spark-dpp-1.0.0-jar-with-dependencies.jar`, and it will be installed to `output/fe/spark-dpp/`.
3. Modify some bugs of spark load
This PR is to support grammar like the following: INSTALL PLUGIN FROM [source] [PROPERTIES("KEY"="VALUE", ...)]
user can set md5sum="xxxxxxx", so we don't need to provide a md5 uri.
When we get different columns's row ranges by column_delete_conditions, we should use union operation instead of intersection operation to get final get final row ranges.
The root cause is that we lost the relationship of the two delete conditions in same delete stmt.
Base data:
```
k1, k2
1, 2
1, 3
case 1:
delete from tbl where k1=1 and k2=2;
case 2:
delete from tbl where k1=1;
delete from tbl where k2=2;
```
We treat the above 2 cases as same, which is incorrect.
So we need to process every rowset of delete conditions separately.
This PR is the first step to make Doris stream load more robust with higher concurrent
performance(#3368),the main work is to support txn management in db level isolation
and use ArrayDeque to stored final status txns.
issue #2344
* Add install/unintall Plugin statement
* Add show plugin statement
* Support install plugin through two ways:
* Built-in Plugin: use PluginMgr's register method.
* Dynamic Plugin: install by SQL statement, and the process:
1. check Plugin has already install?
2. download Plugin file from remote source or copy from local source
3. extract Plugin's .zip
4. read Plugin's plugin.properties, and check Plugin's Value
5. dynamic load .jar and init Plugin's main Class
6. invoke Plugin's init method
7. register Plugin into PluginMgr.
8. update meta
* Support FE Plugin dynamic uninstall process
1. check Plugin has install?
2. invoke Plugin's close method
3. delete Plugin from PluginMgr
4. update meta
* Add audit plugin interface
* Add plugin enable flags in Config
* Add plugin install path in Config, default plugin will install in ${DORIS_FE_PATH}/plugins
* Add FE plugins project
* Add audit plugin demo
The usage:
```
// install plugin and show plugins;
mysql>
mysql> install plugin from "/home/users/seaven/auditplugin.zip";
Query OK, 0 rows affected (0.05 sec)
mysql>
mysql> show plugins;
+-------------------+-------+---------------+---------+-------------+------------------------+--------+---------------------------------------+
| Name | Type | Description | Version | JavaVersion | ClassName | SoName | Sources |
+-------------------+-------+---------------+---------+-------------+------------------------+--------+---------------------------------------+
| audit_plugin_demo | AUDIT | just for test | 0.11.0 | 1.8.31 | plugin.AuditPluginDemo | NULL | /home/users/hekai/auditplugindemo.zip |
+-------------------+-------+---------------+---------+-------------+------------------------+--------+---------------------------------------+
1 row in set (0.00 sec)
mysql> show plugins;
+-------------------+-------+---------------+---------+-------------+------------------------+--------+---------------------------------------+
| Name | Type | Description | Version | JavaVersion | ClassName | SoName | Sources |
+-------------------+-------+---------------+---------+-------------+------------------------+--------+---------------------------------------+
| audit_plugin_demo | AUDIT | just for test | 0.11.0 | 1.8.31 | plugin.AuditPluginDemo | NULL | /home/users/hekai/auditplugindemo.zip |
+-------------------+-------+---------------+---------+-------------+------------------------+--------+---------------------------------------+
1 row in set (0.00 sec)
mysql> uninstall plugin audit_plugin_demo;
Query OK, 0 rows affected (0.04 sec)
mysql> show plugins;
Empty set (0.00 sec)
```
TODO:
*Config.plugin_dir should be created if missing
Pure DocValue optimization for doris-on-es
Future todo:
Today, for every tuple scan we check if pure_docvalue is enabled, this is not reasonable, should check pure_docvalue enabled for one whole scan outside, I will add this todo in future
* Reduce UT binary size
Almost every module depend on ExecEnv, and ExecEnv contains all
singleton, which make UT binary contains all object files.
This patch seperate ExecEnv's initial and destory to anthor file to
avoid other file's dependence. And status.cc include debug_util.h which
depend tuple.h tuple_row.h, and I move get_stack_trace() to
stack_util.cpp to reduce status.cc's dependence.
I add USE_RTTI=1 to build rocksdb to avoid linking librocksdb.a
Issue: #292
* Update