doris

Author	SHA1	Message	Date
Mingyu Chen	d659167d6d	[Planner] Set MysqlScanNode's cardinality to avoid unexpected shuffle join (#3886 )	2020-06-17 10:53:36 +08:00
lichaoyong	ae7028bee4	[Enhancement] Replace N/A with NULL in ShowStmt result (#3851 )	2020-06-17 09:41:51 +08:00
lichaoyong	6c4d7c60dd	[Feature] Add QueryDetail to store query statistics. (#3744 ) 1. Store the query statistics in memory. 2. Supporting RESTFUL interface to get the statistics.	2020-06-15 18:16:54 +08:00
Mingyu Chen	2211cb0ee0	[Metrics] Add metrics document and 2 new metrics of TCP (#3835 )	2020-06-15 09:48:09 +08:00
Mingyu Chen	b3811f910f	[Spark load][Fe 4/6] Add hive external table and update hive table syntax in loadstmt (#3819 ) * Add hive external table and update hive table syntax in loadstmt * Move check hive table from SelectStmt to FromClause and update doc * Update hive external table en sql reference	2020-06-13 16:28:24 +08:00
WingC	414a0a35e5	[Dynamic Partition] Use ZonedDateTime to support set timezone (#3799 ) This CL mainly support timezone in dynamic partition: 1. use new Java Time API to replace Calendar. 2. support set time zone in dynamic partition parameters.	2020-06-13 16:27:09 +08:00
wyb	4c2e73a5fe	Add hive external table and update hive table syntax in loadstmt	2020-06-10 16:32:32 +08:00
Mingyu Chen	4fa9d8cbe9	[Spark load][Fe 3/5] Fe create job (#3715 ) * Add create spark load job * Remove unused import	2020-06-09 21:57:46 +08:00
Mingyu Chen	5b1589498a	[Bug] Fix SchemaChangeJobV2's meta persist bug (#3804 ) 1. Missing field `partitionIndexMap` in SchemaChangeJobV2 2. Pair in field `indexSchemaVersionAndHashMap` can not be persisted by GSON 3. Exit the FE process when replay edit log error. Fix: #3802	2020-06-09 21:55:46 +08:00
Yunfeng,Wu	acd7a58875	[Doris On ES] [1/3] Add ES QueryBuilders for debug mode (#3774 )	2020-06-09 16:45:16 +08:00
Xiang Wei	c51f20bb7a	Disable Bitmap or Hll type in keys or in values with incorrect agg-type (#3768 ) Bitmap and Hll type can not be used with incorrect aggregate functions, which will cause to BE crush. Add some logical checks in FE's ColumnDef#analyze to avoid creating tables or changing schemas incorrectly. Keys never be bitmap or hll type values with bitmap or hll type have to be associated with bitmap_union or hll_union	2020-06-06 11:36:06 +08:00
Mingyu Chen	173dd3953d	[Code Refactor] Remove Catalog.getInstance() method (#3784 ) Use Catalog.getCurrentCatalog() instead, to avoid potential meta operation error.	2020-06-06 11:35:01 +08:00
Yunfeng,Wu	5abef19be4	[Doris On ES] Add more detailed error message when fail to create es table (#3758 )	2020-06-05 23:06:46 +08:00
EmmyMiao87	0a748661c1	Fix the error selectedIndexId when keysType of table is UNIQUE (#3772 ) The unique table also should be compensated candidate index. The reason is the same as the agg table type. Fixed #3771. Change-Id: Ic04b0360a0b178cb0b6ee635e56f48852092ec09	2020-06-04 19:26:50 +08:00
Mingyu Chen	fc33ee3618	[Plugin] Add timeout of connection when downloading the plugins from URL (#3755 ) If no timeout is set, the download process may be blocked forever.	2020-06-04 11:37:18 +08:00
yangzhg	a8c95e7369	[Bug] Fix binaryPredicte's equals function ignore op (#3753 ) BinaryPredicte's equals function compare by opcode , but the opcode may not be inited yet. so it will return true if this child is same, for example `a>1` and `a<1` are equal.	2020-06-04 09:29:19 +08:00
wyb	edfa6683fc	Add create spark load job	2020-06-03 21:27:27 +08:00
EmmyMiao87	30df9fcae9	Serialize origin stmt in Rollup Job and MV Meta (#3705 ) * Serialize origin stmt in Rollup Job and MV Meta In materialized view 2.0, the define expr is serialized in column. The method is that doris serialzie the origin stmt of Create Materialzied View Stmt in RollupJobV2 and MVMeta. The define expr will be extract from the origin stmt after meta is deserialized. The define expr is necessary for bitmap and hll materialized view. For example: MV meta: __doris_mv_bitmap_k1, bitmap_union, to_bitmap(k1) Origin stmt: select bitmap_union(to_bitmap(k1)) from table Deserialize meta: __doris_mv_bitmap_k1, bitmap_union, null After extract: the define expr `to_bitmap(k1)` from origin stmt should be extracted. __doris_mv_bitmap_v1, bitmap_union, to_bitmap(k1) (which comes from the origin stmt) Change-Id: Ic2da093188d8985f5e97be5bd094e5d60d82c9a7 * Add comment of read method Change-Id: I4e1e0f4ad0f6e76cdc43e49938de768ec3b0a0e8 * Fix ut Change-Id: I2be257d512bf541f00912a374a2e07a039fc42b4 * Change code style Change-Id: I3ab23f5c94ae781167f498fefde2d96e42e05bf9	2020-05-30 20:17:46 +08:00
Binglin Chang	5cb4063904	Fix UT ThreadPoolManagerTest failure (#3723 )	2020-05-30 10:35:07 +08:00
Binglin Chang	c967eaf496	[Memory Engine] Add TabletType to PartitionInfo and TabletMeta (#3668 )	2020-05-29 20:20:44 +08:00
Mingyu Chen	bc35f3a31f	[DynamicPartition] Optimize the rule of creating dynamic partition (#3679 ) Problem is described in ISSUE #3678 This CL mainly changed to rule of creating dynamic partition. 1. If time unit is DAY, the logic remains unchanged. 2. If time unit is WEEK, the logical changes are as follows: 1. Allow to set the start day of every week, the default is Monday. Optional Monday to Sunday 2. Assuming that the starting day is a Tuesday, the range of the partition is Tuesday of the week to Monday of the next week. 3. If time unit is MONTH, the logical changes are as follows: 1. Allow to set the start date of each month. The default is 1st, and can be selected from 1st to 28th. 2. Assuming that the starting date is the 2nd, the range of the partition is from the 2nd of this month to the 1st of the next month. 4. The `SHOW DYNAMIC PARTITION TABLES` statement adds a `StartOf` column to show the start day of week or month. It is recommended to refer to the example in `dynamic-partition.md` to understand. TODO: Better to support HOUR and YEAR time unit. Maybe in next PR. FIX: #3678	2020-05-27 16:42:41 +08:00
lichaoyong	1cc78fe69b	[Enhancement] Convert metric to Json format (#3635 ) Add a JSON format for existing metrics like this. ``` { "tags": { "metric":"thread_pool", "name":"thrift-server-pool", "type":"active_thread_num" }, "unit":"number", "value":3 } ``` I add a new JsonMetricVisitor to handle the transformation. It's not to modify existing PrometheusMetricVisitor and SimpleCoreMetricVisitor. Also I add 1. A unit item to indicate the metric better 2. Cloning tablet statistics divided by database. 3. Use white space to replace newline in audit.log	2020-05-27 08:49:30 +08:00
wyb	4978bd6c81	[Spark load] Add resource manager (#3418 ) 1. User interface: 1.1 Spark resource management Spark is used as an external computing resource in Doris to do ETL work. In the future, there may be other external resources that will be used in Doris, for example, MapReduce is used for ETL, Spark/GPU is used for queries, HDFS/S3 is used for external storage. We introduced resource management to manage these external resources used by Doris. ```sql -- create spark resource CREATE EXTERNAL RESOURCE resource_name PROPERTIES ( type = spark, spark_conf_key = spark_conf_value, working_dir = path, broker = broker_name, broker.property_key = property_value ) -- drop spark resource DROP RESOURCE resource_name -- show resources SHOW RESOURCES SHOW PROC "/resources" -- privileges GRANT USAGE_PRIV ON RESOURCE resource_name TO user_identity GRANT USAGE_PRIV ON RESOURCE resource_name TO ROLE role_name REVOKE USAGE_PRIV ON RESOURCE resource_name FROM user_identity REVOKE USAGE_PRIV ON RESOURCE resource_name FROM ROLE role_name ``` - CREATE EXTERNAL RESOURCE: FOR user_name is optional. If there has, the external resource belongs to this user. If not, the external resource belongs to the system and all users are available. PROPERTIES： 1. type: resource type. Only support spark now. 2. spark configuration: follow the standard writing of Spark configurations, refer to: https://spark.apache.org/docs/latest/configuration.html. 3. working_dir: optional, used to store ETL intermediate results in spark ETL. 4. broker: optional, used in spark ETL. The ETL intermediate results need to be read with the broker when pushed into BE. Example: ```sql CREATE EXTERNAL RESOURCE "spark0" PROPERTIES ( "type" = "spark", "spark.master" = "yarn", "spark.submit.deployMode" = "cluster", "spark.jars" = "xxx.jar,yyy.jar", "spark.files" = "/tmp/aaa,/tmp/bbb", "spark.yarn.queue" = "queue0", "spark.executor.memory" = "1g", "spark.hadoop.yarn.resourcemanager.address" = "127.0.0.1:9999", "spark.hadoop.fs.defaultFS" = "hdfs://127.0.0.1:10000", "working_dir" = "hdfs://127.0.0.1:10000/tmp/doris", "broker" = "broker0", "broker.username" = "user0", "broker.password" = "password0" ) ``` - SHOW RESOURCES: General users can only see their own resources. Admin and root users can show all resources. 1.2 Create spark load job ```sql LOAD LABEL db_name.label_name ( DATA INFILE ("/tmp/file1") INTO TABLE table_name, ... ) WITH RESOURCE resource_name [(key1 = value1, ...)] [PROPERTIES (key2 = value2, ... )] ``` Example: ```sql LOAD LABEL example_db.test_label ( DATA INFILE ("hdfs:/127.0.0.1:10000/tmp/file1") INTO TABLE example_table ) WITH RESOURCE "spark0" ( "spark.executor.memory" = "1g", "spark.files" = "/tmp/aaa,/tmp/bbb" ) PROPERTIES ("timeout" = "3600") ``` The spark configurations in load stmt can override the existing configuration in the resource for temporary use. #3010	2020-05-26 18:21:21 +08:00
Mingyu Chen	77b9acc242	[Stmt] Add rowCount column to SHOW DATA stmt (#3676 ) User can see the row count of all materialized indexes of a table. ``` mysql> show data from test; +-----------+-----------+-----------+--------------+----------+ \| TableName \| IndexName \| Size \| ReplicaCount \| RowCount \| +-----------+-----------+-----------+--------------+----------+ \| test2 \| r1 \| 10.000MB \| 30 \| 10000 \| \| \| r2 \| 20.000MB \| 30 \| 20000 \| \| \| test2 \| 50.000MB \| 30 \| 50000 \| \| \| Total \| 80.000 \| 90 \| \| +-----------+-----------+-----------+--------------+----------+ ``` Fix #3675	2020-05-26 15:53:38 +08:00
EmmyMiao87	aa4ac2d078	[Bug] Serialize storage format in rollup job (#3686 ) The segment v2 rollup job should set the storage format v2 and serialize it. If it is not serialized, the rollup of segment v2 may use the error format 'segment v1'.	2020-05-26 15:35:12 +08:00
Mingyu Chen	1124808fbc	[Enhancement] Add detail msg to show the reason of publish failure. (#3647 ) Add 2 new columns `PublishTime` and `ErrMsg` to show publish version time and errors happen during the transaction process. Can be seen by executing: `SHOW PROC "/transactions/dbId/";` or `SHOW TRANSACTION WHERE ID=xx;` Currently is only record error happen in publish phase, which can help us to find out which txn is blocked. Fix #3646	2020-05-22 22:59:53 +08:00
yangzhg	00d563d014	[SQL] Support more syntax in case when clause (#3625 ) support support more syntax in case-when clause with subquey. suport query like ` case when k1 > subquery1 and k2 < subquey2 then ... else ... ` or `case when subquey in null then ...`	2020-05-22 10:22:59 +08:00
worker24h	ef8fd1fcbe	[Load] Support load json-data into Doris by RoutineLoad or StreamLoad (#3553 ) Doris support load json-data by RoutineLoad or StreamLoad	2020-05-21 13:00:49 +08:00
Mingyu Chen	4cbcae1574	[Spark on Doris] Shade and provide the thrift lib in spark-doris-connector (#3631 ) Mainly changes: 1. Shade and provide the thrift lib in spark-doris-connector 2. Add a `build.sh` for spark-doris-connector 3. Move the README.md of spark-doris-connector to `docs/` 4. Change the line delimiter of `fe/src/test/java/org/apache/doris/analysis/AggregateTest.java`	2020-05-19 14:20:21 +08:00
Mingyu Chen	9d72d1bb87	[Refactor] Refactor some redundant code && Replace some UT by UtFrameUtils This CL have no logic changed, just do some code refactor and use new UtFrameWork to replace some old UT. NOTICE(#3622): This is a "revert of revert pull request". This pr is mainly used to synthesize the PRs whose commits were scattered and submitted due to the wrong merge method into a complete single commit.	2020-05-18 14:53:59 +08:00
Mingyu Chen	c2c81d58dc	[Fix]SlotRef.tosql() is the same as the SQL returned by different sql Fix: #3555 NOTICE(#3622): This is a "revert of revert pull request". This pr is mainly used to synthesize the PRs whose commits were scattered and submitted due to the wrong merge method into a complete single commit.	2020-05-18 14:47:48 +08:00
Mingyu Chen	7a83c5662d	[Bug] fix OrCompoundPredicate predicate fold bug #3596 Fix: #3596 NOTICE(#3622): This is a "revert of revert pull request". This pr is mainly used to synthesize the PRs whose commits were scattered and submitted due to the wrong merge method into a complete single commit.	2020-05-18 14:46:34 +08:00
Mingyu Chen	903592d82b	Revert "Refactor some redunant code && Replace some UT by UtFrameUtils" (#3613 ) This revert is used to correct the mess of the commit timeline caused by the wrong merge method.	2020-05-18 13:11:39 +08:00
Mingyu Chen	539efb3532	Revert "[Fix]SlotRef.tosql() is the same as the SQL returned by different sql" (#3610 ) This revert is used to correct the mess of the commit timeline caused by the wrong merge method.	2020-05-18 13:07:21 +08:00
Mingyu Chen	20f20239f2	Revert "[Bug] fix OrCompoundPredicate predicate fold bug #3596 " (#3609 ) This revert is used to correct the mess of the commit timeline caused by the wrong merge method.	2020-05-18 13:03:24 +08:00
Mingyu Chen	2f3b7b5b8e	[Refactor] Refactor some redundant code && Replace some UT by UtFrameUtils	2020-05-18 10:53:32 +08:00
Mingyu Chen	62f746fc87	[Fix] SlotRef.tosql() is the same as the SQL returned by different sql	2020-05-18 10:41:15 +08:00
Mingyu Chen	e6588981b4	[Bug] fix OrCompoundPredicate predicate fold bug #3596 (#3597 ) * [Bug] fix OrCompoundPredicate predicate fold bug * fix code style	2020-05-18 10:36:13 +08:00
wutiangan	5138197d57	[Bug] generate exceptions to avoid mulitDistinctAggregation produces wrong results (#3561 ) when a query (#3492) contain “2 DistinctAggregation with one column” and “1 DistinctAggregation with two columns”, it will produce wrong result. This pull request is not to solve this problem really, but to generate exceptions to avoid getting wrong results. This problem needs a real repair in future.	2020-05-16 21:36:43 +08:00
marising	4217db00d3	Tosql method returns slot index and column name	2020-05-15 17:31:25 +08:00
wutiangan	0919407092	[Bug] fix OrCompoundPredicate predicate fold bug	2020-05-15 10:20:13 +08:00
wutiangan	9f224cdd8a	[Bug] Fix bug of Partition prune of constant in predicate (#3476 ) 1. phenomenon： The following two statements are the same, but a query has results and the other query has no results mysql> select * from (select '积极' as kk1, sum(k2) from table_range where k1 = '2013-01-01' group by kk1)tt where kk1 = '积极'; +--------+-----------+ \| kk1 \| sum(`k2`) \| +--------+-----------+ \| 积极 \| 1 \| +--------+-----------+ 1 row in set (0.01 sec) mysql> select * from (select '积极' as kk1, sum(k2) from table_range where k1 = '2013-01-01' group by kk1)tt where kk1 in ('积极'); Empty set (0.01 sec) 2. reason： In partition prune, constant in predicate（‘积极’ in ‘积极’） is mistakenly considered to meet partition prune conditions, and mistakenly regarded as partition prune column. Then in partition prune , no partition is considered to meet the requirements, so it is planned to be 0 partition in query planning	2020-05-14 11:46:13 +08:00
Mingyu Chen	ca7c0717cd	Fix compile bug (#3557 )	2020-05-12 10:24:37 +08:00
caiconghui	b648734441	[TxxMgr] Support txn management in db level and use ArrayDeque to improve txn task performance (#3369 ) This PR is the first step to make Doris stream load more robust with higher concurrent performance(#3368)，the main work is to support txn management in db level isolation and use ArrayDeque to stored final status txns.	2020-05-11 23:32:43 +08:00
WingC	4294301c53	Throw DdlException when use `admin set frontend config` (#3539 ) The set more than one config in a single set config stmt, an exception will be thrown to forbid the operation.	2020-05-11 23:29:38 +08:00
wangcong18	1b14dd4426	Refactor some redunant code && Replace some UT by UtFrameUtils This CL have no logic change, just do some code refacotr and UT Change.	2020-05-08 12:07:33 +08:00
Mingyu Chen	084515317f	[Bug] Fix constant In Predicate result error (#3511 ) `select 1 not in (2, NULL, 1);` should return `0`	2020-05-08 11:30:11 +08:00
wangbo	d60bb81cb0	[SQL Function] Calculate 'case when expr' when possible (#3396 ) Calculate 'case when expr' when possible	2020-05-07 22:04:09 +08:00
Mingyu Chen	ca36dc697f	[Bug] Fix bug that push down logic error on semi join (#3481 ) For SQL like: ``` select * from join1 left semi join join2 on join1.id = join2.id and join2.id > 1; ``` the predicate `join2.id > 1` can not be pushed down to table join1.	2020-05-07 09:30:30 +08:00
Mingyu Chen	101628c813	[Bug] Fix bug of predicate pushdown logic (#3475 ) When there is subquery in where clause, the query will be rewritten to join operation. And some auxiliary binary predicates will be generated. These binary predicates will not go through the ExprRewriteRule, so they are not normalized as "column to the left and constant to the right" format. We need to take this case into account so that the `canPushDownPredicate()` judgement will not throw exception.	2020-05-06 15:15:37 +08:00

1 2 3 4 5 ...

323 Commits