doris

Author	SHA1	Message	Date
Yingchun Lai	64ebea2e43	[Feature] Support gzip compression for http response (#4533 ) After tablet level metrics is supported, the http metrics API may response a very large body when a BE holds a large number of tablets, and cause heavy network traffic. This patch introduce http content compression to reduce network traffic.	2020-09-06 20:30:12 +08:00
ZhangYu0123	69bd91b617	[BUG] Tablet is not readable and delete handler report -1903 error, when condition value contains \n (#4531 )	2020-09-06 20:29:44 +08:00
xueyan.li	bcb443fb63	[Feature]Support SELECT Optimizer Hints SET_VAR (#4504 )	2020-09-06 20:27:53 +08:00
caoyang10	03d9f6d8b4	[Feature] support hour time unit with dynamic parition (#4514 ) Many tables are so large that need seperate partitions with "HOUR" time unit. But now dynamic partition doesn't support "HOUR" time unit and it was marked as "TODO". So I support the feature and it works.	2020-09-06 20:25:27 +08:00
ZhangYu0123	13e2cf172f	[Log] Add log for trace broker (#4505 ) Add tracing broker log. When fe get filestatus for distributing load task to broker, the broker maybe get empty files and not give correct error code. Add this log to easy track which broker process filestatus operation and we can get the error log.	2020-09-06 20:25:08 +08:00
Yingchun Lai	b780df697a	[refactor] Optimize threads usage mode in BE (#4440 ) BE can not graceful exit because some threads are running in endless loop. This patch do the following optimization: - Use the well encapsulated Thread and ThreadPool instead of std::thread and std::vector<std::thread> - Use CountDownLatch in thread's loop condition to avoid endless loop - Introduce a new class Daemon for daemon works, like tcmalloc_gc, memory_maintenance and calculate_metrics - Decouple statistics type TaskWorkerPool and StorageEngine notification by submit tasks to TaskWorkerPool's queue - Reorder objects' stop and deconstruct in main(), i.e. stop network services at first, then internal services - Use libevent in pthreads mode, by calling evthread_use_pthreads(), then EvHttpServer can exit gracefully in multi-threads - Call brpc::Server's Stop() and ClearServices() explicitly	2020-09-06 20:19:14 +08:00
Mingyu Chen	a390c9ee9f	[Bug] Fix bug that table ids is not set right for hadoop load job (#4535 ) We store all table ids involved in the Load Job in TransactionState. However, for Hadoop Load job, table ids are set incorrectly. This caused the WAITING_TXN phase to not correctly wait for the completion of the previous load transaction when doing the alter table, which caused some data version loss problems.	2020-09-04 17:39:37 +08:00
xy720	369e86bbe2	[Spark Load] [Bug] Load job's state will stay in ETL state all the time after FE restart (#4528 )	2020-09-04 15:56:54 +08:00
Youngwb	068707484d	Support sequence column for UNIQUE_KEYS Table (#4256 ) * add sequence col Co-authored-by: yangwenbo6 <yangwenbo3@jd.com>	2020-09-04 10:10:17 +08:00
Mingyu Chen	72f04ebdb8	[Bug] Fix bug that query meta fields has been sent twice (#4529 ) Fix bug that query meta fields has been sent twice. This bug is introduced by #4330 , and related PR is #4450	2020-09-04 09:31:14 +08:00
ZhangYu0123	8d60352737	[BUG] Fix segment group add zone map bug when schema change. (#4526 ) Fix segment group add zone map bug when schema change. (1) WrapperField null point check (2) in DUP_KEYS, let _zone_maps index consistent with _schema column index	2020-09-04 09:30:52 +08:00
Mingyu Chen	15f3e5a775	[Bug] Fix bug of core local value (#4523 ) When creating core local value from CoreDataAllocator, A lock is needed to protect the modification of _blocks.	2020-09-04 09:30:30 +08:00
HappenLee	a64c3a7acd	[ODBC SCAN NODE] 3/4 Add ODBC_TABLE and ODBC_SCAN NODE in FE. (#4430 ) we can create odbc_table use SQL like ``` CREATE EXTERNAL TABLE `baseall_oracle` ( `k1` decimal(9, 3) NOT NULL COMMENT "", `k2` char(10) NOT NULL COMMENT "", `k3` datetime NOT NULL COMMENT "", `k5` varchar(20) NOT NULL COMMENT "", `k6` double NOT NULL COMMENT "" ) ENGINE=ODBC PROPERTIES ( "host" = "192.168.0.1", "port" = "8086", "user" = "happenlee", "password" = "doris", "database" = "doris", "table" = "baseall", "driver" = "Oracle 19 ODBC driver", "type" = "oracle" ); ``` Now we only support Oracle and MySQL Database and this feature default turned off by conf enable_odbc_table.	2020-09-04 09:30:01 +08:00
Mingyu Chen	5166a6c6bc	[Bug] function str_to_date()'s behavior on BE and FE is inconsistent (#4495 ) Main CL: 1. Copy the code from BE to implement the `str_to_date()` function in FE. 2. `str_to_date("2020-08-08", "%Y-%m-%d %H:%i:%s")` will return `2020-08-08 00:00:00` instead of `2020-08-08`.	2020-09-03 17:16:19 +08:00
Zhengguo Yang	d0d394ad7e	[SQL][Bug]fix semi/anti join error when table has delete sign column (#4498 ) It is possible to report "Illegal column/field reference'table2.DORIS_DELETE_SIGN' of semi-/anti-join" when executing a semi/anti join statement on a table with hidden columns. This is because the filter conditions of semi/anti join cannot added in the where statement. Now we add delete flag related where predicate in OlapScanNode level.	2020-09-03 17:15:15 +08:00
xinghuayu007	1a30bcbf36	[SQL Function][Bug] Fix parse_url() bug (#4429 ) The parameter 'part' of parse_url function does not support lower case, and parse protocol not right. And This function does not support parse 'port'. This PR tries to make parse_url function case insensitive and support parse 'port'. The issue: #4451	2020-09-03 17:06:09 +08:00
ZhangYu0123	c29d41f675	[BUG] Fix recover persistent stale rowsets bug from multi-single version rowsets in stale rowsets (#4513 ) (1) fix recover persistent stale rowsets bug from multi-single version rowset in stale rowsets (2) delete_expired_inc_rowsets check consistent version convert to [0, max_version]	2020-09-03 16:59:18 +08:00
Mingyu Chen	d7ac44ac79	[Bug] Fix bug that BE will crash when querying information_schema.columns (#4511 ) This bug is introduced from #4364	2020-09-03 16:57:56 +08:00
Zhengguo Yang	ac3bbdd3ab	[BatchDelete] Add a configuration indicating whether to enable the batch delete function (#4493 )	2020-09-03 16:56:37 +08:00
HaiBo Li	c01954719f	[Feature][Cache] Sql cache and partition cache #2581 (#4330 ) 1. Analyze what mode of cache can be used by query 2. Query cache before executing query in StmtExecutor 3. Two cache mode, sqlcache and partitioncache, are implemented	2020-09-02 19:18:42 +08:00
HappenLee	e6367e9ba8	[Bug] Support disable colocate join where join clause has join hint (#4497 ) Fix issue:#4496	2020-09-02 10:54:45 +08:00
xy720	f5ee854b6f	[Spark load][Bug] Fix column terminator for spark load (#4491 ) Support specifying column separator without back slash.	2020-09-02 10:54:03 +08:00
xy720	f207036cad	[Spark load][Document] Add docs about spark and yarn client for spark load (#4489 ) Add docs about spark and yarn client for spark load	2020-09-02 10:52:49 +08:00
xinghuayu007	1a22f3b2ac	[SQL][Function] Validate the param of rand function in compile step (#4439 ) The param of rand() function should be literal, but current compiler ignore to validate the literal param of rand function, it is validated in execution step. This PR make it validated in compile step, and make it more earlier to find the usage error of rand() function.	2020-09-02 10:50:52 +08:00
Yingchun Lai	498b06fbe2	[Metrics] Support tablet level metrics (#4428 ) Sometimes we want to detect the hotspot of a cluster, for example, hot scanned tablet, hot wrote tablet, but we have no insight about tablets in the cluster. This patch introduce tablet level metrics to help to achieve this object, now support 4 metrics on tablets: `query_scan_bytes `, `query_scan_rows `, `flush_bytes `, `flush_count `. However, one BE may holds hundreds of thousands of tablets, so I add a parameter for the metrics HTTP request, and not return tablet level metrics by default.	2020-09-02 10:39:41 +08:00
Lijia Liu	f3a9f3f87c	Do not add exchange when table's distributioin satisfy the distribution requirements (#4482 ) In DistributedPlanner, do not add the unnecessary Exchanges. For case 1, we only need to judge that the table's distribute hash keys is a subset of the aggregate keys. For case 2, we should judge two conditions: - partition keys are also hash keys. - the table's distribute hash keys is a subset of the aggregate keys.	2020-09-01 11:34:53 +08:00
Mingyu Chen	d49566130b	[Bug] Fix bug of select @@sql_mode (#4484 ) Fix bug that `select @@sql_mode` throw error: Invalid number format.	2020-09-01 11:31:35 +08:00
Mingyu Chen	a864db03fe	[Bug] Fix bug of load error hub and schema change (#4486 ) 1. When WITH_MYSQL is off, load error hub does not suport MySQL load error hub, we should check its return value. 2. misjudge the return value of `change_row_block` in schema_change.cpp	2020-08-31 23:21:50 +08:00
Mingyu Chen	8bb65863f5	[Doc] Update doc of fe-idea-dev.md (#4485 )	2020-08-31 10:09:10 +08:00
xy720	7b67da30d2	[Spark Load] Redirect the spark launcher's log to a separated log file (#4470 )	2020-08-30 21:10:04 +08:00
ZhangYu0123	1d93ba027a	[Compaction] Compaction show policy type and disk format (#4466 ) Add more information in compaction show api 1、add cumulative policy type 2、format rowset total disk size	2020-08-30 21:09:47 +08:00
wyb	ffe696d17c	[Doc] Add spark load sql statement doc and update manual (#4463 ) 1. add sql statement in dml 2. update spark load manual	2020-08-30 21:09:17 +08:00
Yingchun Lai	65cacbff7c	[Bug] Fix bug that memory copy may overflow in MemIndex::load_segment (#4458 ) Segment index file content is not set as 0 when it is constructed in write procedure, so when load index from this file, and meet a null VARCHAR cell, the null field of this cell is 0, but the length field which is not initialized maybe a large random number, then memory copy may cause overflow. This patch fix this bug, and also skip useless memory copy to improve a bit of performance.	2020-08-30 21:08:55 +08:00
ZhangYu0123	123237afb7	[Compaction] Persistence stale rowsets meta (#4454 ) Persistence stale rowsets meta. When BE reboots, stale rowsets meta can resume and the stale version can also be readable before stale gc time. ISSUE: #4453	2020-08-30 21:05:48 +08:00
Zhengguo Yang	3b7614e174	[Refactor] Use camelCase in thrift generated java sources (#4443 ) Use camelCase in thrift generated java sources to make us fe's code style is more unified	2020-08-28 13:28:11 +08:00
Mingyu Chen	0db9194dc0	[Doc] Fix wrong doc name (#4477 ) Co-authored-by: morningman <chenmingyu@baidu.com>	2020-08-28 11:56:59 +08:00
ZhangYu0123	004b955ca4	[Bug] Fix a null pointer bug in PlanFragmentExecutor. (#4473 ) Fix a null pointer bug in PlanFragmentExecutor. Add null check operation before it is used. Detail: #4472	2020-08-28 09:28:23 +08:00
wyb	ec64789e89	[Bug][Colocation Join] Fix colocation balance endless loop bug (#4471 ) 1. Only one available backend. 2. All backends are checked but this round is not changed. For example, all backends are on the same host.	2020-08-28 09:27:57 +08:00
Zhengguo Yang	174c9f89ea	[DOCS] Add batch delete docs (#4435 ) update documents for batch delete #4051	2020-08-28 09:24:07 +08:00
wyb	82940a4905	[Spark Load] Fix spark load bugs (#4464 ) 1. fix write dpp result when dpp throw exception 2. boolean value：true, false(IgnoreCase), 0, 1 3. wrong dest column for source data check 4. support * in source file path 5. if job state is cancelled or finished, submitPushTasks would throw all partitions have no load data exception, because tableToLoadPartitions was already cleaned up #3433	2020-08-27 23:40:33 +08:00
HappenLee	84c63f1350	[Bug] replace libltdl.so when compile the unixodbc library (#4461 )	2020-08-27 20:53:28 +08:00
Youngwb	976e3bb219	[Bug][Compile] Add missing imports (#4468 ) Co-authored-by: yangwenbo6 <yangwenbo3@jd.com>	2020-08-27 18:14:11 +08:00
HangyuanLiu	ad738fa198	Add OLAP_ERR_DATE_QUALITY_ERR error status to display schema change failure (#4388 ) In the process of historical data transformation of materialized views, it may occur that the transformation fails due to data quality. Add an error status code ：OLAP_ERR_DATE_QUALITY_ERR to determine if a data problem is causing the failure #3344	2020-08-27 17:52:53 +08:00
gengjun-git	fe0c21bf93	[Bug] Fix mysql return bug (#4450 ) Send fields after first row arrived so that error packet can be send to client when exception thrown from coord.getNext(). Golang and Python can not identify error if fields packet arrived before error packet.	2020-08-27 12:17:24 +08:00
xueyan.li	3c784b9c90	[SQL] support StringLiteral try to cast BigInt (#4445 )	2020-08-27 12:15:28 +08:00
yiguolei	b85bb0e2e9	[Bug-Fix] Some deleted tablets are not recycled on BE (#4401 )	2020-08-27 12:09:19 +08:00
xy720	8c38c79104	[SparkLoad]Use the yarn command to get status and kill the application (#4383 ) This cl will use yarn command as follows to kill or get status of application running on YARN. ``` yarn --config confdir application <-kill \| -status> <Application ID> ```	2020-08-27 12:08:55 +08:00
Mingyu Chen	f218327dd9	[Mysql Compatibility] Support convert() and signed/unsigned interger cast (#4364 ) 1. Support convert(expr, target_type) function, which is same as CastExpr 2. Support cast (expr as signed/unsigned int) This is just for compatibility, the signed/unsigned specification is meaningless.	2020-08-27 12:07:58 +08:00
Mingyu Chen	8b0b120aca	[Profile] Add 2 Segment related metrics in query profile (#4348 ) Total number of segments and filterd number of segment	2020-08-27 12:07:21 +08:00
HappenLee	e4e9af4577	This PR contain three things (#4448 ) 1. Fix core bug wild pointer in PlanFragmentExecutor, fix issue #4447 2. Fix core bug wild pointer json load, fix issue #4452 3. Change the declare order of ODBC type in thrift for compatibility	2020-08-26 10:53:53 +08:00

1 2 3 4 5 ...

2318 Commits