doris

Author	SHA1	Message	Date
ZhangYu0123	dc3ed1c525	[Compaction]Compaction rules optimization (#4212 ) Compaction rules optimization, the detail problem description and design to see #4164. This pr commits 2 functions: (1) add the cumulative policy configable, and implement original policy. (2) implement universal policy, the optimization version in #4164.	2020-08-19 09:34:13 +08:00
Yingchun Lai	56260a65c8	[Metrics] Support tablet level metrics (#4327 ) Sometimes we want to detect the hotspot of a cluster, for example, hot scanned tablet, hot wrote tablet, but we have no insight about tablets in the cluster. This patch introduce tablet level metrics to help to achieve this object, now support 4 metrics on tablets: `query_scan_bytes `, `query_scan_rows `, `flush_bytes `, `flush_count `. However, one BE may holds hundreds of thousands of tablets, so I add a parameter for the metrics HTTP request, and not return tablet level metrics by default.	2020-08-18 16:56:12 +08:00
ZhangYu0123	8a3eaeecf1	Update support batch delete storage design document (#4234 ) * Update delete index design document	2020-08-18 15:37:14 +08:00
Mingyu Chen	3359467b9a	[Tablet][Recovery] Support using empty tablet to repair the damaged or missing tablet (#4255 ) In some very special circumstances, such as code bugs, or human misoperation, etc., all replicas of some tablets may be lost. In this case, the data has been substantially lost. However, in some scenarios, the business still hopes to ensure that the query will not report errors even if there is data loss, and reduce the perception of the user layer. At this point, we can use the blank Tablet to fill the missing replica to ensure that the query can be executed normally. Add a new FE config `recover_with_empty_tablet`. default is false. true means to use empty tablet to fill the missing one. Also fix a bug in Fix #4274	2020-08-18 06:13:53 +00:00
Stalary	26fe510011	[Doc] modify the document error (#4357 )	2020-08-17 23:06:23 +08:00
ZhangYu0123	d6028863f3	[Compaction] Manually trigger compaction RESTapi interface (#4312 ) Add restapi to be which do compaction task by manual trigger. The detail design in #4311 .	2020-08-13 23:41:46 +08:00
Mingyu Chen	05fa55047e	[Doc][Json Load] Improve json data format load documents (#4337 ) And some detail explaination of JsonPath and Columns parameter	2020-08-13 23:39:57 +08:00
ZhangYu0123	1d9b3aeee7	[Doc] Repair document format (#4336 ) The error format '##keyword' in a lot of docs. This pr is to repair document format. #4335	2020-08-13 23:39:41 +08:00
weizuo93	d655b271b8	[Feature][Web] Add new feature to list all tablets on a particular BE (#4268 ) A new feature has been added to acquire tablet id and schema hash of all the tablets on a particular BE node via Web page，so that more detailed information of each tablet can be obtained according to these tablet id and schema hash. In accordance with different web request, there are two ways (table and json)to show these acquired tablet id and schema hash on Web page.	2020-08-12 20:55:19 +08:00
Fullstop000	651a7e50d0	[Doc] Update compilation.md (#4297 )	2020-08-09 20:50:33 +08:00
caiconghui	eefad13107	[Feature] Support InPredicate in delete statement (#4006 ) This PR is to add inPredicate support to delete statement, and add max_allowed_in_element_num_of_delete variable to limit element num of InPredicate in delete statement.	2020-08-06 23:19:40 +08:00
EmmyMiao87	5ba4b024e7	[Docs] Add Materialized view manual (#4229 ) Add usage manual of materialized view in Chinese and English	2020-08-06 23:18:06 +08:00
Mingyu Chen	237c0807a4	[RoutineLoad] Support modify routine load job (#4158 ) Support ALTER ROUTINE LOAD JOB stmt, for example: ``` alter routine load db1.label1 properties ( "desired_concurrent_number"="3", "max_batch_interval" = "5", "max_batch_rows" = "300000", "max_batch_size" = "209715200", "strict_mode" = "false", "timezone" = "+08:00" ) ``` Details can be found in `alter-routine-load.md`	2020-08-06 23:11:02 +08:00
Fullstop000	421828d52a	[Doc] Fix format in doris_storage_optimization.md (#4250 )	2020-08-05 21:45:03 +08:00
Zhengguo Yang	1b341601fe	Generate jave files using maven (#4133 ) generate generated-java files using maven instead of by build.sh	2020-08-05 15:20:39 +08:00
Mingyu Chen	3f31866169	[Bug][Load][Json] #4124 Load json format with stream load failed (#4217 ) Stream load should read all the data completely before parsing the json. And also add a new BE config streaming_load_max_batch_read_mb to limit the data size when loading json data. Fix the bug of loading empty json array [] Add doc to explain some certain case of loading json format data. Fix: #4124	2020-08-04 12:55:53 +08:00
Lijia Liu	bdaef84a10	[FE] [HttpServer] Config netty param in HttpServer (#4225 ) Now, if the length of URL is longer than 4096 bytes, netty will refuse. The case can be reproduced by constructing a very long URL(longer than 4096bytes) Add 2 http server params: 1. http_max_line_length 2. http_max_header_size	2020-08-01 17:59:01 +08:00
HangyuanLiu	116d7ffa3c	[SQL][Function] Add approx_count_distinct() function (#4221 ) Add approx_count_distinct() function to replace the ndv() function	2020-08-01 17:54:19 +08:00
HaiBo Li	b4cb8fb9b2	[Feature][Cache]Add interface, metric, variable and config for query cache (#4159 )	2020-07-30 11:24:20 +08:00
worker24h	fdcc223ad2	[Bug][Json] Refactor the json load logic to fix some bug 1. Add `json_root` for nest json data. 2. Remove `_jmap` to make the logic reasonable.	2020-07-30 10:36:34 +08:00
caiconghui	237271c764	[Bug] Fix fe meta version problem, make drop meta check code easy to read and add doc content for drop meta check (#4205 ) This PR is mainly do three things: 1. Fix fe meta version bug introduced by #4029 , when fix conflict with #4086 2. Make drop check code easy to read 3. Add doc content for drop meta check	2020-07-30 09:54:20 +08:00
Mingyu Chen	8a169981cf	[Bug][TabletRepair] Fix bug that too many replicas generated when decommission BE (#4148 ) Try to select the BE with an existing replicas as the destination BE for REPLICA_RELOCATING clone task. Fix #4147 Also add 2 new FE configs `max_clone_task_timeout_sec` and `min_clone_task_timeout_sec`	2020-07-30 09:46:33 +08:00
caiconghui	1b3af783e6	[Plugin] Add properties grammar in InstallPluginStmt (#4173 ) This PR is to support grammar like the following: INSTALL PLUGIN FROM [source] [PROPERTIES("KEY"="VALUE", ...)] user can set md5sum="xxxxxxx", so we don't need to provide a md5 uri.	2020-07-29 15:02:31 +08:00
wutiangan	59676a1117	[BUG] fix 4149, add sessionVariable to choose broadcastjoin first when cardinality cannot be estimated (#4150 )	2020-07-29 12:28:52 +08:00
caiconghui	9e5ca697f3	[Doc] Fix typo for stream load content in basic-usage.md (#4185 )	2020-07-27 16:50:15 +08:00
caiconghui	4608f9786e	Support checking database used data quota when data load job begin a new txn (#3955 ) Now, we only check database used data quota when create or alter table, or in some old type load job, but not for routine load job and stream load job. This PR provide a uniform solution to check db used data quota when data load job begin a new txn.	2020-07-24 10:03:43 +08:00
HuangWei	a01d1aec56	[Compaction] track RowsetReader's mem & add metric (#4068 ) Ref https://github.com/apache/incubator-doris/issues/3624#issuecomment-655933244 Only RowsetReaders in compaction are under the track. Other RowsetReaders won't be effected, because the parent_tracker is nullptr.	2020-07-24 07:58:09 +08:00
caiconghui	2334f5d997	Fix some problem related with publish version task (#4089 ) This PR is mainly do following three things: 1. Add thread name in fe log to make trace problem more easy. 2. Add agent_task_resend_wait_time_ms config to escape sending duplicate agent task to be. 3. Skip to continue to update replica version when new version is lower than replica version in fe.	2020-07-23 20:06:02 +08:00
lichaoyong	fbf7bd6a1d	[Bug] Change get load state interface (#4081 ) Now, the PathTrie will match wrong interface between /api/{db}/{table} and /api/{db}/{label}	2020-07-20 15:51:27 +08:00
ZhangYu0123	03cf9b2a24	[Compaction] Add delayed deletion of rowsets function, fix -230 error. (#4039 ) Related issue #4017, main changes as follows: 1. Add expired_snapshot_rs_version_map，_expired_snapshot_rs_metas， 2. Add VersionedRowsetTracker record compacted path version 3. Record path version when rowsets compact 4. In gc process, add expired snapshot rowsets to unused set to remove.	2020-07-19 22:03:59 +08:00
hffariel	a0c19df18c	[Website] Redesign the home page of document website (master) (#4069 )	2020-07-16 11:36:24 +08:00
Zhengguo Yang	78a1dea19d	Support using B/K/KB/M/MB/G/GB/T/TB/P/PB as unit in session variable exec_mem_limit (#4063 ) Support using B/K/KB/M/MB/G/GB/T/TB/P/PB as unit in session variable exec_mem_limit	2020-07-13 20:54:14 +08:00
WingC	d7893f0fa7	[Bug]Fix some schema change not work right (#4009 ) [Bug]Fix some schema change not work right This CL mainly fix some schema change to varchar type not work right because forget to logic check && Add ConvertTypeResolver to add supported convert type in order to avoid forget logic check	2020-07-11 10:18:29 +08:00
xy720	d2ab38a5e0	[Feature] Batch update partition's property in one command (#3981 ) Support following command. ``` alter table tbl_name modify partition (p1, p2, p3) set ("replication_num" = "3"); ```	2020-07-09 21:48:43 +08:00
caiconghui	5a27981e49	[Config] Add thrift_client_retry_interval_ms config in be for thrift client to avoid avalanche disaster in fe thrift server (#4022 ) This PR is mainly to add `thrift_client_retry_interval_ms` config in be for thrift client to avoid avalanche disaster in fe thrift server and fix some typo and some rpc setting problems at the same time.	2020-07-08 21:07:00 +08:00
Mingyu Chen	7715a84d4d	[Config] Enable some features by default (#4031 ) Its time to enable some features by default. 1. Enable FE plugins by setting `plugin_enable=true` 2. Enable dynamic partition by setting `dynamic_partition_enable=true` 3. Enable nio mysql server by setting `mysql_service_nio_enabled=true` Also modify installation doc, add download link of MySQL client.	2020-07-08 09:59:10 +08:00
caiconghui	b7051d0971	[Config]Make it easier for users to find configuration items needed (#3957 ) This PR is to make config items ordered by key and support like predicate for admin show config stmt	2020-07-07 23:12:21 +08:00
Mingyu Chen	c3d9feed75	[Load][Json] Refactor json load logic to make it more reasonable (#4020 ) This CL mainly changes: 1. Reorganized the code logic to limit the supported json format to two, and the import behavior is more consistent. 2. Modified the statistical behavior of the number of error rows when loading in json format, so that the error rows can be counted correctly. 3. See `load-json-format.md` to get details of loading json format.	2020-07-07 23:07:28 +08:00
funyeah	d396408861	Correct typos (#4024 )	2020-07-07 13:33:46 +08:00
WingC	913b2caac4	[Dynamic Partition]Support set replication number (#3965 ) This CL mainly support set replication_num property in dynamic partition table if dynamic_partition.replication_num is not set, the value is the same as table's default replication_num.	2020-07-05 16:28:38 +08:00
WingC	7351f7c237	[Config]Allower use to config different thrift server model (#3986 ) Doris only support TThreadPoolServer model in thrift server, but the server model is not effective in some high concurrency scenario, so this PR introduced new config to allow user to choose different server model by their scenario. Add new FE config: `thrift_server_type`	2020-07-05 16:24:29 +08:00
Mingyu Chen	6a7583bb08	[Doc] Add doc for setting dev env of FE in Eclipse (#3952 ) Also fix some doc bugs	2020-07-02 13:54:36 +08:00
wangbo	210ee9664f	[SparkLoad]add user doc for build global dict (#3938 ) describe global dict and how to use it in spark load	2020-06-30 19:12:35 +08:00
caiconghui	48398232e7	[Bug] Fix bug that default_rowset_type have a session variable (#3953 ) This PR is mainly for fixing bug that `default_rowset_type` have a session variable	2020-06-29 19:16:42 +08:00
Mingyu Chen	2c96d27fdc	[Enhance] Add MetaUrl and CompactionUrl for "show tablet" stmt (#3962 ) * [Enhance] Add MetaUrl and CompactionUrl for "show tablet" stmt Add MetaUrl and CompactionUrl in result of following stmt: `show tablet 10010`; * fix ut * add doc Co-authored-by: chenmingyu <chenmingyu@baidu.com>	2020-06-29 19:15:38 +08:00
Mingyu Chen	af1beb6ce4	[Enhance] Add prepare phase for some timestamp functions (#3947 ) Fix: #3946 CL: 1. Add prepare phase for `from_unixtime()`, `date_format()` and `convert_tz()` functions, to handle the format string once for all. 2. Find the cctz timezone when init `runtime state`, so that don't need to find timezone for each rows. 3. Add constant rewrite rule for `utc_timestamp()` 4. Add doc for `to_date()` 5. Comment out the `push_handler_test`, it can not run in DEBUG mode, will be fixed later. 6. Remove `timezone_db.h/cpp` and add `timezone_utils.h/cpp` The performance shows bellow: 11,000,000 rows SQL1: `select count(from_unixtime(k1)) from tbl1;` Before: 8.85s After: 2.85s SQL2: `select count(from_unixtime(k1, '%Y-%m-%d %H:%i:%s')) from tbl1 limit 1;` Before: 10.73s After: 4.85s The date string format seems still slow, we may need a further enhancement about it.	2020-06-29 19:15:09 +08:00
chenmingyu	4003ed07b5	add doc	2020-06-28 14:07:11 +08:00
WingC	b2b9e22b24	[CreateTable] Check backend disk has available capacity by storage medium before create table (#3519 ) Currently we choose BE random without check disk is available, the create table will failed until create tablet task is sent to BE and BE will check is there has available capacity to create tablet. So check backend disk available by storage medium will reduce unnecessary RPC call.	2020-06-28 09:36:31 +08:00
Stalary	a894b1edc5	[Doris On ES] Split /_cluster/state to [indexName/_mappings, indexName/_search_shards] (#3454 ) 1. Split /_cluster/state into /_mapping and /_search_shards requests to reduce permissions and make the logic clearer 2. Rename part es related objects to make their representation more accurate 3. Simply support docValue and Fields in alias mode, and take the first one by default #3311	2020-06-26 17:46:43 +08:00
张家锋	b956bd5c8e	[Doc] Add document for setting up IntelliJ IDEA (#3939 )	2020-06-26 14:35:02 +08:00

... 45 46 47 48 49 ...

2730 Commits