doris

Author	SHA1	Message	Date
caiconghui	8660bf69ff	[fix](select join) Make selected slotRef nullable when slotRef is from nullable tuple in outer join sql block (#7290 )	2021-12-06 16:17:10 +08:00
Mingyu Chen	164b27412c	[revert] "[improvement](bdbje) clean too many bdbje log (#7273 )" (#7312 ) Reverts #7273 Because there is no EnvironmentConfig.RESERVED_DISK.	2021-12-06 11:32:45 +08:00
Zhengguo Yang	200210e708	[fix] (ut) fix fe unit test failed, this is because we fix the MAX_PHYSICAL_PACKET_LENGTH to 0xffffff	2021-12-06 11:13:01 +08:00
caiconghui	6e0664bdf8	[enhancement](audit) Enable fe audit plugin to audit more infos for query (#7300 )	2021-12-06 10:33:15 +08:00
caiconghui	bffc2836d7	[fix](show) Fix bug that AdminShowDataSkew operation may cause fe oom (#7297 )	2021-12-06 10:32:00 +08:00
thinker	f9be31d4bc	[refactor](rowbatch) make RowBatch better (#7286 ) 1. add const keyword for RowBatch's read-only member functions 2. should use member object rather than member object pointer as possible as you can	2021-12-06 10:31:43 +08:00
tianhui5	e080afa186	[typo] update comment of MasterDaemon (#7285 ) The comment of MasterDaemon is out of date, may misguide reader.	2021-12-06 10:30:48 +08:00
thinker	8a6528a2fb	[fix](executor) set the length of StringValue to 0 when it is null (#7284 ) the tuple String Slot's ptr and len are not assigned appropriately on send side, the receive side may crash in some situation. detail description: on send side, when we call RowBatch::serialize(PRowBatch* output_batch) to pack RowBatch, the Tuple::deep_copy() will be called, for each String Slot, only String Slots that is not null will set ptr and len with proper value, the null String Slots will keep original status, the ptr member will point randomly and the len member may unexpect. on recv side, unpack is processed by RowBatch::RowBatch(const RowDescriptor&, const PRowBatch&...), in this function, each String Slot will transfer offset to valid string_val->ptr whether the String Slot is null or not. but some business logic depends on string_val->len=0, such as AggregateFuncTraits::init(), HyperLogLog::deserialize() will return correctly if slice.size<=0. so if string_val->len is set to 0 in send side, everything will be ok, otherwise server may crash. by netcomm viewpoint, we should make sure transfer correct data, it's sender's responsibility to set data with proper value, and do not make any presume which way the recv side will use it.	2021-12-06 10:30:26 +08:00
wei zhao	19a3c393a9	[Improvement](spark-connector) Add 'sink.batch.size' and 'sink.max-retries' options in spark-connector (#7281 ) Add `sink.batch.size` `sink.max-retries` options in `Doris Spark-connector`. Be consistent with `link-connector` options . eg: ```scala df.write .format("doris") // specify maximum number of lines in a single flushing .option("sink.batch.size",2048) // specify number of retries after writing failed .option("sink.max-retries",3) .save() ```	2021-12-06 10:29:33 +08:00
dh-cloud	974ab9b90c	[improvement](bdbje) clean too many bdbje log (#7273 ) In an HA environment, JE will retains as many reserved files. the jdbje log become too large. so we should limit the reserved files size, default set 1GB	2021-12-06 10:28:36 +08:00
tinkerrrr	25b31e7d5e	[docs][typo] correct sql syntax in upgrade.md (#7271 ) correct sql syntax in upgrade.md Co-authored-by: 袁湘敏 <yuanxiangmin@corp.netease.com>	2021-12-06 10:28:01 +08:00
EmmyMiao87	4bfee42ba1	[feature-wip](lateral view) Support lateral view based on subquery (#7269 ) Support lateral view of the result column in subquery. For example: ``` select e1 from (select k2 as a from test_explode group by a) tmp1 lateral view explode_split(a, ",") tmp2 as e1; ``` The lateral view will parse the inline view column and put the table function node above the subquery.	2021-12-06 10:26:36 +08:00
renzhimin7	27f494dad3	[docs][typo] Update fe_config.md (#7252 ) Int type should be 4 bytes and decimal should be 16 bytes	2021-12-06 10:25:28 +08:00
HappenLee	d3316ff567	[performance](function) Support SIMD function in some string function (#7236 ) Support SIMD function in some string function：lrtim，rtrim，trim，reverse，hex	2021-12-06 10:24:26 +08:00
kezhenxu94	270bebe196	[chore](github) Add third-party GitHub Action as submodule to allow it to run (#7280 ) Add the 3rd-party GHA as submodule so that it can be run without asking to add it into allow list.	2021-12-04 19:43:30 +08:00
EmmyMiao87	845f931098	[fix](select outfile) Remove optional properties check of hdfs storage (#7272 )	2021-12-03 13:42:56 +08:00
sparklezzz	92020e6e85	[deps](librdkafka) set --enable-sasl option in rdkafka build to enable plain password auth at routine load (#7251 ) ``` create routine load rd_001 on tb1 with append COLUMNS(user_id, date) properties ( "desired_concurrent_number" = "3", "max_batch_interval" = "5", "max_batch_rows" = "300000", "max_batch_size" = "209715200", "max_error_number" = "100", "format" = "json" ) from KAFKA ( "kafka_broker_list" = "127.0.0.1:9092", "kafka_topic" = "topic1", "property.security.protocol" = "sasl_plaintext", "property.sasl.mechanism" = "PLAIN", "property.sasl.username" = "your-username", "property.sasl.password" = "your-password", "property.group.id" ="group1", "property.client.id" = "client-1", "property.kafka_default_offsets" = "OFFSET_BEGINNING" ); ```	2021-12-02 11:44:37 +08:00
Wei	5f7c4f903f	[refactor](log) Remove unused log instance creation (#7249 )	2021-12-02 11:43:29 +08:00
jakevin	f51448d60b	[community](github) add enhancement.yml (#7242 ) Add enhancement type of issue	2021-12-02 11:42:31 +08:00
Xinyi Zou	fc9e502b51	[improvement](brpc)(config) Support transfer RowBatch in Controller Attachment (#7164 ) Transfer RowBatch in Protobuf Request to Controller Attachment, when the maximum length of the RowBatch in the Protobuf Request is exceeded. This can avoid reaching the upper limit of the Protobuf Request length (2G), and it is expected that performance can be improved.	2021-12-02 11:41:38 +08:00
xinghuayu007	dd36ccc3bf	[feature](storage-format) Z-Order Implement (#7149 ) Support sort data by Z-Order: ``` CREATE TABLE table2 ( siteid int(11) NULL DEFAULT "10" COMMENT "", citycode int(11) NULL COMMENT "", username varchar(32) NULL DEFAULT "" COMMENT "", pv bigint(20) NULL DEFAULT "0" COMMENT "" ) ENGINE=OLAP DUPLICATE KEY(siteid, citycode) COMMENT "OLAP" DISTRIBUTED BY HASH(siteid) BUCKETS 1 PROPERTIES ( "replication_allocation" = "tag.location.default: 1", "data_sort.sort_type" = "ZORDER", "data_sort.col_num" = "2", "in_memory" = "false", "storage_format" = "V2" ); ```	2021-12-02 11:39:51 +08:00
Zhengguo Yang	d8ba6e3eb6	1. Fix an error when fetch string type field may cause malform packet error. (#7262 ) This is beacuse of an const MAX_PHYSICAL_PACKET_LENGTH in fe should be 2^24 -1, but it is set as 2^24 -2 by mistake. 2. Fix bitmap_to_string may failed when the result is large than 2G	2021-12-01 10:02:34 +08:00
caiconghui	fbab8afe24	[feature] Support disable query and load for backend to make Doris more robust and set default value to 1 for max_query_retry_time (#7155 ) ALTER SYSTEM MODIFY BACKEND "host1:9050" SET ("disable_query" = "true"); ALTER SYSTEM MODIFY BACKEND "host1:9050" SET ("disable_load" = "true");	2021-11-30 22:08:32 +08:00
Mingyu Chen	6c4aeab06f	[fix](broker-load) BE may crash when using preceding filter in broker or routine load (#7193 ) The broker scan node has two tuple descriptors: One is dest tuple and the other is src tuple. The src tuple is used to read the lines of the original file, and the dest tuple is used to save the converted lines. The preceding filter is executed on the src tuple, so src tuple descriptor should be used to initialize the filter expression	2021-11-30 22:04:05 +08:00
董伟召	904a32c758	[docs] fix 0.14 release date in download page (#7253 ) The release date of 0.14 in download page is wrong	2021-11-30 15:00:36 +08:00
Mingyu Chen	9b3c834396	[docs](release) Update download page to add release 0.15 (#7244 ) Also modify some steps in release processing document	2021-11-29 16:06:32 +08:00
HappenLee	91a3150910	[fix](reader) Fix the bug that reader call _capture_rs_readers function twice (#7224 )	2021-11-26 10:17:33 +08:00
Mingyu Chen	baa5d6089f	[fix](alter) Fix bug that partition column of a unique key table can be modified (#7217 ) The partition columns can not be modified.	2021-11-26 10:16:01 +08:00
曹建华	948a2a738d	[performance] Improve DeltaWriter's performance. (#7216 ) 1. Support batch write for DeltaWriter. 2. Use mutex instead of SpinLock.	2021-11-26 10:15:27 +08:00
Shuo Wang	178fda593d	[docs] Refine documents for commit message tags. (#7215 )	2021-11-26 10:14:39 +08:00
GoGoWen	52cd12a1f9	[fix](planner) fix preaggregation reason error (#7205 ) this pr is going to Fix #7204.	2021-11-26 10:13:53 +08:00
Hao Tan	a1bf2878c0	[feat-opt](json-function) optimize get_json_xx function (#7157 ) Avoid repeated parsing json string is the first parameter of function is constant.	2021-11-26 10:12:55 +08:00
EmmyMiao87	70670b5a42	[feat-wip](lateral-iew) Pruning output slot of TableFunctionNode (#7148 ) If the calculation of the lateral view function is completed, the result will be directly returned to the upper layer. It will cause a lot of memory copy and network transmission. The reason is that the original column that generally participates in the lateral view is very likely to be a very long value. If Doris still retain this column after calculating the lateral view, it need to perform a memory copy. However, in many cases, the upper plan node does not need the original columns of the lateral view, so it is necessary to perform column pruning after the calculation of the lateral view, so as to avoid useless memory copy and network transmission. For example, the following query can prune the original column v1 ```select k1, e1 from table lateral view explode_split(v1, ",") tmp as e1;``` The `outputSlotIds` in TableFunctionNode is used to store the columns that should be retained after pruning. * Support scalar function in lateral view The child 0 of explode_split function could be a scalar function such as: concat(k1, ",", k2) This pr mainly detects whether the lateral view with function satisfies the following specifications in semantics. 1. The columns in the function must all belong to the original table 2. The function must be a scalar function	2021-11-26 10:10:05 +08:00
Pxl	2445f10868	[fix](bitmap-function) fix core dump at some bitmap function (#7221 )	2021-11-25 22:52:50 +08:00
Zhengguo Yang	c9e578032b	optimize bitmap function count, use roaring cardinality method, this will more fast than current version (#7151 )	2021-11-24 14:42:48 +08:00
yiguolei	b6a9207a25	[deps](brpc) fix compile bug that could not find protobuf lib during compile (#7197 )	2021-11-24 10:44:26 +08:00
HappenLee	fb5adaf18e	[fix](mem-tracker) Fix mem limit -1 in partition aggregate node (#7181 ) Make error message more clear.	2021-11-24 10:43:35 +08:00
Mingyu Chen	3fd8148100	[doc] Add build-dev image 1.4.2 to compilation document (#7174 ) Add build-dev image 1.4.2 to compilation document	2021-11-24 10:42:52 +08:00
Mingyu Chen	5a8591aaf0	[doc] add FAQ document (#7173 ) From Apache Doris wechat count, authorized.	2021-11-24 10:42:33 +08:00
Pxl	3fcb3db57a	[fix](vectorized-engine) fix core when enable_vectorized_engine open (#7159 )	2021-11-24 10:42:12 +08:00
Mingyu Chen	e74bfea8e4	[chore](clang-format)(license-eye) Add Clang Format/Skywalking eyes github action (#7132 ) 1. The clang format action will be triggered when a PR is submitted. 2. Skywalking eyes actions will be triggered when a PR is submitted and after merging to master branch.	2021-11-24 10:41:02 +08:00
xu20160924	3b988204fc	[doc] Modify the wrong comment of the ScanTime (#7109 ) Modify the wrong comment of the ScanTime.	2021-11-24 10:40:00 +08:00
Pxl	a74fdf184c	[refactor](be) refactor predicate function creator (#7054 ) Refactor predicate function creator, make MinMaxFunction/HybridSet/BloomFilter use a unified interface through template to get function.	2021-11-24 10:39:29 +08:00
tianhui5	d3c020b3cb	[feat-opt](fe-config) Add tablets number limit to void wrong usage (#7025 ) 1. Add new FE config `default_db_replica_quota_size` 2. Check replica quota after create table/partition	2021-11-24 10:37:54 +08:00
luzhijing	4b45b806da	[doc] Created commit-format-specification.md (#7190 ) We found that many commit messages submitted at present have ambiguous information. Clear commit messages can help developers submit pull requests more readable, committers merge easily and Release Manager easy to release. Therefore, we have sorted out a version of the commit format specification. We hope that subsequent contributors can sort out the commit messages according to the specification when submitting Pull Request.	2021-11-24 10:30:54 +08:00
Zhengguo Yang	d420ff0afd	display current load bytes to show load progress, (#7134 ) this value may greate than the file size when loading parquert or orc file, will less than file size when loading csv file.	2021-11-24 10:08:32 +08:00
Zhengguo Yang	e2d3d0134e	dd a method to get doris current memory usage (#6979 ) Add all memory usage check when TryConsume memory	2021-11-24 10:07:54 +08:00
Xinyi Zou	ad0d2b82ab	[fix](memory) fix bug that ~BitShufflePageDecoder destroys uninitialized chunk (#7172 ) Added a safe way to destroy Chunk.	2021-11-23 15:24:25 +08:00
renzhimin7	ce7fa5d6d9	[typo] Update multi-tenant.md (#7162 ) A double quote is missing	2021-11-22 14:47:00 +08:00
xy720	836c95c2ca	[feat](memory-track) Print peak memory use of all backend after query in audit log (#7030 ) Add a new field `peakMemoryBytes` in fe.audit.log	2021-11-22 14:46:08 +08:00

1 2 3 4 5 ...

3582 Commits