doris

Author	SHA1	Message	Date
zy-kkk	ce489cf723	[Feature](JDBC)support clickhouse jdbc external table (#14244 )	2022-11-21 10:33:53 +08:00
zhannngchen	41dae8b6bb	[improvement](load) add a log when close OlapTableSink with error (#14257 )	2022-11-21 10:33:37 +08:00
xueweizhang	a9a6fdd8c3	[fix](insert) fix insert into table which contains column name prefix mv_ (#14361 )	2022-11-21 10:31:01 +08:00
minghong	0613ccda74	[feature](tools)profile viewer (#14429 ) It is a painful work to read profile, especially there are multi-parallel instances. This tool helps us to grasp the main information of profile in a graphical view. The profile is represented by a tree. Sql operation nodes contains operation type(join, scan...), its node id, its fragment id. The number on the arrow edge means how many rows output by child node. This tool will sum the output rows of the same node in multi-parallel instances, that is if there are 4 parallel instance, and each ScanNode on lineitem table output 10 rows, the label on the arrow beginning with ScanNode(lineitem) is 40. Here is a demo for tpch Q2 tpch q2 profile viewer Issue Number: close #xxx	2022-11-21 10:29:54 +08:00
周翱	4976021bf7	[Enhancement] Doris broker support aliyun-oss #13665 (#14305 )	2022-11-21 10:29:14 +08:00
Pxl	c18a471303	[Optimize](predicate) update inplace on VcompoundPred (#14402 ) select count(*) from lineorder where lo_orderkey<100000000 OR lo_orderkey>100000000 AND lo_orderkey<200000000 OR lo_orderkey >200000000; 0.6s -> 0.5s	2022-11-21 09:12:30 +08:00
zhangstar333	3f29e3bff6	[bug](test) fix regression test of jdbc postgresql table core (#14417 )	2022-11-20 23:03:14 +08:00
jiafeng.zhang	98cea90950	[typo](docs)benchmark doc fix number (#14427 )	2022-11-20 22:51:42 +08:00
HappenLee	c29975d347	[Docs](function) Add some function do not in sidebars (#14426 )	2022-11-20 22:50:52 +08:00
jiafeng.zhang	71e80e8957	[typo](docs)Performance test documentation update (#14147 ) * Performance test documentation update	2022-11-20 09:40:57 +08:00
Mingyu Chen	2ccb5209a0	(improvement)[doc] add document version tag instruction (#14406 )	2022-11-20 00:05:53 +08:00
Yongqiang YANG	3489f4826c	[fix](test) sync conf used in pipeline and in repository (#14414 )	2022-11-20 00:05:08 +08:00
zhannngchen	3e1e8db173	[fix](exec) fix thread token shutdown (#14418 ) Fix Thread pool token was shut down error. This is because when there are more than 1 fragment of a query on one BE, the thread token maybe reset incorrectly, causing thread token shutdown earlier. cherry-pick from master Introduced from #13021	2022-11-20 00:04:48 +08:00
lsy3993	5dfe5ef965	[test](hive catalog)add hive catalog test case (#14217 )	2022-11-19 17:26:18 +08:00
Gabriel	2c42f0a905	[refactor](decimalv3) Refine code for DecimalV3 (#14394 )	2022-11-19 16:57:17 +08:00
Dongyang Li	1482ab32b6	[tools](tpch)fix invalid download url (#14329 )	2022-11-19 13:29:33 +08:00
caiconghui	1f2c06dd6e	[enhancement](rewrite) Remove unused wide common factors to improve scan performance in ExtractCommonFactorsRule (#14381 ) * [enhancemeng](sql) Remove unused wide common factors to improve scan performance in ExtractCommonFactorsRule * fix regression test Co-authored-by: caiconghui1 <caiconghui1@jd.com>	2022-11-19 13:23:49 +08:00
FreeOnePlus	f5f2e84e31	[refactor](planner) remove the limit return rows of order by (#12478 ) Originally, Order By Limit returned a maximum of 65535 rows of data by default during the query, but now many businesses do not apply this limit. It is necessary to add larger data after the query statement to complete the full data query, which is extremely inconvenient, so adjustments have been made. At the same time, I added the variable DEFAULT_ORDER_BY_LIMIT to the SessionVariable, the default value is -1, if the user does not use the LIMIT keyword or the LIMIT value is a negative integer, the default query return value is Long.MAX_VALUE. If the corresponding maximum query value is set, the number of data items is returned according to the maximum query value or the value followed by the LIMIT keyword.	2022-11-19 12:45:44 +08:00
gnehil	1b6e872a8a	[improvement](common) table name length exceeds limit error message (#14368 ) For the table name check, the regular match error and the length exceeds the limit, both of which display the message "Incorrect table name 'xxx'. Table name regex is 'xxx'". Obviously, the message cannot clearly point out what kind of error it is. So it is a better way to separate the two error messages.	2022-11-19 11:36:08 +08:00
Mingyu Chen	512b787559	[fix](parquet-reader) fix stack-use-after-return error (#14411 )	2022-11-19 10:52:50 +08:00
lihangyu	b4aef889f2	[feature-array](array-function) add array constructor function `array()` (#14250 ) * [feature-array](array-function) add array constructor function `array()` ``` mysql> select array(qid, creationDate) from nested_c_2 limit 10; +------------------------------+ \| array(`qid`, `creationDate`) \| +------------------------------+ \| [1000038, 20090616074056] \| \| [1000069, 20090616075005] \| \| [1000130, 20090616080918] \| \| [1000145, 20090616081545] \| +------------------------------+ 10 rows in set (0.01 sec) ```	2022-11-19 10:49:50 +08:00
lsy3993	02372ca2ea	[test](jdbc external table) add new jdbc mysql external table (#14323 )	2022-11-19 09:46:48 +08:00
Adonis Ling	eb76160b48	[chore](third-party) Use GNU official mirror to boost the download speed (#14358 ) According to the description in https://www.gnu.org/server/mirror.html, using the address http://ftpmirror.gnu.org/ to download GNU packages is recommended. It can boost the download speed worldwide.	2022-11-19 00:04:52 +08:00
924060929	63a2344e68	[Enhancement](Nereids) Refactor AggregateFunction and support explain plan (#14380 ) # Proposed changes - Refactor AggregateFunction 1. AggregateFunction implement ComputeSignature 3. Add a CustomSignature to dynamic compute signature, we can check input type and compute implicit cast type in the `customSignature` method 2. Add PartialAggType to record some type information before disassemble aggregate 4. Refine and create a custom catalog function when translate AggregateFunction, without `finalizeForNereids` - Support explain plan 1. explain parsed plan select ... 5. explain analyzed plan select ... 6. explain rewritten/logical plan select ... 7. explain optimized/physical plan select ... 8. explain all plan select ...	2022-11-18 23:40:33 +08:00
minghong	c4bade71c8	[refactor](nereids) remove ColumnStatistics.UNKNOWN from StatsDerive (#14343 ) ColumnStatistics.UNKNOWN can be replaced by ColumnStatistics.DEFAULT	2022-11-18 23:40:00 +08:00
Xin Liao	a82896f420	[fix](broker-load) fix that broker load don not set be exec version and limit node channel memory (#14399 )	2022-11-18 23:38:37 +08:00
Xinyi Zou	21416f9947	[enhancement](memory) Support Jemalloc metrics and default allocator changed to Jemalloc (#14384 )	2022-11-18 21:02:54 +08:00
xueweizhang	68da6bccb7	[fix](type) fix DECIMAL scale when cast function on fe (#12877 ) before: MySQL [test]> select cast('135.759999999' as DECIMAL(10,3)); +----------------------------------------+ \| CAST('135.759999999' AS DECIMAL(10,3)) \| +----------------------------------------+ \| 135.759999999 \| +----------------------------------------+ 1 row in set (0.00 sec) now: MySQL [stage]> select cast('135.759999999' as DECIMAL(10,3)); +----------------------------------------+ \| CAST('135.759999999' AS DECIMAL(10,3)) \| +----------------------------------------+ \| 135.759 \| +----------------------------------------+ 1 row in set (0.01 sec)	2022-11-18 19:36:14 +08:00
carlvinhust2012	eab0af7afe	[optimization](array-type) optimize the export precision of floating point numbers (#14261 ) Co-authored-by: hucheng01 <hucheng01@baidu.com>	2022-11-18 18:24:11 +08:00
jiafeng.zhang	bd5882d08a	[fix](datax)doris writer write error (#14276 ) * doris writer write error	2022-11-18 18:20:13 +08:00
Pxl	734525de86	[Bug](runtime filter) fix minmax filter not copy rightly on shared hash join (#14367 ) fix minmax filter not copy rightly on shared hash join	2022-11-18 17:52:45 +08:00
Mingyu Chen	2c4236fd24	[improvement](ctas) use string type for varchar/char/string (#14382 ) When executing create table as select stmt, the varchar/char/string type of column in created table will be unified to string type. Because when select from external table (mysql/pg, etc), the length of varchar in external database is calculated by "char" length, not "byte" length. So if there is a column with varchar(10) in external table, then there will be a same varchar(10) in created table. But the byte length of data in external table may be larger than 10, causing failure of CTAS. Change to string will not impact performance of the capacity of disk storage. And notice that if a string type column is the first column, it will be changed to varchar(65535), because we do not allow string type column as sort key column.	2022-11-18 14:20:13 +08:00
Tiewei Fang	a1d02f36ac	[feature](table-valued-function) support `hdfs()` tvf (#14213 ) This pr does two things: 1. support `hdfs()` table valued function. 2. add regression test	2022-11-18 14:17:02 +08:00
starocean999	1f326fc0d6	[enhancement](be)limit mem cost to 16m when pre serialize keys in agg node (#14321 ) * [enhancement](be)limit mem cost to 16m when pre serialize keys in agg node * use only one chunk memory when serializing keys in agg node	2022-11-18 12:31:52 +08:00
morrySnow	7952bce03f	[compatibility](Nereids) process escape in string literal (#14294 )	2022-11-18 11:24:00 +08:00
谢健	9e25aa8d3e	[feature](Nereids): Add subgraph enumerator #14291 Add subgraph enumerator to find the best plan For DPHyp, we need an enumerator for all csg-cmp pairs to find the best plan	2022-11-18 10:33:30 +08:00
Adonis Ling	2b6f85ab96	[chore](macOS) Fix BE UT (#14307 ) #13195 left some unresolved issues. One of them is that some BE unit tests fail. This PR fixes this issue. Now, we can run the command ./run-be-ut.sh --run successfully on macOS.	2022-11-18 10:13:38 +08:00
morrySnow	da0b09caea	[fix](Nereids) DateTimeType migrate to DateType is wrong when hour, minute and second all zero (#14327 ) 1. fix DateTimeType migrate to DateType is wrong when hour, minute and second all zero 2. add TPC-H regression test with DATEV2 type	2022-11-18 01:38:03 +08:00
Xinyi Zou	bd5a593403	[enhancement](memtracker) Use proc/meminfo MemAvailable to control memory and optimize MemTracker log printing (#14335 )	2022-11-17 22:46:07 +08:00
Xin Liao	fb140d0180	[Enhancement](sequence-column) optimize the use of sequence column (#13872 ) When you create the Uniq table, you can specify the mapping of sequence column to other columns. You no longer need to specify mapping column when importing.	2022-11-17 22:39:09 +08:00
spaces-x	1a035e2073	[fix](profile)(AggNode) fix the GetResultsTime is always zero (#14366 ) add scoped_timer in _serialize_with_serialized_key_result	2022-11-17 22:30:21 +08:00
Gabriel	50bfd99b59	[feature](join) support nested loop semi/anti join (#14227 )	2022-11-17 22:20:08 +08:00
HappenLee	d5af4f6558	[Neried](Profile) Add projection timer for neried (#14286 )	2022-11-17 22:17:55 +08:00
Mingyu Chen	8fe5211df4	[improvement](multi-catalog)(cache) invalidate catalog cache when refresh (#14342 ) Invalidate catalog/db/table cache when doing refresh catalog/db/table. Tested table with 10000 partitions. The refresh operation will cost about 10-20 ms.	2022-11-17 20:47:46 +08:00
Jibing-Li	ccf4db394c	[feature-wip](multi-catalog) Collect external table statistics (#14160 ) Collect HMS external table statistic information through external metadata. Insert the result into __internal_schema.column_statistics using insert into SQL.	2022-11-17 20:41:09 +08:00
Ashin Gau	44ee4386f7	[test](multi-catalog)Regression test for external hive orc table (#13762 ) Add regression test for external hive orc table. This PR has generated all basic types support by hive orc, and create a hive external table to touch them in docker environment. Functions to be tested: 1. Ensure that all types are parsed correctly 2. Ensure that the null map of all types are parsed correctly 3. Ensure that the `SearchArgument` of `OrcReader` works well 4. Only select partition columns	2022-11-17 20:36:02 +08:00
Kikyou1997	98956dfa19	[fix](statistics) statistics inaccurate after analyze same table more than once (#14279 ) If a table already been analyzed, then we analyze it again, the new statistics would larger than expected since the incremental would contain the values from table level statistics since the SQL lack the predication for the nullability of part_id	2022-11-17 20:18:14 +08:00
TengJianPing	a382bb95e7	[fix](runtimefilter) fix heap-user-after-free of runtime filter merge (#14362 )	2022-11-17 19:38:45 +08:00
yiguolei	dba19e591c	[cherry-pick](scanner) using avg rowset to calculate batch size instead of using total_bytes since it costs a lot of cpu (#14345 ) Co-authored-by: yiguolei <yiguolei@gmail.com>	2022-11-17 18:57:21 +08:00
slothever	6da2948283	[feature-wip](multi-catalog) support iceberg v2(step 1) (#13867 ) Support position delete(part of).	2022-11-17 17:56:48 +08:00

1 2 3 4 5 ...

7246 Commits