doris

Author	SHA1	Message	Date
Mingyu Chen	d79cdc829f	[Bug] Filter out unavaliable backends when getting tablet location (#6204 ) * [Bug] Filter out unavaiable backends when getting scan range location In the previous implementation, we will eliminate non-surviving BEs in the Coordinator phase. But for Spark or Flink Connector, there is no such logic, so when a BE node is down, it will cause the problem of querying errors through the Connector. * fix ut * fix compiule	2021-07-13 11:17:49 +08:00
Zhengguo Yang	dbfe8e4753	[enhancement] Optimize load CSV file memory allocate (#6174 ) Optimize load CSV file memory allocate, avoid frequent allocation, may reduce the load time by 40%-50% when large column numbers	2021-07-12 09:58:45 +08:00
stdpain	290a844e04	[optimize] Optimize bloomfilter performance (#6180 ) refactor runtime filter bloomfilter and eliminate some virtual function calls which obtained a performance improvement of about 5% import block bloom filter, for avx version obtained 40% performance improvement before: bloomfilter size:default, about 2000W item cost about 1s400ms after: bloomfilter size:524288, about 2000W item cost about 400ms	2021-07-10 10:12:12 +08:00
pengxiangyu	01bef4b40d	[Load] Add "LOAD WITH HDFS" model, and make hdfs_reader support hdfs ha (#6161 ) Support load data from HDFS by using `LOAD WITH HDFS` syntax and read data directly via libhdfs3	2021-07-10 10:11:52 +08:00
Pxl	3812cca4db	[Bug]fix the calculation of the "_start_trash_sweep" run interval. (#6177 ) * fix the calculation of the _start_trash_sweep run interval	2021-07-09 09:45:44 +08:00
Pxl	0c6726f7cd	[Bug] Fix bug of TDisk have wrong static_cast (#6175 ) * remove some useless static_cast	2021-07-09 09:42:08 +08:00
stdpain	8a785ab08b	[BUG][Leak] fix new-delete-type-mismatch in StorageBackend (#6157 )	2021-07-08 09:55:22 +08:00
DinoZhang	c929a8935a	[Feature][Function] support bit_length function (#6140 ) support bit_length function like mysql	2021-07-08 09:40:30 +08:00
Zhengguo Yang	198ba78595	[Feature] Add update time to show table status (#6117 ) Add update time to show table status ``` MySQL [test_query_qa]> show table status; +----------+--------+---------+------------+------+----------------+-------------+-----------------+--------------+-----------+----------------+---------------------+---------------------+---------------------+-----------+----------+----------------+---------+ \| Name \| Engine \| Version \| Row_format \| Rows \| Avg_row_length \| Data_length \| Max_data_length \| Index_length \| Data_free \| Auto_increment \| Create_time \| Update_time \| Check_time \| Collation \| Checksum \| Create_options \| Comment \| +----------+--------+---------+------------+------+----------------+-------------+-----------------+--------------+-----------+----------------+---------------------+---------------------+---------------------+-----------+----------+----------------+---------+ \| bigtable \| Doris \| NULL \| NULL \| NULL \| NULL \| NULL \| NULL \| NULL \| NULL \| NULL \| 2021-06-29 17:09:28 \| 2021-06-29 17:17:28 \| 1970-01-01 07:59:59 \| utf-8 \| NULL \| NULL \| OLAP \| \| test \| Doris \| NULL \| NULL \| NULL \| NULL \| NULL \| NULL \| NULL \| NULL \| NULL \| 2021-06-29 17:09:26 \| 2021-06-29 17:17:28 \| 1970-01-01 07:59:59 \| utf-8 \| NULL \| NULL \| OLAP \| \| baseall \| Doris \| NULL \| NULL \| NULL \| NULL \| NULL \| NULL \| NULL \| NULL \| NULL \| 2021-06-29 17:09:26 \| 2021-06-29 17:17:26 \| 1970-01-01 07:59:59 \| utf-8 \| NULL \| NULL \| OLAP \| +----------+--------+---------+------------+------+----------------+-------------+-----------------+--------------+-----------+----------------+---------------------+---------------------+---------------------+-----------+----------+----------------+---------+ 3 rows in set (0.002 sec) ```	2021-07-07 10:27:14 +08:00
Zhengguo Yang	739c0268ff	[refactor] Remove decimal v1 related code from code base (#6079 ) remove ALL DECIMAL V1 type code ， this is a part of #6073	2021-07-07 10:26:32 +08:00
stdpain	149def9e42	[Feature] Support RuntimeFilter in Doris (BE Implement) (#6077 ) 1. support in/bloomfilter/minmax 2. support broadcast/shuffle/bucket shuffle/colocate join 3. opt memory use and cpu cache miss while build runtime filter 4. opt memory use in left semi join (works well on tpcds-95)	2021-07-04 20:59:05 +08:00
Mingyu Chen	4a5f0f859d	[Bug] Add readlock when calling get_rowset_by_version() (#6120 )	2021-07-01 09:19:10 +08:00
luozenglin	6441a4c0ca	[Metrics] Add metrics for load average. (#6069 ) Added load average metrics to be for monitoring system load. See iseue #6068 for a detailed explanation.	2021-06-30 09:28:45 +08:00
Xinyi Zou	f254870aeb	[Optimize] Put _Tuple_ptrs into mempool when RowBatch is initialized (#6036 )	2021-06-30 09:27:53 +08:00
Zhengguo Yang	fe65a623c1	Fix timeout error when delete condition contains invalid datetime format (#6030 ) * add date time format check in delete statment	2021-06-29 09:47:42 +08:00
stdpain	dd455af844	fix stdfs link error in some gcc version (#6090 )	2021-06-26 14:09:00 +08:00
HappenLee	6651d3bf2a	SIMD instruction speed up the storage layer (#6089 ) * SIMD instruction speed up the storage layer * 1. add DECHECK in power of 2 int32 2. change vector to array deduce the cost	2021-06-25 11:04:32 +08:00
Hao Tan	45ec084a6d	fix (#6081 )	2021-06-24 09:44:24 +08:00
Mingyu Chen	c8899ee5bd	[Build][ARM] Fix some compilation problems on ARM64 (#6076 ) 1. Disable libhdfs3 on ARM, because it doesn't support ARM now. 2. Add compilation doc for ARM64	2021-06-23 09:38:16 +08:00
Zhengguo Yang	68bab73c35	[Bug] Fix select random storage path maybe same at a long time (#6062 ) random_shuflle will generate same random sequence when call multiple times, although we use twice random, but when there is no change in the size relationship between the adjacent numbers, the result of the second shuffle will not change either	2021-06-20 16:16:32 +08:00
xinghuayu007	5dabf0bef5	[Alter] validate data file after alter operation success (#6022 ) Co-authored-by: wangxixu <wangxixu@xiaomi.com>	2021-06-20 16:15:14 +08:00
stdpain	1999a0c26b	[optimization] open gcc strict-aliasing optimization (#6034 ) * open gcc strict-aliasing optimization * use -Werror=strick-alias	2021-06-18 11:39:24 +08:00
Mingyu Chen	5cfe081b05	[Bug] Remove duplicate memtracker (#6041 ) * [Enhanece] Remove duplicate memtracker This problem will cause frequent creation of memtracker and affect query concurrency.	2021-06-18 11:28:37 +08:00
weizuo93	9f52f4f9e5	fix stream load error msg missing (#6050 ) Co-authored-by: weizuo <weizuo@xiaomi.com>	2021-06-18 09:21:12 +08:00
Mingyu Chen	d57c2344e1	[MemTracker] Refactored the hierarchical structure of memtracker (#5956 ) To avoid showing too many memtracker on BE web pages. The MemTracker level now has 3 levels: OVERVIEW, TASK and VERBOSE. OVERVIEW Mainly used for main memory consumption module such as Query/Load/Metadata. TASK is mainly used to record the memory overhead of a single task such as a single query, load, and compaction task. VERBOSE is used for other more detailed memtrackers.	2021-06-16 09:44:24 +08:00
stdpain	bde60280b8	[Optimize] use string_view instead of std::string in string function (#6010 )	2021-06-16 09:40:13 +08:00
crazyleeyang	8b4721c941	[Bug] Fix kafka consumer reuse bug (#6007 ) When judging whether consumer can be reused, it is necessary to judge whether the parameter content is equal.	2021-06-16 09:39:05 +08:00
Yingchun Lai	6d6c3d9703	[Enhancement] Reduce memory consumption by releasing readers earier (#5811 ) We created multiple rowset readers to read data of one tablet, after one rowset reader has reached EOF, it can be released to reduce resource (typically memory) consumption. As the same, we can release segment reader when it reach EOF.	2021-06-16 09:37:50 +08:00
luozenglin	d33a6d1b98	[Function] Support date function: yearweek(), week(), makedate(). (#6000 )	2021-06-10 17:38:25 +08:00
HappenLee	80220af271	[Enhancement] Use Parallel Hash Map Replace Unordered Map In Dict Encodeing Map And Hyper Set (#5990 ) Use Parallel Hash Map Replace Unordered Map In Dict Encodeing Map And Hyper Set To Improve Ferformance	2021-06-10 17:38:08 +08:00
Mingyu Chen	206a711f9b	[Bug] SimplifyInvalidDateBinaryPredicatesDateRule may cause invalid query plan (#5987 ) 1. "where 1k > to_date(now())" will return EMPTYSET in query plan. 2. DateLiteral should accept date string like "2021-6-1".	2021-06-10 17:37:26 +08:00
xinghuayu007	e245aee33e	[Feature] Select outfile support parquet format (#5938 ) `Select outfile into` currently only supports to export data with CSV format. This patch extends the feature to supports parquet format. Usage: LocaFile: ``` SELECT citycode FROM table1 INTO OUTFILE "file:///root/doris/" FORMAT AS PARQUET PROPERTIES ("schema"="required,int32,siteid;", "parquet.compression"="snappy"); ``` BrokerFile: ``` SELECT siteid FROM table1 INTO OUTFILE "hdfs://host/test_sql_prc_2019_02_19/" FORMAT AS PARQUET PROPERTIES ( "broker.name" = "hdfs_broker", "broker.hadoop.security.authentication" = "kerberos", "broker.kerberos_principal" = "test", "broker.kerberos_keytab_content" = "base64" , "schema"="required,int32,siteid;" ); ``` Field `schema` is required, which defines the schema of a parquet file. Prefix `parquet.` is the parquet file properties, like compression, version, enable_dictionary.	2021-06-10 17:34:01 +08:00
Lijia Liu	4d64612b96	[ARRAY]Save array's size instead of offset. (#5983 ) * Save array's size instead of offset. * Optimize variable name * Fix comment	2021-06-10 12:32:58 +08:00
caiconghui	d9c128b744	[BrokerLoad] Support read properties for broker load when read data (#5845 ) * [BrokerLoad] support read properties for broker load when read data Co-authored-by: caiconghui <caiconghui@xiaomi.com>	2021-06-09 14:59:55 +08:00
weizuo93	61af76b8fb	[Log] fix log error when commit transaction in txn manager (#5937 ) Co-authored-by: weizuo <weizuo@xiaomi.com>	2021-06-06 22:05:40 +08:00
stdpain	d790cc6a50	[BUG] Fixed the problem that substring function may access illegal address (#5952 )	2021-06-03 18:38:10 +08:00
weizuo93	4ef1dbf394	[Bug] Fix lack of rdlock before rowset_with_max_version() in compaction log (#5953 )	2021-06-03 10:01:35 +08:00
Mingyu Chen	81ecf3d097	[Bug] Rebuilt version graph of a tablet when there are too many orphan vertex (#5945 ) The version information of the tablet will be stored in the memory in an adjacency graph data structure. And as the new version is written and the old version is deleted, the data structure will begin to have empty vertex with no edge associations(orphan vertex). These orphan vertexs should be removed somehow.	2021-06-03 09:59:20 +08:00
曹建华	4c0a98e8bf	[BE] Optimize version retrieval efficiency. (#5831 ) * [FE] Optimize version retrieval efficiency in high-frequency import/compaction scenarios. * Jump out of the loop when encountering the reverse edge.	2021-06-02 09:58:21 +08:00
Mingyu Chen	ba868c610f	[Optimize] Optimize some tablet scheduling logic (#5926 ) 1. The partitions set by the admin repair command are prioritized to ensure that the tablets of these partitions can be repaired as soon as possible. 2. Add an FE metric "query_begin" to monitor the number of queries submitted to the Doris.	2021-05-30 23:08:59 +08:00
xinghuayu007	63c99eb4cb	[Cache][Enhancement] Assure sql cache only one version (#5793 ) For PR #5792. This patch add a new param `cache type` to distinguish sql cache and partition cache. When update sql cache, we make assure one sql key only has one version cache.	2021-05-28 13:45:47 +08:00
Xinyi Zou	80f0b5fd1c	[BUG] Fix calculation error when the memory parameter is a float value percentage (#5916 ) When parsing memory parameters in `ParseUtil::parse_mem_spec`, convert the percentage to `double` instead of `int`. The currently affected parameters include `mem_limit` and `storage_page_cache_limit`	2021-05-27 22:06:50 +08:00
stdpain	f4ebac0210	[BUG] BE core when FE get_stream_load_record (#5913 )	2021-05-27 22:06:26 +08:00
Xinyi Zou	4343354711	[BUG] Fix in memory table may cause a lot of CPU consumption when LRU Cache evict (#5908 ) According to the LRU priority, the `lru list` is split into `lru normal list` and `lru durable list`, and the two lists are traversed in sequence during LRU evict, avoiding invalid cycles.	2021-05-27 22:05:41 +08:00
EmmyMiao87	0f4a39f82d	[LOG]Hiding stack info of memory exceed in the log (#5896 ) If query is memory exceed, a detail info where memory exceed is required. However it is not necessary to return the entire query stack to the end user. The query stack only needs to be printed in the be log.	2021-05-27 22:04:17 +08:00
Zhengguo Yang	ba38973209	use virtual hosted-style request to access object store (#5894 ) * use virtual hosted-style access request object store	2021-05-27 15:52:07 +08:00
stdpain	d6076af938	[BUG] fix BE coredump if result sink prepare failed (#5899 )	2021-05-26 10:02:55 +08:00
stdpain	6924637e64	[BUG] fix compression bug while compaction (#5893 ) Because the maximum length of LZ4 compression is 2^32, it can cause some memory problems	2021-05-26 10:02:39 +08:00
HappenLee	629e440a67	[Bug] Fix the bug of nullif function: (#5882 ) 1. Prevent return NULL call nullif(98, null) in FE 2. Support DecimalV2 of nullif function to get the right result	2021-05-26 10:01:17 +08:00
stdpain	9dd54b83b8	[optimize] avoid extra memory alloc in object pool (#5871 )	2021-05-26 09:58:21 +08:00

1 2 3 4 5 ...

1425 Commits