doris

Author	SHA1	Message	Date
Mingyu Chen	c3b010b277	[refactor] Remove flink/spark connectors (#8004 ) As we discussed in dev@doris[1] Flink/Spark connectors has been moved to new repo: https://github.com/apache/incubator-doris-connectors [1] https://lists.apache.org/thread/hnb7bf0l6y6rzb9pr6lhxz3jjoo04skl	2022-02-10 15:00:36 +08:00
Mingyu Chen	df2c7563b0	[improvement](log) Add query id info in error log for easy tracking (#7975 ) This PR #7936 change some FE log level to debug, so that when error happens, it is not easy to find out which SQL cause the error. So I add stmt id and query id in error log, so that user can use these identifiers to find SQL in fe.audit.log	2022-02-09 13:07:28 +08:00
EmmyMiao87	eeaf6725fd	(fix)[lateral-view] Solve the problem of not recognizing the lateral view on the view (#7968 ) If the tableRef behind represents a CTE or a view, the tableRef will be reset during semantic parsing. The new tableRef needs to inherit the lateral view property of the origin tableRef to ensure that the lateral view is not accidentally lost during parsing.	2022-02-09 13:07:03 +08:00
Zhengguo Yang	5029ef46c9	[fix] fix ltrim result may incorrect in some case (#7963 ) fix ltrim result may incorrect in some case according to https://gcc.gnu.org/onlinedocs/gcc/Other-Builtins.html Built-in Function: int __builtin_cl/tz (unsigned int x) If x is 0, the result is undefined. So we handle the case of 0 separately this function return different between gcc and clang when x is 0	2022-02-09 13:06:37 +08:00
zuochunwei	db20e1f323	[refactor](storage) VGenericIterator to reuse Schema (#7858 ) 1. reuse Schema to avoid copying, because clone Schema will generate a lot of sub Field object 2. call interface provided by Block to reduce code lines	2022-02-09 13:06:03 +08:00
Pxl	0553ce2944	[feature](vectorization) support function topn && remove some unused code (#7793 )	2022-02-09 13:05:31 +08:00
Mingyu Chen	3048ce8a4f	[improvement][refactor](vec) Refactor serde of vec block and using brpc attachment (#7939 ) This PR mainly changes: 1. Change the define of PBlock The new PBlock consists of a set of PColumnMeta and a binary buffer. The PColumnMeta records the metadata information of all columns in the Block, while the buffer stores the serialized binary data of all columns. 2. Refactor the serialize/deserialize method of data type Rewrite the `serialize()/deserialize()` of IDataType. And also add a new method `get_uncompressed_serialized_bytes()` to get the total length of uncompressed serialized data of a column. 3. Rewrite the serialize/deserialize method of Block Now, when serializing a Block to PBlock, it will first get the total length of uncompressed serialized data of all columns in this Block, and then allocate the memory to write the serialized data to the buffer. 4. Use brpc attachment to transmit the serialized column data	2022-02-08 11:11:42 +08:00
HappenLee	ef233701b3	[feature](vec)(load) Support vtablet sink to enable insert into by using vec query engine (#7957 ) Support vtablet sink to enable insert into query in vec query engine	2022-02-08 11:04:09 +08:00
HappenLee	505acae931	[fix](vectorization) make sure the mem address use in agg is align in proper way before use (#7960 )	2022-02-08 10:05:03 +08:00
caiconghui	ecbd4bcae0	[fix](catalog) Fix bug that The MetaObject lock design of fe would cause some problems with consistent meta when catalog do replay operation (#6650 ) 1. If the table or db has been dropped，we will get write lock failed or just skip or throw exception， 2. and if we recover table or db， we must ensure that unmark dropped state after writing recover journal. 3. db.dropTable corresponds to db.createTable, I don't move table.markDropped method to the db.dropTable, for that all meta added to db or catalog must after writing recover journal, so we must invoke markDropped and unmarkDropped method outside the dropTable and createTable method.	2022-02-08 10:01:52 +08:00
caoliang-web	8fcae0f0f4	[refactor] Modify the content of code comments (#7950 ) Co-authored-by: caol <caol@shuhaisc.com>	2022-02-08 09:55:46 +08:00
caiconghui	c6defb2faf	[improvement](query) Improve fe high concurrent query performance (#7936 )	2022-02-08 09:54:59 +08:00
jiafeng.zhang	4c22f3c6e1	(docs) Improve Flink Connector documentation description (#7946 )	2022-02-08 09:50:38 +08:00
dataroaring	cc9cfebbff	(build) teach build.sh --clean work correctly (#7949 ) CMAKE_BUILD_DIR is set while building be. "build.sh --clean" just cleans and exits, however clean_be does not works without CMAKE_BUILD_DIR set. This patch set CMAKE_BUILD_DIR in clean_be to teach build.sh --clean work correctly.	2022-02-08 09:49:39 +08:00
zhannngchen	f0a3c852f1	[Docs] fix typo (#7964 )	2022-02-08 09:46:31 +08:00
Mingyu Chen	ed9ecb8fa8	[docs] Update README (#7965 ) 1. Add slack channel 2. Modify some links	2022-02-08 09:46:04 +08:00
Zhengguo Yang	f8d086d87f	[feature](rpc) (experimental)Support implement UDF through GRPC protocol. (#7519 ) Support implement UDF through GRPC protocol. This brings several benefits: 1. The udf implementation language is not limited to c++, users can use any familiar language to implement udf 2. UDF is decoupled from Doris, udf will not cause doris coredump, udf computing resources are separated from doris, and doris services are not affected But RPC's UDF has a fixed overhead, so its performance is much slower than C++ UDF, especially when the amount of data is large. Create function like ``` CREATE FUNCTION rpc_add(INT, INT) RETURNS INT PROPERTIES ( "SYMBOL"="add_int", "OBJECT_FILE"="127.0.0.1:9999", "TYPE"="RPC" ); ``` Function service need to implement `check_fn` and `fn_call` methods Note: THIS IS AN EXPERIMENTAL FEATURE, THE INTERFACE AND DATA STRUCTURE MAY BE CHANGED IN FUTURE !!!	2022-02-08 09:25:09 +08:00
Adonis Ling	03f5fc2b0b	[chore] Make build setting BUILD_META_TOOL optional. (#7948 )	2022-02-07 16:06:42 +08:00
Keysluomo	8207883319	[style] Translate Chinese to English in PaloBrokerService.thrift (#7941 ) Translate Chinese to English in PaloBrokerService.thrift	2022-02-06 08:35:24 +08:00
Adonis Ling	e7103bfd08	[chore] Set the full path of make program to CMake (#7909 )	2022-02-06 08:34:08 +08:00
Henry2SS	2ffd7fc80a	[fix](load priv) modify error msg of checking table priv (#7817 )	2022-02-06 08:33:41 +08:00
HappenLee	9eb1d1df27	[fix](vec) fix block mem use-after-free bug in agg table read (#7944 )	2022-02-06 00:34:38 +08:00
HappenLee	51abaa89f3	[fix](vec) Fix some bugs about vec engine (#7884 ) 1. mem leak in vcollector iter 2. query slow in agg table limit 10 3. query slow in SSB q4,q5,q6	2022-02-03 19:21:17 +08:00
Mingyu Chen	c0e59e59aa	[fix][refactor] fix bugs and refactor some code by lint (#7871 ) 1. Fix some `passedByValue` issues. 2. Fix some `dereferenceBeforeCheck` issues. 3. Fix some `uninitMemberVar` issues. 4. Fix some iterator `eraseDereference` issues. 5. Fix compile issue introduced from #7923 #7905 #7848	2022-02-01 14:31:14 +08:00
Mingyu Chen	82f421a019	[fix](brpc-attachment) Fix bug that may cause BE crash when enable `transfer_data_by_brpc_attachment` (#7921 ) This PR mainly changes: 1. Fix bug when enable `transfer_data_by_brpc_attachment` In `data_stream_sender`, we will send a serialized PRowBatch data to multiple Channels. And if `transfer_data_by_brpc_attachment` is enabled, we will mistakenly clear the data in PRowBatch after sending PRowBatch to the first Channel. As a result, the following Channel cannot receive the correct data, causing an error. So I use a separate buffer instead of `tuple_data` in PRowBatch to store the serialized data and reuse it in multiple channels. 2. Fix bug that the the offset in serialized row batch may overflow Use int64 to replace int32 offset. And for compatibility, add a new field `new_tuple_offsets` in PRowBatch.	2022-02-01 08:51:16 +08:00
EmmyMiao87	58ad8b7ec9	(improvement)[test] Combine multiple tests to use only one doris cluster (#7934 ) This PR mainly includes the following two changes: 1. Shorten FE single measurement time In Doris's FE unit test, starting a Doris cluster is a time-consuming operation. In this PR, the unit tests of some small functions are merged into @QueryPlanTest, the same cluster is used centrally, so as to avoid the problem that the overall unit test time of FE is too long. 2. Refine the logic of "PR 7851" Although the function can be implemented correctly in PR #7851, the logic is not brief enough. This PR mainly succinct redundant code in terms of engineering implementation.	2022-01-31 22:16:44 +08:00
spaces-x	8c179bb09f	[fix](alter) fix sql analyzed failed after increase the default bucket num of the table. (#7932 ) Distribution info of partitions are deep copied from olapTable.	2022-01-31 22:16:08 +08:00
zuochunwei	4e783afa7a	[feature] add Generic debug timer for debugging or profiling (#7923 ) add a group of debug-timer for the purpose of profiling or testing you can use these timers for custom meaning purpose unlike the specific named timer	2022-01-31 22:15:43 +08:00
dataroaring	a6962af30f	[typo](doc) rename DDL Statements to SQL Statements in sidebars of docs in english (#7922 )	2022-01-31 22:15:10 +08:00
Adonis Ling	b1b1a8c64d	[chore] Fix the return code of function download_func (#7919 )	2022-01-31 22:14:46 +08:00
HappenLee	358bd79fb1	[improvement](vec)(Join) Mem reuse to speed up join operator (#7905 ) 1. Reuse the mem of output block in vec join node 2. Add the function `replicate` in column	2022-01-31 22:14:12 +08:00
wudi	68c4bb34e6	[chore] Build script add build broker option (#7885 )	2022-01-31 22:13:25 +08:00
dataroaring	14c209c7cf	[refactor] remove useless if statement in segment_writer.cpp (#7864 )	2022-01-31 22:12:54 +08:00
jiafeng.zhang	4ada8e4854	[fix](httpv2) make http v2 and v1 interface compatible (#7848 ) http v2 TableSchemaAction adds the return value of aggregation_type, and modifies the corresponding code of Flink/Spark Connector	2022-01-31 22:12:34 +08:00
caoliang-web	cacc29470b	[demo] Add Flink oracle cdc demo (#7845 )	2022-01-31 22:11:38 +08:00
Pxl	2003da7cf9	[fix](ut) fix abs function ut (#7938 )	2022-01-31 14:58:29 +08:00
wangbo	3f221e1d0b	[fix](memory-leak) using unique_ptr to refactor some fields (#7933 ) Using unique_ptr to refactor some class members. Fix mem leak for `SegmentIterator`'s `_pre_eval_block_predicate`.	2022-01-30 16:49:04 +08:00
htyoung	7a1ad65d88	[fix](fe-ui) fix home page spin bug #7170 (#7912 ) The Spin(loading circle) will disappear after hardware info loaded	2022-01-30 16:47:58 +08:00
Pxl	3ee000c13c	[chore] support build with libc++ && add some build config (#7903 ) support LIBCPP/LDD/BUILD_META_TOOL for build.sh	2022-01-30 16:47:22 +08:00
924060929	c1fef37399	[improvement](runtime-filter) Support adaptive runtime filter(#7546 ) (#7645 ) Change 1: Support an adaptive runtime filter: IN_OR_BLOOM_FILTER The processing logic is If the number of rows in the right table < runtime_filter_max_in_num, then IN predicate will work If the number of rows in the right table >= runtime_filter_max_in_num, then Bloom filter can take effect Change 2: The default runtime filter is changed to filter: IN_OR_BLOOM_FILTER	2022-01-30 16:46:52 +08:00
Mingyu Chen	7325e9d7f7	[chore] Remove BUILD_TYPE bcc (#7893 ) no need anymore	2022-01-30 15:55:20 +08:00
Amos Bird	3904447db8	[fix](toolchain) Allow building when system contains libunwind (#7866 )	2022-01-29 12:33:32 +08:00
Zeno Yang	a72eaa2b2e	[fix](Vectorized) optinmize dict page decoder init (#7917 ) this may cause mem leak	2022-01-29 11:47:57 +08:00
caiconghui	4c7525cf2c	[improvement](show) Support that user can use show data skew statement instead of admin (#7914 ) * [improvement](show) Support that user can use show data skew statement instead of admin This PR mainly do two things: 1. Support that user can use show data skew statement instead of admin 2. Fix fe ut failed caused by pr [improvement](rewrite) Make RewriteDateLiteralRule to be compatible with mysql #7876 and pr [feature-wip](iceberg) Step1: Support create Iceberg external table #7391 Co-authored-by: caiconghui1 <caiconghui1@jd.com>	2022-01-29 10:45:03 +08:00
qiye	6a1a2a2ed5	[fix](query) Add init function for result_file_sink (#7927 ) Add init function in `result_file_sink` to fix the error "Empty partition info", which is occasional reported when using SELECT INFO OUTFILE.	2022-01-29 10:08:57 +08:00
EmmyMiao87	1d900d8605	(fix)[planner] Fix the right tuple ids in empty set node (#7931 ) The tuple ids of the empty set node must be exactly the same as the tuple ids of the origin root node. In the issue, we found that once the tree where the root node is located has a window function, the tuple ids of the empty set node cannot be calculated correctly. This pr mostly fixes the problem. In order to calculate the correct tuple ids, the tuple ids obtained from the SelectStmt.getMaterializedTupleIds() function in the past are changed to directly use the tuple ids of the origin root node. Although we tried to fix #7929 by modifying the SelectStmt.getMaterializedTupleIds() function, this method can't get the tuple of the last correct window function. So we use other ways to construct tupleids of empty nodes.	2022-01-29 09:46:05 +08:00
zhangstar333	fb6e22f4ca	[Fix] fix memory leak in be unit test (#7857 ) 1. fix be unit test memory leak 2. ignore mindump test with ASAN test	2022-01-29 01:00:38 +08:00
bingzxy	2d99041ec0	[docs] fix typo (#7907 )	2022-01-28 22:32:50 +08:00
zhangstar333	071be928f9	[fix](vectorized) fix bug multi distinct function get wrong type (#7900 )	2022-01-28 22:31:41 +08:00
HappenLee	6a89b893a3	[refactor](benchmark) Change SSB create table column to NOT NULL (#7898 )	2022-01-28 22:26:55 +08:00

1 2 3 4 5 ...

3838 Commits