doris

Author	SHA1	Message	Date
caiconghui	5d40218ae6	[Config] Support max_stream_load_timeout_second config in fe (#3902 ) This configuration is specifically used to limit timeout setting for stream load. It is to prevent that failed stream load transactions cannot be canceled within a short time because of the user's large timeout setting.	2020-06-19 17:09:27 +08:00
Mingyu Chen	51367abce7	[Bug] Fix bug that BE crash when doing Insert Operation (#3872 ) Mainly change: 1. Fix the bug in `update_status(status)` of `PlanFragmentExecutor`. 2. When the FE Coordinator executes `execRemoteFragmentAsync()`, if it finds an RPC error, return a Future with an error code instead of exception. 3. Protect the `_status` in RuntimeState with lock 4. Move the `_runtime_profile` of RuntimeState before the `_obj_pool`, so that the profile will be deconstructed after the object pool. 5. Remove the unused `ObjectPool` param in RuntimeProfile constructor. If I don't remove it, RuntimeProfile will depends on the `_obj_pool` in RuntimeProfile.	2020-06-19 17:09:04 +08:00
Yunfeng,Wu	355df127b7	[Doris On ES] Support fetch `_id` field from ES (#3900 ) More information can be found: https://github.com/apache/incubator-doris/issues/3901 The created ES external Table must contains `_id` column if you want to fetch the Elasticsearch document `_id`. ``` CREATE EXTERNAL TABLE `doe_id2` ( `_id` varchar COMMENT "", `city` varchar COMMENT "" ) ENGINE=ELASTICSEARCH PROPERTIES ( "hosts" = "http://10.74.167.16:8200", "user" = "root", "password" = "root", "index" = "doe", "type" = "doc", "version" = "6.5.3", "enable_docvalue_scan" = "true", "transport" = "http" ); Query: ``` mysql> select * from doe_id2 limit 10; +----------------------+------+ \| _id \| city \| +----------------------+------+ \| iRHNc3IB8XwmcbhB7lEB \| gz \| \| jBHNc3IB8XwmcbhB71Ef \| gz \| \| jRHNc3IB8XwmcbhB71GI \| gz \| \| jhHNc3IB8XwmcbhB71Hx \| gz \| \| ThHNc3IB8XwmcbhBkFHB \| sh \| \| TxHNc3IB8XwmcbhBkFH9 \| sh \| \| URHNc3IB8XwmcbhBklFA \| sh \| \| ahHNc3IB8XwmcbhBxlFq \| gz \| \| axHNc3IB8XwmcbhBxlHw \| gz \| \| bxHNc3IB8XwmcbhByVFO \| gz \| +----------------------+------+ ``` NOTICE: This change the column name format to support column name start with "_".	2020-06-19 17:07:07 +08:00
lichaoyong	e0461cc7f4	[bug] Make compaction metrics value is right (#3903 ) Now _input_rowsets will be cleared when calling gc_used_rowsets(). After that, the metrics is not right upon be calculated.	2020-06-19 11:22:06 +08:00
xy720	1d9fa5071d	[BUG][Broker] Fix broker read buffer size from input stream (#3881 ) This commit fixs a bug that broker cannot read the full length of buffer size, when the buffer size is set larger than 128k. This bug will cause the data size returned by pread request to be less than 128K all the time.	2020-06-19 09:33:09 +08:00
yangzhg	5a253bc2c6	[BE][Tool] Add segment v2 footer meta viewer (#3822 ) Add segment v2 footer meta viewer tool	2020-06-19 09:32:11 +08:00
Binglin Chang	ca96ea3056	[Memory Engine] MemTablet creation and compatibility handling in BE (#3762 )	2020-06-18 09:56:07 +08:00
ZhangYu0123	2f99f632e8	Modify docs format (#3896 )	2020-06-18 09:43:28 +08:00
EmmyMiao87	a62cebfccf	Forbidden float column in short key (#3812 ) * Forbidden float column in short key When the user does not specify the short key column, doris will automatically supplement the short key column. However, doris does not support float or double as the short key column, so when adding the short key column, doris should avoid setting those column as the key column. The short key columns must be less then 3 columns and less then 36 bytes. The CreateMaterailizedView, AddRollup and CreateDuplicateTable need to forbidden float column in short key. If the float column is directly encountered during the supplement process, the subsequent columns are all value columns. Also the float and double could not be the short key column. At the same time, Doris must be at least one short key column. So the type of first column could not be float or double. If the varchar is the short key column, it can only be the least one short key column. Fixed #3811 For duplicate table without order by columns, the order by columns are same as short key columns. If the order by columns have been designated, the count of short key columns must be <= the count of order by columns.	2020-06-17 14:16:48 +08:00
lichaoyong	e9f7576b9d	[Enhancement] make metrics api more clear (#3891 )	2020-06-17 12:17:54 +08:00
Yunfeng,Wu	c6f2b5ef0d	[Doris On ES][Docs] refator documentation for doe (#3867 )	2020-06-17 10:54:28 +08:00
Mingyu Chen	d659167d6d	[Planner] Set MysqlScanNode's cardinality to avoid unexpected shuffle join (#3886 )	2020-06-17 10:53:36 +08:00
Mingyu Chen	a2df29efe9	[Bug][RoutineLoad] Fix bug that exception thrown when txn of a routineload task become visible (#3890 )	2020-06-17 10:52:51 +08:00
WingC	bfbe22526f	Show create table result with bitmap column should not return default value (#3882 )	2020-06-17 09:43:17 +08:00
lichaoyong	ae7028bee4	[Enhancement] Replace N/A with NULL in ShowStmt result (#3851 )	2020-06-17 09:41:51 +08:00
Mingyu Chen	0224d49842	[Fix][Bug] Fix compile bug (#3888 ) Co-authored-by: chenmingyu <chenmingyu@baidu.com>	2020-06-16 18:42:04 +08:00
lichaoyong	6c4d7c60dd	[Feature] Add QueryDetail to store query statistics. (#3744 ) 1. Store the query statistics in memory. 2. Supporting RESTFUL interface to get the statistics.	2020-06-15 18:16:54 +08:00
Mingyu Chen	2211cb0ee0	[Metrics] Add metrics document and 2 new metrics of TCP (#3835 )	2020-06-15 09:48:09 +08:00
Yingchun Lai	3c09e1e1d8	[trace] Adapt trace util to compaction module (#3814 ) Trace util is helpful for diagnosing compaction performance problems, we can get trace log for base compaction like: ``` W0610 11:26:33.804431 56452 storage_engine.cpp:552] Trace: 0610 11:23:03.727535 (+ 0us) storage_engine.cpp:554] start to perform base compaction 0610 11:23:03.728961 (+ 1426us) storage_engine.cpp:560] found best tablet 546859 0610 11:23:03.728963 (+ 2us) base_compaction.cpp:40] got base compaction lock 0610 11:23:03.729029 (+ 66us) base_compaction.cpp:44] rowsets picked 0610 11:24:51.784439 (+108055410us) compaction.cpp:46] got concurrency lock and start to do compaction 0610 11:24:51.784818 (+ 379us) compaction.cpp:74] prepare finished 0610 11:26:33.359265 (+101574447us) compaction.cpp:87] merge rowsets finished 0610 11:26:33.484481 (+125216us) compaction.cpp:102] output rowset built 0610 11:26:33.484482 (+ 1us) compaction.cpp:106] check correctness finished 0610 11:26:33.513197 (+ 28715us) compaction.cpp:110] modify rowsets finished 0610 11:26:33.513300 (+ 103us) base_compaction.cpp:49] compaction finished 0610 11:26:33.513441 (+ 141us) base_compaction.cpp:56] unused rowsets have been moved to GC queue Metrics: {"filtered_rows":0,"input_row_num":3346807,"input_rowsets_count":42,"input_rowsets_data_size":1256413170,"input_segments_num":44,"merge_rowsets_latency_us":101574444,"merged_rows":0,"output_row_num":3346807,"output_rowset_data_size":1228439659,"output_segments_num":6} ``` for cumulative compaction like: ``` W0610 11:14:18.714366 56468 storage_engine.cpp:518] Trace: 0610 11:14:08.068484 (+ 0us) storage_engine.cpp:520] start to perform cumulative compaction 0610 11:14:08.069844 (+ 1360us) storage_engine.cpp:526] found best tablet 547083 0610 11:14:08.069846 (+ 2us) cumulative_compaction.cpp:42] got cumulative compaction lock 0610 11:14:08.069947 (+ 101us) cumulative_compaction.cpp:46] calculated cumulative point 0610 11:14:08.070141 (+ 194us) cumulative_compaction.cpp:50] rowsets picked 0610 11:14:08.070143 (+ 2us) compaction.cpp:46] got concurrency lock and start to do compaction 0610 11:14:08.070518 (+ 375us) compaction.cpp:74] prepare finished 0610 11:14:15.389893 (+7319375us) compaction.cpp:87] merge rowsets finished 0610 11:14:15.390916 (+ 1023us) compaction.cpp:102] output rowset built 0610 11:14:15.390917 (+ 1us) compaction.cpp:106] check correctness finished 0610 11:14:15.409460 (+ 18543us) compaction.cpp:110] modify rowsets finished 0610 11:14:15.409496 (+ 36us) cumulative_compaction.cpp:55] compaction finished 0610 11:14:15.410138 (+ 642us) cumulative_compaction.cpp:65] unused rowsets have been moved to GC queue Metrics: {"filtered_rows":0,"input_row_num":136707,"input_rowsets_count":302,"input_rowsets_data_size":76617836,"input_segments_num":302,"merge_rowsets_latency_us":7319372,"merged_rows":0,"output_row_num":136707,"output_rowset_data_size":53893280,"output_segments_num":1} ```	2020-06-13 19:31:51 +08:00
Mingyu Chen	b3811f910f	[Spark load][Fe 4/6] Add hive external table and update hive table syntax in loadstmt (#3819 ) * Add hive external table and update hive table syntax in loadstmt * Move check hive table from SelectStmt to FromClause and update doc * Update hive external table en sql reference	2020-06-13 16:28:24 +08:00
WingC	414a0a35e5	[Dynamic Partition] Use ZonedDateTime to support set timezone (#3799 ) This CL mainly support timezone in dynamic partition: 1. use new Java Time API to replace Calendar. 2. support set time zone in dynamic partition parameters.	2020-06-13 16:27:09 +08:00
Mingyu Chen	b8ee84a120	[Doc] Add docs to OLAP_SCAN_NODE query profile (#3808 )	2020-06-13 16:25:40 +08:00
caiconghui	6928c72703	Optimize the logic for getting TabletMeta from TabletInvertedIndex to reduce frequency of getting read lock (#3815 ) This PR is to optimize the logic for getting tabletMeta from TabletInvertedIndex to reduce frequence of getting read lock	2020-06-13 12:46:59 +08:00
sdgshawn	61be7132a9	fix for be server crash which throwing syntax error when parse json … (#3846 ) Fix for be server crash which throwing syntax error when parse json from kafka message	2020-06-13 12:45:16 +08:00
HuangWei	38b6d291f1	[Bug] fix uninitialized member vars (#3848 ) This fix is based on UBSAN unit test. So if we create & use class obj in a different way, may have runtime error: load of value XX, which is not a valid value for type 'YYY' warnings again. Unit test should build in DEBUG or XXSAN mode(at lease DEBUG). RELEASE mode will add -DNDEBUG, turn off dchecks/asserts/debug.	2020-06-13 12:44:49 +08:00
HuangWei	83d39ff9c9	Avoid pass NULL to memcmp() (#3844 ) If we exec "StringVal(len=0, ptr="") == StringVal(len=0,ptr=NULL)", it will pass NULL ptr to memcmp(). It should be avoided.	2020-06-13 12:43:41 +08:00
HappenLee	dac156b6b1	[Spill To Disk] Analytic_Eval_Node Support Spill Disk and Del Some Unless Code (#3820 ) * 1. Add enable spilling in query option, support spill disk in Analytic_Eval_Node, FE can open enable spilling by set enable_spilling = true; Now, Sort Node and Analytic_Eval_Node can spill to disk. 2. Delete merge merge_sorter code we do not use now. 3. Replace buffered_tuple_stream by buffered_tuple_stream2 in Analytic_Eval_Node and support spill to disk. Delete the useless code of buffered_block_mgr and buffered_tuple_stream. 4. Add DataStreamRecvr Profile. Move the counter belong to DataStreamRecvr from fragment to DataStreamRecvr Profile to make clear of Running Profile. * change some hint in code * replace disable_spill with enable_spill which is better compatible to FE	2020-06-13 10:19:02 +08:00
wyb	44dbdf4986	Update hive external table en sql reference	2020-06-12 21:38:05 +08:00
HuangWei	88a5429165	[FE] Add db&tbl info in broker load log (#3837 ) stream load log in FE has db & tbl info, broker load log should have too.	2020-06-12 20:54:41 +08:00
Dayue Gao	7591527977	[Bug] Fix a bug that insert null bitmap crashes BE (#3830 ) INSERT INTO VALUES to_bitmap('xx') may insert null into bitmap column, which may cause dirty data to be written.	2020-06-12 18:03:02 +08:00
hffariel	75f4df400e	Stop travis building when an error occurred (#3838 ) Co-authored-by: fariel huang <farielclaire@gmail.com>	2020-06-12 09:16:01 +08:00
wyb	7f7ee63723	Move check hive table from SelectStmt to FromClause and update doc	2020-06-11 16:53:41 +08:00
EmmyMiao87	2ce2cf78ac	Remove unused import (#3826 ) Change-Id: Ic6ef5a0d372a9b17ffa21cffb9027d2d7e856474	2020-06-11 11:44:51 +08:00
Mingyu Chen	8d11ad3a16	[Doc] Fix website doc error (#3823 )	2020-06-11 10:01:54 +08:00
wfjcmcb	86d235a76a	[Extension] Logstash Doris output plugin (#3800 ) This plugin is used to output data to Doris for logstash Use the HTTP protocol to interact with the Doris FE Http interface Load data through Doris's stream load	2020-06-11 08:54:51 +08:00
HuangWei	8caedadb67	use scoped_refptr to new HashIndex (#3818 )	2020-06-10 23:47:10 +08:00
yangzhg	cd402a6827	[Restore] Fix error message not match of restore job when job is time out (#3798 ) For the current code if a restore job is time out it will be reported as user canceled. This error message is very misleading	2020-06-10 23:12:04 +08:00
HaiBo Li	ef94c25773	[Bug]fix the crash of checksum task #3735 (#3738 ) 1. the table include key column of double/float type 2. when run checksum task, will use all of key columns to compare 3. schema.column(idx) of double/float type is NULL #3735	2020-06-10 22:59:15 +08:00
ChenXiaofei	4adc9d45c2	[Doc] Update ALTER TABLE.md	2020-06-10 22:58:29 +08:00
yangzhg	de91037d8c	[Doc]Add some routine load docs (#3796 ) Add some documentation about using routine load in the cloud environment	2020-06-10 22:57:00 +08:00
EmmyMiao87	4cb5f7a535	[Config]Remove max_user_connections from config (#3805 ) Update max_user_connections by user property: ``` set property `user` max_user_connections=100; ```	2020-06-10 22:56:05 +08:00
wyb	4c2e73a5fe	Add hive external table and update hive table syntax in loadstmt	2020-06-10 16:32:32 +08:00
Yunfeng,Wu	8c608bbad5	[Doris On ES] Skip function_call expr when process predicate (#3813 ) [Doris On ES] Skip function_call expr when process predicate Fixed #3801 Do not push-down function_call such as split_xxx when process predicate, Doris BE is responsible for processing these predicate All rows in table: ``` +------+------+------+------------+------------+ \| k1 \| k2 \| k3 \| UpdateTime \| ArriveTime \| +------+------+------+------------+------------+ \| NULL \| NULL \| kkk1 \| 123456789 \| NULL \| \| kkk1 \| NULL \| NULL \| 123456789 \| NULL \| \| NULL \| kkk2 \| NULL \| 123456789 \| NULL \| +------+------+------+------------+------------+ ``` The following predicate could not push down to ES. ``` SQL 1: mysql> select * from (select split_part(k1, "1", 1) as kk from case_replay_for_milimin) t where t.kk is not null; +------+ \| kk \| +------+ \| kkk \| +------+ 1 row in set (0.02 sec) SQL 2: mysql> select * from (select split_part(k1, "1", 1) as kk from case_replay_for_milimin) t where t.kk > 'a'; +------+ \| kk \| +------+ \| kkk \| +------+ SQL 3: mysql> select * from (select split_part(k1, "1", 1) as kk from case_replay_for_milimin) t where t.kk > '2'; +------+ \| kk \| +------+ \| kkk \| +------+ 1 row in set (0.03 sec) ```	2020-06-10 11:22:53 +08:00
Mingyu Chen	4fa9d8cbe9	[Spark load][Fe 3/5] Fe create job (#3715 ) * Add create spark load job * Remove unused import	2020-06-09 21:57:46 +08:00
Mingyu Chen	5b1589498a	[Bug] Fix SchemaChangeJobV2's meta persist bug (#3804 ) 1. Missing field `partitionIndexMap` in SchemaChangeJobV2 2. Pair in field `indexSchemaVersionAndHashMap` can not be persisted by GSON 3. Exit the FE process when replay edit log error. Fix: #3802	2020-06-09 21:55:46 +08:00
Yunfeng,Wu	acd7a58875	[Doris On ES] [1/3] Add ES QueryBuilders for debug mode (#3774 )	2020-06-09 16:45:16 +08:00
Mingyu Chen	8ada2559b7	[Bug] Fix bug that checkpoint thread failed to start (#3795 ) 1. Set thread id before starting the checkpoint thread 2. Init the CHECKPOINT catalog instance before visiting it.	2020-06-08 23:00:36 +08:00
kangkaisen	559714f3d4	Fix largeint max min bug (#3793 )	2020-06-08 21:01:30 +08:00
Yingchun Lai	e4dc2ec440	[StorageEngine] Make StorageEngine::open return more detailed info (#3761 ) StorageEngine::open just return a very vague status info when failed, we have to check logs to find out the root reason, and it's not convenient to check logs if we run unit tests in CI dockers. It would be better to return more detailed failure info to point out the root reason, for example, it may return error status with message "file descriptors limit is too small".	2020-06-07 10:21:33 +08:00
kangkaisen	928379c5d8	[Bug] Fix colocate group replay NPE (#3790 ) Group id should also be persisted for replaying	2020-06-07 10:20:22 +08:00

... 91 92 93 94 95 ...

6608 Commits