Commit Graph

2502 Commits

Author SHA1 Message Date
bba85fc352 Update routine-load-manual.md (#4911)
add key word for routine load
2020-11-17 10:21:53 +08:00
b48c768dc7 [ComplexType] Restructure storage type to support complex types expending (#4905)
This CL includes:
* Change the column metadata to a tree structure.
* Refactor the segment_v2.ColumnReader and sgment_v2.ColumnWriter to support complex type.
* Implements the reading and writing of array type.
2020-11-16 21:59:41 +08:00
448df42fb0 [Compatibility] Add table_privileges, schema_privileges and user_privileges tables(#4899)
Add privileges tables in information_schema database
2020-11-16 21:58:30 +08:00
55080ba888 [BUG] Fix colocate join memory limit problem (#4894)
In colocate join, the memory limit of each instance is usually less than the value of exec_mem_limit,
which could lead to query failure (Memory exceed limit).
Since the purpose of resetting colocate-join memory limit
(/fe/fe-core/src/main/java/org/apache/doris/qe/Coordinator.java) is unclear to me,
I just change the default value of query_colocate_join_memory_limit_penalty_factor from 8 to 1, as a hotfix.
2020-11-16 21:57:00 +08:00
c5e435146d [Refactor] Remove break label for readability (#4890)
Co-authored-by: tanhao <tanhao.0902@bytedance.com>
2020-11-16 21:56:10 +08:00
5aefd701cb [Improve]modify isDecommissioned be capacity calculate rule (#4889)
I use containerized deployment of BE nodes, both using the same distributed disk.
When doing data migration, the current logic will lead to errors.
For example, my distributed disk has 10t and has been used by other services for 9T,
at this time, it is assumed that all the 9T data is used by BE nodes
2020-11-16 21:55:35 +08:00
2af4bc294f [Bug] Java Version BitmapValue deserialized failed when only has 32-bit bitmap (#4884) 2020-11-16 21:54:07 +08:00
e706a6bca4 [Doc] Running Profile document add HASH_JOIN_NODE, etc. (#4878)
- Running Profile document add `HASH_JOIN_NODE`, `CROSS_JOIN_NODE`, `UNION_NODE`, `ANALYTIC_EVAL_NODE`.
- `UNION_NODE` increase`MaterializeExprsEvaluateTime` profile.
2020-11-16 21:53:25 +08:00
18a22bd347 [BUG] Fix field error in information_schema.columns (#4858) 2020-11-15 22:01:32 +08:00
aca9b2da82 [Bug] Fix bug introduced by split RowsDelFiltered profile (#4881)
bug introduced from pr #4825, will cause `schema_change` to report an error:
```
schema_change.cpp:1271] fail to check row num! source_rows=1, merged_rows=0, filtered_rows=0, new_index_rows=0
schema_change.cpp:1921] failed to process the version. version=2-2
schema_change.cpp:1615] failed to alter tablet. base_tablet=44643.1383650721.b140317f6662c1e0-65bcbc87db8d22bc, drop new_tablet=45680.1530531459.474e41f3dd538fb6-9284085daac24f83
```
2020-11-13 16:16:10 +08:00
69c422e31e [Bug] Fix bug #4886 and #4586 by refactoring code of method 'getDbs' (#4887)
fix issue #4886
2020-11-13 11:55:10 +08:00
e9923100f2 [Profile][UT] Fix UT and remove useless profile (#4879)
Fix UT failed by #4825 and remove useless profile
2020-11-12 16:28:57 +08:00
97867364e7 Revert "[FEATURE]Check date type to avoid scan all partitions (#4756)" (#4877)
This reverts commit c8df76a807b4856f71bcb6a3a023849f3bf294d7.

This commit has some problem when handling predicate like:
`k1 = "2020-10-10 10:00:00.000"`

This is a valid predicate, and FE Datetime can not support milli or micro seconds, so it will treat it as invalid date time value.

So we revert it, and may find some better solution later.
2020-11-12 13:52:10 +08:00
796f44beac [Bug] Fix bug that routine load blocked with TOO_MANY_TASKS error (#4861)
When receiving empty msg from kafka, the load process will quit abnormally.
Fix #4860
2020-11-12 10:05:10 +08:00
1810f10497 [Bug] Fix bug that failed to create view with complex select stmt (#4840)
Fix bug that failed to create view with complex select stmt.
Fix #4839
2020-11-12 10:04:00 +08:00
a1ae399737 [Refactor] Refactor storage medium migration task process (#4475)
This CL refactor the storage medium migration task process in BE.
I did not modify the execution logic. Just extract part of the logic
in the migration task and put it in task_work_pool.

In this way, the migration task is only used to process the migration
from the specified tablet to the specified data dir.

Later, we can use this task to migrate of tablets between different disks. #4476
2020-11-12 10:00:43 +08:00
dd70653c91 [DOCS] Fix some docs typo (#4873) 2020-11-11 21:24:19 +08:00
1151a0063c [Bug] Make 'LastStartTime' in backends list as the actual BE start time (#4872)
We use 'LastStartTime' in backends list to check whether there is an unexpected
restart of BE, but it will be changed as BE's first heartbeat time after FE
restarted, it would be better to set it to BE's actual start time.
2020-11-11 21:24:06 +08:00
4ccd7b84ad [Bug] Rename table logic error (#4870)
1. Rename table operation will failed to drop table with old name in Catalog.
2. Rename table operation forget the check rollup names.
2020-11-11 21:22:10 +08:00
74bc25ffe5 [Metrics] Add metric to monitor timeout canceled fragment count (#4862)
It would be helpful to monitor the count of timeout canceled fragments
when there is any issuse cause fragments execute failed or queued too
long time.
2020-11-11 21:21:48 +08:00
f0e89395e6 [Bug] Fix DCHECK failed in group_concat (#4850)
issue:#4849
2020-11-11 21:21:37 +08:00
66132d2836 [Feature] Running Profile OLAP_SCAN_NODE layering and enhance readability (#4825)
mainly includes:
- `OLAP_SCAN_NODE` profile layering: `OLAP_SCAN_NODE`,`OlapScanner`, and `SegmentIterator`.
- Delete meaningless statistical values. mainly in scan_node.cpp.
- Increase `RowsConditionsFiltered` statistical, split from `RowsDelFiltered`, the meaning is the number of rows filtered by various column indexes, only in segment V2.
- Modify the document based on the above, and enhance readability.
2020-11-11 21:21:25 +08:00
53c570a273 [Broker Load] Ignore empty file when file format is parquet or orc. (#4810)
* [Broker Load] Ignore empty file when file format is parquet or orc.

We can not ready empty parquet or orc format file. So we should skip them
when doing broker load.
2020-11-10 10:55:48 +08:00
17af23edae 【Improvement】Avoid null host when forward to master (#4844)
Co-authored-by: wangxixu <wangxixu@xiaomi.com>
2020-11-10 10:54:29 +08:00
4204a878c8 [Bug] Fix some bugs of load job scheduler (#4869)
* [Bug] Fix some bugs of load job scheduler

1. The fix load meta bug logic should be removed since 0.12.
2. The load task thread pool's waiting queue should be as long as desired pending jobs num.
3. Submit the load task outside database lock to prevent holding lock for long time.
2020-11-10 10:50:31 +08:00
9aa3d61dc0 [refactor] change http server log level (#4853)
* change some log level

* change some log level
2020-11-08 20:53:36 +08:00
04cfcf6c36 Update fe-idea-dev.md (#4852)
* Update fe-idea-dev.md

use `brew install thrift@0.9` to install thrift 0.9.3.1
`brew edit thrift090 | head` shows thrift@0.9 uses thrift 0.9.3.1

* [Refactor] Remove the unnecessary if statement

Future<?> submit(Runnable task)
Submits a Runnable task for execution and returns a Future representing that task. The Future's get method will return null upon successful completion.
2020-11-08 20:52:15 +08:00
59c7d5021d [Bug][Load] Catch retry submit exception (#4796)
When `Load Job Task Queue` is filled, continue to submit more jobs to the queue will cause 
`RejectedExecutionException`.
But `callback.onTaskFailed` function does not catch the exception, that will cause
re-submitting job failed, and status is not updated to failed.
issue: #4795
2020-11-08 20:50:50 +08:00
f40868a480 [Optimize] Improve LRU cache's performance (#4781)
When LRUCache insert and evict a large number of entries, there are
frequently calls of HandleTable::remove(e->key, e->hash), it will
lookup the entry in the hash table. Now that we know the entry to
remove 'e', we can remove it directly from hash table's collision list
if it's a double linked list.
This patch refactor the collision list to double linked list, the simple
benchmark CacheTest.SimpleBenchmark shows that time cost reduced about
18% in my test environment.
2020-11-06 10:56:27 +08:00
bde84e4ae5 [Bug][SQL] Fix bug that query failed when SQL contains Union and Colocate join (#4842)
SQL like:
`select a join b union select c join d`;

if a b is colocate join, and c d is also colocate join, the query may failed
with error like:

`failed to get tablet. tablet_id=26846, with schema_hash=398972982, reason=tablet does not exist`
2020-11-05 20:57:11 +08:00
1d89e0670a [License] Add other license declare in NOTICE (#4831) 2020-11-05 20:30:49 +08:00
c5b034acc4 [FE UI] Fix some bugs about new FE UI (#4830)
1. Add a search boxer in the left tree view of Playground.
2. Fix some visual bugs of UI.
3. Fix bugs that link failed in QueryProfile view.
4. Fix bugs that cookie is always invalid.
5. Set cookie to HTTP_ONLY to make it more safe.
2020-11-05 20:30:09 +08:00
f239f44b37 [Compaction][Bug-Fix] Fix bug that meta lock need to be held when calculating compaction score (#4829)
* [Compaction][Buf] Fix bug that meta lock need to be held when calucating compaction score

* fix

Co-authored-by: morningman <chenmingyu@baidu.com>
2020-11-05 20:29:01 +08:00
c8df76a807 [FEATURE]Check date type to avoid scan all partitions (#4756)
`select day from test where day='2020-10-32'`
table 'test' is parititioned by day. In this case, '2020-10-32' will be taken as CastExpr not LiteralExpr,
and condition "day='2020-10-32'" will not be recognized as partitionfilter.
This case will scan all partitions. To avoid scall all partitions, it is better to filter invalid date value.

issue: #4755
2020-11-05 20:25:53 +08:00
c53dd949c9 [Feature] Add CPU and Heap profile in BE webserver (#4632)
Add CPU Profile and Heap Profile in BE webserver.
This way we can more easily diagnose system performance bottlenecks through perf tools.
2020-11-05 20:25:07 +08:00
80d5f6e3d8 [LoadBalance] make BeLoadRebalancer extends from base class Rebalancer (#4771) 2020-11-03 20:23:48 +08:00
d1c2b3ed0d [Optimize] Add an unordered_map for TabletSchema to speed up column name lookup (#4779)
Reduce column name lookup for TabletSchema and Tablet from O(N) to O(1).
2020-11-03 19:53:44 +08:00
b1c1ffda4a [Refactor] Refactor olap scan node code (#4823)
1. Remove meaningless code in Doris
2. Replace string copy by string reference
3. Simplified the implementation of some functions
2020-11-01 09:12:23 +08:00
d0b7286099 [optimize] optimize default value for thriftserver's config key "thrift_client_timeout_ms" (#4808)
* optimize  default value for thrift server's config key "thrift_client_timeout_ms"

Co-authored-by: liwei5 <liwei5@vipkid.com.cn>
2020-10-30 17:10:03 +08:00
c03808a4e4 [TabletScheduler] Fix some bug where decommission operations cannot be completed (#4804)
1.
When we decommission some BEs with SSD disks,
if there are no SSD disks on the remaining BEs, it will be impossible to select a suitable destination path.
In this case, we need to ignore the storage medium property and try to select the destination path again.
Set `isSupplement` to true will ignore the storage medium property.

2.
When the BE nodes where all replicas of a tablet are located are decommission,
and this task is a VERSION_INCOMPLETE task.
This will lead to failure to select a suitable dest replica.
At this time, we should try to convert this task to a REPLICA_MISSING task, and schedule it again.
2020-10-30 11:50:47 +08:00
a291f4d285 [SQL][Bug] Fix union bug (#4772) (#4807) 2020-10-30 11:49:43 +08:00
44498a1ae2 [Compatibility] Add table "views" in information_schema database (#4778)
To support some tools like DBeaver
2020-10-30 11:44:44 +08:00
5933326503 Fix Clerical error (#4820) 2020-10-30 10:06:35 +08:00
9099191038 remove (#4818)
Co-authored-by: yangwenbo6 <yangwenbo3@jd.com>
2020-10-30 10:05:47 +08:00
32afb11458 [Doc] Add doc for sequence column (#4814) 2020-10-30 10:05:15 +08:00
427d7ec08a [Docs] Fix REPLACE TABLE syntax (#4819) 2020-10-30 10:04:53 +08:00
6b234fb2ce Fix create rollup may duplicate hidden column (#4816) 2020-10-30 10:04:21 +08:00
fe6ac26b74 [BUG] Cast int type to date type (#4806) 2020-10-29 20:36:45 +08:00
54fa76359b [ODBC] Support ODBC external table of PostgreSQL and revise the doc. (#4798) 2020-10-29 14:31:23 +08:00
d6497fedc4 [Config] Change config name 'streaming_load_max_batch_size_mb' to 'streaming_load_json_max_mb' (#4791)
The name and another config name are close to each other and are indistinguishable.
So this pr modify the name.
The document description has also been changed
2020-10-28 23:27:33 +08:00