Commit Graph

1887 Commits

Author SHA1 Message Date
fb02bb5cd9 [Load] Fix mem limit in NodeChannel (#3643) 2020-05-22 09:11:59 +08:00
4f79036a7e Add error code into error message (#3645) 2020-05-21 19:14:35 +08:00
f6b5c8839b [Bug] Ignore loading DELETE status tablet error when restarting BE (#3641)
Fix: #3640 

Also add a `batch delete meta` feature for `meta tool`
Fix #3639
2020-05-21 19:08:28 +08:00
ef8fd1fcbe [Load] Support load json-data into Doris by RoutineLoad or StreamLoad (#3553)
Doris support load json-data by RoutineLoad or StreamLoad
2020-05-21 13:00:49 +08:00
792307ae54 [CMake] Different cmake build directories for different build types (#3623) (#3629)
add `CMAKE_BUILD_TYPE` as the suffix of build directory.
2020-05-20 21:41:44 +08:00
0d66e6bd15 Support bitmap_intersect (#3571)
* Support bitmap_intersect

Support aggregate function Bitmap Intersect, it is mainly used to take intersection of grouped data.
The function 'bitmap_intersect(expr)' calculates the intersection of bitmap columns and returns a bitmap object.
The defination is following:
FunctionName: bitmap_intersect,
InputType: bitmap,
OutputType: bitmap

The scenario is as follows:
Query which users satisfy the three tags a, b, and c at the same time.

```
select bitmap_to_string(bitmap_intersect(user_id)) from
(
    select bitmap_union(user_id) user_id from bitmap_intersect_test
    where tag in ('a', 'b', 'c')
    group by tag
) a
```
Closed #3552.

* Add docs of bitmap_union and bitmap_intersect

* Support null of bitmap_intersect
2020-05-20 21:12:02 +08:00
c54cb4b14e [Memory Engine] Add column reader/writer (#3580) 2020-05-20 11:09:30 +08:00
6be7a6232f [Config] Add ignore config to determine whether to continue to start be when load tablet from header failed. (#3632)
Add config ignore_load_tablet_failure to determine whether to continue to start be when load tablet from header failed.
2020-05-20 09:40:50 +08:00
58a6628af2 [Bug] Fix first start error after upgrade doris to support delete dulplicate table value columns (#3628) 2020-05-20 09:39:24 +08:00
9425f17d28 [Bug] instance mem tracker should has no limit (#3592) 2020-05-19 19:49:39 +08:00
8018b1c348 [Doris on ES]Fix bug of like not translate correctly (#3602)
Why this case happened
In current implement, translation into dsl only if it is not the first charactor.
Thus, when sql is write like '%abc', translation would not run.

How fixed

Now, translation will trigger with charactor '?' or '*'
if it is the first charactor, translate directly
else, check the preceding char is escaped or not to determin translation or not
2020-05-19 17:06:46 +08:00
4cbcae1574 [Spark on Doris] Shade and provide the thrift lib in spark-doris-connector (#3631)
Mainly changes:
1. Shade and provide the thrift lib in spark-doris-connector
2. Add a `build.sh` for spark-doris-connector
3. Move the README.md of spark-doris-connector to `docs/`
4. Change the line delimiter of `fe/src/test/java/org/apache/doris/analysis/AggregateTest.java`
2020-05-19 14:20:21 +08:00
7fb74db0a1 [Trace] Introduce trace util to BE
Ref https://github.com/apache/incubator-doris/issues/3566
Introduce trace utility from Kudu to BE. This utility has been widely used in Kudu,
Impala also import this trace utility.
This trace util is used for tracing each phases in a thread, and can be dumped to
string to see each phases' time cost and diagnose which phase cost more time.
This util store a Trace object as a threadlocal variable, we can add trace entries
which record the current file name, line number, user specified symbols and
timestamp to this object, and it's able to add some counters to this Trace
object. And then, it can be dumped to human readable string.
There are some helpful macros defined in trace.h, here is a simple example for
usage:
```
  scoped_refptr<Trace> t1(new Trace);            // New 2 traces
  scoped_refptr<Trace> t2(new Trace);
  t1->AddChildTrace("child_trace", t2.get());    // t1 add t2 as a child named "child_trace"

  TRACE_TO(t1, "step $0", 1);  // Explicitly trace to t1
  usleep(10);
  // ... do some work
  ADOPT_TRACE(t1.get());   // Explicitly adopt to trace to t1
  TRACE("step $0", 2);     // Implicitly trace to t1
  {
    // The time spent in this scope is added to counter t1.scope_time_cost
    TRACE_COUNTER_SCOPE_LATENCY_US("scope_time_cost");
    ADOPT_TRACE(t2.get());  // Adopt to trace to t2 for the duration of the current scope
    TRACE("sub start");     // Implicitly trace to t2
    usleep(10);
    // ... do some work
    TRACE("sub before loop");
    for (int i = 0; i < 10; ++i) {
      TRACE_COUNTER_INCREMENT("iterate_count", 1);  // Increase counter t2.iterate_count

      MicrosecondsInt64 start_time = GetMonoTimeMicros();
      usleep(10);
      // ... do some work
      MicrosecondsInt64 end_time = GetMonoTimeMicros();
      int64_t dur = end_time - start_time;
      // t2's simple histogram metric with name prefixed with "lbm_writes"
      const char* counter = BUCKETED_COUNTER_NAME("lbm_writes", dur);
      TRACE_COUNTER_INCREMENT(counter, 1);
    }
    TRACE("sub after loop");
  }
  TRACE("goodbye $0", "cruel world");     // Automatically restore to trace to t1
  std::cout << t1->DumpToString(Trace::INCLUDE_ALL) << std::endl;
```
output looks like:
```
0514 02:16:07.988054 (+     0us) trace_test.cpp:76] step 1
0514 02:16:07.988112 (+    58us) trace_test.cpp:80] step 2
0514 02:16:07.988863 (+   751us) trace_test.cpp:103] goodbye cruel world
Related trace 'child_trace':
0514 02:16:07.988120 (+     0us) trace_test.cpp:85] sub start
0514 02:16:07.988188 (+    68us) trace_test.cpp:88] sub before loop
0514 02:16:07.988850 (+   662us) trace_test.cpp:101] sub after loop
Metrics: {"scope_time_cost":744,"child_traces":[["child_trace",{"iterate_count":10,"lbm_writes_lt_1ms":10}]]}
```
Exclude the original source code, this patch
do the following work to adapt to Doris:
- Rename "kudu" namespace to "doris"
- Update some names to the existing function names in Doris, i.g. strings::internal::SubstituteArg::kNoArg -> strings::internal::SubstituteArg::NoArg
- Use doris::SpinLock instead of kudu::simple_spinlock which hasn't been imported
- Use manual malloc() and free() instead of kudu::Arena which hasn't been imported
- Use manual rapidjson::Writer instead of kudu::JsonWriter which hasn't been imported
- Remove all TRACE_EVENT related unit tests since TRACE_EVENT is not imported this time
- Update CMakeLists.txt

NOTICE(#3622):
This is a "revert of revert pull request".
This pr is mainly used to synthesize the PRs whose commits were
scattered and submitted due to the wrong merge method into a complete single commit.
2020-05-18 14:55:11 +08:00
9d72d1bb87 [Refactor] Refactor some redundant code && Replace some UT by UtFrameUtils
This CL have no logic changed, just do some code refactor and use new UtFrameWork to replace some old UT.

NOTICE(#3622):
This is a "revert of revert pull request".
This pr is mainly used to synthesize the PRs whose commits were
scattered and submitted due to the wrong merge method into a complete single commit.
2020-05-18 14:53:59 +08:00
fa27012da2 [Bug] Fix bug that ConcurrentModificationException thrown
Fix: #3588
When truncate the table, a ConcurrentModificationException may thrown when there
are temp partitions in this table.

NOTICE(#3622 ):
This is a "revert of revert pull request".
This pr is mainly used to synthesize the PRs whose commits were
scattered and submitted due to the wrong merge method into a complete single commit.
2020-05-18 14:51:25 +08:00
8398fa9b75 [Bug] Fix bug that DbTxnMgr does not create for db in CatalogRecycleBin
Fix #3589
The reason is that:
When load meta from image, we will create `DatabaseTransactionMgr` for each database
loaded from `loadDb()` method. But we forget to create `DatabaseTransactionMgr` for
database in the catalog recycle bin.

NOTICE(#3622):
This is a "revert of revert pull request".
This pr is mainly used to synthesize the PRs whose commits were
scattered and submitted due to the wrong merge method into a complete single commit.
2020-05-18 14:49:49 +08:00
c2c81d58dc [Fix]SlotRef.tosql() is the same as the SQL returned by different sql
Fix: #3555

NOTICE(#3622):
This is a "revert of revert pull request".
This pr is mainly used to synthesize the PRs whose commits were
scattered and submitted due to the wrong merge method into a complete single commit.
2020-05-18 14:47:48 +08:00
7a83c5662d [Bug] fix OrCompoundPredicate predicate fold bug #3596
Fix: #3596

NOTICE(#3622):
This is a "revert of revert pull request".
This pr is mainly used to synthesize the PRs whose commits were
scattered and submitted due to the wrong merge method into a complete single commit.
2020-05-18 14:46:34 +08:00
d8a32af59c [Bug] Fix bug that descriptor table is not reset before planning next routine load task
Before planning for next routine load task, the analyzer and descriptor table
in it should be reset. Otherwise, a lot of historical objects will
accumulate inside, causing memory leaks.
Fix: #3603

NOTICE(#3622):
This is a "revert of revert pull request".
This pr is mainly used to synthesize the PRs whose commits were
scattered and submitted due to the wrong merge method into a complete single commit.
2020-05-18 14:44:27 +08:00
87caa697a9 [Doc] Update table-restore-tool.md
Fix some format.

NOTICE(#3622 ):
This is a "revert of revert pull request".
This pr is mainly used to synthesize the PRs whose commits were
scattered and submitted due to the wrong merge method into a complete single commit.
2020-05-18 14:42:17 +08:00
69a63f6f53 Revert "[trace] Introduce trace util to BE" (#3614)
This revert is used to correct the mess of the commit
timeline caused by the wrong merge method.
2020-05-18 13:16:39 +08:00
903592d82b Revert "Refactor some redunant code && Replace some UT by UtFrameUtils" (#3613)
This revert is used to correct the mess of the commit
timeline caused by the wrong merge method.
2020-05-18 13:11:39 +08:00
d028a728e4 Revert "[Bug] Fix bug that ConcurrentModificationException thrown " (#3612)
This revert is used to correct the mess of the commit
timeline caused by the wrong merge method.
2020-05-18 13:10:33 +08:00
a37f5cb657 Revert "[Bug] Fix bug that DbTxnMgr does not create for db in CatalogRecycleBin" (#3611)
This revert is used to correct the mess of the commit
timeline caused by the wrong merge method.
2020-05-18 13:08:36 +08:00
539efb3532 Revert "[Fix]SlotRef.tosql() is the same as the SQL returned by different sql" (#3610)
This revert is used to correct the mess of the commit
timeline caused by the wrong merge method.
2020-05-18 13:07:21 +08:00
20f20239f2 Revert "[Bug] fix OrCompoundPredicate predicate fold bug #3596" (#3609)
This revert is used to correct the mess of the commit
timeline caused by the wrong merge method.
2020-05-18 13:03:24 +08:00
ed6548e27f Revert "[Bug] Fix bug that descriptor table is not reset before planning next routine load task (#3605)" (#3608)
This reverts commit 271f25f0a4e98c3d9130c0772bc386e7786cbae4.

This revert is used to correct the mess of the commit
timeline caused by the wrong merge method.
2020-05-18 13:00:20 +08:00
24ca937877 Revert "[Doc] Update table-restore-tool.md" (#3606) 2020-05-18 12:08:54 +08:00
0d76c78537 [Doc] Update table-restore-tool.md 2020-05-18 11:12:24 +08:00
bb7ae97845 [trace] Introduce trace util to BE
Ref https://github.com/apache/incubator-doris/issues/3566
Introduce trace utility from Kudu to BE. This utility has been widely used in Kudu,
Impala also import this trace utility.
This trace util is used for tracing each phases in a thread, and can be dumped to
string to see each phases' time cost and diagnose which phase cost more time.
This util store a Trace object as a threadlocal variable, we can add trace entries
which record the current file name, line number, user specified symbols and
timestamp to this object, and it's able to add some counters to this Trace
object. And then, it can be dumped to human readable string.
There are some helpful macros defined in trace.h, here is a simple example for
usage:
```
  scoped_refptr<Trace> t1(new Trace);            // New 2 traces
  scoped_refptr<Trace> t2(new Trace);
  t1->AddChildTrace("child_trace", t2.get());    // t1 add t2 as a child named "child_trace"

  TRACE_TO(t1, "step $0", 1);  // Explicitly trace to t1
  usleep(10);
  // ... do some work
  ADOPT_TRACE(t1.get());   // Explicitly adopt to trace to t1
  TRACE("step $0", 2);     // Implicitly trace to t1
  {
    // The time spent in this scope is added to counter t1.scope_time_cost
    TRACE_COUNTER_SCOPE_LATENCY_US("scope_time_cost");
    ADOPT_TRACE(t2.get());  // Adopt to trace to t2 for the duration of the current scope
    TRACE("sub start");     // Implicitly trace to t2
    usleep(10);
    // ... do some work
    TRACE("sub before loop");
    for (int i = 0; i < 10; ++i) {
      TRACE_COUNTER_INCREMENT("iterate_count", 1);  // Increase counter t2.iterate_count

      MicrosecondsInt64 start_time = GetMonoTimeMicros();
      usleep(10);
      // ... do some work
      MicrosecondsInt64 end_time = GetMonoTimeMicros();
      int64_t dur = end_time - start_time;
      // t2's simple histogram metric with name prefixed with "lbm_writes"
      const char* counter = BUCKETED_COUNTER_NAME("lbm_writes", dur);
      TRACE_COUNTER_INCREMENT(counter, 1);
    }
    TRACE("sub after loop");
  }
  TRACE("goodbye $0", "cruel world");     // Automatically restore to trace to t1
  std::cout << t1->DumpToString(Trace::INCLUDE_ALL) << std::endl;
```
output looks like:
```
0514 02:16:07.988054 (+     0us) trace_test.cpp:76] step 1
0514 02:16:07.988112 (+    58us) trace_test.cpp:80] step 2
0514 02:16:07.988863 (+   751us) trace_test.cpp:103] goodbye cruel world
Related trace 'child_trace':
0514 02:16:07.988120 (+     0us) trace_test.cpp:85] sub start
0514 02:16:07.988188 (+    68us) trace_test.cpp:88] sub before loop
0514 02:16:07.988850 (+   662us) trace_test.cpp:101] sub after loop
Metrics: {"scope_time_cost":744,"child_traces":[["child_trace",{"iterate_count":10,"lbm_writes_lt_1ms":10}]]}
```
Exclude the original source code, this patch
do the following work to adapt to Doris:
- Rename "kudu" namespace to "doris"
- Update some names to the existing function names in Doris, i.g. strings::internal::SubstituteArg::kNoArg -> strings::internal::SubstituteArg::NoArg
- Use doris::SpinLock instead of kudu::simple_spinlock which hasn't been imported
- Use manual malloc() and free() instead of kudu::Arena which hasn't been imported
- Use manual rapidjson::Writer instead of kudu::JsonWriter which hasn't been imported
- Remove all TRACE_EVENT related unit tests since TRACE_EVENT is not imported this time
- Update CMakeLists.txt
2020-05-18 11:10:25 +08:00
d4ff6dcdd6 fix by review 2020-05-18 10:56:12 +08:00
2f3b7b5b8e [Refactor] Refactor some redundant code && Replace some UT by UtFrameUtils 2020-05-18 10:53:32 +08:00
bca9fb8551 [Bug] Fix bug that ConcurrentModificationException thrown
When truncate the table, a ConcurrentModificationException may thrown when there
are temp partitions in this table.
2020-05-18 10:48:19 +08:00
5276b5a4a3 [Bug] Fix bug that DbTxnMgr does not create for db in CatalogRecycleBin
Fix #3589
The reason is that:
When load meta from image, we will create `DatabaseTransactionMgr` for each database
loaded from `loadDb()` method. But we forget to create `DatabaseTransactionMgr` for 
database in the catalog recycle bin.
2020-05-18 10:42:17 +08:00
62f746fc87 [Fix] SlotRef.tosql() is the same as the SQL returned by different sql 2020-05-18 10:41:15 +08:00
e6588981b4 [Bug] fix OrCompoundPredicate predicate fold bug #3596 (#3597)
* [Bug] fix OrCompoundPredicate predicate fold bug

* fix code style
2020-05-18 10:36:13 +08:00
271f25f0a4 [Bug] Fix bug that descriptor table is not reset before planning next routine load task (#3605)
Before planning for next routine load task, the analyzer and descriptor table
in it should be reset. Otherwise, a lot of historical objects will
accumulate inside, causing memory leaks.
2020-05-18 10:34:21 +08:00
5138197d57 [Bug] generate exceptions to avoid mulitDistinctAggregation produces wrong results (#3561)
when a query (#3492) contain “2 DistinctAggregation with one column” and “1 
DistinctAggregation with two columns”,  it will produce wrong result.

This pull request is not to solve this problem really, but to generate exceptions to avoid 
getting wrong results. 

This problem needs a real repair in future.
2020-05-16 21:36:43 +08:00
7bf926eba8 [Profile] Improve the running profile
1. Delete Invalid Counter In Data_Stream_Sender. (#3598)
2. Add Counter For PartitionHashTable of PartitionAggregationNode:
     * Hash Probe Method
     * Row processed by Aggregation
     * HashFilledBuckets: Counter How Many FilledBuckets in Aggragation
     * HTResize: Counter How Many Resize of HashTable
     * HashProbe: Counter Probe of HashTable
     * HashFailedProbe: Counter Failed Probe of HashTable
     * HashTravelLength: Total TravelLength for Probe
     * HashCollisions: Counter of HashCollision
3. Del some unecessary code in PartitionHashTable by template
2020-05-16 21:35:30 +08:00
8cb48161e3 change to current catalog 2020-05-16 21:12:46 +08:00
4217db00d3 Tosql method returns slot index and column name 2020-05-15 17:31:25 +08:00
c50b1a4d17 fix bug 2020-05-15 16:15:53 +08:00
0d457692bc [incubator-doris][thirdpary][glog][bug] Calucate file length at the be start (#3594) 2020-05-15 15:15:54 +08:00
a4e98953be [website] modify download links & remove some links' suffix _EN(master) (#3573)
modify download links & remove some links' suffix _EN
2020-05-15 14:03:28 +08:00
a7e1c08624 Report error when subquery in case-when returns empty set (#3558)
The doris rewrite the subquery in case-when to inline view.
So it the result is different between subquery in case-when and inline view.
We could not support the empty set of subquery in case-when.
This commit forbidden this case.
2020-05-15 12:32:05 +08:00
8be10dca05 fix code style 2020-05-15 12:10:19 +08:00
805ecc9d4e fix 2020-05-15 11:23:01 +08:00
0919407092 [Bug] fix OrCompoundPredicate predicate fold bug 2020-05-15 10:20:13 +08:00
273aad6cf4 [Bug] Restore tablet action not working because tablet status is shutdown (#3551) 2020-05-15 10:11:17 +08:00
123e1394b1 [Delete] Allow delete duplicated non-key column using delete from (#3424) 2020-05-15 09:26:36 +08:00