Commit Graph

1385 Commits

Author SHA1 Message Date
4e2f01a9fa [Compaction] Fix a bug that CumulativeCompaction compares time of different precision (#2693)
time(NULL) returns second-resolution timestamp, however all compaction related time in Tablet are in millis-resolution. Therefore should use UnixMillis() instead.
2020-01-07 21:31:36 +08:00
844ccaafc9 Remove boost filesystem exception in FileUtils (#2692)
If `errer_code` is provided, then the `boost::filesystem` functions
will not throw an exception, so we do not need to catch it.
2020-01-07 07:29:05 -06:00
7d2610d091 Change bitmap functions return type to BITMAP (#2690) 2020-01-07 19:27:21 +08:00
852046de29 Fix incompatibility with arm architecture in olap #2645 (#2682) 2020-01-07 19:16:10 +08:00
72fd745086 [Load] Fix broker load's file format npe bug (#2689) 2020-01-07 16:50:02 +08:00
23e3149d70 [Variable] Fix default rowset type read (#2687)
fix bug of read default rowset type in HeartbeatFlags.
2020-01-07 15:24:03 +08:00
e2c174aecc Fix mysqlChannel npe in ConnectContext (#2684) 2020-01-07 14:54:08 +08:00
2326b478b6 Support load orc format in Apache Doris (#2554)
Support load orc format in Apache Doris
2020-01-07 14:22:43 +08:00
369b2c364f [Build] fix jemalloc download (#2681) 2020-01-07 11:33:55 +08:00
ec860c82c0 [Variable] Fix default rowset type variable (#2680)
Fix defaultRowsetType's type description in SessionVariable
2020-01-07 10:56:18 +08:00
f77171f85d Make bdbje lock timeout configurable (#2676) 2020-01-06 21:20:36 +08:00
40c8bddd24 Decrease transaction_clean_interval_second config (#2673) 2020-01-06 20:16:37 +08:00
9ca7fdfe1c Make MAX_SCHEDULING_TABLETS and MAX_BALANCING_TABLETS configurable (#2670) 2020-01-06 20:15:38 +08:00
4d6afdae4d Add nio support for mysql protocol implementation (#2603) 2020-01-06 18:56:21 +08:00
de4d1778c6 Fix incompatibility with arm architecture in util and gutil (#2650)
1. upgrade  gutil code from imapla  to new verison, include `cpuinfo`, `spinlock` and `linux_syscall_support `
2. impliments arm version  utf8 check code
3. remove incompatible code from stopwatch
2020-01-06 18:39:31 +08:00
c6badaec91 Fix bug: CreateIndexClause can be casted to AlterTableClause (#2667) 2020-01-06 18:31:40 +08:00
7f148c188e [Build]Make set target arch universal (#2660) 2020-01-06 14:46:07 +08:00
87a50070c4 Fix bug: parquet scanner don't seek (#2661) 2020-01-06 13:55:40 +08:00
8a5ee6ad21 Fix FE couldn't start (#2662) 2020-01-06 12:46:45 +08:00
220ed8436c [Unit Test]Fix Schema Change Test Case (#2659) 2020-01-05 20:08:23 +08:00
1648226927 Adapt arrow 0.15 API (#2657)
This CL supports arrow's zero copy read interface, which can make code
comply with arrow 0.15.
And the schema change unit test has some problem, I disable it in run-ut.sh
2020-01-04 15:54:29 +08:00
af9529a207 [Dynamic Partition] Support for automatically adding partitions
In some scenarios, when a user creates an olap table that is range partition by time, the user needs to periodically add and remove partitions to ensure that the data is valid. As a result, adding and removing partitions dynamically can be very useful for users.
2020-01-03 23:45:04 +08:00
1ca82122a8 Fix doris be compile error for Ubuntu14.04 (#2647) 2020-01-03 21:14:34 +08:00
42dfe1369b Add filter conditions for 'show partitions from table' syntax (#2553)
Add filter conditions for show partitions from table syntax, to filter partitions needed
2020-01-03 19:52:25 +08:00
458ed55fa5 Fix BITMAP_UNION_COUNT couldn't hit rollup table (#2655) 2020-01-03 19:27:40 +08:00
5dff936243 Fix HLL_UNION_AGG AnalyticFn result in BE core by adding hll_get_value (#2653) 2020-01-03 19:23:56 +08:00
c098178f7a [Index] Implements create drop show index syntax for bitmap index [#2487] (#2573)
### create table with index 
```
CREATE TABLE table1
(
    siteid INT DEFAULT '10',
    citycode SMALLINT,
    username VARCHAR(32) DEFAULT '',
    pv BIGINT SUM DEFAULT '0',
    INDEX index_name [USING BITMAP] (siteid, citycode) COMMENT 'balabala'
)
AGGREGATE KEY(siteid, citycode, username)
DISTRIBUTED BY HASH(siteid) BUCKETS 10
PROPERTIES("replication_num" = "1");
```
### create index 
```
CREATE INDEX index_name  ON table1 (siteid, citycod) [USING BITMAP] COMMENT 'balabala';
or 
ALTER TABLE table1 ADD  INDEX index_name  [USING BITMAP] (siteid, citycod) COMMENT 'balabala';
```
### drop index
```
DROP INDEX index_name ON table1;
or
ALTER TABLE table1 DROP INDEX index_name 
```

### show index
```
SHOW INDEX[ES] FROM table1
```
output
```
+---------+-------------+-----------------+------------+---------+
| Table   | Index_name  | Column_name     | Index_type | Comment |
+---------+-------------+-----------------+------------+---------+
| table1  | index_name  | siteid,citycode | BITMAMP    | balabala|
+---------+-------------+-----------------+------------+---------+
```
2020-01-03 17:41:26 +08:00
559a8d0514 Bump up apache arrow to 0.15.1 (#2646) 2020-01-03 13:15:29 +08:00
7951e15208 Fix estimate_segment_size problem #2643 (#2644) 2020-01-03 11:11:34 +08:00
3b5f608df7 Delete documents of count_distinct function (#2642) 2020-01-02 23:35:10 +08:00
4220e09724 Change introduction document (#2640)
change FE default maximum memory introduction document from 2G to 4G in english
2020-01-02 23:03:56 +08:00
b440ff286f Change introduction document (#2639)
Change FE default maximum memory introduction document  from 2G to 4G
2020-01-02 23:03:17 +08:00
9c90b09a3f [Alter Table] No need to check whether table is stable when doing some kinds of alter operation (#2617)
* [Alter Table] No need to check whether table is stable when doing some kinds of alter operation.

Not all alter table operation require table to be stable. Such as rename, modify meta data.
2020-01-02 20:51:23 +08:00
d05768ffd4 Fix core when es_scanner_node exit (#2634) 2020-01-02 16:30:11 +08:00
6cab929d6d [Compaction] Limit the max concurrency of running compaction tasks (#2635)
Compaction task may sometimes consume much memory and results in OOM.
And currently, there is no good way to predict the mem consumption of
a compaction task, so I add a new BE config: max_compaction_concurrency
to limit the max concurrency of running compaction tasks manually.
2020-01-02 14:47:54 +08:00
cc924c9e6a [Rowset Reader] Improve the merge read efficiency of alpha rowsets (#2632)
When merge reads from one rowset with multi overlapping segments, 
I introduce a priority queue(A Minimum heap data structure) for multipath merge sort, 
to replace the old N*M time complexity algorithm.

This can significantly improve the read efficiency when merging large number of 
overlapping data.

In mytest:
1. Compaction with 187 segments reduce time from 75 seconds to 42 seconds
2. Compaction with 3574 segments cost 43 seconds, and with old version, I kill the 
process after waiting more than 10 minutes...

This CL only change the reads of alpha rowset. Beta rowset will be changed in another CL.

ISSUE: #2631
2020-01-02 14:10:05 +08:00
2a8e77d9cb Support arm atomicops (#2626) (#2627) 2019-12-31 22:39:22 +08:00
4c5b0b6dc9 Remove VersionHash used to comparison in BE (#2622) 2019-12-31 19:38:45 +08:00
13733d91e3 Fix the missing sync in SegmentWriter (#2623)
In the default configuration, `WritableFile` does not sync when close file.
We need to do it manually to ensure durability.
2019-12-31 18:34:40 +08:00
9783fb7221 Fix: UDF version `GLIBCXX_3.4.21' not found (#2629) 2019-12-31 18:32:42 +08:00
9471c90451 Delete count_distinct function (#2621)
ISSUE #1553
This commit will remove function count_distinct().

We already have function multi_distinct_count as an alternative to help us calculate "count distinct" of any type value.
Besides, the count_distinct() function is with the the same symbol as count() function, which fails to express the meaning.
So I suggest to remove count_distinct() function.
2019-12-31 14:40:45 +08:00
5229ea24da Fix bloom filter statistics bug (#2609) 2019-12-30 23:23:39 +08:00
da2838e5fe Set AGG_KEYS upon upgrade from tablet if has_keys_type() is false (#2620)
Doris support AGG_KEYS/UNIQUE_KEYS/DUP_KEYS/ three storage model.
Among these three model, UNIQUE_KYES/DUP_KEYS is added after AGG_KEYS.
For historical tablet, the keys_type field to indicate storage model
may be missed for AGG_KEYS.
So upgrade from historical tablet, this situation should be taken into
consideration and set to be AGG_KEYS.
2019-12-30 23:17:16 +08:00
feda66f99f Spark return error to users when spark on doris query failed (#2531) 2019-12-30 21:58:13 +08:00
da8c9b4429 [Segment V2] refactor SegmentReaderWriterTest and add UT for lazy materialization (#2614) 2019-12-30 21:07:58 +08:00
368bbfd426 Fix linked schema change bug #2610 (#2613) 2019-12-30 15:48:52 +08:00
db698978da Make from_unixtime and date_format function support grayscale upgrade (#2612) 2019-12-30 13:55:23 +08:00
ffea3f8825 [env] Add CREATE_OR_OPEN and rename existing open modes (#2604)
The upcoming patch will use CREATE_OR_OPEN mode

This patch also remove virtual dtors to cpp file.


* Move the dtors back to env.h

Generally, placing the dtor in an `.h` file(inline) or in a `cpp` file
depends on the trade-off between code expansion and function call overhead.
The code expansion rate is closely related to the number of class members
and the inheritance level.

For the several classes here: `Env`, `ReadableFile`, and `WritableFile`
have no members and are the top level of the inheritance hierarchy, But
for now I have no obvious evidence to prove that make their dtors inline
will cause serious code expansion and more instruction cache-misses,
even if there are thousands of `ReadableFile` objects kept being created
and released during running.
2019-12-30 13:51:38 +08:00
f9cf8a1d65 Delete unused variable in the function of recordeTable (#2607) 2019-12-28 15:35:25 +08:00
7afbda803a Fix memory leak when compression fails in ColumnWriter (#2606)
Only the Pages in the linked-list can be destructed in the
ColumnWriter dtor, but if we meet something wrong, we will
return directly, which causes a memory leak
2019-12-27 22:31:02 +08:00