Commit Graph

1444 Commits

Author SHA1 Message Date
92d8f6ae78 [Alter] Allow submitting alter jobs when table is unstable
Alter job will wait table to be stable before running.
2020-01-18 22:56:37 +08:00
ae018043b0 [Alter] Support replication_num setting for table level (#2737)
Support replication_num setting for table level, so There is no need for user to set replication_num for every alter table add partition statement.

eg:
`alter table tbl set ("default.replication_num" = "2");`
2020-01-18 21:17:22 +08:00
1550401d4b Support param exec_mem_limit for spark-doris-connctor (#2775) 2020-01-18 00:14:39 +08:00
c71eefa2ac Add path util (#2747)
Note that the methods in path_util are only related to path processing,
and do not involve any file and IO operations

The upcoming patch will use these util methods, used to extract operations
such as concatenation of directory strings from processing logic.
2020-01-18 00:05:00 +08:00
a3789ab2af Refine .clang-format (#2791) 2020-01-18 00:00:49 +08:00
23f472903a [Routine Load] Fix a bug that show routine load will throw Unknown Exception
If we connect to a non-master FE and execute `show routine load;`. It may sometimes
throw Unknown Exception, because some of fields in thrift result is not set.
2020-01-17 20:46:00 +08:00
6365a7d559 [FE Maven] Change maven repository url from http to https (#2786)
From January 15th, 2020, Requests to http://repo1.maven.org/maven2/ return a 501 HTTPS Required status.
So switch central repository url from http to https
2020-01-17 16:45:04 +08:00
fc55423032 [SQL] Support Grouping Sets, Rollup and Cube to extend group by statement
Support Grouping Sets, Rollup and Cube to extend group by statement
support GROUPING SETS syntax 
```
SELECT a, b, SUM( c ) FROM tab1 GROUP BY GROUPING SETS ( (a, b), (a), (b), ( ) );
```
cube  or rollup like 
```
SELECT a, b,c, SUM( d ) FROM tab1 GROUP BY ROLLUP|CUBE(a,b,c)
```

[ADD] support grouping functions in expr like grouping(a) + grouping(b) (#2039)
[FIX] fix analyzer error in window function(#2039)
2020-01-17 16:24:02 +08:00
3b24287251 Support 64 bits integers for BITMAP type (#2772)
Fixes #2771 

Main changes in this CL
* RoaringBitmap is renamed to BitmapValue and moved into bitmap_value.h
* leveraging Roaring64Map to support unsigned BIGINT for BITMAP type
* introduces two new format (SINGLE64 and BITMAP64) for BITMAP type

So far we have three storage format for BITMAP type

```
EMPTY := TypeCode(0x00)
SINGLE32 := TypeCode(0x01), UInt32LittleEndian
BITMAP32 := TypeCode(0x02), RoaringBitmap(defined by https://github.com/RoaringBitmap/RoaringFormatSpec/)
```

In order to support BIGINT element and keep backward compatibility, introduce two new format

```
SINGLE64 := TypeCode(0x03), UInt64LittleEndian
BITMAP64 := TypeCode(0x04), CustomRoaringBitmap64
```

Please note that SINGLE64/BITMAP64 doesn't replace SINGLE32/BITMAP32. Doris will choose the smaller (in terms of space) type automatically during serializing. For example, BITMAP32 is preferred over BITMAP64 when the maximum element is <= UINT32_MAX. This will also make BE rollback possible as long as user didn't write element larger than UINT32_MAX into bitmap column.

Another important design decision is that we fork and maintain our own version of Roaring64Map instead of using the one in "roaring/roaring64map.hh". The reasons are

1. RoaringBitmap doesn't define a standard for the binary format of 64-bits bitmap. As a result, different implementations of Roaring64Map use different format. For example the [C++ version](https://github.com/RoaringBitmap/CRoaring/blob/v0.2.60/cpp/roaring64map.hh#L545) is different from the [Java version](35104c564e/src/main/java/org/roaringbitmap/longlong/Roaring64NavigableMap.java (L1097)). Even for CRoaring, the format may change in future releases. However Doris require the serialized format to be stable across versions. Fork is a safe way to achieve this.
2. We may want to make some code changes to Roaring64Map according to our needs. For example, in order to use the BITMAP32 format when the maximum element can be represented in 32 bits, we may want to access the private member of Roaring64Map. Another example is we want to further customize and optimize the format for BITMAP64 case, such as using vint64 instead of uint64 for map size.
2020-01-17 14:13:38 +08:00
463c0e87ec Replace PowerMock/EasyMock by Jmockit (4/4) (#2784)
This commit replaces the PowerMock/EasyMock in our unit tests. (All)
2020-01-17 14:09:00 +08:00
8df63bc191 [Doc] Add en doc for dynamic partition feature (#2764) 2020-01-16 21:54:26 +08:00
d0e2fc3305 Remove resource_info related members from TaskWorkerPool (#2704)
The `TResourceInfo` was used to help `cgruops` to isolate resources,
but it is no longer used.

In fact, the `TResourceInfo` information is no longer carried in
the requests from FE to BE.
2020-01-16 14:39:08 +08:00
753a7dd73a Replace PowerMock/EasyMock by Jmockit (3/4) 2020-01-16 13:24:43 +08:00
0ddca59d36 Add timestampadd/timestampdiff function (#2725) 2020-01-15 21:47:07 +08:00
8ea5907252 Update arrow's version to 0.15.1 and shaded it in spark-doris-connector (#2769) 2020-01-15 21:08:34 +08:00
9bc306d17c Replace PowerMock/EasyMock by Jmockit (2/4) (#2749) 2020-01-15 20:31:30 +08:00
4496ebb632 [Alter View] Fix bug that alter view operation lost when replaying from image (#2773)
When "replay" something, we should use Catalog.getCurrentCatalog() instead
of Catalog.getInstance(), otherwise, we may get wrong Catalog instance.
2020-01-15 20:04:09 +08:00
7fe6431ac7 Fix delete handler init when schema change (#2767)
delete handler init failed because there are missed version. Schema change should return failure when get version failed.
2020-01-15 15:42:56 +08:00
54952a24ad Remove and comment some FE code (#2766) 2020-01-15 15:14:52 +08:00
9e54751098 [Snapshot] Modify the prefer snapshot version (#2748)
In this CL, prefer snapshot version in snapshot request is defined
in thrift. So that both FE and BE can use this version value.
2020-01-15 15:10:14 +08:00
7768629f08 Add bitmap_contains and bitmap_has_any functions (#2752) 2020-01-15 14:31:44 +08:00
1f0ea2d2e0 Merge pull request #2765 from morningman/routine_load_clean_label
[Routine Load] Fix bug that history routine load jobs are cleaned prematurely
2020-01-15 11:27:33 +08:00
70c7281bf2 [Routine Load] Fix bug that history routine load jobs are cleaned prematurely 2020-01-15 11:18:34 +08:00
e5717efc5a [Insert] Return more info of insert operation (#2718)
Standardize the return results of INSERT operations,
which is convenient for users to use and locate problems.

More details can be found in insert-into-manual.md
2020-01-15 10:39:53 +08:00
9bafcc99f6 Don't balance when Available BE num equals or less then tablet Replica num (#2740) 2020-01-15 10:39:18 +08:00
a36193dfab Support decimal and timestamp type in orc load (#2759) 2020-01-15 07:40:30 +08:00
64b2291347 Allow user to ignore the broken disk (#2755)
Add a BE config `ignore_broken_disk`.
2020-01-14 22:40:43 +08:00
f071d5a307 Support ends_with function (#2746) 2020-01-14 22:37:20 +08:00
ef6cd9ae25 Add files to gitignore (#2753) 2020-01-14 22:29:56 +08:00
e5197eff94 Update the doc of doris to fix some mistakes (#2758) 2020-01-14 22:26:49 +08:00
fafc684e0d [External Table] Fix bug that query external hdfs table throw NPE. (#2756) 2020-01-14 15:44:54 +08:00
1ccd377b33 [Colocate Table] Fix colocate table balance forever (#2744) 2020-01-13 23:13:58 +08:00
273edced77 Replace PowerMock/EasyMock by Jmockit (1/3) (#2732)
This commit replaces the PowerMock/EasyMock in our unit tests, But not all.
PS.(The tests relevant to DescribeStmt are ignored until I find a way to fix it)
2020-01-13 21:28:18 +08:00
a99a49a444 Add bitamp_to_string function (#2731)
This CL changes:

1. add function bitmap_to_string and bitmap_from_string, which will
 convert a bitmap to/from string which contains all bit in bitmap
2. add function murmur_hash3_32, which will compute murmur hash for
input strings
3. make the function cast float to string the same with user result
logic
2020-01-13 12:31:37 +08:00
4e868252fc Add .clang-format and docs (#2724)
The problem of inconsistence style in Doris code is too big, it's hard to minimize modification when reformatting code.
So here, our aim is to make the style rules, tune the config in .clang-format.

Note: I choose clang-format-8.0+ to support richer sytle options.
2020-01-11 20:54:20 +08:00
e00343b6ec Choose tablets in ConsistencyChecker in batch (#2736) 2020-01-11 20:45:06 +08:00
089b358dcd Skip dropped be when choose dest be in TabletScheduler (#2734) 2020-01-11 20:32:26 +08:00
e391fe1e70 [SQL] Ignore the null type when getCmpType (#2730)
In previous versions, if the children of the IN predicate included NULL, all child types would be converted to DOUBLE for calculation.
For example:
select * from t1 where k1 in ('TABLE', NULL);
But children like varchar cannot be converted to double, so the query cannot be executed.
The error is "TABLE is not a number"

The current version, if null exists in the child, it will not enter the calculation of compatibility type.
For the above query, the compatibility type is varchar, so the 'TABLE' is not converted to double, and the query could be executed.

Also, for JDBC. It will convert 'show tables;' to :
```
SELECT
TABLE_SCHEMA AS TABLE_CAT, NULL AS TABLE_SCHEM, TABLE_NAME,
 CASE WHEN TABLE_TYPE='BASE TABLE'
 THEN CASE WHEN TABLE_SCHEMA = 'mysql' OR TABLE_SCHEMA = 'performance_schema'
 THEN 'SYSTEM TABLE' ELSE 'TABLE'END WHEN TABLE_TYPE='TEMPORARY'
 THEN 'LOCAL_TEMPORARY' ELSE TABLE_TYPE END AS TABLE_TYPE, TABLE_COMMENT AS REMARKS, NULL AS TYPE_CAT, NULL AS TYPE_SCHEM, NULL AS TYPE_NAME, NULL AS SELF_REFERENCING_COL_NAME, NULL AS REF_GENERATION
 FROM INFORMATION_SCHEMA.TABLES
 WHERE TABLE_SCHEMA LIKE 'test_db'
 AND TABLE_NAME LIKE '%'
 HAVING TABLE_TYPE IN ('TABLE','VIEW',null,null,null)
 ORDER BY TABLE_TYPE, TABLE_SCHEMA, TABLE_NAME
```
In previous version, Doris could not return the correct tables to JDBC. It will thrown the error "'TABLE' is not a number".
After this commit, #2729 is fixed. Doris could return the tables schema by JDBC.
2020-01-11 14:03:50 +08:00
ccaa97a5ac Make bitmap functions accept any expression that returns bitmap (#2728)
This CL make bitmap_count, bitmap_union, and bitmap_union_count accept any expression whose return type is bitmap as input so that we can support flexible bitmap expression such as bitmap_count(bitmap_and(to_bitmap(1), to_bitmap(2))).

This CL also create separate documentation for each bitmap UDF to conform with other functions.
2020-01-11 14:02:12 +08:00
60dc7c394f Fix rowset state transition bug of release (#2726)
Add on_release to tranfer state when release is called. When release called, state should transfer from unloading to unloaded, not from loaded.
2020-01-10 18:29:54 +08:00
3690f3e917 Add rowset state (#2691)
1. add rowset state to rowset
2. add close api to rowset to release resources
issue: #2665
2020-01-10 14:17:57 +08:00
18a11f5663 Convert from arrow to rowbatch (#2723)
For #2722
In our test environment, Doris cluster used 1 fe and 7 be (32C+128G). When using spakr-doris connecter to query a table containing 67 columns, it took about 1 hour for the query to return 69 million rows of data. After the improvement, the same query condition took 2.5 minutes and the query performance was significantly improved
2020-01-10 14:11:15 +08:00
81be684bae [FE Meta]fix schema change job write edit log error (#2721)
fix when there is no indexChange still write editlog and will not read
2020-01-10 11:12:03 +08:00
4b8f7f9c32 Use cgroups memory limit and cpu cores in container (#2710) 2020-01-10 00:45:50 +08:00
fa4407cf4f Fix bug for cumulative compaction on singleton rowset with multiple segments (#2719)
Row will be scanned mistakenly after cumulative compaction on singleton rowset.
If I have (1, 1), (2, 2), (3, 3) three records.
Now I have read (1, 1), this bug will make return row is (2, 2)
instead of (1, 1).
2020-01-09 21:08:21 +08:00
e7b763309d Skip missing version replica when getQueryableReplicas (#2715) 2020-01-09 17:19:23 +08:00
425b1cf29b Fix port already in use (#2716) 2020-01-09 16:01:17 +08:00
f7cea6dda5 CreateViewStmt/AlterViewStmt support cte and fix bug (#2641)
This commit contains the following changes:
1. Let create/alter view statement support cte sql. (Issue #2625 )

e.g.
```
Alter view test_tbl_view (h1, h2)
as
with testTbl_cte (w1, w2) as 
(
    select col1, col2 from testDb.testTbl
)
select w1 as c1, sum(w2) as c2 from testTbl_cte 
where w1 > 10 
group by w1 order by w1
```

2. Fix the bug that view's schema remains unchanged after replaying alter view. (Issue #2624 )
2020-01-08 23:11:38 +08:00
87b266cd11 Add cpp connect sample (#2685) 2020-01-08 23:10:58 +08:00
6bc54ef3f0 [Document] Add dynamic partition docs (#2711) 2020-01-08 23:08:48 +08:00