doris

Author	SHA1	Message	Date
lichaoyong	3f4e18633d	[util] Add Apache License 2.0 to Thread (#2928 )	2020-02-18 15:36:49 +08:00
LingBin	b3c5f0fac7	Remove unneeded headers included in agent-util (#2929 )	2020-02-18 13:18:56 +08:00
kangkaisen	625411bd28	Doris support in memory olap table (#2847 )	2020-02-18 10:45:54 +08:00
worker24h	1f844946e9	Fixbug: Invalid memory address in doris::memory_copy (#2919 ) (#2923 ) When I change schema from char(20) to varchar(20), be will cause coredump.	2020-02-17 18:48:38 +08:00
LingBin	feef077520	Some refactors on `TabletManager` (#2918 ) 1. Add some comments to make the code easier to understand; 2. Make the metric `create_tablet_requests_failed` to be accurate; 3. Some internal methods use naked pointers directly instead of `shared_ptr`; 4. The `using` in `.h` files are contagious when included by other files, so we should only use it in `.cpp` files; 5. Some formatting changes: such as wrapping lines that are too long 6. Parameters that need to be modified, use pointers instead of references No functional changes in this patch.	2020-02-17 14:50:29 +08:00
lichaoyong	f20eb12457	[util] Import ThreadPool and Thread from KUDU (#2915 ) Thread pool design point: All tasks submitted directly to the thread pool enter a FIFO queue and are dispatched to a worker thread when one becomes free. Tasks may also be submitted via ThreadPoolTokens. The token wait() and shutdown() functions can then be used to block on logical groups of tasks. A token operates in one of two ExecutionModes, determined at token construction time: 1. SERIAL: submitted tasks are run one at a time. 2. CONCURRENT: submitted tasks may be run in parallel. This isn't unlike submitted without a token, but the logical grouping that tokens impart can be useful when a pool is shared by many contexts (e.g. to safely shut down one context, to derive context-specific metrics, etc.). Tasks submitted without a token or via ExecutionMode::CONCURRENT tokens are processed in FIFO order. On the other hand, ExecutionMode::SERIAL tokens are processed in a round-robin fashion, one task at a time. This prevents them from starving one another. However, tokenless (and CONCURRENT token-based) tasks can starve SERIAL token-based tasks. Thread design point: 1. It is a thin wrapper around pthread that can register itself with the singleton ThreadMgr (a private class implemented in thread.cpp entirely, which tracks all live threads so that they may be monitored via the debug webpages). This class has a limited subset of boost::thread's API. Construction is almost the same, but clients must supply a category and a name for each thread so that they can be identified in the debug web UI. Otherwise, join() is the only supported method from boost::thread. 2. Each Thread object knows its operating system thread ID (TID), which can be used to attach debuggers to specific threads, to retrieve resource-usage statistics from the operating system, and to assign threads to resource control groups. 3. Threads are shared objects, but in a degenerate way. They may only have up to two referents: the caller that created the thread (parent), and the thread itself (child). Moreover, the only two methods to mutate state (join() and the destructor) are constrained: the child may not join() on itself, and the destructor is only run when there's one referent left. These constraints allow us to access thread internals without any locks.	2020-02-17 11:22:09 +08:00
HangyuanLiu	43583e7bd2	Fix orc load bug (#2912 )	2020-02-16 19:14:42 +08:00
kangkaisen	6c33f80544	Add disable_storage_page_cache config (#2890 ) 1. when read column data page: for compaction, schema_change, check_sum: we don't use page cache for query and config::disable_storage_page_cache is false, we use page cache 2. when read column index page if config::disable_storage_page_cache is false, we use page cache	2020-02-16 19:13:30 +08:00
lichaoyong	9ee1704859	[util] Import util tools from KUDU (#2905 ) 1. MonoTime/MonoDelta MonoTime: The MonoTime represents a particular point in time, relative to some fixed but unspecified reference point. MonoDelta: The MonoDelta class represents an elapsed duration of time, the delta between two MonoTime instances. 2. CountDownLatch This is a C++ implementation of the Java CountDownLatch	2020-02-14 18:01:16 +08:00
lichaoyong	09a4d3e50a	[gutil] import scoped_refptr smart pointer from KUDU (#2899 ) scoped_refptr is used to replace std::shared_ptr, is generally faster and smaller. advantage (1) only requires a single allocation, and ref count is on the same cache line as the object (2) the pointer only requires 8 bytes (since the ref count is within the object) (3) you can manually increase or decrease reference counts when more control is required (4) you can convert from a raw pointer back to a scoped_refptr safely without worrying about double freeing (5) since we control the implementation, we can implement features, such as debug builds that capture the stack trace of every referent to help debug leaks. disadvantage (1) the referred-to object must inherit from RefCounted (2) does not support the weak_ptr use cases	2020-02-14 13:32:03 +08:00
LingBin	d2625a26aa	[env] Add env-util class (#2898 ) The code submitted later will use this utility class. Currently only factory methods for various file types are provided. In the future, tool methods that are common to all Env types can be added here.	2020-02-14 10:04:51 +08:00
令狐少侠	fd492e3b6f	[Doris on ES] Support escape character (#2865 )	2020-02-13 11:32:48 +08:00
LingBin	3c539aac54	[Refactor] Some tiny refactor on streaming-load related code (#2891 ) Mainly contains the following modifications: 1. Use `std::unique_ptr` to replace some naked pointers 2. Modify some methods from member-method to local-static-function 3. Modify some methods do not need to be public to private 4. Some formatting changes: such as wrapping lines that are too long 5. Remove some useless variables 6. Add or modify some comments for easier understanding No functional changes in this patch.	2020-02-13 10:42:52 +08:00
yangzhg	3e160aeb66	[GroupingSet] fix a bug when using grouping set without all column in a grouping set item (#2877 ) fix a bug when using grouping sets without all column in a grouping set item will produce wrong value. fix grouping function check will not work in group by clause	2020-02-12 21:50:12 +08:00
LingBin	e9ff40f07f	Add `sync_dir` interface to Env (#2884 ) when we need to ensure that a newly-created file is fully synchronized back to disk, we should call `fsync()` on the parent directory—that is, the directory containing the newly-created file. That is to say, In this situation, we should call `fsync()` on both the newly-created file and its parent directory. Unfortunately, currently in Doris, in any scenario, directories are not fsynced. This patch adds `sync_dir()` interface first, laying the groundwork for future fixes. This patch also removes unneeded private method `dir_exists()`.	2020-02-12 13:55:17 +08:00
LingBin	5440e19d01	Improve the triggering strategy of BE report (#2881 ) Currently, the report from BE to FE is completed in the background threads of `AgentServer` (`report_tablet_thread` and `report_disk_stat_thread`). These two threads will sleep and be in a standby state after each report, if there is any need to report immediately, they will be notified and wake up immediately to report. For example, when background thread (`disk_monitor_thread`) in `StorageEngine` finds some tablets were deleted, it will notify `AgentServer` to trigger a report immediately. In the current implementation, in order to report ASAP, a local variable (`_is_drop_tables`) and two other flags are used to record whether reporting is needed, and then `StorageEngine::disk_monitor_thread` checks the value of this variable every time it runs, to determine whether it needs to be triggered Reporting. This is actually superfluous, and it may result in untimely notifications, as shown below: ``` (thread_1) (thread_2) disk-monitor disk-stat-reporter \| \| \| reporting \| \| notify_1 \| \| \| \| wait_for_notify(will wait until timeout or next notification) \| \| V V ``` When `report_tablet_thread` has not started waiting, `StorageEngine::disk_monitor_thread` triggers a notification, so this notification will not be received by `report_tablet_thread`, resulting in the BE not reporting to the FE until the lock times out or the next round of `disk_monitor_thread` detection. This change restructures the triggering implementation, and solves the above problem. This change also changes some methods(that do not need to be public) to private.	2020-02-11 20:38:44 +08:00
HangyuanLiu	3a8e783444	Compatible with python3 in build (#2876 )	2020-02-10 21:50:42 +08:00
LingBin	4e151b1551	Remove boost exception when parse store path (#2861 )	2020-02-10 17:50:52 +08:00
LingBin	c89d0a090c	Fix bug that _min_percentage_of_error_disk was not initialized (#2867 ) In StorageEngine, the variable _min_percentage_of_error_disk was not initialized (so it defaults to 0), which causes the process to exit whenever one disk fails. What we expect is that exit the process only when the number of failed disks reach a certain percentage. Also, this variable should mean the maximum percentage of error disks allowed, not the minimum, so change the configuration name to max_percentage_of_error_disk.	2020-02-10 16:58:24 +08:00
Dayue Gao	7037754978	Fix a bug that TabletsChannel may be written after cancel (#2870 ) TabletsChannel may be written after cancelation, leading to core at DeltaWriter::write. We should check the state of TabletsChannel at the beginning of each operations.	2020-02-10 14:49:00 +08:00
LingBin	77805e85d2	Fix lock type when clear trash (#2868 ) In `TabletManager::start_trash_swee`, the modification of `_tablet_map` should be protected by `write-lock` of `_tablet_map_lock`	2020-02-10 13:14:17 +08:00
yangzhg	502fa2eb50	[GroupingSet] Fix core when using grouping sets in large data (#2858 ) dst_tuples memory size to Allocate is wrong	2020-02-07 21:40:29 +08:00
kangkaisen	e7817053cc	[Uitls] ParseUtil::parse_mem_spec support K and T suffix (#2854 )	2020-02-07 09:31:35 +08:00
Yunfeng,Wu	b35e8153c0	[Doris on Es] Fix lte and gte error expression (#2851 ) LE should LTE GE should GTE	2020-02-06 20:52:14 +08:00
Mingyu Chen	f77cfcdb61	[Compaction] Avoid unnecessary compaction (#2839 ) It is not necessary to perform compaction in the following cases 1. A tablet has only 2 rowsets, the versions are [0-1] and [2-x]. In this case, there is no need to perform base compaction because the [0-1] version is an empty version. Some tables will be partitioned by day, and then each partition will only load one batch of data each day, so a large number of tablets with rowsets [0-1][2-2] will appear. And these tablets do not need to be base compaction. 2. The initial value of the `last successful execution time of compaction` is 0, which causes the first time to determine the time interval from the last successful execution time of compaction, which always meets the conditions to trigger cumulative compaction.	2020-02-06 16:40:38 +08:00
LingBin	14c772013b	Fix removing tablet bug from partition_map in TabletManager (#2842 ) When using an iterator of _tablet_map.tablet_arr(`std::list`) to remove a tablet, we should first remove tablet from _partition_map to avoid the iterator becoming invalid.	2020-02-06 09:57:12 +08:00
LingBin	e991b1300f	[Code Refactor] Refactor AgentServer to make it less error-prone and more readable (#2831 ) In `AgentServer`, each task type needs to be processed separately, which leads to very long code, hard to read, and not easy to detect errors (for example, some task type processing may be missed, corresponding relationship may be error) Fortunately, the code for each task_type is very similar, so this is a good case to use `MACRO`, which can greatly reduce the repeated code and solve above problems. This patch also fix two small bugs: 1. The `_topic_subscriber` member has not been released in dtor 2. in `submit_tasks()`, the `status_code` is not reset before each task is processed, resulting in wrong judgment. No functional changes in this patch.	2020-02-06 09:56:00 +08:00
ZHAO Chun	25a6d6abbe	Make cmake and maven configurable (#2837 )	2020-02-05 23:04:29 +08:00
LingBin	ee5323a6a0	[Code Refactor]Improve initialization flow of Schema (#2833 ) When constructing `Schema` objects, two similar `init` functions need to be called, and the call order is implicitly required, which is easy to be misused. At the same time, some of the existing comments are missing or out of date, which will cause some misleading. This patch unifies the initialization logic of `Schema`. No functional changes in this patch.	2020-02-05 11:48:54 +08:00
kangpinghuang	a27e89065b	Add file cache for v2 (#2782 ) Add file descriptor cache for segment v2 to solve too many open file problems	2020-02-04 00:16:01 +08:00
Lijia Liu	99ad56d1bf	Support bitmap index for more type (#2630 ) For #2589 1. date(uint24_t)/datetime(int64_t)/largeint(int128_t) use frame of reference code as dict. 2. decimal(decimal12_t) also uses frame of reference code as dict. 3. float/double use bitshuffle code as dict.	2020-01-31 21:09:29 +08:00
Lishi	89c7234c1c	Support starts_with (str, prefix) function (#2813 ) Support starts_with function	2020-01-21 14:09:08 +08:00
HangyuanLiu	64e99f29e6	Fix parquet arrow read batch bug (#2812 ) Fix parquet arrow read batch bug #2811 The original code was to determine the number of rows in the batch based on the number of rows in the parquet RowGroup.But now it's a batch take 65535 lines. So when parquet row greater than 65535，the number of batch don't match the number of rowgroup. The code using the field "_current_line_of_group" as a position of array can cause the data to be out of array cause be crash	2020-01-21 10:57:56 +08:00
LingBin	7c4149cf27	Improve comparison and printing of Version (#2796 ) * Improve comparison and printing of Version There are two members in `Version`:` first` and `second`. There are many places where we need to print one `Version` object and compare two `Version` objects, but in the current code, these two members are accessed directly, which makes the code very tedious. This patch mainly do: 1. Adds overloaded methods for `operator<<()` for `Version`, so we can directly print a Version object; 2. Adds the `cantains()` method to determine whether it is an containment relationship; 3. Uses `operator==()` to determine if two `Version` objects are equal. Because there are too many places need to be modified, there are still some naked codes left, which will be modified later. This patch also removes some necessary header file references. No functional changes in this patch.	2020-01-19 18:04:28 +08:00
Youngwb	1550401d4b	Support param exec_mem_limit for spark-doris-connctor (#2775 )	2020-01-18 00:14:39 +08:00
LingBin	c71eefa2ac	Add path util (#2747 ) Note that the methods in path_util are only related to path processing, and do not involve any file and IO operations The upcoming patch will use these util methods, used to extract operations such as concatenation of directory strings from processing logic.	2020-01-18 00:05:00 +08:00
yangzhg	fc55423032	[SQL] Support Grouping Sets, Rollup and Cube to extend group by statement Support Grouping Sets, Rollup and Cube to extend group by statement support GROUPING SETS syntax ``` SELECT a, b, SUM( c ) FROM tab1 GROUP BY GROUPING SETS ( (a, b), (a), (b), ( ) ); ``` cube or rollup like ``` SELECT a, b,c, SUM( d ) FROM tab1 GROUP BY ROLLUP\|CUBE(a,b,c) ``` [ADD] support grouping functions in expr like grouping(a) + grouping(b) (#2039) [FIX] fix analyzer error in window function(#2039)	2020-01-17 16:24:02 +08:00
Dayue Gao	3b24287251	Support 64 bits integers for BITMAP type (#2772 ) Fixes #2771 Main changes in this CL * RoaringBitmap is renamed to BitmapValue and moved into bitmap_value.h * leveraging Roaring64Map to support unsigned BIGINT for BITMAP type * introduces two new format (SINGLE64 and BITMAP64) for BITMAP type So far we have three storage format for BITMAP type ``` EMPTY := TypeCode(0x00) SINGLE32 := TypeCode(0x01), UInt32LittleEndian BITMAP32 := TypeCode(0x02), RoaringBitmap(defined by https://github.com/RoaringBitmap/RoaringFormatSpec/) ``` In order to support BIGINT element and keep backward compatibility, introduce two new format ``` SINGLE64 := TypeCode(0x03), UInt64LittleEndian BITMAP64 := TypeCode(0x04), CustomRoaringBitmap64 ``` Please note that SINGLE64/BITMAP64 doesn't replace SINGLE32/BITMAP32. Doris will choose the smaller (in terms of space) type automatically during serializing. For example, BITMAP32 is preferred over BITMAP64 when the maximum element is <= UINT32_MAX. This will also make BE rollback possible as long as user didn't write element larger than UINT32_MAX into bitmap column. Another important design decision is that we fork and maintain our own version of Roaring64Map instead of using the one in "roaring/roaring64map.hh". The reasons are 1. RoaringBitmap doesn't define a standard for the binary format of 64-bits bitmap. As a result, different implementations of Roaring64Map use different format. For example the [C++ version](https://github.com/RoaringBitmap/CRoaring/blob/v0.2.60/cpp/roaring64map.hh#L545) is different from the [Java version](`35104c564e/src/main/java/org/roaringbitmap/longlong/Roaring64NavigableMap.java (L1097)`). Even for CRoaring, the format may change in future releases. However Doris require the serialized format to be stable across versions. Fork is a safe way to achieve this. 2. We may want to make some code changes to Roaring64Map according to our needs. For example, in order to use the BITMAP32 format when the maximum element can be represented in 32 bits, we may want to access the private member of Roaring64Map. Another example is we want to further customize and optimize the format for BITMAP64 case, such as using vint64 instead of uint64 for map size.	2020-01-17 14:13:38 +08:00
LingBin	d0e2fc3305	Remove resource_info related members from TaskWorkerPool (#2704 ) The `TResourceInfo` was used to help `cgruops` to isolate resources, but it is no longer used. In fact, the `TResourceInfo` information is no longer carried in the requests from FE to BE.	2020-01-16 14:39:08 +08:00
HangyuanLiu	0ddca59d36	Add timestampadd/timestampdiff function (#2725 )	2020-01-15 21:47:07 +08:00
kangpinghuang	7fe6431ac7	Fix delete handler init when schema change (#2767 ) delete handler init failed because there are missed version. Schema change should return failure when get version failed.	2020-01-15 15:42:56 +08:00
Mingyu Chen	9e54751098	[Snapshot] Modify the prefer snapshot version (#2748 ) In this CL, prefer snapshot version in snapshot request is defined in thrift. So that both FE and BE can use this version value.	2020-01-15 15:10:14 +08:00
DanyBin	7768629f08	Add bitmap_contains and bitmap_has_any functions (#2752 )	2020-01-15 14:31:44 +08:00
HangyuanLiu	a36193dfab	Support decimal and timestamp type in orc load (#2759 )	2020-01-15 07:40:30 +08:00
kangkaisen	64b2291347	Allow user to ignore the broken disk (#2755 ) Add a BE config `ignore_broken_disk`.	2020-01-14 22:40:43 +08:00
frwrdt	f071d5a307	Support ends_with function (#2746 )	2020-01-14 22:37:20 +08:00
ZHAO Chun	a99a49a444	Add bitamp_to_string function (#2731 ) This CL changes: 1. add function bitmap_to_string and bitmap_from_string, which will convert a bitmap to/from string which contains all bit in bitmap 2. add function murmur_hash3_32, which will compute murmur hash for input strings 3. make the function cast float to string the same with user result logic	2020-01-13 12:31:37 +08:00
kangpinghuang	60dc7c394f	Fix rowset state transition bug of release (#2726 ) Add on_release to tranfer state when release is called. When release called, state should transfer from unloading to unloaded, not from loaded.	2020-01-10 18:29:54 +08:00
kangpinghuang	3690f3e917	Add rowset state (#2691 ) 1. add rowset state to rowset 2. add close api to rowset to release resources issue: #2665	2020-01-10 14:17:57 +08:00
yangzhg	4b8f7f9c32	Use cgroups memory limit and cpu cores in container (#2710 )	2020-01-10 00:45:50 +08:00

1 2 3 4 5 ...

725 Commits