doris

Author	SHA1	Message	Date
Xinyi Zou	97fcad76f8	[enhancement](memtracker) Improve readability (#15716 )	2023-01-16 16:30:35 +08:00
Pxl	b727033906	[Chore](build) enable -Wextra and remove some -Wno (#15760 ) enable -Wextra and remove some -Wno	2023-01-15 10:40:35 +08:00
pengxiangyu	58c520dbfd	[Feature](remote) Cooldown cold data to object storage only one replica (#15832 )	2023-01-14 23:58:00 +08:00
plat1ko	ad68764977	[enhancement](tablet) Unify redundant `create_rowset_writer` methods (#15519 ) * Remove redundant create_rowset_writer methods * Set resource id when setting FS in rowset meta * fix * fix ut	2022-12-30 22:57:12 +08:00
AlexYue	ffef81a6ab	[feature](BE)pad missed version with empty rowset (#15030 ) If all replicas of one tablet are broken, user can use this http api to pad the missed version with empty rowset.	2022-12-29 11:20:44 +08:00
spaces-x	a22ee89431	[Enhancement](jemalloc):support heap dump by http request at runtime (#15429 )	2022-12-28 20:10:50 +08:00
Xin Liao	bf71943605	[feature](load) stream load trim double quotes for csv (#15241 )	2022-12-26 11:45:54 +08:00
Tiewei Fang	ec055e1acb	[feature](new file reader) Integrate new file reader (#15175 )	2022-12-26 08:55:52 +08:00
Zhengguo Yang	a98636a970	[bugfix](from_unixtime) fix timezone not work for from_unixtime (#15298 ) * [bugfix](from_unixtime) fix timezone not work for from_unixtime	2022-12-23 19:05:09 +08:00
Pxl	1b07e3e18b	[Chore](refactor) some modify for pass c++20 standard (#15042 ) some modify for pass c++20 standard	2022-12-17 14:41:07 +08:00
Mingyu Chen	0e1e5a802b	[config](load) enable new load scan node by default (#14808 ) Set FE `enable_new_load_scan_node` to true by default. So that all load tasks(broker load, stream load, routine load, insert into) will use FileScanNode instead of BrokerScanNode to read data 1. Support loading parquet file in stream load with new load scan node. 2. Fix bug that new parquet reader can not read column without logical or converted type. 3. Change jsonb parser function to "jsonb_parse_error_to_null" So that if the input string is not a valid json string, it will return null for jsonb column in load task.	2022-12-16 09:41:43 +08:00
plat1ko	f3aea7f0f0	[Enhancement](status) Unify error code and enable customed err msg for BE internal errors (#14744 )	2022-12-11 23:33:18 +08:00
Pxl	82da071b45	[Chore](format) update clang-format version to 15 (#13036 ) update clang-format version to 15	2022-11-29 14:46:10 +08:00
Xinyi Zou	21416f9947	[enhancement](memory) Support Jemalloc metrics and default allocator changed to Jemalloc (#14384 )	2022-11-18 21:02:54 +08:00
Xinyi Zou	dd11d5c0a5	[enhancement](memory) Support try catch bad alloc (#14135 )	2022-11-13 11:22:56 +08:00
xy720	035657c5a1	[typo](comment) Fix a lot of spell errors in be comments (#14208 ) fix typos.	2022-11-12 16:06:15 +08:00
Xinyi Zou	0b945fe361	[enhancement](memtracker) Refactor mem tracker hierarchy (#13585 ) mem tracker can be logically divided into 4 layers: 1)process 2)type 3)query/load/compation task etc. 4)exec node etc. type includes enum Type { GLOBAL = 0, // Life cycle is the same as the process, e.g. Cache and default Orphan QUERY = 1, // Count the memory consumption of all Query tasks. LOAD = 2, // Count the memory consumption of all Load tasks. COMPACTION = 3, // Count the memory consumption of all Base and Cumulative tasks. SCHEMA_CHANGE = 4, // Count the memory consumption of all SchemaChange tasks. CLONE = 5, // Count the memory consumption of all EngineCloneTask. Note: Memory that does not contain make/release snapshots. BATCHLOAD = 6, // Count the memory consumption of all EngineBatchLoadTask. CONSISTENCY = 7 // Count the memory consumption of all EngineChecksumTask. } Object pointers are no longer saved between each layer, and the values of process and each type are periodically aggregated. other fix: In [fix](memtracker) Fix transmit_tracker null pointer because phamp is not thread safe #13528, I tried to separate the memory that was manually abandoned in the query from the orphan mem tracker. But in the actual test, the accuracy of this part of the memory cannot be guaranteed, so put it back to the orphan mem tracker again.	2022-11-08 09:52:33 +08:00
jiafeng.zhang	a19e6881c7	[chore](be web ui)upgrade jquery version to 3.6.0 (#13942 ) * upgrade jquery version to 3.6.0 * update license dist	2022-11-04 16:20:17 +08:00
Yongqiang YANG	54545c6446	[improvement](config) enlarge default value of create_table_timeout and remove disable_stream_load_2pc (#13520 ) Users do not need to set create_table_timeout, it is a ddl command and when encounter a timeout event users will set a lager timeout and retry. Stream load 2pc is used by default in flink connector, so we should not disable it by config, the config item is useless.	2022-10-24 11:51:18 +08:00
yixiutt	6d322f85ac	[improvement](compaction) delete num based compaction policy (#13409 ) Co-authored-by: yixiutt <yixiu@selectdb.com>	2022-10-18 16:13:28 +08:00
Mingyu Chen	dbf71ed3be	[feature-wip](new-scan) Support stream load with csv in new scan framework (#13354 ) 1. Refactor the file reader creation in FileFactory, for simplicity. Previously, FileFactory had too many `create_file_reader` interfaces. Now unified into two categories: the interface used by the previous BrokerScanNode, and the interface used by the new FileScanNode. And separate the creation methods of readers that read `StreamLoadPipe` and other readers that read files. 2. Modify the StreamLoadPlanner on FE side to support using ExternalFileScanNode 3. Now for generic reader, the file reader will be created inside the reader, not passed from the outside. 4. Add some test cases for csv stream load, the behavior is same as the old broker scanner.	2022-10-17 23:33:41 +08:00
pengxiangyu	af7b6524f2	add hide config to hide config in webserver for safety. (#13255 )	2022-10-12 10:27:09 +08:00
yixiutt	3dc4dc6d43	[compaction](http_action) enable be run manual compaction concurrently (#13219 ) In some case, we need to run manual compaction via http interface concurrently, so we remove the mutex and tablet's compaction lock is enough to prevent concurrent compaction in tablet. Co-authored-by: yixiutt <yixiu@selectdb.com>	2022-10-10 08:33:18 +08:00
xueweizhang	70ab9cb43e	[feature](http) refactor version info and add new http api for get version info (#12513 ) Refactor version info and add new http api for get version info	2022-09-22 10:53:04 +08:00
Xinyi Zou	3bb042e45c	[fix](memtracker) Process physical mem check does not include tc/jemalloc allocator cache (#12688 ) tcmalloc/jemalloc allocator cache does not participate in the mem check as part of the process physical memory. because new/malloc will trigger mem hook when using tcmalloc/jemalloc allocator cache, but it may not actually alloc physical memory, which is not expected in mem hook fail. in addition: The value of tcmalloc/jemalloc allocator cache is used as a mem tracker, the parent is the process mem tracker, which is updated every 1s. Modify the process default mem_limit to 90%. expect mem tracker to effectively limit the memory usage of the process.	2022-09-17 11:31:01 +08:00
Stalary	87439e227e	[Enhancement](DOE): Doe support object/nested use string (#12401 ) * MOD: doe support object/nested use string	2022-09-13 09:59:48 +08:00
zhannngchen	38937c15d7	[typo](streamload) fix typo and remove useless method declaration #12343	2022-09-05 19:16:36 +08:00
Mingyu Chen	22430cd7bb	[feature](stmt) add ADMIN COPY TABLET stmt for local debug (#12176 ) Add a new stmt ADMIN COPY TABLET for easy copy a tablet to local env to reproduce problem. See document for more details.	2022-08-31 09:06:49 +08:00
Pxl	67e94d2aea	[Enhancement](compaction) add compaction use time count (#12141 )	2022-08-30 09:18:02 +08:00
yixiutt	1b0b5b5f09	[Enhancement](load) add hidden_columns in stream load param (#11625 ) Stream load will ignore invisible columns if no http header columns specified, but in some case user cannot get all columns if columns changed frequently。 Add a hidden_columns header to support hidden columns import。User can set hidden_columns such as __DORIS_DELETE_SIGN__ and add this column in stream load data so we can delete this line. For example: curl -u root -v --location-trusted -H "hidden_columns: __DORIS_DELETE_SIGN__" -H "format: json" -H "strip_outer_array: true" -H "jsonpaths: [\"$.id\", \"$.name\",\"$.__DORIS_DELETE_SIGN__\"]" -T 1.json http://{beip}:{be_port}/api/test/test1/_stream_load Co-authored-by: yixiutt <yixiu@selectdb.com>	2022-08-19 14:57:11 +08:00
Xinyi Zou	7d836cf0c7	[fix](memtracker) Fix flush memtable to reduce load channel mem not executed (#11771 ) The memory value automatically tracked by the tcmalloc hook in the DeltaWriter is smaller than the value recorded manually in the memtable, because the first 4096-byte Chunk requested by each MemPool when the memtable is initialized is not tracked to the DeltaWriter by the hook. The values of the two are not equal, causing the mem_consumption() == _mem_table->memory_usage branch judgment to fail.	2022-08-16 14:30:45 +08:00
weizuo93	838fdc1354	[Bug](httpserver) Fix bug that http server should not be stoped in destructor if it not running Co-authored-by: weizuo <weizuo@xiaomi.com>	2022-08-03 19:44:46 +08:00
weizuo93	f730a048b1	[feature-wip](load) Support single replica load (#10298 ) During load process, the same operation are performed on all replicas such as sort and aggregation, which are resource-intensive. Concurrent data load would consume much CPU and memory resources. It's better to perform write process (writing data into MemTable and then data flush) on single replica and synchronize data files to other replicas before transaction finished.	2022-08-02 11:44:18 +08:00
Mingyu Chen	abbf75d302	[doc][refactor](metrics) Reorganize FE and BE metrics and add document (#11307 )	2022-08-02 11:34:06 +08:00
weizuo93	5c1cd058f2	[Feature] Add interface to check tablet segment lost (#10711 ) Co-authored-by: weizuo <weizuo@xiaomi.com>	2022-08-02 09:40:04 +08:00
Xinyi Zou	b6bdb3bdbc	[fix] (mem tracker) Fix MemTracker accuracy (#11190 )	2022-07-27 18:59:24 +08:00
Xinyi Zou	4960043f5e	[enhancement] Refactor to improve the usability of MemTracker (step2) (#10823 )	2022-07-21 17:11:28 +08:00
Xinyi Zou	d5fa66d9a3	[Enhancement] [Memory] Limit memory usage use process actual physical memory (#10924 )	2022-07-19 11:08:39 +08:00
lihangyu	b04a791895	[Enhancement] support compile with jemalloc (#10542 ) A test feature to use jemalloc as default malloc.	2022-07-11 12:15:35 +08:00
plat1ko	331fa50501	[feature](cold-data) move cold data to object storage without losing any feature(BE) (#10280 ) This PR supports rowset level data upload on the BE side, so that there can be both cold data and hot data in a tablet, and there is no necessary to prohibit loading new data to cooled tablets. Each rowset is bound to a `FileSystem`, so that the storage layer can read and write rowsets without perceiving the underlying filesystem. The abstracted `RemoteFileSystem` can try local caching strategies with different granularity, instead of caching segment files as before. To avoid conflicts with the code in be/src/io, we temporarily put the file system related code in the be/src/io/fs directory. In the future, `FileReader`s and `FileWriter`s should be unified.	2022-07-08 12:18:39 +08:00
Tiewei Fang	c9f86bc7e2	[refactor] Refactoring Status static methods to format message using fmt(#9533 )	2022-07-02 18:58:23 +08:00
yiguolei	aab7dc956f	[refactor](load) Remove mini load (#10520 )	2022-06-30 23:21:41 +08:00
Mingyu Chen	8a49c7ef04	[chore] Rename Doris binary output format	2022-06-24 15:30:05 +08:00
yinzhijian	75a7e72402	[Refactor] Use iequal to replace boost::iequals (#10146 ) * [Refactor] Use iequal to replace boost::iequals * remove unused include	2022-06-16 18:18:38 +08:00
yinzhijian	cbbda7857b	[feature-wip](parquet-orc) Support orc scanner in vectorized engine (#9541 )	2022-05-26 21:39:12 +08:00
jacktengg	9236c2efc9	[improvement] Show detail status code string for be http api (#9771 ) 1. move to_json method to common/status 2. modify related usage in http folder	2022-05-26 15:09:21 +08:00
Yongqiang YANG	defdae1e7d	[improvement](stream-load) adjust read unit of http to optimize stream load (#9154 )	2022-05-20 09:52:36 +08:00
yinzhijian	bee5c2f8aa	[feature-wip](parquet-vec) Support parquet scanner in vectorized engine (#9433 )	2022-05-17 09:37:17 +08:00
plat1ko	4cd579b155	[refactor] Check status precise_code instead of construct OLAPInternalError (#9514 ) * check status precise_code instead of construct OLAPInternalError * move is_io_error to Status	2022-05-12 15:39:29 +08:00
Adonis Ling	718a51a388	[refactor][style] Use clang-format to sort includes (#9483 )	2022-05-10 21:25:35 +08:00

1 2 3 4 5

225 Commits