doris

Author	SHA1	Message	Date
ZHAO Chun	89dc461f91	Fix UT and remove unused code (#2160 )	2019-11-08 08:47:48 +08:00
Mingyu Chen	62acf5d098	Limit the memory usage of Loading process (#1954 )	2019-10-15 09:26:20 +08:00
ZHAO Chun	024348d74b	Enable auto convert when check in (#1926 ) Leverage gitattributes to enable auto convert end-of-line to LF when checking in. Convert already exist CRLF to LF by removing all files and checking out with new .gitattributes file. Except .gitattributes, all files are only modified at the end of line.	2019-10-09 22:31:27 +08:00
kangkaisen	1131f53420	Fix parquet_scanner_test in debug mode (#1900 )	2019-09-28 01:15:33 +08:00
kangkaisen	1c229fbd92	Fix es_scan_reader_test in debug mode (#1905 )	2019-09-28 00:02:30 +08:00
shgxwxl	c2de62d6a1	Collect scanner's status when es_http_scan_node close (#1861 )	2019-09-25 12:20:13 +08:00
ZHAO Chun	58801c6ab0	Support converting RowBatch and RowBlockV2 to/from Arrow (#1699 )	2019-08-27 11:30:00 +08:00
yuanli	ba6d728f26	Enable parsing columns from file path for Broker Load (#1582 ) (#1635 ) Currently, we do not support parsing encoded/compressed columns in file path, eg: extract column k1 from file path /path/to/dir/k1=1/xxx.csv This patch is able to parse columns from file path like in Spark(Partition Discovery). This patch parse partition columns at BrokerScanNode.java and save parsing result of each file path as a property of TBrokerRangeDesc, then the broker reader of BE can read the value of specified partition column.	2019-08-19 09:39:21 +08:00
HangyuanLiu	69af50aa8c	Time zone related BE function (#1598 ) Details can be found in time-zone.md document	2019-08-12 20:57:59 +08:00
Mingyu Chen	0694b6a6fa	Fix bugs of Broker load (#1546 ) Use same UUID as query ID and load ID of a load execution plan. Each load execution plan has a load ID, and as a plan, there is also a query ID. We can use same UUID as query ID and load ID, for tracing the load process more easily. Change the load ID when retrying a load execution plan. When a load execution plan retry, the load ID should be changed, otherwise BE can not distinguish the old and new load requests. Cancel the running loading task when cancelling the broker load. When user cancel a broker load, the running loading task should also be cancelled, or it may occupies the worker thread for a long time. Remove the unnecessary query report when doing load execution plan. Only the last query report is needed. Add a new BE config tablet_writer_rpc_timeout_sec. It is used for RPC of tablet sink. The default is 600 seconds. which is long enough for flushing about 6GB data. The long timeout config will reduce the possibility of encountering fail to send batch error when loading. Use streaming_load_max_mb instead of mini_load_max_mb in BE config. Add more logs for tracing a broker load process easily.	2019-07-27 20:17:05 +08:00
lichaoyong	0d48a3961c	Refactor Storage Engine (#1478 ) NOTE: This patch would modify all Backend's data. And this will cause a very long time to restart be. So if you want to interferer your product environment, you should upgrade backend one by one. 1. Refactoring be is to clarify the structure the codes. 2. Use unique id to indicate a rowset. Nameing rowset with tablet_id and version will lead to many conflicts among compaction, clone, restore. 3. Extract an rowset interface to encapsulate rowsets with different format.	2019-07-15 21:18:22 +08:00
worker24h	7eab12a40e	Support reading Parquet file when loading data (#1173 )	2019-07-01 18:39:27 +08:00
ZHAO Chun	9d03ba236b	Uniform Status (#1317 )	2019-06-14 23:38:31 +08:00
Mingyu Chen	e4e04e8203	Make LZO support optional (#1263 )	2019-06-07 22:26:54 +08:00
lide	9c82d41981	Support Doris query ES by HTTP way (#925 )	2019-04-28 17:14:44 +08:00
Mingyu Chen	0820a29b8d	Implement the routine load process of Kafka on Backend (#671 )	2019-04-28 10:33:50 +08:00
Mingyu Chen	da308da17c	Fix bug that empty stream load return unexpected error msg (#1052 )	2019-04-28 09:36:19 +08:00
ZHAO Chun	11307b23c8	Fix bug: stream load ignore last line with no-newline (#785 ) #783	2019-03-21 19:18:22 +08:00
Salieri1969	4d5f92cce7	Add EsScanNode (#450 )	2019-01-17 17:59:33 +08:00
ZHAO Chun	90d71508ff	Add UserFunctionCache to cache UDF's library (#453 ) * Add UserFunctionCache to cache UDF's library This patch replace LibCache with UserFunctionCache. LibCache use HDFS URL to identify a UDF's Library, and when BE process restart all of downloaded library should be loaded another time. We use function id corresponding to a library, and when process restart, all downloaded libraries can be loaded without another downloading. * update	2018-12-21 22:07:21 +08:00
Zhao Chun	a2b299e3b9	Reduce UT binary size (#314 ) * Reduce UT binary size Almost every module depend on ExecEnv, and ExecEnv contains all singleton, which make UT binary contains all object files. This patch seperate ExecEnv's initial and destory to anthor file to avoid other file's dependence. And status.cc include debug_util.h which depend tuple.h tuple_row.h, and I move get_stack_trace() to stack_util.cpp to reduce status.cc's dependence. I add USE_RTTI=1 to build rocksdb to avoid linking librocksdb.a Issue: #292 * Update	2018-11-15 16:17:23 +08:00
Zhao Chun	c6df9b1e2d	Make BE UT success (#304 )	2018-11-13 19:17:51 +08:00
chenhao7253886	37b4cafe87	Change variable and namespace name in BE (#268 ) Change 'palo' to 'doris'	2018-11-02 10:22:32 +08:00
morningman	2868793b6b	Change license to Apache License 2.0 (#262 )	2018-11-01 09:06:01 +08:00
morningman	051aced48d	Missing many files in last commit In last commit, a lot of files has been missed	2018-10-31 16:19:21 +08:00
morningman	5d3fc80067	Added: * Add streaming load feature. You can execute 'help stream load;' to see more information. Changed: * Loading phase of a certain table can be parallelized, to reduce the load job execution time when multi load jobs to a single table. * Using RocksDB to save the header info of tablets in Backends, to reduce the IO operations and increate speeding of restarting. Fixed: * A lot of bugs fixed.	2018-10-31 14:46:22 +08:00
imay	ae9ce81453	Changed: change build.sh to use environment variable to get thirdparty's path, and change PALO_HOME to DORIS_HOME	2018-10-30 16:29:06 +08:00
morningman	65fe7f65c1	Fixed: privilege logic error: 1. No one can set root password expect for root user itself 2. NODE_PRIV cannot be granted. 3. ADMIN_PRIV and GRANT_PRIV can only be granted or revoked on . 4. No one can modifly privs of default role 'operator' and 'admin'. 5. No user can be granted to role 'operator'. Fixed: the running load limit should not be applied to replay logic. It will cause replay or loading image fail. Changed: optimize the problem of too many directories under mini load directory. Fixed: missing password and auth check when handling mini load request in Frontend. Fixed: DomainResolver should start after Frontends transfer to a certain ROLE, not in Catalog construction methods. Fixed: a stupid bug that no one can set password for root user... fix it: only root user can set password for root. Fixed: read null data twice When reading data with a null value, in some cases, the same data will be read twice by the storage engine, resulting in a wrong result.The reason for this problem is that when splitting, and the start key is the minimum value, the data with null is read. Fixed: add a flag to prevent DomainResovler thread start twice. Fixed: fixed a mem leak of using ByteBuf when parsing auth info of http request. Fixed: add a new config 'disable_hadoop_load', default is false, set to true to disable hadoop load. Changed: add detail error msg of submitting hadoop load job in show load result. Fixed: Backend process should be crashed if failed to saving header. Added: exposure backend info to user when encounter error on Backend. for debugging it more convenient. Fixed: Should remove fd from map when inputstream or outputstream is closed in Broker process. Fixed: Change all files' LF to unix format. Internal commit id: merge from dfcd0aca18eed9ff99d188eb3d01c60d419be1b8	2018-10-01 19:58:41 +08:00
morningman	cc74efb3c5	merge to ddb65b69f9c788e359e191889cb31f15279c41ec (#224 ) 1. Apache HDFS broker support HDFS HA and Hadoop kerberos authentication. 2. New Backup and Restore function. Use Fs Broker to backup your data to HDFS or restore them from HDFS. 3. Table-Level Privileges. Grant fine-grained privileges on table-level to specified user. 4. A lot of bugs fixed. 5. Performance improvement.	2018-08-24 17:12:26 +08:00
李超勇	6486be64c3	fix license statement (#29 ) * change picture to word * change picture to word * SHOW FULL TABLES WHERE Table_type != VIEW sql can not execute * change license description	2017-08-18 19:16:23 +08:00
cyongli	e2311f656e	baidu palo	2017-08-11 17:51:21 +08:00

31 Commits