doris

Author	SHA1	Message	Date
chenhao	d372b04e42	Revert "Add cpu and io indicates to audit log (#513 )" (#520 ) This reverts commit 5192e2f010308eefffa5271b0bdc947dfd9168ae.	2019-01-10 12:44:09 +08:00
chenhao	5192e2f010	Add cpu and io indicates to audit log (#513 ) Record query consumption into fe audit log. Its basic mode of work is as follows, one of instance of parent plan is responsible for accumulating sub plan's consumption and send to it's parent, BE coordinator will get total consumption because it's a single instance.	2019-01-09 22:28:20 +08:00
chenhao	92b138121b	Support io and cpu indicates for current query (#497 ) Help to locate big query when system overload, by checking consumptions of running parts of current all queries or specified one query. Its basic mode of work is as follows: firstly trigger BE to report RuntimeProfiles, and wait a moment. secondly caculate consumptions with RuntimeProfiles reported by BE. The consumptions supported by it are the cost of running ExecNode in query when call it.	2019-01-08 10:59:42 +08:00
Mingyu Chen	a51ce03595	Enhance the usability of Load operation (#490 ) 1. Add broker load error hub A broker load error hub will collect error messages in load process and saves them as a file to the specified remote storage via broker. In case that in broker/min/streaming load process, user may not be able to access the error log file in Backend directly. We also add a new header option: 'enable_hub' in streaming load request, and default is false. Because if we enable the broker load error hub, it will significantly slow down the processing speed of streaming load, due to the visit of remote storage via broker. So use can disable the error load hub using this header option, to avoid slowing down the load speed. 2. Show load error logs by using SHOW LOAD WARNINGS stmt We also provide a more easy way to get load error logs. We implement 'SHOW LOAD WARNINGS ON 'url'' stmt to show load error logs directly. The 'url' in stmt is provided in 'SHOW LOAD' stmt. eg: show load warnings on "http://192.168.1.1:8040/api/_load_error_log?file=__shard_2/error_log_xxx"; 3. Support now() function in broker load User can mapping a column to now() in broker load stmt, which means this column will be filled with time when the ETL started. 4. Support more types of wildcard in broker load Currently, we only support wildcard '' to match the file names. wildcard like '/path/to/20190[1-4]' is not support.	2019-01-03 19:07:27 +08:00
Mingyu Chen	5b1e3d3f40	Optimize backup & restore process (#460 ) 1. Print broker address for debug. 2. Do not letting backup job cancelled if it already in state UPLOAD_INFO. 3. Cancel task on Backends when job is cancelled. 4. Show detail progress of backup and restore job. 5. Make 'show snapshot' result more readable. 6. Change upload and download thread num of backup and restore in Backend to 1.	2018-12-24 16:49:16 +08:00
ZHAO Chun	90d71508ff	Add UserFunctionCache to cache UDF's library (#453 ) * Add UserFunctionCache to cache UDF's library This patch replace LibCache with UserFunctionCache. LibCache use HDFS URL to identify a UDF's Library, and when BE process restart all of downloaded library should be loaded another time. We use function id corresponding to a library, and when process restart, all downloaded libraries can be loaded without another downloading. * update	2018-12-21 22:07:21 +08:00
EmmyMiao87	fc41842c18	Add a frontend interface for committing RoutineLoadTask (#368 ) 1. add a needSchedulerTasksQueue in LoadManager: the RoutineLoadTaskScheduler will poll task from this queue and schedule task. 2. add a frontend interface named rlTaskCommit: commit txn, update offset and renew a task for the same partitions 3. add extra property in transaction state: in rlTaskCommit, extra property which looks like {"job_id": xxx, "progress": xxx} When fe initialize routine load job meta from logs, all of txn state which related to routine load job will be used for initializing progress of job. Add a TxnStateChangeListener interface for transaction 1. onCommitted , onAborted, beforeAborted will be called by different type of txn 2. RoutineLoadJob will update job progress and renew a task when onCommitted callback 3. Add TxnStateChangeListener into TransactionState 4. set transactionState to committed will call onCommitted callback if callback is not null 5. set transactionState to aborted will call beforeAborted and onAborted 6. beforeAborted in RoutineLoadJob will check if there is related task when TxnStatusChangeReason is TIMEOUT. It will prevent abort when there is a related task by throw TransactionException 7. Other reason of abort will not prevent abort. The onAborted will be call and job state will be change to paused Change extra to TxnCommitAttachment in TLoadTxnCommitRequest 1. The KAFKA source of TTxnSourceType means that this is a routine load task commit. And the TRLTaskTxnCommitAttachment is the commitInfo of this task. 2. TRLTaskTxnCommitAttachment will be convert to RLTaskTxnCommitAttachment which include progress of this task, task id, numOfErrorData etc. Add param TxnCommitAttachment into commitTransaction 1. The TxnCommitAttachment will be updated in commitTransaction	2018-12-11 11:06:25 +08:00
kangpinghuang	85d0996b35	Rename Rowset to SegmentGroup (#364 ) * Rename Rowset to SegmentGroup * Modify protobuf related rowset to SegmentGroup	2018-11-29 17:30:41 +08:00
李超勇	477b6b3d2a	Remove unused row-oriented format flags (#357 )	2018-11-27 16:34:22 +08:00
EmmyMiao87	bbdf4fba4a	Add distributor which schedule task to be fairly, for routine load job (#333 ) Step1: updateBeIdTaskMaps, remove unavailable BE and add new alive BE Step2: process timeout tasks, if a task has already been allocated to BE but not finished before DEFAULT_TASK_TIMEOUT, it will be discarded. At the same time, the partitions belong to old tasks will be allocated to a new task. The new task with a signature will be added in the queue of needSchedulerRoutineLoadTask. Step3: process all needSchedulerRoutineLoadTasks, allocate task to BE. The task will be executed by BE.	2018-11-23 10:35:10 +08:00
Mingyu Chen	9a2ad18428	Add path info of replica in catalog (#327 ) Add path info of replica in catalog Also fix a bug that when calling check_none_row_oriented_table, store is null, it cannot be used to create table. Instead, OLAPHeader can be used to get storage type information.	2018-11-19 17:42:46 +08:00
李超勇	1ba8a4ee4e	Transform row-oriented table to columnar-oriented table (#311 )	2018-11-16 16:03:56 +08:00
Zhao Chun	2081b7fea5	Be compatible with old RPC (#296 ) Add palo.PInternalService which can server old version palo's client. Issue: #293	2018-11-10 15:46:45 +08:00
chenhao7253886	312dfd10bb	Change SQL built-in function's symbol (#274 )	2018-11-02 16:24:21 +08:00
chenhao7253886	37b4cafe87	Change variable and namespace name in BE (#268 ) Change 'palo' to 'doris'	2018-11-02 10:22:32 +08:00
morningman	faeb472909	Tidy up the docs and gensrc directory (#263 ) 1. Remove all design docs. They will be pushed again after modification. 2. Add streaming load and privilege help docs. 3. Rename palo.py to doris.py in gensrc/script/.	2018-11-01 10:37:30 +08:00
morningman	2868793b6b	Change license to Apache License 2.0 (#262 )	2018-11-01 09:06:01 +08:00
morningman	2be7991561	Change com.baidu.palo to org.apache.doris Change some package names in fe/fs_brokers/gensrc dir.	2018-10-31 17:07:16 +08:00
morningman	051aced48d	Missing many files in last commit In last commit, a lot of files has been missed	2018-10-31 16:19:21 +08:00
morningman	5d3fc80067	Added: * Add streaming load feature. You can execute 'help stream load;' to see more information. Changed: * Loading phase of a certain table can be parallelized, to reduce the load job execution time when multi load jobs to a single table. * Using RocksDB to save the header info of tablets in Backends, to reduce the IO operations and increate speeding of restarting. Fixed: * A lot of bugs fixed.	2018-10-31 14:46:22 +08:00
imay	ae9ce81453	Changed: change build.sh to use environment variable to get thirdparty's path, and change PALO_HOME to DORIS_HOME	2018-10-30 16:29:06 +08:00
morningman	68d663fe7a	Changed: change the compilation method of Frontend and Apache hdfs broker from ANT to MAVEN	2018-10-27 11:35:20 +08:00
morningman	4f6f8572de	Added: Add 3 new metrics of Backends: host_fd_metrics, process_fd_metrics and process_thread_metrics, to monitor open file number and thread number. Added: Support getting column size and precision info of table or view using JDBC. Updated: Change the promethues type name GAUGE to lowercase, to fit the latest promethues version. Updated: Backend ip saved in FE will be compared with BE's local ip when doing heartbeat, to avoid false positive heartbeat response. Updated: Using version_num of tablet instead of calculating nice value to select cumulative compaction candicates. Fixed: Predicates should not be pushed down to subquery which contains limit clause. Fixed: Fix the formula of calculating BE load score. Fixed: Fix a bug that in some edge cases, non-master Fontend may wait for a unnecessary long timeout after forwarding cmd to Master FE. Fixed: A bug that granting privs on more than one table does not work. Fixed: Support 'Insert into' table which contains HLL columns. Fixed: ExportStmt' toSql() method may throw NullPointer Exception if table does not exist. Fixed: Remove unnecessary 'get capacity' operation to avoid IO impact. Internal commit id: merge to c16bd603a53dfe2089ff95704c698a738c317792	2018-10-26 14:48:21 +08:00
morningman	65fe7f65c1	Fixed: privilege logic error: 1. No one can set root password expect for root user itself 2. NODE_PRIV cannot be granted. 3. ADMIN_PRIV and GRANT_PRIV can only be granted or revoked on . 4. No one can modifly privs of default role 'operator' and 'admin'. 5. No user can be granted to role 'operator'. Fixed: the running load limit should not be applied to replay logic. It will cause replay or loading image fail. Changed: optimize the problem of too many directories under mini load directory. Fixed: missing password and auth check when handling mini load request in Frontend. Fixed: DomainResolver should start after Frontends transfer to a certain ROLE, not in Catalog construction methods. Fixed: a stupid bug that no one can set password for root user... fix it: only root user can set password for root. Fixed: read null data twice When reading data with a null value, in some cases, the same data will be read twice by the storage engine, resulting in a wrong result.The reason for this problem is that when splitting, and the start key is the minimum value, the data with null is read. Fixed: add a flag to prevent DomainResovler thread start twice. Fixed: fixed a mem leak of using ByteBuf when parsing auth info of http request. Fixed: add a new config 'disable_hadoop_load', default is false, set to true to disable hadoop load. Changed: add detail error msg of submitting hadoop load job in show load result. Fixed: Backend process should be crashed if failed to saving header. Added: exposure backend info to user when encounter error on Backend. for debugging it more convenient. Fixed: Should remove fd from map when inputstream or outputstream is closed in Broker process. Fixed: Change all files' LF to unix format. Internal commit id: merge from dfcd0aca18eed9ff99d188eb3d01c60d419be1b8	2018-10-01 19:58:41 +08:00
imay	ae19fbfc8b	make doris compile successfully on ubuntu 1604 and centos 7.2	2018-09-21 20:22:01 +08:00
lide	bea10e4f06	1. hide password and other sensitive information in log and audit log 2. add 2 new proc '/current_queries' and '/current_backend_instances' to monitor the current running queries. 3. add a manual compaction api on Backend to trigger cumulative or base compaction manually. 4. add Frontend config 'max_bytes_per_broker_scanner' to limit to bytes per one broker scanner. This is to limit the memory cost of a single broker load job 5. add Frontend config 'max_unfinished_load_job' to limit load job number: if number of running load jobs exceed the limit, no more load job is allowed to be submmitted. 6. a log of bug fixed	2018-09-19 20:04:01 +08:00
morningman	cc74efb3c5	merge to ddb65b69f9c788e359e191889cb31f15279c41ec (#224 ) 1. Apache HDFS broker support HDFS HA and Hadoop kerberos authentication. 2. New Backup and Restore function. Use Fs Broker to backup your data to HDFS or restore them from HDFS. 3. Table-Level Privileges. Grant fine-grained privileges on table-level to specified user. 4. A lot of bugs fixed. 5. Performance improvement.	2018-08-24 17:12:26 +08:00
morningman	19997510a6	merge to 9625ef157dd44c58802d63cb7547f037b75fd710 (#208 ) 1. Implement Backend http server using libevent instead of mongoose. 2. Remove Old Hypertable rpc framework, use brpc instead. 3. Change rpc from FE to BE to brpc. 4. Fs broker support HDFS HA. 5. add more metrics to monitor. 6. Lots of bug fixed.	2018-07-17 09:20:30 +08:00
lide-reed	3c9f2ae669	remove aes encrypt and decrypt which used GPL	2018-06-12 17:26:21 +08:00
morningman	2419384e8a	push 3.3.19 to github (#193 ) * push 3.3.19 to github * merge to 20ed420122a8283200aa37b0a6179b6a571d2837	2018-05-15 20:38:22 +08:00
morningman	9b420d258b	fix left join bug, be compatible with sqlalchemy (#189 ) merge to 4bd53476a7c3e6451aa1cf640a010d3b7bb0ec42	2018-04-13 22:30:22 +08:00
morningman	6cf2fb4d47	Bug fixs merge to 2d4cc9e1358c980b4f726e17d036639bc31127aa (#188) contains: first_value with PRECEDING LEFT and NON-PRECEDING RIGHT rewrite error and count* materialize SlotDescriptor error when referring to the slot of the current query and subquery Simultaneously. fix join and count(*) materialize SlotDescriptor error. fix materialize scannode's conjuncts bug. remove no used materialization work. it have to evaluate orderby in subquery because we limit the number of rows returned by subquery. the method of judging limit is wrong. user info is missing when retrying to call load check. It's wrong to pass aggregate function when it's param is not materialized. InsertStmt does not pass the session param to observer.	2018-04-11 16:05:33 +08:00
lide-reed	5e9041fc1c	added largeint for buildin functions: if, ifnull, nullif and coalesce	2018-01-16 09:46:51 +08:00
yubingpeng	446e76862e	fix microseconds_add and microseconds_sub bug (#157 )	2017-12-15 14:18:16 +08:00
yubingpeng	1eddd0d7ad	report the real disk available capacity to fe (#156 )	2017-12-14 10:49:19 +08:00
morningman	a033451367	modify: (#137 ) 1. remove libunwind deps 2. modify install in build.sh 3. add SHOW USER stmt to show user privilege	2017-11-13 19:49:40 -06:00
chenhao7253886	e619631201	add des cluster when perform 'allter system add backend' (#134 ) add des cluster when perform 'allter system add backend'	2017-11-03 04:37:16 -05:00
LingBin	51d5c727a7	make UUID to be authentication token (#107 )	2017-09-20 21:25:10 +08:00
morningman	0475394879	add new stmt ALTER SYSTEM ADD FREE BACKEND to add a backend not belongs to any cluster	2017-09-04 15:43:19 +08:00
morningman	2ab0c26b1d	fix clone bug. In case that we want to clone a replica with version X, but a replica with stale version(less than X) is already in the dest backend. This will leads the clone task to success directly, which is not what we expected. Adding version check when doing the clone task and try to drop the replica with stale version	2017-08-31 18:53:33 +08:00
morningman	168e09d2ea	modify some name problems of multi cluster feature	2017-08-31 17:27:59 +08:00
李超勇	6b6dacb5ee	change Cloudera Impala to Apache Impala (#59 )	2017-08-31 09:51:24 +08:00
chenhao16	a5c53c4de2	modify sql_scanner and remove multi host restrict when check tablets in clonechecker	2017-08-28 15:34:34 +08:00
chenhao7253886	ab681f9398	support for cluster can acquire multi be in same host and fixed some issues (#47 ) * cluster can acquire more backends in same host * compile failed * count* 's subquery with orderby exec error * modify show processlist result and check name format error * merge log and add cancel cluster for root, show backends for users in cluster * returns null when the param does not match * update UpdateClusterAndBackends to BackendIdsUpdateInfo * correct Licenses	2017-08-26 21:29:28 +08:00
李超勇	928c09d11b	change version (#43 ) change Palo version to 0.8.0	2017-08-22 10:23:17 +08:00
李超勇	6486be64c3	fix license statement (#29 ) * change picture to word * change picture to word * SHOW FULL TABLES WHERE Table_type != VIEW sql can not execute * change license description	2017-08-18 19:16:23 +08:00
cyongli	e2311f656e	baidu palo	2017-08-11 17:51:21 +08:00

47 Commits