Commit Graph

57 Commits

Author SHA1 Message Date
aba1b9e5d6 Reopen the thrift client when got exception (#610)
To avoid broken connection being reused.
2019-01-31 16:54:49 +08:00
4f3954fd77 Fix bug that recvr thread update sub plan's QueryStatistics when it is destructed (#573) 2019-01-23 17:00:40 +08:00
0dd4c6e0a0 Fix ASAN compilation issue (#561) 2019-01-21 13:09:45 +08:00
717285db1e Remove unused code about showing current queries (#552) 2019-01-18 09:53:40 +08:00
4d5f92cce7 Add EsScanNode (#450) 2019-01-17 17:59:33 +08:00
0e5b193243 Add cpu and io indicates to audit log (#531) 2019-01-17 12:43:15 +08:00
d372b04e42 Revert "Add cpu and io indicates to audit log (#513)" (#520)
This reverts commit 5192e2f010308eefffa5271b0bdc947dfd9168ae.
2019-01-10 12:44:09 +08:00
5192e2f010 Add cpu and io indicates to audit log (#513)
Record query consumption into fe audit log. Its basic mode of work is as follows, one of instance of parent plan is responsible for accumulating sub plan's consumption and send to it's parent, BE coordinator will get total consumption because it's a single instance.
2019-01-09 22:28:20 +08:00
92b138121b Support io and cpu indicates for current query (#497)
Help to locate big query when system overload, by checking consumptions of running parts of current all queries or specified one query. Its basic mode of work is as follows: firstly trigger BE to report RuntimeProfiles, and wait a moment. secondly caculate consumptions with RuntimeProfiles reported by BE. The consumptions supported by it are the cost of running ExecNode in query when call it.
2019-01-08 10:59:42 +08:00
a51ce03595 Enhance the usability of Load operation (#490)
1. Add broker load error hub
A broker load error hub will collect error messages in load process and saves them as a file to the specified remote storage via broker. In case that in broker/min/streaming load process, user may not be able to access the error log file in Backend directly.
We also add a new header option: 'enable_hub' in streaming load request, and default is false. Because if we enable the broker load error hub, it will significantly slow down the processing speed of streaming load, due to the visit of remote storage via broker. So use can disable the error load hub using this header option, to avoid slowing down the load speed.

2. Show load error logs by using SHOW LOAD WARNINGS stmt
We also provide a more easy way to get load error logs. We implement 'SHOW LOAD WARNINGS ON 'url'' stmt to show load error logs directly. The 'url' in stmt is provided in 'SHOW  LOAD' stmt.
eg:
show load warnings on "http://192.168.1.1:8040/api/_load_error_log?file=__shard_2/error_log_xxx";

3. Support now() function in broker load
User can mapping a column to now() in broker load stmt, which means this column will be filled with time when the ETL started.

4. Support more types of wildcard in broker load
Currently, we only support wildcard '*' to match the file names. wildcard like '/path/to/20190[1-4]*' is not support.
2019-01-03 19:07:27 +08:00
ff7d3e5878 Unify the print method of TUniqueId (#487) 2018-12-29 16:22:38 +08:00
c74c915441 Fix license issue of gutil (#471) 2018-12-27 13:53:27 +08:00
7d7934112f Fix fe ut (#469)
1. Fix StreamLoadScanNodeTest
2. Revert the fix of decimal value with scientific notation, this still need to fix it later
2018-12-25 20:07:03 +08:00
5b1e3d3f40 Optimize backup & restore process (#460)
1. Print broker address for debug.
2. Do not letting backup job cancelled if it already in state UPLOAD_INFO.
3. Cancel task on Backends when job is cancelled.
4. Show detail progress of backup and restore job.
5. Make 'show snapshot' result more readable.
6. Change upload and download thread num of backup and restore in Backend to 1.
2018-12-24 16:49:16 +08:00
90d71508ff Add UserFunctionCache to cache UDF's library (#453)
* Add UserFunctionCache to cache UDF's library

This patch replace LibCache with UserFunctionCache. LibCache use HDFS
URL to identify a UDF's Library, and when BE process restart all of
downloaded library should be loaded another time. We use function id
corresponding to a library, and when process restart, all downloaded
libraries can be loaded without another downloading.

* update
2018-12-21 22:07:21 +08:00
dc4cbab11e Report error when loading decimal value with scientific notation (#428)
Currently we do not support scientific notation of decimal value.
2018-12-17 21:04:18 +08:00
74cc5c5404 Write summary line in load error file anyway (#425)
Summary line should be wrote in spite of the limit error number
2018-12-13 12:35:19 +08:00
850150896a Fix bug that load error log is empty sometimes (#424)
_num_print_error_rows is not initialized
2018-12-13 11:23:32 +08:00
6b4049e21c Unify Slice code path (#380) 2018-12-03 18:11:47 +08:00
a2b299e3b9 Reduce UT binary size (#314)
* Reduce UT binary size

Almost every module depend on ExecEnv, and ExecEnv contains all
singleton, which make UT binary contains all object files.

This patch seperate ExecEnv's initial and destory to anthor file to
avoid other file's dependence. And status.cc include debug_util.h which
depend tuple.h tuple_row.h, and I move get_stack_trace() to
stack_util.cpp to reduce status.cc's dependence.

I add USE_RTTI=1 to build rocksdb to avoid linking librocksdb.a

Issue: #292

* Update
2018-11-15 16:17:23 +08:00
2081b7fea5 Be compatible with old RPC (#296)
Add palo.PInternalService which can server old version palo's client.

Issue: #293
2018-11-10 15:46:45 +08:00
0c4edc2b3c Fix BE can't be grayscale upgraded (#285) 2018-11-07 09:34:39 +08:00
37b4cafe87 Change variable and namespace name in BE (#268)
Change 'palo' to 'doris'
2018-11-02 10:22:32 +08:00
2868793b6b Change license to Apache License 2.0 (#262) 2018-11-01 09:06:01 +08:00
051aced48d Missing many files in last commit
In last commit, a lot of files has been missed
2018-10-31 16:19:21 +08:00
5d3fc80067 Added:
* Add streaming load feature. You can execute 'help stream load;' to see more information.

Changed:
* Loading phase of a certain table can be parallelized, to reduce the load job execution time when multi load jobs to a single table.
* Using RocksDB to save the header info of tablets in Backends, to reduce the IO operations and increate speeding of restarting.

Fixed:
* A lot of bugs fixed.
2018-10-31 14:46:22 +08:00
65fe7f65c1 Fixed: privilege logic error:
1. No one can set root password expect for root user itself
    2. NODE_PRIV cannot be granted.
    3. ADMIN_PRIV and GRANT_PRIV can only be granted or revoked on *.*
    4. No one can modifly privs of default role 'operator' and 'admin'.
    5. No user can be granted to role 'operator'.
Fixed: the running load limit should not be applied to replay logic. It will cause replay or loading image fail.
Changed: optimize the problem of too many directories under mini load directory.
Fixed: missing password and auth check when handling mini load request in Frontend.
Fixed: DomainResolver should start after Frontends transfer to a certain ROLE, not in Catalog construction methods.
Fixed: a stupid bug that no one can set password for root user... fix it: only root user can set password for root.
Fixed: read null data twice
    When reading data with a null value, in some cases, the same data will be read twice by the storage engine,
    resulting in a wrong result.The reason for this problem is that when splitting,
    and the start key is the minimum value, the data with null is read.
Fixed: add a flag to prevent DomainResovler thread start twice.
Fixed: fixed a mem leak of using ByteBuf when parsing auth info of http request.
Fixed: add a new config 'disable_hadoop_load', default is false, set to true to disable hadoop load.
Changed: add detail error msg of submitting hadoop load job in show load result.
Fixed: Backend process should be crashed if failed to saving header.
Added: exposure backend info to user when encounter error on Backend. for debugging it more convenient.
Fixed: Should remove fd from map when inputstream or outputstream is closed in Broker process.
Fixed: Change all files' LF to unix format.

Internal commit id: merge from dfcd0aca18eed9ff99d188eb3d01c60d419be1b8
2018-10-01 19:58:41 +08:00
e0d721f210 Merge pull request #216 from jimmycasey/master
Fixed Spelling.
2018-09-20 14:25:49 +08:00
bea10e4f06 1. hide password and other sensitive information in log and audit log
2. add 2 new proc '/current_queries' and '/current_backend_instances' to monitor the current running queries.
3. add a manual compaction api on Backend to trigger cumulative or base compaction manually.
4. add Frontend config 'max_bytes_per_broker_scanner' to limit to bytes per one broker scanner. This is to limit the memory cost of a single broker load job
5. add Frontend config 'max_unfinished_load_job' to limit load job number: if number of running load jobs exceed the limit, no more load job is allowed to be submmitted.
6. a log of bug fixed
2018-09-19 20:04:01 +08:00
cc74efb3c5 merge to ddb65b69f9c788e359e191889cb31f15279c41ec (#224)
1. Apache HDFS broker support HDFS HA and Hadoop kerberos authentication.
2. New Backup and Restore function. Use Fs Broker to backup your data to HDFS or restore them from HDFS.
3. Table-Level Privileges. Grant fine-grained privileges on table-level to specified user.
4. A lot of bugs fixed.
5. Performance improvement.
2018-08-24 17:12:26 +08:00
d4c2e56477 Fixed Spelling. 2018-07-29 21:50:56 +00:00
aad32ec8fc merge to 3eb59087fb98ed61655997de525331736c56256a (#215)
1. fixed a bug when cast float to string
2. delay result's cancelled time
3. update type check in CastTo
4. remove unused mysql channel log
5. update stop_be.sh, use kill -0 instead of flock -nx
6. update jprotobuf-rpc to 3.5.15 in Frontend, the previous version will cause too many threads in JVM
2018-07-26 19:22:37 +08:00
19997510a6 merge to 9625ef157dd44c58802d63cb7547f037b75fd710 (#208)
1. Implement Backend http server using libevent instead of mongoose.
2. Remove Old Hypertable rpc framework, use brpc instead.
3. Change rpc from FE to BE to brpc.
4. Fs broker support HDFS HA.
5. add more metrics to monitor.
6. Lots of bug fixed.
2018-07-17 09:20:30 +08:00
6beea22d7b remove unused files (#205)
* modify the license

* remove unused files
2018-06-19 21:32:54 +08:00
bfe1bc7cf3 remove mysql_dtoa 2018-06-13 08:33:48 +08:00
9f7b1ea6d4 merge to 87fd4ebd9977afb1e1193429dd75c7c82caab204 (#202)
1. ix bugs in query layer.
2. remove some redundant code in BE
3. support specify multi helper node when starting FE
4. add proc 'cluster_load_statistic' to show load balance situation of Palo
2018-06-08 08:42:23 +08:00
5065d634eb merge to 7ab21a103ba63af3bd27f45f7056b1ddbcc93a13 (#194)
1. call done->Run when stream receiver is closed
2. change the name of frontend in bdbje, use arbitrary name instead of ip_port
3. change sync backend report to async, to avoid timeout when backend report
4. fix: buf in WrapperField is not equal to field_buf minus one
2018-05-18 13:57:04 +08:00
2419384e8a push 3.3.19 to github (#193)
* push 3.3.19 to github

* merge to 20ed420122a8283200aa37b0a6179b6a571d2837
2018-05-15 20:38:22 +08:00
5de798fdd6 Merge code to github (#187)
* merge to 95787f8be1fd0ff215708fb0f49997b632876586
* Bugs fixed
2018-03-23 14:04:55 +08:00
ef15bf424e fix decimal bug (#181) 2018-01-23 19:11:06 +08:00
497292387c add export retry (#176) 2018-01-11 14:20:41 +08:00
1e951e5c1c fix ut compile. set timeout to pull load task. fix export sink bug (#175) 2018-01-09 10:42:57 +08:00
32c5570771 fix some bugs (#169)
* change broker's default log level to INFO. fix some log error

* change exporting data via broker in batch
2018-01-02 19:50:15 +08:00
0d3f401889 remove confusing & symbol (#168) 2017-12-27 21:42:03 +08:00
b981d70fe7 fix error file log path when load with broker (#167) 2017-12-27 21:31:09 +08:00
756f9bdb6c exchange check error when parent child is union (#158) 2017-12-18 11:20:39 +08:00
585c21fab4 add feature and fix bugs (#148)
Add new features:
1. plugins of Ambari and k8s deploy
2. specified config 'priority_network' to solve some ip problems

Fix bugs:
fix bugs that rebalance does not work in some case.
fix count(*) from union stmt bug
fix some union stmt bugs
fix bugs when try to schema change a clone replica
2017-11-30 16:31:12 +08:00
adb3213314 Do not show token and file path in load error url (#114) 2017-09-21 20:27:49 +08:00
51d5c727a7 make UUID to be authentication token (#107) 2017-09-20 21:25:10 +08:00
db8c40e5f0 add authentication to DownloadAction (#91)
* add authentication to DownloadAction

1. use cluster_id as token;
2. add dir limit, only files in data dir can be accessed.

* enable authentication in DownloadAction by default
2017-09-13 16:54:00 +08:00