Commit Graph

59 Commits

Author SHA1 Message Date
42a4fff562 Replace boost canonicalize (#2209) 2019-11-19 17:57:37 +08:00
59e9027f76 Fix bug that timeout is not taken effect in streamload (#2217) 2019-11-16 22:29:55 +08:00
c3b5046940 Fix bug of invalid stream load task rollback (#1999)
If stream load be committed with result PUBLISH_TIMEOUT, it should not rollback
this transaction, but only return this message to user.
2019-10-17 21:08:29 +08:00
62acf5d098 Limit the memory usage of Loading process (#1954) 2019-10-15 09:26:20 +08:00
f130bd3e7b Use Env function to operate directory (#1980)
Now Env has unify all environment operation, such as file operation.
However some of our old functions don't leverage it. This change unify
FileUtils::scan_dir to use Env's function.
2019-10-15 09:25:12 +08:00
2f0808137a Refactor FrontendHelper (#1888) 2019-09-27 13:21:14 +08:00
e8da855cd2 Support setting timezone for stream load and routine load (#1831) 2019-09-20 07:55:05 +08:00
00f8040bf3 Fix bug that 2 same stream load jobs may both be able to executed successfully (#1690)
This will cause 2 jobs trying to write same file, and cause file damaged.
2019-08-22 19:38:16 +08:00
978b1ee1af Add strict mode in Routine load, Stream load and Mini load (#1677) 2019-08-20 21:56:45 +08:00
8e6814cfcd Support setting timeout for stream load (#1670) 2019-08-20 15:43:03 +08:00
1e2a4c3b9b Fix tablet restore api in BE(#1623) (#1624) 2019-08-13 09:34:24 +08:00
0694b6a6fa Fix bugs of Broker load (#1546)
Use same UUID as query ID and load ID of a load execution plan.
Each load execution plan has a load ID, and as a plan, there is also a query ID.
We can use same UUID as query ID and load ID, for tracing the load process more easily.

Change the load ID when retrying a load execution plan.
When a load execution plan retry, the load ID should be changed, otherwise BE can not
distinguish the old and new load requests.

Cancel the running loading task when cancelling the broker load.
When user cancel a broker load, the running loading task should also be cancelled, or
it may occupies the worker thread for a long time.

Remove the unnecessary query report when doing load execution plan.
Only the last query report is needed.

Add a new BE config tablet_writer_rpc_timeout_sec.
It is used for RPC of tablet sink. The default is 600 seconds. which is long enough for flushing
about 6GB data. The long timeout config will reduce the possibility of encountering fail to send batch error when loading.

Use streaming_load_max_mb instead of mini_load_max_mb in BE config.

Add more logs for tracing a broker load process easily.
2019-07-27 20:17:05 +08:00
4e043e66e2 Modify the result json format of mini load (#1487)
Mini load is now using stream load framework. But we should keep the
mini load return behavior and result json format be same as old.
So PUBLISH_TIMEOUT error should be treated as OK in mini load.

Also add 2 counters for OlapTableSink profile:
SerializeBatchTime: time of serializing all row batch.
WaitInFlightPacketTime: time of waiting last send packet
2019-07-16 19:15:41 +08:00
6c246418fb Add timeout in stream load planner (#1480)
Mini load timeout needs to be added in plan options.
The timeout property has been added in request of process put.
Otherwise, the timeout of mini load is useless.

Add log of label, txn and query id in mini load
2019-07-15 22:14:59 +08:00
0d48a3961c Refactor Storage Engine (#1478)
NOTE: This patch would modify all Backend's data.
And this will cause a very long time to restart be.
So if you want to interferer your product environment,
you should upgrade backend one by one.

1. Refactoring be is to clarify the structure the codes.
2. Use unique id to indicate a rowset.
   Nameing rowset with tablet_id and version will lead to
   many conflicts among compaction, clone, restore.
3. Extract an rowset interface to encapsulate rowsets
   with different format.
2019-07-15 21:18:22 +08:00
734032d917 Fix the error unit of create timestamp in mini load (#1460)
The unit of old create timestamp is micros while the unit of create timestamp in fe is millisecond.
2019-07-11 19:29:18 +08:00
b0af97d8aa Change error msg of mini load when PUBLISH_TIMEOUT (#1415) 2019-07-01 16:05:49 +08:00
4f416b7e21 Fix the core of mini load (#1375)
A new struct named MiniLoadCtx is used to save is_streaming and ctx.
2019-06-25 20:54:58 +08:00
7550b2f09b Convert mini load to streaming mini load (#1323)
* This commit has brought contribution to streaming mini load
The operation of streaming mini load is sames as previous. Also, user can check the load by frontend.
The difference is that streaming mini load finish the task before reply of REST API while the non-streaming only register a load.

* When updating doris
Updating fe or be firstly are also supported. After fe and be are updated, the streaming mini load will take effect.

* For multi mini load
The non-streaming mini load still has been used by multi mini load. The behavior of multi mini load has not been changed.

* Add a interface named isSupportedFunction
This function is used to protect the correctness of new feature which consists of be and fe during updaing.
2019-06-21 19:34:50 +08:00
9d03ba236b Uniform Status (#1317) 2019-06-14 23:38:31 +08:00
c20d62679e Add negative load from StreamLoad (#1227) 2019-05-31 07:14:06 +08:00
afa3aa9069 Add some pre-calculated metrics (#1079)
1. max io util of disks
2. max network send/receive bytes rate of all network devices
3. base/cumulative compaction request counter and failure counter
2019-04-30 11:12:23 +08:00
9c82d41981 Support Doris query ES by HTTP way (#925) 2019-04-28 17:14:44 +08:00
0820a29b8d Implement the routine load process of Kafka on Backend (#671) 2019-04-28 10:33:50 +08:00
d3251a19f7 Modify the method to obtain some metrics (#904) 2019-04-10 19:37:48 +08:00
fb4e77d6d6 Add http post feature for HttpClient (#773) 2019-03-19 22:05:33 +08:00
aba1b9e5d6 Reopen the thrift client when got exception (#610)
To avoid broken connection being reused.
2019-01-31 16:54:49 +08:00
af445b6cc2 Optimize something (#607)
1. Unify the thrift rpc timeout from BE to FE.
    Add a BE config 'thrift_rpc_timeout_ms', default is 5000
2. Add hostname in "show proc '/frontends';" stmt result.
3. Fix a lock order bug in Load.java
2019-01-31 13:30:45 +08:00
33b133c6ff Fix bug that internal retry of stream load return wrong result (#541)
Add an internal-generated timestamp as a unique identifier to identify a request and a retry request
2019-01-16 18:59:19 +08:00
a51ce03595 Enhance the usability of Load operation (#490)
1. Add broker load error hub
A broker load error hub will collect error messages in load process and saves them as a file to the specified remote storage via broker. In case that in broker/min/streaming load process, user may not be able to access the error log file in Backend directly.
We also add a new header option: 'enable_hub' in streaming load request, and default is false. Because if we enable the broker load error hub, it will significantly slow down the processing speed of streaming load, due to the visit of remote storage via broker. So use can disable the error load hub using this header option, to avoid slowing down the load speed.

2. Show load error logs by using SHOW LOAD WARNINGS stmt
We also provide a more easy way to get load error logs. We implement 'SHOW LOAD WARNINGS ON 'url'' stmt to show load error logs directly. The 'url' in stmt is provided in 'SHOW  LOAD' stmt.
eg:
show load warnings on "http://192.168.1.1:8040/api/_load_error_log?file=__shard_2/error_log_xxx";

3. Support now() function in broker load
User can mapping a column to now() in broker load stmt, which means this column will be filled with time when the ETL started.

4. Support more types of wildcard in broker load
Currently, we only support wildcard '*' to match the file names. wildcard like '/path/to/20190[1-4]*' is not support.
2019-01-03 19:07:27 +08:00
46c70a16b1 Add more detail logs to debug streaming load (#484)
* Add more detail logs to debug streaming load

* fix bugs

* fix bugs
2018-12-28 19:42:09 +08:00
90d71508ff Add UserFunctionCache to cache UDF's library (#453)
* Add UserFunctionCache to cache UDF's library

This patch replace LibCache with UserFunctionCache. LibCache use HDFS
URL to identify a UDF's Library, and when BE process restart all of
downloaded library should be loaded another time. We use function id
corresponding to a library, and when process restart, all downloaded
libraries can be loaded without another downloading.

* update
2018-12-21 22:07:21 +08:00
e99468c387 Add HttpClient class (#441)
Replace FileDownloader with HttpClient, this patch change clone_copy and
pusher's download.
2018-12-18 09:55:11 +08:00
dedfccfaf5 Optimize the publish logic of streaming load (#350)
1. Only collect all error replicas if publish task is timeout.
2. Add 2 metrics to monitor the success of failure of txn.
3. Change publish timeout to Config.load_straggler_wait_second
2018-11-26 19:01:50 +08:00
be6e9c393d Fix bug of using symbolic link dir as storage path (#340)
* Fix bug of #307

There is a bug to use symbolic link directory as storage root path.
It is a problem that whether the path is canonical

In DownloadAction, checking fails by comparing canonical path with non-canonical path.
So fix the bug by convert all path to canonical path before comparison
2018-11-23 10:12:26 +08:00
a2b299e3b9 Reduce UT binary size (#314)
* Reduce UT binary size

Almost every module depend on ExecEnv, and ExecEnv contains all
singleton, which make UT binary contains all object files.

This patch seperate ExecEnv's initial and destory to anthor file to
avoid other file's dependence. And status.cc include debug_util.h which
depend tuple.h tuple_row.h, and I move get_stack_trace() to
stack_util.cpp to reduce status.cc's dependence.

I add USE_RTTI=1 to build rocksdb to avoid linking librocksdb.a

Issue: #292

* Update
2018-11-15 16:17:23 +08:00
37b4cafe87 Change variable and namespace name in BE (#268)
Change 'palo' to 'doris'
2018-11-02 10:22:32 +08:00
2868793b6b Change license to Apache License 2.0 (#262) 2018-11-01 09:06:01 +08:00
051aced48d Missing many files in last commit
In last commit, a lot of files has been missed
2018-10-31 16:19:21 +08:00
5d3fc80067 Added:
* Add streaming load feature. You can execute 'help stream load;' to see more information.

Changed:
* Loading phase of a certain table can be parallelized, to reduce the load job execution time when multi load jobs to a single table.
* Using RocksDB to save the header info of tablets in Backends, to reduce the IO operations and increate speeding of restarting.

Fixed:
* A lot of bugs fixed.
2018-10-31 14:46:22 +08:00
65fe7f65c1 Fixed: privilege logic error:
1. No one can set root password expect for root user itself
    2. NODE_PRIV cannot be granted.
    3. ADMIN_PRIV and GRANT_PRIV can only be granted or revoked on *.*
    4. No one can modifly privs of default role 'operator' and 'admin'.
    5. No user can be granted to role 'operator'.
Fixed: the running load limit should not be applied to replay logic. It will cause replay or loading image fail.
Changed: optimize the problem of too many directories under mini load directory.
Fixed: missing password and auth check when handling mini load request in Frontend.
Fixed: DomainResolver should start after Frontends transfer to a certain ROLE, not in Catalog construction methods.
Fixed: a stupid bug that no one can set password for root user... fix it: only root user can set password for root.
Fixed: read null data twice
    When reading data with a null value, in some cases, the same data will be read twice by the storage engine,
    resulting in a wrong result.The reason for this problem is that when splitting,
    and the start key is the minimum value, the data with null is read.
Fixed: add a flag to prevent DomainResovler thread start twice.
Fixed: fixed a mem leak of using ByteBuf when parsing auth info of http request.
Fixed: add a new config 'disable_hadoop_load', default is false, set to true to disable hadoop load.
Changed: add detail error msg of submitting hadoop load job in show load result.
Fixed: Backend process should be crashed if failed to saving header.
Added: exposure backend info to user when encounter error on Backend. for debugging it more convenient.
Fixed: Should remove fd from map when inputstream or outputstream is closed in Broker process.
Fixed: Change all files' LF to unix format.

Internal commit id: merge from dfcd0aca18eed9ff99d188eb3d01c60d419be1b8
2018-10-01 19:58:41 +08:00
bea10e4f06 1. hide password and other sensitive information in log and audit log
2. add 2 new proc '/current_queries' and '/current_backend_instances' to monitor the current running queries.
3. add a manual compaction api on Backend to trigger cumulative or base compaction manually.
4. add Frontend config 'max_bytes_per_broker_scanner' to limit to bytes per one broker scanner. This is to limit the memory cost of a single broker load job
5. add Frontend config 'max_unfinished_load_job' to limit load job number: if number of running load jobs exceed the limit, no more load job is allowed to be submmitted.
6. a log of bug fixed
2018-09-19 20:04:01 +08:00
cc74efb3c5 merge to ddb65b69f9c788e359e191889cb31f15279c41ec (#224)
1. Apache HDFS broker support HDFS HA and Hadoop kerberos authentication.
2. New Backup and Restore function. Use Fs Broker to backup your data to HDFS or restore them from HDFS.
3. Table-Level Privileges. Grant fine-grained privileges on table-level to specified user.
4. A lot of bugs fixed.
5. Performance improvement.
2018-08-24 17:12:26 +08:00
19997510a6 merge to 9625ef157dd44c58802d63cb7547f037b75fd710 (#208)
1. Implement Backend http server using libevent instead of mongoose.
2. Remove Old Hypertable rpc framework, use brpc instead.
3. Change rpc from FE to BE to brpc.
4. Fs broker support HDFS HA.
5. add more metrics to monitor.
6. Lots of bug fixed.
2018-07-17 09:20:30 +08:00
6beea22d7b remove unused files (#205)
* modify the license

* remove unused files
2018-06-19 21:32:54 +08:00
81baef34f4 resotre mongoose's license 2018-06-12 16:02:59 +08:00
7e2a3aa1b3 modify the license (#203)
some license is replaced not correctly.
2018-06-09 19:12:16 +08:00
611afcd125 restore license which is replaced incorrectly 2018-06-09 16:35:35 +08:00
2419384e8a push 3.3.19 to github (#193)
* push 3.3.19 to github

* merge to 20ed420122a8283200aa37b0a6179b6a571d2837
2018-05-15 20:38:22 +08:00
6cf2fb4d47 Bug fixs
merge to 2d4cc9e1358c980b4f726e17d036639bc31127aa (#188)
contains:
    first_value with PRECEDING LEFT and NON-PRECEDING RIGHT rewrite error and count* materialize SlotDescriptor
    error when referring to the slot of the current query and subquery Simultaneously.

    fix join and count(*) materialize SlotDescriptor error.

    fix materialize scannode's conjuncts bug.

    remove no used materialization work.

    it have to evaluate orderby in subquery because we limit the number of rows returned by subquery.

    the method of judging limit is wrong.

    user info is missing when retrying to call load check.

    It's wrong to pass aggregate function when it's param is not materialized.

    InsertStmt does not pass the session param to observer.
2018-04-11 16:05:33 +08:00