Commit Graph

367 Commits

Author SHA1 Message Date
d15bc83de0 Fix some bugs of alter table operation (#550)
1. Fix bug that failed to query restored table after schema change.
2. Fix bug that failed to add rollup to restored table.
3. Optimize the info of SHOW ALTER TABLE stmt.
4. Optimize the info of some PROCs.
5. Optimize the tablet checker to avoid adding too much task to scheduler.
2019-01-17 15:17:51 +08:00
5cb1c161a4 Fix colocate join balance bug (#547) 2019-01-17 14:58:08 +08:00
6bef41633c Add DORIS_THIRDPARTY env in docker image (#539)
* Add param of specified thirdparty path
1. The thirdparth path can be specify on build.sh: ./build.sh --thirdparty /specified/path/to/thirdparty
2. If there are only thirdparty param of build.sh, it will build both fe and be
3. Add unit test of routine load stmt
4. Remove source code in docker image

* Add DORIS_THIRDPARTY env in docker image
1. Set DORIS_THIRDPARTY env in docker image. The build.sh will use /var/local/thirdparty instead of /source/code/thirdparty
2. remove --thirdparty param of build.sh

* Change image workdir to /root
2019-01-17 14:19:13 +08:00
0e5b193243 Add cpu and io indicates to audit log (#531) 2019-01-17 12:43:15 +08:00
33b133c6ff Fix bug that internal retry of stream load return wrong result (#541)
Add an internal-generated timestamp as a unique identifier to identify a request and a retry request
2019-01-16 18:59:19 +08:00
e8360f5eee Add counters to OlapScanNode (#538)
There is unnegligible cost to covnert VectorRowBatch to RowBatch,
When we seek block, we only read one row from engine to minimize
this convert cost.

This patch can optimize some query's time from 5s to 2s
2019-01-16 18:57:04 +08:00
79dc521893 Fix UnixMicros function (#544) 2019-01-16 17:26:43 +08:00
798a66e6a0 Implement new tablet repair and balance framework (#336)
More detail, see issue #540
2019-01-16 13:29:17 +08:00
f20c99fd09 Support storage migration (#534)
Add a migration lock to lock the data when doing storage migration.
2019-01-15 12:53:24 +08:00
b3b86731cb Remove useless check, not need lsb_release any more (#526) 2019-01-11 14:59:20 +08:00
0fcbe15280 Change the default bdbje sync policy to SYNC (#519) 2019-01-10 19:06:29 +08:00
65eed5fdc1 Fix inconsistency of three replicas belongs to one tablet (#523) (#525)
There are A, B, C replicas of one tablet.
A has 0 - 10 version.
B has 0 - 5, 6, 7, 9, 10 version.
1. B has missed versions, so it clones 0 - 10 from A, and remove overlapped versions in its header.
2. Coincidentally, 6 is a version for delete predicate (delete where day = 20181221).
   When removing overlapped versions, version 6 is removed but delete predicate is not be removed.
3. Unfortunately, 0-10 cloned from A has data indicated at 20181221.
4. B performs compaction, and data generated by 20181221 is be removed falsely.
2019-01-10 18:57:50 +08:00
742dc796b2 Fix inconsistency of three replicas belongs to one tablet (#523)
There are A, B, C replicas of one tablet.
A has 0 - 10 version.
B has 0 - 5, 6, 7, 9, 10 version.
1. B has missed versions, so it clones 0 - 10 from A, and remove overlapped versions in its header.
2. Coincidentally, 6 is a version for delete predicate (delete where day = 20181221).
   When removing overlapped versions, version 6 is removed but delete predicate is not be removed.
3. Unfortunately, 0-10 cloned from A has data indicated at 20181221.
4. B performs compaction, and data generated by 20181221 is be removed falsely.
2019-01-10 18:51:12 +08:00
eae755f833 Fix bug that schema change does not set null value correctly (#524) 2019-01-10 18:35:03 +08:00
0b50617542 Fix BE core if WHEN expr is null in CASE-WHEN clause (#521)
#518
2019-01-10 13:40:28 +08:00
d372b04e42 Revert "Add cpu and io indicates to audit log (#513)" (#520)
This reverts commit 5192e2f010308eefffa5271b0bdc947dfd9168ae.
2019-01-10 12:44:09 +08:00
5192e2f010 Add cpu and io indicates to audit log (#513)
Record query consumption into fe audit log. Its basic mode of work is as follows, one of instance of parent plan is responsible for accumulating sub plan's consumption and send to it's parent, BE coordinator will get total consumption because it's a single instance.
2019-01-09 22:28:20 +08:00
69f9987abd EsTable without partition info (#511) 2019-01-09 11:14:19 +08:00
92b138121b Support io and cpu indicates for current query (#497)
Help to locate big query when system overload, by checking consumptions of running parts of current all queries or specified one query. Its basic mode of work is as follows: firstly trigger BE to report RuntimeProfiles, and wait a moment. secondly caculate consumptions with RuntimeProfiles reported by BE. The consumptions supported by it are the cost of running ExecNode in query when call it.
2019-01-08 10:59:42 +08:00
cbf1f99a46 Fix parse es state failed in unit test (#502) 2019-01-07 14:13:26 +08:00
9bfd8d818a Add md5 property for UDF create statement (#500) 2019-01-06 19:45:04 +08:00
483c5a971e Add routine load statement (#456)
1. Add sql parser and sql scanner for routine load stmt such as KW_ROUTINE(routine), KW_PAUSE.
2. Create routine load statment like
      CREATE ROUTINE LOAD name ON database.table
      (properties of routine load)
      [PROPERTIES (key1=value1, )]
      FROM [KAFKA](type of routine load)
      (properties of this type)

      properties of routine load:
          The load property of CreateRoutineLoadStmt is disordered: Both 'LoadColumnsInfo, PartitionNames xxx' and 'PartitionNames, ColumnsInfo xxx' is right.
          [COLUMNS TERMINATED BY separator ]
          [(col1, ...)]
          [SET (k1=f1(xx), k2=f2(xx))]
          WHERE
          [PARTITION (p1, p2)]

      type of routine load:
          KAFKA

      different type has different properties
      properties of this type:
          k1 = v1
          k2 = v2
3. Pause/Resume/Stop routine load statment like
      PAUSE/RESUME/STOP ROUTINE LOAD jobName
4. Ddlexecutor support CreateRoutineLoadStmt, Pause/Resume/StopRoutineLoadStmt
5. Pause/Stop routine load will clear all of task which belong to job immediately
      The task which has been not committed will be abort.
6. Resume routine load will change job state to need scheduler
      The RoutineLoadJobScheduler will scheduler it later.
7. Show routine load statment like
      SHOW ROUTINE LOAD jobName
8. All of load property can implement LoadProperty such as LoadColumnsInfo, PartitionsNames etc
9. The sql of LoadColumnsInfo is Columns (c1, c2, c3) set (c1, c2, c3=c1+c2)
10. Add check of routineLoadName, db.routineLoadName is unique in database when job state is not final state.
2019-01-04 13:49:49 +08:00
968364d4a6 Build boost with custom GCC (#499) 2019-01-04 12:20:06 +08:00
18c9527dc0 Change lzo-master to lzo-2.10 (#498) 2019-01-04 11:43:33 +08:00
a51ce03595 Enhance the usability of Load operation (#490)
1. Add broker load error hub
A broker load error hub will collect error messages in load process and saves them as a file to the specified remote storage via broker. In case that in broker/min/streaming load process, user may not be able to access the error log file in Backend directly.
We also add a new header option: 'enable_hub' in streaming load request, and default is false. Because if we enable the broker load error hub, it will significantly slow down the processing speed of streaming load, due to the visit of remote storage via broker. So use can disable the error load hub using this header option, to avoid slowing down the load speed.

2. Show load error logs by using SHOW LOAD WARNINGS stmt
We also provide a more easy way to get load error logs. We implement 'SHOW LOAD WARNINGS ON 'url'' stmt to show load error logs directly. The 'url' in stmt is provided in 'SHOW  LOAD' stmt.
eg:
show load warnings on "http://192.168.1.1:8040/api/_load_error_log?file=__shard_2/error_log_xxx";

3. Support now() function in broker load
User can mapping a column to now() in broker load stmt, which means this column will be filled with time when the ETL started.

4. Support more types of wildcard in broker load
Currently, we only support wildcard '*' to match the file names. wildcard like '/path/to/20190[1-4]*' is not support.
2019-01-03 19:07:27 +08:00
7057db8442 Build libcurl with openssl (#496) 2019-01-03 18:22:08 +08:00
d1bdb55302 Fix bug that schema change on restored table will lose data (#489)
In TabletInvertedIndex calss, The instance of TabletMeta in
'tabletMetaMap' and 'tabletMetaTable' should be same.
Otherwise when we change the schema hash info of TabletMeta
in 'tabletMetaMap', TabletMeta in 'tabletMetaTable' left
unchanged, which will cause inconsistency of meta.
2019-01-02 09:51:06 +08:00
ff7d3e5878 Unify the print method of TUniqueId (#487) 2018-12-29 16:22:38 +08:00
7380483394 Support UDF (#468)
Now, user can create UDF with CREATE FUNCTION statement. Doris only
support UDF in this version, it will support UDAF/UDTF later.
2018-12-29 09:13:04 +08:00
3faf443f52 Fixed: prometheus2.6 metrics (#478) 2018-12-29 09:08:57 +08:00
46c70a16b1 Add more detail logs to debug streaming load (#484)
* Add more detail logs to debug streaming load

* fix bugs

* fix bugs
2018-12-28 19:42:09 +08:00
d471ee5f37 Add md5sum check for third party packages (#480) 2018-12-28 17:33:45 +08:00
f88970a454 Fix get_tablet_stat data race and base_compaction deletion check bug. (#477)
1. It is wrong to use _tablet_map_lock to protect critical region in get_tablet_stat function.
   Add a _tablet_stat_mutex to protect critical region.
2. When base_compaction finished, it checks where there is version missed in tablet.
   If answer is yes, BE will be cored dump. Now check tablet's integrity in advance.
2018-12-27 19:50:08 +08:00
4655b96580 Fix bug that generating incorrect wSymbol in InPredicate (#472) 2018-12-27 19:43:15 +08:00
7b22f82957 Adjust format of README.md (#476) 2018-12-27 17:39:15 +08:00
99c912d884 Complete license and build script (#473)
1. Complete license and build script
2. Complete build script
3. Complete the READEME.md
2018-12-27 16:49:53 +08:00
c74c915441 Fix license issue of gutil (#471) 2018-12-27 13:53:27 +08:00
7d7934112f Fix fe ut (#469)
1. Fix StreamLoadScanNodeTest
2. Revert the fix of decimal value with scientific notation, this still need to fix it later
2018-12-25 20:07:03 +08:00
d2cd8cf180 Make column's name be case insensitive in load stmt (#464)
1. Make column's name be case insensitive in broker load
2. Make column name in stream load be case insensitive too
2018-12-25 16:57:41 +08:00
053bf62460 Change WriteLock to ReadLock when get_tablet_stat (#466) 2018-12-25 16:10:01 +08:00
b037466d56 Get rid of choosing one tablet by compaction (#433)
1. Get rid of choosing one tablet by compaction.
2. Change PREFER_READER to PREFER_WRITING from _tablet_map_lock.
3. Change license of murmur_hash
2018-12-24 16:55:39 +08:00
5b1e3d3f40 Optimize backup & restore process (#460)
1. Print broker address for debug.
2. Do not letting backup job cancelled if it already in state UPLOAD_INFO.
3. Cancel task on Backends when job is cancelled.
4. Show detail progress of backup and restore job.
5. Make 'show snapshot' result more readable.
6. Change upload and download thread num of backup and restore in Backend to 1.
2018-12-24 16:49:16 +08:00
945aaf8923 Fix core when release UDF cache entry (#462) 2018-12-24 14:34:31 +08:00
f1db289934 Fix compile error (#461) 2018-12-24 10:26:03 +08:00
90d71508ff Add UserFunctionCache to cache UDF's library (#453)
* Add UserFunctionCache to cache UDF's library

This patch replace LibCache with UserFunctionCache. LibCache use HDFS
URL to identify a UDF's Library, and when BE process restart all of
downloaded library should be loaded another time. We use function id
corresponding to a library, and when process restart, all downloaded
libraries can be loaded without another downloading.

* update
2018-12-21 22:07:21 +08:00
0341ffde67 Revert commit 'Add log to detect empty load file' (#447)
Looks like we still need to send push task without file
to Backend, or load job will fail.
Fix it later.
2018-12-19 12:29:12 +08:00
5a6e5cfd07 Add log to detect empty load file (#445)
We find that a load file may not be generated for rollup tablet,
add a log to observe.
2018-12-18 12:44:36 +08:00
b9201ece0b Parse thrift port from cluster state (#443) 2018-12-18 11:28:42 +08:00
e99468c387 Add HttpClient class (#441)
Replace FileDownloader with HttpClient, this patch change clone_copy and
pusher's download.
2018-12-18 09:55:11 +08:00
e177c23787 Update the increased frequency of priority for remaining tasks in BlockingPriorityQueue (#434) 2018-12-17 21:05:57 +08:00