doris

Author	SHA1	Message	Date
Zhengguo Yang	4bdeef3b64	[chore][fix][doc](fe-plugin)(mysqldump) fix build auditlog plugin error (#7804 ) 1. fix problems when build fe_plugins 2. format 3. add docs about dump data using mysql dump	2022-01-26 09:11:23 +08:00
Zhengguo Yang	738d2d2e07	[refactor] update parent pom version and optimize build scripts (#7548 )	2022-01-05 10:45:11 +08:00
Zhengguo Yang	2872dbfeb8	[refactor] Standardize the writing of pom files, prepare for deployment to maven (#7477 )	2021-12-30 10:16:37 +08:00
caiconghui	382351b0ee	[fix](ut) Fix run fe ut failed, be ut memory leak and build thirdparty failed (#7377 )	2021-12-15 11:00:20 +08:00
Zhengguo Yang	926540c561	[feature] Support return bitmp/hll data in select statement (#7276 ) Support return bitmp/hll data in select statement, this can be used when set show_object_data=true;	2021-12-15 09:48:27 +08:00
Zhengguo Yang	d420ff0afd	display current load bytes to show load progress, (#7134 ) this value may greate than the file size when loading parquert or orc file, will less than file size when loading csv file.	2021-11-24 10:08:32 +08:00
lihuigang	e9282205f1	[feat-opt](spark-load) support bitmap binary data from hive in spark load (#6883 ) Support to load the binary data of bitmap value from Hive into Doris. fix #6461	2021-11-20 21:38:38 +08:00
lihuigang	35da149ebe	[SparkDpp]Add not() and xor() methods to bitmapValue (#6885 ) Add not() and xor() methods to bitmapValue	2021-11-12 10:38:15 +08:00
dohongdayi	ea17682d1f	[Typo] Correct misspellings in SparkDpp (#6789 ) Correct misspellings in SparkDpp	2021-10-10 23:07:39 +08:00
Xiang Wei	6ac0ab6b29	fix(sparkload): bitmap deep copy in `or` operator (#6480 ) * fix(sparkload): bitmap deep copy in `or` operator fix multi rollup hold the same Ref of bitmapvalue which may be updated repeatedly. * fix(sparkload): bitmap deep copy in `or` operator fix multi rollup hold the same Ref of bitmapvalue which may be updated repeatedly. Co-authored-by: weixiang <weixiang06@meituan.com>	2021-09-02 12:15:02 +08:00
Xiang Wei	52f39e3fde	[Bug][SparkLoad]: bitmap value in `or` operator in spark load should be deep copied (#6453 ) fix multi rollup hold the same Ref of bitmapvalue which may be updated repeatedly. fix #6452	2021-08-19 14:17:31 +08:00
Xiang Wei	60ac4a9660	[Bug][SparkLoad] Fix bucket_hash_value for bool value (#6284 ) Co-authored-by: weixiang <weixiang06@meituan.com>	2021-07-27 13:38:42 +08:00
wangbo	ba84eacb8c	(#6009 ) fix bucket key distribute error when using spark load (#6087 )	2021-06-29 12:30:08 +08:00
Xiang Wei	9f706848b9	[Bug] Fix somg bugs about Spark Load (#5701 ) The distinct count result of bitmap/hll column may be incorrect in the spark load mode. Fix some bugs in spark load to solve the above problem. 1. FE is big end but BE is little end. BitmapValues should be transfered to little end in FE's serialization 2. BitmapUnionAggregator/HllUnionAggregator ignore `null` value 3. Make sure encodeVarint64 in FE is consistent with BE Co-authored-by: weixiang <weixiang06@meituan.com>	2021-05-07 11:18:23 +08:00
zh0122	18c2553ef8	[FE][Bug] Update Spark version to fix a security issue (#5593 ) Fix CVE-2020-9480: Apache Spark RCE vulnerability in auth-enabled standalone master https://spark.apache.org/security.html#CVE-2020-9480	2021-04-06 11:02:04 +08:00
copperybean	d8202ca9cc	[Enhancement] move common codes from fe-core to fe-common and remove log4j1 (#5317 ) (#5318 ) The io related codes may be used by new modules, so It's better to move them to fe-common. The modification to fe-core is frequent, but there are many generated java files by thrift will slow down the compilation, so It's better to move thrift generation process to fe-common. Currently both log4j1 and log4j2 are used, which leads to logs are written to wrong files. Our modification will remove log4j1 from dependency, use slf4j + slf4j -> log4j2 instead.	2021-02-04 13:41:03 +08:00
wangbo	41ef9ccda9	(#5224 )some little fix for spark load (#5233 ) * (#5224)some little fix for spark load * 1 use yyyy-MM-dd instead of YYYY-MM-DD 2 unify lower case for bitmap column name	2021-01-27 11:16:59 +08:00
Dam1029	834834dc44	[SparkLoadk] Avoid to read whole hive table when we add a where (#5047 ) When we use spark load from hive table, the function loadDataFromHiveTable will read whole hive table and then filter the data in process() if hive table have lots of partitions and history data，the load will be cost too much time and resource. So we can do filter work in loadDataFromHiveTable function when read from hive table. Co-authored-by: 杜安明 <anming.du@mihoyo.com>	2020-12-15 09:26:42 +08:00
wangbo	2af4bc294f	[Bug] Java Version BitmapValue deserialized failed when only has 32-bit bitmap (#4884 )	2020-11-16 21:54:07 +08:00
wangbo	2c24fe80fa	[SparkDpp] Support complete types (#4524 ) For[Spark Load] 1 support decimal andl largeint 2 add validate logic for char/varchar/decimal 3 check data load from hive with strict mode 4 support decimal/date/datetime aggregator	2020-09-13 11:57:33 +08:00
xy720	aae942b982	[Spark Load][Bug] Keep the column splitting in spark load consistent with broker load / mini load (#4532 )	2020-09-06 20:33:26 +08:00
xy720	f5ee854b6f	[Spark load][Bug] Fix column terminator for spark load (#4491 ) Support specifying column separator without back slash.	2020-09-02 10:54:03 +08:00
wyb	82940a4905	[Spark Load] Fix spark load bugs (#4464 ) 1. fix write dpp result when dpp throw exception 2. boolean value：true, false(IgnoreCase), 0, 1 3. wrong dest column for source data check 4. support * in source file path 5. if job state is cancelled or finished, submitPushTasks would throw all partitions have no load data exception, because tableToLoadPartitions was already cleaned up #3433	2020-08-27 23:40:33 +08:00
wangbo	790779fb6f	[SparkLoad]remove unncessary convert from dataframe to rdd (#4304 )	2020-08-13 23:37:38 +08:00
Mingyu Chen	0e79f6908b	[CodeRefactor] Modify FE modules (#4146 ) This CL mainly changes: 1. Add 2 new FE modules 1. fe-common save all common classes for other modules, currently only `jmockit` 2. spark-dpp The Spark DPP application for Spark Load. And I removed all dpp related classes to this module, including unit tests. 2. Change the `build.sh` Add a new param `--spark-dpp` to compile the `spark-dpp` alone. And `--fe` will compile all FE modules. the output of `spark-dpp` module is `spark-dpp-1.0.0-jar-with-dependencies.jar`, and it will be installed to `output/fe/spark-dpp/`. 3. Modify some bugs of spark load	2020-07-29 16:18:05 +08:00

25 Commits