doris

Author	SHA1	Message	Date
Calvin Kirs	2c3183f5eb	[Feature](Job)Provide unified internal Job scheduling (#21113 ) We use the time wheel algorithm to complete the scheduling and triggering of periodic tasks. The implementation of the time wheel algorithm refers to netty's HashedWheelTimer. We will periodically (10 minutes by default) put the events that need to be triggered in the future cycle into the time wheel for periodic scheduling. In order to ensure the efficient triggering of tasks and avoid task blocking and subsequent task scheduling delays, we use Disruptor to implement the production and consumption model. When the task expires and needs to be triggered, the task will be put into the RingBuffer of the Disruptor, and then the consumer thread will consume the task. Consumers need to register for events, and event registration needs to provide event executors. Event executors are a functional interface with only one method for executing events. If it is a single event, the event definition will be deleted after the scheduling is completed; if it is a periodic event, it will be put back into the time wheel according to the periodic scheduling after the scheduling is completed.	2023-06-30 16:43:20 +08:00
Kaijie Chen	53f90cb2e3	[fix](load) fix tablet id in RowsetWriterContext (#21336 )	2023-06-30 14:59:43 +08:00
Xinyi Zou	25b5bab22d	[fix](memory) Fix hash table buf initialize null pointer (#21315 ) When compiling FunctionArrayEnumerateUniq::_execute_by_hash, AllocatorWithStackMemory::free(buf) will be called when delete HashMapContainer. the gcc compiler will think that size > N and buf is not heap memory, and report an error ' void free(void*)' called on unallocated object 'hash_map' This only fails on doris docker + gcc 11.1, no problem on doris docker + clang 16.0.1, no problem on ldb_toolchanin gcc 11.1 and clang 16.0.1.	2023-06-30 14:50:53 +08:00
AlexYue	1ac724c2dd	[enhance](BufferedReader) don't blocking wait on buffered reader's condition variable (#21153 )	2023-06-30 14:34:27 +08:00
Liqf	d76fa427a3	[improve](jsonb)Invalid json path prompts an error instead of null (#19646 ) 1. Invalid json path prompts an error instead of null： before： ```sql mysql> SELECT jsonb_extract('[{"k1":"v41","k2":400},1,"a",3.14]', '$[a]'); +-------------------------------------------------------------+ \| jsonb_extract('[{"k1":"v41","k2":400},1,"a",3.14]', '$[a]') \| +-------------------------------------------------------------+ \| NULL \| +-------------------------------------------------------------+ 1 row in set (0.01 sec) ``` now ```sql mysql> SELECT jsonb_extract('[{"k1":"v41","k2":400},1,"a",3.14]', '$[a]'); ERROR 1105 (HY000): errCode = 2, detailMessage = (127.0.0.1)[INVALID_ARGUMENT]Json path error: Invalid Json Path for value: $[a] ``` 2. fix some problem: https://github.com/apache/doris/pull/19185 a. support negative numbers ```sql mysql> SELECT jsonb_extract('[{"k1":"v41","k2":400},1,"a",3.14]', '$[-2]'); +--------------------------------------------------------------+ \| jsonb_extract('[{"k1":"v41","k2":400},1,"a",3.14]', '$[-2]') \| +--------------------------------------------------------------+ \| "a" \| +--------------------------------------------------------------+ 1 row in set (0.02 sec) ``` b. Avoid using unnecessary memory 3. Supplementary regression test	2023-06-30 14:29:21 +08:00
starocean999	8809cca74a	[fix](nereids) physical sort node's equals method should compare sort phase (#21301 )	2023-06-30 14:04:22 +08:00
Xiangyu Wang	8f4b7c8f3d	[Fix](multi-catalog) optimize hashcode for PartitionKey. (#21307 )	2023-06-30 13:48:08 +08:00
yongjinhou	df23ab3f29	[Enhancement](tvf) Add authentication for workload group tvf (#21323 )	2023-06-30 12:56:23 +08:00
abmdocrt	2fcb0e090b	[Fix](Snapshot) Shoule use false instead of 0 in while loop (#20966 )	2023-06-30 10:22:51 +08:00
TengJianPing	33fa5dd1e9	[fix](cast) fix coredump of cast string of invalid datetime (#21350 ) For sql like select cast("627492340" as datetime); the string is an invalid datetime, function DateV2Value<T>::from_date_str cast it as datetime 2062-74-92 23:40:00, with an out-of-range month and day value, which cause memory violation in function DateV2Value<T>::format_datetime when trying to access s_days_in_month. ==256444==ERROR: AddressSanitizer: global-buffer-overflow on address 0x55a7c1a5cff8 at pc 0x55a7e5aa3d2a bp 0x7f3b805f0370 sp 0x7f3b805f0368 READ of size 4 at 0x55a7c1a5cff8 thread T390 (FragmentMgrThre) #0 0x55a7e5aa3d29 in doris::vectorized::DateV2Value<doris::vectorized::DateTimeV2ValueType>::format_datetime(unsigned int, bool) const /home/zcp/repo_center/doris_master/doris/be/src/vec/runtime/vdatetime_value.cpp:1821:31 #1 0x55a7e5aa3052 in doris::vectorized::DateV2Value<doris::vectorized::DateTimeV2ValueType>::from_date_str(char const, int, int) /home/zcp/repo_center/doris_master/doris/be/src/vec/runtime/vdatetime_value.cpp:1968:5 #2 0x55a7d48f0c49 in bool doris::vectorized::read_datetime_v2_text_impl<unsigned long>(unsigned long&, doris::vectorized::ReadBuffer&, unsigned int) /home/zcp/repo_center/doris_master/doris/be/src/vec/io/io_helper.h:309:19 #3 0x55a7ddb21642 in bool doris::vectorized::try_read_datetime_v2_text<unsigned long>(unsigned long&, doris::vectorized::ReadBuffer&, unsigned int) /home/zcp/repo_center/doris_master/doris/be/src/vec/io/io_helper.h:409:12 #4 0x55a7ddb215ec in bool doris::vectorized::try_parse_impl<doris::vectorized::DataTypeDateTimeV2, unsigned int, void>(doris::vectorized::DataTypeDateTimeV2::FieldType&, doris::vectorized::ReadBuffer&, DateLUTImpl const, unsigned int) /home/zcp/repo_center/doris_master/doris/be/src/vec/functions/function_cast.h:839:16 #5 0x55a7ddb21c84 in auto doris::Status doris::vectorized::ConvertThroughParsing<doris::vectorized::DataTypeString, doris::vectorized::DataTypeDateTimeV2, doris::vectorized::NameCast>::execute<void>(doris::FunctionContext, doris::vectorized::Block&, std::vector<unsigned long, std::allocator<unsigned long>> const&, unsigned long, unsigned long, bool, void)::'lambda'(void, auto)::operator()<std::integral_constant<bool, false>, std::integral_constant<bool, true>>(void, auto) const /home/zcp/repo_center/doris_master/doris/be/src/vec/functions/function_cast.h:1340:38 #6 0x55a7ddb216f7 in void* std::__invoke_impl<doris::Status, doris::Status doris::vectorized::ConvertThroughParsing<doris::vectorized::DataTypeString, doris::vectorized::DataTypeDateTimeV2, doris::vectorized::NameCast>::execute<void>(doris::FunctionContext, doris::vectorized::Block&, std::vector<unsigned long, std::allocator<unsigned long>> const&, unsigned long, unsigned long, bool, void)::'lambda'(void, auto), std::integral_constant<bool, false>, std::integral_constant<bool, true>>(std::__invoke_other, auto&&, std::integral_constant<bool, false>&&, std::integral_constant<bool, true>&&) /var/local/ldb_toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/invoke.h:61:14 #7 0x55a7ddb2167f in std::__invoke_result<void, std::integral_constant<bool, false>, std::integral_constant<bool, true>>::type std::__invoke<doris::Status doris::vectorized::ConvertThroughParsing<doris::vectorized::DataTypeString, doris::vectorized::DataTypeDateTimeV2, doris::vectorized::NameCast>::execute<void>(doris::FunctionContext, doris::vectorized::Block&, std::vector<unsigned long, std::allocator<unsigned long>> const&, unsigned long, unsigned long, bool, void)::'lambda'(void, auto), std::integral_constant<bool, false>, std::integral_constant<bool, true>>(void&&, std::integral_constant<bool, false>&&, std::integral_constant<bool, true>&&) /var/local/ldb_toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/invoke.h:96:14 #8 0x55a7ddb20d14 in std::__detail::__variant::__gen_vtable_impl<std::__detail::__variant::_Multi_array<std::__detail::__variant::__deduce_visit_result<doris::Status> ()(doris::Status doris::vectorized::ConvertThroughParsing<doris::vectorized::DataTypeString, doris::vectorized::DataTypeDateTimeV2, doris::vectorized::NameCast>::execute<void>(doris::FunctionContext, doris::vectorized::Block&, std::vector<unsigned long, std::allocator<unsigned long>> const&, unsigned long, unsigned long, bool, void)::'lambda'(void, auto)&&, std::variant<std::integral_constant<bool, false>, std::integral_constant<bool, true>>&&, std::variant<std::integral_constant<bool, false>, std::integral_constant<bool, true>>&&)>, std::integer_sequence<unsigned long, 0ul, 1ul>>::__visit_invoke(doris::Status doris::vectorized::ConvertThroughParsing<doris::vectorized::DataTypeString, doris::vectorized::DataTypeDateTimeV2, doris::vectorized::NameCast>::execute<void>(doris::FunctionContext, doris::vectorized::Block&, std::vector<unsigned long, std::allocator<unsigned long>> const&, unsigned long, unsigned long, bool, void)::'lambda'(void, auto)&&, std::variant<std::integral_constant<bool, false>, std::integral_constant<bool, true>>&&, std::variant<std::integral_constant<bool, false>, std::integral_constant<bool, true>>&&) /var/local/ldb_toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/variant:1013:11 #9 0x55a7ddb20c15 in decltype(auto) std::__do_visit<std::__detail::__variant::__deduce_visit_result<doris::Status>, doris::Status doris::vectorized::ConvertThroughParsing<doris::vectorized::DataTypeString, doris::vectorized::DataTypeDateTimeV2, doris::vectorized::NameCast>::execute<void>(doris::FunctionContext, doris::vectorized::Block&, std::vector<unsigned long, std::allocator<unsigned long>> const&, unsigned long, unsigned long, bool, void)::'lambda'(void, auto), std::variant<std::integral_constant<bool, false>, std::integral_constant<bool, true>>, std::variant<std::integral_constant<bool, false>, std::integral_constant<bool, true>>>(auto&&, std::variant<std::integral_constant<bool, false>, std::integral_constant<bool, true>>&&, std::variant<std::integral_constant<bool, false>, std::integral_constant<bool, true>>&&) /var/local/ldb_toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/variant:1714:14 #10 0x55a7ddb20b6a in decltype(auto) std::visit<doris::Status doris::vectorized::ConvertThroughParsing<doris::vectorized::DataTypeString, doris::vectorized::DataTypeDateTimeV2, doris::vectorized::NameCast>::execute<void>(doris::FunctionContext, doris::vectorized::Block&, std::vector<unsigned long, std::allocator<unsigned long>> const&, unsigned long, unsigned long, bool, void)::'lambda'(void, auto), std::variant<std::integral_constant<bool, false>, std::integral_constant<bool, true>>, std::variant<std::integral_constant<bool, false>, std::integral_constant<bool, true>>>(void&&, std::variant<std::integral_constant<bool, false>, std::integral_constant<bool, true>>&&, std::variant<std::integral_constant<bool, false>, std::integral_constant<bool, true>>&&) /var/local/ldb_toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/variant:1769:9 #11 0x55a7ddb205ff in doris::Status doris::vectorized::ConvertThroughParsing<doris::vectorized::DataTypeString, doris::vectorized::DataTypeDateTimeV2, doris::vectorized::NameCast>::execute<void>(doris::FunctionContext, doris::vectorized::Block&, std::vector<unsigned long, std::allocator<unsigned long>> const&, unsigned long, unsigned long, bool, void) /home/zcp/repo_center/doris_master/doris/be/src/vec/functions/function_cast.h:1321:23 #12 0x55a7ddb1f2c7 in doris::vectorized::FunctionConvertFromString<doris::vectorized::DataTypeDateTimeV2, doris::vectorized::NameCast>::execute_impl(doris::FunctionContext, doris::vectorized::Block&, std::vector<unsigned long, std::allocator<unsigned long>> const&, unsigned long, unsigned long) /home/zcp/repo_center/doris_master/doris/be/src/vec/functions/function_cast.h:1417:20	2023-06-30 10:12:31 +08:00
minghong	9f44c2d80d	[fix](nereids) nest loop join stats estimation (#21275 ) 1. fix bug in nest loop join estimation 2. update column=column stats estimation	2023-06-30 10:00:30 +08:00
acnot	6d63261b71	[docs]<docs>Add file system benchmark tools docs (#21262 )	2023-06-30 09:27:18 +08:00
Kang	a3fee40ce5	[bugfix](build script) fix noavx2 package name branch condition #21356 ARCH changed from x86_64 to x64, but the check condition for noavx2 remains x86_64. Just remove check for ARCH.	2023-06-30 09:03:05 +08:00
yongjinhou	3fb75c1844	[docs](workload-group) Modify workload group docs (#21349 )	2023-06-29 23:25:06 +08:00
airborne12	a3033bff42	[Fix](s3FileWriter) fix bytes_appended bug for s3_file_writer (#21348 )	2023-06-29 22:06:49 +08:00
jakevin	9756ff1e25	[feature](Nereids): infer distinct from SetOperator (#21235 ) Infer distinct from Distinct SetOperator, and put distinct above children to reduce data. tpcds_sf100 q14: before 100 rows in set (7.60 sec) after 100 rows in set (6.80 sec)	2023-06-29 22:04:41 +08:00
Colagy Wang	f07e0d7686	[typo](docs) Some typo in nereids.md has been fixed (#20475 )	2023-06-29 22:04:13 +08:00
Xin Liao	2643f3a167	[fix](merge-on-write) fix dead lock when publish (#21339 )	2023-06-29 20:58:47 +08:00
zhannngchen	c7286c620b	[fix](unique key) agg_function is NONE when properties is null (#21337 )	2023-06-29 20:47:13 +08:00
morrySnow	6259a91d12	[opt](profile) add whether use Nereids info in Profile (#21342 ) add whether use Nereids or pipeline engine in profile, for example: Summary: - Profile ID: 460e710601674438-9df2d685bdfc20f8 - Task Type: QUERY ... - Is Nereids: Yes - Is Pipeline: Yes - Is Cached: No	2023-06-29 20:36:15 +08:00
Mingyu Chen	cfbf48e61b	[improvement](regression) add custom_env.sh from regression pipeline (#21250 ) This file will be used when compiling Doris in regression pipeline. And we can modify it to control the compile behavior. I add BUILD_FS_BENCHMARK=ON, so that it will build fs_benchmark_tool.	2023-06-29 20:20:56 +08:00
morrySnow	f3fc606312	[minor](Nereids) change Nereids parse failed log level to debug (#21335 )	2023-06-29 19:52:48 +08:00
morrySnow	5bb79be932	[opt](Nereids) forbid gather agg and gather set operation (#21332 ) gather agg and gather set operation usually not good we cannot compute cost on them nicely, so just forbid them until we could choose realy best plan	2023-06-29 19:52:15 +08:00
airborne12	d00326549f	[Fix](inverted index) fix a bundle of inverted index exception process errors (#21328 ) 1. fix alloc/free mismatch problem. 2. fix buffer use after free problem.	2023-06-29 19:47:55 +08:00
zhangdong	41ccf77c7d	[feature][fix](fs)(s3)add fs_s3 benchmark tool and fix s3 file writer bug (#20926 ) 1. Fix bug that the field of s3_file_write_bufferpool is not initialized, causing undefined behavior. 2. add fs_s3 benchmark tool，Reference to the usage of tools https://github.com/apache/doris/pull/20770 And opt the output: `sh bin/run-fs-benchmark.sh --conf=conf/s3.conf --fs_type=s3 --operation=single_read --threads=1 --iterations=1` ``` ------------------------------------------------------------------------------------------------------------------------------ Benchmark Time CPU Iterations UserCounters... ------------------------------------------------------------------------------------------------------------------------------ S3ReadBenchmark/iterations:1/repeats:3/manual_time/threads:1 7366 ms 123 ms 1 ReadRate(B/S)=12.1823M/s ReadTime(S)=7.36572 ReadTotal(B)=89.7314M S3ReadBenchmark/iterations:1/repeats:3/manual_time/threads:1 6163 ms 116 ms 1 ReadRate(B/S)=14.5597M/s ReadTime(S)=6.16299 ReadTotal(B)=89.7314M S3ReadBenchmark/iterations:1/repeats:3/manual_time/threads:1 6048 ms 110 ms 1 ReadRate(B/S)=14.8366M/s ReadTime(S)=6.04796 ReadTotal(B)=89.7314M S3ReadBenchmark/iterations:1/repeats:3/manual_time/threads:1_mean 6526 ms 116 ms 3 ReadRate(B/S)=13.8596M/s ReadTime(S)=6.52556 ReadTotal(B)=89.7314M S3ReadBenchmark/iterations:1/repeats:3/manual_time/threads:1_median 6163 ms 116 ms 3 ReadRate(B/S)=14.5597M/s ReadTime(S)=6.16299 ReadTotal(B)=89.7314M S3ReadBenchmark/iterations:1/repeats:3/manual_time/threads:1_stddev 730 ms 6.68 ms 3 ReadRate(B/S)=1.45914M/s ReadTime(S)=0.729876 ReadTotal(B)=0 S3ReadBenchmark/iterations:1/repeats:3/manual_time/threads:1_cv 11.18 % 5.75 % 3 ReadRate(B/S)=10.53% ReadTime(S)=11.18% ReadTotal(B)=0.00% S3ReadBenchmark/iterations:1/repeats:3/manual_time/threads:1_max 7366 ms 123 ms 3 ReadRate(B/S)=14.8366M/s ReadTime(S)=7.36572 ReadTotal(B)=89.7314M S3ReadBenchmark/iterations:1/repeats:3/manual_time/threads:1_min 6048 ms 110 ms 3 ReadRate(B/S)=12.1823M/s ReadTime(S)=6.04796 ReadTotal(B)=89.7314M ```	2023-06-29 19:03:49 +08:00
minghong	419f51ca2c	[feature](nereids)set nereids cbo weights by session var #21293 good for tune cost model	2023-06-29 18:54:04 +08:00
Jack Drogon	8c532e8808	[fix](restore) work around, ingest binlog after backup/restore which local_tablet.partition_id is not correct, use req.partition_id (#21288 ) * work around, ingest binlog after backup/restore which local_tablet.partition_id is not correct, use by req.partition_id Signed-off-by: Jack Drogon <jack.xsuperman@gmail.com>	2023-06-29 17:19:02 +08:00
xzj7019	59198ed59e	[improvement](nereids) Support rf into cte (#21114 ) Support runtime filter pushing down into cte internal.	2023-06-29 16:58:31 +08:00
minghong	64e9eab0dd	[fix](nereids)update Agg stats estimation #21300 Agg stats estimation should use the biggest groupby key's NDV as base, and multiply expansion factor, which is calculated by other groupby key' ndv. Before, we use the smallest ndv as base	2023-06-29 16:37:05 +08:00
Pxl	a518ea5063	[Bug](pipeline) do not call cancelPlanFragmentAsync when instance finished (#21193 ) do not call cancelPlanFragmentAsync when instance finished	2023-06-29 15:35:23 +08:00
924060929	16c218fde5	[feature](nereids) support bind external relation out of Doris fe environment (#21123 ) support bind external relation out of Doris fe environment, for example, analyze sql in other java application. see BindRelationTest.bindExternalRelation.	2023-06-29 14:29:29 +08:00
Houliang Qi	f5668ac1a0	[fix](doc) Fix table typo in star schema benchmark documentation and join optimization (#19181 )	2023-06-29 11:50:04 +08:00
Pxl	87e64115ae	[Chore](materialized-view) add case about insert data imidiately after create mv(#21281 ) add case about insert data imidiately after create mv	2023-06-29 11:17:38 +08:00
Jibing-Li	3a12b67517	[Improvement](statistics, multi catalog)Implement hive table statistic connector (#21053 ) This pr is to add the collecting hive statistic function. While the CBO fetching hive table statistics, statistic cache will first load from internal stats olap table. If not found, then using this pr's function to fetch from remote Hive metastore.	2023-06-29 10:50:54 +08:00
Pxl	45f1909bc3	[Bug](lateral-view) make lateral view function's nullable mode work (#21242 ) make lateral view function's nullable mode work	2023-06-29 10:50:07 +08:00
Jerry Hu	7f0e37069f	[improvement](olap) filter the whole segment by dictionary (#21239 )	2023-06-29 10:34:29 +08:00
Jack Drogon	3f99b91ddf	[fix](gc_binlog) Fix tablet gc_binlogs nullptr (#21158 )	2023-06-29 10:10:33 +08:00
Pxl	f8cfe5e579	[Bug](pipeline) add DCHECK for _instance_to_sending_by_pipeline = false on _send_rpc (#21169 ) add DCHECK for _instance_to_sending_by_pipeline = false on _send_rpc	2023-06-29 10:03:57 +08:00
Calvin Kirs	30b1b93353	[dependency](fe)Dependency version upgrade (#21191 ) Keep hadoop-aliyun version consistent with hadoop main version (3.3.5) upgrade jackson to 2.14.3 upgrade netty version to 4.1.94.final binding check.freamework version to 3.32.0 upgrade snappy-java to 1.1.10.1 upgrade hudi version to 0.13.1 upgrade spring version to 2.7.13 upgrade orc version to 1.8.4 revert nonsensical changes	2023-06-29 10:01:33 +08:00
caoliang-web	54e2e2f7ee	[typo](doc)FlinkCDC access to multi-table or whole database example document mod… (#21295 )	2023-06-29 09:42:13 +08:00
morrySnow	64ffb06a79	[fix](Nereids) olap scan should not be gather since coordinator chould not process (#21298 ) in PR #21168 , we refactor physcial properties and translator to ensure not generating useless excahange. olap scan node could be gather in Nereids but translate to hash partitioned. since coordinator could not process gather olap scan node, we remove the candidate distribution spec of olap scan	2023-06-29 09:12:08 +08:00
Mingyu Chen	9af714bceb	[fix](catalog) disble FileSystem Cache to avoid too many fs cache (#21283 ) When creating a new hive catalog or refresh the hive catalog, it will refresh the HiveMetaStore cache. And it will call "FileInputFormat.setInputPaths()". In this method, it will create a new FileSystem instance and store it in FileSystem's cache. So if refresh catalog frequently, there will be too many FileSystem instances in cache, causing OOM. This PR disable the FileSystem Cache.	2023-06-29 09:06:00 +08:00
gnehil	73bce9e750	[typo](doc) add params description and example for accessing hdfs in ha mode by tvf #21277	2023-06-29 09:05:35 +08:00
Xiangyu Wang	884c908e25	[Enhancement](multi-catalog) try to reuse existed ugi. (#21274 ) Try to reuse an existed ugi at DFSFileSystem, otherwise if we query a more then ten-thousands partitons hms table, we will do more than ten-thousands login operations, each login operation will cost hundreds of ms from my test. Co-authored-by: 王翔宇 <wangxiangyu@360shuke.com>	2023-06-29 09:04:59 +08:00
Xiangyu Wang	86af533e83	[Enhancement](heartbeat) make heartbeat ok when config repeated host-ip pairs (#21228 )	2023-06-28 23:12:06 +08:00
zy-kkk	449c8d4568	[fix](jdbc) Handling Zero DateTime Values in Non-nullable Columns for JDBC Catalog Reading MySQL (#21296 )	2023-06-28 22:51:17 +08:00
Kang	e7dd65f551	[fix](test) fix PlannerTest testEliminatingSortNode (#21112 ) testEliminatingSortNode needs to check if SortNode is existed in plan tree, so it should check plan1.contains("order by:"), but rather than plan1.contains("SORT INFO:") or plan1.contains("SORT LIMIT:").	2023-06-28 21:29:23 +08:00
FreeOnePlus	274203a59c	[typo](storage)Fixed wrong description about Storage_root_path parameter (#20641 )	2023-06-28 21:28:50 +08:00
DongLiang-0	a6b51ec19a	[Feature](avro) Support Apache Avro file format (#19990 ) support read avro file by hdfs() or s3() . ```sql select * from s3( "uri" = "http://127.0.0.1:9312/test2/person.avro", "ACCESS_KEY" = "ak", "SECRET_KEY" = "sk", "FORMAT" = "avro"); +--------+--------------+-------------+-----------------+ \| name \| boolean_type \| double_type \| long_type \| +--------+--------------+-------------+-----------------+ \| Alyssa \| 1 \| 10.0012 \| 100000000221133 \| \| Ben \| 0 \| 5555.999 \| 4009990000 \| \| lisi \| 0 \| 5992225.999 \| 9099933330 \| +--------+--------------+-------------+-----------------+ select * from hdfs( "uri" = "hdfs://127.0.0.1:9000/input/person2.avro", "fs.defaultFS" = "hdfs://127.0.0.1:9000", "hadoop.username" = "doris", "format" = "avro"); +--------+--------------+-------------+-----------+ \| name \| boolean_type \| double_type \| long_type \| +--------+--------------+-------------+-----------+ \| Alyssa \| 1 \| 8888.99999 \| 89898989 \| +--------+--------------+-------------+-----------+ ``` current avro reader only support common data type, the complex data types will be supported later.	2023-06-28 21:15:35 +08:00
zy-kkk	4e082a803f	[typo](docs) improvement lakehouse doc sidebar (#21270 )	2023-06-28 20:19:17 +08:00

1 2 3 4 5 ...

11588 Commits