Commit Graph

72 Commits

Author SHA1 Message Date
1d05feea1b [Feature](Nereids) add executable function to support fold constant for functions (#18209)
1. Add date-time functions for fold constant for Nereids.
This is the list of executable date-time function nereids supports up to now:
- now()
- now(int)
- current_timestamp()
- current_timestamp(int)
- localtime()
- localtimestamp()
- curdate()
- current_date()
- curtime()
- current_time()
- date_{add/sub}(),{years/months/days/hours/minutes/seconds}_{add/sub}()
- datediff()
- {date/datev2}()
- {year/quarter/month/day/hour/minute/second}()
- dayof{year/month/week}()
- date_format()
- date_trunc()
- from_days()
- last_day()
- to_monday()
- from_unixtime()
- unix_timestamp()
- utc_timestamp()
- to_date()
- to_days()
- str_to_date()
- makedate()

2. solved problem:
- enable datev2/datetimev2 default.
- refactor Nereids foldConstantOnFE and support fold nested expression.
- separate the executable into multi-files for easily-reading and adding new functions
2023-05-17 21:26:31 +08:00
2bdfaac609 [fix](ubsan) fix ubsan errors (#19658)
ixu ubsan errors:

doris/be/src/util/string_parser.hpp:275:58: runtime error: signed integer overflow: 2147483647 + 1 cannot be represented in type 'int'

doris/be/src/vec/functions/functions_comparison.h:214:51: runtime error: addition of unsigned offset to 0x7fea6c6b7010 overflowed to 0x7fea6c6b700c

doris/be/src/vec/functions/multiply.cpp:67:50: runtime error: signed integer overflow: 1295699415680000000 * 0x0000000000015401d0a4cd4890a77700 cannot be represented in type '__int128

doris/be/src/vec/aggregate_functions/aggregate_function_percentile_approx.h:445:73: runtime error: addition of unsigned offset to 0x7feca3343d10 overflowed to 0x7feca3343d08 

doris/be/src/exec/schema_scanner/schema_tables_scanner.cpp:330:24: run
2023-05-17 09:32:03 +08:00
2c836251b2 [Fix](schema scanner) Fixed the problem of overflow when multiplying two INT 2023-04-25 23:58:47 +08:00
339d804ec4 [Refactor](exceptionsafe) add factory creator to some class (#19000) 2023-04-25 14:33:47 +08:00
e412dd12e8 [chore](build) Use include-what-you-use to optimize includes (PART II) (#18761)
Currently, there are some useless includes in the codebase. We can use a tool named include-what-you-use to optimize these includes. By using a strict include-what-you-use policy, we can get lots of benefits from it.
2023-04-19 23:11:48 +08:00
9e960f4c4f [chore](build) Use include-what-you-use to optimize includes (#18681)
Currently, there are some useless includes in the codebase. We can use a tool named include-what-you-use to optimize these includes. By using a strict include-what-you-use policy, we can get lots of benefits from it.
2023-04-17 11:44:58 +08:00
759f1da32e [Enhencement](Backends) add HostName filed in backends table and delete backends table in information_schema (#18156)
1.  Add `HostName` field for `show backends` statement and `backends()` tvf.
2. delete the `backends` table in `information_schema` database
2023-04-07 08:30:42 +08:00
Pxl
e77833bfa1 [Bug](materialized-view) fix where clause persistence replay incorrect (#18228)
fix where clause persistence replay incorrect
2023-04-03 12:49:01 +08:00
d9fe5f7b67 [enhancement](memory) Remove MemPool and replace it with Arena (#17820)
Arena can replace MemPool in most scenarios. Except for memory reuse, MemPool supports reuse of previous memory chunks after clear, but Arena does not.

Some comparisons between MemPool and Arena:

 1. Expansion
     Arena is less than 128M index 2 alloc chunk; more than 128M memory, allocate 128M * n > `size`, n is equal to the minimum value that satisfies the expression;
     MemPool less than 512K index 2 alloc chunk, greater than 512K memory, separately apply for a `size` length chunk
     
     After Arena applied for a chunk larger than 128M last time, the minimum chunk applied for after that is 128M. Does this seem to be a waste of memory? MemPool is also similar. After the chunk of 512K was applied for last time, the minimum chunk of subsequent applications is 512K.

 2. Alignment
     MemPool defaults to 16 alignment, because memtable and other places that use int128 require 16 alignment;
     Arena has no default alignment;

 3. Memory reuse
     Arena only supports `rollback`, which reuses the memory of the current chunk, usually the memory requested last time.
     MemPool supports clear(), all chunks can be reused; or call ReturnPartialAllocation() to roll back the last requested memory; if the last chunk has no memory, search for the most free chunk for allocation

 4. Realloc
     Arena supports realloc contiguous memory; it also supports realloc contiguous memory from any position at the time of the last allocation. The difference between `alloc_continue` and `realloc` is:
         1. Alloc_continue does not need to specify the old size, but the default old size = head->pos - range_start
         2. alloc_continue supports expansion from range_start when additional_bytes is between head and pos, which is equivalent to reusing a part of memory, while realloc completely allocates a new memory
     MemPool does not support realloc, but supports transferring or absorbing chunks between two MemPools

 5. check mem limit
     MemPool checks the mem limit, and Arena checks at the Allocator layer.

 6. Support for ASAN
     Arena does something extra

 7. Error handling
     MemPool supports returning the error message of application failure directly through `Status`, and Arena throws Exception.
Tests that Arena can consider

 1. After the last applied chunk is larger than 128M, the minimum applied chunk is 128M, which seems to waste memory;

 2. Support clear, memory multiplexing;

 3. Increase the large list, alloc the memory larger than 128M, and the size is equal to `size`, so as to avoid the current chunk not being fully used, which is wasteful.

 4. In some cases, it may be possible to allocate backwards to find chunks t
2023-03-29 20:56:49 +08:00
bd8e3e6405 [refactor](date) unify DateTimeValue and VecDateTimeValue (#17670) 2023-03-20 16:27:08 +08:00
ee1be6edd7 [chore](fe) enhance_mysql_data_type (#17429) 2023-03-06 10:42:01 +08:00
a8f20eb4ac [Enhencement](schema_scanner) Optimize the performance of reading information schema tables (#17371)
batch fill block
batch call rpc from FE to get table desc
For 34w colunms

SELECT COUNT( * ) FROM information_schema.columns;
time: 10.3s --> 0.4s
2023-03-06 09:53:01 +08:00
68e9a66aa0 [Enchancement](schema scanner) add SchemaScanner profile (#17230)
Add some profile information to the schema scanner to facilitate performance optimization.

Example:

SchemaScanner:
      -  FillBlockTime:  9s131ms
      -  GetDbTime:  12.816ms
      -  GetDescribeTime:  1s645ms
      -  GetTableTime:  25.433ms
2023-03-01 08:34:27 +08:00
1f631c388d [enhance](cooldown)accelerate cooldown task produce efficiency (#16089) 2023-02-10 16:58:27 +08:00
125b60b4b9 [improvement](compatibility) add DATA_TYPE in information schema for new types #16391
Add DATA_TYPE in information schema for types: datev2, datatimev2, decimal, jsonb. It was 'unknown' for these types and cause problem for tools such as BI using information schema.
2023-02-03 22:28:42 +08:00
69e748b076 [fix](schema scanner)change schema_scanner::get_next_row to get_next_block (#15718) 2023-01-30 10:01:50 +08:00
199d7d3be8 [Refactor]Merged string_value into string_ref (#15925) 2023-01-22 16:39:23 +08:00
Pxl
b727033906 [Chore](build) enable -Wextra and remove some -Wno (#15760)
enable -Wextra and remove some -Wno
2023-01-15 10:40:35 +08:00
124c8662e8 [Bug](schema scanner) Fix wrong type in schema scanner (#15768) 2023-01-11 08:37:39 +08:00
1597afcd67 [fix](mutil-catalog) fix get many same name db/table when show where (#15076)
when show databases/tables/table status where xxx, it will change a selectStmt to select result from 
information_schema, it need catalog info to scan schema table, otherwise may get many
database or table info from multi catalog.

for example
mysql> show databases where schema_name='test';
+----------+
| Database |
+----------+
| test |
| test |
+----------+

MySQL [internal.test]> show tables from test where table_name='test_dc';
+----------------+
| Tables_in_test |
+----------------+
| test_dc |
| test_dc |
+----------------+
2022-12-19 14:27:48 +08:00
f3aea7f0f0 [Enhancement](status) Unify error code and enable customed err msg for BE internal errors (#14744) 2022-12-11 23:33:18 +08:00
826cfdaf93 [feature](information_schema) add backends information_schema table (#13086) 2022-11-08 22:15:10 +08:00
429ac929fb [chore](build) Support building from source on ubuntu-22.04 (aarch64) (#12813)
Support building from source on ubuntu-22.04
2022-09-27 10:29:13 +08:00
58508aea13 [enhance](information_schema) show hll type and bitmap type instead of unknown (#12519)
Before this pr, when querying data type of hll/bitmap column, 'unknown' would be returned instead of the correct data type of queried column.
2022-09-13 19:43:42 +08:00
169996d8e4 [feature](information_schema) add rowsets table into information_s… (#11266)
* [feature](information_schema) add 'segments' table into information_schema
2022-08-09 18:15:54 +08:00
4f3b4c7efc [Improvement] information_schema.columns support COLUMN KEY (#11228) 2022-07-27 12:22:17 +08:00
73ba806046 [feature-wip](multi-catalog) Add catalog to information_schema table "columns". (#10592) 2022-07-05 09:57:19 +08:00
7fe4b20da3 [feature-wip](multi-catalog) refactor catalog interface (#10320) 2022-06-25 21:51:54 +08:00
8abd00dcd5 [feature-wip](multi-catalog) Add catalog name to information schema. (#10349)
Information schema database need to show catalog name after multi-catalog is supported.
This part is step 1, add catalog name for schemata table.
2022-06-25 11:53:04 +08:00
Pxl
fd0bd395ac [Enhancement] Remove some unused include (#10035) 2022-06-17 10:47:25 +08:00
718a51a388 [refactor][style] Use clang-format to sort includes (#9483) 2022-05-10 21:25:35 +08:00
e61d296486 [Refactor] Replace '#ifndef' with '#pragma once' (#9456)
* Replace '#ifndef' with '#pragma once'
2022-05-10 09:25:59 +08:00
c9961c9bb9 [style] clang-format all c++ code (#9305)
- sh build-support/clang-format.sh  to  clang-format all c++ code
2022-04-29 16:14:22 +08:00
50864aca7d [refactor] fix warings when compile with clang (#8069) 2022-02-19 11:29:02 +08:00
52ebb3d8f5 [feat](mysql-compatibility) Increase compatibility with mysql (#7041)
Increase compatibility with mysql
  1. Added two system tables files and partitions
  2. Improved the return logic of mysql error code to make the error code more compatible with mysql
  3. Added lock/unlock tables statement and show columns statement for compatibility with mysql dump
  4. Compatible with mysqldump tool, now you can use mysql dump to dump data and table structure from doris

now use mysqldump may print error message like 
```
$ mysqldump -h127.0.0.1 -P9130 -uroot test_query_qa > a
mysqldump: Error: 'errCode = 2, detailMessage = select list expression not produced by aggregation output (missing from GROUP BY clause?): `EXTRA`' when trying to dump tablespaces
```

This error message not effect the export file, you can add `--no-tablespaces` to avoid this error
2021-11-20 21:39:37 +08:00
6c6380969b [refactor] replace boost smart ptr with stl (#6856)
1. replace all boost::shared_ptr to std::shared_ptr
2. replace all boost::scopted_ptr to std::unique_ptr
3. replace all boost::scoped_array to std::unique<T[]>
4. replace all boost:thread to std::thread
2021-11-17 10:18:35 +08:00
8738ce380b Add long text type STRING, with a maximum length of 2GB. Usage is similar to varchar, and there is no guarantee for the performance of storing extremely long data (#6391) 2021-08-18 09:05:40 +08:00
198ba78595 [Feature] Add update time to show table status (#6117)
Add update time to show table status

```
MySQL [test_query_qa]> show table status;
+----------+--------+---------+------------+------+----------------+-------------+-----------------+--------------+-----------+----------------+---------------------+---------------------+---------------------+-----------+----------+----------------+---------+
| Name     | Engine | Version | Row_format | Rows | Avg_row_length | Data_length | Max_data_length | Index_length | Data_free | Auto_increment | Create_time         | Update_time         | Check_time          | Collation | Checksum | Create_options | Comment |
+----------+--------+---------+------------+------+----------------+-------------+-----------------+--------------+-----------+----------------+---------------------+---------------------+---------------------+-----------+----------+----------------+---------+
| bigtable | Doris  |    NULL | NULL       | NULL |           NULL |        NULL |            NULL |         NULL |      NULL |           NULL | 2021-06-29 17:09:28 | 2021-06-29 17:17:28 | 1970-01-01 07:59:59 | utf-8     |     NULL | NULL           | OLAP    |
| test     | Doris  |    NULL | NULL       | NULL |           NULL |        NULL |            NULL |         NULL |      NULL |           NULL | 2021-06-29 17:09:26 | 2021-06-29 17:17:28 | 1970-01-01 07:59:59 | utf-8     |     NULL | NULL           | OLAP    |
| baseall  | Doris  |    NULL | NULL       | NULL |           NULL |        NULL |            NULL |         NULL |      NULL |           NULL | 2021-06-29 17:09:26 | 2021-06-29 17:17:26 | 1970-01-01 07:59:59 | utf-8     |     NULL | NULL           | OLAP    |
+----------+--------+---------+------------+------+----------------+-------------+-----------------+--------------+-----------+----------------+---------------------+---------------------+---------------------+-----------+----------+----------------+---------+
3 rows in set (0.002 sec)
```
2021-07-07 10:27:14 +08:00
739c0268ff [refactor] Remove decimal v1 related code from code base (#6079)
remove ALL DECIMAL V1 type code , this is a part of #6073
2021-07-07 10:26:32 +08:00
1ec615c562 [BUG] Fixed some uninitialized variables (#5850)
Fixed some potential bugs caused by uninitialized variables
2021-05-25 10:34:35 +08:00
98e80aa65e [refactor] Replace boost::function with std::function (#5700)
Replace boost::function with std::function
2021-05-09 22:00:48 +08:00
a803ceea86 [refactor] Remove boost mutex, use std::mutex instead (#5684)
* Remove boost mutex, use std::mutex instead

* replace shared_mutex
2021-04-22 11:29:36 +08:00
c4cc681d14 remove boost_foreach, using c++ foreach instead (#5611) 2021-04-15 10:52:29 +08:00
1ae6de7117 [Enhance] Add "statistics" meta table and fix some mysql compatibility problem (#4991)
1. Add metadata table 'statistics' to store index information;
2. In the header information returned by mysql, the data type length is returned according to the actual type.
2020-12-03 09:38:18 +08:00
6fedf5881b [CodeFormat] Clang-format cpp sources (#4965)
Clang-format all c++ source files.
2020-11-28 18:36:49 +08:00
448df42fb0 [Compatibility] Add table_privileges, schema_privileges and user_privileges tables(#4899)
Add privileges tables in information_schema database
2020-11-16 21:58:30 +08:00
18a22bd347 [BUG] Fix field error in information_schema.columns (#4858) 2020-11-15 22:01:32 +08:00
44498a1ae2 [Compatibility] Add table "views" in information_schema database (#4778)
To support some tools like DBeaver
2020-10-30 11:44:44 +08:00
09f97f8a05 [Refactor] Fixes some be typo part 2 (#4747) 2020-10-20 09:28:57 +08:00
75e0ba32a1 Fixes some be typo (#4714) 2020-10-13 09:37:15 +08:00