[improvement] Refactor type info for further optimizations. (#8786)

## Design:

For now, there are two categories of types in Doris, one is for scalar types (such as int, char and etc.) and the other is for composite types (array and etc.). For the sake of performance, we can cache type info of scalar types globally (unique objects) due to the limited number of scalar types. When we consider the composite types, normally, the type info is generated in runtime (we can also use some cache strategy to speed up). The memory thereby should be reclaimed when we create type info for composite types.

There are a lots of interfaces to get the type info of a specific type. I reorganized those as the following describes.
1. `const TypeInfo* get_scalar_type_info(FieldType field_type)`
    The function is used to get the type info of scalar types. Due to the cache, the caller uses the result **WITHOUT** considering the problems about memory reclaim.
2. `const TypeInfo* get_collection_type_info(FieldType sub_type)`
    The function is used to get the type info of array types with just **ONE** depth. Due to the cache, the caller uses the result **WITHOUT** considering the problems about memory reclaim.
3. `TypeInfoPtr get_type_info(segment_v2::ColumnMetaPB* column_meta_pb)`
4. `TypeInfoPtr get_type_info(const TabletColumn* col)`
    These functions are used to get the type info of **BOTH** scalar types and composite types. The caller should be responsible to manage the resources returned.

#### About the new type `TypeInfoPtr`
`TypeInfoPtr` is an alias type to `unique_ptr` with a custom deleter.
1. For scalar types, the deleter does nothing.
2. For composite types, the deleter reclaim the memory.

By analyzing the callers of `get_type_info`, these classes should hold TypeInfoPtr:
1. `Field`
2. `ColumnReader`
3. `DefaultValueColumnIterator`

Other classes are either constructed by the foregoing classes or hold those, so they can just use the raw pointer of `TypeInfo` directly for the sake of performance.
1. `ScalarColumnWriter` - holds `Field`
    1. `ZoneMapIndexWriter` - created by `ScalarColumnWriter`, use `type_info` from the field in `ScalarColumnWriter`
        1. `IndexedColumnWriter` - created by `ZoneMapIndexWriter`, only uses scalar types.
    2. `BitmapIndexWriter` - created by `ScalarColumnWriter`, uses `type_info` from the field in `ScalarColumnWriter`
        1. `IndexedColumnWriter` - created by `BitmapIndexWriter`, uses `type_info` in `BitmapIndexWriter` and  `BitmapIndexWriter` doesn't support `ArrayType`.
    3. `BloomFilterIndexWriter` - created by `ScalarColumnWriter`, uses `type_info` from the field in `ScalarColumnWriter`
        1.  `IndexedColumnWriter` - created by `BloomFilterIndexWriter`, only uses scalar types.
2. `IndexedColumnReader` initializes `type_info` by the field type in meta (only scalar types).
3. `ColumnVectorBatch`
    1. `ZoneMapIndexReader` creates `ColumnVectorBatch`, `ColumnVectorBatch` uses `type_info` in  `IndexedColumnReader`
    2. `BitmapIndexReader` supports scalar types only and it creates `ColumnVectorBatch`, `ColumnVectorBatch` uses `type_info` in `BitmapIndexReader`
    3. `BloomFilterIndexWriter` supports scalar types only and it creates `ColumnVectorBatch`, `ColumnVectorBatch` uses `type_info` in `BloomFilterIndexWriter`
This commit is contained in:
Adonis Ling
2022-04-20 14:47:29 +08:00
committed by GitHub
parent 1b4cd76847
commit bd126f0679
27 changed files with 343 additions and 310 deletions

View File

@ -31,8 +31,8 @@ Status ColumnVectorBatch::resize(size_t new_cap) {
return Status::OK();
}
Status ColumnVectorBatch::create(size_t init_capacity, bool is_nullable,
const TypeInfo* type_info, Field* field,
Status ColumnVectorBatch::create(size_t init_capacity, bool is_nullable, const TypeInfo* type_info,
Field* field,
std::unique_ptr<ColumnVectorBatch>* column_vector_batch) {
if (is_scalar_type(type_info->type())) {
std::unique_ptr<ColumnVectorBatch> local;
@ -164,8 +164,8 @@ Status ColumnVectorBatch::create(size_t init_capacity, bool is_nullable,
}
template <class ScalarType>
ScalarColumnVectorBatch<ScalarType>::ScalarColumnVectorBatch(
const TypeInfo* type_info, bool is_nullable)
ScalarColumnVectorBatch<ScalarType>::ScalarColumnVectorBatch(const TypeInfo* type_info,
bool is_nullable)
: ColumnVectorBatch(type_info, is_nullable), _data(0) {}
template <class ScalarType>
@ -180,8 +180,7 @@ Status ScalarColumnVectorBatch<ScalarType>::resize(size_t new_cap) {
return Status::OK();
}
ArrayColumnVectorBatch::ArrayColumnVectorBatch(const TypeInfo* type_info,
bool is_nullable,
ArrayColumnVectorBatch::ArrayColumnVectorBatch(const TypeInfo* type_info, bool is_nullable,
ScalarColumnVectorBatch<uint32_t>* offsets,
ColumnVectorBatch* elements)
: ColumnVectorBatch(type_info, is_nullable), _data(0) {
@ -222,11 +221,10 @@ void ArrayColumnVectorBatch::prepare_for_read(size_t start_idx, size_t size, boo
_data[start_idx + i] = CollectionValue(length);
} else {
_data[start_idx + i] = CollectionValue(
_elements->mutable_cell_ptr(offset),
length,
item_has_null,
_elements->is_nullable() ? const_cast<bool*>(&_elements->null_signs()[offset])
: nullptr);
_elements->mutable_cell_ptr(offset), length, item_has_null,
_elements->is_nullable()
? const_cast<bool*>(&_elements->null_signs()[offset])
: nullptr);
}
}
}

View File

@ -28,16 +28,14 @@
namespace doris {
typedef google::protobuf::RepeatedPtrField<DeletePredicatePB> DelPredicateArray;
using DelPredicateArray = google::protobuf::RepeatedPtrField<DeletePredicatePB>;
class Conditions;
class RowCursor;
class TabletReader;
class TabletSchema;
class DeleteConditionHandler {
public:
DeleteConditionHandler() {}
~DeleteConditionHandler() {}
// generated DeletePredicatePB by TCondition
Status generate_delete_predicate(const TabletSchema& schema,
const std::vector<TCondition>& conditions,

View File

@ -40,7 +40,7 @@ namespace doris {
// User can use this class to access or deal with column data in memory.
class Field {
public:
explicit Field() = default;
explicit Field() : _type_info(TypeInfoPtr(nullptr, nullptr)) {}
explicit Field(const TabletColumn& column)
: _type_info(get_type_info(&column)),
_length(column.length()),
@ -277,7 +277,7 @@ public:
FieldType type() const { return _type_info->type(); }
FieldAggregationMethod aggregation() const { return _agg_info->agg_method(); }
const TypeInfo* type_info() const { return _type_info; }
const TypeInfo* type_info() const { return _type_info.get(); }
bool is_nullable() const { return _is_nullable; }
// similar to `full_encode_ascending`, but only encode part (the first `index_size` bytes) of the value.
@ -301,7 +301,7 @@ public:
size_t get_sub_field_count() const { return _sub_fields.size(); }
protected:
const TypeInfo* _type_info;
TypeInfoPtr _type_info;
const AggregateInfo* _agg_info;
// unit : byte
// except for strings, other types have fixed lengths
@ -322,7 +322,7 @@ protected:
}
void clone(Field* other) const {
other->_type_info = this->_type_info;
other->_type_info = clone_type_info(this->_type_info.get());
other->_key_coder = this->_key_coder;
other->_name = this->_name;
other->_index_size = this->_index_size;

View File

@ -54,8 +54,8 @@ namespace doris {
// tablets, finally we will only push for current tablets. this is
// very useful in rollup action.
Status PushHandler::process_streaming_ingestion(TabletSharedPtr tablet, const TPushReq& request,
PushType push_type,
std::vector<TTabletInfo>* tablet_info_vec) {
PushType push_type,
std::vector<TTabletInfo>* tablet_info_vec) {
LOG(INFO) << "begin to realtime push. tablet=" << tablet->full_name()
<< ", transaction_id=" << request.transaction_id;
@ -78,9 +78,9 @@ Status PushHandler::process_streaming_ingestion(TabletSharedPtr tablet, const TP
}
Status PushHandler::_do_streaming_ingestion(TabletSharedPtr tablet, const TPushReq& request,
PushType push_type,
std::vector<TabletVars>* tablet_vars,
std::vector<TTabletInfo>* tablet_info_vec) {
PushType push_type,
std::vector<TabletVars>* tablet_vars,
std::vector<TTabletInfo>* tablet_info_vec) {
// add transaction in engine, then check sc status
// lock, prevent sc handler checking transaction concurrently
if (tablet == nullptr) {
@ -208,7 +208,7 @@ void PushHandler::_get_tablet_infos(const std::vector<TabletVars>& tablet_vars,
}
Status PushHandler::_convert_v2(TabletSharedPtr cur_tablet, TabletSharedPtr new_tablet,
RowsetSharedPtr* cur_rowset, RowsetSharedPtr* new_rowset) {
RowsetSharedPtr* cur_rowset, RowsetSharedPtr* new_rowset) {
Status res = Status::OK();
uint32_t num_rows = 0;
PUniqueId load_id;
@ -273,7 +273,8 @@ Status PushHandler::_convert_v2(TabletSharedPtr cur_tablet, TabletSharedPtr new_
}
// init Reader
if (!(res = reader->init(schema.get(), _request.broker_scan_range, _request.desc_tbl))) {
if (!(res = reader->init(schema.get(), _request.broker_scan_range,
_request.desc_tbl))) {
LOG(WARNING) << "fail to init reader. res=" << res
<< ", tablet=" << cur_tablet->full_name();
res = Status::OLAPInternalError(OLAP_ERR_PUSH_INIT_ERROR);
@ -349,7 +350,7 @@ Status PushHandler::_convert_v2(TabletSharedPtr cur_tablet, TabletSharedPtr new_
}
Status PushHandler::_convert(TabletSharedPtr cur_tablet, TabletSharedPtr new_tablet,
RowsetSharedPtr* cur_rowset, RowsetSharedPtr* new_rowset) {
RowsetSharedPtr* cur_rowset, RowsetSharedPtr* new_rowset) {
Status res = Status::OK();
RowCursor row;
BinaryFile raw_file;
@ -862,8 +863,8 @@ Status LzoBinaryReader::_next_block() {
size_t written_len = 0;
size_t block_header_size = 5;
if (!(res = olap_decompress(_row_compressed_buf + block_header_size,
compressed_size - block_header_size, _row_buf, _max_row_buf_size,
&written_len, OLAP_COMP_TRANSPORT))) {
compressed_size - block_header_size, _row_buf, _max_row_buf_size,
&written_len, OLAP_COMP_TRANSPORT))) {
LOG(WARNING) << "olap decompress fail. res=" << res;
return res;
}
@ -874,7 +875,7 @@ Status LzoBinaryReader::_next_block() {
}
Status PushBrokerReader::init(const Schema* schema, const TBrokerScanRange& t_scan_range,
const TDescriptorTable& t_desc_tbl) {
const TDescriptorTable& t_desc_tbl) {
// init schema
_schema = schema;
@ -950,7 +951,7 @@ Status PushBrokerReader::init(const Schema* schema, const TBrokerScanRange& t_sc
}
Status PushBrokerReader::fill_field_row(RowCursorCell* dst, const char* src, bool src_null,
MemPool* mem_pool, FieldType type) {
MemPool* mem_pool, FieldType type) {
switch (type) {
case OLAP_FIELD_TYPE_DECIMAL: {
dst->set_is_null(src_null);

View File

@ -38,10 +38,11 @@ class IndexedColumnIterator;
class BitmapIndexReader {
public:
explicit BitmapIndexReader(const FilePathDesc& path_desc, const BitmapIndexPB* bitmap_index_meta)
: _path_desc(path_desc), _bitmap_index_meta(bitmap_index_meta) {
_typeinfo = get_scalar_type_info<OLAP_FIELD_TYPE_VARCHAR>();
}
explicit BitmapIndexReader(const FilePathDesc& path_desc,
const BitmapIndexPB* bitmap_index_meta)
: _path_desc(path_desc),
_type_info(get_scalar_type_info<OLAP_FIELD_TYPE_VARCHAR>()),
_bitmap_index_meta(bitmap_index_meta) {}
Status load(bool use_page_cache, bool kept_in_memory);
@ -50,13 +51,13 @@ public:
int64_t bitmap_nums() { return _bitmap_column_reader->num_values(); }
const TypeInfo* type_info() { return _typeinfo; }
const TypeInfo* type_info() { return _type_info; }
private:
friend class BitmapIndexIterator;
FilePathDesc _path_desc;
const TypeInfo* _typeinfo;
const TypeInfo* _type_info;
const BitmapIndexPB* _bitmap_index_meta;
bool _has_null = false;
std::unique_ptr<IndexedColumnReader> _dict_column_reader;

View File

@ -63,12 +63,10 @@ public:
using CppType = typename CppTypeTraits<field_type>::CppType;
using MemoryIndexType = typename BitmapIndexTraits<CppType>::MemoryIndexType;
explicit BitmapIndexWriterImpl(const TypeInfo* typeinfo)
: _typeinfo(typeinfo),
_reverted_index_size(0),
_pool("BitmapIndexWriterImpl") {}
explicit BitmapIndexWriterImpl(const TypeInfo* type_info)
: _type_info(type_info), _reverted_index_size(0), _pool("BitmapIndexWriterImpl") {}
~BitmapIndexWriterImpl() = default;
~BitmapIndexWriterImpl() override = default;
void add_values(const void* values, size_t count) override {
auto p = reinterpret_cast<const CppType*>(values);
@ -88,7 +86,7 @@ public:
} else {
// new value, copy value and insert new key->bitmap pair
CppType new_value;
_typeinfo->deep_copy(&new_value, &value, &_pool);
_type_info->deep_copy(&new_value, &value, &_pool);
_mem_index.insert({new_value, roaring::Roaring::bitmapOf(1, _rid)});
it = _mem_index.find(new_value);
}
@ -112,10 +110,10 @@ public:
IndexedColumnWriterOptions options;
options.write_ordinal_index = false;
options.write_value_index = true;
options.encoding = EncodingInfo::get_default_encoding(_typeinfo, true);
options.encoding = EncodingInfo::get_default_encoding(_type_info, true);
options.compression = LZ4F;
IndexedColumnWriter dict_column_writer(options, _typeinfo, wblock);
IndexedColumnWriter dict_column_writer(options, _type_info, wblock);
RETURN_IF_ERROR(dict_column_writer.init());
for (auto const& it : _mem_index) {
RETURN_IF_ERROR(dict_column_writer.add(&(it.first)));
@ -142,16 +140,15 @@ public:
bitmap_sizes.push_back(bitmap_size);
}
const auto* bitmap_typeinfo = get_scalar_type_info<OLAP_FIELD_TYPE_OBJECT>();
const auto* bitmap_type_info = get_scalar_type_info<OLAP_FIELD_TYPE_OBJECT>();
IndexedColumnWriterOptions options;
options.write_ordinal_index = true;
options.write_value_index = false;
options.encoding = EncodingInfo::get_default_encoding(bitmap_typeinfo, false);
options.encoding = EncodingInfo::get_default_encoding(bitmap_type_info, false);
// we already store compressed bitmap, use NO_COMPRESSION to save some cpu
options.compression = NO_COMPRESSION;
IndexedColumnWriter bitmap_column_writer(options, bitmap_typeinfo, wblock);
IndexedColumnWriter bitmap_column_writer(options, bitmap_type_info, wblock);
RETURN_IF_ERROR(bitmap_column_writer.init());
faststring buf;
@ -177,7 +174,7 @@ public:
}
private:
const TypeInfo* _typeinfo;
const TypeInfo* _type_info;
uint64_t _reverted_index_size;
rowid_t _rid = 0;
// row id list for null value
@ -189,48 +186,48 @@ private:
} // namespace
Status BitmapIndexWriter::create(const TypeInfo* typeinfo,
Status BitmapIndexWriter::create(const TypeInfo* type_info,
std::unique_ptr<BitmapIndexWriter>* res) {
FieldType type = typeinfo->type();
FieldType type = type_info->type();
switch (type) {
case OLAP_FIELD_TYPE_TINYINT:
res->reset(new BitmapIndexWriterImpl<OLAP_FIELD_TYPE_TINYINT>(typeinfo));
res->reset(new BitmapIndexWriterImpl<OLAP_FIELD_TYPE_TINYINT>(type_info));
break;
case OLAP_FIELD_TYPE_SMALLINT:
res->reset(new BitmapIndexWriterImpl<OLAP_FIELD_TYPE_SMALLINT>(typeinfo));
res->reset(new BitmapIndexWriterImpl<OLAP_FIELD_TYPE_SMALLINT>(type_info));
break;
case OLAP_FIELD_TYPE_INT:
res->reset(new BitmapIndexWriterImpl<OLAP_FIELD_TYPE_INT>(typeinfo));
res->reset(new BitmapIndexWriterImpl<OLAP_FIELD_TYPE_INT>(type_info));
break;
case OLAP_FIELD_TYPE_UNSIGNED_INT:
res->reset(new BitmapIndexWriterImpl<OLAP_FIELD_TYPE_UNSIGNED_INT>(typeinfo));
res->reset(new BitmapIndexWriterImpl<OLAP_FIELD_TYPE_UNSIGNED_INT>(type_info));
break;
case OLAP_FIELD_TYPE_BIGINT:
res->reset(new BitmapIndexWriterImpl<OLAP_FIELD_TYPE_BIGINT>(typeinfo));
res->reset(new BitmapIndexWriterImpl<OLAP_FIELD_TYPE_BIGINT>(type_info));
break;
case OLAP_FIELD_TYPE_CHAR:
res->reset(new BitmapIndexWriterImpl<OLAP_FIELD_TYPE_CHAR>(typeinfo));
res->reset(new BitmapIndexWriterImpl<OLAP_FIELD_TYPE_CHAR>(type_info));
break;
case OLAP_FIELD_TYPE_VARCHAR:
res->reset(new BitmapIndexWriterImpl<OLAP_FIELD_TYPE_VARCHAR>(typeinfo));
res->reset(new BitmapIndexWriterImpl<OLAP_FIELD_TYPE_VARCHAR>(type_info));
break;
case OLAP_FIELD_TYPE_STRING:
res->reset(new BitmapIndexWriterImpl<OLAP_FIELD_TYPE_STRING>(typeinfo));
res->reset(new BitmapIndexWriterImpl<OLAP_FIELD_TYPE_STRING>(type_info));
break;
case OLAP_FIELD_TYPE_DATE:
res->reset(new BitmapIndexWriterImpl<OLAP_FIELD_TYPE_DATE>(typeinfo));
res->reset(new BitmapIndexWriterImpl<OLAP_FIELD_TYPE_DATE>(type_info));
break;
case OLAP_FIELD_TYPE_DATETIME:
res->reset(new BitmapIndexWriterImpl<OLAP_FIELD_TYPE_DATETIME>(typeinfo));
res->reset(new BitmapIndexWriterImpl<OLAP_FIELD_TYPE_DATETIME>(type_info));
break;
case OLAP_FIELD_TYPE_LARGEINT:
res->reset(new BitmapIndexWriterImpl<OLAP_FIELD_TYPE_LARGEINT>(typeinfo));
res->reset(new BitmapIndexWriterImpl<OLAP_FIELD_TYPE_LARGEINT>(type_info));
break;
case OLAP_FIELD_TYPE_DECIMAL:
res->reset(new BitmapIndexWriterImpl<OLAP_FIELD_TYPE_DECIMAL>(typeinfo));
res->reset(new BitmapIndexWriterImpl<OLAP_FIELD_TYPE_DECIMAL>(type_info));
break;
case OLAP_FIELD_TYPE_BOOL:
res->reset(new BitmapIndexWriterImpl<OLAP_FIELD_TYPE_BOOL>(typeinfo));
res->reset(new BitmapIndexWriterImpl<OLAP_FIELD_TYPE_BOOL>(type_info));
break;
default:
return Status::NotSupported("unsupported type for bitmap index: " + std::to_string(type));

View File

@ -36,7 +36,7 @@ namespace segment_v2 {
class BitmapIndexWriter {
public:
static Status create(const TypeInfo* typeinfo, std::unique_ptr<BitmapIndexWriter>* res);
static Status create(const TypeInfo* type_info, std::unique_ptr<BitmapIndexWriter>* res);
BitmapIndexWriter() = default;
virtual ~BitmapIndexWriter() = default;

View File

@ -43,22 +43,22 @@ class BloomFilterIndexReader {
public:
explicit BloomFilterIndexReader(const FilePathDesc& path_desc,
const BloomFilterIndexPB* bloom_filter_index_meta)
: _path_desc(path_desc), _bloom_filter_index_meta(bloom_filter_index_meta) {
_typeinfo = get_scalar_type_info<OLAP_FIELD_TYPE_VARCHAR>();
}
: _path_desc(path_desc),
_type_info(get_scalar_type_info<OLAP_FIELD_TYPE_VARCHAR>()),
_bloom_filter_index_meta(bloom_filter_index_meta) {}
Status load(bool use_page_cache, bool kept_in_memory);
// create a new column iterator.
Status new_iterator(std::unique_ptr<BloomFilterIndexIterator>* iterator);
const TypeInfo* type_info() const { return _typeinfo; }
const TypeInfo* type_info() const { return _type_info; }
private:
friend class BloomFilterIndexIterator;
FilePathDesc _path_desc;
const TypeInfo* _typeinfo;
const TypeInfo* _type_info;
const BloomFilterIndexPB* _bloom_filter_index_meta;
std::unique_ptr<IndexedColumnReader> _bloom_filter_reader;
};

View File

@ -68,14 +68,14 @@ public:
using ValueDict = typename BloomFilterTraits<CppType>::ValueDict;
explicit BloomFilterIndexWriterImpl(const BloomFilterOptions& bf_options,
const TypeInfo* typeinfo)
const TypeInfo* type_info)
: _bf_options(bf_options),
_typeinfo(typeinfo),
_type_info(type_info),
_pool("BloomFilterIndexWriterImpl"),
_has_null(false),
_bf_buffer_size(0) {}
~BloomFilterIndexWriterImpl() = default;
~BloomFilterIndexWriterImpl() override = default;
void add_values(const void* values, size_t count) override {
const CppType* v = (const CppType*)values;
@ -83,7 +83,7 @@ public:
if (_values.find(*v) == _values.end()) {
if constexpr (_is_slice_type()) {
CppType new_value;
_typeinfo->deep_copy(&new_value, v, &_pool);
_type_info->deep_copy(&new_value, v, &_pool);
_values.insert(new_value);
} else if constexpr (_is_int128()) {
int128_t new_value;
@ -129,12 +129,12 @@ public:
meta->set_algorithm(BLOCK_BLOOM_FILTER);
// write bloom filters
const auto* bf_typeinfo = get_scalar_type_info<OLAP_FIELD_TYPE_VARCHAR>();
const auto* bf_type_info = get_scalar_type_info<OLAP_FIELD_TYPE_VARCHAR>();
IndexedColumnWriterOptions options;
options.write_ordinal_index = true;
options.write_value_index = false;
options.encoding = PLAIN_ENCODING;
IndexedColumnWriter bf_writer(options, bf_typeinfo, wblock);
IndexedColumnWriter bf_writer(options, bf_type_info, wblock);
RETURN_IF_ERROR(bf_writer.init());
for (auto& bf : _bfs) {
Slice data(bf->data(), bf->size());
@ -153,14 +153,15 @@ public:
private:
// supported slice types are: OLAP_FIELD_TYPE_CHAR|OLAP_FIELD_TYPE_VARCHAR
static constexpr bool _is_slice_type() {
return field_type == OLAP_FIELD_TYPE_VARCHAR || field_type == OLAP_FIELD_TYPE_CHAR || field_type == OLAP_FIELD_TYPE_STRING;
return field_type == OLAP_FIELD_TYPE_VARCHAR || field_type == OLAP_FIELD_TYPE_CHAR ||
field_type == OLAP_FIELD_TYPE_STRING;
}
static constexpr bool _is_int128() { return field_type == OLAP_FIELD_TYPE_LARGEINT; }
private:
BloomFilterOptions _bf_options;
const TypeInfo* _typeinfo;
const TypeInfo* _type_info;
MemPool _pool;
bool _has_null;
uint64_t _bf_buffer_size;
@ -173,43 +174,43 @@ private:
// TODO currently we don't support bloom filter index for tinyint/hll/float/double
Status BloomFilterIndexWriter::create(const BloomFilterOptions& bf_options,
const TypeInfo* typeinfo,
const TypeInfo* type_info,
std::unique_ptr<BloomFilterIndexWriter>* res) {
FieldType type = typeinfo->type();
FieldType type = type_info->type();
switch (type) {
case OLAP_FIELD_TYPE_SMALLINT:
res->reset(new BloomFilterIndexWriterImpl<OLAP_FIELD_TYPE_SMALLINT>(bf_options, typeinfo));
res->reset(new BloomFilterIndexWriterImpl<OLAP_FIELD_TYPE_SMALLINT>(bf_options, type_info));
break;
case OLAP_FIELD_TYPE_INT:
res->reset(new BloomFilterIndexWriterImpl<OLAP_FIELD_TYPE_INT>(bf_options, typeinfo));
res->reset(new BloomFilterIndexWriterImpl<OLAP_FIELD_TYPE_INT>(bf_options, type_info));
break;
case OLAP_FIELD_TYPE_UNSIGNED_INT:
res->reset(
new BloomFilterIndexWriterImpl<OLAP_FIELD_TYPE_UNSIGNED_INT>(bf_options, typeinfo));
res->reset(new BloomFilterIndexWriterImpl<OLAP_FIELD_TYPE_UNSIGNED_INT>(bf_options,
type_info));
break;
case OLAP_FIELD_TYPE_BIGINT:
res->reset(new BloomFilterIndexWriterImpl<OLAP_FIELD_TYPE_BIGINT>(bf_options, typeinfo));
res->reset(new BloomFilterIndexWriterImpl<OLAP_FIELD_TYPE_BIGINT>(bf_options, type_info));
break;
case OLAP_FIELD_TYPE_LARGEINT:
res->reset(new BloomFilterIndexWriterImpl<OLAP_FIELD_TYPE_LARGEINT>(bf_options, typeinfo));
res->reset(new BloomFilterIndexWriterImpl<OLAP_FIELD_TYPE_LARGEINT>(bf_options, type_info));
break;
case OLAP_FIELD_TYPE_CHAR:
res->reset(new BloomFilterIndexWriterImpl<OLAP_FIELD_TYPE_CHAR>(bf_options, typeinfo));
res->reset(new BloomFilterIndexWriterImpl<OLAP_FIELD_TYPE_CHAR>(bf_options, type_info));
break;
case OLAP_FIELD_TYPE_VARCHAR:
res->reset(new BloomFilterIndexWriterImpl<OLAP_FIELD_TYPE_VARCHAR>(bf_options, typeinfo));
res->reset(new BloomFilterIndexWriterImpl<OLAP_FIELD_TYPE_VARCHAR>(bf_options, type_info));
break;
case OLAP_FIELD_TYPE_STRING:
res->reset(new BloomFilterIndexWriterImpl<OLAP_FIELD_TYPE_STRING>(bf_options, typeinfo));
res->reset(new BloomFilterIndexWriterImpl<OLAP_FIELD_TYPE_STRING>(bf_options, type_info));
break;
case OLAP_FIELD_TYPE_DATE:
res->reset(new BloomFilterIndexWriterImpl<OLAP_FIELD_TYPE_DATE>(bf_options, typeinfo));
res->reset(new BloomFilterIndexWriterImpl<OLAP_FIELD_TYPE_DATE>(bf_options, type_info));
break;
case OLAP_FIELD_TYPE_DATETIME:
res->reset(new BloomFilterIndexWriterImpl<OLAP_FIELD_TYPE_DATETIME>(bf_options, typeinfo));
res->reset(new BloomFilterIndexWriterImpl<OLAP_FIELD_TYPE_DATETIME>(bf_options, type_info));
break;
case OLAP_FIELD_TYPE_DECIMAL:
res->reset(new BloomFilterIndexWriterImpl<OLAP_FIELD_TYPE_DECIMAL>(bf_options, typeinfo));
res->reset(new BloomFilterIndexWriterImpl<OLAP_FIELD_TYPE_DECIMAL>(bf_options, type_info));
break;
default:
return Status::NotSupported("unsupported type for bitmap index: " + std::to_string(type));

View File

@ -38,7 +38,7 @@ struct BloomFilterOptions;
class BloomFilterIndexWriter {
public:
static Status create(const BloomFilterOptions& bf_options, const TypeInfo* typeinfo,
static Status create(const BloomFilterOptions& bf_options, const TypeInfo* type_info,
std::unique_ptr<BloomFilterIndexWriter>* res);
BloomFilterIndexWriter() = default;

View File

@ -103,7 +103,7 @@ Status ColumnReader::init() {
return Status::NotSupported(
strings::Substitute("unsupported typeinfo, type=$0", _meta.type()));
}
RETURN_IF_ERROR(EncodingInfo::get(_type_info, _meta.encoding(), &_encoding_info));
RETURN_IF_ERROR(EncodingInfo::get(_type_info.get(), _meta.encoding(), &_encoding_info));
RETURN_IF_ERROR(get_block_compression_codec(_meta.compression(), &_compress_codec));
for (int i = 0; i < _meta.indexes_size(); i++) {

View File

@ -171,7 +171,8 @@ private:
uint64_t _num_rows;
FilePathDesc _path_desc;
const TypeInfo* _type_info = nullptr; // initialized in init(), may changed by subclasses.
TypeInfoPtr _type_info =
TypeInfoPtr(nullptr, nullptr); // initialized in init(), may changed by subclasses.
const EncodingInfo* _encoding_info =
nullptr; // initialized in init(), used for create PageDecoder
const BlockCompressionCodec* _compress_codec = nullptr; // initialized in init()
@ -386,12 +387,11 @@ private:
class DefaultValueColumnIterator : public ColumnIterator {
public:
DefaultValueColumnIterator(bool has_default_value, const std::string& default_value,
bool is_nullable, const TypeInfo* type_info,
size_t schema_length)
bool is_nullable, TypeInfoPtr type_info, size_t schema_length)
: _has_default_value(has_default_value),
_default_value(default_value),
_is_nullable(is_nullable),
_type_info(type_info),
_type_info(std::move(type_info)),
_schema_length(schema_length),
_is_default_value_null(false),
_type_size(0),
@ -426,7 +426,7 @@ private:
bool _has_default_value;
std::string _default_value;
bool _is_nullable;
const TypeInfo* _type_info;
TypeInfoPtr _type_info;
size_t _schema_length;
bool _is_default_value_null;
size_t _type_size;

View File

@ -78,7 +78,8 @@ Status IndexedColumnReader::load_index_page(fs::ReadableBlock* rblock, const Pag
}
Status IndexedColumnReader::read_page(fs::ReadableBlock* rblock, const PagePointer& pp,
PageHandle* handle, Slice* body, PageFooterPB* footer, PageTypePB type) const {
PageHandle* handle, Slice* body, PageFooterPB* footer,
PageTypePB type) const {
PageReadOptions opts;
opts.rblock = rblock;
opts.page_pointer = pp;

View File

@ -46,7 +46,7 @@ class IndexedColumnIterator;
class IndexedColumnReader {
public:
explicit IndexedColumnReader(const FilePathDesc& path_desc, const IndexedColumnMetaPB& meta)
: _path_desc(path_desc), _meta(meta){};
: _path_desc(path_desc), _meta(meta) {};
Status load(bool use_page_cache, bool kept_in_memory);

View File

@ -37,23 +37,23 @@ namespace doris {
namespace segment_v2 {
IndexedColumnWriter::IndexedColumnWriter(const IndexedColumnWriterOptions& options,
const TypeInfo* typeinfo, fs::WritableBlock* wblock)
const TypeInfo* type_info, fs::WritableBlock* wblock)
: _options(options),
_typeinfo(typeinfo),
_type_info(type_info),
_wblock(wblock),
_mem_pool("IndexedColumnWriter"),
_num_values(0),
_num_data_pages(0),
_value_key_coder(nullptr),
_compress_codec(nullptr) {
_first_value.resize(_typeinfo->size());
_first_value.resize(_type_info->size());
}
IndexedColumnWriter::~IndexedColumnWriter() = default;
Status IndexedColumnWriter::init() {
const EncodingInfo* encoding_info;
RETURN_IF_ERROR(EncodingInfo::get(_typeinfo, _options.encoding, &encoding_info));
RETURN_IF_ERROR(EncodingInfo::get(_type_info, _options.encoding, &encoding_info));
_options.encoding = encoding_info->encoding();
// should store more concrete encoding type instead of DEFAULT_ENCODING
// because the default encoding of a data type can be changed in the future
@ -68,7 +68,7 @@ Status IndexedColumnWriter::init() {
}
if (_options.write_value_index) {
_value_index_builder.reset(new IndexPageBuilder(_options.index_page_size, true));
_value_key_coder = get_key_coder(_typeinfo->type());
_value_key_coder = get_key_coder(_type_info->type());
}
if (_options.compression != NO_COMPRESSION) {
@ -80,7 +80,7 @@ Status IndexedColumnWriter::init() {
Status IndexedColumnWriter::add(const void* value) {
if (_options.write_value_index && _data_page_builder->count() == 0) {
// remember page's first value because it's used to build value index
_typeinfo->deep_copy(_first_value.data(), value, &_mem_pool);
_type_info->deep_copy(_first_value.data(), value, &_mem_pool);
}
size_t num_to_write = 1;
RETURN_IF_ERROR(
@ -141,7 +141,7 @@ Status IndexedColumnWriter::finish(IndexedColumnMetaPB* meta) {
if (_options.write_value_index) {
RETURN_IF_ERROR(_flush_index(_value_index_builder.get(), meta->mutable_value_index_meta()));
}
meta->set_data_type(_typeinfo->type());
meta->set_data_type(_type_info->type());
meta->set_encoding(_options.encoding);
meta->set_num_values(_num_values);
meta->set_compression(_options.compression);

View File

@ -70,7 +70,7 @@ struct IndexedColumnWriterOptions {
class IndexedColumnWriter {
public:
explicit IndexedColumnWriter(const IndexedColumnWriterOptions& options,
const TypeInfo* typeinfo, fs::WritableBlock* wblock);
const TypeInfo* type_info, fs::WritableBlock* wblock);
~IndexedColumnWriter();
@ -87,7 +87,7 @@ private:
Status _flush_index(IndexPageBuilder* index_builder, BTreeMetaPB* meta);
IndexedColumnWriterOptions _options;
const TypeInfo* _typeinfo;
const TypeInfo* _type_info;
fs::WritableBlock* _wblock;
// only used for `_first_value`
MemPool _mem_pool;

View File

@ -212,7 +212,7 @@ Status Segment::new_column_iterator(uint32_t cid, ColumnIterator** iter) {
std::unique_ptr<DefaultValueColumnIterator> default_value_iter(
new DefaultValueColumnIterator(
tablet_column.has_default_value(), tablet_column.default_value(),
tablet_column.is_nullable(), type_info, tablet_column.length()));
tablet_column.is_nullable(), std::move(type_info), tablet_column.length()));
ColumnIteratorOptions iter_opts;
RETURN_IF_ERROR(default_value_iter->init(iter_opts));

View File

@ -30,8 +30,7 @@ namespace doris {
namespace segment_v2 {
ZoneMapIndexWriter::ZoneMapIndexWriter(Field* field)
: _field(field), _pool("ZoneMapIndexWriter") {
ZoneMapIndexWriter::ZoneMapIndexWriter(Field* field) : _field(field), _pool("ZoneMapIndexWriter") {
_page_zone_map.min_value = _field->allocate_zone_map_value(&_pool);
_page_zone_map.max_value = _field->allocate_zone_map_value(&_pool);
_reset_zone_map(&_page_zone_map);
@ -56,12 +55,12 @@ void ZoneMapIndexWriter::add_values(const void* values, size_t count) {
}
}
void ZoneMapIndexWriter::moidfy_index_before_flush(struct doris::segment_v2::ZoneMap & zone_map) {
void ZoneMapIndexWriter::moidfy_index_before_flush(struct doris::segment_v2::ZoneMap& zone_map) {
_field->modify_zone_map_index(zone_map.max_value);
}
void ZoneMapIndexWriter::reset_page_zone_map() {
_page_zone_map.pass_all = true;
_page_zone_map.pass_all = true;
}
void ZoneMapIndexWriter::reset_segment_zone_map() {
@ -106,14 +105,14 @@ Status ZoneMapIndexWriter::finish(fs::WritableBlock* wblock, ColumnIndexMetaPB*
_segment_zone_map.to_proto(meta->mutable_segment_zone_map(), _field);
// write out zone map for each data pages
const auto* typeinfo = get_scalar_type_info<OLAP_FIELD_TYPE_OBJECT>();
const auto* type_info = get_scalar_type_info<OLAP_FIELD_TYPE_OBJECT>();
IndexedColumnWriterOptions options;
options.write_ordinal_index = true;
options.write_value_index = false;
options.encoding = EncodingInfo::get_default_encoding(typeinfo, false);
options.encoding = EncodingInfo::get_default_encoding(type_info, false);
options.compression = NO_COMPRESSION; // currently not compressed
IndexedColumnWriter writer(options, typeinfo, wblock);
IndexedColumnWriter writer(options, type_info, wblock);
RETURN_IF_ERROR(writer.init());
for (auto& value : _values) {

View File

@ -178,7 +178,7 @@ ColumnMapping* RowBlockChanger::get_mutable_column_mapping(size_t column_index)
<< " origin_type=" \
<< ref_block->tablet_schema().column(ref_column).type() \
<< ", alter_type=" << mutable_block->tablet_schema().column(i).type(); \
return Status::OLAPInternalError(OLAP_ERR_SCHEMA_CHANGE_INFO_INVALID); \
return Status::OLAPInternalError(OLAP_ERR_SCHEMA_CHANGE_INFO_INVALID); \
} \
break; \
}
@ -431,8 +431,7 @@ bool count_field(RowCursor* read_helper, RowCursor* write_helper, const TabletCo
}
Status RowBlockChanger::change_row_block(const RowBlock* ref_block, int32_t data_version,
RowBlock* mutable_block,
uint64_t* filtered_rows) const {
RowBlock* mutable_block, uint64_t* filtered_rows) const {
if (mutable_block == nullptr) {
LOG(FATAL) << "mutable block is uninitialized.";
return Status::OLAPInternalError(OLAP_ERR_NOT_INITED);
@ -584,7 +583,8 @@ Status RowBlockChanger::change_row_block(const RowBlock* ref_block, int32_t data
write_helper.set_not_null(i);
const Field* ref_field = read_helper.column_schema(ref_column);
char* ref_value = read_helper.cell_ptr(ref_column);
Status st = write_helper.convert_from(i, ref_value, ref_field->type_info(), mem_pool);
Status st = write_helper.convert_from(i, ref_value, ref_field->type_info(),
mem_pool);
if (!st) {
LOG(WARNING)
<< "the column type which was altered from was unsupported."
@ -919,8 +919,8 @@ void RowBlockMerger::_pop_heap() {
}
Status LinkedSchemaChange::process(RowsetReaderSharedPtr rowset_reader,
RowsetWriter* new_rowset_writer, TabletSharedPtr new_tablet,
TabletSharedPtr base_tablet) {
RowsetWriter* new_rowset_writer, TabletSharedPtr new_tablet,
TabletSharedPtr base_tablet) {
// In some cases, there may be more than one type of rowset in a tablet,
// in which case the conversion cannot be done directly by linked schema change,
// but requires direct schema change to rewrite the data.
@ -969,7 +969,7 @@ bool SchemaChangeDirectly::_write_row_block(RowsetWriter* rowset_writer, RowBloc
}
Status reserve_block(std::unique_ptr<RowBlock, RowBlockDeleter>* block_handle_ptr, int row_num,
RowBlockAllocator* allocator) {
RowBlockAllocator* allocator) {
auto& block_handle = *block_handle_ptr;
if (block_handle == nullptr || block_handle->capacity() < row_num) {
// release old block and alloc new block
@ -987,8 +987,8 @@ Status reserve_block(std::unique_ptr<RowBlock, RowBlockDeleter>* block_handle_pt
}
Status SchemaChangeDirectly::process(RowsetReaderSharedPtr rowset_reader,
RowsetWriter* rowset_writer, TabletSharedPtr new_tablet,
TabletSharedPtr base_tablet) {
RowsetWriter* rowset_writer, TabletSharedPtr new_tablet,
TabletSharedPtr base_tablet) {
if (_row_block_allocator == nullptr) {
_row_block_allocator = new RowBlockAllocator(new_tablet->tablet_schema(), 0);
if (_row_block_allocator == nullptr) {
@ -1106,9 +1106,8 @@ SchemaChangeWithSorting::~SchemaChangeWithSorting() {
}
Status SchemaChangeWithSorting::process(RowsetReaderSharedPtr rowset_reader,
RowsetWriter* new_rowset_writer,
TabletSharedPtr new_tablet,
TabletSharedPtr base_tablet) {
RowsetWriter* new_rowset_writer, TabletSharedPtr new_tablet,
TabletSharedPtr base_tablet) {
if (_row_block_allocator == nullptr) {
_row_block_allocator =
new (nothrow) RowBlockAllocator(new_tablet->tablet_schema(), _memory_limitation);
@ -1167,9 +1166,8 @@ Status SchemaChangeWithSorting::process(RowsetReaderSharedPtr rowset_reader,
RowBlock* ref_row_block = nullptr;
rowset_reader->next_block(&ref_row_block);
while (ref_row_block != nullptr && ref_row_block->has_remaining()) {
if (!_row_block_allocator->allocate(&new_row_block,
ref_row_block->row_block_info().row_num,
true)) {
if (!_row_block_allocator->allocate(&new_row_block, ref_row_block->row_block_info().row_num,
true)) {
LOG(WARNING) << "failed to allocate RowBlock.";
return Status::OLAPInternalError(OLAP_ERR_INPUT_PARAMETER_ERROR);
} else {
@ -1385,7 +1383,8 @@ Status SchemaChangeHandler::process_alter_tablet_v2(const TAlterTabletReqV2& req
<< ", new_tablet_id=" << request.new_tablet_id
<< ", alter_version=" << request.alter_version;
TabletSharedPtr base_tablet = StorageEngine::instance()->tablet_manager()->get_tablet(request.base_tablet_id);
TabletSharedPtr base_tablet =
StorageEngine::instance()->tablet_manager()->get_tablet(request.base_tablet_id);
if (base_tablet == nullptr) {
LOG(WARNING) << "fail to find base tablet. base_tablet=" << request.base_tablet_id;
return Status::OLAPInternalError(OLAP_ERR_TABLE_NOT_FOUND);
@ -1411,14 +1410,16 @@ Status SchemaChangeHandler::process_alter_tablet_v2(const TAlterTabletReqV2& req
// Should delete the old code after upgrade finished.
Status SchemaChangeHandler::_do_process_alter_tablet_v2(const TAlterTabletReqV2& request) {
Status res = Status::OK();
TabletSharedPtr base_tablet = StorageEngine::instance()->tablet_manager()->get_tablet(request.base_tablet_id);
TabletSharedPtr base_tablet =
StorageEngine::instance()->tablet_manager()->get_tablet(request.base_tablet_id);
if (base_tablet == nullptr) {
LOG(WARNING) << "fail to find base tablet. base_tablet=" << request.base_tablet_id;
return Status::OLAPInternalError(OLAP_ERR_TABLE_NOT_FOUND);
}
// new tablet has to exist
TabletSharedPtr new_tablet = StorageEngine::instance()->tablet_manager()->get_tablet(request.new_tablet_id);
TabletSharedPtr new_tablet =
StorageEngine::instance()->tablet_manager()->get_tablet(request.new_tablet_id);
if (new_tablet == nullptr) {
LOG(WARNING) << "fail to find new tablet."
<< " new_tablet=" << request.new_tablet_id;
@ -1531,11 +1532,11 @@ Status SchemaChangeHandler::_do_process_alter_tablet_v2(const TAlterTabletReqV2&
}
}
res = delete_handler.init(base_tablet->tablet_schema(), base_tablet->delete_predicates(),
end_version);
res = delete_handler.init(base_tablet->tablet_schema(),
base_tablet->delete_predicates(), end_version);
if (!res.ok()) {
LOG(WARNING) << "init delete handler failed. base_tablet=" << base_tablet->full_name()
<< ", end_version=" << end_version;
LOG(WARNING) << "init delete handler failed. base_tablet="
<< base_tablet->full_name() << ", end_version=" << end_version;
// release delete handlers which have been inited successfully.
delete_handler.finalize();
@ -1634,9 +1635,9 @@ Status SchemaChangeHandler::_do_process_alter_tablet_v2(const TAlterTabletReqV2&
}
Status SchemaChangeHandler::schema_version_convert(TabletSharedPtr base_tablet,
TabletSharedPtr new_tablet,
RowsetSharedPtr* base_rowset,
RowsetSharedPtr* new_rowset) {
TabletSharedPtr new_tablet,
RowsetSharedPtr* base_rowset,
RowsetSharedPtr* new_rowset) {
Status res = Status::OK();
LOG(INFO) << "begin to convert delta version for schema changing. "
<< "base_tablet=" << base_tablet->full_name()
@ -1649,8 +1650,8 @@ Status SchemaChangeHandler::schema_version_convert(TabletSharedPtr base_tablet,
bool sc_directly = false;
const std::unordered_map<std::string, AlterMaterializedViewParam> materialized_function_map;
if (!(res = _parse_request(base_tablet, new_tablet, &rb_changer, &sc_sorting,
&sc_directly, materialized_function_map))) {
if (!(res = _parse_request(base_tablet, new_tablet, &rb_changer, &sc_sorting, &sc_directly,
materialized_function_map))) {
LOG(WARNING) << "failed to parse the request. res=" << res;
return res;
}
@ -1803,7 +1804,7 @@ Status SchemaChangeHandler::_convert_historical_rowsets(const SchemaChangeParams
// a.Parse the Alter request and convert it into an internal representation
Status res = _parse_request(sc_params.base_tablet, sc_params.new_tablet, &rb_changer,
&sc_sorting, &sc_directly, sc_params.materialized_params_map);
&sc_sorting, &sc_directly, sc_params.materialized_params_map);
if (!res.ok()) {
LOG(WARNING) << "failed to parse the request. res=" << res;
goto PROCESS_ALTER_EXIT;
@ -1993,7 +1994,7 @@ Status SchemaChangeHandler::_parse_request(
}
res = _init_column_mapping(column_mapping, new_column, new_column.default_value());
if (!res) {
return res;
return res;
}
VLOG_TRACE << "A column with default value will be added after schema changing. "
@ -2093,8 +2094,8 @@ Status SchemaChangeHandler::_parse_request(
}
Status SchemaChangeHandler::_init_column_mapping(ColumnMapping* column_mapping,
const TabletColumn& column_schema,
const std::string& value) {
const TabletColumn& column_schema,
const std::string& value) {
column_mapping->default_value = WrapperField::create(column_schema);
if (column_mapping->default_value == nullptr) {
@ -2111,7 +2112,7 @@ Status SchemaChangeHandler::_init_column_mapping(ColumnMapping* column_mapping,
}
Status SchemaChangeHandler::_validate_alter_result(TabletSharedPtr new_tablet,
const TAlterTabletReqV2& request) {
const TAlterTabletReqV2& request) {
Version max_continuous_version = {-1, 0};
new_tablet->max_continuous_version_from_beginning(&max_continuous_version);
LOG(INFO) << "find max continuous version of tablet=" << new_tablet->full_name()

View File

@ -510,7 +510,8 @@ vectorized::Block TabletSchema::create_block(
for (int i = 0; i < return_columns.size(); ++i) {
const auto& col = _cols[return_columns[i]];
bool is_nullable = (tablet_columns_need_convert_null != nullptr &&
tablet_columns_need_convert_null->find(return_columns[i]) != tablet_columns_need_convert_null->end());
tablet_columns_need_convert_null->find(return_columns[i]) !=
tablet_columns_need_convert_null->end());
auto data_type = vectorized::DataTypeFactory::instance().create_data_type(col, is_nullable);
auto column = data_type->create_column();
block.insert({std::move(column), data_type, col.name()});

View File

@ -19,10 +19,15 @@
#include <memory>
#include "gen_cpp/segment_v2.pb.h"
#include "olap/tablet_schema.h"
namespace doris {
void (*FieldTypeTraits<OLAP_FIELD_TYPE_CHAR>::set_to_max)(void*) = nullptr;
static TypeInfoPtr create_type_info_ptr(const TypeInfo* type_info, bool should_reclaim_memory);
bool is_scalar_type(FieldType field_type) {
switch (field_type) {
case OLAP_FIELD_TYPE_STRUCT:
@ -50,73 +55,63 @@ bool is_olap_string_type(FieldType field_type) {
const TypeInfo* get_scalar_type_info(FieldType field_type) {
// nullptr means that there is no TypeInfo implementation for the corresponding field_type
static const TypeInfo* field_type_array[] = {
nullptr,
get_scalar_type_info<OLAP_FIELD_TYPE_TINYINT>(),
nullptr,
get_scalar_type_info<OLAP_FIELD_TYPE_SMALLINT>(),
nullptr,
get_scalar_type_info<OLAP_FIELD_TYPE_INT>(),
get_scalar_type_info<OLAP_FIELD_TYPE_UNSIGNED_INT>(),
get_scalar_type_info<OLAP_FIELD_TYPE_BIGINT>(),
get_scalar_type_info<OLAP_FIELD_TYPE_UNSIGNED_BIGINT>(),
get_scalar_type_info<OLAP_FIELD_TYPE_LARGEINT>(),
get_scalar_type_info<OLAP_FIELD_TYPE_FLOAT>(),
get_scalar_type_info<OLAP_FIELD_TYPE_DOUBLE>(),
nullptr,
get_scalar_type_info<OLAP_FIELD_TYPE_CHAR>(),
get_scalar_type_info<OLAP_FIELD_TYPE_DATE>(),
get_scalar_type_info<OLAP_FIELD_TYPE_DATETIME>(),
get_scalar_type_info<OLAP_FIELD_TYPE_DECIMAL>(),
get_scalar_type_info<OLAP_FIELD_TYPE_VARCHAR>(),
nullptr,
nullptr,
nullptr,
nullptr,
nullptr,
get_scalar_type_info<OLAP_FIELD_TYPE_HLL>(),
get_scalar_type_info<OLAP_FIELD_TYPE_BOOL>(),
get_scalar_type_info<OLAP_FIELD_TYPE_OBJECT>(),
get_scalar_type_info<OLAP_FIELD_TYPE_STRING>(),
get_scalar_type_info<OLAP_FIELD_TYPE_QUANTILE_STATE>(),
nullptr,
get_scalar_type_info<OLAP_FIELD_TYPE_TINYINT>(),
nullptr,
get_scalar_type_info<OLAP_FIELD_TYPE_SMALLINT>(),
nullptr,
get_scalar_type_info<OLAP_FIELD_TYPE_INT>(),
get_scalar_type_info<OLAP_FIELD_TYPE_UNSIGNED_INT>(),
get_scalar_type_info<OLAP_FIELD_TYPE_BIGINT>(),
get_scalar_type_info<OLAP_FIELD_TYPE_UNSIGNED_BIGINT>(),
get_scalar_type_info<OLAP_FIELD_TYPE_LARGEINT>(),
get_scalar_type_info<OLAP_FIELD_TYPE_FLOAT>(),
get_scalar_type_info<OLAP_FIELD_TYPE_DOUBLE>(),
nullptr,
get_scalar_type_info<OLAP_FIELD_TYPE_CHAR>(),
get_scalar_type_info<OLAP_FIELD_TYPE_DATE>(),
get_scalar_type_info<OLAP_FIELD_TYPE_DATETIME>(),
get_scalar_type_info<OLAP_FIELD_TYPE_DECIMAL>(),
get_scalar_type_info<OLAP_FIELD_TYPE_VARCHAR>(),
nullptr,
nullptr,
nullptr,
nullptr,
nullptr,
get_scalar_type_info<OLAP_FIELD_TYPE_HLL>(),
get_scalar_type_info<OLAP_FIELD_TYPE_BOOL>(),
get_scalar_type_info<OLAP_FIELD_TYPE_OBJECT>(),
get_scalar_type_info<OLAP_FIELD_TYPE_STRING>(),
get_scalar_type_info<OLAP_FIELD_TYPE_QUANTILE_STATE>(),
};
return field_type_array[field_type];
}
#define INIT_ARRAY_TYPE_INFO_LIST(type) \
{ \
get_init_array_type_info<type>(0), \
get_init_array_type_info<type>(1), \
get_init_array_type_info<type>(2), \
get_init_array_type_info<type>(3), \
get_init_array_type_info<type>(4), \
get_init_array_type_info<type>(5), \
get_init_array_type_info<type>(6), \
get_init_array_type_info<type>(7), \
get_init_array_type_info<type>(8) \
#define INIT_ARRAY_TYPE_INFO_LIST(type) \
{ \
get_init_array_type_info<type>(0), get_init_array_type_info<type>(1), \
get_init_array_type_info<type>(2), get_init_array_type_info<type>(3), \
get_init_array_type_info<type>(4), get_init_array_type_info<type>(5), \
get_init_array_type_info<type>(6), get_init_array_type_info<type>(7), \
get_init_array_type_info<type>(8) \
}
template <FieldType field_type>
inline const ArrayTypeInfo* get_init_array_type_info(int32_t iterations) {
static ArrayTypeInfo nested_type_info_0(get_scalar_type_info<field_type>());
static ArrayTypeInfo nested_type_info_1(&nested_type_info_0);
static ArrayTypeInfo nested_type_info_2(&nested_type_info_1);
static ArrayTypeInfo nested_type_info_3(&nested_type_info_2);
static ArrayTypeInfo nested_type_info_4(&nested_type_info_3);
static ArrayTypeInfo nested_type_info_5(&nested_type_info_4);
static ArrayTypeInfo nested_type_info_6(&nested_type_info_5);
static ArrayTypeInfo nested_type_info_7(&nested_type_info_6);
static ArrayTypeInfo nested_type_info_8(&nested_type_info_7);
static ArrayTypeInfo nested_type_info_0(
create_static_type_info_ptr(get_scalar_type_info<field_type>()));
static ArrayTypeInfo nested_type_info_1(create_static_type_info_ptr(&nested_type_info_0));
static ArrayTypeInfo nested_type_info_2(create_static_type_info_ptr(&nested_type_info_1));
static ArrayTypeInfo nested_type_info_3(create_static_type_info_ptr(&nested_type_info_2));
static ArrayTypeInfo nested_type_info_4(create_static_type_info_ptr(&nested_type_info_3));
static ArrayTypeInfo nested_type_info_5(create_static_type_info_ptr(&nested_type_info_4));
static ArrayTypeInfo nested_type_info_6(create_static_type_info_ptr(&nested_type_info_5));
static ArrayTypeInfo nested_type_info_7(create_static_type_info_ptr(&nested_type_info_6));
static ArrayTypeInfo nested_type_info_8(create_static_type_info_ptr(&nested_type_info_7));
static ArrayTypeInfo* nested_type_info_array[] = {
&nested_type_info_0,
&nested_type_info_1,
&nested_type_info_2,
&nested_type_info_3,
&nested_type_info_4,
&nested_type_info_5,
&nested_type_info_6,
&nested_type_info_7,
&nested_type_info_8
};
&nested_type_info_0, &nested_type_info_1, &nested_type_info_2,
&nested_type_info_3, &nested_type_info_4, &nested_type_info_5,
&nested_type_info_6, &nested_type_info_7, &nested_type_info_8};
return nested_type_info_array[iterations];
}
@ -124,40 +119,41 @@ const TypeInfo* get_array_type_info(FieldType leaf_type, int32_t iterations) {
DCHECK(iterations <= 8) << "the depth of nested array type should not be larger than 8";
static constexpr int32_t depth = 9;
static const ArrayTypeInfo* array_type_Info_arr[][depth] = {
{ nullptr, nullptr, nullptr, nullptr, nullptr, nullptr, nullptr, nullptr, nullptr },
INIT_ARRAY_TYPE_INFO_LIST(OLAP_FIELD_TYPE_TINYINT),
{ nullptr, nullptr, nullptr, nullptr, nullptr, nullptr, nullptr, nullptr, nullptr },
INIT_ARRAY_TYPE_INFO_LIST(OLAP_FIELD_TYPE_SMALLINT),
{ nullptr, nullptr, nullptr, nullptr, nullptr, nullptr, nullptr, nullptr, nullptr },
INIT_ARRAY_TYPE_INFO_LIST(OLAP_FIELD_TYPE_INT),
INIT_ARRAY_TYPE_INFO_LIST(OLAP_FIELD_TYPE_UNSIGNED_INT),
INIT_ARRAY_TYPE_INFO_LIST(OLAP_FIELD_TYPE_BIGINT),
INIT_ARRAY_TYPE_INFO_LIST(OLAP_FIELD_TYPE_UNSIGNED_BIGINT),
INIT_ARRAY_TYPE_INFO_LIST(OLAP_FIELD_TYPE_LARGEINT),
INIT_ARRAY_TYPE_INFO_LIST(OLAP_FIELD_TYPE_FLOAT),
INIT_ARRAY_TYPE_INFO_LIST(OLAP_FIELD_TYPE_DOUBLE),
{ nullptr, nullptr, nullptr, nullptr, nullptr, nullptr, nullptr, nullptr, nullptr },
INIT_ARRAY_TYPE_INFO_LIST(OLAP_FIELD_TYPE_CHAR),
INIT_ARRAY_TYPE_INFO_LIST(OLAP_FIELD_TYPE_DATE),
INIT_ARRAY_TYPE_INFO_LIST(OLAP_FIELD_TYPE_DATETIME),
INIT_ARRAY_TYPE_INFO_LIST(OLAP_FIELD_TYPE_DECIMAL),
INIT_ARRAY_TYPE_INFO_LIST(OLAP_FIELD_TYPE_VARCHAR),
{ nullptr, nullptr, nullptr, nullptr, nullptr, nullptr, nullptr, nullptr, nullptr },
{ nullptr, nullptr, nullptr, nullptr, nullptr, nullptr, nullptr, nullptr, nullptr },
{ nullptr, nullptr, nullptr, nullptr, nullptr, nullptr, nullptr, nullptr, nullptr },
{ nullptr, nullptr, nullptr, nullptr, nullptr, nullptr, nullptr, nullptr, nullptr },
{ nullptr, nullptr, nullptr, nullptr, nullptr, nullptr, nullptr, nullptr, nullptr },
INIT_ARRAY_TYPE_INFO_LIST(OLAP_FIELD_TYPE_HLL),
INIT_ARRAY_TYPE_INFO_LIST(OLAP_FIELD_TYPE_BOOL),
INIT_ARRAY_TYPE_INFO_LIST(OLAP_FIELD_TYPE_OBJECT),
INIT_ARRAY_TYPE_INFO_LIST(OLAP_FIELD_TYPE_STRING),
INIT_ARRAY_TYPE_INFO_LIST(OLAP_FIELD_TYPE_QUANTILE_STATE),
{nullptr, nullptr, nullptr, nullptr, nullptr, nullptr, nullptr, nullptr, nullptr},
INIT_ARRAY_TYPE_INFO_LIST(OLAP_FIELD_TYPE_TINYINT),
{nullptr, nullptr, nullptr, nullptr, nullptr, nullptr, nullptr, nullptr, nullptr},
INIT_ARRAY_TYPE_INFO_LIST(OLAP_FIELD_TYPE_SMALLINT),
{nullptr, nullptr, nullptr, nullptr, nullptr, nullptr, nullptr, nullptr, nullptr},
INIT_ARRAY_TYPE_INFO_LIST(OLAP_FIELD_TYPE_INT),
INIT_ARRAY_TYPE_INFO_LIST(OLAP_FIELD_TYPE_UNSIGNED_INT),
INIT_ARRAY_TYPE_INFO_LIST(OLAP_FIELD_TYPE_BIGINT),
INIT_ARRAY_TYPE_INFO_LIST(OLAP_FIELD_TYPE_UNSIGNED_BIGINT),
INIT_ARRAY_TYPE_INFO_LIST(OLAP_FIELD_TYPE_LARGEINT),
INIT_ARRAY_TYPE_INFO_LIST(OLAP_FIELD_TYPE_FLOAT),
INIT_ARRAY_TYPE_INFO_LIST(OLAP_FIELD_TYPE_DOUBLE),
{nullptr, nullptr, nullptr, nullptr, nullptr, nullptr, nullptr, nullptr, nullptr},
INIT_ARRAY_TYPE_INFO_LIST(OLAP_FIELD_TYPE_CHAR),
INIT_ARRAY_TYPE_INFO_LIST(OLAP_FIELD_TYPE_DATE),
INIT_ARRAY_TYPE_INFO_LIST(OLAP_FIELD_TYPE_DATETIME),
INIT_ARRAY_TYPE_INFO_LIST(OLAP_FIELD_TYPE_DECIMAL),
INIT_ARRAY_TYPE_INFO_LIST(OLAP_FIELD_TYPE_VARCHAR),
{nullptr, nullptr, nullptr, nullptr, nullptr, nullptr, nullptr, nullptr, nullptr},
{nullptr, nullptr, nullptr, nullptr, nullptr, nullptr, nullptr, nullptr, nullptr},
{nullptr, nullptr, nullptr, nullptr, nullptr, nullptr, nullptr, nullptr, nullptr},
{nullptr, nullptr, nullptr, nullptr, nullptr, nullptr, nullptr, nullptr, nullptr},
{nullptr, nullptr, nullptr, nullptr, nullptr, nullptr, nullptr, nullptr, nullptr},
INIT_ARRAY_TYPE_INFO_LIST(OLAP_FIELD_TYPE_HLL),
INIT_ARRAY_TYPE_INFO_LIST(OLAP_FIELD_TYPE_BOOL),
INIT_ARRAY_TYPE_INFO_LIST(OLAP_FIELD_TYPE_OBJECT),
INIT_ARRAY_TYPE_INFO_LIST(OLAP_FIELD_TYPE_STRING),
INIT_ARRAY_TYPE_INFO_LIST(OLAP_FIELD_TYPE_QUANTILE_STATE),
};
return array_type_Info_arr[leaf_type][iterations];
}
const TypeInfo* get_type_info(segment_v2::ColumnMetaPB* column_meta_pb) {
FieldType type = (FieldType) column_meta_pb->type();
// TODO: Support the type info of the nested array with more than 9 depths.
TypeInfoPtr get_type_info(segment_v2::ColumnMetaPB* column_meta_pb) {
FieldType type = (FieldType)column_meta_pb->type();
if (UNLIKELY(type == OLAP_FIELD_TYPE_ARRAY)) {
int32_t iterations = 0;
const auto* child_column = &column_meta_pb->children_columns(0);
@ -165,13 +161,31 @@ const TypeInfo* get_type_info(segment_v2::ColumnMetaPB* column_meta_pb) {
iterations++;
child_column = &child_column->children_columns(0);
}
return get_array_type_info((FieldType) child_column->type(), iterations);
return create_static_type_info_ptr(
get_array_type_info((FieldType)child_column->type(), iterations));
} else {
return get_scalar_type_info(type);
return create_static_type_info_ptr(get_scalar_type_info(type));
}
}
const TypeInfo* get_type_info(const TabletColumn* col) {
TypeInfoPtr create_static_type_info_ptr(const TypeInfo* type_info) {
return create_type_info_ptr(type_info, false);
}
TypeInfoPtr create_dynamic_type_info_ptr(const TypeInfo* type_info) {
return create_type_info_ptr(type_info, true);
}
TypeInfoPtr create_type_info_ptr(const TypeInfo* type_info, bool should_reclaim_memory) {
if (!should_reclaim_memory) {
return TypeInfoPtr(type_info, [](const TypeInfo*) {});
} else {
return TypeInfoPtr(type_info, [](const TypeInfo* type_info) { delete type_info; });
}
}
// TODO: Support the type info of the nested array with more than 9 depths.
TypeInfoPtr get_type_info(const TabletColumn* col) {
auto type = col->type();
if (UNLIKELY(type == OLAP_FIELD_TYPE_ARRAY)) {
int32_t iterations = 0;
@ -180,9 +194,19 @@ const TypeInfo* get_type_info(const TabletColumn* col) {
iterations++;
child_column = &child_column->get_sub_column(0);
}
return get_array_type_info(child_column->type(), iterations);
return create_static_type_info_ptr(get_array_type_info(child_column->type(), iterations));
} else {
return get_scalar_type_info(type);
return create_static_type_info_ptr(get_scalar_type_info(type));
}
}
TypeInfoPtr clone_type_info(const TypeInfo* type_info) {
if (is_scalar_type(type_info->type())) {
return create_static_type_info_ptr(type_info);
} else {
const auto array_type_info = dynamic_cast<const ArrayTypeInfo*>(type_info);
return create_dynamic_type_info_ptr(
new ArrayTypeInfo(clone_type_info(array_type_info->item_type_info())));
}
}

View File

@ -21,30 +21,38 @@
#include <stdio.h>
#include <limits>
#include <memory>
#include <sstream>
#include <string>
#include "gen_cpp/segment_v2.pb.h" // for ColumnMetaPB
#include "gutil/strings/numbers.h"
#include "olap/decimal12.h"
#include "olap/olap_common.h"
#include "olap/olap_define.h"
#include "olap/tablet_schema.h" // for TabletColumn
#include "olap/uint24.h"
#include "runtime/collection_value.h"
#include "runtime/datetime_value.h"
#include "runtime/mem_pool.h"
#include "util/hash_util.hpp"
#include "util/mem_util.hpp"
#include "util/slice.h"
#include "util/string_parser.hpp"
#include "util/types.h"
namespace doris {
namespace segment_v2 {
class ColumnMetaPB;
}
class MemPool;
struct uint24_t;
struct decimal12_t;
class TabletColumn;
extern bool is_olap_string_type(FieldType field_type);
class TypeInfo;
using TypeInfoPtr = std::unique_ptr<const TypeInfo, void (*)(const TypeInfo*)>;
TypeInfoPtr create_static_type_info_ptr(const TypeInfo* type_info);
TypeInfoPtr create_dynamic_type_info_ptr(const TypeInfo* type_info);
class TypeInfo {
public:
virtual ~TypeInfo() = default;
@ -66,7 +74,7 @@ public:
// Convert and deep copy value from other type's source.
virtual Status convert_from(void* dest, const void* src, const TypeInfo* src_type,
MemPool* mem_pool, size_t variable_len = 0) const = 0;
MemPool* mem_pool, size_t variable_len = 0) const = 0;
virtual Status from_string(void* buf, const std::string& scan_key) const = 0;
@ -83,15 +91,11 @@ public:
class ScalarTypeInfo : public TypeInfo {
public:
bool equal(const void* left, const void* right) const override {
return _equal(left, right);
}
bool equal(const void* left, const void* right) const override { return _equal(left, right); }
int cmp(const void* left, const void* right) const override { return _cmp(left, right); }
void shallow_copy(void* dest, const void* src) const override {
_shallow_copy(dest, src);
}
void shallow_copy(void* dest, const void* src) const override { _shallow_copy(dest, src); }
void deep_copy(void* dest, const void* src, MemPool* mem_pool) const override {
_deep_copy(dest, src, mem_pool);
@ -110,8 +114,8 @@ public:
}
// Convert and deep copy value from other type's source.
Status convert_from(void* dest, const void* src, const TypeInfo* src_type,
MemPool* mem_pool, size_t variable_len = 0) const override {
Status convert_from(void* dest, const void* src, const TypeInfo* src_type, MemPool* mem_pool,
size_t variable_len = 0) const override {
return _convert_from(dest, src, src_type, mem_pool, variable_len);
}
@ -159,7 +163,7 @@ private:
void (*_direct_copy)(void* dest, const void* src);
void (*_direct_copy_may_cut)(void* dest, const void* src);
Status (*_convert_from)(void* dest, const void* src, const TypeInfo* src_type,
MemPool* mem_pool, size_t variable_len);
MemPool* mem_pool, size_t variable_len);
Status (*_from_string)(void* buf, const std::string& scan_key);
std::string (*_to_string)(const void* src);
@ -177,10 +181,11 @@ private:
class ArrayTypeInfo : public TypeInfo {
public:
explicit ArrayTypeInfo(const TypeInfo* item_type_info)
: _item_type_info(item_type_info), _item_size(item_type_info->size()) {}
~ArrayTypeInfo() = default;
bool equal(const void* left, const void* right) const override {
explicit ArrayTypeInfo(TypeInfoPtr item_type_info)
: _item_type_info(std::move(item_type_info)), _item_size(_item_type_info->size()) {}
~ArrayTypeInfo() override = default;
inline bool equal(const void* left, const void* right) const override {
auto l_value = reinterpret_cast<const CollectionValue*>(left);
auto r_value = reinterpret_cast<const CollectionValue*>(right);
if (l_value->length() != r_value->length()) {
@ -329,7 +334,7 @@ public:
if (_item_type_info->type() == OLAP_FIELD_TYPE_ARRAY) {
for (uint32_t i = 0; i < src_value->length(); ++i) {
if (dest_value->is_null_at(i)) continue;
dynamic_cast<const ArrayTypeInfo*>(_item_type_info)
dynamic_cast<const ArrayTypeInfo*>(_item_type_info.get())
->direct_copy(base, (uint8_t*)(dest_value->mutable_data()) + i * _item_size,
(uint8_t*)(src_value->data()) + i * _item_size);
}
@ -350,12 +355,10 @@ public:
}
}
void direct_copy_may_cut(void* dest, const void* src) const override {
direct_copy(dest, src);
}
void direct_copy_may_cut(void* dest, const void* src) const override { direct_copy(dest, src); }
Status convert_from(void* dest, const void* src, const TypeInfo* src_type,
MemPool* mem_pool, size_t variable_len = 0) const override {
Status convert_from(void* dest, const void* src, const TypeInfo* src_type, MemPool* mem_pool,
size_t variable_len = 0) const override {
return Status::OLAPInternalError(OLAP_ERR_FUNC_NOT_IMPLEMENTED);
}
@ -406,20 +409,22 @@ public:
FieldType type() const override { return OLAP_FIELD_TYPE_ARRAY; }
const TypeInfo* item_type_info() const { return _item_type_info; }
inline const TypeInfo* item_type_info() const { return _item_type_info.get(); }
private:
const TypeInfo* _item_type_info;
TypeInfoPtr _item_type_info;
const size_t _item_size;
};
extern bool is_scalar_type(FieldType field_type);
bool is_scalar_type(FieldType field_type);
extern const TypeInfo* get_scalar_type_info(FieldType field_type);
const TypeInfo* get_scalar_type_info(FieldType field_type);
extern const TypeInfo* get_type_info(segment_v2::ColumnMetaPB* column_meta_pb);
TypeInfoPtr get_type_info(segment_v2::ColumnMetaPB* column_meta_pb);
extern const TypeInfo* get_type_info(const TabletColumn* col);
TypeInfoPtr get_type_info(const TabletColumn* col);
TypeInfoPtr clone_type_info(const TypeInfo* type_info);
// support following formats when convert varchar to date
static const std::vector<std::string> DATE_FORMATS {
@ -562,7 +567,7 @@ struct BaseFieldtypeTraits : public CppTypeTraits<field_type> {
static inline void direct_copy_may_cut(void* dest, const void* src) { direct_copy(dest, src); }
static Status convert_from(void* dest, const void* src, const TypeInfo* src_type,
MemPool* mem_pool, size_t variable_len = 0) {
MemPool* mem_pool, size_t variable_len = 0) {
return Status::OLAPInternalError(OLAP_ERR_FUNC_NOT_IMPLEMENTED);
}
@ -645,7 +650,7 @@ struct NumericFieldtypeTraits : public BaseFieldtypeTraits<fieldType> {
}
static Status convert_from(void* dest, const void* src, const TypeInfo* src_type,
MemPool* mem_pool, size_t variable_len = 0) {
MemPool* mem_pool, size_t variable_len = 0) {
if (src_type->type() == OLAP_FIELD_TYPE_VARCHAR ||
src_type->type() == OLAP_FIELD_TYPE_STRING) {
return arithmetic_convert_from_varchar<CppType>(dest, src);
@ -824,7 +829,7 @@ struct FieldTypeTraits<OLAP_FIELD_TYPE_DOUBLE>
return std::string(buf);
}
static Status convert_from(void* dest, const void* src, const TypeInfo* src_type,
MemPool* mem_pool, size_t variable_len = 0) {
MemPool* mem_pool, size_t variable_len = 0) {
//only support float now
if (src_type->type() == OLAP_FIELD_TYPE_FLOAT) {
using SrcType = typename CppTypeTraits<OLAP_FIELD_TYPE_FLOAT>::CppType;
@ -894,7 +899,7 @@ struct FieldTypeTraits<OLAP_FIELD_TYPE_DATE> : public BaseFieldtypeTraits<OLAP_F
return reinterpret_cast<const CppType*>(src)->to_string();
}
static Status convert_from(void* dest, const void* src, const TypeInfo* src_type,
MemPool* mem_pool, size_t variable_len = 0) {
MemPool* mem_pool, size_t variable_len = 0) {
if (src_type->type() == FieldType::OLAP_FIELD_TYPE_DATETIME) {
using SrcType = typename CppTypeTraits<OLAP_FIELD_TYPE_DATETIME>::CppType;
SrcType src_value = *reinterpret_cast<const SrcType*>(src);
@ -992,7 +997,7 @@ struct FieldTypeTraits<OLAP_FIELD_TYPE_DATETIME>
return std::string(buf);
}
static Status convert_from(void* dest, const void* src, const TypeInfo* src_type,
MemPool* memPool, size_t variable_len = 0) {
MemPool* memPool, size_t variable_len = 0) {
// when convert date to datetime, automatic padding zero
if (src_type->type() == FieldType::OLAP_FIELD_TYPE_DATE) {
using SrcType = typename CppTypeTraits<OLAP_FIELD_TYPE_DATE>::CppType;
@ -1112,7 +1117,7 @@ struct FieldTypeTraits<OLAP_FIELD_TYPE_VARCHAR> : public FieldTypeTraits<OLAP_FI
}
static Status convert_from(void* dest, const void* src, const TypeInfo* src_type,
MemPool* mem_pool, size_t variable_len = 0) {
MemPool* mem_pool, size_t variable_len = 0) {
assert(variable_len > 0);
switch (src_type->type()) {
case OLAP_FIELD_TYPE_TINYINT:
@ -1124,7 +1129,8 @@ struct FieldTypeTraits<OLAP_FIELD_TYPE_VARCHAR> : public FieldTypeTraits<OLAP_FI
case OLAP_FIELD_TYPE_DOUBLE:
case OLAP_FIELD_TYPE_DECIMAL: {
auto result = src_type->to_string(src);
if (result.size() > variable_len) return Status::OLAPInternalError(OLAP_ERR_INPUT_PARAMETER_ERROR);
if (result.size() > variable_len)
return Status::OLAPInternalError(OLAP_ERR_INPUT_PARAMETER_ERROR);
auto slice = reinterpret_cast<Slice*>(dest);
slice->data = reinterpret_cast<char*>(mem_pool->allocate(result.size()));
memcpy(slice->data, result.c_str(), result.size());
@ -1163,7 +1169,7 @@ struct FieldTypeTraits<OLAP_FIELD_TYPE_STRING> : public FieldTypeTraits<OLAP_FIE
}
static Status convert_from(void* dest, const void* src, const TypeInfo* src_type,
MemPool* mem_pool, size_t variable_len = 0) {
MemPool* mem_pool, size_t variable_len = 0) {
switch (src_type->type()) {
case OLAP_FIELD_TYPE_TINYINT:
case OLAP_FIELD_TYPE_SMALLINT:
@ -1269,7 +1275,8 @@ inline const TypeInfo* get_scalar_type_info() {
template <FieldType field_type>
inline const TypeInfo* get_collection_type_info() {
static ArrayTypeInfo collection_type_info(get_scalar_type_info<field_type>());
static ArrayTypeInfo collection_type_info(
create_static_type_info_ptr(get_scalar_type_info<field_type>()));
return &collection_type_info;
}

View File

@ -54,7 +54,7 @@ public:
template <FieldType type>
void write_index_file(std::string& filename, const void* values, size_t value_count,
size_t null_count, ColumnIndexMetaPB* meta) {
const auto* type_info = get_scalar_type_info(type);
const auto* type_info = get_scalar_type_info<type>();
{
std::unique_ptr<fs::WritableBlock> wblock;
fs::CreateBlockOptions opts(filename);

View File

@ -345,7 +345,7 @@ void test_array_nullable_data(CollectionValue* src_data, uint8_t* src_is_null, i
MemTracker tracker;
MemPool pool(&tracker);
std::unique_ptr<ColumnVectorBatch> cvb;
ColumnVectorBatch::create(0, true, type_info, field, &cvb);
ColumnVectorBatch::create(0, true, type_info.get(), field, &cvb);
cvb->resize(1024);
ColumnBlock col(cvb.get(), &pool);
@ -372,7 +372,7 @@ void test_array_nullable_data(CollectionValue* src_data, uint8_t* src_is_null, i
MemTracker tracker;
MemPool pool(&tracker);
std::unique_ptr<ColumnVectorBatch> cvb;
ColumnVectorBatch::create(0, true, type_info, field, &cvb);
ColumnVectorBatch::create(0, true, type_info.get(), field, &cvb);
cvb->resize(1024);
ColumnBlock col(cvb.get(), &pool);
@ -462,13 +462,14 @@ TEST_F(ColumnReaderWriterTest, test_array_type) {
template <FieldType type>
void test_read_default_value(string value, void* result) {
using Type = typename TypeTraits<type>::CppType;
const auto* type_info = get_scalar_type_info(type);
const auto* scalar_type_info = get_scalar_type_info<type>();
// read and check
{
TabletColumn tablet_column = create_with_default_value<type>(value);
DefaultValueColumnIterator iter(tablet_column.has_default_value(),
tablet_column.default_value(), tablet_column.is_nullable(),
type_info, tablet_column.length());
create_static_type_info_ptr(scalar_type_info),
tablet_column.length());
ColumnIteratorOptions iter_opts;
auto st = iter.init(iter_opts);
EXPECT_TRUE(st.ok());
@ -480,7 +481,7 @@ void test_read_default_value(string value, void* result) {
auto tracker = std::make_shared<MemTracker>();
MemPool pool(tracker.get());
std::unique_ptr<ColumnVectorBatch> cvb;
ColumnVectorBatch::create(0, true, type_info, nullptr, &cvb);
ColumnVectorBatch::create(0, true, scalar_type_info, nullptr, &cvb);
cvb->resize(1024);
ColumnBlock col(cvb.get(), &pool);
@ -511,7 +512,7 @@ void test_read_default_value(string value, void* result) {
auto tracker = std::make_shared<MemTracker>();
MemPool pool(tracker.get());
std::unique_ptr<ColumnVectorBatch> cvb;
ColumnVectorBatch::create(0, true, type_info, nullptr, &cvb);
ColumnVectorBatch::create(0, true, scalar_type_info, nullptr, &cvb);
cvb->resize(1024);
ColumnBlock col(cvb.get(), &pool);
@ -573,13 +574,14 @@ static vectorized::MutableColumnPtr create_vectorized_column_ptr(FieldType type)
template <FieldType type>
void test_v_read_default_value(string value, void* result) {
using Type = typename TypeTraits<type>::CppType;
const auto* type_info = get_scalar_type_info(type);
const auto* scalar_type_info = get_scalar_type_info<type>();
// read and check
{
TabletColumn tablet_column = create_with_default_value<type>(value);
DefaultValueColumnIterator iter(tablet_column.has_default_value(),
tablet_column.default_value(), tablet_column.is_nullable(),
type_info, tablet_column.length());
create_static_type_info_ptr(scalar_type_info),
tablet_column.length());
ColumnIteratorOptions iter_opts;
auto st = iter.init(iter_opts);
EXPECT_TRUE(st.ok());

View File

@ -155,8 +155,9 @@ void common_test_array(CollectionValue src_val) {
TabletColumn item_column(OLAP_FIELD_AGGREGATION_NONE, item_type, true, 0, item_length);
list_column.add_sub_column(item_column);
const auto* array_type = dynamic_cast<const ArrayTypeInfo*>(get_type_info(&list_column));
EXPECT_EQ(item_type, array_type->item_type_info()->type());
auto array_type = get_type_info(&list_column);
ASSERT_EQ(item_type,
dynamic_cast<const ArrayTypeInfo*>(array_type.get())->item_type_info()->type());
{ // test deep copy
CollectionValue dst_val;

View File

@ -64,7 +64,7 @@ ColumnPB create_column_pb(const std::string& type, const Ts&... sub_column_types
return column;
}
const TypeInfo* get_type_info(const ColumnPB& column_pb) {
TypeInfoPtr get_type_info(const ColumnPB& column_pb) {
TabletColumn tablet_column;
tablet_column.init_from_pb(column_pb);
return get_type_info(&tablet_column);
@ -250,8 +250,8 @@ private:
}
EXPECT_TRUE(st.ok());
} while (rows_read >= 1024);
auto tuple_desc = get_tuple_descriptor(_object_pool, get_type_info(column_pb));
auto type_info = get_type_info(column_pb);
auto tuple_desc = get_tuple_descriptor(_object_pool, type_info.get());
block.set_selected_size(rows_read);
test_convert_to_vec_block(block, tuple_desc, field, arrays);
}
@ -383,9 +383,9 @@ const std::string ArrayTest::TEST_DIR = "./ut_dir/array_test";
TEST_F(ArrayTest, TestSimpleIntArrays) {
auto column_pb = create_column_pb("ARRAY", "INT");
const auto* type_info = get_type_info(column_pb);
auto type_info = get_type_info(column_pb);
auto field = create_field(column_pb);
auto tuple_desc = get_tuple_descriptor(_object_pool, type_info);
auto tuple_desc = get_tuple_descriptor(_object_pool, type_info.get());
EXPECT_EQ(tuple_desc->slots().size(), 1);
FunctionContext context;
ArrayUtils::prepare_context(context, *_mem_pool, column_pb);
@ -411,9 +411,9 @@ TEST_F(ArrayTest, TestSimpleIntArrays) {
TEST_F(ArrayTest, TestNestedIntArrays) {
// depth 2
auto column_pb = create_column_pb("ARRAY", "ARRAY", "INT");
const auto* type_info = get_type_info(column_pb);
auto type_info = get_type_info(column_pb);
auto field = create_field(column_pb);
auto tuple_desc = get_tuple_descriptor(_object_pool, type_info);
auto tuple_desc = get_tuple_descriptor(_object_pool, type_info.get());
EXPECT_EQ(tuple_desc->slots().size(), 1);
auto context = std::make_unique<FunctionContext>();
ArrayUtils::prepare_context(*context, *_mem_pool, column_pb);
@ -438,7 +438,7 @@ TEST_F(ArrayTest, TestNestedIntArrays) {
column_pb = create_column_pb("ARRAY", "ARRAY", "ARRAY", "INT");
type_info = get_type_info(column_pb);
field = create_field(column_pb);
tuple_desc = get_tuple_descriptor(_object_pool, type_info);
tuple_desc = get_tuple_descriptor(_object_pool, type_info.get());
EXPECT_EQ(tuple_desc->slots().size(), 1);
arrays.clear();
EXPECT_EQ(arrays.size(), 0);
@ -465,7 +465,7 @@ TEST_F(ArrayTest, TestSimpleStringArrays) {
auto column_pb = create_column_pb("ARRAY", "VARCHAR");
auto type_info = get_type_info(column_pb);
auto field = create_field(column_pb);
auto tuple_desc = get_tuple_descriptor(_object_pool, type_info);
auto tuple_desc = get_tuple_descriptor(_object_pool, type_info.get());
EXPECT_EQ(tuple_desc->slots().size(), 1);
FunctionContext context;
ArrayUtils::prepare_context(context, *_mem_pool, column_pb);
@ -491,9 +491,9 @@ TEST_F(ArrayTest, TestSimpleStringArrays) {
TEST_F(ArrayTest, TestNestedStringArrays) {
auto column_pb = create_column_pb("ARRAY", "ARRAY", "ARRAY", "VARCHAR");
const auto* type_info = get_type_info(column_pb);
auto type_info = get_type_info(column_pb);
auto field = create_field(column_pb);
auto tuple_desc = get_tuple_descriptor(_object_pool, type_info);
auto tuple_desc = get_tuple_descriptor(_object_pool, type_info.get());
EXPECT_EQ(tuple_desc->slots().size(), 1);
FunctionContext context;
ArrayUtils::prepare_context(context, *_mem_pool, column_pb);

View File

@ -20,6 +20,7 @@
#include <memory>
#include <string>
#include "olap/tablet_schema.h"
#include "olap/types.h"
#include "runtime/mem_tracker.h"
#include "runtime/string_value.h"
@ -43,7 +44,7 @@ ColumnPB create_column_pb(const std::string& type, const Ts&... sub_column_types
return column;
}
static const TypeInfo* get_type_info(const ColumnPB& column_pb) {
static TypeInfoPtr get_type_info(const ColumnPB& column_pb) {
TabletColumn tablet_column;
tablet_column.init_from_pb(column_pb);
return get_type_info(&tablet_column);