Apache CarbonData 2.0 Preview（关键特性提前预览）

来源: | 作者:华为云折扣网 | 发布时间: 2020-12-20 | 4247 次浏览 | 分享到:

CarbonData是一种高性能大数据存储方案，已在100+企业生产环境上部署应用，其中最大的单一集群数据规模达到几万亿。

Use Case: Many users have already generated data with different formats like ORC, Parquet, JSON, CSV etc. If users want to migrate to CarbonData for better performance or for better features(SDK) then there was no mechanism. All the existing data had to be converted to CarbonData to migrate.

To solve this limitation, add segment is introduced so that the user can easily add segments of different formats to a carbon table and run the queries.

Example:

alter table test add segment options ('path'='hdfs://usr/oldtable','format'='parquet')

Get more about usage:

https://github.com/apache/carbondata/blob/master/integration/spark/src/test/scala/org/apache/carbondata/spark/testsuite/addsegment/AddSegmentTestCase.scala

9. Hive leverage the index for query performance improvement

UseCase: Hive expression has to be pushed down to carbon to filter data which improves the query performance.

Usage/Example: When set hive.optimize.index.filter = true, hive expression can be pushed down to carbon to filter the data.

10. Hive Write support

Use Case: CarbonData now supports write and read from Hive execution engine. It will be helpful for users who want to try carbon without migrating to spark. Also, users can now convert their existing parquet/orc table directly to carbon format for ETL purposes.

Example:

CREATE TABLE hive_carbon_table(shortField SMALLINT , intField INT, bigintField BIGINT , doubleField DOUBLE, stringField STRING, timestampField TIMESTAMP, decimalField DECIMAL(18,2), dateField DATE, charField CHAR(5), floatField FLOAT) stored by 'org.apache.carbondata.hive.CarbonStorageHandler'

Get more about usage: https://github.com/apache/carbondata/blob/master/integration/hive/src/test/java/org/apache/carbondata/hive/HiveCarbonTest.java

11. Support prestodb-0.217 and prestosql-316

Use Case: Currently presto has two community, presto db and presto sql. To support CarbonData for users of both the community, now carbon supports prestodb-0.217 and prestosql-316.

« 上一页 3 456 7 下一页 » 查看全文 »

上一篇：基于Docker和D......

下一篇：全球最大CDN服务商......

客服微信

备案号：浙ICP备19010705号-2