资讯中心
关于我们
欢迎光临格子云商城!
GE ZI CLOUD
数字化应用聚合平台
格子云
按钮文本
热门搜索:惠普  复印纸  中性笔
全部商品分类
技术社区

Apache CarbonData 2.0 Preview(关键特性提前预览)

来源: | 作者:华为云折扣网 | 发布时间: 2020-12-20 | 4247 次浏览 | 分享到:
CarbonData是一种高性能大数据存储方案,已在100+企业生产环境上部署应用,其中最大的单一集群数据规模达到几万亿。

Use Case: Many users have already generated data with different formats like ORC, Parquet, JSON, CSV etc.  If users want to migrate to CarbonData for better performance or for better features(SDK) then there was no mechanism. All the existing data had to be converted to CarbonData to migrate. 

To solve this limitation, add segment is introduced so that the user can easily add segments of different formats to a carbon table and run the queries.

Example:

alter table test add segment options ('path'='hdfs://usr/oldtable','format'='parquet')

Get more about usage:

https://github.com/apache/carbondata/blob/master/integration/spark/src/test/scala/org/apache/carbondata/spark/testsuite/addsegment/AddSegmentTestCase.scala

9. Hive leverage the index for query performance improvement

UseCase: Hive expression has to be pushed down to carbon to filter data which improves the query performance.

Usage/Example: When set hive.optimize.index.filter = true, hive expression can be pushed down to carbon to filter the data.

 

10Hive Write support

Use Case: CarbonData now supports write and read from Hive execution engine. It will be helpful for users who want to try carbon without migrating to spark. Also, users can now convert their existing parquet/orc table directly to carbon format for ETL purposes.

Example:

CREATE TABLE hive_carbon_table(shortField SMALLINT , intField INT, bigintField BIGINT , doubleField DOUBLE, stringField STRING, timestampField TIMESTAMP, decimalField DECIMAL(18,2), dateField DATE, charField CHAR(5), floatField FLOAT) stored by 'org.apache.carbondata.hive.CarbonStorageHandler'

Get more about usagehttps://github.com/apache/carbondata/blob/master/integration/hive/src/test/java/org/apache/carbondata/hive/HiveCarbonTest.java

11. Support prestodb-0.217 and prestosql-316

Use CaseCurrently presto has two community, presto db and presto sql. To support CarbonData for users of both the community, now carbon supports prestodb-0.217 and prestosql-316.