资讯中心
关于我们
欢迎光临格子云商城!
GE ZI CLOUD
数字化应用聚合平台
格子云
按钮文本
热门搜索:惠普  复印纸  中性笔
全部商品分类
技术社区

Apache CarbonData 2.0 Preview(关键特性提前预览)

来源: | 作者:华为云折扣网 | 发布时间: 2020-12-20 | 4251 次浏览 | 分享到:
CarbonData是一种高性能大数据存储方案,已在100+企业生产环境上部署应用,其中最大的单一集群数据规模达到几万亿。

Usage/Example:

Please refer the below link to use pycarbon https://github.com/apache/carbondata/blob/master/python/README.md

15) Materialized view on all table such as Parquet and ORC

Use Case: CarbonData’s datamap interface can be used to improve the query performance of other formats like Parquet/ORC. One of the implementations of datamap interface is MV table which precompute the aggregation results based on the user input. By creating MV datamap on a parquet/orc table the user can get the benefit of quering a pre-computed data instead of raw data which results in better query results.

This is possible as carbon will redirect the query to the MV datamap instead of the parquet tables.

Example:

Spark.sql(""" create table source(empname String, designation String, deptno int, deptname String, salary int) using parquet """) Spark.sql(""" create materialized view mv_parquet as select empname, deptname, avg(salary) from source group by empname, deptname """)

Get more about usage: https://github.com/apache/carbondata/blob/master/integration/spark/src/test/scala/org/apache/carbondata/view/MVTest.scala

点击这里→了解更多精彩内容(同时获取华为云服务器折扣)