Use Case: Many users have already generated data with different formats like ORC, Parquet, JSON, CSV etc. If users want to migrate to CarbonData for better performance or for better features(SDK) then there was no mechanism. All the existing data had to be converted to CarbonData to migrate.
To solve this limitation, add segment is introduced so that the user can easily add segments of different formats to a carbon table and run the queries.
Example:
alter table test add segment options ('path'='hdfs://usr/oldtable','format'='parquet')
Get more about usage:
https://github.com/apache/carbondata/blob/master/integration/spark/src/test/scala/org/apache/carbondata/spark/testsuite/addsegment/AddSegmentTestCase.scala
9. Hive leverage the index for query performance improvement
UseCase: Hive expression has to be pushed down to carbon to filter data which improves the query performance.
Usage/Example: When set hive.optimize.index.filter = true, hive expression can be pushed down to carbon to filter the data.
10. Hive Write support
Use Case: CarbonData now supports write and read from Hive execution engine. It will be helpful for users who want to try carbon without migrating to spark. Also, users can now convert their existing parquet/orc table directly to carbon format for ETL purposes.
Example:
CREATE TABLE hive_carbon_table(shortField SMALLINT , intField INT, bigintField BIGINT , doubleField DOUBLE, stringField STRING, timestampField TIMESTAMP, decimalField DECIMAL(18,2), dateField DATE, charField CHAR(5), floatField FLOAT) stored by 'org.apache.carbondata.hive.CarbonStorageHandler'
Get more about usage: https://github.com/apache/carbondata/blob/master/integration/hive/src/test/java/org/apache/carbondata/hive/HiveCarbonTest.java
11. Support prestodb-0.217 and prestosql-316
Use Case: Currently presto has two community, presto db and presto sql. To support CarbonData for users of both the community, now carbon supports prestodb-0.217 and prestosql-316.