Sankara's Big Data Notes: Partition and Bucketing in Hive

Monday, 10 December 2018

Partition and Bucketing in Hive

CREATE TABLE SALES_TRANSACTION
(
transaction_id string,
region_id string,
product_id string,
amount double,
t_date timestamp,
)
partitioned by (region string) row format delimited fields terminated by ',' stored as textfile;

load data local inpath '/root/sales_transaction_california.csv' into table sales_transaction partition (region='California');

create table sales_transaction2
(
transaction_id string,
product_id string,
t_date timestamp,
amount double
)
partitioned by (country string) clustered by (product_id) into 10 buckets;

Hive SQL Data Types:
Numeric : TINYINT, SMALLINT, INT, BIGINT, FLOAT, DOUBLE, DECIMAL

Date / Time Types:
Timestamp, Date

String Types:
String, Varchar, Char

Misc Types:
Boolean, Binary

Complex Data Types:
Arrays, Maps, Structs, Union

use <db Name>
show databases;
show tables;
describe table;
create database <db>
drop database <db>

Sankara's Big Data Notes

Monday, 10 December 2018

Partition and Bucketing in Hive

No comments:

Post a Comment

Flume - Simple Demo