Monday, 10 December 2018

Partition and Bucketing in Hive

CREATE TABLE SALES_TRANSACTION
(
transaction_id string,
region_id string,
product_id string,
amount double,
t_date timestamp,
)
partitioned by (region string) row format delimited fields terminated by ',' stored as textfile;

load data local inpath '/root/sales_transaction_california.csv' into table sales_transaction partition (region='California');



create table sales_transaction2
(
transaction_id string,
product_id string,
t_date timestamp,
amount double
)
partitioned by (country string) clustered by (product_id) into 10 buckets;

Hive SQL Data Types:
Numeric : TINYINT, SMALLINT, INT, BIGINT, FLOAT, DOUBLE, DECIMAL

Date / Time Types:
Timestamp, Date

String Types:
String, Varchar, Char

Misc Types:
Boolean, Binary

Complex Data Types:
Arrays, Maps, Structs, Union


use <db Name>
show databases;
show tables;
describe table;
create database <db>
drop database <db>

No comments:

Post a Comment

Flume - Simple Demo

// create a folder in hdfs : $ hdfs dfs -mkdir /user/flumeExa // Create a shell script which generates : Hadoop in real world <n>...