Monday, 10 December 2018

MySQL to HDFS : SQOOP Import Examples

MYSQL

to logon to mysql:
mysql -pcloudera

create database my1st;
create table emp (id int, name varchar(50), deg varchar(50), salary int, dept varchar(50));

describe emp;

insert into emp values (1201,'Gopal','Manager',50000,'TP');
insert into emp values (1202, 'Manisha', 'Proof Reader', 50000,'TP'),(1203,'Khalil', 'PHP dev', 30000,'AC'), (1204, 'Prasanth', 'PHP dev', 30000,'AC'), (1205,'Raja','Admin',20000,'TP');

select * from emp;

create table emp_add (id int, hno varchar(10), street varchar(50), city varchar(50));
describe emp_add;

insert into emp_add values (1201,'288A','RajaRajeswariNagar','Bangalore');


insert into emp_add values(1202, '108I','Kaggadasapura','Bangalore'), (1203, '144Z','CV Raman Nagar','Bangalore'),(1204,'78B','Udhayanagar','Bangalore'),(1205,'720X','Dooravani Nagar','Bangalore');

    create table emp_contact(id int, phno varchar(50), email varchar(50));

describe emp_contact;

insert into emp_contact values(1201,'11111111','gopal@tp.com'), (1202, '222222222','manisha@tp.com'),(1203,'3333333','khalil@ac.com'),(1204,'44444444','prasanth@ac.com'),(1205,'55555555','raja@tp.com');


connect with my1st Database and display all its tables:

sqoop list-tables --connect jdbc:mysql://localhost/my1st --username=root    --password=cloudera

Import from MySQL:
emp,emp_add, emp_contact

import a complete table with data to default hdfs folder:
sqoop import --connect jdbc:mysql://localhost/my1st --username=root    --password=cloudera --table emp -m 1
sqoop import --connect jdbc:mysql://localhost/my1st --username=root    --password=cloudera --table emp_add -m 1
sqoop import --connect jdbc:mysql://localhost/my1st --username=root    --password=cloudera --table emp_contact -m 1

all imported data will be placed in : hdfs dfs -ls /user/root/
/user/root/emp/
/user/root/emp_add
/user/root/emp_contact

To view the imported data in hdfs folders:
hdfs dfs -cat /user/root/emp/p*
hdfs dfs -cat /user/root/emp_add/p*
hdfs dfs -cat /user/root/emp_contact/p*


Set Target Directory:
imported data will be placed in /qr folder
sqoop import --connect jdbc:mysql://localhost/my1st --username=root    --password=cloudera --table emp_contact -m 1 --target-dir /qr
hdfs dfs -ls /qr

import subset of table data (partial data with where condition)
--where "email='gopal@tp.com' --target-dir "wherequery"
sqoop import --connect jdbc:mysql://localhost/my1st --username=root    --password=cloudera --table emp_contact -m 1 --where "email='gopal@tp.com'" --target-dir /wherequery

hdfs dfs -cat /wherequery/p*


incremental import:

mysql -pcloudera
use my1st;
mysql> select * from emp;
+------+----------+--------------+--------+------+
| id   | name     | deg          | salary | dept |
+------+----------+--------------+--------+------+
| 1201 | Gopal    | Manager      |  50000 | TP   |
| 1202 | Manisha  | Proof Reader |  50000 | TP   |
| 1203 | Khalil   | PHP dev      |  30000 | AC   |
| 1204 | Prasanth | PHP dev      |  30000 | AC   |
| 1205 | Raja     | Admin        |  20000 | TP   |

insert into emp values(1206,'Satish','team leader',20000,'GR');

mysql> select * from emp;
+------+----------+--------------+--------+------+
| id   | name     | deg          | salary | dept |
+------+----------+--------------+--------+------+
| 1201 | Gopal    | Manager      |  50000 | TP   |
| 1202 | Manisha  | Proof Reader |  50000 | TP   |
| 1203 | Khalil   | PHP dev      |  30000 | AC   |
| 1204 | Prasanth | PHP dev      |  30000 | AC   |
| 1205 | Raja     | Admin        |  20000 | TP   |
| 1206 | Satish   | team leader  |  20000 | GR   |
+------+----------+--------------+--------+------+

1206 is the newly created record in mysql table named : emp in database named :my1st

Incremental import:
--incremental <mode>
--check-column <column name>
--last value <last check column value>

id greater than 1205 will be appended
sqoop import --connect jdbc:mysql://localhost/my1st --username=root    --password=cloudera --table emp -m 1 --incremental append --check-column id --last-value 1205

display the contents taken from emp table:
hdfs dfs -cat /user/root/emp/p*

Delete a folder / directory which is not empty
hdfs dfs -rm -r /user/root/emp
hdfs dfs -rm -r /user/root/emp_add
hdfs dfs -rm -r /user/root/emp_contact

No comments:

Post a Comment

Flume - Simple Demo

// create a folder in hdfs : $ hdfs dfs -mkdir /user/flumeExa // Create a shell script which generates : Hadoop in real world <n>...