Monday, 3 August 2020

SQOOP Import Problem and Solution #2

#Problem : 2

Connect to MySQL database using SQOOP,
import all customers lives in 'CA' state.

Data description:
A MySQL instance is running on the localhost node.
In that instance, you will find customers table that contains
customer's data.

Installation : localhost
Database name : retail_db
Table Name : Customers
User Name : root 
Password : cloudera

Output requirement:

Place the customers files in HDFS directory
"user/cloudera/problem1/customers_selected/avrodata"

Use avro format with Snappy compression

Select the columns : customer_id,customer_fname,customer_lname,customer_state.



$ sqoop import \
-connect "jdbc:mysql://localhost/retail_db" \
-username root \
-password cloudera \
-table customers \
-where "customer_state='CA'" \
-target-dir /user/cloudera/problem1/customers_selected/avrodata \
-columns "customer_id,customer_fname,customer_lname,customer_state" \
-compress \
-compression-codec snappy \
-as-avrodatafile


$ hdfs dfs -ls /user/cloudera/problem1/customers_selected/avrodata
Found 5 items
-rw-r--r--   1 cloudera cloudera          0 2020-08-03 08:24 /user/cloudera/problem1/customers_selected/avrodata/_SUCCESS
-rw-r--r--   1 cloudera cloudera       6508 2020-08-03 08:24 /user/cloudera/problem1/customers_selected/avrodata/part-m-00000.avro
-rw-r--r--   1 cloudera cloudera       6427 2020-08-03 08:24 /user/cloudera/problem1/customers_selected/avrodata/part-m-00001.avro
-rw-r--r--   1 cloudera cloudera       6419 2020-08-03 08:24 /user/cloudera/problem1/customers_selected/avrodata/part-m-00002.avro
-rw-r--r--   1 cloudera cloudera       7093 2020-08-03 08:24 /user/cloudera/problem1/customers_selected/avrodata/part-m-00003.avro


$ avro-tools tojson hdfs://localhost/user/cloudera/problem1/customers_selected/avrodata/part-m-00003.avro | head
log4j:WARN No appenders could be found for logger (org.apache.hadoop.metrics2.lib.MutableMetricsFactory).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.
{"customer_id":{"int":9328},"customer_fname":{"string":"Mary"},"customer_lname":{"string":"Perez"},"customer_state":{"string":"CA"}}
{"customer_id":{"int":9333},"customer_fname":{"string":"Angela"},"customer_lname":{"string":"Mills"},"customer_state":{"string":"CA"}}
{"customer_id":{"int":9336},"customer_fname":{"string":"Janice"},"customer_lname":{"string":"Guzman"},"customer_state":{"string":"CA"}}
{"customer_id":{"int":9338},"customer_fname":{"string":"James"},"customer_lname":{"string":"Davis"},"customer_state":{"string":"CA"}}
{"customer_id":{"int":9340},"customer_fname":{"string":"Mary"},"customer_lname":{"string":"Smith"},"customer_state":{"string":"CA"}}
{"customer_id":{"int":9347},"customer_fname":{"string":"Mary"},"customer_lname":{"string":"Fuentes"},"customer_state":{"string":"CA"}}
{"customer_id":{"int":9352},"customer_fname":{"string":"Mary"},"customer_lname":{"string":"Lewis"},"customer_state":{"string":"CA"}}
{"customer_id":{"int":9357},"customer_fname":{"string":"Katherine"},"customer_lname":{"string":"Spence"},"customer_state":{"string":"CA"}}
{"customer_id":{"int":9363},"customer_fname":{"string":"Mary"},"customer_lname":{"string":"Simmons"},"customer_state":{"string":"CA"}}
{"customer_id":{"int":9375},"customer_fname":{"string":"George"},"customer_lname":{"string":"Smith"},"customer_state":{"string":"CA"}}

No comments:

Post a Comment

Flume - Simple Demo

// create a folder in hdfs : $ hdfs dfs -mkdir /user/flumeExa // Create a shell script which generates : Hadoop in real world <n>...