Data Storage Objects in Spark:
Spark Core : RDD (Unstructured files)
SparkSQL : DataFrame, DataSet (Semi, Structured files)
SparkStreaming : DStream (Streaming Applications)
Spark MLLib : Vectors
Spark GraphX : Graph Objects
Subscribe to:
Post Comments (Atom)
Flume - Simple Demo
// create a folder in hdfs : $ hdfs dfs -mkdir /user/flumeExa // Create a shell script which generates : Hadoop in real world <n>...
-
// Lead Example // Lead means Next row's salary value spark.sql("SELECT id, fname,lname, designation, technology,salary, LEAD(sal...
-
How to fetch Spark Application Id programmaticall while running the Spark Job? scala> spark.sparkContext.applicationId res124: String = l...
-
Import data from MySQL to HDFS using SQOOP with conditional data importing //Conditional import using Where sqoop import \ -connect jdbc:m...
No comments:
Post a Comment