Monday, 10 August 2020

Filtering the Error Log using Spark RDD with Scala

log.txt:
---------
DEBUG | 2017-11-13 21:02:58 | [main] example.Log4jExample (Log4jExample.java:10) - Log4jExample: A Sample Debug Message
INFO  | 2017-11-13 21:02:58 | [main] example.Log4jExample (Log4jExample.java:11) - Log4jExample: A Sample Info  Message
WARN  | 2017-11-13 21:02:58 | [main] example.Log4jExample (Log4jExample.java:12) - Log4jExample: A Sample Warn  Message
ERROR | 2017-11-13 21:02:58 | [main] example.Log4jExample (Log4jExample.java:13) - Log4jExample: A Sample Error Message
FATAL | 2017-11-13 21:02:58 | [main] example.Log4jExample (Log4jExample.java:14) - Log4jExample: A Sample Fatal Message
DEBUG | 2017-11-13 21:02:58 | [main] example.Log4jExample (Log4jExample.java:10) - Log4jExample: A Sample Debug Message
INFO  | 2017-11-13 21:02:58 | [main] example.Log4jExample (Log4jExample.java:11) - Log4jExample: A Sample Info  Message
WARN  | 2017-11-13 21:02:58 | [main] example.Log4jExample (Log4jExample.java:12) - Log4jExample: A Sample Warn  Message
ERROR | 2017-11-13 21:02:58 | [main] example.Log4jExample (Log4jExample.java:13) - Log4jExample: A Sample Error Message
FATAL | 2017-11-13 21:02:58 | [main] example.Log4jExample (Log4jExample.java:14) - Log4jExample: A Sample Fatal Message

// Filter the Error Log
val sc = new SparkContext(conf)
val logRDD = sc.textFile("D:\\Ex\\log.txt")
val filteredRDD = logRDD.filter(x => x.startsWith("ERROR"))
filteredRDD.collect().foreach(println)
 
scala> val logRDD = sc.textFile("D:\\Ex\\log.txt")
logRDD: org.apache.spark.rdd.RDD[String] = D:\Ex\log.txt MapPartitionsRDD[1] at textFile at <console>:24

scala>   val filteredRDD = logRDD.filter(x => x.startsWith("ERROR"))
filteredRDD: org.apache.spark.rdd.RDD[String] = MapPartitionsRDD[2] at filter at <console>:25

scala>   filteredRDD.collect().foreach(println)
ERROR | 2017-11-13 21:02:58 | [main] example.Log4jExample (Log4jExample.java:13) - Log4jExample: A Sample Error Message
ERROR | 2017-11-13 21:02:58 | [main] example.Log4jExample (Log4jExample.java:13) - Log4jExample: A Sample Error Message

No comments:

Post a Comment

Flume - Simple Demo

// create a folder in hdfs : $ hdfs dfs -mkdir /user/flumeExa // Create a shell script which generates : Hadoop in real world <n>...