Sunday, 19 July 2020

How to read multi sheet excel file using Python?

$ pip3 install pandas  // install pandas
$ pip3 install xlrd // install excel reading package

>>> import pandas as pd
>>> xls = pd.ExcelFile('/home/hadoop/Downloads/multisheets.xlsx')  // read excel file
>>> df1 = pd.read_excel(xls, 'first')  // read particular sheet
>>> df2 = pd.read_excel(xls, 'second')
>>> df3 = pd.read_excel(xls, 'third')

>>> print(df1)  // display the content of python dataframe
    Ravi  22    Chennai
    Rahul  23  Bengaluru

>>> print(df2)
  selvi  32 Chennai
  Usha  38  Singai

>>> print(df3)
  Ayush   5   Bengaluru
  Ram  50  Aranthangi


The content of multi sheets excel file is given below:

Flume - Simple Demo

// create a folder in hdfs : $ hdfs dfs -mkdir /user/flumeExa // Create a shell script which generates : Hadoop in real world <n>...