Working with Spreadsheets using Python (Part 2) – Hacker Noon

If you are here reading this article then you have come a long way in working with spreadsheets using Python. But if you are here directly then I would recommend you to go over part 1 of this article which will give you insight from the beginning on this topic.

Here we will discuss use of the Python package openpyxl for creating spreadsheets in the .xlsx format. As I discussed in part 1, that if you have lots of data to work with and many clients to report that data to, then knowing your way around both .xls and .xlsx (older and newer Microsoft Excel versions respectively) formats is a great skill that can come in handy lots of times.

The package openpyxl is a Python library for reading and writing Excel 2010 xlsx/xlsm/xltx/xltm files.

Yes, you read that right you can read, write and append data using only one single package. You can install this package using the following command on terminal or command prompt (Prerequisite: pip must be installed on you system. For install instructions of pip visit this link).

pip install openpyxl

After the installation is complete, open the text editor to create a new file and give it any name you like; I am giving it the name excelScript.py .

Inside the script write the following code:

import openpyxl

def main():
book = openpyxl.Workbook()
book.create_sheet('Sample Sheet')
book.save('Sample.xlsx')

if __name__ == '__main__':
main()

The code above creates a blank Excel file with the name “Sample.xlsx” , containing a single sheet named “Sheet 1”. This was an example of .xlsx file creation. Run this script by executing

python excelScript.py

on your command line. When you open this .xlsx file you will see that instead of one there are two sheets created in the file. This happens because when you make an object of the Workbook() method of the openpyxl package, it creates a sheet called “Sheet” by default on calling that method.

If the sheet names don’t matter to your work then you can start using this sheet directly and you don’t need to create another sheet. If your work requires only the use of one sheet and you can start writing data directly to this sheet.

Later, I will be showing you how to remove that default sheet Sheet from your .xlsx file. To write data to Excel change the code to the following:

import openpyxl

def main():
book = openpyxl.Workbook()
book.create_sheet('Sample')
    # Acquire a sheet by its name
sheet = book.get_sheet_by_name('Sample')
    # Writing to sheet
sheet.cell(row=1, column=1).value = 'sample'
    book.save('Sample.xlsx')

if __name__ == '__main__':
main()

The code above will write “sample” on the 1st row and the 1st column. Keep in mind that openpyxl reads row and column indices starting from one. This way you can write into your Excel file and create as many sheets you like. Run the script above using

python excelScript.py

in your command line. You can give it styling also.

The following code shows you how to do so:

import openpyxl
def main():
book = openpyxl.Workbook()
book.create_sheet('Sample')
    # Acquire a sheet by its name
sheet = book.get_sheet_by_name('Sample')
    # Writing to sheet
sheet.cell(row=1, column=1).value = 'sample'
    # To set alignment of text inside cell and text wrapping
sheet.cell(row=1, column=1).alignment = openpyxl.styles.Alignment(horizontal='center', vertical='center', wrap_text=True)
    # To make font bold or italic
sheet.cell(row=1, column=1).font = openpyxl.styles.Font(bold=True, italic=True)
    book.save('Sample.xlsx')
if __name__ == '__main__':
main()

This will write “sample” on the 1st row and 1st column. It will be both horizontally and vertically center aligned. For information on more methods provided by openpyxl you can read the documentation here.

You can also merge cells, just like we did using xlwt , in openpyxl. The following code shows how to achieve merged cells using openpyxl:

import openpyxl
def main():
book = openpyxl.Workbook()
book.create_sheet('Sample')
    # Acquire a sheet by its name
sheet = book.get_sheet_by_name('Sample')
    # Merging first 3 columns of 1st row
sheet.merge_cells('A1:C1')
    # Writing to sheet
sheet.cell(row=1, column=1).value = 'sample'
    book.save('Sample.xlsx')
if __name__ == '__main__':
main()

The code above demonstrates how to merge cells when you know the exact designations of the cells to merge. I knew the designation of the 1st column of the 1st row which is A1 , and the designation of the 3rd column of the 1st row which is C1. In this case, I merged them directly using the merge_cells method of openpyxl.

This was one of the two methods to merge cells using openpyxl. Here is the demonstration of the 2nd method:

import openpyxl
def main():
book = openpyxl.Workbook()
book.create_sheet('Sample')
    # Acquire a sheet by its name
sheet = book.get_sheet_by_name('Sample')
    # Merging first 3 columns of 1st 3 rows
r1 = 1
r2 = 3
c1 = 1
c2 = 3
sheet.merge_cells(start_row=r1, start_column=c1, end_row=r2, end_column=c2)
    # Writing to sheet
sheet.cell(row=1, column=1).value = 'sample'
book.save('Sample.xlsx')
if __name__ == '__main__':
main()

This method comes in handy when data is written dynamically to the Excel file. If you are not sure of the exact designations of the cells to merge then you can use this method. This will merge the 1st three columns of 1st three rows and make it into a single cell.

When merging cells always keep in mind that you need to write the value into the first column of the merged cell otherwise the value will not reflect in the excel file. You can still apply styling to a merged cell in the same way demonstrated.

Now coming to the part of deleting that extra sheet that is created by default on calling the Workbook() method of openpyxl. This is how it can be done:

import openpyxl

def main():
book = openpyxl.Workbook()
    book.create_sheet('Sample')
    extraSheet = book.get_sheet_by_name('Sheet')
book.remove_sheet(extraSheet)
    book.save('Sample.xlsx')

if __name__ == '__main__':
main()

Here we acquired the sheet named “Sheet” in object form using get_sheet_by_name() method and stored in the extraSheet variable and then removed the sheet by calling the remove_sheet() method on the extraSheet variable.

This was all about creating a new .xlsx file, writing to it, merging cells, and styling it. Now, you might be wondering how to read and append data into .xlsx file using openpyxl.

First of all let’s look at reading an .xlsx file using openpyxl. Just as writing, you can use openpyxl.load_workbook() to open an existing workbook:

import openpyxl
def main():
book = openpyxl.load_workbook('Sample.xlsx')
print book.get_sheet_names()
# ['Sheet2', 'New Title', 'Sheet1']
    # Get a sheet to read
sheet = book.get_sheet_by_name('Sheet1')
    # No of written Rows in sheet
r = sheet.max_row
    # No of written Columns in sheet
c = sheet.max_column
    # Reading each cell in excel
for i in xrange(1, r+1):
for j in xrange(1, c+1):
print sheet.cell(row=i, column=j).value

if __name__ == '__main__':
main()

The code above will print all the written cells in the Excel file. The for loop starts from 1 instead of 0 because openpyxl starts indexing from 1.

You can also read cells using the designations of the cells:

# prints 1st row, 1st column directly
print sheet['A1']
# get a range of cells
cells = sheet['A1:C1']
for cell in cells[0]:
print cell.value

I selected cells[0] because sheet['A1:C1'] creates a tuple of tuple and all the cell objects are stored at oth index of the tuple hence, cells[0].

Finally, let’s see how to append data to .xlsx file. The following code demonstrates this process:

import openpyxl

def main():
book = openpyxl.load_workbook('Sample.xlsx')
sheet = book.get_sheet_by_name('Sample')
    sheet.cell(row=5, column=1).value = 'Appended Data'
    book.save('Sample.xlsx')

if __name__ == '__main__':
main()

The process is very simple indeed. I just loaded the existing .xlsx file into an object and saved it into book variable by calling the load_workbook() method on “Sample.xlsx” file. I then grabbed the sheet “Sample” by calling the get_sheet_by_name() method on the book object.

Now, I can simply start appending data to the sheet in the same way when creating a new file. All those methods that are available at the time of creation of file are available now.

Finally, when you are done writing data to the file don’t forget to save the file by calling the save() method.

Additionally, don’t forget to use a code beautifier when coding and try to follow PEP8 standards to make the code more readable. Use PyLint before going live. This will help you maintain a coding standard in your project.

The journey does not end here. This was just the basics of how to read, write and append data to .xls or .xlsx files. You should explore these packages more and make your algorithms according to your requirements depending on the form in which you want your data to be represented inside Excel.

Thanks for reading!

read original article here