Flask Python Tutorial: Upload CSV file and Insert rows into MySQL Database
Table of Content
Python is a cool language, soon as you finish your first projects with it, you will be hocked.
It's a general purpose language and it's gaining popularity for the recent years among data scientists and data engineers.
Personally, I recommend Python to newbies especially teens, not just because it has a simple direct learning curve, but also because it has a large community as well as a rich ecosystem.
About this tutorial
I was helping a friend through how to solve problems by dissecting them into small tiny problems and solve them one-by-one, So here is the code assignment as the main problem we are trying to solve.
The Problem:
Use Python to upload and import CSV file into a pre-defined table in MySQL database.
However, there is a cool CSV package that you can use in a similar manner, it is called CleverCSV.
Steps
- Create file upload form
- Upload the CSV using Flask
- Parse CSV file data
- Connect to the database
- Insert rows into a specific table in the database
Stack we will use:
- Flask;
A Python Minimal (Micro) web frameworks. - MySQL;
If you use Windows you can use WAMP or XAMPP, if you are using Linux XAMPP is a good choice. If you are using macOS, XAMPP or MAMP are good solutions. - Pandas;
Pandas is an open-source python library for data analytics and manipulation. It's widely used among data scientists and data engineers. - mysql.connector;
The library that we will use to connect to the MySQL database and insert the data.
Requirements
- Install Flask;
Here I will install Python directly withpip
, but I strongly recommend usingpython-virtualenv
for best practice.
pip install Flask
- Install XAMPP;
The XAMPP package will include MySQL database and PHPMyAdmin to manage our MySQL databases. - Install the required libraries (Pandas, sql.connector)
pip install pandas
pip install sql.connector
- A sample CSV file, I will be using
addresses.csv
from here.
0 Get Flask ready
Now let's test if Flask works by starting to create our main file main.py
from flask import Flask
app = Flask(__name__)
@app.route("/")
def hello():
return "Flask CSV Filer Uploader and Parser"
if (__name__ == "__main__"):
app.run(port = 5000)
We can run the file by $ python main.py
then heading tohttp://127.0.0.1:5000/
.
If everything goes as planned, you will get the rendered message.
Next step is to create the HTML page which will be responsible for uploading the CSV files.
1- Create file upload form
Let's create templates
directory and our HTML upload form in it.
<!doctype html>
<html>
<head>
<title>FLASK CSV File Upload</title>
</head>
<body>
<h1>Upload your CSV file</h1>
<form method="POST" action="" enctype="multipart/form-data">
<p><input type="file" name="file"></p>
<p><input type="submit" value="Submit"></p>
</form>
</body>
</html>
2- Upload the files to the server
Now we need to accept the incoming files from this form, let's add the reasonable function in main.py
from flask import Flask, render_template, request, redirect, url_for
import os
from os.path import join, dirname, realpath
app = Flask(__name__)
# enable debugging mode
app.config["DEBUG"] = True
# Upload folder
UPLOAD_FOLDER = 'static/files'
app.config['UPLOAD_FOLDER'] = UPLOAD_FOLDER
# Root URL
@app.route('/')
def index():
# Set The upload HTML template '\templates\index.html'
return render_template('index.html')
# Get the uploaded files
@app.route("/", methods=['POST'])
def uploadFiles():
# get the uploaded file
uploaded_file = request.files['file']
if uploaded_file.filename != '':
file_path = os.path.join(app.config['UPLOAD_FOLDER'], uploaded_file.filename)
# set the file path
uploaded_file.save(file_path)
# save the file
return redirect(url_for('index'))
if (__name__ == "__main__"):
app.run(port = 5000)
As everything is set, we can test the file upload now at http://127.0.0.1:5000/
If everything is ok, you can see the uploaded files at /static/files/
3- Parse CSV file
To parse the CSV file data into rows, we will use pandas
the data analysis and manipulation library.
- Import the library
import pandas as pd
Parse CSV function
def parseCSV(filePath):
# CVS Column Names
col_names = ['first_name','last_name','address', 'street', 'state' , 'zip']
# Use Pandas to parse the CSV file
csvData = pd.read_csv(filePath,names=col_names, header=None)
# Loop through the Rows
for i,row in csvData.iterrows():
print(i,row['first_name'],row['last_name'],row['address'],row['street'],row['state'],row['zip'],)
Don't forget to add parseCSV(file_path)
function soon after the file is saved in uploadFiles
# Output
0 John Doe 120 jefferson st. Riverside NJ 8075
1 Jack McGinnis 220 hobo Av. Phila PA 9119
2 John "Da Man" Repici 120 Jefferson St. Riverside NJ 8075
3 Stephen Tyler 7452 Terrace "At the Plaza" road SomeTown SD 91234
4 nan Blankman nan SomeTown SD 298
5 Joan "the bone", Anne Jet 9th, at Terrace plc Desert City CO 123
4- Connect to the database (XAMPP/ MySQL)
- In this step we will connect to the database and make sure it works.
- Add
import mysql.connector
- Connect to the MySQL Server
import mysql.connector
mydb = mysql.connector.connect(
host="localhost",
user="root",
password="",
database="databaseName"
)
Make sure to replace the conditionals right.
- To test the Database Connection, let's list all available databases:
mycursor = mydb.cursor()
mycursor.execute("SHOW DATABASES")
# List All Databases
for x in mycursor:
print(x)
Create the database and the table you want to insert the rows into it with the correspondent column names.
5- Insert rows into the database
With this step we will just update our loop to insert every row into the database
def parseCSV(filePath):
# CVS Column Names
col_names = ['first_name','last_name','address', 'street', 'state' , 'zip']
# Use Pandas to parse the CSV file
csvData = pd.read_csv(filePath,names=col_names, header=None)
# Loop through the Rows
for i,row in csvData.iterrows():
sql = "INSERT INTO addresses (first_name, last_name, address, street, state, zip) VALUES (%s, %s, %s, %s, %s, %s)"
value = (row['first_name'],row['last_name'],row['address'],row['street'],row['state'],str(row['zip']))
mycursor.execute(sql, value, if_exists='append')
mydb.commit()
print(i,row['first_name'],row['last_name'],row['address'],row['street'],row['state'],row['zip'])
As everything goes as planned, check the new data entries at the database.
Troubleshooting and practice
As we are concluded this tutorial, here are some Troubleshooting and ideas to carry on and practice
NaN
; don't sendNaN
values to the database- Use Secure Form for best practice
- Create and insert table/ columns based on CSV structure
- Insert new values only
- Set upload options: accept only CSV files, limit file size
- Upload multiple files
- Move finished files to
done
directly as soon as they are inserted - Better error management
- List all done and uploaded files in
index.html
- View CSV table in
index.html
- Convert the
csv
intojson
Full code
main.py
from flask import Flask, render_template, request, redirect, url_for
import os
from os.path import join, dirname, realpath
import pandas as pd
import mysql.connector
app = Flask(__name__)
# enable debugging mode
app.config["DEBUG"] = True
# Upload folder
UPLOAD_FOLDER = 'static/files'
app.config['UPLOAD_FOLDER'] = UPLOAD_FOLDER
# Database
mydb = mysql.connector.connect(
host="localhost",
user="root",
password="",
database="csvdata"
)
mycursor = mydb.cursor()
mycursor.execute("SHOW DATABASES")
# View All Database
for x in mycursor:
print(x)
# Root URL
@app.route('/')
def index():
# Set The upload HTML template '\templates\index.html'
return render_template('index.html')
# Get the uploaded files
@app.route("/", methods=['POST'])
def uploadFiles():
# get the uploaded file
uploaded_file = request.files['file']
if uploaded_file.filename != '':
file_path = os.path.join(app.config['UPLOAD_FOLDER'], uploaded_file.filename)
# set the file path
uploaded_file.save(file_path)
parseCSV(file_path)
# save the file
return redirect(url_for('index'))
def parseCSV(filePath):
# CVS Column Names
col_names = ['first_name','last_name','address', 'street', 'state' , 'zip']
# Use Pandas to parse the CSV file
csvData = pd.read_csv(filePath,names=col_names, header=None)
# Loop through the Rows
for i,row in csvData.iterrows():
sql = "INSERT INTO addresses (first_name, last_name, address, street, state, zip) VALUES (%s, %s, %s, %s, %s, %s)"
value = (row['first_name'],row['last_name'],row['address'],row['street'],row['state'],str(row['zip']))
mycursor.execute(sql, value, if_exists='append')
mydb.commit()
print(i,row['first_name'],row['last_name'],row['address'],row['street'],row['state'],row['zip'])
if (__name__ == "__main__"):
app.run(port = 5000)
If you like this quick tutorial, please share it and if you have any comment please let us know.
Resources
Photo by Hitesh Choudhary on Unsplash