from flask import Flask, jsonify
import pandas as pd
Automating Data Science Analysis with Flask APIs
Introduction
An API (Application Programming Interface) is a set of rules that allows different software applications to communicate with each other. It enables developers to access specific features or data of an application without exposing its internal code. APIs are widely used to integrate services, share data, and build scalable systems, making them essential in modern software development.
Why Are APIs Essential in Data Science?
APIs are invaluable in data science for integrating and sharing insights seamlessly across systems. They allow data scientists to deploy models, retrieve external data, and provide real-time predictions. Integrating APIs with tools like Excel makes advanced analytics accessible to non-technical users, enabling them to interact with models or datasets directly within spreadsheets for better decision-making.
Let’s explore how it all comes together.
Import Necessary Libraries
Library Name | Description |
---|---|
Flask |
The core class used to create the web application. |
jsonify |
A Flask utility to convert Python objects (like dictionaries) into JSON format, suitable for web APIs |
Initialize the Flask App
__name__
is a special Python variable that is set to __main__
when the script is run directly. Flask uses this to determine where to find the resources for your app.
= Flask(__name__) app
Define the Home Route
We define a simple home route (/) that will display “Hello, world!” when you access the root URL of the application. This is just to verify that the Flask app is running correctly.
@app.route("/")
def home():
return "Hello, world!"
@app.route("/")
tells Flask that this function will be triggered when the user visits the root URL of the app (http://localhost:5039/).
Create the Data Endpoint
@app.route("/data")
def get_data():
= pd.read_csv('BANK LOAN.csv')
Bank_Loan = Bank_Loan.corr()
correlation_matrix return jsonify(correlation_matrix.to_dict(orient='records'))
Description of the Dataset
The BANK LOAN.csv
file contains the following columns:
Column Name | Description |
---|---|
SN | Serial number, uniquely identifying each record. |
AGE | Age of the individual. |
EMPLOY | Years of employment. |
ADDRESS | Years at the current address. |
DEBTINC | Debt-to-income ratio. |
CREDDEBT | Credit card debt. |
OTHDEBT | Other forms of debt. |
DEFAULTER | Indicates whether the individual defaulted on a loan (1 for default, 0 for no default). |
- The get_data function is defined as the route handler for the
/data
endpoint it reads the BANK LOAN.csv file, calculates its correlation matrix using Pandas, and converts the matrix into a JSON-compatible dictionary with to_dict(orient=‘records’). - The jsonify() function then serializes the dictionary into a JSON response, which is returned to the user when they access the endpoint.
Run the Flask App
if __name__ == '__main__':
=True, port=5039,use_reloader=False) app.run(debug
* Serving Flask app '__main__'
* Debug mode: on
if __name__ == '__main__':
ensures that the app runs when the script is executed.app.run(debug=True, port=5039)
: This starts the Flask development server on port 5039 (you can choose a different port if needed)This will start the Flask server on http://localhost:5039/.
Access the Home Route: Open a web browser and visit http://localhost:5039/. You should see the message “Hello, world!” displayed.
- Access the Data Endpoint: Visit http://localhost:5039/data. You should see a JSON response containing the correlation matrix of the dataset. It will look something like this:
Conclusion
Congratulations! You’ve built a simple Flask API that serves a correlation matrix from a CSV file, making it easy to share data insights. In this tutorial, you learned how to set up a Flask app, work with data using Pandas, and return data in JSON format. In the next blog, we’ll show you how to integrate this API with Excel using Macros.