Data gathering service using Flask: Structure and Data

As a big fan of World of Warcraft, I always wanted to do a little side-project about auction house price predictions. When searching the web for data, I didn’t find any useful data source tho, so I decided to create one myself. With the help of TradeSkillMaster API and some Flask code, I started gathering the data I need in a MySQL database… and today I will show you how to do something similar for your own needs.

This blog post, due to the amount of information, comes in a series:

TLDR: The full wow-data-gather project code is stored here, if you want to skip to the end.

Why Flask?

First of all, I don’t need anything fancy here. A service that will run a function that connects to a remote API, collects the data and stores it in a database. This is supposed to be light-weight, so selecting other frameworks like eg. Django, would be an overkill. Secondly, I wanted to build it fast and make it maintenance free. As you will soon see, there is little to no code required to run.

And what exactly is Flask?

As taken from the official website:

Flask is a lightweight WSGI web application framework. It is designed to make getting started quick and easy, with the ability to scale up to complex applications. It began as a simple wrapper around Werkzeug and Jinja and has become one of the most popular Python web application frameworks. Flask offers suggestions, but doesn’t enforce any dependencies or project layout. It is up to the developer to choose the tools and libraries they want to use. There are many extensions provided by the community that make adding new functionality easy.

Seems like the ideal description of what we need!

Building our service

We will be using Python 3.7, and you need to install the following dependencies:

You can also install those from the requirements.txt file I provided.

Moreover, you need a MySQL database, we will be able to connect to.

We will start simple. Create a file main.py in your project directory. This file will be the server start point for our service (the # noqa is only there for flake to shut up).

# main.py
from app import app  # noqa

Main app

Now that we have our great import, it would be cool to actually add the app to our project. Create a new directory app/ and add an __init__.py file. We will start by adding some imports:

# app/__init__.py

from flask import Flask
from flask_sqlalchemy import SQLAlchemy
from flask_migrate import Migrate
from apscheduler.schedulers.background import BackgroundScheduler

from .config import Config

As we have our imports done, we can create the application, mind that we still don’t have the Config created, so you will get errors. We will do that in a moment.

Below the imports, add the following code. It will create a Flask application, configure it, create a database connection and add model migration for our database. Quite a lot, isn’t it?

app = Flask(__name__)
app.config.from_object(Config)
db = SQLAlchemy(app)
migrate = Migrate(app, db)

Moreover, add this line at the end of the file, so that our app knows about our code structure. We will implement these soon!

from app import routes, models  # noqa

Configuration

Create a config.py file in the app directory. We will create a class that will gather the ENV values for our project configuration and some initial settings for the Warcraft TSM API.

# app/config.py
import os


class Config(object):
    SQLALCHEMY_DATABASE_URI = os.environ['SQLALCHEMY_DATABASE_URI']
    SQLALCHEMY_TRACK_MODIFICATIONS = False
    TSM_API = os.environ['TSM_API']
    SQLALCHEMY_POOL_RECYCLE = 299
    SQLALCHEMY_POOL_TIMEOUT = 20

    ITEMS_FILTER = [
        168651,  # Greater Flask of the Currents
        168652,  # Greater Flask of Endless Fathoms
        168653,  # Greater Flask of the Vast Horizon
        # ... and any other
    ]

As you can see we are using os.environ to gather the environment variables for our project. When hosting on Heroku, we will add those to the application. When running on your computer you need to add those by typing in bash terminal:

export TSM_API=YOUR_API_KEY
export SQLALCHEMY_DATABASE_URI=mysql://user:pass@host/db

But what exactly are those settings, you might ask?

Remember that the external API you will use, might need different configurations. The config.py file will be the best place to store that information.

Routing

We don’t really plan on sharing the service link anywhere, because we aren’t displaying any information. This is just a gathering service after all. Nevertheless, it is good to create a simple start page, if someone lands on our page in any way (can’t imagine how). Let’s create a simple routes.py file to have that dealt with:

# app/routes.py
from app import app


@app.route('/')
def hello():
    return (
        'Warcraft data gathering service by '
        '<a href="https://zmudzinski.me">Lukasz Zmudzinski</a>. '
        '<br/> Nothing to see here. Move along 👋.'
    )

This will ensure that every time someone goes to our service website directly, he will be greeted by the above message, returned by the hello() function.

Building our model

We need to tell SQLAlchemy how we want to store our data. To do this, we need to create a database model. Create a models.py file and add all the models in a similar fashion to mine, depending on the data you will be gathering.

You can preview other data types in the SQLAlchemy Documentation.

The __repr__ function is there only to make printing the model for debug purposes meaningful.

# app/models.py
from app import db


class TSMData(db.Model):
    id = db.Column(db.Integer, primary_key=True)    
    item_name = db.Column(db.String(255), nullable=False)
    item_subclass = db.Column(db.String(255), nullable=False)
    item_vendor_buy = db.Column(db.Integer, nullable=False)
    item_vendor_sell = db.Column(db.Integer, nullable=False)
    item_market_value = db.Column(db.Integer, nullable=False)
    item_min_buyout = db.Column(db.Integer, nullable=False)
    item_quantity = db.Column(db.Integer, nullable=False)
    item_num_auctions = db.Column(db.Integer, nullable=False)
    created_at = db.Column(
        db.DateTime,
        default=datetime.utcnow,
        nullable=False,
    )

    def __repr__(self):
        return f'{self.item_name} at {self.created_at}.'

Right now it would be cool to launch our app, to see how we are doing. Do the following steps in a bash terminal to push your models to the database and launch our awesome service!

# Set the flask app env
export FLASK_APP=main.py

# Create a migration repository
flask db init

# Generate initial migration
flask db migrate

# Apply the migration to the database
flask db upgrade

# Run the server!
flask run

When everything goes according to plan, you should see something like this on your website root:

Website root screenshot Example message you can get on your website