🖋️
Labs DS Guide
  • Labs DS guide
  • Process
    • Labs for DS
    • Sprints
    • New Tech
    • How to get help
  • Tech
    • Structure
    • Architecture
    • Data wrangling
    • FastAPI
Powered by GitBook
On this page
  • Architecture
  • Architecture patterns explained
  • Labs authentication/authorization
  • Labs Data Science API contracts checklist

Was this helpful?

  1. Tech

Architecture

PreviousStructureNextData wrangling

Last updated 2 years ago

Was this helpful?

Architecture

What technologies will you use? How will the pieces fit together? That's systems architecture. Some of the have already been made; however, you will still have almost full decision-making power in the app.

The following is a simplistic view of one of the major architectural designs that apps follow.

Architecture patterns explained

  • Pattern #1: Batch predictions. You probably haven’t done this at BloomTech, but it can be a good, simple option for some forecasting problems. You make predictions in batches, ahead of time, and store them in a database.

  • Pattern #2: Predictions on demand, embedded into one app, “tightly coupled.” BloomTech examples include the Twitoff app in Unit 3, and Plotly Dash apps in Build Weeks.

  • Pattern #3: Predictions on demand, via a separate “microservice” API, “loosely coupled” with your main app.

Labs authentication/authorization

Your product will likely have user authentication, but the design & implementation will be your Web teammates responsibility, not yours. DS doesn’t need to directly do anything with Okta.

Labs Data Science API contracts checklist

Data visualization

  • Decide what inputs you will need. What options will the user have, to filter, sort, or customize the visualization? These inputs will usually come through a GET request, as path parameters or query parameters.

  • Decide what output you will provide. We need to send our Web colleagues the format they expect: JSON. Many visualization libraries are able to do this, as well as FastAPI.

Modeling

  • First, decide what model/algorithm you will utilize. Do you need a machine learning model or just an algorithm?

    • Depending on the problem, you may be able to create an algorithm. No need for a mega-ton hammer, if a small mallet will do.

    • What questions do you want to answer? This depends on your product and users.

    • What questions can you answer? This depends on what data you have, and whether it's labeled.

    Follow this flowchart:

Pre-trained models are useful. If you have labeled data, you can fine-tune most pre-trained models for your problem, for the best of both worlds.

ML Ops

ML Ops is the glue that binds the DS portion of the app together. There are few concepts that are paramount in the world of an ML Ops Engineer and one of them is component based design.

Let's say we need to create an API for the DS side of an app. We don't want to create a function in the API route as it makes the API bloated.

from fastapi import FastAPI

API = FastAPI(
    title="DS API",
    version="0.0.1",
    docs_url="/",
)


@API.post("/create-user")
async def create_user(user: User = default_user):
    """ Creates one user
    @param user: User
    @return: Boolean Success """
    def create(self, data: Dict) -> bool:
        """ Creates one record in the Collection
        @param data: Dict
        @return: Boolean Success """
        return self._collection().insert_one(dict(data)).acknowledged
    return API.db.create(user.dict(exclude_none=True))

It's better practice to create a separate file to house your functions. In the following example, we have moved the create() function to the data file in the main app directory and import it into our API

from fastapi import FastAPI

from app.data import create

API = FastAPI(
    title="DS API",
    version="0.0.1",
    docs_url="/",
)


@API.post("/create-user")
async def create_user(user: User = default_user):
    """ Creates one user
    @param user: User
    @return: Boolean Success """
    return API.db.create(user.dict(exclude_none=True))

Here’s a

See also by Andreessen Horowitz.

If your DS API shouldn’t be public to the internet (for example, if you’re serving private data) then DS can configure CORS (cross-origin resource sharing) to only permit calls from your Web Backend. Refer to these great docs:

FastAPI has other security options if you're curious. See

good explanation of 3 different architecture patterns for how to serve machine learning models.
Emerging Architectures for Modern Data Infrastructure
https://fastapi.tiangolo.com/tutorial/cors/
https://fastapi.tiangolo.com/tutorial/security/
tech choices