We had a production system written by mathematicians, 50 different stakeholders with conflicting targets, five leadership changes during last year, a dozen of microservices, AWS costs of 10 thousands per week, hole galaxy of legacy databases, cron jobs, Celery, greenlets, … Also, unstable API as a dependency, 10 Gb of text dumps as output, user input without validation, false alarms in monitoring, and two dozen unprotected public endpoints. Not that we needed all that for the work, but once you get locked into a serious agile development, the tendency is to push it as far as you can....

July 4, 2018 · SergeM

Bokeh in jupyter notebooks for interactive plots

Bokeh is a library for interactive visualization. One can use it in Jupyter notebooks. Here is the example. Lets say we have a pandas dataframe with timestamps and some values: 1 2 3 4 5 6 7 8 9 10 import pandas as pd from io import StringIO df = pd.read_csv(StringIO("""timestamp,value 2018-01-01T10:00:00,20 2018-01-01T12:00:00,10 2018-01-01T14:00:00,30 2018-01-02T10:30:00,40 2018-01-02T13:00:00,50 2018-01-02T18:00:40,10 """), parse_dates=["timestamp"]) You can visualize it to a nice graph with zoom, selection, and mouse-over tooltips using the bokeh:...

June 20, 2018 · SergeM

Comparison of click-based config parsers for python

Problem There is click module that allows you to create comman dline interfaces for your python scripts. The advantages of click are nice syntax 1 2 3 4 5 6 7 8 9 10 11 12 13 14 import click @click.command() @click.option('--count', default=1, help='Number of greetings.') @click.option('--name', prompt='Your name', help='The person to greet.') def hello(count, name): """Simple program that greets NAME for a total of COUNT times.""" for x in range(count): click....

June 6, 2018 · SergeM

Vim cheat sheet

Some frequently used commands in Vim File explorer :Explore - opens the file explorer window. :E - the same Visual commands > - shift right < - shift left y - yank (copy) marked text d - delete marked text ~ - switch case Cut and Paste yy - yank (copy) a line 2yy - yank 2 lines yw - yank word y$ - yank to end of line p - put (paste) the clipboard after cursor P - put (paste) before cursor dd - delete (cut) a line dw - delete (cut) the current word x - delete (cut) current character Search/Replace /pattern - search for pattern ?...

May 31, 2018 · SergeM

Select lines matching regular expression in python

Given a text file we want to create another file containing only those lines that match a certain regular expression using python3 1 2 3 4 5 6 import re with open("./in.txt", "r") as input_file, open("out.txt", "w") as output_file: for line in input_file: if re.match("(.*)import(.*)", line): print(line, file=output_file)

February 26, 2018 · SergeM

Workflow management with Apache Airflow

Some useful resources about Airflow: ETL best practices with Airflow Series of articles about Airflow in production: Part 1 - about usecases and alternatives Part 2 - about alternatives (Luigi and Paitball) Part 3 - key concepts Part 4 - deployment, issues More notes about production About start_time: Why isn’t my task getting scheduled? One cannot specify datetime.now() as start_date. Instead one has to provide datetime object without timezone....

February 6, 2018 · SergeM

Datetime in Python

Conversion from calendar week to date Sometimes one has to convert a date written as year, calendar week (CW), and day of week to an actual date with month and date. The behaviour in the begin/end of a year may be not straightforward. For example according to ISO 8601 monday date of the CW 1 year 2019 is 31 January 2018. As far as I can see there is no standard function for conversion in python....

January 17, 2018 · SergeM

Flask and SQLAlchemy explained

Awesome explanation of SQLAlchemy with examples and comparison to Django by Armin Ronacher: SQLAlchemy and You Flask-SQLAlchemy module Flask-SQLAlchemy is an extension for Flask that adds support for SQLAlchemy to your application. How to add SQLAlchemy to Flask application: from flask import Flask from flask_sqlalchemy import SQLAlchemy app = Flask(__name__) # configuration of the DB is read from flask configuration storage app.config['SQLALCHEMY_DATABASE_URI'] = 'sqlite:////tmp/test.db' # here we define db object that keeps track of sql interactions db = SQLAlchemy(app) Now we are ready to define tables and objects using predefined db....

January 11, 2018 · SergeM

Ubuntu/linux settings

Some settings I find useful for a workstation CPU monitoring on the main panel Default Ubuntu desktop seems to become finally convenient enough for me starting from Ubuntu 18.04. Only several tweaks are missing. Constantly available CPU/Mem/HDD/Network monitor is one of them. Here is how to install a small widget for a top panel in the default GNOME desktop environment. sudo apt-get install gir1.2-gtop-2.0 gir1.2-networkmanager-1.0 gir1.2-clutter-1.0 Go to Ubuntu Software and then search for system monitor extension....

January 11, 2018 · SergeM

Pytest cheatsheet

Pytest is a powerful tool for testing in python. Here are some notes about hands-on experience. Running tests in pytest with/without a specified mark Execute pytest -m "integration" if you want to run only tests that have “@pytest.mark.integration” annotation. Similarly you can run only tests that don’t are not marked. pytest -m "not your_mark" That command will test everything that is not marked as “your_mark”. How to verify exception message using pytest One can use context manager pytest....

January 2, 2018 · SergeM