Data Structures And Algorithms in Python

Data strucures Sources: https://wiki.python.org/moin/TimeComplexity https://www.ics.uci.edu/~pattis/ICS-33/lectures/complexitypython.txt n is the number of elements in the container. k is either the value of a parameter, or the number of elements in the parameter. list It is actually an array (implemented as an array). Operation Average Case Amortized Worst Case Note Copy O(n) O(n) Append O(1) O(1) If the array grows beyond the allocated space it must be copied. In the worst case O(n) Pop last O(1) O(1) Pop intermediate O(n) O(n) Popping the intermediate element requires shifting all elements after k by one slot to the left using memmove....

January 29, 2021 · SergeM

Rounding in Python

In school we round numbers like 0.5, 1123.5 towards the bigger number. It’s a “round half up” method. That introduces an undesired bias some cases. For example if we have a large data set, and we aggregate some column containing a lot of .5 fractions. In order to adjust for it in many cases a rounding of 0.5 towards nearest even number is applied. It’s “Rounding half to even” or “banker’s rounding”....

January 10, 2021 · SergeM

Symbolic math and python

With the help of python and SymPy module one can do pretty neat computations. For example when I took a course about Robotic Preception on Coursera I had to find a cross product of two vectors v1 x v2 represented in a generic form: v1 = (a, b, c) v2 = (d, e, 0) Normally I would write it down on a piece of paper and do the computations myself. Luckily python can help with that....

December 31, 2020 · SergeM

Point cloud processing

ROS nodes Point Cloud IO https://github.com/ANYbotics/point_cloud_io two nodes for reading and writing PointCloud2 from/to ply, pcd formats point_cloud_assembler from laser_assembler http://wiki.ros.org/laser_assembler This node assembles a stream of sensor_msgs/PointCloud2 messages into larger point clouds. The aggregated point cloud can be accessed via a call to assemble_scans service. https://github.com/ros-perception/laser_assembler Tutorial Octomap http://octomap.github.io/ Seems like a standard solution to convert point clouds to a map in several formats pointcloud_to_laserscan http://wiki.ros.org/pointcloud_to_laserscan pcl_ros http://wiki.ros.org/pcl_ros This package provides interfaces and tools for bridging a running ROS system to the Point Cloud Library....

November 10, 2020 · SergeM

Python - Multiprocessing

Libraries Standard multiprocessing Pebble - pretty close to the standard one, but with a bit nicer interface Dask - well maintained and (almost) drop-in replacement of numpy and pandas: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 # Arrays implement the Numpy API import dask.array as da x = da.random.random(size=(10000, 10000), chunks=(1000, 1000)) x + x.T - x.mean(axis=0) # Dataframes implement the Pandas API import dask....

October 1, 2020 · SergeM

Parameters parsing for python applications

command line arguments is a standard and one of the most common ways to pass parameters to a python script. There exist a list of python libraries that help with that task. Here I am going to list some of them. argparse --------------------------- The default choice for the python developer. The module is included in python standard library and comes together with any python distribution. Example of usage: .. code-block:: #!...

April 21, 2020 · SergeM

Jupyter notebooks on EMR

Explanatory data analysis requires interactive code execution. In case of spark and emr it is very convenient to run the code from jupyter notebooks on a remote cluster. EMR allows installing jupyter on the spark master. In order to do that configure "Applications" field for the emr cluster to contain also jupyter hub. For example: "Applications": [ { "Name": "Ganglia", "Version": "3.7.2" }, { "Name": "Spark", "Version": "2.4.0" }, { "Name": "Zeppelin", "Version": "0....

February 4, 2019 · SergeM

Spark in Docker with AWS credentials

Running spark in docker container Setting up spark is tricky. Therefore it is useful to try out things locally before deploying to the cluster. Docker is of a good help here. There is a great docker image to play with spark locally. gettyimages/docker-spark Examples Running SparkPi sample program (one of the examples from the docs of Spark): docker run --rm -it -p 4040:4040 gettyimages/spark bin/run-example SparkPi 10 Running a small example with Pyspark:...

July 29, 2018 · SergeM

Bokeh in jupyter notebooks for interactive plots

Bokeh is a library for interactive visualization. One can use it in Jupyter notebooks. Here is the example. Lets say we have a pandas dataframe with timestamps and some values: 1 2 3 4 5 6 7 8 9 10 import pandas as pd from io import StringIO df = pd.read_csv(StringIO("""timestamp,value 2018-01-01T10:00:00,20 2018-01-01T12:00:00,10 2018-01-01T14:00:00,30 2018-01-02T10:30:00,40 2018-01-02T13:00:00,50 2018-01-02T18:00:40,10 """), parse_dates=["timestamp"]) You can visualize it to a nice graph with zoom, selection, and mouse-over tooltips using the bokeh:...

June 20, 2018 · SergeM

Comparison of click-based config parsers for python

Problem There is click module that allows you to create comman dline interfaces for your python scripts. The advantages of click are nice syntax 1 2 3 4 5 6 7 8 9 10 11 12 13 import click @click.command() @click.option('--count', default=1, help='Number of greetings.') @click.option('--name', prompt='Your name', help='The person to greet.') def hello(count, name): """Simple program that greets NAME for a total of COUNT times.""" for x in range(count): click....

June 6, 2018 · SergeM