computer vision

Effect of techniques from Fast.ai

fast.ai is a brilliant library and a course by Jeremy Howard an co. They use pytorch as a base and explain deep learning from the foundations to a very decent level. In his course Jeremy Howard demonstrates a lot of interesting techniques that he finds in papers and that do NN training faster/better/cheaper. Here I want to reproduce some of the techniques in order to understand what is the effect they bring....

Self-supervised depth and ego motion estimation

3D Packing for Self-Supervised Monocular Depth Estimation -------------------------------------------------------------- by Vitor Guizilini, `pdf at arxiv `_, 2020 Learning 1. Depth estimator :math:`f_D : I \rightarrow D` 2. Ego motion estimator: :math:`f_x : (I_t , I_S) \rightarrow x_{t \rightarrow S}` Depth Estimator ===================================== They predict an inverse depth and use a packnet architecture. Inverse depth probably has more stable results. Points far away from camera have small inverse depth that with low precision....

Which pretrained backbone to choose

In 2020 which architecture should I use for my image classification/tracking/segmentation/… task? I was asked on an interview that and I didn’t have a prepared answer. I made a small research and want to write down some thoughts. Most of the architectures build upon ideas from ResNet paper Deep Residual Learning for Image Recognition, 2015 Here is some explanation of resnet family:An Overview of ResNet and its Variants by Vincent Fung, 2017....

Multistage NN training experiment

Ideas for multistage NN training. There is some research on continuous learning without catastrophic forgetting . For example ANML: Learning to Continually Learn (ECAI 2020) arxiv code video The code for the paper is based on another one: OML (Online-aware Meta-learning) ~ NeurIPS19 code video OML paper derives some code from MAML: Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks pdf official tf code, also includes some links to other implementations....

Image segmentation with unlabeled areas with fast.ai

fast.ai library has a pretty easy to use yet powerful capabilities for semantic image segmentation. By default all the classes are treated the same. The network is trained to predict all the labels. Sometimes it’s important to provide non-complete labeling. That means for some areas the label is undefined. The performance of the network should exclude that areas in the loss and accuracy computation. That allows the network predict any other class in those areas....

Computer vision libraries

So, what else is there except for opencv… CCV CCV website, github CCV 0.7 comes with a sub-10% image classifier, a decent face detector. It runs on Mac OSX, Linux, FreeBSD, Windows*, iPhone, iPad, Android, Raspberry Pi. In fact, anything that has a proper C compiler probably can run ccv. The majority (with notable exception of convolutional networks, which requires a BLAS library) of ccv will just work with no compilation flags or dependencies....

Excelent idea moves towards a real life

Smartphone Application for driver assistance. It can measure distance to the next vehicle for example. Or detect road lane. And warn about violating it. http://www.acodriver-shop.com/ Some time ago I also had such an idea. Very interesting

Camera model and projective geometry

About camera models OpenCV camera model and calibration Difference between pinhole camera model and thin lens model. One more note, One more article Pinhole camera model