Category: Machine Learning

Data science complexity and solutions in real industrial projects

As data scientists we usually like to apply fancy machine learning models to well-groomed datasets. Everyone working on industrial problems will eventually learn, that this does not reflect reality. The amount of time spent on modeling is small compared to data gathering, -warehousing and -cleaning. Even after training and deployment of the model, the work

Inverse Problem (Part 2)

In the last post I have written about inverse problems. A simplified toy example was presented, which showed you how to translate this problem into an optimization problem. Optimization problems can be solved with multiple algorithms, e.g. gradient descent or evolutionary algorithms. This article presents a more sophisticated inverse problem. We want to classify images

Inverse Problem (Part 1)

The process of calculating the causal factors from an observation is called inverse problem. An inverse problem is much harder to solve than the corresponding forward counterpart, which is calculating the observation from the causal factors. Many problems in science and math are inverse problems. They can be found in optics, radar, acoustics, communication theory,

How to use pytest in automatic code generation

This notebook shows you how to write a plugin for Pytest. This allows us to use the pytest functionaltiy, e.g. test-discovery, in our automatic programming scenario. Another advantage is that many developers are already familar with pytest. Therefor, it would be much easier for those people to apply this development technique. We have seen in

Abstract Syntax Tree

This blog post follows my introductory post to Automatic programming. The syntax of a programming language is defined as the set of rules that defines correctly structured code. There are many possible syntax errors. Some examles are: Opened parantheses are not closed Indendations are not correct Variable names are written wrong It is hard to

Automatic programming

Automatic programming is a type of computer programming with the goal to generate a computer program. This allows the programmer to write the code in a more abstract way. It can be something like switching to a higher level programming language, which compiles to a lower level language or programming declaratively. Declarative programming  is any

Auto-Encoding Variational Bayes

In the last post I have introduced the probabilistic programming. The biggest problem this idea is to find an efficient approximation of the posterior for arbitrary probabilistic models. Auto-Encoding Variational Bayes (AEVB) is a great step into the right direction. Consider a dataset . It consists of i.i.d. continues samples of dimension D. The data

Introduction to Probabilistic Programming

I love the idea of Probabilistic Programming. It is basically a language to describe probabilistic models and then perform automatic inference on those models. That means that you as an expert write a simulation of your problem in a Bayesian sense. For example, a text simulation could be to first sample the number of words,

Archetypal Analysis

Recently I have read about Archetypal Analysis. It is an unsupervised learning algorithm similar to clustering analysis and dimensionality reduction. It has been introduced by Adele Cutler and Leo Breiman in 1994. In my opinion this idea doesn’t get enough attention, although there are good reasons to learn about it. The Archetypal Analysis has nice