Publications | Patrick Koller

Patrick Koller, Amil v. Dravid, Prof. Dr. Guido Schuster, Prof. Dr. Aggelos Katsaggelos

February, 2023 OST

Robustness has become one of the most critical problems in machine learning (ML). The science of interpreting ML models to understand their behavior and improve their robustness is referred to as explainable artificial intelligence (XAI). One of the state-of-the-art XAI methods for computer vision problems is to generate saliency maps. A saliency map highlights the pixel space of an image that excites the ML model the most. However, this property could be misleading if spurious and salient features are present in overlapping pixel spaces. In this paper, we propose a caption-based XAI method, which integrates a standalone model to be explained into the contrastive language-image pre-training (CLIP) model using a novel network surgery approach. The resulting caption-based XAI model identifies the dominant concept that contributes the most to the models prediction. This explanation minimizes the risk of the standalone model falling for a covariate shift and contributes significantly towards developing robust ML models.

Patrick Koller, Florian Merz, Hannes Badertscher

August, 2022 OST

Trade-that! A quantitative trading engine

There are numerous individuals and institutions in this world, who earned more than is needed to cover their fixed costs and living expenses. Due to different motivations, these parties try to protect themself against inflation with a small associated risk or to increase their wealth with a higher associated risk. The objective of this project is to increase the size of its portfolio by using an algorithmic trading approach to trade the worlds first cryptocurrency called Bitcoin. Instead of trading Bitcoin by hand and driven by emotions, a quantitative trading engine is used to identify and capitalize available trading opportunities for the asset according to a multi-label classification model. The core idea is not to predict the prices in the future and execute the trades accordingly but to follow a more recent trend in the quantitative trading environment, where the state of the market is classified using a buy, hold or sell label. During the training phase, these labels are generated using future data. These labels serve as the target to train a classifier with an appropriate set of features. To find a set of distinct features to approximate the labels, an unique measure called the label separation power is used. This process is applied to generate multiple feature and label sets. Each feature and label set is used to train a separate classifier. The outputs of the classifiers are combined and form a trading strategy. Based on the scorer, which penalizes undesired characteristics, the trading strategies are optimized. The best performing strategies end up in an ensemble, which makes the resulting ensemble trading strategy more robust and can determine to buy, hodl or sell discrete amounts of the portfolio value in an optimized fashion according to the data it has been trained on. Backtesting the trading engine over two periods ranging over about one year according to the reference paper results in a positive total return. On one period the total return is around 1.5% per month and on the other period the total return is about 20% per month depending on the market trends. The average position size over both periods is about 50%, which enables the trading engine to quickly adapt to any changes in the market with maximized dynamics at any time. “Trade-that!” is able to demonstrate the feasibility of classifying the state of the market. Nevertheless, past profits do not guarantee future profits. Therefore it is essential to improve the trading engine and to adapt the properties of the trading engine continuously to the most recent market conditions.

Patrick Koller, Florian Merz, Hannes Badertscher

February, 2022 OST

AI in injection molding

The “Institut für Werkstofftechnik und Kunststoffverarbeitung” (IWK) is a leading Swiss institute in the area of materials technology and polymers processing. One area of interest is to optimize the yield of injection molding machines. Injection molding machines are typically commissioned by an experienced operator and then run for weeks and months, where they produce millions of identical goods. To lower the amount of defective goods during this contiuous manufacturing process, it needs to be monitored at all times. Thus anomalies can be detected and corrected in a early stage. To achieve this objective, the current state-of-the-art method monitors several measurement variables during the process and stores them as persistent data in memory. Based on these measurements, manually defined features are extracted and passed into an anomaly detector. The difficulty of this approach on the one hand is the accessability of the data, as there is no unified interface across the industry. On the other hand, the data quality fluctuates across all manufacturers. The cavity pressure curve represents the pressure inside the tool during the injection molding process over time. The shape of this curve greatly affects the quality of the produced goods. The presented approach in this project focuses on utilizing this cavity pressure curve as the only feature to predict an anomaly. Several modern machine learning anomaly detection models are introduced, evaluated and compared to a simple baseline model and the state-of-the-art method. The development of such a model strongly depends on the quality of the data. Even a non-domain expert is able to spot obvious problems with the original dataset. The labels of the original dataset contain obvious anomalies, which are labeled as normal and vice versa. Since the quality of the labels provided is not ideal, a well defined anomaly definition according to the injection molding theory has been introduced. Due to the nature of the labeling procedure of the calculated labels one could argue, that the labels have been adapted to fit the model assumptions. This is partly true, but the underlaying assumption that the cavity pressure curve of normal cavity pressure curves are very similar in nature, is backed by the known theory about the injection molding process. Therefore, the original dataset could provide a possible advantage for the existing state-of-the-art models and the calculated labels could provide a possible advantage for the molding-molly-models. Therefore, the best of both label worlds has been combined to the fusion labels to balance the benefits and to prevent a covariate shift as much as possible. All available models are then evaluated on all three label definitions and demonstrate, that it is possible to detect any significant differences in the cavity pressure curves and therefore potential anomalies. The potential of the presented approach lies in the massively reduced data volume, accessibility of the measurement variable and performance comparability with the current state-of-the-art method