Deep Learning model explainability

Hi, In my first post, I looked into the explainability of classical machine learning models. As a next step, I’m interested in the explainability of neural networks. Model explainability is easy for simple models (linear regression, decision trees), and some tools exist for more complex algorithms (ensemble trees). Therefore, I highly recommend the book Interpretable Machine Learning by Christoph Molnar for a deeper theoretical understanding. All different approaches for model explanability are shown with a PyTorch model in this kaggle notebook....

December 8, 2023 · 7 min

Model explainability

Hi, Some months have passed since my last post. Model explainability is easy for simple models (linear regression, decision trees), and some tools exist for more complex algorithms (ensemble trees). I want to dig into the tools to interpret more complex models with this post. Therefore, I highly recommend the book Interpretable Machine Learning by Christoph Molnar for a deeper theoretical understanding. All different approaches for model explanability are shown with a RandomForest model in this kaggle notebook....

November 21, 2023 · 5 min

Implementing a Transformer Network from scratch

Hi, This post is about my implementation of an encoder transformer network from scratch as a follow-up of understanding the attention layer together with the colab implementation. I use a simplified dataset, where I don’t expect great results. My approach is building something from scratch to understand it in depth. I faced many challenges during my implementation, so I aligned my code to the BertSequenceClassifier from huggingface. My biggest challenge was to get the network to train....

August 28, 2023 · 6 min

Implementing a Transformer Network from scratch

Hi, This post is about my implementation of an encoder transformer network from scratch as a follow-up of understanding the attention layer together with the colab implementation. I use a simplified dataset, where I don’t expect great results. My approach is building something from scratch to understand it in depth. I faced many challenges during my implementation, so I aligned my code to the BertSequenceClassifier from huggingface. My biggest challenge was to get the network to train....

August 28, 2023 · 6 min

Learning about time-series analysis

Hi, Recently, I had to work on a simple time-series analysis. I performed poorly since I never worked with time-series before. I believe in a deterministic world, and in general, I prefer to find the causality of a specific data behavior prior to a simple way of empiristic modeling. However, I understand the need for time-series analysis as not enough data available, the underlying processes understood, the complexity bearable, or the time/need for a proper process understanding....

August 15, 2023 · 3 min

Learning about time-series analysis

Hi, Recently, I had to work on a simple time-series analysis. I performed poorly since I never worked with time-series before. I believe in a deterministic world, and in general, I prefer to find the causality of a specific data behavior prior to a simple way of empiristic modeling. However, I understand the need for time-series analysis as not enough data available, the underlying processes understood, the complexity bearable, or the time/need for a proper process understanding....

August 15, 2023 · 3 min

Training a language model from scratch

Hi, This post is a short overview over a work project, where I trained a language model for invoices. This so-called base model is then fine-tuned for text classification on customer data. Due to data privacy, a non-disclosure agreement, ISO 27001 and SOAP2, I’m not allowed to publish any results. Believe me, it works like 🚀✨🪐. A language model is trained on large amounts of textual data to understand the patterns and structure of language....

April 15, 2023 · 14 min

Training a language model from scratch

Hi, This post is a short overview over a work project, where I trained a language model for invoices. This so-called base model is then fine-tuned for text classification on customer data. Due to data privacy, a non-disclosure agreement, ISO 27001 and SOAP2, I’m not allowed to publish any results. Believe me, it works like 🚀✨🪐. A language model is trained on large amounts of textual data to understand the patterns and structure of language....

April 15, 2023 · 14 min

Cookie-cutter Problems

Hi, Recently, I started to put some scripts together and run them against a Kaggle dataset. I decided to train my skills on an unseen dataset. Training keeps me sharp, and I need it to complement my skill set. For the last 2,5 years, I struggled in a small team with NLP problems, where I worked mostly on engineering tasks. My understanding in this area is not where I wanted to be....

April 10, 2023 · 2 min

Cookie-cutter Problems

Hi, Recently, I started to put some scripts together and run them against a Kaggle dataset. I decided to train my skills on an unseen dataset. Training keeps me sharp, and I need it to complement my skill set. For the last 2,5 years, I struggled in a small team with NLP problems, where I worked mostly on engineering tasks. My understanding in this area is not where I wanted to be....

April 10, 2023 · 2 min