Hello, I'm Panu Alaluusua

AI Data Engineer

I am a Data Engineer transitioning into AI Data Engineering with a unique background that blends Finance (M.Sc.) with strong technical expertise in data platforms and software engineering. Databricks Certified GenAI Engineer specializing in RAG applications, Vector Search, and LLM-enabled solutions.

Databricks Certified GenAI Engineer Azure Data Platforms Specializing in RAG & LLMs Finance Domain Expertise

My Projects

Latest Commit

Finance & Investing Data Platform

Role: Data Engineer (End-to-End) | Jan 2023 - Aug 2025

The Challenge:

To modernize investment performance calculations and liberate dataflows from legacy systems by building a specialized, domain-specific data platform.

The Solution:

Built an isolated platform segment in close collaboration with the customer. Designed and implemented daily end-to-end pipelines for Return, Risk, and market data. Revamped cash flow reporting and orchestrated SQL warehouse migrations.

Azure Data Factory Databricks DBT SQL Servers Bicep

General Data Platform

Role: Data Engineer / Platform Engineer | Jan 2023 - Aug 2025

The Challenge:

Create a modern, agile data platform empowering the organization to build dataflows swiftly based on business needs.

The Solution:

Streamlined processes and improved reporting capabilities through a comprehensive Azure-based platform. Created Azure data services, implemented robust user management/security, and optimized data streaming.

Azure Data Factory Databricks PySpark Alation

Data Migration & GenAI-Accelerated Refactoring

Role: Data Engineer | Jan 2023 - Aug 2025

The Challenge:

Migrate 100+ legacy Databricks notebooks to new data sources and ensure 100% data consistency without stalling development.

The Solution:

Developed a custom test bench for automated regression testing and utilized GenAI augmented workflows to systematically refactor code at speed. Utilized a systematic Agentic Development Methodology to maintain strict quality control.

Databricks Python PySpark Automated Testing

GenAI-Augmented Migration Pipeline POC

Role: Lead Architect (POC)

The Challenge:

Explore the feasibility of automating the migration of complex legacy Informatica workflows to modern dbt models to reduce manual refactoring effort.

The Solution:

Designed and partially implemented a semi-automated migration pipeline utilizing GenAI to interpret transformation logic and generate compliant dbt code. Validated the feasibility of AI-driven legacy code modernization.

GenAI dbt Python Legacy ETL

Regulatory Data Integration (EUDR Compliance)

Role: Senior Integration Specialist | 2025

The Challenge:

To comply with the new EU Deforestation Regulation (EUDR), the customer needed to attach specific traceability numbers to transport orders. The goal was to replace error-prone manual updates with a fully automated, auditable system.

The Solution:

Designed and implemented a production-ready `GET -> MODIFY -> POST` integration using Azure Web Apps Surface. Automated daily scheduling, robust error handling, full audit logging, and secure secret management.

Azure Web Apps Snowflake Python (Flask) Docker Terraform

Passion Projects & Hobbies

Beyond client work, I love building tools and exploring new tech. Here are some of my personal projects.

Portfolio Website

Frontend / Modern Stack

You are looking at it! Built with semantic HTML, CSS, and vanilla JS. Documented with `docs` and deployed via GitHub Pages.

View Source

MyWhoosh Race Results

Data Engineering / Visualization

Automated processing and visualization of e-cycling race results. Python scripts to parse data and generate graphics.

View Repository

Cycling Events (Hupiprojekti)

183
Streamlit App & Data

A fun project exploring cycling events data, built with Streamlit.

Live App Repository Read Docs

Streamlit E-SM

519
Esports Transparency & Analytics

E-Sports Championship transparency project.

Live App Repository Description Learnings

InsightHub

Full Stack AI Platform / Architecture

An ambitious Full Stack AI-platform (SvelteKit + Python). "Failed due to its size", but served as a massive learning ground for architecture and complex orchestrations.

Repository Read Docs

Centralized Docs Sync

DevOps / Automation

Automated workflow to sync documentation from multiple projects into a central repository. Keeps knowledge organized and accessible.

Description Learnings

More on GitHub

Explore my other repositories, experiments, and learnings.

Go to GitHub Profile

Latest Writings

Thoughts on Data Engineering, AI, and Tech.

View All Articles

My Skills

core Competencies

Generative AI Engineering

RAG Applications

AI Data Engineering

Platform Engineering

MLOps & DataOps

Technologies & Tools

Vector Search & RAG

MLflow & Model Serving

Databricks & Spark

Azure (ADF, Synapse)

Python & PySpark

dbt & Unity Catalog

Docker & Terraform

Domain Knowledge (Finance)

Corporate Finance

Financial Risk Mgmt

Financial Engineering

Econometrics

Banking & Insurance

Certifications

Databricks Certified Generative AI Engineer Associate

Issued February 17, 2026 • Expires February 17, 2028

Design and implement LLM-enabled solutions, RAG applications, and LLM chains using Databricks Vector Search, Model Serving, MLflow, and Unity Catalog.

GenAI RAG Vector Search MLflow LLM Chains
EXPIRED

Databricks Certified Data Engineer Associate

Issued October 31, 2022 • Expired October 31, 2024

Work Experience

Mar 2021 – Present

Data Engineer

Solita | Oulu, Finland
  • Working as a mix of Platform and Data Engineer.
  • Designing and maintaining modern data platforms on Azure.
  • Implementing DataOps practices, streaming solutions, and GDPR compliance workflows.
  • Stack: Databricks, Spark, dbt, Azure Data Factory, Azure DevOps.
Aug 2020 – Dec 2020

Doctoral Student / Researcher

University of Oulu
  • Research in Academic Analytics.
  • Explored Business Intelligence and Data Mining solutions for Higher Education.
  • Teaching assistant for Python basics course.
Jun 2019 – Aug 2020

Research Assistant

University of Oulu
  • Learning Analytics: Data cleaning, feature engineering, and exploratory data analysis.
  • Ad-hoc Analysis: Provided customized reports beyond standard BI tools.
  • Master's Thesis: "Simulation analysis of higher education teaching linked with financing model."
Apr 2018 – Jun 2019

Property Maintenance (Night Shift Street Cleaning)

ISS A/S | Oulu, Finland

Cleaned streets and emptied trash bins during night shifts while others partied. Winter specialty: snow shoveling at 4 AM. Early training for cleaning messy data – though data doesn't freeze. ❄️➡️📊

Get In Touch