Built and scaled data platforms for analytics and near real-time decisioning.
About
Aleksandr Andreev
Lead Data Engineer with 9+ years building streaming, analytics and lakehouse platforms at production scale.
I work across the full data platform surface area: streaming pipelines, batch compute, table formats, orchestration, data quality and the tooling teams need to move faster without losing discipline.
In recent years I have also been building internal AI tooling, especially review and knowledge-assist systems grounded in real documentation rather than generic prompts.
This site is both a portfolio and a writing lab: essays, compact notes and case studies about systems that are interesting because they have trade-offs.
Highlights
Hands-on with Kafka, Flink, Spark, Airflow, dbt, Iceberg, Trino and Python.
Experienced leading delivery, mentoring engineers and tightening engineering standards.
Experience
Lead Data Engineer
AlfaStrakhovanie
- Led design of real-time claims pipelines on Kafka, Flink and Iceberg for high-throughput workloads.
- Drove migration of analytical workloads toward open lakehouse patterns with Trino, dbt and Iceberg.
- Built an LLM-assisted merge request review workflow adopted by multiple teams.
Senior Data Engineer
Large financial services company
- Built and operated Spark-based ETL pipelines across multiple upstream systems.
- Standardized orchestration patterns in Airflow and improved observability for data SLAs.
- Improved query performance through better file layout, partitioning and columnar storage practices.
Data Engineer / Data Analyst
Earlier analytics and data roles
- Moved from SQL-heavy analytics toward Python and distributed data engineering.
- Built early event pipelines and learned how platform decisions affect downstream teams.
Selected skills
Kafka, Kafka Streams, Flink
Spark, Airflow, dbt
Iceberg, Parquet, ClickHouse, PostgreSQL
Trino, DuckDB, SQL optimization
Kubernetes, Docker, Terraform, GitLab CI
Claude API, RAG, Qdrant, internal developer tools
How I work
Subscribe
Get new articles by email when they are published. No spam, just notes on data platforms, distributed systems and AI tooling.
Your email is stored only for this self-hosted mailing list.