Daniel Siegel โ€“ Data Engineering Consultant
Available for Consulting

Dan Siegel

I build high-performance data systems and write efficient, custom catalog extensions. From managing complex pipelines to executing Redshift-to-Snowflake migrations that slash compute bills by 40%, I bridge engineering depth with business delivery.

HIGH-AVAILABILITY DATA SYSTEMS
ACTIVE SLA
SLA TARGET ACHIEVED
Uptime. Your data doesn't sleep, and neither do my pipelines.

Services

Cloud & Infrastructure Architecture

Production-grade cloud setups and automation. Designing highly secure, self-healing platforms with built-in cost optimization.

  • GitOps delivery (Flux, Helm, Kustomize)
  • Production Kubernetes (k8s) & cloud security
  • Snowflake warehouse resizing & cost audits

Resilient & High-Performance Pipelines

Developing low-latency event-driven streams, real-time message routing, and reliable anomaly triage. Building hardened, self-healing data delivery paths for mission-critical operations.

  • Low-latency Go WebSocket snipers & NATS JetStream
  • Fault-tolerant Apache Flink & Apache Kafka systems
  • Hardened, self-healing CDC (Debezium, Kafka)

Analytics Engineering & DS/MLOps

Bridging the gap from raw database schemas to production ML models. Transforming chaotic data pools into optimized Kimball star schemas, while engineering low-latency feature stores and model registries (MLflow) to deploy model predictions at scale.

  • Dimensional modeling & lakehouse design (Snowflake, Databricks)
  • MLOps feature engineering & model lifecycles (MLflow, Spark)
  • Metric stores, semantic layers, and automated BI enablement

Agile Program Delivery & Architecture Audits

Aligning data engineering strategy with business execution. Auditing cloud architectures, rescuing high-risk projects, and running end-to-end software procurement, vendor selection, and SOW scoping pipelines.

  • High-risk project rescue & migration recovery
  • End-to-end technology procurement & SOW/RFP pipelines
  • Enterprise architecture audits & cost-benefit reviews
  • Agile/Scrum team governance & roadmap velocity alignment

Projects & Writings

Case studies, C++ extensions, and real-time streaming architectures. Read my detailed engineering posts on Substack.

C++ DuckDB Extension Snowflake DuckLake PostgreSQL

DuckSync (DuckDB Snowflake Caching)

Why do dashboards query static Snowflake datasets hundreds of times a day and burn unnecessary compute? I built DuckSync, a C++ DuckDB community extension, to intercept SQL queries, rewrite AST paths, analyze Snowflake table metadata (last_altered), and cache rows locally as Parquet files managed by a Postgres catalog. This keeps the Snowflake data warehouse asleep, drastically slashing compute costs.

Core Integration
C++ AST rewrite, PostgreSQL, DuckLake parquet store
Extension Command
INSTALL ducksync FROM community;
DuckDB shell
-- Install and load the community extension INSTALL ducksync FROM community; LOAD ducksync; -- Query Snowflake tables with transparent caching SELECT * FROM snowflake_db.sales.monthly_revenue;
Go Python NATS JetStream PostgreSQL Anomaly Detection

WhaleDoxer (Real-time Prediction Market Tracker)

A real-time paper-trading and suspicion engine designed to monitor information asymmetry and anomalies in Polymarket and Kalshi order books. It runs a sub-millisecond hot path using a lightweight Go sniper node (8ยตs p95 latency), streams raw events through NATS JetStream, and routes tripwires to Python forensics tasks executing wallet identity tracking and composite Brier scoring.

Hot-Path Latency
8 microseconds (Go Sniper pod)
Data Architecture
PostgreSQL-first Kappa with NATS JetStream fanout
dbt Apache Airflow Python Salesforce API

Salesforce Reverse ETL Framework

A cost-efficient, open-source alternative to expensive commercial Reverse ETL tools. Built by extending Apache Airflow operators and embedding dbt SQL validation models to sync clean warehouse records directly to Salesforce accounts and leads, avoiding high SaaS licensing fees and platform lock-in.

Experience

Senior Data Engineer
Quanata
Martinsburg, WV (Remote) โ€ข 06/2022 โ€“ Present
  • Architected and optimized data models that reduced processing costs by 40% and increased query performance, migration from Redshift to Snowflake enabled BI teams to deliver insights 30% faster.
  • Led data infrastructure modernization using Terraform and Kubernetes, cutting deployment time from days to hours and improving time-to-market by 50%.
  • Designed and implemented robust data pipelines that automated 95% of manual processes, ensuring 99.9% data reliability while reducing operational overhead by 60%.
  • Partnered with teams to build reverse ETL paths synchronizing analytics and operational systems, increasing customer acquisition metrics by 25%.
Senior / Lead Data Engineer
Consumer Affairs
Roanoke, VA (Remote) โ€ข 01/2021 โ€“ 06/2022
  • Managed a team of 3 data engineers, delivering 100% of roadmap deliverables on-time and raising team velocity by 35%.
  • Automated 80% of manual pipelines and designed infrastructure for 3x scalability, generating $200K in annual cloud infrastructure cost savings.
  • Constructed data solutions leveraging AWS Athena, Glue, Talend, Python, Databricks, Spark, and dbt to process 15+ data sources, cutting ingestion latency by 60%.
Data Engineer
Carilion Clinic
Roanoke, VA โ€ข 10/2019 โ€“ 01/2021
  • Built robust ETL/ELT flows across 8+ data platforms (Netezza, Oracle, SQL Server, Snowflake, APIs), raising data validity to 99.5% and cutting integration times by 50%.
  • Optimized hospital analytical processes, automating 70% of manual tasks and generating $150K in annual operational savings.
Project Manager / Product Owner
Cities Digital
Portland, OR (Remote) โ€ข 04/2016 โ€“ 06/2018
  • Developed statement of work parameters, business requirements, and integrated automated BI dashboards utilizing SQL warehouse backends.
  • Served as Agile coach and Scrum product owner for development sprints.
Implementation Consultant
ADP
Clackamas, OR โ€ข 06/2015 โ€“ 04/2016
  • Provided oversight and support of the implementation of SaaS HCM software for new clients or existing client upsells.
  • Controlled project requirements, scope, and change management to ensure on-time achievement of project milestones and deliverables.
  • Ranked #1 against over 500 peers within market segmentation at 177% to Plan YTD at time of departure.
Project Specialist
Walmart eCommerce
San Bruno, CA โ€ข 10/2012 โ€“ 05/2015
  • Created and managed contingent workforce process including submission processes, onboarding / offboarding, vendor management policies, and played a key role in MSP implementation.
  • Leveraged Agile and Kanban methodologies and developed recruitment analytics for risk evaluation, resource forecasting, and velocity optimization.
  • Controlled document registry and project timelines by implementing collaboration platforms including Confluence and Sharepoint.
๐Ÿ•ฐ๏ธ Early Career & Internships (2008 โ€“ 2012) Click to expand
  • Human Resources Scheduler (Contract) โ€ข Ask.com, Oakland CA (5/2012 โ€“ 10/2012)
  • Recruiting Coordinator (Contract) โ€ข P.C.A.O.B., Washington DC (5/2011 โ€“ 8/2011)
  • Analyst (Contract) โ€ข Thomson Reuters, Washington DC (1/2011 โ€“ 5/2011)
  • Finance Assistant โ€ข Congressman Bill Foster, Batavia IL (5/2010 โ€“ 11/2010)
  • Policy Assistant (Contract) โ€ข GLSEN, Washington DC (1/2010 โ€“ 4/2010)
  • Intern โ€ข Steve Shannon for Attorney General, Fairfax VA (8/2009 โ€“ 11/2009)
  • Intern โ€ข The White House, CEQ, Washington DC (2/2009 โ€“ 5/2009)
  • Intern โ€ข Congresswoman Bordallo, Washington DC (9/2008 โ€“ 12/2008)

Technical Skills

Programming Languages

Python C++ Go SQL JavaScript C# R Scala Shell Scripting HTML/CSS

Data & Analytics Platforms

Snowflake DuckDB dbt AWS (Glue/Athena/S3) Apache Airflow NATS JetStream Databricks Spark Redshift PostgreSQL Netezza Oracle PowerBI Tableau Salesforce

Data Science & DevOps

Pandas NumPy Scikit-Learn Keras NLTK Kubernetes Docker Terraform Git/GitHub Agile/Scrum PMP Methodology

Certifications & Education

MS in Data Science
Bellevue University โ€ข 2018 โ€“ 2019 โ€ข GPA: 4.0/4.0
BA in Political Science (Minor Psychology)
American University
Project Management Professional (PMP)
Project Management Institute (PMI), 2015
Certified Scrum Master (CSM)
Scrum Alliance, 2015
Certified Scrum Product Owner (CSPO)
Scrum Alliance, 2015
Business Process Management Specialist (BPMS)
AIIM, 2016
AWS Certified Cloud Practitioner
Amazon Web Services (AWS), 2018
SAFeยฎ 4 Agilist
Scaled Agile Inc, 2019
AWS Solutions Architect - Associate
Amazon Web Services (AWS), 2019

Discussing a project or infrastructure review?

I combine backend engineering experience with project management delivery. Let's optimize your data systems, warehouse compute overhead, or real-time pipelines.

Get in Touch