AI & Data Engineer  ·  ENSIAS × Télécom Saint-Étienne

Building AI systems enterprises can actually deploy.

I design and ship enterprise RAG pipelines, LLM integrations, and agentic systems — built from the start to run air-gapped, on-prem, and RGPD-compliant.

View GitHub ↗
Hamza OUADID

About Me

I'm an AI & Data Engineer specializing in enterprise AI systems that organizations can confidently deploy and maintain — not just prototype. My work is shaped by real production constraints: tight compliance requirements, existing infrastructure, and teams that need to own the result long-term.

My core focus is enterprise RAG — hybrid retrieval architectures combining BM25, dense embeddings, and reciprocal rank fusion on pgvector, Qdrant, or FAISS — deployed on Kubernetes with full audit trails. I wire LLMs into existing Spring Boot and FastAPI backends via streaming microservice brokers, and build stateful agentic pipelines with LangGraph for complex workflow automation. Evaluation is baked in from day one: RAGAS, TruLens, and custom LLM-as-judge suites run on every deployment so quality regressions never reach users silently.

What sets my engagements apart is a focus on constrained environments: air-gapped clusters, Keycloak SSO, vLLM model serving, and zero external data egress — fully RGPD-compliant. I work fluently in French, English, and Arabic. If you're evaluating production AI and need an engineer who has already solved the hard parts — compliance, integration, evaluation — I take on a small number of engagements at a time. Let's talk.

2+ Years shipping
enterprise AI
12 Production
AI projects
3 Languages
FR · EN · AR
#1 Orange Pan-African
Hackathon 2022

What I Do

  • 01

    Enterprise RAG Systems

    Hybrid BM25 + semantic + RRF pipelines on pgvector, Qdrant, or FAISS — on Kubernetes with auth and audit logging.

  • 02

    LLM Integration into Existing Backends

    LLMs wired into Spring Boot, FastAPI, or Django — streaming, structured output, vLLM — inside your existing auth chain. No rewrite.

  • 03

    Agentic & Multi-Agent Systems

    Stateful LangGraph agents, A2A coordination, and MCP integration for live tool access without context bloat.

  • 04

    LLM Evaluation & Quality Engineering

    RAGAS, TruLens, and custom LLM-as-judge pipelines. Regression suites on every deployment so quality drops are caught before users see them.

  • 05

    Automated Data Ingestion & ETL

    Web-scale document harvesting with delta detection. Structured knowledge bases ready for RAG indexing — no manual intervention.

  • 06

    Constrained & On-Prem Deployment

    Air-gapped Kubernetes, Keycloak SSO, vLLM model serving — RGPD-compliant with zero external data egress.

  • 07

    Multilingual AI Systems (FR / EN / AR)

    Cross-lingual RAG with language detection and multilingual-e5-large embeddings. I build and communicate natively in French, English, and Arabic.

  • 08

    PoC to Production

    From validated prototype to hardened production system — with CI/CD, monitoring, rollback strategies, and documentation a team can own long-term.

Resume

Experience

  1. AI Engineer | French Healthcare SaaS Company

    2023 — 2025

    Built a production RAG system using hybrid retrieval (BM25 + semantic embeddings + RRF) deployed on-prem Kubernetes with pgvector and Qdrant — zero external API calls, fully RGPD-compliant.


    Integrated LLM capabilities into an existing Spring Boot backend via a FastAPI microservice broker, handling structured output parsing, streaming, and Keycloak-authenticated tool routing.


    Designed and delivered a RAGAS evaluation pipeline with custom LLM-as-judge metrics and regression test suites running on every deployment to prevent silent quality degradation.

  2. Data Engineering | French Public Institution

    2024

    Designed and implemented a data lake prototype, covering ingestion, transformation, and storage of heterogeneous public datasets.


    Developed a secure REST API and a metadata catalog compliant with the DCAT standard for interoperability.


    Participated in end-to-end validation testing to verify data pipeline correctness and API reliability.

  3. Data Scientist Intern (Computer Vision) | Software Consultancy

    2022

    Pre-processed and feature-engineered a dataset of 500,000 labeled product images for training a retail recognition model.


    Built a CNN capable of identifying supermarket products from a live camera feed, achieving production-grade accuracy on the client's inventory SKU set.


    Deployed the model as a web application and a mobile application using Flask and Java.

Education

  1. Télécom Saint-Étienne — Double Degree (France)

    2023 — 2025

    Engineering degree in Computer Science, specialising in Machine Learning, distributed systems, and applied AI. Double-diploma programme with ENSIAS Morocco.

  2. ENSIAS — École Nationale Supérieure d'Informatique (Morocco)

    2021 — 2025

    Engineering degree in Smart Systems & Data Science — covering AI fundamentals, statistical learning, NLP, computer vision, and large-scale data architectures.

Achievements

  1. 1st Place — Orange Digital Center Pan-African Mega Hackathon

    2022

    Ranked 1st among competing teams from 20+ countries across Africa. Built and presented a functional AI-driven solution under competition constraints within 48 hours.

My skills

  • RAG & Retrieval Engineering
    90%
  • Python
    90%
  • LLM Integration & Agentic Systems
    88%
  • Data Engineering & ETL Pipelines
    85%
  • LLM Evaluation (RAGAS / Custom Metrics)
    82%
  • Database Management (PostgreSQL / pgvector)
    80%
  • NLP & Multilingual AI
    80%
  • Docker / Kubernetes
    75%
  • Computer Vision
    75%
  • Deep Learning
    70%

Portfolio

12 production AI systems — from research prototype to hardened enterprise deployment.

  • Enterprise 2023 – 2025

    Production RAG System for Healthcare Knowledge Management

    Clinical staff needed instant access to 50,000+ documents, but no data could leave the premises. Delivered an air-gapped hybrid retrieval system on existing infrastructure — fully RGPD-compliant, zero cloud exposure. Staff now find verified answers in seconds; independent quality benchmarks show 9 in 10 responses are accurate and directly on-point.

    PythonpgvectorQdrantK8sRAGASFastAPI
  • Enterprise 2024

    LLM Layer Added to Existing Java Enterprise Backend

    The client wanted AI capabilities but had a stable, years-old Java system that couldn't be rewritten. Delivered a thin microservice that adds a self-hosted AI model to the existing backend — not a single line of legacy code touched. AI features went live in weeks, not quarters, with response times users don't notice.

    FastAPIvLLMSpring BootMistralKeycloakK8s
  • Engineering 2024

    Automated Web-to-Knowledge-Base Ingestion Pipeline

    A team was manually copying and updating 50,000+ documents per week — hours of analyst time, data always behind reality. Built an automated pipeline that detects only what changed and updates it in real time. Zero manual intervention for 3+ months; analyst time fully redirected to higher-value work.

    crawl4aiPythonPostgreSQLAirflowDocker
  • Engineering 2024

    AI Evaluation System: Automated Quality Gates on Every Deployment

    An AI system had silently degraded by 23% over six weeks — no one knew until a user complained. Designed an automated quality gate that validates every deployment against a golden test set before it reaches users. The 23% regression was the first thing it caught; it has caught every regression since.

    RAGASTruLensLangChainPythonPostgreSQL
  • Engineering 2024

    Trilingual RAG Assistant — FR / EN / AR

    A multilingual organization needed their AI assistant to perform equally in French, English, and Arabic — most commercial solutions fail on Arabic retrieval entirely. Built language-aware chunking with multilingual embeddings designed for cross-lingual search. Retrieval accuracy above 87% across all three languages, a level most tools don't reach in a single language.

    LangChainpgvectormultilingual-e5FastAPIRAGAS
  • Engineering 2024

    AI Assistant with Live Access to Internal Developer Tools

    Developer teams were spending 30–60 minutes per task searching GitLab, Jira, and Confluence before they could start building. Delivered an AI assistant with live, role-scoped access to all three — answers reflect the current state of the system, not last night's index. Context retrieval now takes seconds; no stale data, no unauthorized access.

    MCPOpen WebUIPythonGitLab APIJira API
  • Engineering 2024

    Legacy ETL Modernization — 93% Failure Rate Reduction

    Pipelines failing 15% of the time were cascading into reporting errors and costing the analytics team days of debugging every week. Replaced fragile legacy scripts with modular, monitored pipelines built for reliability. Failure rate: 15% → under 1%. Onboarding a new data source: 3 days → 2 hours.

    Apache HopPostgreSQLPythonDocker
  • Consulting 2024

    AI Model Selection Benchmark for a Six-Figure Infrastructure Decision

    A client needed to choose which open-source AI model to deploy on-premises — a significant infrastructure investment with no objective basis for comparison. Ran a structured benchmark across 500 domain-specific prompts, measuring output quality, response latency, and GPU cost simultaneously. Delivered a clear, defensible recommendation with a deployment-ready setup included — zero additional integration work required.

    vLLMPythonK8sHelmRAGAS
  • Engineering 2024

    Multi-Tenant AI Platform — 12 Clients, Zero Cross-Tenant Data Risk

    A SaaS provider needed to serve 12 enterprise clients on one AI platform with absolute assurance that no client could see another's documents. Built logically isolated environments per tenant on shared hardware, enforced at the access layer — no separate infrastructure per client. 12 tenants live on modest hardware; new client onboarding documented at under 10 minutes.

    pgvectorKeycloakK8sLangChainFastAPI
  • Engineering 2024

    Multi-Agent Research Synthesizer — 6h of Analyst Work in 8 Minutes

    Analyst teams were spending 6 hours per research synthesis task — gathering, cross-referencing, and summarizing sources by hand. Built a four-stage agent pipeline that mirrors the analyst's workflow: plan the inquiry, retrieve relevant evidence, critique gaps, then compose the output. Same task now completes in under 8 minutes; analyst time redirected to judgment and client communication.

    LangGraphLangChainPythonpgvectorFastAPI
  • Engineering 2024

    Autonomous Contract Processing — 4h Daily Workload to 20 Minutes

    A legal team's daily routine included 4 hours of manually classifying contracts across 12 categories — high cost, error-prone, and impossible to scale. Deployed an autonomous agent that reads, classifies, and routes contracts, flagging only low-confidence cases for human review. Classification accuracy: 96%. Daily processing time: 4 hours → 20 minutes. Estimated annual saving: ~800 paralegal hours.

    LangGraphPythonFastAPITesseract OCRPostgreSQL
  • Enterprise 2024 – 2025

    On-Prem Clinical Documentation Assistant — RGPD-Compliant

    Clinical staff were losing significant time to note drafting, but strict data residency rules made every commercial AI tool off-limits. Deployed a self-hosted language model on the client's own hardware — zero data ever leaves the clinical network, fully RGPD-compliant. 35% reduction in note drafting time across the 30-user pilot; approximately 2,000 hours saved annually.

    vLLMMistralFastAPIKeycloakK8sPostgreSQL

Contact

Send a message