AI platform for personalized product recommendation

Personalizing eCommerce discovery with AI recommendations, improving shopper engagement by 40%

Scroll down to read more

Project highlights

Industry: eCommerce, Retail

Client services: AI Product Development

Started in 2024

Location: United States

Team size: 4 members

Duration: 6 months

About the client

A U.S.-based eCommerce platform in the outdoor/sporting-goods category.

The company helps consumers discover and purchase gear that fits their needs without spending hours scrolling through product catalogs.

Business challenges

The client’s AI-powered catalog and conversation experience limited product discovery and conversion.

Pain points included:

Duplicate SKUs from color/size variants inflating catalog “noise”
Inconsistent vectorization across sources reducing semantic match accuracy
Weak categorization/labeling that constrained facet filtering and recommendations
Unclear scraping scope creating unstable upstream data
Limited dialog (few clarifying questions, flat result lists) that increased abandonment, and
Low explainability/determinism that eroded user trust in RAG-style answers

Goals set to Achievion

The engagement’s goal was to deliver explainable, high-relevance product recommendations through a conversational assistant that reasons over a clean, consistent catalog.

Success criteria:

Consolidate variants to reduce duplication and sharpen embeddings
Standardize attributes and categories to enable meaningful filters and comparisons
Improve retrieval quality and dialog flow so users get to a confident choice faster
Lay a scalable foundation that can support growth and potential white-label deployment without re-architecture

The north star was a shopping experience that feels guided, transparent, and predictable – where users understand why an item is recommended and can easily refine the path to purchase.

Solution

We designed a normalization pipeline that consolidates product variants (e.g., color/size) under a canonical item, reducing duplication and clarifying attribution. This improves embedding quality by focusing signals on the product, not fragmenting them across near-identical entries. A consistent attribute schema (categories → subcategories → features) increases the precision of both retrieval and filtering.

To raise recommendation confidence, we combined accurate vector search with a deterministic query-builder path for scenarios that require traceability (e.g., strict attribute filters, exclusions, inventory constraints). The assistant can favor semantic retrieval for discovery and pivot to rule/SQL filtering when users need precise control – maintaining performance and transparency in both modes.

We also reworked the conversational flow to elicit clarifying preferences (intended use, conditions, fit, budget) early, then present results in cluster-aware groupings (e.g., “lightweight daypacks,” “ultra-durable expedition packs”) with short rationales (“recommended for X terrain; weight ≤ Y; fits torso length Z”). This turns a flat list into explainable sets, helping users understand trade-offs and refine quickly.

Trace logs, retrieval/ranking telemetry, and conversation analytics enable continuous tuning of prompts, filters, and vectors. Explanations are generated in plain language and tied to visible attributes so recommendations are auditable. The system’s componentized services (catalog integration, scraping, dialog manager, retrieval/rules engine) support iterative changes without destabilizing the whole stack.

Business outcome

The upgraded assistant delivers a cleaner, more trustworthy discovery experience, translating into faster decisions and higher session completion. Users see duplicate-free, well-explained recommendations that match their intent, reducing choice overload and improving confidence at the moment of selection. Clarifying prompts and cluster-based presentation help shoppers converge on the right product without back-and-forth searching, shortening the path from browse to buy.

Operationally, a structured, de-duplicated catalog lowers maintenance overhead and stabilizes upstream data flow. Normalized attributes and consistent categories improve search and filtering, reduce manual corrections, and make it easier to onboard new inventory or vendors without rework. The hybrid retrieval approach (semantic + deterministic) provides predictability when precision matters (e.g., strict specs, exclusions), while preserving the flexibility needed for discovery.

From a product and analytics standpoint, observability is built in. Telemetry on retrieval quality, ranking behavior, and conversation flow supports continuous tuning of prompts, filters, and embeddings. Explanations are generated in plain language and tied to visible attributes, improving audibility and trust for both users and internal stakeholders. The system now supports A/B experimentation on dialog and ranking strategies, enabling data-driven iteration rather than one-off changes.

Commercially, the platform is positioned for scalable growth and new monetization paths. The componentized architecture supports white-label distribution and partner integrations without full re-architecture, and the improved explainability opens opportunities for sponsored placements that remain transparent and user-centric. Together, these outcomes align user experience, catalog integrity, and engineering scalability – supporting measurable gains in engagement and conversion while keeping long-term cost and complexity in check.

Timeline

Q4 2024

Discovery & Foundations

Stakeholder workshops to define success measures (relevance, conversion lift, time-to-choice) and guardrails (explainability, determinism, privacy)
Current-state audit of catalog sources, scraping coverage, duplication patterns, and attribute gaps; draft the canonical product/variant model
Technical spikes for vector store, rules/SQL path, and dialog manager; decide on service boundaries and integration points

Deliverables: Discovery report, canonical data model (v1), success metrics baseline, architecture blueprint & PoC plan

Q1 2025

Data Normalization & Catalog Integrity

Build normalization pipeline to consolidate variants (color/size) into canonical items; implement attribute schema and category hierarchy
Stand up ingestion & enrichment services; establish quality gates (missing attributes, invalid ranges, dedupe thresholds)
Backfill embeddings with improved signals; measure retrieval quality vs. pre-normalization baseline

Deliverables: Clean catalog (pilot categories), dedupe metrics, attribute completeness dashboard, embedding refresh (v1)

Q2 2025

Retrieval & Assistant MVP

Implement hybrid retrieval (semantic + deterministic filters) with clear fallbacks for strict/spec-driven queries
Design conversational flows to capture intent (use, conditions, fit, budget) and present cluster-based results with short rationales
Add explanation UX (plain-language “why recommended” tied to visible attributes) and quick refiners

Deliverables: Assistant MVP in staging, top-journey flows, explanation components, initial KPIs (relevance, abandonment, time-to-choice)

Team

Product Manager

AI Solutions Architect

LLM/NLP Engineer

Data Engineer

Tech Stack

Technologies:

LLM orchestration

Vector database for semantic retrieval

Rule/SQL filters for deterministic querying

Integrations:

Product catalog

Scraping pipeline

Dialog manager

Observability/Trace logs

Deployment:

Service-oriented components

Get in touch to move your AI transformation
or product idea from concept to execution.