Same strain. Dozens of names. One canonical record.
A single cannabis SKU appears as "Blue Dream 3.5g" in one POS, "BLUE DREAM - Eighth" in e-commerce, and "BluDream 1/8oz" in a brand’s inventory. Multiply that across thousands of SKUs and dozens of data sources. Fuzzy resolves all of them automatically — and gets smarter every time.
Blue Dream 3.5gBLUE DREAM - EighthBluDream 1/8oz Cookies
➜
Blue DreamFlower3.5gCookies
The Problem
Strain variants are everywhere. “GSC”, “Girl Scout Cookies”, and “GSC by Cookies” are the same strain — your POS doesn’t know that.
Weight formats are chaos. Eighth, 3.5g, 1/8 oz, 3.5 grams — every source uses a different convention.
Every new retailer compounds the mess. Each dispensary, brand, or distributor feed creates exponentially more duplicates.
Dirty data blocks analytics. Ad targeting, market share, and purchasing insights all break without a clean product graph.
Self-improving. Every human correction trains the system. The same match is never decided twice.
Fully explainable. See exactly why every match was made — confidence scores, tier breakdown, reasoning. No black boxes.
Veto fields prevent false positives. Weight or category mismatch = automatic reject. “Blue Dream 3.5g” never matches “Blue Dream 1g”.
Deeply configurable. Tune matching rules, weight normalization, veto fields, and confidence thresholds — every dispensary’s data is different.
Four-Tier Matching Pipeline
T1
Exact Match — Normalized
Free
T2
Fuzzy Consensus — 3/5 vote
Free
T3
LLM Reasoning — Ambiguous pairs
~$0.001
T4
Human Review — Feedback loop
You
80% free15% LLM5% human
LLM + Human Review
The LLM understands that “Wedding Cake Quarter” = “Wedding Cake 7g” even when string algorithms disagree. It resolves “Gelato #33” vs “Gelato 33” by reasoning about cannabis naming conventions. Human reviewers handle the final 5% — and every correction is cached permanently so the same pair is never re-decided.