This is the fourth article in our mini-series, which aims to showcase the evolution of Data extraction solutions over the past 2 to 3 years. These articles highlight the advancements made in recent months and illustrate what is now possible. The most significant improvement is the ability of current solutions to learn autonomously.
If you missed the introduction to this mini-series, you would find it here: Mini-series: The real power of new data extraction strategies. The previous article discussed how modern solutions could effectively handle repetitions on handwritten documents: Use Case 3 - Handling Repetitions.
Could you please sort your groceries before I scan them?
When you go shopping, purchasing the same item multiple times is not uncommon. Most probably, you drop your groceries in random order on the conveyor belt. As a result, these items will appear on the receipt in the order they were scanned. As a consumer, this has no consequence and probably no interest. However, for companies extracting data from receipts, these details matter. These companies are typically interested in knowing what you buy, in what quantity, and at what frequency. Grouping identical items is a fundamental part of their analysis.