Loma Lookup: A Case Study in AI Limitations

Benjamin Garcia|May 20, 2025|Status: Scrapped

Next.jsTailwind CSSFirebaseGoogle Vision APIGoogle Document AILLaVA4oPython

Sample data. Believe it or not, this was one of the better/easier images to work with.

Problem Statement

At Loma Linda University’s Dental School (LLU), D3 and D4 students begin clinical rotations under the guidance of attendees—senior practitioners responsible for oversight and support.

However, the system used to track these attendees is confusing and inaccessible. The weekly clinic schedule, formatted as a dense spreadsheet, is hard to read and interpret, especially given the complex layout of the clinic floor.

To make matters worse, LLU does not distribute this schedule digitally. Instead, it’s displayed only on a single hallway TV monitor, making it nearly impossible for students to plan ahead or access it remotely.

Design Objective

Build a system capable of extracting structured JSON from arbitrary photographs of LLU’s clinic schedule—removing the need to physically view the TV and enabling digital lookup of attendees by day, time slot, and clinic block.

Preprocessing & Extraction

Each image first passed through a custom OpenCV pipeline that flattened perspective distortion, enhanced contrast, and standardized brightness. This normalized the input for further parsing.

Preprocessed of the raw image through a custom Python script. — Preprocessed image.

I began with Google Vision OCR to extract text from individual cells. When results became inconsistent, especially with merged cells or rotated entries, I switched to Google Document AI's table processor, which improved baseline accuracy but still lacked full structural understanding.

Testing the Frontier

To handle table structure, merge semantics, and embedded shorthand (e.g., ‘Clin 7910’ → Clin 7, 9, 10), I explored multimodal models: LLaVA-v1.5, table-llava, and GPT-4o.

Each model could understand pieces of the layout but none offered consistent, complete parsing of arbitrary photos. These models collapsed when faced with skewed perspectives, inconsistent row heights, or vertically merged cells. That's when I realized this wasn’t an implementation issue—it was an AI limitation.

Takeaways & Implications

This project looked simple on paper—turn a photo into JSON. In practice, it revealed a deeper truth: some tasks remain on the edge of AI's capabilities, especially those requiring spatial logic, semantic interpretation, and structural inference.

Loma Lookup is now retired, but it serves as a technical postmortem on where automation fails, and how critical human verification still is when pushing past the bounds of AI reliability.

What's next? log it v2 & Git Proof