top of page
Cover (1).png

Sensible — Designing Trustworthy AI Workflows for Enterprise Document Extraction

Overview

Sensible is an enterprise LLM-based document extraction platform. I designed core AI workflows that helped move the product from a manual, query-based experience toward a more AI-first one — while solving the trust and control challenges that came with automation.

Challenge

AI made extraction much faster, but users trusted it less than the manual system they understood and controlled.

The challenge was not just adding AI. It was deciding where AI should act, where users should stay in control, and how the system should communicate uncertainty in high-stakes workflows.

What I led

  • AI-first extraction workflow design

  • Confidence and human review patterns

  • Workflow design for both beginner and expert users

  • Collaboration with product and engineering on automation boundaries

Impact

  • Designed the core AI workflow for an enterprise document extraction platform

  • Introduced confidence and review patterns that made AI output more usable in high-stakes workflows

  • Helped shift users from full manual checking toward more selective, flagged review

  • Contributed during a period of 5x company growth

  • Supported product scale to 7-figure ARR

Key research findings

1. Trust was the main barrier
Users saw the speed advantage of AI, but hesitated when they could not judge whether the output was reliable enough for production.

 

2. Users wanted different levels of automation
Some preferred a faster, simpler workflow. Others, especially more technical users, wanted more transparency and control.

 

3. Neither extreme worked
Full manual review reduced the value of AI, while blind automation reduced trust. The right answer was a better middle ground.

Framing the problem

The initial problem looked like a trust issue, but the deeper issue was workflow design.

Through user feedback and early testing, three things became clear:
 

  • users were not rejecting AI itself - they were reacting to the loss of visibility and control

  • different users needed different levels of automation

  • a generic confidence score was not enough to guide action

This shifted the question from “How do we make users trust AI?” to “How do we give users the right checkpoints at the right moments?”

I considered pushing further toward full automation with audit logs, but rejected that direction because it removed too much visibility from users and displaced the trust problem rather than solving it.

Design decisions

1.AI-first extraction workflows

Insight: Users saw the speed advantage of AI, but the manual workflow still felt safer because they understood and controlled it.

Decision: I designed a more AI-first extraction flow that reduced setup effort and made extraction more accessible.

Outcome: Lower friction and a faster path to value, especially for less technical users.

2. Confidence signals

Insight: Users were unsure when to trust AI output and when to review it. A generic confidence score did not help much, because it did not tell them what action to take.

Decision: I explored different ways to communicate uncertainty, including a generic percentage score and more visual field-level treatments. I moved toward confidence states that were tied to action instead of raw probability, so users could quickly understand what looked reliable, what needed review, and where to focus first. I also worked with engineering to make sure these states reflected real extraction behavior rather than arbitrary thresholds.

Outcome: Users could scan results faster and focus on the fields that mattered, instead of reviewing everything with the same level of effort.

Screenshot 2024-11-28 at 11.38.54.png

3. Human review

Insight: Blind automation felt risky, but full manual review removed much of AI’s value.


Decision: I designed a human review flow for flagged extractions, with validation, editing, source visibility, and approval.


Outcome: A safer middle ground that supported higher-stakes workflows without forcing manual review on every case.

Screenshot 2024-11-28 at 11.49.02.png

4. More intentional control for expert users

Insight: More technical users wanted transparency and control, not just a simpler AI experience.


Decision: I made the workflow more transparent and preserved deeper control for advanced users.


Outcome: The product felt more trustworthy and production-ready for expert use cases.

dashboard.png

What I Learned

The most important lesson from this project was that trust in AI products is not built only by improving the model. It is built in the moments where the system hands work back to the user.

Users could forgive imperfect AI output. What they struggled with was not knowing whether something was wrong, or not knowing what to do next.

That is why the biggest trust gains did not come from making AI feel magical. They came from designing better handoffs: clearer review moments, better visibility into uncertainty, and more control where it mattered most.

The next opportunities I’d explore are:

  • learning from repeated user corrections

  • surfacing only cases that truly need review

  • supporting reasoning across groups of related documents, not just one file at a time

bottom of page