Taking an alpha level app to beta with the voice of the user
AI Model Creation Tool Audit
INTRODUCTION
Testing with our users to elevate our app
Shortly after joining WebAI, I proposed a research effort to audit the current state of the app. WebAI Navigator was an application built to allow low/no code users to build their own custom Artificial Intelligence models. The idea was to take users through the current state of the app, give them a series of tasks based around the core flow of the application. We would then collect feedback and see where they tripped up while trying to build their first model.





TEAM & ROLES
Elevating an app is a team sport
CURRENT STATE
A no-code AI building tool that transforms ideas into models
When I joined the company, their main product, Navigator, was still in alpha stage but preparing for a limited beta release. The company's AI Model builder, Navigator, was designed to allow people with no machine learning background to create their own custom AI models. Development, to this point, had been entirely internal and the application had been developed entirely in-house without any user testing or market feedback.
THE PROBLEM
Helping our users make their first Artificial Intelligence Model
After being onboarded with the team at WebAI, I spoke with our head of product who highlighted that while our app was getting a fair number of downloads and users who were testing it, there was a low rate of users who were able to successfully build their first Artificial Intelligence Model. Our team was tasked to figure out why users weren’t able to get a project setup or get an AI model running.
THE SCRIPT
Getting to know our users and asking the right questions
Before testing our users, I began preparations by drafting a testing script. The script opened with core demographic questions (location, age, professional title, commonly used tools, AI familiarity, interest in WebAI) and from there I prompted users through the login screen all the way to getting an model up and running.
To evaluate the app's intuitive design, I intentionally kept the usability prompts open-ended. The script was also structured to allow organic exploration, enabling participants to investigate areas they found particularly interesting, confusing, or delightful. Once I received approval for the testing script, I coordinated with our Growth Product Manager to gather participants for our sessions.
WHO WE SPOKE WITH
Picking the right users
During the script approval process, I collaborated with our Growth Product Manager to define our ideal set of participants. Since engineers were the company's primary target user base, we prioritized participants with an engineering background.
However, I advocated for including one to two non-technical participants to gain valuable perspective on the app from non-technical users. This strategy led us to recruit a group of six participants for our testing sessions.
PARTICIPANTS

THE SESSIONS
Structure and freedom to guide our sessions
Over the course of one week, I conducted 6 interview sessions. We used the latest alpha build of the app to provide participants with the most authentic experience possible. This also had the added benefit of dramatically expediting our timeline and eliminating the need for any prototype work.
We led the users through the approved script recording each session and allowing the user to deviate at appropriate points. Session lengths varied from one to two hours, depending on the depth of participant feedback and their time availability.
SYNTHESIS
Six sessions, forty-nine insights
Following the user interviews, each session was meticulously documented within our research platform. Key takeaways were highlighted and tagged, facilitating efficient analysis. This process yielded 456 distinct highlights, which were then categorized into 49 key insights. Using these insights as a basis, I wrote out 30 actionable design recommendations for the team to consider.
The insights and their corresponding recommendations were then presented to the co-founders, product team, and key engineering members for discussion and feedback before we discussed next steps.
DESIGN RECOMMENDATIONS
Below is a selection pulled from the 30 design recommendations I collected:
OTHER TOOLS
The daily toolkit of our users
Over the course of one week, I conducted 6 interview sessions. We used the latest alpha build of the app to provide participants with the most authentic experience possible. This also had the added benefit of dramatically expediting our timeline and eliminating the need for any prototype work.
We led the users through the approved script recording each session and allowing the user to deviate at appropriate points. Session lengths varied from one to two hours, depending on the depth of participant feedback and their time availability.
APPLICATIONS
Visual Code Studio
Py Charm
NX Unigraphics
One Note
Miro
Microsoft Office Suite
Google GSuite
Solid Works
AutoCAD
Figma
MDM Workbench
Eclipse
Docker
Jupyter Notebook
CODING LANGUAGES
Python
C
C++
DEVELOPMENT LIBRARIES
Open CV
PyTorch
MDM Workbench
Django
AI TOOLS
Rivet
ML Flow
Google Gemini
Chat GPT
Anthropic Claude
Stable Diffusion
Perplexity AI
Minstral AI
Runway
LM Studio
V7
YOLO
SERVICES
Amazon Workspace (AWS)
FAN AI
GitHUB
OpenAI APIs
Anthropic APIs
Minstral AI APIs
OPERATING SYSTEMS
Apple OSX
Microsoft Windows
Linux
RESULTS
Insights that moved the needle
After presenting our research to the executive team, the full template feature was prioritized as our next full release.
With templates incorporated into the application we saw in a jump in projects built successfully from 7% to over 80%.
The team implemented several quick improvements to the application, including removing Metamask from the login screen, eliminating unnecessary sounds, and automatically displaying results when processes complete.
NEXT STEPS
Building a research driven roadmap
The team also highlighted the organization as another point of confusion for users. We two separate card sorting exercises around template organization, and elements to make sure they more accurately mapped to our users’ understanding.
Templates significantly improved project success rates, but users still faced challenges when building projects from scratch. In response, the executive team prioritized research into developing a more robust onboarding process and exploring design patterns beyond the canvas interface for AI model building.















