App Testing Tips for Vibe Coders
Testing your app feels like a chore. You've automated all your development work and are shipping fast with Claude Code or Cursor, but then you hit a wall: testing is manual.
We brought in Rory Smith—12 years in mobile development, 5 years at Apple—to our latest Office Hours. Rory recently left Apple to co-found Semaloop, a startup building AI-powered mobile app testing. Their tool uses multimodal models to visually test apps without requiring developers to tag UI elements—you just describe what the app should do in plain English.
Here's what we learned.
Your Tests Are Your Specification
If you've got 300 unit tests but 200 of them are nonsense, that's fake confidence. You're saying 'shipping with confidence,' but you don't really know what you're doing.
When you use AI to generate tests, don't just accept them blindly. Read through them. Ask yourself: does this make sense? If you can't explain what a test is checking, delete it.
Your test suite should read like a functional spec. Anyone should be able to look at it and understand what the app is supposed to do.
Invest in Developer Experience Early
Rory's advice: fix this friction before you go deep into building.
If there's a way to make your workflow faster, it's going to help you down the line because it multiplies by the number of times you're iterating.
There's always a temptation to push forward on features. But spending a few hours optimizing your dev setup pays compound interest on every iteration afterward. This applies whether you're the one iterating or an AI agent is.
Use AI as Your UX Tester
I make it roleplay. I say: you're a teacher from Connecticut with this experience, you've landed on this homepage—complete X task. Then I'll run it again as an experienced software developer. They approach things very differently.
His rule of thumb: if Claude can't complete the flow, it's too complicated for real users.
This isn't about catching code errors—it's about catching UX problems. If an AI gets confused by your onboarding, that's a signal humans will too.
Separate What Needs Device Testing
Rory's suggestion: isolate what actually requires a device.
If you're iterating on style or layout—things unrelated to camera APIs or device functionality—try to iterate in a different environment. Maybe inside a web browser, still within React Native setup.
Only pull in actual device testing when you need hardware access. For everything else, keep the feedback loop tight.
Lightweight Tests Beat Comprehensive Tests
Make it as lightweight as possible. Your entire test suite should run quickly—maybe in memory, maybe in the cloud. You want fast answers because that helps you iterate, whether you're driving or an agent is.
Heavy, slow test suites don't get run. Fast, focused tests become part of your workflow.
Watch Out for Overlapping Tests
It's very easy to ask Claude to create tests to cover what it's done, but between different requests, there can be overlap. Even in a single request, there can be lots of tests that don't actually add value.
Periodically audit your test suite. Look for tests that check the same thing in slightly different ways. Consolidate or cut them.
Use Test Data to Skip Repetitive Steps
The folks at Granola for a long time had a podcast embedded inside their builds that they would test against. They no longer had to deal with the microphone API during development.
If your app processes images, embed a test image. If it handles audio, include a sample file. Remove the friction of repeatedly capturing real input while you're iterating.
The "Does It Work?" Checklist
1. Write tests you can read. If you can't explain what it checks, it's not helping you.
2. Optimize your iteration loop first. Every minute saved compounds across hundreds of cycles.
3. Use AI to simulate real users. Have it roleplay personas navigating your app.
4. Test on-device only when necessary. Keep layout and logic testing in faster environments.
5. Prefer fast, focused tests. A 10-second test suite you actually run beats a 10-minute suite you skip.
6. Audit for redundancy. AI-generated tests often overlap. Trim the fat.
7. Stub repetitive inputs. Embed test data so you're not fighting APIs while debugging UI.