AI Agents in QA: How to Keep Up with AI-Driven Dev Velocity with Vilhelm von Ehrenheim
Categories: Podcasts , Test Guild
AI reshapes testing roles by shifting bottlenecks to QA, emphasizing agentic testing that simulates user behavior and tools integrating with GitHub for automated PR analysis. The shift prioritizes behavior and intent over code-centric testing, while challenges include adoption hurdles, validation needs, and aligning AI-driven speed with quality assurance.
Test Guild
Test Guild - hosted by Joe Colantonio has main topic focus on Testing or Automating. Each episode has a different guest. Show notes have comprehensive links and usually a full transcript. Released as audio and video.
- https://testguild.com/
- https://testguild.com/podcasts/automation/
- https://www.youtube.com/playlist?list=PL9AgRtJkydU1jqvx46esyr56BXtm1QEds
- https://www.youtube.com/@JoeColantonio
Episode Details
- Show Notes: https://testtalks.libsyn.com/ai-agents-in-qa-how-to-keep-up-with-ai-driven-dev-velocity-with-vilhelm-von-ehrenheim
- Published: 2026-06-02T16:48:00Z
- Duration: 35:23
- Author: Unknown
Overview
The podcast explores the evolving role of testers in software development as AI tools automate coding, pull requests (PRs), and testing tasks. It addresses misconceptions that AI replaces testers, highlighting instead that bottlenecks now shift to quality assurance (QA) and testing. Key themes include agentic testing, where AI agents simulate manual testing by interacting with applications like users rather than relying on scripted test cases. A QA tech platform is discussed, which integrates with GitHub workflows to analyze PRs, run targeted tests, and automate test reviews, reducing traditional hurdles like locator dependencies. Challenges in testing with complex systems and the need for intelligent, adaptive solutions are emphasized, alongside the impact of AI on accelerating development cycles while requiring structured pipelines to maintain quality. The discussion also touches on the importance of codifying knowledge and aligning team practices to manage AI-driven development speed effectively.
The podcast delves into AIs role in automated decision-making, particularly in high-stakes areas like credit risk, where rigorous validation and performance proof are critical. It outlines approaches to AI testing, including scripted automation, AI-assisted script writing, and agentic testing, with the latter gaining traction despite being unconventional in 2023. Knowledge graphs are presented as tools to capture contextual application insights, enabling agents to understand user journeys, domain-specific knowledge, and technical documentation. These graphs are iteratively updated and validated to avoid obsolescence. Practical examples include an AI-driven testing framework that autonomously explores and validates software without hardcoded test cases, focusing on user intent and experience rather than technical implementation. The podcast also contrasts AI-driven testing with traditional methods, noting advantages in nuance and reduced ambiguity, while acknowledging challenges in adoption, such as integration with CI/CD pipelines and initial setup requirements.
Key topics include agent-driven testing automation, where AI generates test goals and expected results based on application knowledge, and a shift away from code-centric testing toward behavior, intent, and product alignment. The discussion emphasizes the importance of defining clear requirements for AI systems and practical experimentation to refine their capabilities. Challenges include skepticism about “self-healing” mechanisms and the need for strategic alignment with testing goals. The podcast also addresses the evolving role of testers in cross-functional teams, stressing the need for automated pipelines to ensure product quality, including aspects like accessibility and brand guidelines. Finally, it touches on debugging with AI agents, their ability to reproduce user-reported bugs, and the importance of structuring work to ensure meaningful, validated tests rather than over-reliance on automation.
What If
-
What if you integrated agentic testing into your GitHub workflows to automate PR testing?
- Move: Set up a system where agentic AI agents analyze GitHub pull requests, generate test goals based on code changes, and execute intent-based validation without scripted tests.
- Why Now?: Modern development teams face bottlenecks in testing pipelines, especially as PR volume increases. Agentic testing aligns with GitHubs integration capabilities and reduces reliance on brittle, code-centric test cases.
- Expected Upside: Accelerate feedback loops for developers, reduce manual QA intervention, and automatically flag edge cases like UI discrepancies or workflow changes that might be missed by traditional tests.
-
What if you built a QA platform using knowledge graphs to simulate user journeys without scripted test cases?
- Move: Create a tool that ingests application documentation, user flows, and API specs to build a knowledge graph. Train agentic AI to use this graph to explore apps like a user (e.g., clicking buttons, validating outcomes) and report bugs.
- Why Now?: Testing complex systems requires understanding user intent over technical implementation. Knowledge graphs enable adaptive testing that scales with new features, avoiding the need for constant script updates.
- Expected Upside: Reduce test maintenance costs by 80% through self-updating knowledge graphs and increase coverage of edge cases that manual testers might overlook in high-velocity development.
-
What if you shifted testing to the development stage by using autonomous agents for early defect detection?
- Move: Deploy autonomous agents during code review stages (e.g., GitHub PRs) to validate logic against domain-specific rules, user journeys, and historical bug patterns before merge.
- Why Now?: Development and testing bottlenecks have shifted to QA, but fast-paced workflows demand immediate feedback. Autonomous agents can act as a “virtual QA” during coding, complementing code review tools.
- Expected Upside: Catch critical bugs (e.g., UI mismatches, backend logic flaws) 20% faster than manual QA, reducing rework and enabling developers to focus on higher-value tasks like feature innovation.
Takeaway
- Integrate AI-driven testing into GitHub workflows: Utilize platforms that analyze pull requests (PRs) in real-time, run targeted tests, and automate test reviews to streamline quality assurance without manual scripting.
- Adopt agentic testing with behavioral knowledge graphs: Replace scripted test cases with AI agents that simulate user interactions, leveraging contextual understanding of user journeys, API specs, and application behavior to validate functionality autonomously.
- Automate bug reproduction via user-reported issues: Use AI agents to analyze user complaints (e.g., Slack messages) and reproduce bugs by replaying workflows, even for intermittent issues that require repeated testing.
- Continuously update knowledge graphs during onboarding and PRs: Maintain accurate, context-rich models of your application by iteratively refining knowledge graphs with new features, PRs, and user interactions to guide agents in testing critical paths.
- Prioritize intent-based validation over brittle automation: Focus test cases on user intent and experience (e.g., end-to-end validation of a specific feature like a chess opening guide) instead of relying on fragile, time-sensitive technical checks.
For a PDF of longer Software Testing Podcast Episode Summaries with Briefing Notes and more detailed summary notes, visit EvilTester Patreon Podcast Summaries.