30 Days of AI in Testing – Days 1-5

Published on March 8, 2024

Currently doing the 30 Days of AI in Testing offered by the excellent Ministry of Testing.

https://www.ministryoftesting.com/events/30-days-of-ai-in-testing

This blog is a record of Days 1 to 5, rather than having them just on the Club. Its quite a lot of writing and I like to have my writing in one place where possible.

Day 1

Introduce yourself: Tell us about your background, your role in testing or tech and how you found this community.

My name is Ash, I’m from the United Kingdom and have been a tester for quite a few years now. I have worked in exploratory testing, automation, making tools to test with plus performance and load testing. Some of the ingredients of a successful test approach I think. My first TestBash was in Brighton in 2013, I was working at a consultancy at the time and colleagues introduced me to MoT. I have since been a speaker and volunteer at various MoT events.

Your Interest in AI: What initially piqued your interest in AI in testing? Are there any particular areas of AI in testing that you’re eager to learn more about?

At the moment, I will use generative ai to help me research the technologies I am about to test (strengths, weaknesses, claims), get my exploratory testing notes in order and generate small scripts for tasks that don’t warrant the full test toolsmith treatment, generating one off test data, that sort of thing.

Your Goals: What do you aim to learn or achieve in this challenge?

I’m looking for a bit more inspiration, can I sensibly deepen my usage of AI in testing, while also being aware of its potential pitfalls.

Day 2

  • Look for an article that introduces AI in software testing. It could be a guide, a blog post, or a case study—anything that you find interesting and informative.

From the many pages of links to companies promising many things I chose:

https://www.getxray.app/blog/ai-in-automated-testing

Mainly because I know a few xrayers and they are generally quite sensible.

  • Summarise the main takeaways from the article. What are the essential concepts, tools, or methodologies discussed?

The article steps through a couple of test automation design approaches using AI tooling:

  1. Tools that use character or image recognition to ‘spider’ your app and provide information about changes.
  2. The more classic ‘scripting’ approach where you would provide prompts to generate a cypress test for example, you might augment with in editor tooling like copilot.

Maintenance wise the article acknowledges that UI automation can be brittle but AI tooling can help, updating its own model of your application with changes or using LLM’s for code review, comments and documentation.

  • Consider how the insights from the article apply to your testing context. Do you see potential uses for AI in your projects? What are the challenges or opportunities?

There were a few things that stood out:

  • Using AI for accelerating requirements and design mock ups, this might help with those hidden requirements that testers come up with early on (error handling, journey abandoning) and fleshing them out earlier.
  • Phind, a ChatGPT based product that can act as your pair programmer for prompt design.
  • Local only urls and authentication layers could be a real blocker for AI tools, especially the character and image recognition spiders.
  • The article warns that as complexity grows, prompt design is not enough, humans will need to intervene.
  • The execution section is missing a little on the questions ‘What is the smallest suite of tests needed?’ This is an interesting usage of AI to target tests where the changes are, but it lacks depth on this area.

Day 3

List three or more different AI uses you discover and note any useful tools you find as well as how they can enhance testing.

I watched Daniel Knotts videos on AI tooling on youtube:

  • Recognising patterns in your tests and suggesting improvements/optimisations
  • Visual AI to see design differences in your mobile or web application.
  • Self healing – an ID has been changed and can adapt test cases accordingly.
  • Visualise user journeys to help focus your tests on what the users are doing.
  • Using natural language processing to turn plain text into the code for an automated test (or even test steps).
  • Analysing test run data to detect trends and suggest improvements and optimations.

These include the usual suspects, Testim, Mabl, Applitools etc.

Reflect and write a summary of which AI uses/features would be most useful in your context and why.

In my context I think:

  • Recognising patterns in your tests and suggesting improvements/optimisations – spend a lot of time reviewing unit, integration and acceptance tests for consistency and code quality so would be nice to do this, as long as I could provide guidance for the patterns I want to see to the tool in the prompt.
  • Analysing test run data to detect trends and suggest improvements and optimations – our tests literally run thousands of times a day so reports are not much use. Gathering information of areas where tests are problematic (matched to the class that the failure is tied to) over time would be really useful.

The others (self healing, generating test steps, OCR) seem like they might be useful for teams with separate testing functions (other companies) but most can be mitigated if the cross functional development team works well together.

Day 4

Watch the “Ask Me Anything on Artificial Intelligence in Testing” with Carlos. You can choose to watch the whole thing (highly recommend!) or choose questions of interest by using the chapters icon on the player or by clicking the chapters from the playbar as indicated with small dots. Take notes as you go.

  • Can you test for biases in AI? You can test for sentiment analysis, is it positive or negative. Use invariance testing and replace tokens within a positive or negative word and see how the model reacts.
  • Assessing confidence your users have in your AI powered software – usually a prompt has a few answers, some tools as if one response to a prompt was better than another.
  • What tool are you using for AI testing? Using Langsmith to test Langchaim, or make it observable. Otherwise, use pytest as a test runner, not as complicated as you might think.
  • How can we make AI unlearn the concepts learned? With open source models you can see what they were trained on, with GPT you don’t know for sure. You can also give GPT files to use as a knowledge base, which is can use first rather than just going out to the internet. Main hub for OS is huggingface.io.
  • How can we use AI with day to day testing? At the moment its cool but not that useful as it doesn’t have the context. Asking for an API test from ChatGPT will do its best, but if its within Postman and can see your other requests, responses etc it will be much better as it has more context.
  • How to get into AI testing? Start using ChatGPT to try and solve some of your problems. Don’t underestimate how powerful the tools are. Research prompt engineering.
  • Security and confidentiality of your data being processed by the AI. If you don’t want to trust a vendor with the data, don’t send it.
  • Whats the difference between machine learning and AI? ML is algorithims and pattern matching, with AI rather than matching and mapping, it is inferring from data to answer a question.
  • Where do you see the role of software tester in 10 years? The analytical part of the testers role will still be very valuable, AI does well with structured endeavours with rules and constraints like writing code but less so with analytical techniques. AI would be better at automation for example.
  • Using copilot to support automation and testing. Using copilot to assist with TDD, write your tests and use copilot to generate functions, classes. Report portal (reportportal.io) to look at test results and check for failure patterns. Starcoder as an open source alternative to copilot.
  • How can AI help with usability and accessibility? AI can help out with standards (like those in Lighthouse), however an AI doesn’t have to work with keyboard and mouse for example so you can’t trust it.
  • Where can AI help a junior tester test better? Learning about all the terms that get thrown around in testing that you are supposed to know, a GPT could help you to get started on a testing problem. You can ask other testers but you don’t always have other testers around.
  • Guard the quality of AI that changes how it behaves in Production. This is known as data drift. If conditions changes after initial model training, it will only ever have the context of the data it was initially trained for.
  • What testing tasks lend themselves best to being done with AI tools? AI is good at structured work, creating scenarios, writing code. We need to use tools that have the context of what we are working on, rather than considering ChatGPT and Gemini as tools

After watching, reflect on the session and share the takeaway that had the biggest impact for you by replying to this topic. For example, this could be a new understanding of AI’s potential in testing or any ethical considerations that stood out to you.

  • I think I really liked the emphasis on Generative AI’s in their web containers are not testing tools per se, they are interesting but not that useful. To be truly useful, it needs to be in context (that word again), like the example of generating tests within Postman with PostBot, rather than just from the ChatGPT response window.

Day 5

Option 1: Case Study Analysis

Search for a real-world example of where AI has been used to tackle testing challenges. This could be a published case study or an example shared in an article or blog post.

This one seemed interesting to me, as managing lower level tests and their quality is quite a challenge:

https://www.parasoft.com/resources/case-studies/ai-driven-java-unit-testing-boosts-developer-productivity-for-financial-firm

Select and analyse a case study that seems relevant or interesting to you. Make a note of the company and context, how AI was applied in their testing process, the specific AI tools or techniques used and the impact on testing outcomes/efficiency.

  • This was a large financial services company, with a set of legacy services some of which had little code coverage and were seemingly buggy. Devs had to do a lot of rework on these services and spent a lot of time on creating unit tests for legacy code.
  • In this context, the AI tooling was an IDE plugin to guide users through the process of creating unit tests and then by targeting tests at change hotspots.
  • Apparently big gains in developer productivity through less time on unit testing and more on ‘innovation’, plus better delivery through reduced defects and rework required.

As this is a company selling their tool, its obviously biased and only gives you the positive side of the case study. The ideas are interesting though, as a lot of time is spent adding tests to legacy code (or not and watching it break) and running tests in areas that haven’t changed. These to me seem like great uses of generative AI, depending on how well it is done of course!