Friday 16 December 2016

The Testing Pendulum: Finding balance in exploration

How detailed should exploratory testing be?

I spotted this question in the Cambridge Lean Coffee topics that James Thomas collated in his blog. It's a question that I'm often asked, and I consistently use the same analogy in my response: the testing pendulum.

The Testing Pendulum

Imagine a pendulum at rest. The space in which the pendulum can swing is our test approach. At the left apex we are going too deep in our testing and at the right apex we are staying too shallow. Initially, the testing pendulum sits directly in the middle of these two extremes:

The Testing Pendulum

When a tester starts in a new organisation or a new project, we apply the first movement to our testing pendulum. Think of it as lifting the pendulum up to the highest point and letting go. The swing starts from the shallow side of the spectrum to reflect the limited knowledge that we have as we enter a new situation. As our experience deepens, so too does our testing.

Starting the pendulum

"When given an initial push, [the pendulum] will swing back and forth at a constant amplitude. Real pendulums are subject to friction and air drag, so the amplitude of their swings declines." [Ref]

Similarly in testing, the pendulum will swing backwards and forwards. When our testing has become too intensive, we switch direction. When our testing has become ineffectual, we switch again.

Pendulum changing direction at the peak of its swing

The skill in knowing how detailed testing should be is to recognise the indicators that tell you when the pendulum is at the top of its swing. You need to be able to identify when you've gone too deep or stayed too shallow, so that you can adjust your approach appropriately.

Indicators

Indicators help us determine the position of our pendulum in the testing spectrum. I see three categories of indicator: bug count, team feedback and management feedback.

Bug count

If you are not finding many bugs in your testing, but your customers are reporting a lot of problems in production, then your testing may be too shallow. On the flip side, if you raise a lot of bugs but not many are being fixed, or your customers are reporting zero problems in production, then your testing may be too deep.

As a caveat, the zero problems in production measure may not apply in some industries. A web-based application may allow some user interface problems to be released where the economics of fixing these does not make sense, but a company that produces medical hardware may seek to release a product that they believe is perfect no matter how long it takes to test. Apply the lens of your own organisation.

Team feedback

Whether you're working in waterfall testing team or an agile delivery team, you are likely to receive feedback from those around you. Be open to those opinions and use them to adjust your behaviour.

If your colleagues are frequently adding scope to your testing, questioning whether you've spent enough time doing your testing, or perform testing themselves that you think is unnecessary, then your testing may be too shallow. On the flip side, if your colleagues are frequently removing scope from your testing, questioning what else you could be doing with the time that you spend testing, or do not perform any testing themselves, then your testing may be too deep.

On the point of colleagues doing testing, this is a particularly useful indicator in agile teams. In the extreme case, if no unit tests are being written and your developers are outsourcing their testing to you, or if the business trust you to do their user acceptance testing, then it's likely that you're testing too deeply. If you want to cultivate an environment with shared ownership of quality then you have to allow room for that to happen.

Management feedback

If your testing pendulum is sitting at a point in the spectrum where your team are unhappy, it's likely that your manager will have a direct conversation with you about your test approach.

If you're testing too much your manager will probably feel comfortable about telling you this directly. If your testing is too shallow, you might be asked for more detail about what you're testing, be questioned about bugs reported by users, or have to explain where you're spending your time.

Indicators are heuristics, not rules. They include subjective language, i.e. "many", "not many", "frequently" or "a lot", which will mean different things in different situations. As always, apply the context of your organisation to your decision making.

The indicators that I've described can be summarised by opposing statements that represent the extremes of the pendulum swing:


Finding equilibrium

Eventually, most testers will end up at an equilibrium where the pendulum hangs at rest, halfway between the two extremes. The comfort of this can be problematic. Once we have confidence in our approach we can become blind to the need to adjust it.

I believe the state that we want to strive for lies slightly towards the left of the spectrum, or too deep. In order to keep a pendulum positioned off-center, we have to regularly apply small amounts of pressure: a bump!

Pushing the boundaries

If you've been testing a product for a long time and wish to avoid becoming stale, give your testing a bump towards greater depth. Apply some different test heuristics. Explore using different customer personas. Alter your test data. Any variation that could provide you with a fresh crop of problems to explore the validity of with your team.

You can use the outcome of a bump to calibrate your test approach within a smaller range of pendulum movement. Which changes are too much? Which are welcome and should be permanent added to your testing repertoire? Continued small experiments help to determine that you are in the right place.

Conclusion

The testing pendulum is a useful analogy for describing how to find balance in a test approach. When entering a new team or project, it can be used to illustrate how we experience large initial swings in the depth of our testing as we learn the indicators of our environment. For those who have been testing the same product for a while, it can remind us to continuously verify our approach through bumps.

There is no one right answer to "How detailed should exploratory testing be?", but I hope the testing pendulum will help you to determine and describe the right level of detail for you.

Monday 12 December 2016

Take control of your test environment

As CAST 2015, Ioana Serban delivered a talk titled "Take control of your test environment". It's an entertaining and animated tale of her experiences with test environments. I'd encourage you to watch the talk if you haven't seen it previously.

I chose Ioana's presentation as the basis for a recent online testing knowledge sharing session in my organisation. As we are still working as a dispersed team, it currently isn't practical to bring all the testers together in one physical location to share ideas. Instead I asked them to watch Ioana's talk in their own time prior to a discussion session where we met for an hour in an online group chat to talk about the ideas that she raised.

Though our discussion was a little difficult to follow at times, on account of the many people typing in parallel, there were some very interesting points raised. We covered four topics: our build stability, techniques for test automation triage, the keys to our environments, and the people we work with in our own test environment quests.

Build Stability

Ioana spoke about techniques for encouraging stability in automated builds. She contrasted an approach where punishment is a deterrent e.g. wear a funny hat or pay a fine if you are the person responsible for a broken build, to a more positive outlook e.g. a counter on the wall that tracks the number of days since the last time that the build failed. Rather than amplifying mistakes, celebrate success.

We started by talking about how we currently encourage stability in our teams and tribes.

The initial response was that we don't focus on the stability of our master code base but rather the stability of what we are merging to it. The testers are careful to only push change from their branches when it has been peer reviewed and executes reliably. If bugs do end up in master, the tester will take ownership of investigating the automation failure, talk to the developers in their team, and collaboratively create a fix. The testers commented that "most of the time our build is stable compared to other places I have worked" and "in other organisations, it takes days to resolve problems, but in [our product] it takes hours".

But we also had one delivery area where things weren't so rosy. One tester said that their builds are usually broken "to the point where most people don’t have a reaction to a failed or unstable build". There was interesting discussion around the approach to resolving this that touched on the willingness of capable individuals to take ownership of the problems, allocation of time to build-related activities, incorrect configuration in our continuous integration server, failure fatigue and aging infrastructure.

We talked about using the time since last failure that is recorded in our continuous integration server to track stability of our builds. We haven't historically put much focus on this metric, but this may need to change if we move to a model of celebrating success.

The current method to keep attention on stability is to push build failure notifications to a team chat channel. Repeated messages make it difficult to ignore instability. Some of the testers also run their automation every time that a developer deploys into their test environment, which keeps both the automation and the environments stable.

Finally we talked a little about how instability can be a positive thing too. If the tests pass all the time, is there any point to them? There was a comment that "a test that never fails never gives you good information". However, we want our failures to be on branches and not on master. It also shouldn't be the test code itself that fails but rather a problem in the application or configuration that is highlighted by the test.

Test Automation Triage

Ioana spoke about separating her test automation into categories and treating each set differently: a suite of known bugs, a suite of flaky tests, and a suite that works reliably. We don't use an approach like this to triage our automation and I was curious about what the testers thought of it.

It's fair to say that opinions were mixed.

In one of our products we have a relatively old automation suite that has created some stability problems in the past. Testers in this tribe were eager to try Ioana's approach, though some investigation will be required to determine whether it's possible with the tool set that we use.

In our other products instability is either extremely rare or completely pervasive. Where instability is rare, we generally have the luxury of focusing on and resolving a single problem. There is little benefit to segregating this work from our master code repository. Where instability is pervasive, we would effectively be tipping the entire suite into the flaky bucket, which negates the benefit of having a separate category.

Though we are familiar with flaky tests, the idea of having an automation suite of known bugs was not something that we'd considered before. Some people thought that it might be useful to highlight known problems and give the developers a resource to target their code changes. But there was also a viewpoint that it would be an expensive waste of time to code a suite that always failed.

Finally, we diverged slightly into discussion about failures beyond our automation. We realised that we don't have visibility of which bugs are part of our product backlogs or a centralised register of known issues with our release test environments. Both were taken as areas for action.

Keys to our environments

Ioana talked about keys as a metaphor for access to aspects of her test environments. The four keys that she mentioned were access to code, access to the database, access to monitoring, and permission to deploy. Our next discussion topic was about the keys that the testers have for our test environments and the keys that they are missing.

Through this conversation I learned that many of the testers have limited access to our release testing environments. Some can examine server logs and query the release environment database, but others cannot. In theory, every new tester who joins the organisation is granted identical permissions. In practice, it seems that there are inconsistencies in the access that has been configured for individuals. Now that I am aware of this, I've started asking questions.

The other keys that the testers identified weren't specifically about test environments. Instead they were keys to information that will help focus their testing. One tester asked for improvements in device usage analytics to better target their native mobile application testing. Another requested access to customer feedback to get a better understanding of how our customers are using our products.

The idea of keys made people think broadly about areas that were locked to them and who they would need to speak with to open these doors.

People

Ioana finished by talking about the relationships that she'd built with people across her organisation who helped her to take control of her test environments. The last discussion topic focused on sharing the go-to people that the testers would call on for help when facing their own test environment challenges.

I was expecting people to name individuals but the most common response was that the testers direct their questions to a group. One commented that asking a group opens the opportunity for someone who you perhaps wouldn't ask directly to respond. I think the order in which the groups were named in our discussion reflects the order in which people would ask them for help.

First, developers. This was overwhelmingly the immediate response. I like that the testers have built relationships with their developers where they feel completely comfortable asking them questions, even those that aren't directly related to the code. I think developers as a first response is a symptom of healthy delivery teams.

The problems that the developers cannot solve are usually resolved by a solution coming from outside the expertise of the delivery team. The next group mentioned in the discussion were "all of the testers". As a collective that spans across multiple teams and tribes, we may not always know the answer but we are well-connected to people who might. Someone else said that they would choose an appropriate online chat channel to ask their question to.

Finally, one tester commented that testers should try to build our capability to identify the problem with our environments instead of simply saying that one exists. As with raising a bug, it's easier to resolve if more information is provided. Particularly when we turn to a group for assistance, asking the question with appropriate details is important.

Conclusion

Ioana's talk is interesting in its own right, but I would also recommend it as the foundation of a team discussion about test environments. We found a lot of scope for deep conversation about applying Ioana's ideas in our organisation.

I finished our session by asking the team to share one thing that they would think more about or take action on. The points that came back were indicative of a presentation that covered attitude, approach and technical techniques, including: 
  • Keep asking questions, be polite, and build relationships with your colleagues.
  • "We are the processes, they are living agreements."
  • Learn what keys are available for your test environments and work to get them. 
  • Do your research and send good information to the support team when asking for help.
  • Create a failing test to make a bug fix easier for a developer.

I hope that this session will encourage the testers that I work with to be more mindful of build stability, techniques for test automation triage, the keys to our environments, and the people we work with in our own test environment quests. Ultimately, we'll take control of our test environments.

Sunday 4 December 2016

Stop fighting. Start participating.

In her editorial for the October edition of Women Testers, Karen Johnson talks about change in the testing profession, particularly the increasing demand for test engineers and automation. She concludes her thoughts by saying: 

"...we had to fight for our profession in the beginning and, as it turns out, we might need to fight the good fight yet again."

I find the rhetoric of fight relatively common in the testing community. It's strong language to use in a professional context. To encourage people to fight is to encourage them to "take part in a violent struggle to overcome, eliminate, or prevent". It seems at odds with the type of environment that many of us work within or would choose to create with our colleagues.

In my experience, the quickest way to be removed or excluded from a collaborative conversation in the workplace is to focus solely on fighting for your viewpoint. If you are seen as aggressive or stubborn, people will probably choose not to talk to you. Decisions start to happen around you, rather than with you. Change still occurs, but by fighting you've removed your ability to influence it.

I don't believe that we should approach change with an attitude to fight it. 

Instead, I would like testers to focus on being part of the conversations that create evolution in our roles. We need to invite ourselves along, contribute with enthusiasm and pragmatism, and shape change that includes our perspective. Our mindset is not to fight, but to participate.

How can we do that?

I believe that any tester, regardless of their experience or position in an organisation hierarchy, can develop the skills and relationships required to effect decision making. But it's not easy and the scope of what you can influence will vary.

Start by developing your understanding of where change comes from in your organisation. As testers our roles may be shaped by test managers, or software development managers, or managers in the wider organisation. How far away from you are the decision makers? 

What forum are decisions being made in? Could you attend? If you're not sure, don't assume that the meeting is closed. Speak to the organiser. If you're unable to attend as a contributor you may be permitted to observe. 

Regardless of whether you can be present, try to find your closest advocate in that forum. This is probably the person whose views most closely match your own. It could also be the person who is most receptive to alternate perspectives. If your manager is in the forum and you believe your best advocate is someone else, use extra care and diplomacy as you navigate future conversations!

Try to develop a connection with that person. Take time to understand their viewpoint and how they've come to develop it. What problems do they believe will be solved with proposed change? What do they understand to be the potential benefits and drawbacks of their choice? 

Think beyond information to the context that it originated from. How much do you know about the environment that the decision maker is in? What pressure is being applied to them from other areas of the organisation? What is their personal history in the organisation? Empathy will help you feel the weight of their context in their choices.

Only after all this groundwork do you start thinking about your own contribution to the conversation. Can you broaden understanding of the existing problems? Can you offer additional information about the options being considered that may alter the final choice? What new alternatives can you suggest that still resolve the specific concerns of the decision maker?

Then, how can you present your viewpoint persuasively? Successful participation is not just about what you say, but also in how you deliver it. Can you control the pitch and pace of your voice? Can you demonstrate positive body language and illustrate your message with appropriate gestures? If you need to, practice first. You should appear confident in your ideas.

Finally, be comfortable with the outcome, whatever it is. Even when you feel some disappointment in a decision, participating in the discussion will have improved your understanding of how it was made. And remember, the goal is not to "overcome, eliminate, or prevent". The goal is to be involved. 

I've described a high-level approach to contributing to change that illustrates a collaborative mindset rather than a combative one. Behind each step is a depth of knowledge to uncover and new skills to develop. It may be harder to be a diplomat than a warrior, but I believe this path will focus your energy in a constructive way.

There is no doubt that the testing profession is evolving. If you want to help shape the changes that affect you, then contribute in your own organisation. Stop fighting and start participating.

Friday 25 November 2016

Finding the vibe of a dispersed team

Recently there has been an unexpected change in my work environment. Just after midnight on the 14th of November, an earthquake with a magnitude of 7.8 struck New Zealand. The earthquake caused significant damage across the upper South Island and lower North Island, including in Wellington where I am based. My work building is currently closed due to earthquake damage.

I work with over 30 testers who are spread across 18 delivery teams. In a co-located environment that's a challenging number of people to juggle. Now that everyone is working from home, there are new obstacles in trying to lead, support and coach the testers that I work with.

In the past fortnight I've been doing a lot of reading about distributed teams. Though some of the advice is relevant, most of it doesn't apply to our situation. We're not distributed in the traditional sense, across multiple cities, countries and timezones. Though we're set up for remote work, it hasn't been our go-to model. We're still close enough that relatively regular face-to-face meetings are possible.

Instead of distributed, I've started to think of us as dispersed.

The biggest challenge so far, in our first fortnight as a dispersed team, has been in determining the vibe of the testing community. The vibe of the team is the atmosphere they create: what is communicated to and felt by others. The vibe comes from feelings of the individuals within the team.

In a co-located environment, there are a lot of opportunities to determine the vibe. The most obvious is our weekly Testing Afternoon Tea. This is a purely social gathering every Tuesday afternoon at 3pm. We have a roster for who provides the afternoon tea, all of the testers meet in the kitchen area, and spend around 15 minutes catching up. The meeting is unstructured, the conversations are serendipitous.

When everyone turns up to afternoon tea, stays for the entire 15 minutes, and there is a hum of conversation, the vibe of the team feels happy and relaxed. When it is difficult to detach people from their desks, people grab food then leave, and the conversations are mostly cathartic, the vibe of the team feels stressed and frustrated. Often, there's a mixture of both.

But even when the testing team are not together, I am reading the vibe from our co-location. For example, I'll often wander the floor in the morning when stand ups are happening. I look at how many people from outside the team are attending. When I spot multiple delivery managers and product owners with a single team, that may be a sign that the team is under pressure or suffering dysfunction. If it seems like the testers are not contributing, or they have closed body language, that may be a sign of frustration or despondence.

The vibe helps me determine where to focus my attention. It's important to be able to offer timely support to the people who need it, even if they may not think to ask. It's important to determine whether it's an appropriate time to think about formal learning, or if it's better to give people space to focus on their delivery demands. It's important to recognise when to push people and when to leave them alone.

Facing the reality of coaching a dispersed team feels a little bit like being blindfolded. The lack of co-location has removed one of my senses. How do I find the vibe of a dispersed team?

I find working at home quite isolating, so the first action I took was to try and reduce the feeling of being far away from everybody else. Though our communication is now primarily through online channels, we are only dispersed and not distributed.

At the start of this week, I asked the testers to check-in to tell me which suburb of the city they were working from and whether they had all the equipment they needed to work effectively. Through the morning I received responses that I used to create a map of our locations. We are now spread across an area of approximately 600 square kilometres or 230 square miles:

Working locations of testers before and after the earthquake

The information in the map is specific enough to be useful but general enough to be widely shared. Markers are by suburb, not personal address, and are labelled by first name only. Tribe groupings are shown by colour and in separate layers so that they can be toggled so that it's possible to see, for example, where all our mobile testers are located.

Creating the map was a way to re-assert that we are still a community. I felt this was a pre-requisite of keeping the testers connected to each other and mindful of the support available from their peers.

The check-in format that I used to gather the information at the start of the week worked well. It meant that everyone contributed to the discussion. I plan to start each week with a check-in of some description while we remain dispersed.

Next I started to consider how to create an environment for the informal gathering and conversation that would usually happen at our weekly afternoon tea. November is traditionally a busy time of year for our delivery as we work to release before the holiday period. Even when we're co-located, it can be hard to get people together. Any distraction from delivery had to have an element of purpose.

Communication was emphasised in everything that I read about distributed teams, with the message that more is better when people are working remotely. I wanted a daily rather than a weekly pulse, but it had to be designed for asynchronous communication. It wasn't feasible to attempt to book a daily appointment and gather people together.

I decided to make use of a book of objective thinking puzzles that I purchased some time ago but never completed. The puzzles are relatively quick, have a purpose in expanding thinking skills, are well suited to remote asynchronous communication, create enough interest that people participate, and offer the opportunity for some conversation outside of their core purpose.

The hardest Puzzle of the Day so far!

I've started to share a puzzle each morning with the testers via an online chat channel. This is keeping the channel active with conversation, which is essential for me to determine the vibe. I'm yet to determine importance within the patterns that I see. I don't assume that silence is bad. I don't assume that people who are active aren't under pressure. But I hope that encouraging informal conversations will start to provide rich information about how people are feeling, just as it did in the office.

Finally, I've started to attend meetings that I would usually skip in our co-located environment. This week the coaching team that I belong to attended two of our product tribe gatherings. These focus on sharing information that delivery teams need to succeed and recognising achievements in what we've already released to our customers.

The content is not directly relevant to me, but these events were a great opportunity to determine the vibe of those tribal groups and the testers within them. Having the ability to sense the atmosphere was worth the hassle of arranging transport and balancing calendar conflicts to attend. It was also a way to be visible, so that people remember to call on us for help too.

It's still early days for our dispersed team. These are just a few things that I've done this week to try to lift the blindfold. I'm curious to hear from other people who coach across dispersed or distributed teams. How do you determine the atmosphere of your team? Where do you discover opportunities to support people? What suggestions do you have that I could try to apply?


Sunday 6 November 2016

Can we remove a tester from our agile team?

In my role, I work with over 30 testers who are distributed across 18 different agile teams. If you do the math on those numbers, you'll realise that we don't have the same number of testers in each of our teams. Some have one tester, some have two, and in exceptional cases there can even be three testers working together.

We generally try to match the skills of a team with the type of work that we're asking that team to do. This can mean that as the nature of work changes, the teams shifts in response. I was recently asked for my views on whether a team could reduce the number of testers they had. I found that I was quite unprepared for the conversation.

How do you know whether you can remove a tester from an agile team?

As with most things, I don't think there are definite rules here. However, having though a lot about how you'd decide whether to remove a tester, I think there's value in a general set of questions to ask. I see five areas to consider: team dynamic, support for quality, context beyond the team, measurement and bias.


Team Dynamic


You never remove a tester though, you remove a person and that person is unique.

The first thing to do is make the question personal. You want to consider the real people who will be impacted by the decision that you're making. There may be multiple faces to removing a tester: think about the team that they leave, the team that they move to, and the experience of the individual themselves.

How many testers do you have in the team? Are you removing the only tester?

Are you removing the tester role or testing as an activity? Are there, or will there be, others in the team with testing experience and knowledge, even if not testers? How will other people in the team feel about adopting testing activities? What support will they need?

Are you replacing the tester in the team with another person? What will their role be? How will the change impact specialist ratios within the team?

If the person is being moved to a new team, what opportunities exist for them there? How will their skills and experience match their new environment? What impact do you expect this change to have on the team that they are joining? How will the people in that team have to change their work to accommodate a new team member?


Support for Quality


If you remove a piece from a Jenga puzzle, will it fall? The impact depends on what it supports.

The quality of your product doesn't come from testing alone. There are many activities that contribute to creating software that your customers enjoy. It's important to determine the depth and breadth of practices that contribute to quality within the team that you're looking to change.

What level of test automation is available to support the team? Is it widely understood, regularly executed and actively maintained?

What other practices contribute to quality in the team? Pair programming? Code review? Code quality analysis? Continuous integration? Business acceptance? Production monitoring? To what extent are these practices embedded and embraced?

What does the tester do outside of testing? What sort of impact do you expect on the social interactions of the team? Agile rituals? How about depth of product knowledge? Customer focus? These things may not specifically be test activities or skills, but do impact quality of the product.


Context beyond the team


It would be interesting to understand their motivation behind asking that question.

The wider context to the change you are making will have significant impact on how people feel about it. You should consider what your underlying reasons are, how people will feel about those reasons, and how far-reaching the implications of your change could be.

What’s the context of the movement being made? In Dynamic Reteaming, Heidi Hefland talks about five different scenarios in which you may want to change a team:
  1. Company growth
  2. The nature of the work
  3. Learning and fulfilment
  4. Code health
  5. Liberate people from misery
Is one of these applicable? If not, can you succinctly explain what your reason is? How is this wider context being viewed by the team(s) involved? Are they enthusiastic, cautiously optimistic, or decidedly negative?

What is the composition of surrounding teams? Similar or different? How will that impact the outcome? If I’m part of the only team without a tester, or the only team with a single tester, what impact will that have?

If there are governance processes surrounding release, is the person who approves change to production supportive of the change to the team? Will that person still have confidence in what the team deliver?


Measurement


How do you know what quality is now? 

As with any change, it's important to understand how you'll track the impact. Changing the way that a team approach testing could impact both the quality of software they create and how they feel about the work they do, in a positive or negative way.

What metrics help you determine the quality of your product? If quality decreases, how low is acceptable? What is your organisation’s appetite for product risk? How many production bugs can the reputation of your organisation withstand?

What metrics help you determine the health of your team? If productivity or morale decrease, how low is acceptable? What is your organisation’s appetite for people risk? What impact will the change have on happiness and retention of other roles?


Bias


How would someone else answer these questions? 
Remember the bias we have that impacts our answers.

The final point to remember is that you don't want to consider these questions alone. A manager who sits outside a team may have quite different answers to a person who works within it. The tester who is being moved will have opinions. Whoever is ultimately responsible for the decision should also think about how other people would respond to the previous prompts.


When I'm asked whether or not we can remove a tester from a team, I often have an immediate and intuitive response that is either positive or negative. Now that I've thought about what contributes to this decision, I will be able to articulate the reasoning behind my instinct. Team dynamic, support for quality, context beyond the team, measurement and bias; five pillars that I'll be using in my next conversation.


Thanks to Remi Roques, Kathy Barker, David Greenlees and everyone who responded to the initial thread on Twitter for helping to refine my thoughts on this topic.

Wednesday 2 November 2016

Stay Interviews for Testers

Recently one of my colleagues sent out an article about stay interviews. Basically, a stay interview is the opposite to an exit interview. Instead of waiting for people to resign then asking them why they are leaving the organisation, you try to determine what is making them stay.

Stay interviews are primarily a retention tool. They're a means of staying connected with the people who work with you, and maintaining an environment that they're happy to be contributing in.

I was interested in this idea, so I decided to try it out. I met one-on-one with every permanent tester in my department to ask a set of questions that touched on motivation, happiness, unused talents and learning opportunities. The answers that I collected provided me with a pulse of the team as a whole, and valuable insights into the individuals who I coach too.

The Questions

I pulled all of my stay interview questions from across the internet. There are a lot of articles around that will give you examples. Some that I read as I researched the concept of stay interviews were:

The ten questions that I chose as being most relevant to my organisation and the purpose of my discussions were:
  1. The last time you went home and said, "I had a great day, I love my job," what had happened?
  2. The last time you went home and said, "That's it, I can't take it anymore," what had happened?
  3. How happy are you working here on a scale of 1-10 with 10 representing the most happy?
  4. What would have to happen for that number to become a 10?
  5. What might tempt you to leave?
  6. What existing talents are not being used in your current role?
  7. What would you like to learn here?
  8. What can I do to best support you?
  9. What do you think is the biggest problem in [our testing team] at the moment?
  10. What else should I be asking you?

The Answers

All of these individual conversations were in confidence. However I did create a high-level document to share with other leaders in my department, which summarised key themes through illustrations, graphs, and anonymised comments. What follows is a subset of that, suitable for sharing.

I took the answers to the first two questions, categorised the responses, then created word clouds that demonstrated the common topics. An awesome day for a tester was one in which they are discovering new things to learn, have released software to production, or are simply enjoying the momentum of completing their work at a steady pace:

"I had a great day, I love my job"

An awful day for a tester is one in which their delivery team is in conflict or has a misunderstanding, where they’re in the midst of our release process, or when they’ve encountered issues with our test environments.

"That's it, I can't take it anymore"

What I found particularly interesting about these responses was how general they were. There were not many comments that were specific to test alone. Rather, I believe that these themes would be consistent for any of our delivery team members: business analysts, developers, or testers.

The question around happiness prompted for a numeric response, so I was able to graph the results:

How happy are you working here on a scale of 1-10 with 10 representing the most happy?

This data was interesting in that the unhappiest testers were mostly from the same area. This was a clear visual to share with the leadership in that particular part of the department, to help drive discussion around specific changes that I hope will improve the testing habitat.

When asked what would improve happiness, salary was an unsurprising response. But other themes emerged of almost equal weighting. Time to deliver more automation, a consistent workflow for testers, and the ability to pick up and learn new tools.

In response to existing talents that are not being used, the most prevalent skills were those that sit within the Testing Coach role: automation frameworks, leadership and training. This was a strong indicator to me that I need to delegate even more frequently to provide these opportunities.

Frustratingly, but not unusually, the requests for learning were fragmented. The lack of a consistent response makes it difficult to arrange knowledge sharing sessions that will appeal to a wide audience. But it does allow people to specialise in areas that are of interest to them rather than pursuing shallow general learning.

Overall I found the activity of stay interviews very useful. The structured set of questions helped me to have a purposeful and productive conversation with each of the permanent testers that I work with. I learned a lot from the information that was gathered, each set of responses were interesting for different reasons. The results have helped me to shape my actions over the coming months. I'm hoping to create outcomes from these conversations that will continue to keep our testing team happy.

Monday 24 October 2016

Risk-based release testing

In my organisation there's a big push to increase our release cadence. Our current rate of release varies between products, as each adopts a slightly different model of delivering change to our customers, but in every case there's opportunity to streamline our activities and be more responsive.

Recently I've been working with a specific group of testers in one of our online banking applications. They currently operate a monthly release cycle using a release process that takes about a week to complete. Most of the week is spent in manual release testing, which consistently creates frustration for the testers themselves and the people they're working alongside.

My observation from a coaching perspective was that we had fallen into release testing theatre*. Our testers all had the script for every release. They dutifully played their parts and read their lines, but it all felt a bit empty. Unfortunately the playwright hadn't been evolving the play alongside other changes in our organisation. The testers were acting out a release process that no longer made much sense.

The testers all recognised a need to change what they were doing in the release. But instead of trying to edit what we already had, I wanted to question the rationale behind it.

Risk Appetite

I facilitated a workshop that was attended by all of the testers for the product, along with two of the delivery managers who have accountability for release testing sign off as part of our governance process. 

I started the session by gauging opinion of all the attendees about our current approach to release testing. I asked two questions that I adapted from The Risk Questionnaire by Adam Knight:
  1. How do you think [product] currently stands in its typical level of rigour in release testing?
  2. How do you think [product] should stand in its typical level of rigour in release testing?
This generated some unexpected discussion on what the term rigour meant to us! 

I asked people answer the questions by choosing a place to stand in the room: one wall was low and the opposite wall was high. This gave a visual indicator of how people felt about the existing approach and which direction they felt we should be heading towards. 

Interestingly the testers and the delivery managers had quite different views, which was good to highlight and discuss early in the session.

Brainstorming Risk

Next I asked people to consider what risks we were addressing in our release testing, then write out one risk per post-it note. I emphasised that I wanted to focus on risk rather than activities. For example, instead of  'cross-browser testing' I would expect to see 'product may not work on different platforms'.

After five minutes of brainstorming, the attendees shared the risks that they had identified. As each risk was shared, other attendees identified where they held a duplicate risk. For example, when someone said 'product may not work on different platforms', we collected every post-it that said something similar and combined them into a single group.

We ended up with a list of 12 specific risks that spanned the broad categories of functionality, code merge, cross-browser compatibility, cross-platform compatibility, user experience, accessibility, security, performance, infrastructure, test data, confirmation bias and reputation.

Mitigating Risk

Between completion by a delivery team and release to our customers, the product is deployed through six different environments. The next activity was to determine whereabouts in the release process we would mitigate each of the risks that we'd collectively identified. 

I stuck a label for each of our environments across the wall of the workshop room, creating column headings, then put the risk post-it notes into a backlog at the left. We worked through the backlog, discussing one risk at a time and moving it to the environment where it was best suited, or breaking the risk in to parts that were mapped to separate environments if required.

The result was a matrix of environments and risk that looked like this:

Mapping risks to release environments

As you can see from the picture above, we realised that most of our risk was being mitigated early in our release process. As we get closer to the production environment, on the right hand side of the visualisation, there are far fewer post-it notes.

Creating this mapping initially caused some confusion, as the testers were reluctant to say a risk had been mitigated at a particular point in the release process. Eventually I realised that there was a misunderstanding in terminology. I said mitigated, they thought I meant eliminated.

To explain the difference between mitigating and eliminating risk I used an example from one of my volunteering roles as a Brownie Leader. In one of the lodges where we hold our overnight camps there is a staircase to use the bathrooms that are located on a lower level. To mitigate the risk of a girl falling on the stairs at night, we leave the stairwell light switched on. This action doesn't mean that we will never have someone fall on the stairs, but it significantly reduces the likelihood. The risk is mitigated but not eliminated.

Targeted Testing

At the conclusion of the workshop we hadn't talked specifically about test activities. However, the visual mapping of risks to environments raised a lot of questions for both the testers and the delivery managers about the validity of our existing release test process.

Having reached agreement with the delivery managers about the underlying purpose of each release environment, the testers reconvened in a later meeting to discuss how testing could mitigate the specific risks that had been identified. Again we did not reference the existing approach to release testing. Instead we collaboratively mapped out the scenes of a brand new play:

Brainstorming a new risk-based approach to release testing

Our new approach is very different to the old. It's less repetitive and quicker to execute. It's also truly a risk-based approach. The testers are excited about the possibility in what we've agreed. I'm looking forward to seeing how it works too.

I also hope that our release testing for this product continues to evolve. This time around all of the testers collaborated together as playwrights and have shared ownership of the actions they will perform. As our organisation continues to change we should continue to tweak our script to stay relevant. The alternative is a stale process that ends in empty pageantry.


* I'm not the first person to use the theatre analogy. Steve Smith wrote an article on a similar theme, titled Release testing is risk management theatre.

Sunday 9 October 2016

Caring for conference speakers

I've been fortunate to have the opportunity to speak at a number of international conferences. I've traveled to the USA, Canada, India, Estonia, England, Australia, and Denmark, as well as speaking at many events around New Zealand.

My experiences have been generally good. Yet there are many things about speaking at conferences that I feel could be improved. As a co-organiser of the upcoming WeTest conferences, I've spent some time this year reflecting on where the opportunities are to do things better.

The most obvious is paying to speak. I've had to pay my own airfares and accommodation on a few occasions, particularly as a new speaker. Where reimbursement for expenses has been offered it is usually paid after the event, which means that I still need to be financially able to cover these expenses in the short term.

But there are a host of smaller parts that form the overall experience of speaking at an event.

I may not know whether I'm supposed to have my presentation material on my own laptop, on a USB drive, or submitted somewhere in advance. What is the type of connection to the projector? Will there be a microphone? A lecturn? A stage?

I may not know how big my audience is going to be: 10, 100, or more? What type of layout will they be in: tables of 10, rows of chairs, or a staggered amphitheater? What type of people will I be speaking to: testers, test managers, or others who work in software?

I may not know what sort of environment I will face. Is it a conference where presenters simply present, or will there be a Q&A or open season afterwards? Is there a culture of debate, argument or challenge? If so, will I be supported by a facilitator?

All of these unknowns about what I've signed up for can cause anxiety. They also make it difficult for me to picture the audience and tailor my material accordingly.

Then there are the series of small challenges that happen during the experience itself. Arriving from a long haul flight in an unfamiliar country and finding my accommodation. Locating the conference venue and the room in which I'll present. Determining whether I'll be introduced by someone or will introduce myself. Deciding how to manage time keeping. And so on.

So, what are we doing differently for WeTest?

One of the main priorities for our organising committee is to care for our speakers. As many of the WeTest organisers are also regular conference speakers, we've worked hard to remove the worries that may surround accepting a speaking engagement. We know our speakers are putting a lot of work into preparing their presentations. We think that this should be their only concern.

We've arranged and paid for our speaker flights and accommodation in advance. With one exception where a speaker had specific airline requirements, none of our speakers have been asked to foot any of these costs upfront.

We've communicated with our speakers regularly. Since their selection in June we've:
  • agreed on benefits and expectations via a written speaker agreement,
  • offered them the opportunity to check their session and biographical details on the event website prior to our go-live, 
  • provided a mechanism for them to complete their complimentary registration, 
  • shared details of the venue, audio visual setup and event timing, 
  • prepared personal itineraries for travel, accommodation and any associated sponsorship commitments, and
  • sent them a copy of our attendee communication.

Over the past four months I hope that this information has removed a lot of anxiety that can be associated with presenting at an event. As an organising team we've tried to space out these messages, to offer regular opportunities for our speakers to ask questions and eliminate any unknowns.

The speaker itineraries that we've prepared run from arrival in the conference city. We have arranged and paid transport to meet all of our speakers at the airport. For international guests this means they don't have to worry about how to find their hotel or immediately locate New Zealand currency when they land.

And on the conference day itself, we have a dedicated person assigned specifically to our speakers. One of our organising committee will be walking our speakers from their accommodation to the venue, leading the speaker briefing, and be available throughout the event to deal with any questions or problems that arise.

I'm confident that our efforts to look after our speakers will result in fantastic material this year and in years to come. I want to continue to create a safe space for new presenters to step forward from the New Zealand testing community. And I want our WeTest events to be a must for international presenters on the software testing conference circuit.

On a broader note, I hope that our efforts help to change the expectations of speakers for other events. If every organiser aimed to provide a similar level of care, or speakers came to expect this, the experience of speaking at a conference could be consistently better than it is today.

Tuesday 4 October 2016

Observation in Testing

At the WeTest Wellington Quick Lunch Talk today, Donal Christie of Powershop spoke on the topic "Do you see what I see?". Donal has been fascinated by observation from an early age - his favourite childhood toys included a magnifying glass, microscope and telescope. His talk focused on what we see as testers when we examine software.

Donal shared a variety of things to be mindful of, but there were three particularly interesting stories that resonated for me: Rubin's vase, Monet's cataracts, and Walmgate Bar.


Rubin's vase

Donal shared a picture and story about the vase created for the Silver Jubilee of Queen Elizabeth:

Most people are familiar with the Ambiguous Vase illusion. Devised by the Danish psychologist Edgar Rubin, we are not sure if we are looking at a vase, or at two faces, staring at each other.

In 1977, a wonderful 3 dimensional version of this illusion was made, to commemorate the Silver Jubilee of Queen Elizabeth. It was a porcelain vase, but one with a wonderful twist. The profile on one side of the vase was of Her Majesty, but on the other side of the vase, the profile was of Prince Philip. [Ref: The Queen's Speech]



Credit: The Queen's Speech

If you were asked to test this vase, what would be important? Is it the vase itself? Or the silhouette of the vase that shows the royal profiles? Or both?

How does this relate back to software? It's important to have a conversation with your business stakeholders about what the customer wants from your product, then learn what part of your architecture delivers that. What you see may not actually be what you need to test.

One example is feature development that introduces a new screen to the user interface and requires a new web service. It may be easy to test the user interface changes at face value. But we could see an entirely different perspective by testing in the web services layer.

Think of the web service change as the silhouette of the user interface changes. Perhaps it holds a lot of the business logic that the customer desires. Make sure you're testing what you see, but also think about what's around it.

Monet's cataracts

Donal shared a picture and a story about Monet's cataracts:

From 1900 onwards Monet had problems with his vision and complained to his friends that everything he saw was a fog. Although cataract operations had been performed for thousands of years they were still a risky business at the time. He agreed to surgery to totally remove the lense in his left eye in 1923 at the age of 82 and the operation was a success. There were no replacement implant lenses at the time and he had to wear thick glasses but his vision was transformed.

However, the operation had an unexpected side effect; as mentioned before it’s claimed that he began seeing the world with UV vision. His palette which before the operation had been red, brown and earthy took on a more bluish hue. [Ref: Claude Monet and Ultraviolet Light]

Credit: Claude Monet and Ultraviolet Light

People perceive colour differently. Though Monet's example is an extreme one, there are many people with impaired vision and colour blindness. For these people, what they see is not what you see.

Donal made the point that in these cases there can be more than one truth. To one person, the house as seen from the rose garden is red. To another, it's blue. To another, it's grey. None of these people are wrong. The way that they see the house will depend on how they see.

When we test software, you might hear people say "Did you see that bug?". In some cases, perhaps they didn't! Two people observing the same piece of software will form two separate truths. What you see and perceive will be different from your colleagues and your customers.

Donal advocated for pair testing and accessibility testing, approaches that try to incorporate multiple perspectives during the development process. I hadn't boiled down the benefits of these practices to a basic need for many people to observe a system. This is an argument I will be adding to my repertoire.

Walmgate Bar

Donal shared a picture and a story about Walmgate Bar, a historic location in York that he visited with his wife. They saw a plaque that described the site:

Credit: Donal Christie

Take a moment to read the inscription.

It may not be particularly striking. You probably know a little bit more about Walmgate Bar. Did you spot the two small errors? The first is that the word siege is spelled incorrectly. The second is in a sentence that has a duplicate word: "erected in the the reign of".

Occasionally we need to consciously shift our thinking to find different types of problems in software. We need to think about the system as a whole and determine whether it is fit for purpose. In this case, the sign is successfully communicating the intended information. We also need to examine the parts that make up the system and determine whether they are behaving correctly. This is where the problems crept in with the sign above.

I've found it particularly difficult to switch between these levels of thinking as a tester in an agile team. It can be easy to focus on testing each individual story and forget about testing the whole. I have a tendency to get bogged down in detail. The Walmgate Bar sign is a good reminder to think about both perspectives.

Interestingly it looks like this particular sign has now been replaced with one that is correct:

Credit: Yorkshire Walks

Donal's talk was a great reminder about observation and interpretation. He reminded me to consider:

  1. whether the product is what I can see, 
  2. that what I see may not be what others see, and 
  3. that the problems I find will change based on where I look.


Sunday 25 September 2016

Why don't you just

I'm solution oriented. If I hear about a problem, I like to make suggestions about how it can be resolved. Sometimes before people have even stopped speaking, my brain is spinning on ideas.

As a coach and mentor, this trait can be a problem. Thinking about solutions interferes with my active listening. I can't hear someone properly when I'm planning what I'll say next. I can neglect to gather all the context to a situation before jumping in with my ideas. And when I offer my thoughts before acknowledging those of the person who I'm talking to, I lack empathy.

Earlier in my career I was taught the GROW model, which is a tool that has been used to aid coaching conversations since the 1980s. GROW is an acronym that stands for goal, reality, options, way forward. It gives a suggested structure to a conversation about goal setting or problem solving.

When I jump to solutions, I skip straight to the end of the GROW model. I'm focusing on the way forward. While I do want my coaching conversations to end in action, I can end up driving there too fast.

Pace of conversation is a difficult thing to judge. I've started to use a heuristic to help me work out when I'm leaping ahead. If I can prefix a response with "Why don't you just" then it's likely that I've jumped into solution mode alone, without the person that I'm speaking to.

Why don't you just ask Joan to restart the server?

Why don't you just look through the test results and see how many things failed?

Why don't you just buy some new pens?

"Why don't you just" is the start of a question, which indicates I'm not sure that what I'm about to say is a valid way forward. If I'm uncertain, it's because I don't have enough information. Instead of suggesting, I loop back and ask the questions that resolve my uncertainty.

"Why don't you just" indicates an easy option. It's entirely likely that the person has already identified the simplest solutions themselves. Instead of offering an answer that they know, I need to ask about the options they've already recognised and dismissed. There are often many.

"Why don't you just" can also help me identify when I'm frustrated because the conversation is stuck. Perhaps the other person is enjoying a rant about their reality or cycling through options without choosing their own way forward. Then I need to ask a question to push the conversation along, or abandon it if they're simply talking as a cathartic outlet.

This prompt helps me determine the pace of a conversation. I can recognise when I need to slow down and gather more information, or when a conversation has stalled and I need to push the other person along. Perhaps "Why don't you just" will help others who are afflicted with a need for action.

Sunday 18 September 2016

Going to a career

My father-in-law works in HR. A few years ago when I was thinking about changing jobs, he gave me a piece of advice that stuck. He said:

"People are either leaving a job or going to a job. Make sure you're going to something."

Sometimes you're changing jobs primarily to escape your current situation. You might have an unpleasant manager or colleagues, feel that you're being paid unfairly, find your work boring or the working conditions intolerable. You're searching for something else. You're leaving a job.

On the other hand, sometime's you're changing jobs in active pursuit of the next challenge. You might be looking to gain experience in a new industry, for a new role within your profession, or for a greater level of responsibility in your existing discipline. You're searching for something specific. You're going to a job.

These two states aren't mutually exclusive, obviously you might have reasons in both categories. But his advice was that the reasons you're going to a job should always outweigh the reasons that you leave your existing one.

When I reflect on my career, I have definitely changed jobs in both situations. But it has been those occasions where I've moved towards a new role, rather than escaping an old one, that have propelled my career forward. The decisions that I've made consciously in pursuit of a broader purpose, rather than as a convenient change in immediate circumstance, have always served me best.

I find myself regularly sharing this same advice with others who are considering their career. If you're thinking about what's next, make sure you're going to something. Deliberate steps forward are how we grow and challenge ourselves.

Sunday 4 September 2016

The end of the pairing experiment

I have spoken and written about the pairing experiment for sharing knowledge between agile teams that I facilitated for the testers in my organisation. After 12 months of pairing, in which we saw many benefits, I asked the testers whether they would like to continue. The result was overwhelming:

Survey Results

I had asked this same question regularly through the experiment, but this was the first time that a majority of respondents had asked to stop pairing. As a result, we no longer do structured, rostered, cross-team pairing.

Why?

The first and most obvious reason is above. If you ask people for their opinion on an activity that they're being instructed to undertake, and they overwhelmingly don't want to do it, then there's questionable value in insisting that it happens regardless. Listen to what you are being told.

But, behind the survey results is a reason that opinion has changed. This result told me that the testers believed we didn't need the experiment anymore, which meant they collectively recognised that the original reason for its existence had disappeared.

The pairing experiment was put in place to address a specific need. In mid-2015 the testers told me that they felt siloed from their peers who worked in different agile teams. The pairing experiment was primarily focused on breaking down these perceived barriers by sharing ideas and creating new connections.

After 12 months of rostered pairing the testers had formed links with multiple colleagues in different product areas. The opportunity to work alongside more people from the same products offered diminishing returns. Each tester already had the visibility of, and connection to, other teams.

Additionally, our pairing experiment wasn't happening in isolation. Alongside, the testers within particular product areas started to interact more frequently in regular team meetings and online chat channels. We also started meeting as an entire testing competency once a week for afternoon tea.

The increased collaboration between testers has shifted our testing culture. The testers no longer feel that they are disconnected from their colleagues. Instead there's a strong network of people who they can call on for ideas, advice and assistance.

The pairing experiment achieved its objective. I'm proud of this positive outcome. I'm also proud that we're all ready to let the experiment go. I think it's important to be willing to change our approach - not just by introducing new ideas, but also by retiring those that have fulfilled their purpose.

Now that we've stopped pairing, there's time available for the next experiment. I'm still thinking about what that might be, so that our testing continues to evolve.

Thursday 18 August 2016

Post-merge test automation failures

Recently we implemented selenium grid for one of our automated suites. I've written about our reasons for this change, but in short we wanted to improve the speed and stability of our automation. Happily we've seen both those benefits.

We've also seen a noticeable jump in the number of pull requests that are successfully merged back to our master branch each day. This gives some weight to the idea that our rate of application code change was previously impeded by our test infrastructure.

The increase in volume occasionally causes a problem when two feature branches are merged back to master in quick succession. Our tests fail on the second build of the master branch post-merge.

To illustrate, imagine that there are two open pull requests for two feature branches: orange and purple. We can trigger multiple pull request (PR) builds in parallel, so the two delivery teams who are behind these feature branches can receive feedback about their code simultaneously.

When a PR build passes successfully and the code has been through peer review, it can be merged back to the master branch. Each time the master branch changes it triggers the same test suite that executes for a pull request.

We do not trigger multiple builds against master in parallel. If two pull requests are merged in quick succession the first will build immediately and the second will trigger a build that waits for the first to complete before executing. Sometimes the second build will fail.

1. Failing tests after multiple PR merges to master

As the person who had driven sweeping test infrastructure changes, when this happened the first time I assumed that the test automation was somehow faulty. The real issue was that the code changes in orange and purple, while not in conflict with each other at a source code level, caused unexpected problems when put together. The failing tests reflected this.

We hadn't seen this problem previously because our pull requests were rarely merged in such quick succession. They were widely spaced, which meant that when the developer pulled from master to their branch at the beginning of the merge process these type of failures were discovered and resolved.

I raised this as a topic of conversation during Lean Coffee at CAST2016 to find out how other teams move quickly with continuous integration. Those present offered up some possible options to resolve the problem as I described it.

Trunk based development

Google and Facebook move a lot faster than my organisation. Someone suggested that I research these companies to learn about their branching and merging strategy.

I duly found Google's vs Facebook's Trunk Based Development by Paul Hammant and was slightly surprised to see a relevant visualisation at the very top of the article:


2. Google's vs Facebook's Trunk Based Development by Paul Hammant

It seems that, to move very quickly with a large number of people contributing to a code base, trunk-based development is preferred. As the previous diagram illustrates, we currently use a mainline approach with feature branches. This creates larger opportunities for conflicts due to merging.

I had assumed that all possible solutions to these tests failing on master would be a testing-focused. However, a switch to trunk-based development would be a significant change to our practices for every person writing code. I think this solution is too big for the problem.

Sequential build

Someone else suggested that perhaps we were just going faster than we should be. If we weren't running any build requests in parallel and instead triggered everything sequentially, would there still be a problem?

I don't think that switching to sequential builds would fix our issue as the step to trigger the merge is a manual one. A pull request might have successfully passed tests but be waiting on peer review from other developers. In the event that no changes are required by reviewers, the pull request could be merged to master at a time that still creates conflict:

3. Sequential PR build with rapid merge timing

The pull request build being sequential would slow our feedback loop to the delivery teams with no certain benefit.

Staged Build

Another suggestion was to look at introducing an interim step to our branching strategy. Instead of feature branches to master, we'd have a staging zone that might work something like this:

4. Introducing a staging area

The staging branch would use sequential builds. If a test passes there, then it can go to master. If a test fails there, then it doesn't go to master. The theory is that master is always passing.

Where this solution gets a little vague is how the staging branch might automatically rollback a merge. I'm not sure whether it's possible to automatically back changes off a branch based on a test result from continuous integration. If this were possible, why wouldn't we just do this with master instead of introducing an interim step?

I'm relatively sure that the person who suggested this hadn't seen such an approach work in practice.

Do Nothing

After querying the cost of the problem that we're experiencing, the last suggestion that I received was to do nothing. This is the easiest suggestion to implement but one that I find challenging. It feels like I'm leaving a problem unresolved.

However, I know that the build can't always pass successfully. Test automation that is meaningful should fail sometimes and provide information about potential problems in the software. I'm coming to terms with the idea that perhaps the failures we see post-merge are valuable, even though they have become more prevalent since we picked up our pace.

While frustrating, the failures are revealing dependencies between teams that might have been hidden. They also encourage collaboration as people from across the product work together on rapid solutions once the master branch is broken.

While I still feel like there must be a better way, for now it's likely that we will do nothing.



Other posts from CAST2016:

Friday 12 August 2016

Human centered test automation

The opening keynote at CAST2016 was Nicholas Carr. Though his talk was live streamed, unfortunately a recording is not going to be published. If you missed it, much of the content is available in the material of a talk he delivered last June titled "Media takes command".

Nicholas spoke about typology of automation, the substitution myth, automation complacency, automation bias and the automation paradox. His material focused on the application of technology in traditionally non-technical industries e.g. farming, architecture, personal training.

As he spoke, I started to wonder about the use of automation within software development itself. Specifically, as a tester, I thought about the execution of test automation to determine whether a product is ready to release.

Automation providing answers

Nicholas shared an academic study of a group of young students who were learning about words that are opposite in meaning e.g. hot and cold. The group of students were divided in two. Half of the students received flashcards to study that stated a word with the first letter of it's opposite e.g. hot and c. The other half of the students received flashcards that stated both words in their entirety e.g. hot and cold.

The students who were in the first group performed better in their exam than those in the second group. Academics concluded that this was because when we need to generate an answer rather than simply study an answer, then we are more likely to learn it. This phenomenon is labelled the generation effect.

On the flip side, the degeneration effect is where the answers are simply provided, as in many automated solutions. Nicholas stated that this approach is "a great way to stop humans from developing rich talents".

It's interesting to consider which of these effects are most prevalent in processing the results provided by our continuous integration builds. I believe that the intent of build results is to provide an answer: the build will pass or fail. However, I think the reality of the result is that it can rarely be taken at face value.

I like to confirm that a successful build has truly succeeded by checking the execution time and number of tests that were run. When a build fails, there is a lot of investigative work to determine the real root cause. I dig through test results, log files and screenshots.

I have previously thought that this work was annoying, but in the context of the degeneration effect perhaps the need to manually determine an answer is how we continue to learn about our system. If continuous integration were truly hands-off, what would we lose?

Developing human expertise

Nicholas also introduced the idea of human centered automation. This is a set of five principles by which we see the benefits of automation but continue to develop rich human expertise.

  1. Automate after mastery
  2. Transfer control between computer and operator
  3. Allow a professional to assess the situation before providing algorithmic assistance
  4. Don't hide feedback
  5. Allow friction for learning

This list is interesting to consider in the context of test automation. The purpose of having test automation is to get fast feedback, so I think it meets the fourth point listed above. But against every other criteria, I start to question our approach.

When I think about the test automation for our products, there is one suite in particular that has been developed over the past five years. This suite has existed longer than most of our testers have been working within my organisation. There is logic coded into the suite for which we no longer have a depth of human expertise. So we do not automate after mastery.

The suite executes without human interaction, there is no transfer of control between computer and operator. Having no tester involvement is a goal for us. Part of providing rapid feedback is reliably provide results within a set period of time regardless of whether there is a tester available.

The suite provides a result. There is human work to assess the result, as I describe above, but the suite has already provided algorithmic assistance that will bias our investigation. It has decided whether the build has passed or a failed.

Finally, the suite is relatively reliable. When the tests usually pass, there is no friction for learning. When the tests are failing and flaky, that is when testers have the opportunity to deeply understand a particular area of the application and associated test code. This happens, but ideally not very much.

So, what are the long term impacts of test automation on our testing skills? Are we forfeiting opportunities to develop rich human expertise in testing by prioritising fast, reliable, automated test execution?

I plan to think more about this topic and perhaps experiment with ways to make our automation more human centered. I'd be curious to hear if other organisations are already exploring in this area.



Other posts from CAST2016: