Monday, 24 October 2016

Risk-based release testing

In my organisation there's a big push to increase our release cadence. Our current rate of release varies between products, as each adopts a slightly different model of delivering change to our customers, but in every case there's opportunity to streamline our activities and be more responsive.

Recently I've been working with a specific group of testers in one of our online banking applications. They currently operate a monthly release cycle using a release process that takes about a week to complete. Most of the week is spent in manual release testing, which consistently creates frustration for the testers themselves and the people they're working alongside.

My observation from a coaching perspective was that we had fallen into release testing theatre*. Our testers all had the script for every release. They dutifully played their parts and read their lines, but it all felt a bit empty. Unfortunately the playwright hadn't been evolving the play alongside other changes in our organisation. The testers were acting out a release process that no longer made much sense.

The testers all recognised a need to change what they were doing in the release. But instead of trying to edit what we already had, I wanted to question the rationale behind it.

Risk Appetite

I facilitated a workshop that was attended by all of the testers for the product, along with two of the delivery managers who have accountability for release testing sign off as part of our governance process. 

I started the session by gauging opinion of all the attendees about our current approach to release testing. I asked two questions that I adapted from The Risk Questionnaire by Adam Knight:
  1. How do you think [product] currently stands in its typical level of rigour in release testing?
  2. How do you think [product] should stand in its typical level of rigour in release testing?
This generated some unexpected discussion on what the term rigour meant to us! 

I asked people answer the questions by choosing a place to stand in the room: one wall was low and the opposite wall was high. This gave a visual indicator of how people felt about the existing approach and which direction they felt we should be heading towards. 

Interestingly the testers and the delivery managers had quite different views, which was good to highlight and discuss early in the session.

Brainstorming Risk

Next I asked people to consider what risks we were addressing in our release testing, then write out one risk per post-it note. I emphasised that I wanted to focus on risk rather than activities. For example, instead of  'cross-browser testing' I would expect to see 'product may not work on different platforms'.

After five minutes of brainstorming, the attendees shared the risks that they had identified. As each risk was shared, other attendees identified where they held a duplicate risk. For example, when someone said 'product may not work on different platforms', we collected every post-it that said something similar and combined them into a single group.

We ended up with a list of 12 specific risks that spanned the broad categories of functionality, code merge, cross-browser compatibility, cross-platform compatibility, user experience, accessibility, security, performance, infrastructure, test data, confirmation bias and reputation.

Mitigating Risk

Between completion by a delivery team and release to our customers, the product is deployed through six different environments. The next activity was to determine whereabouts in the release process we would mitigate each of the risks that we'd collectively identified. 

I stuck a label for each of our environments across the wall of the workshop room, creating column headings, then put the risk post-it notes into a backlog at the left. We worked through the backlog, discussing one risk at a time and moving it to the environment where it was best suited, or breaking the risk in to parts that were mapped to separate environments if required.

The result was a matrix of environments and risk that looked like this:

Mapping risks to release environments

As you can see from the picture above, we realised that most of our risk was being mitigated early in our release process. As we get closer to the production environment, on the right hand side of the visualisation, there are far fewer post-it notes.

Creating this mapping initially caused some confusion, as the testers were reluctant to say a risk had been mitigated at a particular point in the release process. Eventually I realised that there was a misunderstanding in terminology. I said mitigated, they thought I meant eliminated.

To explain the difference between mitigating and eliminating risk I used an example from one of my volunteering roles as a Brownie Leader. In one of the lodges where we hold our overnight camps there is a staircase to use the bathrooms that are located on a lower level. To mitigate the risk of a girl falling on the stairs at night, we leave the stairwell light switched on. This action doesn't mean that we will never have someone fall on the stairs, but it significantly reduces the likelihood. The risk is mitigated but not eliminated.

Targeted Testing

At the conclusion of the workshop we hadn't talked specifically about test activities. However, the visual mapping of risks to environments raised a lot of questions for both the testers and the delivery managers about the validity of our existing release test process.

Having reached agreement with the delivery managers about the underlying purpose of each release environment, the testers reconvened in a later meeting to discuss how testing could mitigate the specific risks that had been identified. Again we did not reference the existing approach to release testing. Instead we collaboratively mapped out the scenes of a brand new play:

Brainstorming a new risk-based approach to release testing

Our new approach is very different to the old. It's less repetitive and quicker to execute. It's also truly a risk-based approach. The testers are excited about the possibility in what we've agreed. I'm looking forward to seeing how it works too.

I also hope that our release testing for this product continues to evolve. This time around all of the testers collaborated together as playwrights and have shared ownership of the actions they will perform. As our organisation continues to change we should continue to tweak our script to stay relevant. The alternative is a stale process that ends in empty pageantry.

* I'm not the first person to use the theatre analogy. Steve Smith wrote an article on a similar theme, titled Release testing is risk management theatre.

Sunday, 9 October 2016

Caring for conference speakers

I've been fortunate to have the opportunity to speak at a number of international conferences. I've traveled to the USA, Canada, India, Estonia, England, Australia, and Denmark, as well as speaking at many events around New Zealand.

My experiences have been generally good. Yet there are many things about speaking at conferences that I feel could be improved. As a co-organiser of the upcoming WeTest conferences, I've spent some time this year reflecting on where the opportunities are to do things better.

The most obvious is paying to speak. I've had to pay my own airfares and accommodation on a few occasions, particularly as a new speaker. Where reimbursement for expenses has been offered it is usually paid after the event, which means that I still need to be financially able to cover these expenses in the short term.

But there are a host of smaller parts that form the overall experience of speaking at an event.

I may not know whether I'm supposed to have my presentation material on my own laptop, on a USB drive, or submitted somewhere in advance. What is the type of connection to the projector? Will there be a microphone? A lecturn? A stage?

I may not know how big my audience is going to be: 10, 100, or more? What type of layout will they be in: tables of 10, rows of chairs, or a staggered amphitheater? What type of people will I be speaking to: testers, test managers, or others who work in software?

I may not know what sort of environment I will face. Is it a conference where presenters simply present, or will there be a Q&A or open season afterwards? Is there a culture of debate, argument or challenge? If so, will I be supported by a facilitator?

All of these unknowns about what I've signed up for can cause anxiety. They also make it difficult for me to picture the audience and tailor my material accordingly.

Then there are the series of small challenges that happen during the experience itself. Arriving from a long haul flight in an unfamiliar country and finding my accommodation. Locating the conference venue and the room in which I'll present. Determining whether I'll be introduced by someone or will introduce myself. Deciding how to manage time keeping. And so on.

So, what are we doing differently for WeTest?

One of the main priorities for our organising committee is to care for our speakers. As many of the WeTest organisers are also regular conference speakers, we've worked hard to remove the worries that may surround accepting a speaking engagement. We know our speakers are putting a lot of work into preparing their presentations. We think that this should be their only concern.

We've arranged and paid for our speaker flights and accommodation in advance. With one exception where a speaker had specific airline requirements, none of our speakers have been asked to foot any of these costs upfront.

We've communicated with our speakers regularly. Since their selection in June we've:
  • agreed on benefits and expectations via a written speaker agreement,
  • offered them the opportunity to check their session and biographical details on the event website prior to our go-live, 
  • provided a mechanism for them to complete their complimentary registration, 
  • shared details of the venue, audio visual setup and event timing, 
  • prepared personal itineraries for travel, accommodation and any associated sponsorship commitments, and
  • sent them a copy of our attendee communication.

Over the past four months I hope that this information has removed a lot of anxiety that can be associated with presenting at an event. As an organising team we've tried to space out these messages, to offer regular opportunities for our speakers to ask questions and eliminate any unknowns.

The speaker itineraries that we've prepared run from arrival in the conference city. We have arranged and paid transport to meet all of our speakers at the airport. For international guests this means they don't have to worry about how to find their hotel or immediately locate New Zealand currency when they land.

And on the conference day itself, we have a dedicated person assigned specifically to our speakers. One of our organising committee will be walking our speakers from their accommodation to the venue, leading the speaker briefing, and be available throughout the event to deal with any questions or problems that arise.

I'm confident that our efforts to look after our speakers will result in fantastic material this year and in years to come. I want to continue to create a safe space for new presenters to step forward from the New Zealand testing community. And I want our WeTest events to be a must for international presenters on the software testing conference circuit.

On a broader note, I hope that our efforts help to change the expectations of speakers for other events. If every organiser aimed to provide a similar level of care, or speakers came to expect this, the experience of speaking at a conference could be consistently better than it is today.

Tuesday, 4 October 2016

Observation in Testing

At the WeTest Wellington Quick Lunch Talk today, Donal Christie of Powershop spoke on the topic "Do you see what I see?". Donal has been fascinated by observation from an early age - his favourite childhood toys included a magnifying glass, microscope and telescope. His talk focused on what we see as testers when we examine software.

Donal shared a variety of things to be mindful of, but there were three particularly interesting stories that resonated for me: Rubin's vase, Monet's cataracts, and Walmgate Bar.

Rubin's vase

Donal shared a picture and story about the vase created for the Silver Jubilee of Queen Elizabeth:

Most people are familiar with the Ambiguous Vase illusion. Devised by the Danish psychologist Edgar Rubin, we are not sure if we are looking at a vase, or at two faces, staring at each other.

In 1977, a wonderful 3 dimensional version of this illusion was made, to commemorate the Silver Jubilee of Queen Elizabeth. It was a porcelain vase, but one with a wonderful twist. The profile on one side of the vase was of Her Majesty, but on the other side of the vase, the profile was of Prince Philip. [Ref: The Queen's Speech]

Credit: The Queen's Speech

If you were asked to test this vase, what would be important? Is it the vase itself? Or the silhouette of the vase that shows the royal profiles? Or both?

How does this relate back to software? It's important to have a conversation with your business stakeholders about what the customer wants from your product, then learn what part of your architecture delivers that. What you see may not actually be what you need to test.

One example is feature development that introduces a new screen to the user interface and requires a new web service. It may be easy to test the user interface changes at face value. But we could see an entirely different perspective by testing in the web services layer.

Think of the web service change as the silhouette of the user interface changes. Perhaps it holds a lot of the business logic that the customer desires. Make sure you're testing what you see, but also think about what's around it.

Monet's cataracts

Donal shared a picture and a story about Monet's cataracts:

From 1900 onwards Monet had problems with his vision and complained to his friends that everything he saw was a fog. Although cataract operations had been performed for thousands of years they were still a risky business at the time. He agreed to surgery to totally remove the lense in his left eye in 1923 at the age of 82 and the operation was a success. There were no replacement implant lenses at the time and he had to wear thick glasses but his vision was transformed.

However, the operation had an unexpected side effect; as mentioned before it’s claimed that he began seeing the world with UV vision. His palette which before the operation had been red, brown and earthy took on a more bluish hue. [Ref: Claude Monet and Ultraviolet Light]

Credit: Claude Monet and Ultraviolet Light

People perceive colour differently. Though Monet's example is an extreme one, there are many people with impaired vision and colour blindness. For these people, what they see is not what you see.

Donal made the point that in these cases there can be more than one truth. To one person, the house as seen from the rose garden is red. To another, it's blue. To another, it's grey. None of these people are wrong. The way that they see the house will depend on how they see.

When we test software, you might hear people say "Did you see that bug?". In some cases, perhaps they didn't! Two people observing the same piece of software will form two separate truths. What you see and perceive will be different from your colleagues and your customers.

Donal advocated for pair testing and accessibility testing, approaches that try to incorporate multiple perspectives during the development process. I hadn't boiled down the benefits of these practices to a basic need for many people to observe a system. This is an argument I will be adding to my repertoire.

Walmgate Bar

Donal shared a picture and a story about Walmgate Bar, a historic location in York that he visited with his wife. They saw a plaque that described the site:

Credit: Donal Christie

Take a moment to read the inscription.

It may not be particularly striking. You probably know a little bit more about Walmgate Bar. Did you spot the two small errors? The first is that the word siege is spelled incorrectly. The second is in a sentence that has a duplicate word: "erected in the the reign of".

Occasionally we need to consciously shift our thinking to find different types of problems in software. We need to think about the system as a whole and determine whether it is fit for purpose. In this case, the sign is successfully communicating the intended information. We also need to examine the parts that make up the system and determine whether they are behaving correctly. This is where the problems crept in with the sign above.

I've found it particularly difficult to switch between these levels of thinking as a tester in an agile team. It can be easy to focus on testing each individual story and forget about testing the whole. I have a tendency to get bogged down in detail. The Walmgate Bar sign is a good reminder to think about both perspectives.

Interestingly it looks like this particular sign has now been replaced with one that is correct:

Credit: Yorkshire Walks

Donal's talk was a great reminder about observation and interpretation. He reminded me to consider:

  1. whether the product is what I can see, 
  2. that what I see may not be what others see, and 
  3. that the problems I find will change based on where I look.

Sunday, 25 September 2016

Why don't you just

I'm solution oriented. If I hear about a problem, I like to make suggestions about how it can be resolved. Sometimes before people have even stopped speaking, my brain is spinning on ideas.

As a coach and mentor, this trait can be a problem. Thinking about solutions interferes with my active listening. I can't hear someone properly when I'm planning what I'll say next. I can neglect to gather all the context to a situation before jumping in with my ideas. And when I offer my thoughts before acknowledging those of the person who I'm talking to, I lack empathy.

Earlier in my career I was taught the GROW model, which is a tool that has been used to aid coaching conversations since the 1980s. GROW is an acronym that stands for goal, reality, options, way forward. It gives a suggested structure to a conversation about goal setting or problem solving.

When I jump to solutions, I skip straight to the end of the GROW model. I'm focusing on the way forward. While I do want my coaching conversations to end in action, I can end up driving there too fast.

Pace of conversation is a difficult thing to judge. I've started to use a heuristic to help me work out when I'm leaping ahead. If I can prefix a response with "Why don't you just" then it's likely that I've jumped into solution mode alone, without the person that I'm speaking to.

Why don't you just ask Joan to restart the server?

Why don't you just look through the test results and see how many things failed?

Why don't you just buy some new pens?

"Why don't you just" is the start of a question, which indicates I'm not sure that what I'm about to say is a valid way forward. If I'm uncertain, it's because I don't have enough information. Instead of suggesting, I loop back and ask the questions that resolve my uncertainty.

"Why don't you just" indicates an easy option. It's entirely likely that the person has already identified the simplest solutions themselves. Instead of offering an answer that they know, I need to ask about the options they've already recognised and dismissed. There are often many.

"Why don't you just" can also help me identify when I'm frustrated because the conversation is stuck. Perhaps the other person is enjoying a rant about their reality or cycling through options without choosing their own way forward. Then I need to ask a question to push the conversation along, or abandon it if they're simply talking as a cathartic outlet.

This prompt helps me determine the pace of a conversation. I can recognise when I need to slow down and gather more information, or when a conversation has stalled and I need to push the other person along. Perhaps "Why don't you just" will help others who are afflicted with a need for action.

Sunday, 18 September 2016

Going to a career

My father-in-law works in HR. A few years ago when I was thinking about changing jobs, he gave me a piece of advice that stuck. He said:

"People are either leaving a job or going to a job. Make sure you're going to something."

Sometimes you're changing jobs primarily to escape your current situation. You might have an unpleasant manager or colleagues, feel that you're being paid unfairly, find your work boring or the working conditions intolerable. You're searching for something else. You're leaving a job.

On the other hand, sometime's you're changing jobs in active pursuit of the next challenge. You might be looking to gain experience in a new industry, for a new role within your profession, or for a greater level of responsibility in your existing discipline. You're searching for something specific. You're going to a job.

These two states aren't mutually exclusive, obviously you might have reasons in both categories. But his advice was that the reasons you're going to a job should always outweigh the reasons that you leave your existing one.

When I reflect on my career, I have definitely changed jobs in both situations. But it has been those occasions where I've moved towards a new role, rather than escaping an old one, that have propelled my career forward. The decisions that I've made consciously in pursuit of a broader purpose, rather than as a convenient change in immediate circumstance, have always served me best.

I find myself regularly sharing this same advice with others who are considering their career. If you're thinking about what's next, make sure you're going to something. Deliberate steps forward are how we grow and challenge ourselves.

Sunday, 4 September 2016

The end of the pairing experiment

I have spoken and written about the pairing experiment for sharing knowledge between agile teams that I facilitated for the testers in my organisation. After 12 months of pairing, in which we saw many benefits, I asked the testers whether they would like to continue. The result was overwhelming:

Survey Results

I had asked this same question regularly through the experiment, but this was the first time that a majority of respondents had asked to stop pairing. As a result, we no longer do structured, rostered, cross-team pairing.


The first and most obvious reason is above. If you ask people for their opinion on an activity that they're being instructed to undertake, and they overwhelmingly don't want to do it, then there's questionable value in insisting that it happens regardless. Listen to what you are being told.

But, behind the survey results is a reason that opinion has changed. This result told me that the testers believed we didn't need the experiment anymore, which meant they collectively recognised that the original reason for its existence had disappeared.

The pairing experiment was put in place to address a specific need. In mid-2015 the testers told me that they felt siloed from their peers who worked in different agile teams. The pairing experiment was primarily focused on breaking down these perceived barriers by sharing ideas and creating new connections.

After 12 months of rostered pairing the testers had formed links with multiple colleagues in different product areas. The opportunity to work alongside more people from the same products offered diminishing returns. Each tester already had the visibility of, and connection to, other teams.

Additionally, our pairing experiment wasn't happening in isolation. Alongside, the testers within particular product areas started to interact more frequently in regular team meetings and online chat channels. We also started meeting as an entire testing competency once a week for afternoon tea.

The increased collaboration between testers has shifted our testing culture. The testers no longer feel that they are disconnected from their colleagues. Instead there's a strong network of people who they can call on for ideas, advice and assistance.

The pairing experiment achieved its objective. I'm proud of this positive outcome. I'm also proud that we're all ready to let the experiment go. I think it's important to be willing to change our approach - not just by introducing new ideas, but also by retiring those that have fulfilled their purpose.

Now that we've stopped pairing, there's time available for the next experiment. I'm still thinking about what that might be, so that our testing continues to evolve.

Thursday, 18 August 2016

Post-merge test automation failures

Recently we implemented selenium grid for one of our automated suites. I've written about our reasons for this change, but in short we wanted to improve the speed and stability of our automation. Happily we've seen both those benefits.

We've also seen a noticeable jump in the number of pull requests that are successfully merged back to our master branch each day. This gives some weight to the idea that our rate of application code change was previously impeded by our test infrastructure.

The increase in volume occasionally causes a problem when two feature branches are merged back to master in quick succession. Our tests fail on the second build of the master branch post-merge.

To illustrate, imagine that there are two open pull requests for two feature branches: orange and purple. We can trigger multiple pull request (PR) builds in parallel, so the two delivery teams who are behind these feature branches can receive feedback about their code simultaneously.

When a PR build passes successfully and the code has been through peer review, it can be merged back to the master branch. Each time the master branch changes it triggers the same test suite that executes for a pull request.

We do not trigger multiple builds against master in parallel. If two pull requests are merged in quick succession the first will build immediately and the second will trigger a build that waits for the first to complete before executing. Sometimes the second build will fail.

1. Failing tests after multiple PR merges to master

As the person who had driven sweeping test infrastructure changes, when this happened the first time I assumed that the test automation was somehow faulty. The real issue was that the code changes in orange and purple, while not in conflict with each other at a source code level, caused unexpected problems when put together. The failing tests reflected this.

We hadn't seen this problem previously because our pull requests were rarely merged in such quick succession. They were widely spaced, which meant that when the developer pulled from master to their branch at the beginning of the merge process these type of failures were discovered and resolved.

I raised this as a topic of conversation during Lean Coffee at CAST2016 to find out how other teams move quickly with continuous integration. Those present offered up some possible options to resolve the problem as I described it.

Trunk based development

Google and Facebook move a lot faster than my organisation. Someone suggested that I research these companies to learn about their branching and merging strategy.

I duly found Google's vs Facebook's Trunk Based Development by Paul Hammant and was slightly surprised to see a relevant visualisation at the very top of the article:

2. Google's vs Facebook's Trunk Based Development by Paul Hammant

It seems that, to move very quickly with a large number of people contributing to a code base, trunk-based development is preferred. As the previous diagram illustrates, we currently use a mainline approach with feature branches. This creates larger opportunities for conflicts due to merging.

I had assumed that all possible solutions to these tests failing on master would be a testing-focused. However, a switch to trunk-based development would be a significant change to our practices for every person writing code. I think this solution is too big for the problem.

Sequential build

Someone else suggested that perhaps we were just going faster than we should be. If we weren't running any build requests in parallel and instead triggered everything sequentially, would there still be a problem?

I don't think that switching to sequential builds would fix our issue as the step to trigger the merge is a manual one. A pull request might have successfully passed tests but be waiting on peer review from other developers. In the event that no changes are required by reviewers, the pull request could be merged to master at a time that still creates conflict:

3. Sequential PR build with rapid merge timing

The pull request build being sequential would slow our feedback loop to the delivery teams with no certain benefit.

Staged Build

Another suggestion was to look at introducing an interim step to our branching strategy. Instead of feature branches to master, we'd have a staging zone that might work something like this:

4. Introducing a staging area

The staging branch would use sequential builds. If a test passes there, then it can go to master. If a test fails there, then it doesn't go to master. The theory is that master is always passing.

Where this solution gets a little vague is how the staging branch might automatically rollback a merge. I'm not sure whether it's possible to automatically back changes off a branch based on a test result from continuous integration. If this were possible, why wouldn't we just do this with master instead of introducing an interim step?

I'm relatively sure that the person who suggested this hadn't seen such an approach work in practice.

Do Nothing

After querying the cost of the problem that we're experiencing, the last suggestion that I received was to do nothing. This is the easiest suggestion to implement but one that I find challenging. It feels like I'm leaving a problem unresolved.

However, I know that the build can't always pass successfully. Test automation that is meaningful should fail sometimes and provide information about potential problems in the software. I'm coming to terms with the idea that perhaps the failures we see post-merge are valuable, even though they have become more prevalent since we picked up our pace.

While frustrating, the failures are revealing dependencies between teams that might have been hidden. They also encourage collaboration as people from across the product work together on rapid solutions once the master branch is broken.

While I still feel like there must be a better way, for now it's likely that we will do nothing.

Other posts from CAST2016: