The incident that made me understand risk management

In 2018, In the depths of an underground mine, I piloted a drone through the dimly lit tunnels. Suddenly, it crashed into a wall, and the propellers shattered, sending shards flying in all directions. One of the largest shards, a jagged piece of carbon fibre and plastic, would then hang on my wall as a stark reminder of mitigation failures. As the Chief Remote Pilot of FlyFreely, it served as a constant warning: things can go wrong, and if circumstances had been slightly different, I could have had a large chunk of propeller embedded in my leg, far from the surface and even further from medical help.

The project had numerous risks that put this overall project into one of the riskiest that I had ever undertaken. To start with, the environment that we were to be operating in was in an active underground mine, a non-trivial industry that has a history of incidents, and as a result of the history, has high amounts of controls for hazard and risks. Adding a large drone and cutting-edge autonomous technology into the mix only heightened the stakes. We needed to exercise extreme caution and meticulous risk management.

Despite the challenges, most of the project went smoothly. We identified the majority of hazards before setting foot in the mine. We had PPE (personal protective equipment), processes, patterns, operational protocols, rehearsals before each flight, flight briefings, walkthroughs, and physical barriers in place.

But then came the wildcard: the development of the automation and its interaction with the environment.

On the day it all went wrong, the team had been discussing how to enable a new mode that would fly through virtual waypoints in the GNSS-denied environment. We planned to remove the wall object collision routine, which had been tested by manually flying into walls, chains, wires, and vehicles.

With the pre-flight safety brief and flight plan completed, non-flight personnel retreated behind barriers. I, along with the person responsible for monitoring the data, took our positions.

I started the drone, and dust billowed from underneath as it hovered. After a quick control check, our operational protocol took over. I engaged the toggle to put it into the mode that allowed the automation to take over. The drone rose, rotated, and slowly moved towards the first waypoint just ahead of the take-off point.

Then it turned and accelerated towards the next waypoint. Realising it was aiming for the wall instead of the open drive, I yelled, “WALL!” while flicking out of autonomous mode, and putting inputs in to try and avoid the collision.

It was too late.

The drone collided with the wall, and the front propellers shattered on impact with the hard rock, shotcrete, and steel rebar. The rest of the drone continued into the wall, and it was all over in seconds. Too fast for anyone to react, and then, silence.

Out of the corner of my eye, I saw something streak across my field of vision. I looked down to my left and saw a large chunk of the propeller (the one in the image above).

This could have been so much worse.

As the dust settled, the rest of the team emerged from behind the barriers. We all moved slowly towards the crash site, aware that the drone was still active and would remain so until we could remove the power supply.

The learnings

Until now, the potential consequences of a failure had all been academic, only what had been written down on a piece of paper, and thus far, only small incidents had been observed. This was the first time that I, or someone standing next to me had been at real risk of being seriously injured, and the environment we were in magnified the potential for harm. Some of the specific learnings were:

Proactive Risk Identification:
- I now place greater emphasis on proactively identifying and mitigating risks before they become critical issues. This involves comprehensive risk assessments and scenario planning.
Enhanced Safety Protocols:
- My approach to safety protocols has become more rigorous. I ensure that all team members are well-versed in safety procedures and that these protocols are strictly adhered to during operations.
Holistic Risk Management:
- I now adopt a comprehensive approach to risk management, considering as many possible scenarios and their impacts. This involves integrating risk management into every aspect of operations.
Emphasis on Redundancy:
- I prioritize building redundant systems to ensure reliability and resilience. This approach minimizes the impact of any single point of failure. Also considering how many things are behind a single point of failure.
Balancing Automation with Human Insight:
- While leveraging automation for efficiency, I ensure that human expertise is always available to intervene when necessary. This balance is crucial for effective risk management.
Adaptability and Learning:
- I emphasize the importance of being adaptable and learning from past incidents. This involves regularly reviewing and updating risk management strategies to stay ahead of potential challenges.
Effective Communication:
- Clear and immediate communication, such as yelling “WALL,” is essential in emergency situations. This incident emphasized the need for a well-coordinated team that can respond quickly to unforeseen events.
Continuous Improvement:
- The incident served as a reminder that risk management is an ongoing process. Regularly reviewing and updating safety protocols based on new experiences and technologies is essential.

Risk management is about safely preventing what could happen, in this instance, we have a failure that manifested in something that could have been physically very serious. But I can apply the learnings from this incident to other areas such as risk management in IT and Technology.