Sifting the wreckage / Global News screenshot
A friend of mine who formerly worked at NASA was talking about his volunteer work at his church, which is to operate the video camera that records the pastor's sermon. He's going to ask them to buy a second camera, and when I asked him why, he said, “Single-point failure. That camera goes out, we're up a creek without a paddle.”
The same concept that can be applied to as humble and non-life-threatening a situation as recording sermons also applies to highly complex systems such as Boeing's 737 Max 8, a new version of the popular 737 aircraft that is in service around the world. But new evidence from the March 10 crash of a 737 Max 8 outside Addis Ababa, Ethiopia in which 157 people died indicates that a single-point failure may be responsible for both that disaster and a similar crash of another 737 Max 8 on Oct. 28, 2018 in Indonesia that killed 189.
The single-point failure possibility involves a new anti-stall system called MCAS, which Boeing installed on the Max 8 version of their 737s when the two engines were moved forward compared to earlier versions. Because this move made the aircraft more prone to stall, the MCAS system was intended to make the plane handle more like older 737s, reducing the need for extensive pilot retraining. But evidently, pilots were not thoroughly informed that the new MCAS system was in place and activated until the Ethiopian crash brought attention to the system.
The system works by monitoring information from two sensors called angle-of-attack (AOA) sensors. These are small fins that stick out from the side of the aircraft rather like wind vanes, and rotate to sense the direction of local airflow with respect to the fuselage. In a stall, the plane is tilted nose-up excessively with respect to the direction of airflow. This makes the sensor turn at an angle that the on-board computers use to figure out that it's time to take over the controls from the pilot and push the nose down.
Normally, according to a post on aviationstackexchange.com, the on-board computer takes the output of both AOA sensors into account, and if one indicates a stall and the other doesn't, perhaps just a warning is issued to the pilot. But according to a New York Times report, the MCAS anti-stall system activates even if only one of the two sensors says the nose is too high. If anything happens to make one of the sensors give a false reading—a stray updraft from the backwash of a flight that just took off, for example—the MCAS goes into action and pushes the nose down, even if the take-off is proceeding normally.
The altitude records of both 737s involved in the crashes in Ethiopia and Indonesia show that the pilots went on a desperate roller-coaster ride, executing climbs and descents every half-minute or so three or four times before the final descent and crash. This is consistent with a struggle between the MCAS and the pilots, although other causes could be responsible as well. Following the Ethiopian crash, China and many other countries grounded all 737 Max 8 and Max 9 planes, and later last week, the US followed suit.
Boeing says it is working on a software upgrade for the planes involved, but it may not be available until April, and so until then, millions of dollars' worth of aviation assets will be out of service. But that's better than having another 737 crash on take-off.
It is too soon to draw definite conclusions about the causes of these crashes. That has to wait for the analyses of black-box records and other pertinent data. But investigators have already found that the horizontal stabilizer in the Ethiopian plane was set to push the nose down, which is not something you normally do on take-off. And the fact that the MCAS can be triggered by only one AOA sensor is enough reason to take measures such as grounding planes until a remedy can be developed and installed.
Planes are designed by people who work in organizations, and successful designs of safe planes emerge from an exceedingly complex process involving thousands of designers, technicians, supervisors, inspectors, regulators, and other interested parties. Successful companies manage to evolve with new young staff replacing retired engineers and managers while maintaining the core principles and knowledge that is essential to making planes safe. And one of those core principles, so easy to understand that even I get it, is to avoid single-point-failure situations whenever possible by installing backup systems and procedures.
If what the Times reported is true, someone dropped the ball with regard to the MCAS system's behaviour in response to only one erroneous sensor. It could take months or years to figure out how this design error happened. But the lesson is one that has to be learned if Boeing is to recover from this sequence of disasters, which it probably will.
It's also possible that the accidents involved pilot error in combination with a misbehaving MCAS. If the pilots didn't know that the MCAS was even installed, or were unfamiliar with what flying the plane with an activated MCAS is like, their actions with regard to it may have contributed to the crashes. Part of the problem here is that the MCAS rarely activates under typical flight conditions. Perhaps there is something about the meteorological conditions at the two airports involved which gave rise to a single AOA sensor error that persisted long enough to cause the accidents.
These and other speculations will have to await the full accident reports, which may not be available for months. But in the absence of more knowledge, grounding the 737 Max 8 and 9 planes until the single-point-failure problem with MCAS can be addressed and demonstrated to be fixed is the wisest course.
Karl D. Stephan is a professor of electrical engineering at Texas State University in San Marcos, Texas. This article has been republished, with permission, from his blog Engineering Ethics, which is a MercatorNet partner site. His ebook Ethical and Otherwise: Engineering In the Headlines is available in Kindle format and also in the iTunes store.