The Patriot Missile Failure in Dhahran: Is Software to Blame?

Shelley Toich
February 9, 1998
Prof. Eric Roberts
CS 201

Background

During the War in the Gulf, the United States' Patriot missile defense system was widely hailed as a savior of the war. News reports and government sources alike attributed to it a nearly perfect success rate in destroying Iraqi Scud missiles. Not until the war was over did observers begin to express doubts regarding the success of the Patriot; eventually the Army determined that the Patriot succeeded in intercepting Scud missiles in only perhaps 10-24 of more than 80 attempts. Some critics, in particular Ted Postol of MIT, believe the success rate was much lower--perhaps as low as just one successful interception. This disagreement occurs because, for several reasons, determining the true success rate of the Patriot is difficult. First, "success" is defined in several ways: destruction, damage to, and deflection of a Scud missile may variously be interpreted as successes depending on who makes the assessment. Secondly, the criteria used for "proof" of a "kill" varied--in some cases Army soldiers made little or no investigation and assumed a kill, in other cases they observed hard evidence of a Scud's destruction. Postol found his much lower figure by carefully examining film footage of Patriot missile intercept attempts. Many have attacked Postol's methods, claiming that video technology is inadequate to accurately portray the activity of missiles traveling at extremely high speeds. Although controversy persists concerning the actual effectiveness of the Patriot missile in the Gulf War, one thing is certain: the Patriot failed to intercept a Scud missile which hit an American military barracks in Dhahran, Saudi Arabia, on February 25, 1991. In fact, no Patriot missile was launched to intercept the Scud that day--twenty-eight people were killed and ninety-seven were injured. Why did the Patriot fail to respond to this threat?

"Computer failure" blamed

Eventually, the Army attributed the Patriot missile failure in Dhahran to "a software failure in the…computer" as a result of "long use of the radar system" (U.S. News 330). But, as is the case in any failure of a complex system, many factors may have contributed to the failure of the Patriot missile to reliably perform its duty. The Patriot problems likely stemmed from one fundamental aspect of its design: the Patriot was originally designed as an anti-aircraft, not anti-missile, defense system. With this limited purpose in mind, Raytheon designed the system with certain constraints. One such constraint was that the designers did not expect the Patriot system to operate for more than a few hours at a time--it was expected to be used in a mobile unit rather than at a fixed location. They also made other design decisions which later caused failures when the Army modified the Patriot for anti-missile defense.

At the time of the Scud attack on Dhahran, the Patriot battery had been running continuously for four days--more than 100 hours. This fact alone probably explains why the Patriot failed to intercept the Scud which hit the American barracks, but some more discussion is required to understand why extended operation caused the Patriot to fail.

When the Patriot system is in operation, it must have a way of determining whether "targets" it finds in the air are actually incoming missiles rather than false alarms. The Patriot makes this determination by tracking the target to determine whether it is following the expected path of a ballistic missile. Ballistic missiles travel at extremely high speeds, which means that the time interval between radar "sightings" of the target must be very small. The Patriot tracks a target by first noting the location of the original radar sighting, then by using knowledge of the characteristics of a ballistic missile in flight to anticipate where the target should be at the next radar sighting--a fraction of a second later. If, at the second radar sighting, the target does not appear in the "range gate," the calculated zone in which the target will appear if it is a ballistic missile, then it is classified a false alarm and subsequently ignored by the Patriot.

In order to make this path calculation, the Patriot depends on its internal clock. Because the memory available to the program was limited, the clock value was truncated slightly when stored. By itself, this would not have been likely to cause significant errors; however, the Patriot's software was written so that the error compounded over time--the longer the Patriot was running, the larger the error became. Israeli military, analyzing data from Patriot batteries operating in Israel, were the first to discover the clock drift error. They calculated that after only 8 hours of continuous operation, the Patriot's stored clock value would be off by 0.0275 seconds, causing an error in range gate calculation of approximately 55 meters. At the time of the Dhahran attack, the Patriot battery in that area had been operating continuously for more than 100 hours--its stored clock value was 0.3433 seconds off, causing the range gate to be shifted 687 meters, a large enough distance that the Patriot was looking for the target in the wrong place. Consequently, the target did not appear where the Patriot incorrectly calculated it should. Therefore the Patriot classified the incoming Scud as a false alarm and ignored it--with disastrous results.

On February 11, 1991, after determining the effect of the error over time, the Israelis notified the U.S. Patriot project office of the problem. Once they were notified, the programming team set to work solving the problem. "Within a few days, the Patriot project office made a software fix correcting the timing error, and sent it out to the troops on February 16, 1991" (Marshall). Sadly, at the time of the Dhahran attack, the software update had yet to arrive in Dhahran. That update, which arrived in Dhahran the day after the attack, might have saved the lives of those in the barracks. In the meantime, they had sent out a warning that "'very long run times' could affect the targeting accuracy" (Marshall). On the day of the Dhahran attack, two Patriot batteries were deployed to cover the Dhahran area. However, the Bravo battery was having trouble with its radar, a problem unrelated to the clock drift error, so the Alpha battery had been running continuously for four days. The Alpha battery needed to run continuously for coverage to be uninterrupted over Dhahran. Additionally, the phrase "very long run times" was not specifically defined so the Patriot operators could not know that they were operating under dangerous conditions when the attack occurred.

A final question to be answered is why did the programmer allow a software error which propagates over time, causing significant errors in range calculation? The probable answer is, again, that the system was designed to be mobile, and to defend against aircraft which move much more slowly than do ballistic missiles. Because the system was intended to be mobile, it was expected that the computer would be periodically rebooted--certainly in less than 14 hours. As a result, any clock drift error would not be propagated over extended periods and would not cause significant errors in range calculation. In fact, because the Patriot system was not intended to run for extended times, it was probably never tested under those conditions, explaining why the problem was not discovered until the war was in progress. The other consideration, that the system was designed as an anti-aircraft system, probably also enabled the inclusion of such a design flaw, since slower-moving airplanes would be easier to track and therefore less dependent upon a highly accurate clock value.

What did it cost?

The most important cost of the Patriot failure was, of course, the loss of life at Dhahran. Twenty-eight people died there; more than ninety were injured. If the Patriot had worked correctly and consistently, these lives likely would have been saved. Monetary costs are more difficult to assess. Each Patriot missile fired cost approximately $600,000. With 159 missiles fired, the total cost of the missiles was more than $95 million. If 25% of Patriots fired successfully intercepted their target (an estimate of success which many critics consider generous), then about $24 million was spent on successful missions and $71 million on unsuccessful missions. If each of these failures to intercept were due to the Patriot working incorrectly, we could say that the failure cost $71 million. This does not take into account the cost of developing (and improving and upgrading) the Patriot or the cost for each of the Patriot systems deployed in the Gulf--if those are included the monetary costs skyrocket.

Who is responsible?

As discussed previously, a number of factors contributed to the failure of the Patriot. Although some of the problems were caused by a software "glitch," it would be unfair to lay all blame on the program (or programmer). If the system had been operated as planned (fewer than 14 hours at a time), the clock drift would have been insignificant. Perhaps if the system had been tested under circumstances similar to those encountered in the Gulf, the programming problem which caused the clock drift would have been discovered and corrected ahead of time. So does that mean the operators were at fault? Yes and no: if they had rebooted the system once or twice a day, the Patriot would have been more reliable; however, the operators were never specifically told that they should reboot the system periodically. We can hardly blame the operators for not performing an action they were not aware was necessary. We also cannot blame their superiors for not ordering the systems rebooted since they were also unaware of the problem. Perhaps we can blame the designers of the system who never thought to inform anyone of the operational limitations, and who wrote the software so that the system was required to be rebooted periodically; but since the system was meant to operate for only a few hours at a time it is perhaps understandable that they never realized it might be used differently in the field. Another possible place to lay blame is with the Patriot project office or the Army for not ensuring a more timely software update--it may have taken as long as nine days for the Dhahran Patriot batteries to receive the update after it had been created.

Who is ultimately at fault then? No specific individual is to blame; instead many people and systems were involved, including Raytheon (the Patriot's designer), the Army, and perhaps the operators. The Patriot missile failure is a perfect example of a system so complex that one simple cause cannot easily be found: instead, several problems exist which, taken together, caused a catastrophic failure. There are several lessons to be learned from analyzing the Patriot failure. One, testing in computer-controlled systems must be very robust, especially when safety is at stake. If the Patriot had been tested under varying conditions, including very long periods of continuous operation, the clock drift error would likely have been discovered long before the Patriot was used in the Gulf. Also, special care must be taken when redesigning a system for a new use--when the uses seem very similar, as an anti-aircraft versus anti-missile weapon, there can still be unexpected difficulties in adaptation. Lastly, communication among the designers, programmers, and operators of a safety-critical system is imperative--even if the other suggestions were not implemented, better communication might have saved lives in Dhahran both by informing users of specific limits (reboot every 8 hours) and by expediting the software upgrade.

 

Selected list of works consulted

Hughes, David. "Tracking Software Error Likely Reason Patriot Battery Failed to Engage Scud." Aviation Week and Space Technology 10 Jun. 1991: 25+.

Marshall, Eliot. "Fatal Error: How Patriot Overlooked a Scud." Science 13 Mar. 1992: 1347.

Neumann, Peter G. Computer-Related Risks. New York: ACM Press, 1995.

---. "Inside Risks." ACM SIGSOFT Software Engineering Notes 16.3 (1991): 19-20.

---. "Inside Risks." ACM SIGSOFT Software Engineering Notes 16.4 (1991): 17-18.

---. "Inside Risks." ACM SIGSOFT Software Engineering Notes 17.2 (1992): 4-5.

---. "Inside Risks." ACM SIGSOFT Software Engineering Notes 18.1 (1993): 25.

The Risks Digest 11.63, 11.70, 11.84, 11.92, 11.94, 12.1, 12.13, 13.19, 13.32, 13.35, 13.37, 13.46, 13.76. Online. Internet. 1991-1992. Available FTP: ftp.sri.com/risks.

Schmitt, Eric. "Army is Blaming Patriot's Computer for Failure to Stop the Dhahran Scud." New York Times 20 May 1991, late ed.: A6.

United States. House. Legislation and National Security Subcommittee of the Committee on Government Operations. Performance of the Patriot Missile in the Gulf War. 102nd Cong., 2nd sess. Washington: GPO, 1992.

---. GAO. Information Management and Technology Division. Patriot Missile Software Problem. Online. Internet. 1992. Available WWW: www.fas.org/spp/starwars/gao/im92026.htm.

U.S. News & World Report. Triumph Without Victory. New York: Times Books, 1992.