Table of contents
- Abstract
- Introduction
- Analysis of disaster
- Discussion
Abstract
This paper addresses the possible causations and engineering failures that led to the demise of NASA’s mars climate orbiter mission in 1998 by summarising and analysing the technical and human factors leading to the incident. The primary fault at hand was the failure to programme and operate the trajectory of the spacecraft in the required manor, causing the space probe to enter a trajectory that took the spacecraft within the minimum altitude at which the spacecraft could survive and operate effectively. Furthermore, any underlying issues or ethical malpractices that could have led to failures in the Mars climate orbiter’s mission will be brought to attention as well as if any regulatory actions were ignored or not followed correctly, which if were followed correctly, may have prevented the engineering failures associated with this disaster. This report also identifies the post disaster action taken to prevent similar engineering and human communication failures in future projects.
Introduction
The Mars climate orbiter (MCO) was launched on December 11, 1998 and was lost on September 23, 1999. The MCO had unintentionally been projected into a path that took it to an altitude too close to Mars’ surface. [1] Ultimately, the MCO had not been engineered with a structure or with the expensive materials required for it to survive within the planet under investigation’s atmosphere, despite the probe costing $327.6 million to research and produce. [2] This meant that the spacecraft either disintegrated in Mars’ atmosphere or deflected and re-entered heliocentric space.
The primary cause for the disappearance of the MCO spacecraft was the failure of NASA’s ground teams to use metric units when coding its trajectory paths. English imperial units were implemented into the coding programme used for the MCO’s computers as these were developed by Lockheed Martin Astronautics, who designed and built the spacecraft, provided data in imperial units. [3] Ultimately, this meant the trajectory data was entirely incorrect and post-failure calculations showed that the spacecraft was on a path that would have taken the MCO to within 57 kilometres of the surface of Mars. Previous calculations showed that the MCO was only capable of surviving in altitudes higher than 80 kilometres.[1]
These failures were fatal and should have been a key consideration for NASA before launching the MCO as one of the key objectives of the MCO mission was to Map the thermal structure of the atmosphere from the surface to 80 km altitude, therefore altitude trajectory should have been a key issue when testing its programming, as the MCO needed to be placed at very specific altitudes in order to successfully complete this objective.
Analysis of disaster
The first and arguably most significant technical fault that led to this disaster was the fact that the MCO’s programming system incorporated the wrong numerical units for NASA’s use of the product. Also, the fact that there were no conversion algorithms incorporated in the MCO’s computers or displays of which numerical unit system was in use meant that the MCO was bound to be projected into an undesired trajectory with NASA’s American teams controlling the space probe in metric units.[3] The effect of these faults coupled together ultimately caused the space probe to enter an altitude at which it could not operate and would be destroyed or lost in space.
Secondly, another key technical fault was in play leading to this disaster as errors went undetected within NASA’s computer models of how thruster firings on the spacecraft were predicted and then carried out on the spacecraft during its mission. These computational models were also programmed in metric units so when it was discovered that the MCO was headed on the wrong trajectory during the mission, the calculations produced in an attempt to salvage the mission were incorrect. [4] The teams working on the trajectory path of the MCO requested calculations of how long to fire the MCO’s small thrusters to deflect the path of the MCO away from Mars’ atmosphere. The results of these calculations were given in pound-force seconds rather than Newton seconds, which the software of the probe’s computers incorporated. Ultimately, this meant that when the small thrusters were used, there was not enough force programmed into its software to manoeuvre the MCO away from the atmosphere of Mars, meaning it remained on its incorrect trajectory that led to it being lost in Mars’ atmosphere. [5]
The key human errors involved in the failure of the MCO mission was the failure of the engineering design, manufacturing, project management and testing teams working on the MCO to communicate effectively and carry out sufficient, thorough testing in the pre-launch phase. Important information failed to be passed on to those who needed it on multiple occasions, either due to communications being too informal, or, key information simply not being highlighted to selective teams working on the project, where this information would have been significant. For example, the failure of NASA’s ground teams to identify that the English computing software may be in English units was exemplified by the fact that nobody had queried the English production teams.[6] Furthermore, the testing teams failed to highlight that the software was in fact not in metric units and simply assumed the operating teams would be aware of this, meant that they did not communicate this key finding during their testing of the product and ultimately this fault in the software was the basis of the missions failure.[7]
Also, there were further miscommunications and a lack of communication between engineering teams and operating teams during the post launch phase of the mission. In effect with this, operating teams failed to produce enough detailed calculations so that the trajectory of the MCO was not controlled effectively. During the mission, days after the MCO was projected towards the surface atmosphere of Mars, navigation teams identified that the MCO seemed to be at an altitude much lower than it was programmed to be.[1] This navigation team produced these calculations and passed on this information to the correct operating unit far too late, therefore, the MCO was heading towards mars at a rate too quickly for the control teams to abandon the initial trajectory programming. Ultimately, this meant that they had practically abandoned the entire mission as it was too late to recover the MCO and shortly after, it lost signal with NASA’s ground teams.[4]
Combining these two key human errors and miscommunications, it is evident that NASA’s ground staff and teams working on this project may have had a lack of training and professional guidance. The fact that communications were too informal or key information was overlooked and, in some cases, even failed to be passed on within a suitable timeframe is evidence that the selected teams working on the MCO mission had a lack of experience working with each another. Furthermore, the fact that during the testing process key technical faults, i.e. the wrong programming units were being used, were not identified shows that these testing teams were not up to the standard required to collaborate with the other teams working on such a high profile project.[6]
Discussion
When analysing this particular incident, there is a recurring theme appearing that led to the demise of the MCO’s mission. Each key technical fault or human error involved here relates to the fact that the MCO software was programmed in imperial units and NASA’s ground teams at the time used metric units. This, therefore, is the primary factor that led to the failure of NASA’s MCO mission. This is because it was fatal in that it caused navigation teams to project the MCO into an altitude where it lost signal and would be destroyed in Mars’ atmosphere or deflected into heliocentric space. Whether NASA themselves were to blame or whether Lockheed Martin Astronautics were at fault, due to designing and manufacturing a product not entirely suited for their client’s intended use, is questionable.
However, as previously analysed, there were many other factors and key players involved in the failure of this mission, not simply the failure to collaborate and convert to a set of universal programming units. NASA must take responsibility for the failure of the mission since they had numerous opportunities to identify this key fault in their spacecraft and had failed to identify and rectify their mistakes in a timely manner.[8]
The technical faults of the product itself were clearly significant, however, the failure of NASA to specify that they needed their product’s software to be programmed in metric units was the key reason for this. Furthermore, NASA’s project management during this mission was poor and led to complacencies within the teams designated to run the project. The lack of effective communication between teams and external stakeholders involved with the mission was the underlying key factor leading to the mission failure. Communication of vital information between NASA’s ground teams was not monitored sufficiently, leading to informal discussions between teams, leading to vital information not being passed on to those who needed it. For example, the navigation unit failed to inform the operating teams that the MCO seemed to be at a location where it was not intended to be until the MCO had already passed an altitude where it was bound to lose signal at any moment. The key reason for this was that the navigation team were not carrying out regular calculations on the MCO’s trajectory and documenting them to the correct standard.
A key professional ethical concern is considered here as the discrepancy between calculated and measured position, had been noticed earlier by navigators, whose concerns were rejected by management teams because they 'did not follow the rules about filling out the form to document their concerns'.[7] The fact that management teams did not even consider that the entire mission was at risk here due to ground staff not following company policy correctly seems unprofessional in itself. However, the fact that such a significant observation was not documented or communicated professionally by the navigation team is further evidence of a lack of personal engineering ethical behaviour as the individuals failed to follow professional standards when dealing with extremely vital information.
In the aftermath of the MCO’s mission failure, before the introduction of NASA’s Mars polar orbiter project in 2001, NASA began to conform to the US government public law 94-168 where international SI units should be incorporated in US scientific projects.[9] NASA recently updated their mandatory engineering project management policy that now incorporates the use of SI units unless otherwise specified. When SI units are not used, the particular units that will be used must be listed and passed on to any and every party involved in the engineering of each individual product to be used in the project.[10] These particular standards have been in place since 2001.