Machine Learning for Space Applications on Embedded Systems

Ric Dengel

As space missions continue to increase in complexity, the operational capabilities and amount of gathered data demand ever more advanced systems. Currently, mission capabilities are often constrained by the link bandwidth as well as on-board processing capabilities. A large number of commands and complex ground station systems are required to allow spacecraft operations. Thus, methods to allow more efficient use of the bandwidth, computing capacity and increased autonomous capabilities are of strong research interest. Artificial Intelligence (AI), with its vast areas of application scenarios, allows for these challenges and more to be tackled in the spacecraft design. Particularly, the flexibility of Artificial Neural Networks as Machine Learning technology provides many possibilities. For example, Artificial Neural Networks can be used for object detection and classification tasks. Unfortunately, the execution of current Machine Learning algorithms consumes a large amount of power and memory resources. Additionally, the qualification of such algorithms remains challenging, which limits their possible applications in space systems. Thus, an increase in efficiency in all aspects is required to further enable these technologies for space applications. The optimisation of the algorithm for System on Chip (SoC) platforms allows it to benefit from the best of a generic processor and hardware acceleration. This increased complexity of the processing system shall allow broader and more flexible applications of these technologies with a minimum increase of power consumption. As Commercial off-the-shelf embedded systems are commonly used in NewSpace applications and such SoC are not yet available in a qualified manner, the deployment of Machine Learning algorithms on such devices has been evaluated. For deployment of machine learning on such devices, a Convolutional Neural Network model was optimised on a workstation. Then, the neural network is deployed with Xilinx's Vitis AI onto a SoC which includes a powerful generic processor as well as the hardware programming capabilities of an Field Programmable Gate Array (FPGA). This result was evaluated based on relevant performance and efficiency parameters and a summary is given in this thesis. Additionally, a tool utilising a different approach was developed. With a high-level synthesis tool the hardware description language of an accelerated linear algebra optimised network is created and directly deployed into FPGA logic. The implementation of this tool was started, and the proof of concept is presented. Furthermore, existing challenges with the auto-generated code are outlined and future steps to automate and improve the entire workflow are presented. As both workflows are very different and thus aim for different usage scenarios, both workflows are outlined and the benefits and disadvantages of both are outlined.