Artificial intelligence has recently revolutionized various areas of science and has also started to permeate commercial applications in an unprecedented way. The current revolution in artificial intelligence is being driven by advances in machine learning, with deep learning as a prominent example. Deep learning involves large multi-layer artificial neural networks trained on large datasets using advanced learning algorithms. Public trust in the performance of such complex technical systems is critical to their widespread applicability and acceptance. In this project, we are further developing our existing audit catalog to be able to certify machine learning systems in safety-critical applications, in order to validate their quality and usability and to increase public understanding and trust. By providing new insights, theories, and methods for evaluating and inspecting machine learning systems, we develop quality assurances using robust statistical test scenarios and demonstrate their applicability in various application domains.