Prof. Dr. Michael Möller

In the future it will be possible to use artificial intelligence (AI) to read off ever more information from image data. However, conventional cameras may not be optimal for the technological development in this field. In the »Learning to Sense« project, a research group consisting of seven research chairs at the University of Siegen is combining the two aspects. The researchers plan to develop both novel image sensors and AI software – for tomorrow’s cameras, scanners, and microscopes.
The image sensors in even the most expensive smartphone or camera are no match for the human eye. That’s because eye and image processing in the brain interact to form an incredibly effective system. It starts with the retina. The retina is covered with thousands of photosensitive cells distributed in various ways. At the point of sharpest focus, a large number of small photosensitive cells are concentrated very closely. This produces a high image resolution when we focus on an object. The photosensitive cells are larger and less concentrated at the edge of the retina. Therefore, the image here has a lower resolution. Nevertheless, we still perceive movement very clearly.
That was a life-saver for early humans, allowing our ancestors to detect approaching predators in time to react. Today, our peripheral vision stops us from stepping onto the road when a car is approaching. The photosensitive cell distribution on the retina benefits image processing by the brain enormously. Because the point of sharpest vision is very small, the brain has to process only a small volume of high-resolution data. The low-resolution information at the edge of the retina is usually less important and takes up less processing effort.
Current optical image sensors have a completely different design. Their photosensors, or sensor pixels, are arranged in a rectangular grid with even spacing. »This design makes automated image evaluation increasingly difficult in many of today’s application fields,« says Margret Keuper, Professor for Visual Computing at the University of Mannheim and a project leader within the Research Group “Learning to Sense”. There is a vast number of applications with a fixed camera position that need high resolution data in only a small part of the image while still relying heavily on a larger context.

Those would benefit immensely from a new chip design. Examples are quality assurance in production plants or traffic monitoring in self driving cars. The problem is that a traditional chip captures the entire image in very high resolution, creating a vast quantity of image data. The software then has to process all of this data, even though in these cases only a small section of the image is relevant – for example, a defective component on a conveyor belt or a significant change in traffic in the periphery.
This overabundance of data is a growing problem because increasingly complex AI is used for image processing – in particular neural networks that pro¬cess information in multiple steps or layers. The more image information one inputs into a neural network, the more computing power one needs and the longer it takes for the neural network to produce a result. »There are many applications where image sensors could become significantly more effective when designed in an unconventional way that is particularly well suited for a subsequent analysis with machine learning technology,« says Michael Möller, Professor of Computer Vision and Spokesperson of the Project “Learning to Sense”.
Basically, both fields have developed separately from each other: electrical engineering, which has continuously optimized conventional image sensors, and computer science, which has developed its own, distinct tools. Up to now, these tools were rarely created with the needs of the other discipline in mind. »In our »Learning to Sense« project, we aim for the joint optimization of the design of the sensor system as well as the computer analysis of the captured data,« says Michael Möller.
Working on the project together with Möller are Prof. Dr Volker Blanz from the Department of Computer Science. Prof. Dr Andreas Kolb (both University of Siegen) and Prof. Dr Margret Keuper (University of Mannheim), as well as Prof. Dr Bhaskar Choubey, Prof. Dr Peter Haring Bolívar and Prof. Dr Ivo Ihrke (all University of Siegen) from the field of sensor technology. »Together with our doctoral students, we aim to design new sensor chips and machine learning methods that complement each other perfectly,« explains Möller. The groups are collaborating on the development of novel techniques to jointly optimize image sensor systems as well as machine learning approaches for analyzing the resulting data. Far beyond particular applications, the goal of their fundamental research is to establish a new paradigm: the ability to »learn« the design of future sensor systems in a similar fashion as today’s artificial intelligence is »learning« to understand our world.
In order to do so, they will demonstrate and validate their findings in three areas of applications: The first is that of Terahertz imaging, a technique for measuring light in frequencies not visible to the human eye, which, for instance, allows to detect sub-surface defects in industrial production. The second is that of 3D microscopy for which, for example, the illumination can be optimized in order to capture the detailed geometry of cells highly relevant for cancer research. The third field is the optimization of CMOS sensors for visual light, deviating from the commonly used equally sized and spaced recording of values for red, green, and blue in order to become significantly more efficient in specific applications, e.g. in difficult lighting conditions that classically require a very high dynamic range.
Today’s neural networks and other AI software are so complex that even the experts who develop them can barely comprehend how they analyze data. The AI software is fed training data such as images showing typical component defects. Over time, the neural network learns what defects look like. But its internal processes remain a closed book. As long as you feed neural networks with image information a human can perceive, it is possible in retrospect to check whether the neural network has worked correctly. For example, whether a fault detected by the software really is a hole in a component. But it gets difficult when you design completely novel sensors that don’t deliver conventional image information. In this case, neural net¬works can learn entirely different image characteristics which humans can’t see, for example the brightness difference between adjacent pixels. »That’s why we need to make sure our AI solutions produce plausible results and the algorithms really do output the information we want,« says Margret Keuper.
The »Learning to Sense« project is one of eight prestigious research units funded by the German Research Foundation (DFG) in their special initiative on artificial intelligence (AI). It is supported with over 3.3 million euros. If everything goes to plan, the new generation of sensors with a design optimized for AI will significantly boost the effectiveness of image processing.