Despite recent progress, one cannot escape the sense that our understanding of how the visual system is shaped by the statistics of our environment is still in its early stages, and much remains to be done.
Deep artificial neural networks provide a very incomplete understanding: the mathematical and physical principles underlying their performance are left scattered over thousands of parameters, providing little insight. Moreover, they fail to account for the critical role of feedback and recurrent connectivity, the diversity of problems that biological visual systems can solve, and the rates at which they learn.
Where else shall we look? Some clues are offered by the papers in this special issue. Simple theoretical principles: the geometry of objects, the physics of image formation, efficient and sparse coding, yield ideas that provide insight while keeping the dimensionality of models low. We believe a continuing confluence of these simple but powerful ideas will ultimately lead to a more complete understanding of the statistics of the natural environment and their implications for vision.