DL currently is an experimental science. That is, DL capabilities are actually discovered by researchers by surprise. There are certainly a lot of engineering that goes into the optimization and improvement of these machines. However, its capabilities are ‘unreasonably effective’, in short, we don’t have very good theories to explain its capabilities.It is clear that there are gaps in understanding are in at least 3 open questions:How is DL able to search high dimensional discrete spaces?How is DL able to perform generalization if it appears to be performing rote memorization?How does (1) and (2) arise from simple components?
Source: Deep Learning Can be Applied to Natural Language Processing