In the past few years, data science has grown considerably in importance and now heavily influences many domains, ranging from computer science to biology, medicine, economics and finance. As we rely more and more on data science for making decisions, we become increasingly vulnerable to programming errors. The likelihood that errors would remain unnoticed is particularly high for data science, as code is often written by domain experts rather than software engineers. Flawed code causes huge monetary losses in financial applications. In medical applications, programming errors can be deadly.
This research project is the first step in a longer research effort to enhance the reliability of data science code. The goal of the project is the development of foundations for the static analysis of data science code to provide rigorous mathematical guarantees of its behavior. For this purpose, we are currently targeting Python, one of the most popular programming languages for data science.
The Lyra project has been funded by ETH Zurich.