Lyra

Logo

In the past few years, data science has grown considerably in importance and now heavily influences many domains, ranging from computer science to biology, medicine, economics and finance. As we rely more and more on data science for making decisions, we become increasingly vulnerable to programming errors. The likelihood that errors would remain unnoticed is particularly high for data science, as code is often written by domain experts rather than software engineers. Flawed code causes huge monetary losses in financial applications. In medical applications, programming errors can be deadly. 

This research project is the first step in a longer research effort to enhance the reliability of data science code. The goal of the project is the development of foundations for the static analysis of data science code to provide rigorous mathematical guarantees of its behavior. For this purpose, we are currently targeting Python, one of the most popular programming languages for data science. 

Project Members

The project has been completed. Please contact Peter Müller in case of questions or comments.

External Collaborators

external pageCaterina Urban

Links

The project is open-source and available external pageon GitHub

Publications

  • M. Hassan and C. Urban and M. Eilers and P. Müller, MaxSMT-Based Type Inference for Python 3
    In Computer Aided Verification (CAV), 2018. [DownloadPDF]
  • C. Urban and P. Müller, An Abstract Interpretation Framework for Input Data Usage
    In European Symposium on Programming (ESOP), 2018. [DownloadPDF]

Completed Student Projects

Acknowledgments

The Lyra project has been funded by ETH Zurich.

 

JavaScript has been disabled in your browser