Lyra

In the past few years, data science has grown considerably in importance and now heavily influences many domains, ranging from computer science to biology, medicine, economics and finance. As we rely more and more on data science for making decisions, we become increasingly vulnerable to programming errors. The likelihood that errors would remain unnoticed is particularly high for data science, as code is often written by domain experts rather than software engineers. Flawed code causes huge monetary losses in financial applications. In medical applications, programming errors can be deadly.

This research project is the first step in a longer research effort to enhance the reliability of data science code. The goal of the project is the development of foundations for the static analysis of data science code to provide rigorous mathematical guarantees of its behavior. For this purpose, we are currently targeting Python, one of the most popular programming languages for data science.

Project Members

The project has been completed. Please contact Peter Müller in case of questions or comments.

External Collaborators

external pageCaterina Urban

Links

The project is open-source and available external pageon GitHub

Publications

M. Hassan and C. Urban and M. Eilers and P. Müller, MaxSMT-Based Type Inference for Python 3
In Computer Aided Verification (CAV), 2018. [DownloadPDF]
C. Urban and P. Müller, An Abstract Interpretation Framework for Input Data Usage
In European Symposium on Programming (ESOP), 2018. [DownloadPDF]

Completed Student Projects

Radwa Sherif Abdelbar, Bachelor's Thesis, SS 2018
Automatic Checking of Implicit Assumptions on Textual Data
Lowis Engel, Bachelor's Thesis, SS 2018
Usage Analysis of Data Stored in Map Data Structures
Madelin Schumacher, Master's Thesis, SS 2017
Automated Generation of Data Quality Checks
Simon Wehrli, Master's Thesis, SS 2017
Static Program Analysis of Data Usage Properties
Mostafa Hassan, Bachelor's Thesis, SS 2017
Static Type Inference for Python

Acknowledgments

The Lyra project has been funded by ETH Zurich.