## Project Details

### Description

One of the biggest challenges of today is the examination of complex data, whose content is often not directly visible. For instance, imagine a group of corona patients from which we know how long each patient had contact with each other. For tracing chains of infection, we would like to assign patients to clusters and say, for instance, that two patient belong to the same cluster if they had at least 10 minutes of contact.

However, the result might differ significantly if instead of 10 minutes, we use 5 or 20 minutes as the minimal time of contact. We call this a parameter - a value on which the result of our examination depends, and which we can vary over a range of values.

It makes sense to consider all possible parameter values instead of only one and to investigate how clusters change when the parameter value changes. Besides clusters, we could also analyze other properties, for instance, the number of "holes'" in the data. Persistent homology is a mathematical theory that tells us how topological properties (i.e., number of clusters or hole) change when the parameter changes.

When analyzing data, we are, however, not restricted to a single

parameter. To extend the example above, we might get more insight on the data if we restrict attention to symptomatic or severely-ill patient.

This gives us two parameters: The severity of the infection and the duration of contact, and we get a different clustering for every choice of the two parameters. Again, we ask about how the clusters evolve when the parameters are changing. This extension is called multi-parameter persistent homology.

When passing to multiple parameters, we are facing a problem: the mathematical description of the object of analysis (in the example, the combination of all clusterings over all choices of contact duration and severity) does not have a simple description, which is a serious problem in interpreting the result. Still, there have been several breakthroughs in the last years which indicate the possibility of a analyzing data sets with several parameters. However, these works are mostly restricted

to a pure mathematical point of view and neglect the important aspect of practicality, that is, how to compute the results fast enough.

The goal of this project is to develop tools (i.e. computer programs) to make possible the multi-parameter analysis on big data sets. Besides the design and implementation of fast algorithms, the basic mathematical structure of the investigated objects is of importance. We expect that the results of this project will be useful for application-oriented research.

However, the result might differ significantly if instead of 10 minutes, we use 5 or 20 minutes as the minimal time of contact. We call this a parameter - a value on which the result of our examination depends, and which we can vary over a range of values.

It makes sense to consider all possible parameter values instead of only one and to investigate how clusters change when the parameter value changes. Besides clusters, we could also analyze other properties, for instance, the number of "holes'" in the data. Persistent homology is a mathematical theory that tells us how topological properties (i.e., number of clusters or hole) change when the parameter changes.

When analyzing data, we are, however, not restricted to a single

parameter. To extend the example above, we might get more insight on the data if we restrict attention to symptomatic or severely-ill patient.

This gives us two parameters: The severity of the infection and the duration of contact, and we get a different clustering for every choice of the two parameters. Again, we ask about how the clusters evolve when the parameters are changing. This extension is called multi-parameter persistent homology.

When passing to multiple parameters, we are facing a problem: the mathematical description of the object of analysis (in the example, the combination of all clusterings over all choices of contact duration and severity) does not have a simple description, which is a serious problem in interpreting the result. Still, there have been several breakthroughs in the last years which indicate the possibility of a analyzing data sets with several parameters. However, these works are mostly restricted

to a pure mathematical point of view and neglect the important aspect of practicality, that is, how to compute the results fast enough.

The goal of this project is to develop tools (i.e. computer programs) to make possible the multi-parameter analysis on big data sets. Besides the design and implementation of fast algorithms, the basic mathematical structure of the investigated objects is of importance. We expect that the results of this project will be useful for application-oriented research.

Status | Not started |
---|---|

Effective start/end date | 1/08/21 → 31/07/25 |