FWF-VAD - Visual Analysis of Heterogeneous Data using Semantic Subsets

  • Lex, Alexander, (Co-Investigator (CoI))

Project: Research project

Description

Analyzing and understanding very large and heterogeneous datasets is a fundamental challenge researchers face in
many scientific domains. Disciplines such as astronomy, physics and biology have to deal with datasets of an
unprecedented scale and complexity. While analyzing these datasets is challenging, they also have the potential to
revolutionize our understanding of the underlying processes.
To realize this potential, novel analysis approaches have to be developed in all fields of the data sciences. In this
proposal for an Erwin Schrödinger fellowship I introduce semantic subsets as a novel method for the visual
analysis of large, heterogeneous, and multiple datasets. I propose to leverage machine learning, statistical and other
methods to first partition datasets into meaningful subsets, and then use a tight integration of computational and
visualization methods to support experts in choosing subsets relevant to a task. These subsets and their
relationships are then visualized, facilitating an open, exploratory analysis of the data. The core research challenges
addressed in this proposal are how to efficiently and effectively find suitable subsets, manage multiple subsets, and
visualize the relationships between them.
I argue that this approach is suitable to address the problems posed by the analysis of multiple large and
heterogeneous datasets, as it scales well, is highly flexible, and naturally integrates multiple datasets.
I intend to develop prototypes realizing the semantic subsets concept for the analysis of biomolecular data in design
studies. These applications will be the product of a user-centered design process involving close collaboration with
domain experts. The applications will address the domain expert's data analysis problems and aid them in their
scientific discovery process. The formal evaluation of the utility of the approach will be conducted using case
studies based on longitudinal observations of the deployed applications in addition to controlled user studies.
I plan to conduct this research at the Visual Computing Group at Harvard University, lead by Professor Hanspeter
Pfister. Professor Pfister and his group have considerable expertise in developing visualization methods for
molecular biology. In addition, the greater Boston area is home to many top-tier molecular biology research labs,
including the Harvard Medical School and the Broad Institute of MIT and Harvard, to which Professor Pfister and
myself have established ties. This environment is therefore uniquely suited to the proposed kind of research.
During the planned return phase at the Institute for Computer Graphics and Vision at Graz University of
Technology I will not only be able to pass on my gained knowledge to my peers and to students, but will also be
able to support Professor Schmalstieg in his agenda of building a strong data visualization group in Graz and
thereby strengthen the already sizable Austrian visualization research community.
StatusFinished
Effective start/end date1/06/15 → 31/05/16