Combining spreadsheet smells for improved fault prediction

Patrick Koch, Konstantin Schekotihin, Dietmar Jannach, Birgit Hofer, Franz Wotawa, Thomas Schmitz

Publikation: Beitrag in Buch/Bericht/KonferenzbandBeitrag in einem KonferenzbandForschungBegutachtung

Abstract

Spreadsheets are commonly used in organizations as a programming tool for business-related calculations and decision making. Since faults in spreadsheets can have severe business impacts, a number of approaches from general software engineering have been applied to spreadsheets in recent years, among them the concept of code smells. Smells can in particular be used for the task of fault prediction. An analysis of existing spreadsheet smells, however, revealed that the predictive power of individual smells can be limited. In this work we therefore propose a machine learning based approach which combines the predictions of individual smells by using an AdaBoost ensemble classifier. Experiments on two public datasets containing real-world spreadsheet faults show significant improvements in terms of fault prediction accuracy.

Originalspracheenglisch
TitelProceedings 2018 ACM/IEEE 40th International Conference on Software Engineering
UntertitelNew Ideas and Emerging Results, ICSE-NIER 2018
Herausgeber (Verlag)IEEE Computer Society, 1998
Seiten25-28
Seitenumfang4
ISBN (elektronisch)9781450356626
DOIs
PublikationsstatusVeröffentlicht - 27 Mai 2018
Veranstaltung40th ACM/IEEE International Conference on Software Engineering: New Ideas and Emerging Results, ICSE-NIER 2018 - Gothenburg, Schweden
Dauer: 30 Mai 20181 Jun 2018

Konferenz

Konferenz40th ACM/IEEE International Conference on Software Engineering: New Ideas and Emerging Results, ICSE-NIER 2018
LandSchweden
OrtGothenburg
Zeitraum30/05/181/06/18

Fingerprint

Spreadsheets
Adaptive boosting
Learning systems
Software engineering
Industry
Classifiers
Decision making
Experiments

Schlagwörter

    ASJC Scopus subject areas

    • Software

    Dies zitieren

    Koch, P., Schekotihin, K., Jannach, D., Hofer, B., Wotawa, F., & Schmitz, T. (2018). Combining spreadsheet smells for improved fault prediction. in Proceedings 2018 ACM/IEEE 40th International Conference on Software Engineering: New Ideas and Emerging Results, ICSE-NIER 2018 (S. 25-28). IEEE Computer Society, 1998. https://doi.org/10.1145/3183399.3183402

    Combining spreadsheet smells for improved fault prediction. / Koch, Patrick; Schekotihin, Konstantin; Jannach, Dietmar; Hofer, Birgit; Wotawa, Franz; Schmitz, Thomas.

    Proceedings 2018 ACM/IEEE 40th International Conference on Software Engineering: New Ideas and Emerging Results, ICSE-NIER 2018. IEEE Computer Society, 1998, 2018. S. 25-28.

    Publikation: Beitrag in Buch/Bericht/KonferenzbandBeitrag in einem KonferenzbandForschungBegutachtung

    Koch, P, Schekotihin, K, Jannach, D, Hofer, B, Wotawa, F & Schmitz, T 2018, Combining spreadsheet smells for improved fault prediction. in Proceedings 2018 ACM/IEEE 40th International Conference on Software Engineering: New Ideas and Emerging Results, ICSE-NIER 2018. IEEE Computer Society, 1998, S. 25-28, Gothenburg, Schweden, 30/05/18. https://doi.org/10.1145/3183399.3183402
    Koch P, Schekotihin K, Jannach D, Hofer B, Wotawa F, Schmitz T. Combining spreadsheet smells for improved fault prediction. in Proceedings 2018 ACM/IEEE 40th International Conference on Software Engineering: New Ideas and Emerging Results, ICSE-NIER 2018. IEEE Computer Society, 1998. 2018. S. 25-28 https://doi.org/10.1145/3183399.3183402
    Koch, Patrick ; Schekotihin, Konstantin ; Jannach, Dietmar ; Hofer, Birgit ; Wotawa, Franz ; Schmitz, Thomas. / Combining spreadsheet smells for improved fault prediction. Proceedings 2018 ACM/IEEE 40th International Conference on Software Engineering: New Ideas and Emerging Results, ICSE-NIER 2018. IEEE Computer Society, 1998, 2018. S. 25-28
    @inproceedings{fa2a308764c84143bb9cad975e405840,
    title = "Combining spreadsheet smells for improved fault prediction",
    abstract = "Spreadsheets are commonly used in organizations as a programming tool for business-related calculations and decision making. Since faults in spreadsheets can have severe business impacts, a number of approaches from general software engineering have been applied to spreadsheets in recent years, among them the concept of code smells. Smells can in particular be used for the task of fault prediction. An analysis of existing spreadsheet smells, however, revealed that the predictive power of individual smells can be limited. In this work we therefore propose a machine learning based approach which combines the predictions of individual smells by using an AdaBoost ensemble classifier. Experiments on two public datasets containing real-world spreadsheet faults show significant improvements in terms of fault prediction accuracy.",
    keywords = "Fault Prediction, Spreadsheet QA, Spreadsheet Smells",
    author = "Patrick Koch and Konstantin Schekotihin and Dietmar Jannach and Birgit Hofer and Franz Wotawa and Thomas Schmitz",
    year = "2018",
    month = "5",
    day = "27",
    doi = "10.1145/3183399.3183402",
    language = "English",
    pages = "25--28",
    booktitle = "Proceedings 2018 ACM/IEEE 40th International Conference on Software Engineering",
    publisher = "IEEE Computer Society, 1998",

    }

    TY - GEN

    T1 - Combining spreadsheet smells for improved fault prediction

    AU - Koch, Patrick

    AU - Schekotihin, Konstantin

    AU - Jannach, Dietmar

    AU - Hofer, Birgit

    AU - Wotawa, Franz

    AU - Schmitz, Thomas

    PY - 2018/5/27

    Y1 - 2018/5/27

    N2 - Spreadsheets are commonly used in organizations as a programming tool for business-related calculations and decision making. Since faults in spreadsheets can have severe business impacts, a number of approaches from general software engineering have been applied to spreadsheets in recent years, among them the concept of code smells. Smells can in particular be used for the task of fault prediction. An analysis of existing spreadsheet smells, however, revealed that the predictive power of individual smells can be limited. In this work we therefore propose a machine learning based approach which combines the predictions of individual smells by using an AdaBoost ensemble classifier. Experiments on two public datasets containing real-world spreadsheet faults show significant improvements in terms of fault prediction accuracy.

    AB - Spreadsheets are commonly used in organizations as a programming tool for business-related calculations and decision making. Since faults in spreadsheets can have severe business impacts, a number of approaches from general software engineering have been applied to spreadsheets in recent years, among them the concept of code smells. Smells can in particular be used for the task of fault prediction. An analysis of existing spreadsheet smells, however, revealed that the predictive power of individual smells can be limited. In this work we therefore propose a machine learning based approach which combines the predictions of individual smells by using an AdaBoost ensemble classifier. Experiments on two public datasets containing real-world spreadsheet faults show significant improvements in terms of fault prediction accuracy.

    KW - Fault Prediction

    KW - Spreadsheet QA

    KW - Spreadsheet Smells

    UR - http://www.scopus.com/inward/record.url?scp=85049772205&partnerID=8YFLogxK

    U2 - 10.1145/3183399.3183402

    DO - 10.1145/3183399.3183402

    M3 - Conference contribution

    SP - 25

    EP - 28

    BT - Proceedings 2018 ACM/IEEE 40th International Conference on Software Engineering

    PB - IEEE Computer Society, 1998

    ER -