ExplainED: Explanations for EDA Notebooks

Abstract

Exploratory Data Analysis (EDA) is an essential yet highly demanding task. To get a head start before exploring a new dataset, data scientists often prefer to view existing EDA notebooks - illustrative exploratory sessions that were created by fellow data scientists who examined the same dataset and shared their notebooks via online platforms. Unfortunately, creating an illustrative, well-documented notebook is cumbersome and time-consuming, therefore users sometimes share their notebook without explaining their exploratory steps and their results. Such notebooks are difficult to follow and to understand. To address this, we present ExplainED, a system that automatically attaches explanations to views in EDA notebooks. ExplainED analyzes each view in order to detect what elements thereof are particularly interesting, and produces a corresponding textual explanation. The explanations are generated by first evaluating the interestingness of the given view using several measures capturing different interestingness facets, then computing the Shapely values of the elements in the view, w.r.t. the interestingness measure yielding the highest score. These Shapely values are then used to guide the generation of the textual explanation. We demonstrate the usefulness of the explanations generated by ExplainED on real-life, undocumented EDA notebooks.

Publication
In PVLDB 13(12)

Related