Microkernel Notebooks

Sdílet
Vložit
  • čas přidán 27. 08. 2024
  • Speaker: Oliver Kennedy, Associate Professor at University of Buffalo
    Reproducibility is critical for effective, interpretable, and equitable data science, and notebooks are a step in the right direction, helping data scientists to keep track of how they arrived at specific results. Yet criticisms remain, from one study of notebooks on Github that found only 10% of notebooks exhibiting reproducible behavior, to talks like Joel Grus' "I don't like notebooks." pointing out how out-of-order execution creates confusion for new users.
    In this talk, Oliver introduces a new "microkernel" notebook architecture that addresses many of these concerns. Inspired by workflow systems, microkernel notebooks (like our system Vizier) use small, lightweight kernels allocated per-cell, with data flow handled by message-passing and data-interop frameworks like Arrow. With the same friendly programming model as Jupyter or Zeppelin, microkernel notebooks can automatically re-run stale cells, revert notebook state, instantly resume after a reboot, and seamlessly share data across languages or python versions.

Komentáře •