CSE researchers present new findings at OOPSLA and SOSP

Several researchers in CSE are presenting papers at the two conferences on programming languages, operating systems, and more.
Computer screen with multicolored code

CSE researchers will present new findings at two upcoming conferences: Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA) and the ACM Symposium on Operating Systems Principles (SOSP).

OOPSLA, taking place October 25-27 in Lisbon, Portugal, is organized by the ACM Special Interest Group on Programming Languages (SIGPLAN) and is now part of SPLASH (Systems, Programming, Languages, and Applications: Software for Humanity). The annual conference is a leading venue for researchers specializing in various facets of programming languages and software engineering to come together and share the latest findings in these areas. 

SOSP is a top conference for academic and industrial researchers focused on systems technology. It brings together innovators from across the globe to present new findings on topics spanning the theory and practice of computer systems and software. It will be held October 23-26 in Koblenz, Germany.

Papers presented at OOPSLA

The two papers by U-M researchers appearing at OOPSLA are as follows, with authors affiliated with CSE in bold:

Gradual Typing for Effect Handlers

Max S. New, Eric Giovannini, Daniel R. Licata

Abstract: We present a gradually typed language, GrEff, with effects and handlers that supports migration from unchecked to checked effect typing. This serves as a simple model of the integration of an effect typing discipline with an existing effectful typed language that does not track fine-grained effect information. Our language supports a simple module system to model the programming model of gradual migration from unchecked to checked effect typing in the style of Typed Racket.

The surface language GrEff is given semantics by elaboration to a core language Core GrEff. We equip Core GrEff with an inequational theory for reasoning about the semantic error ordering and desired program equivalences for programming with effects and handlers. We derive an operational semantics for the language from the equations provable in the theory. We then show that the theory is sound by constructing an operational logical relations model to prove the graduality theorem. This extends prior work on embedding-projection pair models of gradual typing to handle effect typing and subtyping.

Live Pattern Matching with Typed Holes

Yongwei Yuan, Scott Guest, Eric Griffis, Hannah Potter, David Moon, Cyrus Omar

Abstract: Several modern programming systems, including GHC Haskell, Agda, Idris, and Hazel, support typed holes. Assigning static and, to varying degree, dynamic meaning to programs with holes allows program editors and other tools to offer meaningful feedback and assistance throughout editing, i.e. in a live manner. Prior work, however, has considered only holes appearing in expressions and types. This paper considers, from type theoretic and logical first principles, the problem of typed pattern holes. We confront two main difficulties, (1) statically reasoning about exhaustiveness and irredundancy when patterns are not fully known, and (2) live evaluation of expressions containing both pattern and expression holes. In both cases, this requires reasoning conservatively about all possible hole fillings. We develop a typed lambda calculus, Peanut, where reasoning about exhaustiveness and redundancy is mapped to the problem of deriving first order entailments. We equip Peanut with an operational semantics in the style of Hazelnut Live that allows us to evaluate around holes in both expressions and patterns. We mechanize the metatheory of Peanut in Agda and formalize a procedure capable of deciding the necessary entailments. Finally, we scale up and implement these mechanisms within Hazel, a programming environment for a dialect of Elm that automatically inserts holes during editing to provide static and dynamic feedback to the programmer in a maximally live manner, i.e. for every possible editor state. Hazel is the first maximally live environment for a general-purpose functional language.

Figure from paper showing examples of exhaustiveness checking. Example a shows indeterminately exhaustive checking, b shows necessarily inexhaustive checking, and c shows necessarily exhaustive checking.
Figure 2 from the above paper showing examples of exhaustiveness checking with pattern holes

Papers presented at SOSP

The three papers by U-M researchers being presented at SOSP are as follows, with CSE-affiliated authors in bold:

Pushing Performance Isolation Boundaries into Application with pBox

Yigong Hu, Gongqi Huang, Peng (Ryan) Huang

Abstract: 

Modern applications are highly concurrent with a diverse mix of activities. One activity can adversely impact the performance of other activities in an application, leading to intra-application interference. Providing fine-grained performance isolation is desirable. Unfortunately, the extensive performance isolation solutions today focus on mitigating coarse-grained interference among multiple applications. They cannot well address intra-app interference, because such issues are typically not caused by contention on hardware resources.

This paper presents an abstraction called pBox for developers to systematically achieve strong performance isolation within an application. Our insight is that intra-app interference involves application-level virtual resources, which are often invisible to the OS. We define pBox APIs that allow an application to inform the OS about a few general types of state events. Leveraging this information, we design algorithms that effectively predict imminent interference and carefully apply penalties to the noisy pBoxes to achieve a specified isolation goal. We apply pBox on five large applications. We evaluate the pBox-enhanced applications with 16 real-world performance interference cases. pBox successfully mitigates 15 cases, with an average of 86.3% reduction of the interference.

Oobleck: Resilient Distributed Training of Large Models Using Pipeline Templates

Insu Jang, Zhenning Yang, Zhen Zhang, Xin Jin, Mosharaf Chowdhury

Abstract: Oobleck enables resilient distributed training of large DNN models with guaranteed fault tolerance. It takes a planning-execution co-design approach, where it first generates a set of heterogeneous pipeline templates and instantiates at least f + 1 logically equivalent pipeline replicas to tolerate any f simultaneous failures. During execution, it relies on already-replicated model states across the replicas to provide fast recovery. Oobleck provably guarantees that some combination of the initially created pipeline templates can be used to cover all available resources after f or fewer simultaneous failures, thereby avoiding resource idling at all times. Evaluation on large DNN models with billions of parameters shows that Oobleck provides consistently high throughput, and it outperforms state-of-the-art fault tolerance solutions like Bamboo and Varuna by up to 13.9×.

Oobleck decouples planning (generating pipeline templates) from execution (pipeline instantiation) to enable fast failure recovery, as well as high throughput. It first generates a set of heterogeneous pipeline templates that cover all available GPU nodes, then instantiates pipelines for actual execution from the templates.

Siloz: Leveraging DRAM Isolation Domains to Prevent Inter-VM Rowhammer

Kevin Loughlin, Jonah Rosenblum, Stefan Saroiu, Alec Wolman, Dimitrios Skarlatos, Baris Kasikci

Abstract: Today’s cloud DRAM lacks strong isolation primitives, highlighted by Rowhammer bit flips. Rowhammer poses an increasing threat to cloud security/reliability, given (1) DRAM activation rates in commodity and malicious workloads already exceed Rowhammer thresholds, and (2) thresholds are decreasing in newer DRAM. Deployed hardware mitigations remain vulnerable, turning cloud providers toward software defenses. However, existing defenses incur high performance or memory overhead or contain significant protection gaps.

Accordingly, we introduce Siloz, a hypervisor that uses subarray groups as DRAM isolation domains to enable efficient protection against inter-VM Rowhammer. Siloz exploits the insights that (a) Rowhammer can only flip bits in DRAM rows located in the same subarray—not across subarrays—and (b) VMs can be isolated to groups of subarrays without sacrificing bank-level parallelism, a key component of DRAM performance. Siloz thus prevents inter-VM bit flips by placing each VM’s and the host’s data into private subarray groups. To additionally ensure that a VM cannot escape its provisioned subarray group(s), Siloz provides integrity protection for extended page tables (EPTs). We show that Siloz’s implementation has negligible effect on average performance across various cloud workloads, SPEC CPU 2017, and PARSEC 3.0 (within ±0.5% of baseline Linux/KVM).