Burcu Kulahcioglu Ozkan: From Formal Methods to Testing of Distributed Systems
Abstract: Our recent works aim to narrow the gap between formal verification and randomized testing of distributed systems, bringing together the effectiveness of formal methods and the practicality of randomized testing. In this talk, I will overview the key ideas in our recent works and discuss how transferring ideas from formal methods and verification improves the effectiveness of randomized test generation.
Umang Mathur: Algorithmic techniques for scaling concurrency testing
Abstract: Concurrent software is ubiquitous in modern computing systems, yet writing reliable concurrent programs remains hard. Data race detectors serve as the first line of defense, helping developers identify synchronization errors that could lead to subtle and catastrophic failures.
However, even the most well-engineered race detectors impose significant runtime overhead, making them impractical for use in production environments - precisely where many real bugs often manifest.
In this talk, I will discuss three orthogonal algorithmic approaches to tackle this challenge: sampling-based techniques with probabilistic guarantees, grammar-based compression for offline analysis, and the tree clock data structure that offers optimal performance for happens-before analysis. Each approach provides unique theoretical benefits from constant-factor sampling guarantees to optimal time complexity for causal analysis - while our empirical evaluations suggest substantial performance improvements in practice.
Constantin Enea: The Power of Forward Reasoning in Non-Blocking Concurrency Verification
Abstract:Linearizability is the standard correctness criterion for concurrent objects. It ensures that a concurrent object is a refinement of an atomic reference implementation. In general, proving such a refinement requires both forward and backward reasoning or equivalently, the use of both history and prophecy variables. However, prophecy variables are notoriously challenging to work with and are not supported by all deductive verification tools.
In this talk, we present several results that connect the necessity of backward reasoning with non-blocking progress conditions such as lock-freedom and wait-freedom. For example, we show that no lock-free implementation of a concurrent queue or stack -- based solely on restricted atomic read-write operations (with consensus number 2) -- can be verified using forward reasoning alone.
Bertrand Meyer: PRISM: Programming Really Is Simple Mathematics
Abstract: A re-construction of the fundamentals of programming as a small mathematical theory (PRISM) based on elementary set theory. Highlights:
Zero axioms. No properties are assumed, all are proved (from standard set theory).
A single concept covers specifications and programs.
Its definition only involves one relation and one set.
Everything proceeds from three operations: choice, composition and restriction.
These techniques suffice to derive the axioms of classic papers on the "laws of programming" as consequences and prove them mechanically.
The ordinary subset operator suffices to define both the notion of program correctness and the concepts of specialization and refinement.
From this basis, the theory deduces dozens of theorems characterizing important properties of programs and programming.
All these theorems have been mechanically verified (using Isabelle/HOL); the proofs are available in a public repository. This paper is a considerable extension and rewrite of an earlier contribution.
Mihai Nicola: Automated Refinement of Concurrent Object Quotients
Abstract: Lock-free concurrent implementations of standard abstract data types leverage efficient hardware primitives for performance. However, identifying bugs within these implementations continues to be challenging, and formally verifying their correctness typically via linearizability relative to a sequential specification requires substantial and rigorous effort. One promising recent verification strategy called commutativity quotients first reduces an object's interleavings to a core representative set but so far it is unknown whether such quotients can be soundly derived automatically.
I will discuss a new automated method of deriving commutativity quotients through a novel refinement calculus. Quotients are derived through a series of refinement steps and in each step, we compute a fixpoint over a pattern of unboundedly many thread action movements, obtaining Kleene-star or context-free summaries of the quotient. I will briefly also discuss our mechanization of the quotient construction within the Lean theorem prover, a set of general proof tactics, and demonstrate their applicability across a diverse collection of concurrent data structures.
Sreeja S Nair: Reasoning about a sparsely-connected, partially-replicated, peer-to-peer system
Abstract: For an offline-first collaborative application to operate in a true peer-to-peer fashion, its collaborative features must function even in environments where internet connectivity is limited or unavailable. Each peer may only be interested in a subset of the application data relevant to its workload, and this subset can overlap in different ways with those of other peers. Limitations imposed by access control and mesh network technologies often result in peers being sparsely connected. Reasoning about consistency in these systems is hard, especially when considering transactional updates that may alter different sets of data in the same transaction. We present some properties to reason about offline-first collaborative applications that are sparsely connected and rely on partially replicating different subsets of a broader set of data. We then use them to propose a set of guidelines to help developers design their applications with atomicity and consistency guarantees.
Rupak Majumdar: Reasoning about availability
Abstract: Availability is the property that a system responds to requests in a timely manner. In this talk, I will outline a research program on formal modeling and analysis for availability.
Samuel Grahn: Efficient Linearizability Monitoring
Abstract: We present efficient and novel algorithms for checking the linearizability of executions of concurrent data structure libraries such as stacks, queues and sets. For stacks, we construct a linearization recursively by detecting patterns that correspond to intervals of time during which the stack is populated or may be empty, and obtain an algorithm that is quadratic in the length of the input. For queues, we have proven a small-model theorem, from which we can extract an algorithm that is O(n log n). For (multi)sets, we show that it is sufficient to count certain events at each step, and thus obtain an algorithm that is O(n).
Enrique Román Calvo: On the Complexity of Checking Mixed Isolation Levels for SQL Transactions
Abstract: Concurrent accesses to databases are typically grouped in transactions which define units of work that should be isolated from other concurrent computations and resilient to failures. Modern databases provide different levels of isolation for transactions that correspond to different trade-offs between consistency and throughput. Quite often, an application can use transactions with different isolation levels at the same time. In this work, we investigate the problem of testing isolation level implementations in databases, i.e., checking whether a given execution composed of multiple transactions adheres to the prescribed isolation level semantics. We particularly focus on transactions formed of SQL queries and the use of multiple isolation levels at the same time. We show that many restrictions of this problem are NP-complete and provide an algorithm which is exponential-time in the worst-case, polynomial-time in relevant cases, and practically efficient.
Gennaro Parlato: Verifying Tree-Manipulating Programs via CHCs
Abstract: In this talk, I will present a new, unified approach to analyzing programs that manipulate tree-shaped data structures tasks that often demand complex, specialized, and hard-to-automate proofs. Central to our method is the 'knitted-tree' encoding, which models each program execution as a tree encompassing its input, output, and intermediate states. By leveraging the compositional properties of these knitted trees, we can encode them as constrained Horn clauses (CHCs) and then reduce verification to a CHC satisfiability problem. Although our primary focus is on memory safety, the approach naturally generalizes to preconditions, postconditions, termination, and program-specific contracts like insertion, deletion, and search. It can support deductive verification and can be extended to data structures with bounded tree width.
Arnaud Sangnier: Counting Abstraction for the Verification of Structured Parameterized Networks
Abstract: We consider the verification of parameterized networks of replicated
processes whose architecture is described by hyperedge-replacement graph grammars. To solve safety verification problems we present a counting abstraction able to produce, from a graph grammar describing a parameterized system, a finite set of Petri nets that over-approximate the behaviors of the original system. The counting abstraction is implemented in a prototype tool, evaluated on a non-trivial set of test cases. This is a joint work with Marius Bozga, Radu Iosif and Neven Villani.
Prakash Saivasan: String Constraints with Subword ordering
Abstract: In the talk we will consider a variant of string constraints given by membership constraints in context-free languages and subword relation between variables. The satisfiability problem for this variant turns out to be undecidable. We will consider a fragment in which the subword-order constraints do not impose any cyclic dependency between variables and show that this fragment is decidable. As an application of our result, we will show that the complexity of control state reachability in acyclic lossy channel pushdown systems is NEXPTIME-complete.
Cezara Dragoi: Data integrity in distributed systems at scale
Abstract: Data integrity ensures that customer bytes are not tampered with. In this talk I will describe some challenges in designing, implementing, and operating distributed systems at large scale, focusing on the efforts required to ensure data integrity. At today's scale, everything that can happen will very likely happen and I will describe how formal methods and verification can prevent it.
Karem Sakallah: Verifying Unbounded Protocol Specifications is (should be!) Easy
Abstract: A major challenge in the verification of unbounded distributed protocol specifications (also known as parameterized systems) is automating the search for quantified inductive invariants that serve as proof certificates of their safety properties. However, unlike most recent work in this area, we take a radically different approach: algorithmically deriving a unique quantified formula R for a protocol's reachable states. This maybe somewhat surprising since finding compact formulas for the reachable states of non-parameterized irregular finite systems (e.g., hardware) is generally intractable.The intuition behind our approach is that the inherent regularity of a distributed protocol, namely its structural symmetry, must necessarily yield such a compact quantified first-order logic (FOL) formula for R.
Our starting point for deriving such a formula is the oft-cited cutoff phenomenon that many researchers have documented over the past several decades. Without loss of generality, assume a single domain whose size parameter is k.The key insight is that it should be possible to identify this cutoff by deriving a sequence of “special” finitely-quantified reachability formulas R1, R2, R3, ... at increasing domain sizes that converge, i.e., become size-independent, at cutoff. Intuitively, at any given size k, the protocol cannot exhibit fewer 'behaviors' than at sizes less than k. On the other hand, behaviors saturate at cutoff.
The secret sauce in our approach for deriving these reachability formulas is the use of a symmetry-enhanced extension of the classical logic minimization algorithm whose input, denoted as Reach[k], is the explicit (enumerated) set of reachable states at size k, and whose output is Rk expressed as a minimum-cost conjunction of FOL sentences, where cost (denoted as qCost) accounts for both the number of quantified variables in the prefix and the number of literals in the matrix. By construction, Rk implicitly captures the explicit set Reach[k]. Cutoff is reached when the evaluation of Rk at k+1 yields Reach[k+1] indicating that the sequence of reachability formulas has reached a fixed point, which we refer to as the Reachability Invariant. The rationale for using this terminology is the following. Clearly any superset of the reachable states is an invariant. On the other hand, at cutoff the invariants constructed by this minimization procedure are special in that their conjunction encodes the exact set of reachable states for any size!
Looking ahead, we plan to explore the extension of this approach to protocol specifications that involve totally-ordered types that model state evolution over "time". Regularity in this temporal dimension can be viewed as "repetitiveness" (e.g., a repeating sequence of finite transactions) and suggests a notion of temporal cutoff that augments the symmetry-based spatial cutoff. We are also exploring the derivation of disjunctive formulas for R where each conjunct encodes an infinite set of states that represent a certain "phase" of protocol execution.
Eric Koskinen: Whole-Program Parallelization with Commutativity
Abstract: I will discuss a new way to exploit code commutativity to parallelize sequential programs. While prior work provided the ability to identify and exploit commutative function calls or adjacent code blocks, we introduce the first method that supports conditional commutativity of arbitrary regions of code. In this method, programmers give parameterized names to code blocks anywhere in the program, and provide commutativity expressions over those parameters and the global state. Novel algorithms exploit this commutativity and dependence information to automatically construct a program-specific parallelization scheme, yielding a collection of asynchronous tasks. These tasks are controlled by a run-time scheduler that uses commutativity conditions to dynamically control which tasks are executed in parallel. Commutativity conditions can often be verified automatically.
Dominik Klumpp: Finding Commutativity in Verification of Concurrent Programs
Abstract: The benefits of commutativity-based reduction in the verification of concurrent programs are well-established: By focusing on a representative subset of program interleavings, soundly selected based on a given commutativity relation between statements, the verification can succeed with simpler invariants and can avoid combinatorial explosion. Yet, there are many possible commutativity relations. In this talk, we focus on the setting of fully-automated algorithmic verification, and explore strategies to derive commutativity relations that are sound by construction. As the resulting commutativity relations are in general incomparable, we further discuss the potential of combining multiple commutativity relations, and the challenges for algorithmic realizations of such combinations.
Swen Jacobs: 01-Abstraction in Parameterized Verification
Abstract: Since parameterized verification aims at proving properties for an infinite family of system instances, it naturally considers an infinite state space.
Assuming that the local state space of a single process is finite, 01-abstraction is a finite abstraction of the infinite global state space - intuitively, it only keeps track, for each of the local states, whether or not there *exists* at least one process in this state.
In this talk, I will present a sufficient condition for when 01-abstraction is precise for safety verification, and show that a surprising number of existing system models from the literature satisfy this condition.
Beyond giving a much simpler explanation for many existing results, 01-abstraction also enables the verification of novel system models.
Moreover, I will show how 01-abstraction can be used in combination with other abstraction techniques in systems where processes are modeled as timed automata or threshold automata.
Ruzica Piskac: Efficient and Distributed Privacy-Preserving Verification
Abstract: Zero-knowledge (ZK) protocols are well-known cryptographic primitives that allow one party to prove to another party a statement without revealing anything beyond the statement. A ZK protocol consists of two parties: a "prover" and a "verifier". In our work, the prover holds a secret formula and its proof of validity and needs to convince the verifier about the correctness of the proof. The verifier validates the prover's claims, by checking every step of the proof. However, the high overhead of ZKPs can restrict their practical applicability. In this talk we describe our design of a programming language, Ou, aimed at easing the programmer's burden when writing efficient ZKPs. We also developed a compiler framework that automates the analysis and distribution of statements to a computing cluster. Our framework uses programming language semantics, formal methods, and combinatorial optimization to automatically partition an Ou program into efficiently sized chunks for parallel ZK-proving and/or verification.
Klaus v. Gleissenthall: Pantomime: Simulation-Based Leakage Proofs for Hardware Side-Channel Security
Abstract: Tools for verifying leakage descriptions of hardware aim to ensure
that a given hardware design doesn't leak secrets via its microarchi-
tecture, when running programs with appropriate countermeasures.
However, current techniques either don't allow for leakage descriptions expressive enough to capture real-world software countermeasures like constant time programming, or they rely on expensive
solvers and handwritten invariants, making them difficult to apply to larger designs especially for hardware designers who are
not experts in formal verification. In this talk, I will present a new
approach to leakage verification: simulation-based leakage proofs,
where proofs are developed alongside the design.
Inspired by techniques in cryptography, simulation-based leakage proofs show that a leakage description correctly captures a
hardware design by constructing a simulator another hardware
design that must faithfully replicate all attacker-observable behavior from explicitly leaked secrets. Simulation-based leakage proofs
are easy to write and to debug: writing a simulator just means writing another hardware module, which is what hardware designers
are already best at. They also capture all common software defenses
and allow for fast checking via a simple SMT query.
We implemented simulation-based leakage proofs in Pantomime,
a tool that supports writing processors (and their leakage proofs)
in Haskell; we report on writing and verifying a 5-stage in-order
processor with Pantomime. Unlike previous leakage proofs, which
are conditional on the (unproven) functional correctness of the CPU
against an ISA-level specification, our proofs hold unconditionally.
Swarat Chaudhuri: Evolution, Abstraction, and Discovery with Large Language Models
Suresh Jagannathan: Type-Guided Synthesis of Distributed Test Controllers
Abstract: In a distributed system, a test controller manages the order in which messages are handled by actors, and the payload of external components that are not under test. Manually constructing effective controllers is quite challenging given that the executions that lead to
violations of important safety or liveness properties represent an infinitesimally small fragment of the set of all possible behaviors the system can exhibit. In this talk, we present a type-guided synthesis procedure that automatically constructs a test controller geared towards exhibiting executions that violate a property of interest. Central to our approach is a new type abstraction called a Prophecy Automata Type that describes both the history of the system and its future behaviors using a symbolic variant of linear temporal logic, and which is embedded within a refinement type system; these types are used by a deductive synthesis algorithm to construct a distributed test controller in a specialized DSL designed for this purpose. We have implemented our ideas to synthesize controllers for P programs with promising results.
Konstantinos Kallas: Performance and Correctness for Microservice Applications
Abstract: The microservice paradigm is widely used as an application structuring methodology in the industry due to the benefits it offers to organizations. Unfortunately, structuring applications as microservices incurs significant overheads that are not present in monolithic architectures. At the same time, microservice applications are hard to reason about due to their complex service graphs with dynamic dependencies. In this talk I will present some of our work in improving the performance of microservice applications without jeopardizing their correctness, as well as talk about open challenges and future directions in the design of provably correct systems for microservices.
Marc Shapiro: Modelling and Verifying a Database Backend
Massinissa TIGHILT, Lina Azerouk, Camille Palisoc, Binh-Minh Bui-Xuan and Maria Potop-Butucaru: Distributed computation of temporal twins in periodic undirected time-varying graphs
Gowtham Kaki: Convergence is Half Way to Consensus
Abstract: Replicated State Machine (RSM) is a foundational abstraction of a distributed system known for its guarantee of strong consistency: every execution steps through a linear sequence of states no matter where the operations are submitted. In practice, distributed systems are characterized by non-determinism induced by system-level faults, such as network partitions. Overcoming non-determinism and ensuring a functional degree of fault-tolerance requires correctly implementing consensus algorithms, such as Raft and Paxos, which are known to be notoriously difficult to reason about. Ensuring the correctness of consensus protocol implementations currently requires heroic program verification efforts, which is infeasible in practice. In this talk, I argue that the complexity of verification is primarily due the low-level programming model in which consensus and strong replication are implemented asynchronous message passing which thwarts decidable automation by exposing the details of asynchronous communication. To solve this problem, I propose implementing RSM abstraction as a wrapper on top of a Weakly-Consistent Replicated State Machine that guarantees a weaker property called Convergence. Weak RSMs are available (in the sense of CAP theorem) under network partitions and can therefore serve as suitable foundations for performant distributed systems. Crucially, they abstract asynchronous communication and allow us to derive local-scope verification conditions (as opposed to global-scope inductive invariants) for consensus safety. I describe a verification framework we built, called Super-V, that leverages this approach to enable SMT-aided verification of consensus protocol implementations. I describe our experience of using Super-V to automatically verify Ferry -- a strongly-consistent log replication system based on a novel adaptation of the Raft consensus algorithm.
Gangamreddypalli Namratha Reddy: Commutativity based refinement proofs using Civl
Abstract: Civl is a static verifier for concurrent programs based on the idea of layered refinement. A program at a lower layer (such as a concrete, low-level implementation) refines a program at a higher layer (such as a high-level specification) in such a way that proving correctness of the higher layer implies proving correctness of the lower layer. One form of refinement is reducing the space of possible executions of a concurrent program. This can be achieved by exploiting the commutativity of statements, defined as movers in Lipton's reduction. A statement is a left (right) mover if it can be commuted to the left (right) of every other statement executed by a different thread, without altering the outcome of the execution. I describe a new refinement proof rule based on commutativity and use it to prove correctness of a snapshot algorithm. The goal of the algorithm is to return a snapshot of the entire memory, despite concurrent updates to different locations.
Marc Shapiro: Modelling and Verifying a Database Backend, technical deep dive
Abstract: I will be giving an invited talk titled "Modelling and Verifying a Database Backend" to the Netys audience. This presentation will dive deeper into the formal and verification aspects of this work.
Abstract: Although a database backing store is conceptually just a shared memory, modern stores have high internal complexity, for performance and fault tolerance reasons, and to support rich data types supporting incremental and/or convergent updates. As such complexity is bug-prone, this research proposes a correct-by-design approach, based on a set of formal specifications. Possible executions are abstracted as a trace, a partial order of transactional events. Its valuation specifies the expected value of a key at each event, according to datatype semantics. We specify two common variants of store, the eager, random-access map and the lazy, sequential-access journal. We show that both conform to the valuation, meaning that they are (i) safe, (ii) functionally equivalent, and (iii) causally consistent. Finally, we propose a transaction semantics that is representative of common implementations and avoids the timestamp inversion anomaly. We show an equivalence between the trace and the transaction semantics; implying that maps and journals remain safe. Our results extend naturally to systems with stronger guarantees, such as classical assignment-based data types or strong consistency