Operating System

Concurrency

Laws of Order: Expensive Synchronization in Concurrent Algorithms Cannot be Eliminated
Effective Concurrency Testing for Distributed Systems

Model checking is notoriously for state explosion in concurrency testing even coupled with partial order reduction. So, random work is proposed to randomly explore interleavings. However, it still exploring some useless orderings. This paper utilizes thread conflicts in historical testing executions to predict future’s conflict and thus avoid useless exploration. It looks still a kind of order reduction.

File systems

Local File Systems

Split-Level I/O Scheduling, SOSP’16
IOFlow: A Software-Defined Storage Architecture

Distributed file systems

Empowering Azure Storage with RDMA
Octopus: an RDMA-enabled Distributed Persistent Memory File System
The bottleneck of distributed file systems:
- SSD + ethernet: slow device latency (ms)
- PM + RDMA (InfiniBand): Many data duplications among layers.
- Data plane:
  - Directly access servers’ PMs from clients to reduce data duplication times.
  - Concurrent access are protected by locking with GCC-provided primitives at servers and unlocking at clients through RDMA atomic instructions.
- Metadata plane:
  - two-phase commit with: 1) Memory-base RPC for preparation and 2) direct write for commit.
Q: Do read/write instructions in RDMA can know if the execution fails or succeed?
Facebook’s Tectonic Filesystem: Efficiency from Exascale
Three targeted issues:
- Scaling to exabyte-scale -> hash-partition metadata instead of range-partitioning to avoid hotspots. Why?
- Providing performance isolation between tenants -> Isolation among groups instead of aplications.
- Enabling tenant-specific optimization -> Runtime file system configuration instead of pre-configuration
Crail: Unification of Temporary Storage in the NodeKernel Architecture Crail webcite

FPGA file system

Reconfigurable Virtual Memory for FPGA-Driven I/O

Kernel extension

Security

Kernel security

EPF: Evil Packet Filter

Hardware security

WESEE: Using Malicious #VC Interrupts to Break AMD SEV-SNP
DMAAUTH: A Lightweight Pointer Integrity-based Secure Architecture to Defeat DMA Attacks

Architecture

TOREAD

ASPLOS24 TOREAD

Last-Level Cache Side-Channel Attacks Are Feasible in the Modern Public Cloud
Flexible Non-intrusive Dynamic Instrumentation for WebAssembly
Lightweight Fault Isolation: Practical, Efficient, and Secure Software Sandboxing
λFS: A Scalable and Elastic Distributed File System Metadata Service using Serverless Functions
Predict; Don’t React for Enabling Efficient Fine-Grain DVFS in GPUs
Formal Mechanised Semantics of CHERI C: Capabilities, Undefined Behaviour, and Provenance
Verifying Rust Implementation of Page Tables in a Software Enclave Hypervisor
Merlin: Multi-tier Optimization of eBPF Code for Performance and Compactness
Follow up work to K2.

Implement a LLVM pass to rewrite IR regarding two high-level optimization rules:
- Instruction merging
- Replace expensive operations with cheap ones, such as replacing multiplication with shifting.
Skip It: Take Control of Your Cache!

Tools: Chipyard, Rocket and BOOM core, TileLink, Hardware langauge Chisel What’s Enzian for? Synchronous and asynchronous writebacks (flush and clean) ?
Direct Memory Translation for Virtualized Clouds
Instead of translating a virtual memory address to physical address with a multiple-level page walk, DMT directly maps a Virtual Memory Area to a continuous physical memory area with additional hardware registers. Shortcomings:
- Applications aggressively allocating memory can lead to fragmented, unused, and waste physical memory.
GMT: GPU Orchestrated Memory Tiering for the Big Data Era
BypassD: Enabling fast userspace access to shared SSDs
AERO: Adaptive Erase Operation for Improving Lifetime and Performance of Modern NAND Flash-Based SSDs
A Journey of a 1,000 Kernels Begins with a Single Step: A Retrospective of Deep Learning on GPUs
Thesios: Synthesizing Accurate Counterfactual I/O Traces from I/O Samples
Everywhere All at Once: Co-Location Attacks on Public Cloud FaaS
A Verified Confidential Computing as a Service Framework for Privacy Preservation

CXL

Rcmp: Reconstructing RDMA-Based Memory Disaggregation via CXL

Book

Software foundation