Narcissus: Deriving Correct-By-Construction Decoders and Encoders from Binary Formats
Every injective function has an inverse, although constructing the inverse for a particular injective function can be quite tricky. One common instance of inverse-function pairs is the binary encoders and decoders used to convert in-memory data into and out of a structured binary format for network communication. Ensuring that a given decoder is a proper inverse of the original encoder is particularly important, as any error has the potential to introduce security vulnerabilities or to corrupt or lose data in translation. In this paper, we present a synthesis framework, Narcissus, that eliminates both the tedium and the potential for error in building the inverse of a binary encoder. The starting point of the process is a binary format, expressed as a functional program in the nondeterminism monad, that precisely captures all the valid binary encodings of an arbitrary datatype instance. From this specification, Narcissus synthesizes a decoder that is guaranteed to be the inverse of this relation, drawing on an extensible set of decoding strategies to construct the implementation. Each decoder is furthermore guaranteed to detect malformed encodings by failing on inputs not included in this relation. The derivation is carried out inside the Coq proof assistant and produces a proof trail certifying the correctness of the synthesized decoder. We demonstrate the utility of our framework by deriving and evaluating the performance of decoders for all packet formats used in a standard network stack.
READ FULL TEXT