Sound Transpilation from Binary to Machine-Independent Code
In order to handle the complexity and heterogeneity of mod- ern instruction set architectures, analysis platforms share a common design, the adoption of hardware-independent intermediate representa- tions. The usage of these platforms to verify systems down to binary-level is appealing due to the high degree of automation they provide. How- ever, it introduces the need for trusting the correctness of the translation from binary code to intermediate language. Achieving a high degree of trust is challenging since this transpilation must handle (i) all the side effects of the instructions, (ii) multiple instruction encoding (e.g. ARM Thumb), and (iii) variable instruction length (e.g. Intel). We overcome these problems by formally modeling one of such intermediate languages in the interactive theorem prover HOL4 and by implementing a proof- producing transpiler. This tool translates ARMv8 programs to the in- termediate language and generates a HOL4 proof that demonstrates the correctness of the translation in the form of a simulation theorem. We also show how the transpiler theorems can be used to transfer properties verified on the intermediate language to the binary code.
READ FULL TEXT