SymPas: Symbolic Program Slicing
Program slicing is a technique for simplifying programs by focusing on selected aspects of their behaviour. Current mainstream static slicing methods operate on the PDG (program dependence graph) or SDG (system dependence graph), but these friendly graph representations may be expensive and error-prone for some users. We attempt in this paper to study a light-weight approach of static program slicing, called Symbolic Program Slicing (SymPas), which works as a dataflow analysis on LLVM (Low-Level Virtual Machine). In our SymPas approach, slices are stored symbolically rather than procedure being re-analysed (cf. procedure summaries). Instead of re-analysing a procedure multiple times to find its slices for each callling context, SymPas calculates a single symbolic (or parameterized) slice which can be instantiated at call sites avoiding re-analysis; it is implemented in LLVM to perform slicing on its intermediate representation (IR). For comparison, we systematically adapt IFDS (Interprocedural Finite Distributive Subset) analysis and the SDG-based slicing method (SDG-IFDS) to statically IR slice programs. Evaluated on open-source and benchmark programs, our backward SymPas shows a factor-of-6 reduction in time cost and a factor-of-4 reduction in space cost, compared to backward SDG-IFDS, thus being more efficient. In addition, the result shows that after studying slices from 66 programs, ranging up to 336,800 IR instructions in size, SymPas is highly size-scalable.
READ FULL TEXT