Towards Enhancing In-Context Learning for Code Generation

03/31/2023
by   Jia Li, et al.
0

In-context learning (ICL) with pre-trained language models (PTLMs) has shown great success in code generation. ICL does not require training. PTLMs take as the input a prompt consisting of a few requirement-code examples and a new requirement, and output a new program. However, existing studies simply reuse ICL techniques for natural language generation and ignore unique features of code generation. We refer to these studies as standard ICL. Inspired by observations of the human coding process, we propose a novel ICL approach for code generation named AceCoder. Compared to standard ICL, AceCoder has two novelties. (1) Example retrieval. It retrieves similar programs as examples and learns programming skills (e.g., algorithms, APIs) from them. (2) Guided Code Generation. It encourages PTLMs to output an intermediate preliminary (e.g., test cases, APIs) before generating programs. The preliminary can help PTLMs understand requirements and guide the next code generation. We apply AceCoder to six PTLMs (e.g., Codex) and evaluate it on three public benchmarks using the Pass@k. Results show that AceCoder can significantly improve the performance of PTLMs on code generation. (1) In terms of Pass@1, AceCoder outperforms standard ICL by up to 79.7 models by up to 171 (e.g., 1B to 175B) and different languages (e.g., Python, Java, and JavaScript). (3) We investigate multiple choices of the intermediate preliminary. (4) We manually evaluate generated programs in three aspects and prove the superiority of AceCoder. (5) Finally, we discuss some insights about ICL for practitioners.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset