This repo contains code for Rewriting the Code: A Simple Framework for Large Language Model Augmented Semantic Code Search, accepted to ACL 2024. In this codebase we provide instructions for reproducing our results from the paper. We hope that this work can be useful for future research on Generation-Augmented Retrieval framework for code search.
conda create -n ReCo python=3.8 -y
conda activate ReCo
conda install pytorch-gpu=1.7.1 -y
pip install transformers datasets tqdm tree-sitter openai fairscale
fire sentencepiece backoff edit_distance pyserini
For the detailed information of data we used in our experiments,
please refer to README.md in ./data
.
For the detailed information of ReCo and GAR in our paper, please refer to
README.md in ./ReCo
.
For the detailed information of Code Style Distance in our paper, please refer to
README.md in ./metrics
.
If you found this repository useful, please consider citing:
@article{li2024rewriting,
title={Rewriting the Code: A Simple Method for Large Language Model Augmented Code Search},
author={Li, Haochen and Zhou, Xin and Shen, Zhiqi},
journal={arXiv preprint arXiv:2401.04514},
year={2024}
}
此处可能存在不合适展示的内容,页面不予展示。您可通过相关编辑功能自查并修改。
如您确认内容无涉及 不当用语 / 纯广告导流 / 暴力 / 低俗色情 / 侵权 / 盗版 / 虚假 / 无价值内容或违法国家有关法律法规的内容,可点击提交进行申诉,我们将尽快为您处理。