Project Reference: | ITP/051/23LP |
Project Title: | GPT Security Testbed |
Hosting Institution: | LSCM R&D Centre (LSCM) |
Abstract: | One of the concerns of using GPT is that it can be attacked to generate malicious content, leading to the spread of misinformation, manipulation of public opinion, and fraud. In this project, we emphasize the importance of testbeds in addressing the security needs of AI-based models. The project proposes the creation of testbeds specifically for GPT attacks. One significant attack technique explored is prompt injection, where carefully designed prompts manipulate the GPT to disregard instructions or execute unintended actions. While new research on attack techniques is being discovered, defense against prompt injection attacks is still in its early stages. Another propose of this project is to develop evaluation framework to measure the impact, success probability, and weighted resilience score of prompt injection attacks. The deliverables include the construction of an attack dataset, measurement of the Average Impact Metric for each attack, calculation of the Attack Success Probability, and the determination of the Weighted Resilience Score. Also, we’ll evaluate 3 common GPT models, including online and localized versions that are developed by different countries. |
Project Coordinator: | Dr Chung Dak Shum |
Approved Funding Amount: | HK$ 2.79M |
Project Period: | 1 Feb 2024 - 31 Jan 2025 |