Artificial intelligence is not just flooding social media with garbage, it’s also apparently afflicting the open-source programming community. And in the same way, fact-checking tools like X’s Community Notes struggle to refute a deluge of false information, contributors to open-source projects are lamenting the time wasted evaluating and debunking bug reports created using AI code-generation tools.
The Register reported today on such concerns raised by Seth Larson in a blog post recently. Larson is a security developer-in-residence at the Python Software Foundation who says that he has noticed an uptick in “extremely low-quality, spammy, and LLM-hallucinated security reports to open source projects.”
“These reports appear at first glance to be potentially legitimate and thus require time to refute,” Larson added. It could potentially be a big problem for open-source projects (i.e. Python, WordPress, Android) that power much of the internet, because they’re often maintained by small groups of unpaid contributors. Legitimate bugs in ubiquitous code libraries can be dangerous because they have such a potentially wide impact zone if exploited. Larson said he’s only seeing a relatively small number of AI-generated junk reports, but the number is increasing.
Another developer, Daniel Sternberg, called out a bug submitter for wasting his time with a report he believed was generated using AI:
You submitted what seems to be an obvious AI slop ‘report’ where you say there is a security problem, probably because an AI tricked you into believing this. You then waste our time by not telling us that an AI did this for you and you then continue the discussion with even more crap responses – seemingly also generated by AI.
Code generation is an increasingly popular use case for large language models, though many developers are still torn on how useful they truly are. Programs like GitHub Copilot or ChatGPT’s own code generator can be quite effective at producing scaffolding, the basic skeleton code to get any project started. They can also be useful for finding functions in a programming library a developer might not be intimate with, letting developers quickly find small snippets of code they might need.
But as with any language model, they will hallucinate and produce the wrong code, or only partial snippets. They don’t “understand” code—they’re just probability machines guessing what you might want based on what they have seen before. In order to produce a complete project, developers still need to fundamentally understand the programming language they’re working with to debug issues and know what they’re trying to build, how all the independent pieces of code string together. That’s why experts in the field have said junior developers will be the most directly impacted by these tools. Simple apps that can be made just using AI have probably already been built before.
Platforms like HackerOne offer bounties for successful bug reports, which may encourage some individuals to ask ChatGPT to search a codebase for flaws and then submit erroneous ones the LLM returns.
Spam has always been around on the internet, but AI is making it a lot easier to generate. It seems possible that we’re going to find ourselves in a situation that demands more technology like CAPTCHAs for login screens are used to combat this. An unfortunate situation and a big waste of time for everyone.
Read the full article here