Polymathic

Digital transformation, higher education, innovation, technology, professional skills, management, and strategy


Large language models struggle with generating clean code

The article discusses a study on the reliability and robustness of code generated by large language models (LLMs) for Java coding questions. The study evaluated four code-capable LLMs, including GPT-3.5 and GPT-4 from OpenAI, and found that they exhibited high rates of API misuse. The study also highlighted the importance of assessing code reliability beyond semantic correctness and emphasized the need for static analysis to ensure full coverage. Llama 2, an open model, performed the best with a failure rate of less than one percent.

Original article: Perhaps AI is going to take away coding jobs of those who trust this tech too much


Discover more from Polymathic

Subscribe to get the latest posts sent to your email.



Leave a Reply

Your email address will not be published. Required fields are marked *

About Me

Visionary leader driving digital transformation across higher education and Fortune 500 companies. Pioneered AI integration at Emory University, including GenAI and AI agents, while spearheading faculty information systems and student entrepreneurship initiatives. Led crisis management during pandemic, transitioning 200+ courses online and revitalizing continuing education through AI-driven improvements. Designed, built, and launched the Emory Center for Innovation. Combines Ph.D. in Philosophy with deep tech expertise to navigate ethical implications of emerging technologies. International experience includes DAAD fellowship in Germany. Proven track record in thought leadership, workforce development, and driving profitability in diverse sectors.

Favorite sites

  • Daring Fireball

Favorite podcasts

  • Manager Tools

Newsletter

Newsletter