Overview
- A Lifehacker trial found ChatGPT produced a partly working browser elevator simulator with persistent logic and graphical bugs after roughly 45 minutes, while Gemini performed better yet remained imperfect.
- The practical workflow relied on iterative prompts and in-chat debugging rather than reliable one-shot generation, reinforcing that results vary by model and prompt detail.
- Major chatbots now output raw code alongside live previews, lowering the barrier for small experiments and enabling non-programmers to test and refine ideas inside the chat.
- Industry signals show strain: a Stack Overflow survey reported developer trust in AI tools fell to 29%, and usage data tracked by Similarweb showed 30%–50% traffic declines for coding agents since spring.
- Reliability and risk concerns are mounting, with reported incidents of file and code deletion, rising compute costs that pressure pricing, and security analyses finding far more vulnerabilities in AI-assisted code.