The problem Most code an AI agent writes looks right. It compiles, it reads well, it probably works. "Probably" is the problem.
Source: [Dev.to](https://dev.to/shan_wijenayaka_ecbe5dc32/proven-python-make-your-ai-agent-prove-its-python-before-calling-it-done-3kj1)