Meta's new LLM-based test generator is a…

Feb 23, 2024

Meta's TestGen-LLM is a sneak peek to the future of developer productivity: specialized, orchestrated, and rigorously filtered.

Read →

4 Comments

Anton Zaides

Feb 23, 2024

I loved this quote!

"This limits the value of testing, because if you had the foresight to write a test for a particular case, then you probably had the foresight to make the code handle that case too. This makes conventional testing great for catching regressions, but really terrible at catching all the “unknown unknowns” that life, the universe, and your endlessly creative users will throw at you."

Helping me create UTs is my main usage of LLMs during coding (in addition to creating ORM models..)

I feel that it's indeed the best strategic usage of it's ability to think differently than you, and catch things you missed - with minimal risk for the company.

Expand full comment

Reply (1)

Engineer's Codex

Feb 24, 2024

Thanks Anton! I agree - I find that LLMs really useful for giving me ideas and options that I can then evaluate rather than strictly generating code (at least when I'm not writing boilerplate).

Expand full comment

Jan-Christopher Pien

Feb 26, 2024Edited

Using the existing code instead of actual reasoning (be it human or automated) seems like a bad idea. Imagine my code has an unknown bug. This LLM would potentially write a test case that asserts that this bug stays in the code - even if someone or something discovers it later and tries to fix it - the generated unit test would fail and detecting that it's actually a wrongly generated unit test seems much more difficult than simply using human reasoning to write actually correct test cases. Does the paper mention anything about this risk?

Expand full comment

Reply (1)

Engineer's Codex

Feb 27, 2024

You're correct, which is why each test that makes it through all the filters requires a manual check-off by humans. A good way to think about this LLM is that it's a junior dev with the task of creating more comprehensive tests for existing code. Other devs have more important things to work on, so this LLM gets the fun task of improving unit tests.

The tests that it creates in its pull requests are often good, and sometimes trivial or pointless. Occasionally, a test it produces is really good or uncovers a bug inadvertently. Regardless, this work wouldn't have been done by humans anyways due to priorities. All of its pull requests require a human reviewer before pushed into the codebase.

Unit tests shouldn't necessarily become "change detectors", which is what would happen if the LLM generically wrote tests to cover 100% of the code base. It would solidify any bugs in the codebase, like you said.

Expand full comment

Engineer’s Codex

Meta's new LLM-based test generator is a…