Engineer’s Codex is a publication about real-world software engineering.
Using booleans is a trap. Let me explain.
Something I’ve run into more often in the past year is the “boolean trap.” Booleans are great - simple with only two states. They’re very easy to use.
However, the issue comes when using booleans as parameters in functions or APIs. It’s an issue that I’ve ran into enough that I try to actually refrain from using booleans if possible.
In my specific example, I was working on some LLM pipeline outputs. Initially, the LLM outputs were all in plaintext, so we had no flag. Suddenly, there was a need to also support JSON outputs because JSON is much easier to parse for further pipeline steps compared to plaintext.
So, we started off with the following output_plaintext
flag:
def call_llm(prompt: str, output_plaintext: bool)
While premature optimization is the devil, this flag should already be setting off red flags in your head. Remember, the key of a good API is that it’s easy to understand when reading.
Imagine seeing this call in your Python code where types aren’t necessarily enforced, however:
output = call_llm("Subscribe!", True)
Anybody reading this will be confused. What is true
referring to? What is the argument here?
Remember, well-written code is predictable. Don’t allow surprises.
Let’s say we add the argument in:
output = call_llm(prompt="Subscribe!", output_plaintext=True)
Now, this begs the question - what is the output when output_plaintext
is False? HTML? CSV? LISP?
Now, let’s say we have 400 instances of call_llm
throughout our codebase now, all using this boolean flag. Uh oh, we just got a request to have call_llm
support XML output too! Now this is a problem because we can’t shove XML support into a binary state. Plus, we have to go around and update each call to support the new type. Gross.
You can see how this big ugly problem is very prone to human error and the greatest creator of bugs are… humans. Yes, code is written for and processed by computers, but at the same time, you don’t code in a blackbox. You’re also coding for other humans, including your own future self.
So, make your life easier and don’t use boolean flags.
In my opinion, it’s better to use enums, EVEN IF you only have a binary state to represent.
Enums are much more readable - both in terms of function parameters and explicit typing.
Enums are more maintainable because they are extendable.
Enums are shareable across files.
From the above example, we can use the following enum:
class LLM_OUTPUT(Enum):
TEXT = "text"
JSON = "json"
XML = "xml"
MARKDOWN = "markdown"
HTML = "html"
CSV = "csv"
YAML = "yaml"
This is much nicer. We’re able to easily extend functions that already take in the LLM_OUTPUT
enum past just binary states into N-ary states without even needing to update callers of this enum.
Now, the codebase will look like this, where in cases of no explicit keyword arguments, the functionality of the call is still extremely clear. Plus, someone doesn’t have to look into the actual application code to see what the different outputs are - a simple hover over the enum solves that.
output = call_llm("Subscribe!", LLM_OUTPUT.JSON)
output = call_llm(prompt="Subscribe!", output_type=LLM_OUTPUT.JSON)
Now, as always, context is everything. If a function does one thing, and it’s literally called setIsModalOpen(bool isModalOpen)
, like for a modal in React, then booleans are perfect here. There is no chance of any possible extension in the future (what, you’re going to have it half open?) and seeing setIsModalOpen(true)
is extremely obvious in what the function is doing. (Of course, there’s another argument here that instead of having a function like that, we could just have setModalOpen()
and setModalClosed()
instead (which I agree is much nicer)).
In general though - coding is as much science as it is art, and art is easier to have opinions on. Using booleans can be a trap, but also aren’t necessarily wrong. In my own experience, though, booleans usually are less maintainable than enums for API/function parameters over time.
I do want to note (else Hacker News will kill me) that I’m not the first, nor will I be the last person to write about this:
Is it wrong to use a boolean parameter to determine behavior?
And many more, I’m sure
While its possible booleans in general in an API may be bad, but the example given here doesn't prove it. As soon as I read that there was a need to support new output type, it was obvious an enum is needed. Putting in a boolean there didn't make sense to me at all.
Why even use an enum? That's just asking one thing to do multiple things. One thing should do one thing.
Why not just have a function for each output type?