snake_case_is_the_best_case

If you're working with code that already has a case style, just use that. An imperfect convention is better than two competing ones.

But when you need to decide on one, pick snake_case šŸ.

Ideas need names

Sketching out an idea is invaluable; I highly recommend it. Boxes and arrows on a whiteboard convey so much data quickly, clearly, and concisely. But which box is which? What does this arrow mean? What is this whole diagram about?

You need to label these things. Ideas need names.

In code, variables, classes, and functions1 need names, but names can't contain spaces. So what to do for names with multiple words? What should you use instead of spaces?

The key is to take advantage of your perception.

Prose-based perception

You first learned how to read prose šŸ“– and then you honed that skill over decades. You read books, text messages, news articles, instruction manuals... You also learned to write prose. Writing prose shapes the way you think, letting you transform vague inspirations into clear plans and ideas.

Your eyeballs šŸ‘€ and meat computer šŸ§  have been relentlessly fine-tuned for reading and writing prose.

What techniques does your prose-optimized perception use? And are there code styles that tap into those?

Word shapes

My intuition is that our perception uses word shapes as a shortcut to detect word boundaries and identify individual words. By "shapes", I mean the silhouettes of each word.

gif fading between words and their corresponding word shapes

snake_case replaces spaces with underscores, which are excellent at preserving word shapes. PascalCase and camelCase remove spaces and uppercase leading characters, disrupting word shapes. The leading uppercase at least preserves most of the word shape, but removing spaces probably makes it harder to quickly detect word boundaries.

kebab-case fares much better. It preserves individual word shapes as well as snake_case, but hyphens - occupy areas normally used by letters whereas underscores _ stay out of the way. Makes sense; hyphens are literally a way of joining separate words. But my guess is that this probably hurts your ability to detect word boundaries, even if just a little.2

Remember that "word shapes" as a heuristic for perception is just speculation on my part. I would be shocked if our perception didn't use word shapes at all, but it's important not to draw too many conclusions without controlled studies.3

Visual similarity

Intuitively, code that is most visually similar to prose could best reuse your prose-reading abilities.

Let's figure out what "visually similar" means for code. One idea is to use edit distance to measure it. The smaller the edit distance, the more visually similar. I think an even better idea is use visual regression testing to visualize the difference between prose and code.

gif fading between prose and snake_cased

It doesn't matter if you like the numerical objectivity of edit distance or the visceral clarity of visual regression diffs, it's clear that snake_case is the victor.4

If you want more scientific evidence, you're in luck! There's a controlled study examining code comprehension that compares snake_case and camelCase. Unsurprisingly, snake_case comes out on top.

One obvious way

Unlike other case types, snake_case has one obvious way to write names:

  1. Express the idea in prose šŸ‘‰ "fetch RSS feed as XML"
  2. Replace spaces with underscores šŸ‘‰ fetch_RSS_feed_as_XML 5
  3. Lowercase it šŸ‘‰ fetch_rss_feed_as_xml 6
  4. That's it! šŸŽ‰

With other cases, you need to decide how acryonyms should be capitalized: fetchRssFeedAsXml or fetchRSSFeedAsXML? Should you always CAPS LOCK acryonyms? Or never do so? The issue is that you are using capital letters syntactically (to split up words) and semantically (to convey acronyms), so there's not a good rule that always works. Your choices are:

In snake_case it's just fetch_rss_feed_as_xml. You can focus on the semantics of your code instead of spending time deciding on syntactics. It's like having a formatter for your names. It's so systematic that you could trivially implement snake_case naming in code:

let to_snake_case = (prosey_name: string) =>
  prosey_name.replace(" ", "_").toLowerCase();

You'll also get some bonuses from using lowercased names:

  1. Compatibility across case-sensitive (Linux) and case-insensitive (MacOS and Windows) filesystems
  2. Simple renaming with s/old_name/new_name/g since foo and is_foo aren't capitalize differently7

Affordance of Capital Letters

Since capital letters are not used syntactically in snake_case, they can instead be used to convey semantics. In prose, capital letters are useful for indicating proper nouns. To me, namespaces and classes feel most like proper nouns so I like using capital letters as an affordance for those.

// `Order` is a namespace, `order` is a variable
let order = Order.new();

Python, like many languages, uses PascalCase for classes. It works fairly well, though I still don't like that capital letters convey syntax and semantics.

As an alternative for new programming languages, I propose a snake_case variant that capitalizes the first letter of the first word. I call it Cobra_case since the first letter is reminiscent of how a cobra lifts its hooded head upright. Then, the capital letters in Cobra_case names serve as an affordance for namespaces or classes.

Lastly, SCREAMING_SNAKE_CASE can keep its affordance as a naming convention for constants.

Longer names are fine

One argument I hear against snake_case is that it's longer than camelCase. It's true that fetch_rss_feed_as_xml has four more characters than fetchRSSFeedAsXML, but why is that a bad thing?

What makes a name good or bad isn't its length, but what it conveys to the reader. If your variable name is long because you have Object or Factory in it, I bet you can rename it with domain-related words instead. Once you have a good descriptive name, tools like autocomplete, snippets, and GitHub Copilot will make writing it fast and easy.8

I also get that your formatter might complain about lines over a certain length. Ideally, you could just increase that length since modern editors don't have line length limitations. Otherwise you'll have to shorten your name, either by coming up with one that was just as good or by sacrificing some descriptive power.

Footnotes

  1. Anonymous functions don't need names, but I doubt there are meaningful programs where all the functions are anonymous. ā†©

  2. I like the aesthetics of kebab-case, but it is slightly harder to read that snake_case. Also, many programming languages do not support kebab-case for variable names. I do occasionally use kebab-case for filenames and URL path segments. ā†©

  3. The tranposed letter effect makes some intuitive sense along the same lines as "word shapes", but its effect were overstated. I'd argue that my "word shapes" theory is substantially more likely since letter order matters, but like I said, probably best not to draw too many conclusions until the science lady sings. ā†©

  4. snake_case does exactly 1 replacement for every space whereas camelCase does 1 removal and 1 replacement (uppercasing first letter) for every space. Consequently, snake_case will have a lower edit distance than camelCase. ā†©

  5. Underscores are exceedingly rare in names so snake_case tends to be a lossless encoding of the prosey name. ā†©

  6. Acronyms in code should be familiar to you and your team, so lowercasing them doesn't hurt. If they aren't familiar, you'll want to use a more descriptive name in the first place. ā†©

  7. There are tools like vim abolish that will handle renames across different cases, but those are inherently more complex. ā†©

  8. I recommend that you keymap Shift + Space to insert an underscore so that snake_case becomes as convenient to write as normal prose. ā†©