CamelCase vs. underscores revisited (2013)
layer8 11 days ago [-]
I'm slightly conflicted about CamelCase. On the one hand it can be somewhat harder to read, on the other hand it does have a number of benefits:

- Typing "ABC" Ctrl+Space gives me code completion for AsynchronousByteChannel (and the like), whereas "abc" Ctrl+Space gives me code completion only for what strictly starts with "abc". I.e. uppercase means "words starting with that letter". Of course you could have something equivalent for non-CamelCase naming conventions, but for CamelCase it is a very natural fit.

- It provides the two variants lowerCamelCase and UpperCamelCase (often used for variable names vs. type names), whereas lower-kebab-case & Upper-Kebab-Case or lower_snake_case & Upper_Snake_Case seem more awkward in comparison (more keystrokes for the "upper" variant).

- When lowerCamelCase is used for function names, the case distinction often maps nicely to verb + noun (`createFooBar()`, `validateBazQuz()`).

- You still have underscore available to occasionally "index" a name (`transmogrifyFooBar()`, `transmogrifyFooBar_unchecked()`) unambiguously.

- The fact that CamelCase does not match normal English spelling has the advantage that it can't clash with it. CamelCase can stand both for hyphenated-compound-words as well as for open compound words (or just a phrase), whereas other naming conventions may look like the one although the words really stand for the other.

- Minor advantage: You can fit slightly more words into a line.

Apart from that, I'm pro-kebab-case.

Biganon 11 days ago [-]
Does kebab-case even work, generally speaking? Does it not clash with the subtraction operator?
grose 11 days ago [-]
Kebab-case is used in languages that don't have subtraction operators in arbitrary locations (in the sense of C), like Lisp and CSS.
ogoffart 10 days ago [-]
The language I designed for Slint [] uses kebab-case, so it mandates spaces around the minus sign. (If the lookup for `foo-bar` fails, it tries with `foo` and if that succeeds, the error message suggest the use of spaces)
10 days ago [-]
vvillena 10 days ago [-]
Like most esoteric language features in existence, Raku allows for it. Probably easier to support, as variables are prefixed with sigils.
lizmat 10 days ago [-]
The rule for kebab-case (regardless of whether or not it has a sigil) in Raku is:

- each part of the identifier must start with an alpha character

- cannot end with -

So, "foo-bar" would be legal, "foo-" would not, "foo-4" would be considered "foo" - 4. If you want to substract two sigilless identifiers, you *must* have whitespace between them. So "foo - bar" instead of "foo-bar".

m463 11 days ago [-]
> typing "ABC" Ctrl+Space gives me code completion for AsynchronousByteChannel

that's clever. I suspect you could add editor logic to do that with underscores too.

pocketarc 11 days ago [-]
IDEA does something similar - you just type, and as long as the characters are found in the string in the order you typed them, it doesn't matter what's separating them, it will find a match.

So ABC would match AsynchronousByteChannel, and abc would match asynchronous_byte_channel. As would asyncbc or whatever else your brain comes up with.

It's quite nice.

moring 10 days ago [-]
I think IDEA actually splits a CamelCase identifier and expects your input to be a concatenation of prefixes of these parts, in the correct order, but parts may be omitted. Though the latest version goes further and also detects flipped characters, so maybe levenshtein distance on top of that.
brabel 10 days ago [-]
On Emacs, this is called "fuzzy" search and is easy to enable everywhere (autocompletion, file search, text etc).
xlii 10 days ago [-]
And just yesterday I started toying with “orderless” Emacs package which allows that (and more!) - both for underscores and camel case.

Since such features often cross environments I wouldn’t be surprised if other editors/IDEs had similar stuff.

forgotmypw17 11 days ago [-]
- Typing CamelCase is also easier on the pinkies.
int_19h 11 days ago [-]
While I find snake_case more readable when looking at the identifier alone, it has the downside that _ is mostly whitespace with the glyph at the bottom of the character cell, and so it visually groups more with punctuation like "." than with letters. Which wouldn't be so bad if "." wasn't so common in property / method call chains, so you end up with stuff like:

being difficult to parse when quickly skimming through code. OTOH kebab-case solves this nicely because "-" is not in the same place as "." in the character cell, and being in the middle, groups more naturally with letters:
Then again, maybe the established formatting conventions for member access are just suboptimal? Suppose that we put a space before every "."; then:

   foo .bar_baz .blah()
kelnos 11 days ago [-]
kebab-case is fine in strings, but many languages will parse that as

    BinaryExpression(Identifier("kebab"), Identifier("case"), Operator.MINUS)
I admit there's personal preference and just quirks of vision involved, but I don't think I've ever had an issue distinguishing '_' and '.'. One thing you can do, which helps also with breaking up long chains of calls that would give you a really long line, is break it up with newlines, e.g.:

int_19h 11 days ago [-]
I believe that syntactic limitation to be a historical mistake that we keep doubling down on for no good reason. There's really no practical purpose to parse a-b as (a - b), as evidenced by popular style conventions for pretty much ever language with a binary minus operator.
adrian_b 10 days ago [-]
COBOL has used the hyphen a.k.a. dash as the word separator in identifiers.

Then IBM PL/I (in 1964, when it was still named NPL) has replaced it with the underscore, to avoid the confusion with minus.

All the other programming languages that use underscores have taken this usage from PL/I.

Most LISP variants have continued to follow the COBOL practice, because they also have avoided the use of the minus sign as an operator.

Because ASCII includes only a single ambiguous HYPHEN/MINUS sign, as long as programs are restricted to be written in ASCII it is difficult to use the same sign both as an operator as a word separator. If U+2212 were used for the minus sign and U+002D for hyphen, with a typeface that differentiates them, the hyphen could replace the underscore, but that would not change anything for typing as they would still be two distinct keys.

For easier typing, I use a Dvorak variant, where -/_ is on the home row. Whoever uses the underscore more frequently than the minus may revert which of them is obtained with shift and which without shift.

GoblinSlayer 10 days ago [-]
You can try to use U+2013 en dash (­­–) for hyphen.
adrian_b 10 days ago [-]
Proper hyphen is U+2010.

Hyphen, which is the correct symbol for separating words in a compound word, is much shorter than the en dash, which is used in other contexts, e.g. as a figure dash.

The en dash has about the same length as the minus sign, so they cannot be distinguished visually, which is the purpose of using different characters for the word separator and for the mathematical operator.

Vt71fcAqt7 11 days ago [-]
I think it's more about reducing cognitive load. I wouldn't want "+" to be used for anything that isn't adding, for example. Having to differentiate a-b from a - b sounds tedious and error prone.
int_19h 10 days ago [-]
There's precedent for this, though - 123.456 is lexed as one token in most languages, but is three. It doesn't seem to be causing much issue in practice for readability because the context is sufficiently distinctive. Having / not having whitespace around the character would be even more distinctive, though, no?

FWIW I did write a fair bit of XPath (in XSLT context) and XQuery back in the day, which do allow kebab-case identifiers and use them for standard library names other than types, while also using minus as a binary operator. I don't recall ever consciously thinking about how to differentiate the two uses, not even when still learning the ropes coming from C++ and C#; it "just worked". Of course, this is anecdotal - it would be interesting to poll people with prior experience with kebab-case languages that have also been exposed to other styles regarding their overall preference and this specific ambiguity.

tirpen 10 days ago [-]
People usually don't mix up pointers in C (foo) with multiplication (foobar) nor divisions (foo / bar) with comments ( // ).

So I don't really think people would have a problem with kebab case after getting used to it for a couple of minutes.

reuben364 11 days ago [-]
At most it's an error of 'a-b' not in scope. Unless you have dynamic scoping or somehow have a variable 'a-b' and also want to perform 'a - b'. I think the rule that binary operators must be separated by spaces is much easier to reason about then not having certain characters usable in identifiers.
afiori 10 days ago [-]
The only solution being polish notation for all operators
Someone 10 days ago [-]
That’s not the only solution. You can also use different characters. In fact, kebab–case should not use minuses, but dashes, like this:

  kebab-case   // three tokens: ‘kebab’ ‘minus’ ‘case’
  kebab–case   // single token ‘kebab n–dash case’
  kebab—case   // single token ‘kebab m–dash case’
acomjean 10 days ago [-]
A great solution for messing with people. Much like the “smart quotes”
andreareina 10 days ago [-]
I'll admit to not putting space around binary operators when I write code, counting on format-on-save to fix it :)
int_19h 10 days ago [-]
You'll have to put one extra Space before each "-" that is an operator. But you also won't have to reach for Shift every time you had to type "_", which is usually much more common. So I think even in this case you might end up typing code slightly faster on average!
rollcat 10 days ago [-]
I'm a fan of that idea; I would love to see other common operators (such as vertical pipe) reclaimed for expressing operations way more common in the past decade than bit twiddling (e.g. iterator/filter chaining, like in shell pipelines).
Jaxan 10 days ago [-]
This begs the question why we don’t allow spaces in identifiers. (We do allow them in file names for example.)
int_19h 10 days ago [-]
It complicates parsing significantly unless you avoid keywords altogether (i.e. all syntax is punctuation) or use some form of as early PLs did.
layer8 11 days ago [-]
Also, when using a proportional font it gets worse, because underscores become wider (significantly wider than a normal space) and dots become narrower, suggesting the wrong grouping in syntax like `foo.bar_baz.blah()` (the words in "" are closer together than in "bar_baz"), or just when having a snake_case_identifier in normal text ("a snake" is close together than "snake_case", etc.). This problem is less pronounced with kebab-case.

Underscores can also clash with hyperlink underlines, depending on the font. I've seen documentation with linked identifiers where that was confusing.

ripe 10 days ago [-]
This is a really good point. Not sure if the study included this case.

Yes, I do like kebab-case. Common Lisp!

citrin_ru 10 days ago [-]
Lua uses : for a method call and IMHO it is easier to read than with a dot.
rollcat 10 days ago [-]
I think it's the most underrated "wart" of Lua. It's not strictly speaking a method call operator; it's a shorthand used to provide the LHS object (a table) as the first argument in a function call (where the function is looked up from that table).

So if you've implemented some form of inheritance (usually via metatables), you can trivially up/down-cast by using the dot notation and specifying the desired class/prototype for the method, and some instance/object as the first argument.

(Speaking of up/down-casting, classes, and instances in Lua is a bit awkward. I have a very quick & dirty "class.lua" file that I copy&paste between projects, I wish something like that was just a part of the stdlib.)

camgunz 10 days ago [-]
Yeah will +1 this. If you mess it up it's pretty hard to find where you did it. ':' is very close to '.'.
inDigiNeous 11 days ago [-]
Good to see some scientific studies on the easier_readability_of_snake_case, versusComparedToCamelCaseBouncingUpAndDown.

I tried to convince my co-workers toTransitionFromCamelCase to the world_of_easy_reading_snake_case, but alas, the codebase alreadyUsingCamelCase won.

Maybe it's the idea that "shorter == better", or whatever, but if I could choose, I would use snake_case_everywhere man.

brookst 11 days ago [-]
I am pro-snake myself, but those that use double underscores, especially in a meaningful way, are drilling holes in our boat.
stnmtn 10 days ago [-]
I don't hate leading underscores on _private_function or _class variable, in fact I think it's a great way to semantically signal intent

The double underscore for __super_private_function doesn't make much sense to me though. I can't see a reason why this shouldn't just be a single underscore, especially when __double_underscore__ signals very important python internals

clawlor 10 days ago [-]
The leading double underscore functions differently from a leading single underscore in this case:

> Any identifier of the form __spam (at least two leading underscores, at most one trailing underscore) is textually replaced with _classname__spam, where classname is the current class name with leading underscore(s) stripped.


spookthesunset 11 days ago [-]
What is a double underscore?

jayknight 11 days ago [-]
Yep, or "dunder" as it's often called in the python world.
eddsh1994 11 days ago [-]


Etc, in the land of Python

goodoldneon 10 days ago [-]
I wish Python enforced private attributes. I once worked in a codebase where this was roughly true:

brookst 10 days ago [-]
I mean doesn’t it make you want to make ___________super_ultra_private, with an accessor that corrupts the filesystem and hangs if you so much as look at it?
goodoldneon 10 days ago [-]
That sounds good, but we might need a class attribute that disables filesystem corruption in case we need to access `___________super_ultra_private` somewhere
stevula 11 days ago [-]
I do prefer snakecase visually but having to press shift for the underscore is a pain and it’s also quite far from the home row on the keyboard.

I wonder if a lot of people rebind the underscore character to a more convenient key?

simonblack 11 days ago [-]
but having to press shift for the underscore is a pain

But having to press shift for a capitalised word is not a pain. OK, sounds about right.

irrational 11 days ago [-]
For me it is easier/faster to type uppercase letters than an underscore. But, I have small hands.
kitsunesoba 11 days ago [-]
Same. I'm noticeably faster with camelCase.
timbit42 11 days ago [-]
That's why kebab-case-is-best.
shrimp_emoji 11 days ago [-]
This haunts me. It seems almost as good as snek_case but doesn't require a Shift. If we're not Shifting the letters, why are we shifting the -s? My OCD. (Or my autism?) Help.

Another consideration: whether the program you're typing in will stop at underscores or hyphens when using Ctrl + Shift + Left/Right Arrow when highlighting. If I want to highlight `this_variable_with_a_long_name` without using the mouse, it's going to be a pain if I have to hit the left arrow key six times instead of once. (Frustratingly, it varies from editor to editor.)

eyelidlessness 11 days ago [-]
The reason we (most of us) can’t have kebab-case is the same reason it mostly occurs in lisps and other declarative syntaxes: infix operators. ie if you can have identifiers or keywords like-this and you can also have subtraction like-this, you have some serious syntax ambiguities to deal with.

As far as keyboard navigation around partial or whole words, I find it really helpful to have separate keybindings. For me (on a Mac so keyboard layouts matter), opt+left/right moves a “whole word” (as in to the nearest non-word character) and ctrl+left/right moves to the nearest internal word boundary (where boundary is mostly arbitrary; in bouncyCase it moves to the nearest non contiguous case change, in any-punctuated_case it moves to the nearest punctuation).

Agree it’s frustrating how much this varies by editor, but being able to configure the behavior is one of my first checks when deciding if I’ll use the editor at all.

askvictor 11 days ago [-]
Because many languages don't require whitespace between symbols (e.g. x) and operators (e.g. +). So in Python, x+3 is valid (x plus three). As is x+y. And x-y. So how does the interpreter/compiler know if coffee-bean is a single variable, or an operation (coffee - bean) between two variables?
brookst 11 days ago [-]
Sounds like job security!

  x-y = ‘a string or something’
  # bunch of lines of code here
  x = 5
  y = 2
  assert x-y == 3
fiddlerwoaroof 11 days ago [-]
This might be configurable: vim and emacs, at least, let you define which characters are part of a word.
inDigiNeous 8 days ago [-]
Well, this-case-has-the-problem-of-mixing-with-the-letters-more-easily. When you type snake_case it is clearly not in the line of the letters.

But yeah kebab-case-might-be-easier than camelCaseToRead in the long run.

simonblack 11 days ago [-]
kebab-case is all very well as long as a hyphen is not treated as the 'minus sign'

Many years ago, I was programming in COBOL, where kebab-case is the norm. For a while there I was programming in both "C" AND in COBOL. That was a nightmare, I invariably used the wrong syntax for my variable-names, as both languages used incompatible syntax for those variable-names.

bostonvaulter2 11 days ago [-]
I've bound underscore to a top-level key in my default keymap. It's made a big difference to me!
snozolli 11 days ago [-]
You skipped the important half of the sentence:

and it’s also quite far from the home row on the keyboard.

I'm not going to adopt a coding style that requires me to reach that far repeatedly for every name.

layer8 11 days ago [-]
The position of the -/_ key is a bit awkward relative to the right-hand shift key in the US layout.
stevula 11 days ago [-]
Ha! You got me there.
hgs3 11 days ago [-]
> I do prefer snakecase visually but having to press shift for the underscore is a pain

Code is written once but read many times. Better to optimize for reading than writing.

eklitzke 11 days ago [-]
Obviously you should swap the bindings for _ and - (so you press shift for dash instead of the other way around).

If you're really perverse you can do the same for 9/( and 0/), because most programmers type parens way more often than the type 9 or 0.

11 days ago [-]
smcameron 11 days ago [-]
Remember those keyboards with a split space bar? I don't know what they were meant for, but I always thought it would be cool to map one of them to underscore. Never had such a keyboard myself though.
gpderetta 11 days ago [-]
SAI_Peregrinus 11 days ago [-]
eyelidlessness 11 days ago [-]
Works great on Macs, and the default layout makes it an easier reach than underscores!
SAI_Peregrinus 10 days ago [-]
Those are "em" dashes, not hyphens. Shift+Option+- is harder than Shift+-.
stevefolta 11 days ago [-]
Indeed. I wouldn't have a strong preference between snake_case and camelCase, but I use snake_case for new code because it's closer to kebab-case.
dahfizz 11 days ago [-]
This makes me feel uncomfortable
bbarnett 11 days ago [-]
You and me both, brother.

nlnn 11 days ago [-]
I thought about this, but realised that I don't spend that much time typing when coding.

Maybe when defining a new variable/class/etc. the first time, then IDE autocomplete fills it in the next.

bitwize 11 days ago [-]
Having to press shift shouldn't matter much except when you are introducing a new identifier because your autocomplete should kick in after you type the first couple of characters. (You are using an IDE, right?)
gsinclair 10 days ago [-]
> I wonder if a lot of people rebind the underscore character to a more convenient key?

I create bindings using Karabiner (Mac) for all symbols so that I don’t have to press Shift or leave the home position.

Underscore is “s comma”.

cassepipe 10 days ago [-]
I personally like the underscore because it is always the same combination while Caps are combinations all over the place and you have to choose the most convenient shift key, left or right.
8jy89hui 11 days ago [-]
I bound _ to capslock + u. It is very convenient for both hands.
Pxtl 11 days ago [-]
My dream is a programmer's keyboard where part of the spacebar is underscore. Also add A-F to the 10key numerical keypad.
taeric 11 days ago [-]
I do find your straw man for the camel case amusing. Specifically, "versus compared to" doesn't parse well and hurts my ability to read it.

That said, data in this study is interesting for being data and a replication. I don't know that I've given much thought to liking one style over the other in a long time, neat to see the impact it can have.

Directly to your last point, I do greatly prefer single word names, if they can be used. So, `candidates` over `candidateList` or `candidate_list`.

11 days ago [-]
nashashmi 11 days ago [-]
Definitely shorter is better especially when the names are incredibly long. Also helps in narrow word wrapped screens.
wvenable 11 days ago [-]
I wonder if novelty matters. This "versusComparedToCamelCaseBouncingUpAndDown" is certainly harder to read but I also found it harder to read than all my CamelCase identifiers in code.

For me OriginalAmount or AdjustmentFees are just one word each.

hypertele-Xii 10 days ago [-]
An underscore is a word-separating space that selects as part of the sentence.

If you use spaces in your normal writing, there's no reason you shouldn't use spaces in programming - except that a space prevents you from easily selecting the whole phrase, a very common operation programmers face.

So they replaced the space with a character that is effectively identical, but works better.

Some people figured they could just remove it. WellTellMeThis,DoesItMakeMyTextMoreReadableToYou? If you argue that camelcase is better, then shouldn't we adopt it into normal writing too? Why not?

dgb23 10 days ago [-]
I like the notion of discriminating the general kind that names refer to with different separation styles. I think the most common are:

- camelCase

- PascalCase

- snake_case

- kebab-case

- |sentence case|

I think the last one can be used in CL, but not 100% sure.

Some examples:

- Clojure prefers kebab-case mostly, but some types and definitions use PascalCase to imply that there is a more direct mapping to a Java class. Clojure prefers to use namespaces to name things that belong together over long, prefixed names. Single letter names are common for function parameters such as 'm' for maps, 's' for strings and so on.

- Go prefers camelCase for private/local types fields and functions. When PascalCase is used, then the name is exported. Go generally prefers short, often abbreviated names (Unix/C heritage).

- Rust prefers PascalCase for type definitions, snake_case for function names, locals and parameter names.

- Both in Go and Rust, packagas/modules/crates give additional naming structure but also control visibility. So those things have multiple jobs.

So generally the-wayWe_Type/And-NameThings in different languages can give us a bit more structure and meaning without cluttering the syntax with additional keywords. A big side-benefit is that we then don't need to decide which style to use! It's better to have a uniform style than your personally preferred one IMO.

atombender 10 days ago [-]
I never understood why some languages like Rust are so inconsistent here: snake_case for non-type identifiers, but PascalCase rather than Snake_Pascal_Case for types.

Maybe OP's study is real, but the underscore has always bothered me. I'm fine with it in principle, but it creates visual noise, since pretty much all fonts put the underscore below the baseline.

Here's a mockup that fixes the vertical placement of the underscore (right hand side):

Much better!

toastal 10 days ago [-]
If you think that last bit of wacky casing combinations is offensive, wait BEM-style CSS selectors—not the philosophy or separation of components, just the selectors themselves. Basically any other naming or separator selection would have been better.
kagevf 10 days ago [-]
> I think the last one can be used in CL, but not 100% sure.

You can create a symbol that way:

  (let ((|my symbol| 123))
    |my symbol|)

  => 123
__ryan__ 10 days ago [-]
There’s plenty of things that are done in code that we don’t adopt in prose and vice versa. You could just as easily make the case for adopting underscores in place of spaces in prose to increase readability, but we haven’t done that.

In code, you have to balance readability in many contexts.

Snake case and kebab case look more readable in isolation, but do they look better when used as part of a larger expression, for example part of a chain of member access and method calls and argument passing? I don’t think the answer is obvious, and it probably depends on other formatting affordances (e.g. breaking the expression up on multiple lines).

I’m on mobile and I don’t trust HN’s formatting to do any justice with examples, but it might be worth trying it out in your code editor.

TomSwirly 10 days ago [-]


__ryan__ 10 days ago [-]
I’m specifically saying that prose/text and code are different contexts that don’t necessarily benefit from the same styles of writing.

What I was saying is that just because spaces in prose naturally improves visibility does not _necessarily_ mean that underscores and dashes improve overall readability across many contexts _in real code_, because code and prose are different. It could be the case, but it’s not the obvious innate property people are suggesting it is.

If it such an obvious innate improvement in readability, then why don’t we replace spaces with underscores in prose and handwriting? It would remove any ambiguity between intentional separation and awkward unintentional spacing or kerning. But we don’t and haven’t done that for some reason, so there must be something else at play.

Your examples (and many examples in this thread) of long sentences in these formats are not what code actually look like.

Real code has meaningful symbols and syntax that occur between identifiers that carry semantic meaning. Maybe the lack of symbols in identifiers in camel and pascal case make it easier to identify these other symbols and syntactic elements, so you end up with better overall readability. Maybe adding to that the flexibility of using camel and pascal and upper snake case for different "types" of identifiers improves mental mapping of code concepts that you'd lose if you always used snake case.

Again, I'm only making the argument that readability in code and readability in prose are two totally different things, and the effects of different casing and different identifier naming schemes are likely more subtle than what is better clearly separating words in the identifier.

Clamchop 10 days ago [-]
Is it not natural or are you just unpracticed at it?
tgv 10 days ago [-]
> So they replaced the space with a character that is effectively identical

I don't think that's the reason. The reason was ease of parsing. It's harder to write a token recognizer if spaces are allowed in identifiers and distinguish them from type names and reserved words.

ALGOL-68 famously did allow spaces in identifiers, e.g.

jfk13 10 days ago [-]
I don't think the issue with spaces is primarily about selection; it's about parsing.
hypertele-Xii 8 days ago [-]
When I write code I'm not thinking about parsing. When I write code, I'm thinking about writing code. Underscores makes coding easier (for improved selection behavior). Ergo, the issue with spaces is primarily about selection (for me).
ulizzle 10 days ago [-]
Because "normal" writing (prose) isn't code.

The medium dictates the delivery.

hypertele-Xii 8 days ago [-]
Normal writing and code are both text to be read by humans. They are the same medium with regards to readability. Our brains use the same circuits we developed to read normal writing to read code. I haven't seen any evidence to the contrary.
paulcole 10 days ago [-]
CamelCase doesn’t need to be more readable, it just needs to be readable enough. And when you keep names short, it is.
nmz 10 days ago [-]
3 wordsIsTheLimit, afterThat, its_better_to_use_underscores.
tlb 10 days ago [-]
I hacked my editor to make _ look like a small hyphen, and - look like an extra-large em-dash. Then I can work with snake_case underneath, with the pleasing aesthetic of kebab-case. And (for the robot control stuff I work on), negation is an extremely significant operator so I like having it stand out.

Screenshots at

This is for a custom language & IDE, but someday I'll get around to making VSCode do the same.

gpderetta 10 days ago [-]
emacs has glasses-mode to switch visually between snake case and camel case. I'm sure it should be easy to hack to support kebab-case in... case it doesn't already.
jjice 11 days ago [-]
Whatever the standard is for the language. Unless we're in a language without a standard (cough cough PHP cough cough). I work in a real snakecase mixed with camelcase PHP codebase at seemingly random intervals. Worst part is that this codebase started only a few years ago.

Snake is my preferred. The Python/Rust use of snake case and Pascal case for their respective purposes is my favorite.

tredre3 11 days ago [-]
> Unless we're in a language without a standard (cough cough PHP cough cough)

Frameworks, IDEs, and linters all follow the PSR-2: .

It is indeed common for new developers to not know or care about it, but most professional shops I've been to adhere to it unless they're working on very legacy code.

jjice 10 days ago [-]
Huh, thanks for the link. I swear that when I read that document (or a similar PSR) about a year ago, it specified that it didn't take a stance for method names (or maybe regular function names). I wonder why I thought that or if I read it someone else.

Thanks for that.

doutunlined 11 days ago [-]
No linter? Seems like a quick thing to standardize
jjice 10 days ago [-]
I actually did set up our linter but can't enforce the casing since so much of the codebase is all over the place and leadership didn't want to take any risk with an automatic conversion :shrug:.
asiachick 11 days ago [-]
I'm more of a camelCase person than a snake_case, probably because I've been writing lots of JavaScript for the last 10 years. But, .... You could argue that camelCase is culturally insensitive given that plenty of languages (Japanese, Chinese, Korean, Hindi, Sanskrit, etc) have no concept of case.

Many modern computer languages (JavaScript, Rust, Swift, ...) allow non-ascii identifiers so if you pick camelCase, then someone writing Japanese, Chinese, Korean has no way to obey. That doesn't mean snake_case would be all that better in those languages though.

    var 画面_幅 = ...;
    var 窓_縦 = ...;
The point is, both camelCase and snake_case are a thing based around western languages.
wodenokoto 11 days ago [-]
A non-programming colleague asked me why I use so many underscores “wouldn’t it be easier to type a dash if you can’t use a space?”

Why yes. Kebab case would be so much better.

Or dots, like R uses.

Neither of those are compatible with most languages, but they are a better options for current keyboards.

tzs 11 days ago [-]
Offhand I can't think of a good reason why many of the popular languages couldn't allow space. Most either don't have any places where you can have a keyword adjacent to a variable, or would just have to not allow variables containing space if the first word in the name matches a keyword.
thefaux 11 days ago [-]
It is significantly easier to write a parser if you assume names don't have spaces in them. I also tend to think that names with spaces would be confusing to read as well: you'd end up with statements like this `x = foo bar baz (buzz)`. Certainly not unparseable, but I don't think the convenience is worth giving up space as an almost universal token separator.
collyw 10 days ago [-]
I imagine it's a lot harder to read as well - as in read and parse in your head
CoolGuySteve 11 days ago [-]
Everywhere I work seems to settle on CamelCase but reading/writing it is annoying because: sSkKcClIwWxXzZvVoO0

Goldman Sachs' Slang language allowed spaces in tokens which was actually much better but I guess your grammar has to support that from the ground up.

dahfizz 11 days ago [-]
Unicode has lots of space characters, maybe we could choose one to be a space that is legal in identifiers.
layer8 11 days ago [-]
Significant syntax shouldn't be invisible to the human reader. If a parser can't parse it without special space characters, then a human will also tend to have trouble parsing it.
gpderetta 10 days ago [-]
case in point: Makefile tab vs spaces are a pain.
pavlov 10 days ago [-]
Somehow I can't believe that it's 2023 and "word breaks within identifiers" remains such a fundamental issue in programming. Why are we stuck treating programs as structureless character sequences? We don't need to enter them on punch cards or Teletypes anymore.

It's like if file systems were forever stuck on FAT and everybody was quietly resigned to the idea that you can simply never, ever have more than 8+3 characters in a file name. "That's just how computers work. Anyway, I'll share you the latest budget spreadsheet on Google Docs, it's called BDG2023A.GDC"

iainmerrick 10 days ago [-]
Why are we stuck treating programs as structureless character sequences?

The converse question is “why aren’t we all doing visual programming?”

I think the answer is that plain text actually works out better for most purposes. It’s tool-agnostic, it’s reasonably easy to manage in source control, it’s reasonably easy to repair by hand when things get broken, it’s reasonably easy to maintain as tools get upgraded.

Yes, it sucks in many ways. Spaces in filenames messing up shell scripts is my own pet peeve.

On balance I think it’s good that you can write programs in a generic raw text editor, with things like syntax highlighting, formatting and completion being very-nice-to-have added extras. If you couldn’t so much as double-click to select an identifier without tool assistance, that would be pretty annoying when it (inevitably) sometimes goes wrong.

pavlov 10 days ago [-]
> "The converse question is 'why aren’t we all doing visual programming?'"

IMO this is not the right framing because it suggests a dichotomy where we must choose between a CLI or some kind of GUI node spaghetti editor.

It's perfectly possible to add structure to data and still edit it on a CLI. After all that's how file systems work. We don't address our data by raw sectors on disk; instead there are layers of data structures on top, and they're designed to be reasonably easy to use via CLI through concepts like directories and files (rather than having to poke raw inodes for example).

The problem is that a tree of directories and files is clearly not a very good abstraction for structuring programs, but there hasn't been a lot of experimentation on how we could layer a more suitable data structure on top. Smalltalk had a completely different (and incompatible) approach with its image model. It kind of feels like everyone in the PC space looked at Smalltalk, said "oh that's not going to work either", and then gave up on trying to improve program structure.

iainmerrick 10 days ago [-]
The original question was around spaces in identifier names; let me describe how I imagined that would go and you can pick holes in it:

- on disk, use (brackets around identifiers) so they parse

- but you don't want to have to keep typing those, so...

- have a smart IDE that manages identifier brackets for you

- display identifiers in special colours, fonts, etc

Maybe this is a slippery slope argument, but I think when you do that, you've taken a step towards visual programming, where the code is a special data structure and you can only feasibly interact with it via special tooling. I'm with the camp who think that gives you SmallTalk images or UML and it just doesn't work out.

An alternative would be not to put brackets around long identifiers, just have a smarter context-sensitive parser; but I think that's also a bad direction to go in. Putting lots of complexity into the parser (C++, Perl) is a bad idea and keeping the parser simple and regular is a good idea (Go, Python).


The problem is that a tree of directories and files is clearly not a very good abstraction for structuring programs

Hmm, I disagree! It's not perfect, but I think it actually works pretty well. Files and folders map nicely onto modules and packages, and those are a reasonably effective way of organising the code for a big project.

pavlov 10 days ago [-]
I haven’t thought about this deeply at all, but could there be a level of indirection similar to what the file system does with inodes vs. paths?

It would of course require IDE support to resolve the identifier names for display.

I know there are some recent systems that use content addressing for code. Each function or piece of program data is identified by its hash only, so they can be modified without breaking dependencies (editing a function creates a new version). This is basically the same idea as my inode-like suggestion but with deeper ramifications due to immutability.

ihatepython 10 days ago [-]
Instead of using spaces in identifier names, how about using %20 in its place?

For example, the identifier "home address" becomes "home%20address"

Firadeoclus 10 days ago [-]
A "generic raw text editor" is a tool. The vast majority of text editors people use are actually quite complex, way beyond the complexity that would be needed to implement a "generic structured data editor". And the latter could make many very-nice-to-have features simpler to implement so it would very likely end up with less complexity than a full-featured "text editor".

The main problem to be solved is that structured data editors (and structured data query tools) are not ubiquitous, even though structured data formats are. Which really is a shame, considering the amount of time and effort sunk into issues (TFA, tabs vs. spaces, unmatched delimiters and other structural typos, syntax extensibility, etc.) that wouldn't even have to exist if we weren't dealing with raw text.

userbinator 11 days ago [-]
Why not both and neither? IMHO this is one of those things where people seem to congregate and advocate for one side religiously, but I don't see much value in that. Why can't the style simply imply how much scope and importance an identifier has? I've never liked naming dogma, but this is what I find to be natural:

    trivialidentifier - local variables inside a function, usually < 3 words/abbrs
    slightly_important_identifier - function names with limited scope
    ImportantIdentifier - widely-used functions, class/struct names
    More_Important_Identifier - classes/structs that are quite important
    VERY_IMPORTANT_IDENTIFIER - global constants and (rarely) classes/structs
kelnos 11 days ago [-]
That seems less understandable to me. I think it's easier to grok if certain classes of things always use the same casing. So all functions would use snake_case, type names would use PascalCase, global constants use UPPER_CAMEL_CASE, etc.

"Importance" is subjective, and regardless, IMO not really an important measure of anything.

rootusrootus 11 days ago [-]
That's sort of what Python's PEP8 recommends. CamelCase for class names, snake_case for the rest, and SNAKE_CASE for constants.
cpeterso 11 days ago [-]
And Ruby and Rust.
dotancohen 10 days ago [-]
Interestingly, PHP's PSR family specifically does _not_ take a stance on naming conventions for locally-scoped variables.
jjgreen 10 days ago [-]
Shame that the standard library doesn't follow those recommendations (hello unittest).
pavo-etc 10 days ago [-]
also startswith(), endswith()
layer8 11 days ago [-]
Because everyone will have a different opinion about the importance of any given name.
andreareina 10 days ago [-]
Hell I'll have differing opinions about the importance of the same name at different times of day.
__ryan__ 11 days ago [-]

  Why can't the style simply imply how much scope and importance an identifier has? 
Because lack of underscores doesn’t unanimously imply lack of importance or scope.
scns 10 days ago [-]
I call it CONSTANT_CASE and use it for constants.
layer8 10 days ago [-]
It’s commonly called ALL_CAPS (e.g.
pleb_nz 11 days ago [-]
A little confused. People seem to be mixing PascalCase and camelCase in the article and conversation.

Assuming as they're similar they're being used interchangeably in this context?

tmtvl 10 days ago [-]
Let's just call it BactrianCase or dromedaryCase.
yakubin 11 days ago [-]
Do you have a moment to talk about our lord and saviour kebab-case?
txbuck 11 days ago [-]
The fact that kebab-case support is a rarity constantly boggles my mind, nevermind that it isn't the de facto default for any language created after *sh/lisp. Readability, ease-of-typing, parallel with the way it's used in (Romantic) natural language. If I were writing a new language I intended to popularize this would be one of the features I would emphasize.

Spicy semi-snarky aside: if your counterpoint is that kebab-case prevents crushing your arithmetic operators together, I strongly suggest you either reconsider or never write any code you think may be read by another human being (and possibly yourself).

kibwen 11 days ago [-]
Make your language syntax require whitespace around arithmetic operators, and then we can finally live in a glorious paradise of kebab-case and `foo/bar` for namespaces.
txbuck 11 days ago [-]
For reference, something like like this (Python regex):



  binary - operation
  binary + operation
  double - binary - operation

int_19h 11 days ago [-]
It is (was?) the default for XSLT and XQuery, if we're talking about something relatively mainstream. Beyond that I can think of Dylan and REBOL.
capableweb 10 days ago [-]
CSS is probably the most mainstream language allowing dashes in names.
int_19h 10 days ago [-]
Depends on whether you count all the serialized data on the wire or not; if you do, it's probably XML. ~

But I think OP meant programming languages, not styling or markup.

gpderetta 10 days ago [-]
Everybody forgets about shell scripts! Bash allows dashes in executable names (how it couldn't) and in function names, but not in variable names, at least not without a lot of effort.
Gibbon1 10 days ago [-]
My snarky suggestion is we switch to emoji's for arithmetic operators. Semi serious incorporate syntax highlighting via markup into the language.
Joker_vD 10 days ago [-]
I actually have a toy language that uses emojis for keywords, allows me to reduce tokenizing to basically "split by whitespace, split into runs of characters of the same class (every character is its own class except for [A-Za-z0-9_-] which make up a single class), then post-process tokens for finer distinctions". Strings are somewhat more painful, but using \q instead of \" to escape the double quotes helps.
kaba0 10 days ago [-]
I mean, not even a bad idea. I would honestly prefer operators to be more explicit like a-variableanother-varsomething-else. (EDIT: HN removed my emojis, but imagine a plus/minus emoji inbetween ids :( )

With a good font it would basically look similar to what a good IDE with syntax highlighting already does (different color for operators).

Lio 10 days ago [-]
I think this would be a bad idea because emoji are hard to type.

This might be a bit old school but I prefer things limited to plain ASCII. There's only so many keys on a keyboard and want to be able to touch type everything without thinking or looking at a special emoji bar.

goerz 10 days ago [-]
With a proper keyboard layout, you can type a lot of unicode:
kaba0 10 days ago [-]
That I absolutely agree with, but I think it is about time we let go of “programs as plain text” axiom. With good tooling/IDE — which are already sort of visual editors only converting to and from to plain text on save — you could enter a ‘+’ and get autocomplete show the “emoji +” as an option for example. Because the actually important question to ask here in my opinion is “what is the easiest way to read/comprehend programs”, and we seem to have implicitly agreed that syntax highlighting is already a must (which works on even not yet valid program text!). Slightly more reliance on tooling doesn’t seem too bad to me.
gnulinux 11 days ago [-]
I think you're maybe just a little confused, or I'm missing something. Kebab case is in general not viable to implement unless you have very a special/quirky syntax, because there is no way differentiate to `a-b` from `Id("a-b")` and `Subtract(Id("a"), Id("b"))` from syntax alone. Now, there are of course options that introduce trade-offs.

First obvious option is to get rid of the infix "-" operator, which is what Lisp does. In lisp-like languages you don't write "a - b" instead you write "- a b", this way there is nothing to confuse "a-b" with.

Another option is to require a space between operators. E.g. you are not allowed to write "a+b" to mean "add a to b". You have to write "a + b". This is used in Agda programming language. This is very useful because then you can have identifiers like "a+b", or even identifiers like "a+[b+c-d]" etc... As long as any char doesn't have a special meaning (e.g. in Agda "(", ";", "," etc have special meanings) you can use it in an identifier. The trade-off is that, well now you're not allowed to condense arithmetic operations. This may or may not be a problem, depending on the programming language designer. When you said:

> I strongly suggest you either reconsider or never write any code you think may be read by another human being (and possibly yourself).

I'm guessing your opinion is that you're ok with this trade-off. Fact of the matter is that this a very fringe syntax for any programming language to have. As an Agda programmer, I like it, and it is useful, but I'm not convinced something like this would find mass appeal.

The last option I'm aware is to have semantic differentiation. When you find a statement like "c = a-b" you need to ask two things. One, are there identifiers "a", "b" and "a-b". If "a-b" exist and "a" or "b" doesn't exist, you're all set. If all three exist, second question is, are "a" and "b" subtractible? If the answer is yes then programming language designer can choose to prioritize "a - b" over identifier "a-b". Alternatively, you can always choose to prioritize identifier "a-b" as long as it exists. I'm personally not aware of any language that implements something like this, however I have implemented toy languages that go through this, it's pretty easy. It's a matter of making the decision to introduce this type of complexity into your language.

All in all, although I love kebab case, in order to have it in your language you need to make pretty significant trade-offs. Given this, I'm not surprised any mainstream non-lisp-like language doesn't have it.

txbuck 11 days ago [-]
(Hot damn, that was a super thorough/thoughtful reply in a short amount of time.)

You're spot on about the quirky syntax, but I don't think it's as serious a trade-off or addition in complexity (or even a change), given that:

- (IIRC) many style-guides/formatters already enforce spaces between binary operators and their operands (but especially identifiers) and in my super-subjectively-opinionated opinion you should already be doing that even without a formatter

- I don't feel particularly strongly one way or another about any other special characters like "+", so really in this case I'm only considering the dash

- Requiring the dash be between alpha/alphanumerics makes it play nice with unary operators

- The language would be terrible for code-golfing, but that's a relatively niche application I'd definitely consider worth spurning

thaumasiotes 11 days ago [-]
> First obvious option is to get rid of the infix "-" operator, which is what Lisp does. In lisp-like languages you don't write "a - b" instead you write "- a b", this way there is nothing to confuse "a-b" with.

This is a gross error. In your sense, Lisp does not even have operators, only identifiers. The reason there is no confusion between "(- a b)" and "(-ab)" is the spacing that separates the three identifiers in the first case.[1]

Your comment is especially weird because you go on to discuss Lisp's approach as being "an alternative option to what Lisp does".

[1] However, Lisp does have a potential problem with identifiers that begin with a hyphen, due to the need to support literal numeric values like -3. Thus the Common Lisp decrement function is named "1-" despite not returning the value (1 - operand).

cornstalks 11 days ago [-]
> Your comment is especially weird because you go on to discuss Lisp's approach as being "an alternative option to what Lisp does".

I assume you're talking about GP's third paragraph here. Assuming that's true, I think you've misinterpreted it: GP was talking about using infix operator notation, which is most certainly not what Lisp does.

thaumasiotes 11 days ago [-]
No, that's irrelevant.

Lisp will treat (a - b) as 5 tokens, just the same way it will treat (- a b) as 5 tokens. Infix operators are completely unrelated to this problem. What lisp is doing is determining tokens by reference to spacing (the parentheses don't need to be spaced; I believe they are reader macros but in any event they are special-cased) and then acting on the tokens. What C is doing is not that; the concept is that you eliminate all spacing before you decide what the tokens are.

So in C, there is no such thing as "a - b", only "a-b", and that's why "a-b" cannot be used as an identifier.

If you want to write your lisp in infix notation, you can, but it will remain true that (a - b) is a list with 3 elements and (a-b) is a list with 1 element, which is what matters here.

bmacho 10 days ago [-]
What GGP wrote:

> First obvious option is to get rid of the infix "-" operator, which is what Lisp does. In lisp-like languages you don't write "a - b" instead you write "- a b", this way there is nothing to confuse "a-b" with.

> Another option is to require a space between operators. E.g. you are not allowed to write "a+b" to mean "add a to b". You have to write "a + b". This is used in Agda programming language.

is kinda not good, Lisp allows "-" in names the same reason as Agda: it tokenize by spaces (correct me if I'm wrong). This may seem as a gross error for one, and an only implied who cares error which is even true if taken word-by-word, for an other.

10 days ago [-]
gnulinux 10 days ago [-]
This is true, I apologize for the error.
kaba0 10 days ago [-]
> As long as any char doesn't have a special meaning (e.g. in Agda "(", ";", "," etc have special meanings) you can use it in an identifier. The trade-off is that

I would say the trade off is variable names like “a+b” themselves. I fail to see any reason why would I want something like that, like even in Math where the grammar is very hand-wavy to accommodate human parsing you would be insane to write that (though to be fair, math does have their own share of problem with identifiers, enumerating all the letters in different alphabets is not a sustainable solution)

goerz 10 days ago [-]
I would totally want that! Specifically, `a+b = a + b` where `a + b` is an expensive operation. Julia allows stuff like that to some extent, e.g., but unfortunately not with `+`
kaba0 10 days ago [-]
Why not just give it some temporary name, like ‘t’, supposedly it is only used in a very tight scope. Haskell and similar languages often do these things with `let in ..` or `.. with t=a+b`.
goerz 10 days ago [-]
Because `a+b` is a much better variable name than `t`? Why `t`? If anything, `a_plus_b` or `apb`
kaba0 10 days ago [-]
Not really better at all, but I feel this is a bit contrived example. If it really is just numeric addition that just let the compiler do its job, otherwise I’m sure there are better names for whatever you actually try to do. And, random helper variables are a thing even in math, you don’t name the thing the way you calculate that thing because then you don’t spare any character.
jostylr 10 days ago [-]
One could also not have subtraction but simply negation, so instead of a-b being subtraction, one could have a+-b. It probably already works in most languages. Doubt it would be embraced.
Arch-TK 11 days ago [-]
It's a bit hard to parse it outside of languages like lisp because of infix.
thaumasiotes 11 days ago [-]
Infix has nothing to do with it. We can already parse expressions of the form "x - 20". The change would be to stop interpreting "x-20" as being the same set of three tokens as "x - 20".
taeric 11 days ago [-]
That is the point, though? Is "x- 20" the same as "x - 20" the same as "x -20" In the vast majority of modern programming languages, that is a yes. If you allowed dashes in the names, not so much.

Now, I grant the point that it is doable. But the point is it complicate things. Now, fair, we have some of these complications already by virtue of the fact that we allow numbers in variable names. "foo3" is already allowed in many languages, and that clearly gets altered as you add space between the characters.

thaumasiotes 11 days ago [-]
> Is "x- 20" the same as "x - 20" the same as "x -20"[?] In the vast majority of modern programming languages, that is a yes.

Is that really true? I wouldn't exactly feel comfortable with "x -20", though I suspect you're right about at least a large number of languages determining the meaning of the hyphen through local syntactic context ("I just saw an identifier; that must be a non-unary hyphen").

Now I'm interested in whether the corpus of existing code shows any bias between "y = -x" (perfectly allowed, I think) and "y = -1 * x".

Arch-TK 11 days ago [-]
Distinguishing unary minus from binary minus is not done by spacing in parsers. Most modern language parsers ignore whitespace except where it would cause the concatenation of two identifiers (or in the case of languages like python or nim, where it is at the beginning of a line).

I know this for a fact since I've studied their grammars.

Infix notation and - are really the reasons why nobody does this, some people like to use spacing to indicate precedence too (writing code such as "a - 1*g") so it would break some workflows and realistically having a language which is whitespace agnostic except for identifiers AND the minus operator just seems too irregular for people to commit to it.

Joker_vD 10 days ago [-]
Depends on the parser. See e.g. [0] for Elm's take on it: it already had special logic for differentiating "(.)" and "( . )".


Arch-TK 10 days ago [-]
That's why I said most. Obviously there exist languages, such as elm, where this is not the case, but in the case of those languages, the teaching literature (as is the case with elm) explains these things. Most languages don't do this precisely because their designers made the judgement call that the additional parsing complexity (both for the actual parser and the human) outweighed the benefits of allowing you to have minuses in identifiers.
tmtvl 11 days ago [-]
I'll second that, kebab-case is easy to type, has nice readability, and fits well with regular English constructions like case-sensitive, human-sized, looking-glass, and so on.
roenxi 11 days ago [-]
maybe it is some sort of status thing. since people have to be able to afford a fancy keyboard with a shift key to use camel case or underscores, it shows they can afford nicer things than us kebab-cases.

more seriously, it is the unfortunate fact that a lot of systems interpret named-thing and named minus thing. many people don't have the option of using kebab-case. it isn't a good style because it is often impossible to use. which is unfortunate because it is more ergonomic.

waiting4shoe2 11 days ago [-]
The obvious solution is to make `kebab‐case` a name, `kebab−case` a subtraction, and `kebab-case` an error.
alexisread 11 days ago [-]
The problem I've had with kebab case is cross language / devops systems. Some languages and systems don't like kebab case whereas snake case appears more compatible eg. Rpc calls to a named (kebab or snake endpoint).
infradig 11 days ago [-]
I had never heard of kebab-case before as kebab refers to cooked chunks of meat.Soon it dawned that shish (meaning skewer) was meant, as in shish-kebab-case. Sorry to be that guy.
Waterluvian 11 days ago [-]
aloukissas 11 days ago [-]
TylerE 11 days ago [-]
Zamicol 11 days ago [-]
We've dubbed this "ridicule" case.

Upper case is majuscule.

Lower case is miniscule.

Mixed case is ridicule.

TylerE 11 days ago [-]
Mine is actuall a bit more specific. Capitalize every other letter PLUS the start of each word
txbuck 11 days ago [-]
I've seen it dubbed "sponge-case" and am pretty fond of the term.
downvotetruth 11 days ago [-]
WoXCvSZ: randomly only capitalizing C,O,S,V,W,X,Z
tmtvl 11 days ago [-]
There arr worse options? Is that Pirate Case?
11 days ago [-]
bsder 11 days ago [-]
Then we have to talk about the devil token which indicates variables/names.

You can't have infix minus and kebab case without something to differentiate between the two.

Lisp chooses to remove infix minus. Other than old-school BASIC, I don't remember anything else which uses tokens for variables.

yakubin 11 days ago [-]
> Lisp chooses to remove infix minus.

Infix has nothing to do with it. - refers to the subtraction function regardless of whether it is placed in prefix or infix position. Infix will trigger a type error, not a syntactic error — it will tell you that e.g. it cannot call a fixnum. Lisp quite ordinarily requires function names to be separated from their arguments by spaces, which is just a special case of the rule that atoms need to be separated by spaces. That’s all.

lispm 11 days ago [-]
In many Lisps (+ + +) means this: call the function named + on the two values from the variables named +. The function is then a function object and the values of + can be anything they are currently set to.

Thus + serves more than one purpose: function or variable and both may be different values.

Thus the position actually matters.

What we have below the Lisp syntax are s-expressions: nested lists of symbols and other objects. In s-expressions, symbols need to be separated by whitespace, parentheses or possibly by some special character type (-> terminating character). Characters like +, -, /, *, _, ... are valid characters for any symbols.

Thus Lisp does not require functions to be separated by characters but symbols. That's a part of the reader mechanism for reading s-expressions (-> symbolic expression).

yakubin 10 days ago [-]
Right, I was thinking about Lisp-1. Good call-out.
gonzus 10 days ago [-]
... which is the one and only Lisp. Long live Scheme!
inopinatus 11 days ago [-]
I’ve always liked the readability, but interaction is a problem when many UIs treat the hyphen as a break character. Most (all?) browsers do, and even some (albeit not good) IDEs cannot be configured otherwise. When you’re in a hurry, it’s easy to miss a bit off the end.
layer8 11 days ago [-]
Luckily Unicode has the non‑breaking hyphen U+2011. This‑is‑a‑long‑identifier‑that‑should‑not‑break‑at‑any‑of‑the‑hyphens.
inopinatus 9 days ago [-]
Alas, that doesn't seem to be a general solution. Sure, it doesn't cause line or selection break in my IDEs, nor does it line break in browsers, but it's still a selection break in everything except IDEs (such as all the web and chat services developers use to talk about code), it's harder to type, and as a homoglyph it's super problematic to grep or index identifiers that use it. This last consideration could potentially even increase an attack surface.

I'd suggest that it's feasible if we lived our entire developer lives inside a monoculture that adopted it, but that just ain't reality.

This all adds up to "why I do not kebab my identifiers".

djbusby 11 days ago [-]
Doesn't match \w+
taeric 11 days ago [-]
I thought this was mainly because of infix math operators?
choult 11 days ago [-]
kebab-case = doner meat
r2b2 11 days ago [-]

    TypeName, ClassName

    functionName, methodName

    variable-name, symbol-name   # if possible

    variable_name, symbol_name
ldh0011 10 days ago [-]
I'm just mildly annoyed that most languages disallow kebab-case I assume almost entirely so '-' can be used for infix subtraction (and maybe decrement) without having to surround it with spaces... not a good tradeoff imo.
dools 10 days ago [-]
It’s obvious that the only value in using different cases is so you can differentiate between different types of thing:

1) functionsLikeThis

2) variables_like_this

3) ClassesLikeThis

Jaxan 10 days ago [-]
What if a variable is also a function?
dools 10 days ago [-]
Then it’s a variable.
flyingfences 10 days ago [-]



danbruc 10 days ago [-]
If found a paper years ago that was also looking into the readability of different style choices - how easy is it to miss a leading underscore in _field because it is almost a space or to confuse foo_bar and foo bar and other similar stuff mostly related to casing and spacing. I have been trying to find this paper again from time to time for years now without success, is by chance someone aware of it? One thing I remember in more detail is that it was looking at the bounding box shape and how similar they are, it had figures with different styles and bounding boxes drawn around characters and words.
frereubu 11 days ago [-]
The title needs (2013) at the end, particularly as the results might be different now.
Zigurd 11 days ago [-]
This feels like an issue studied through a keyhole: You can do a study that makes underscores seem advantageous. And in a vacuum, a global with an uppercase name otherwise identical to a local with a lowercase name looks like a debacle waiting to happen. But, with a modern IDE, some combination of color, bold, and italics enables coders to easily distinguish among these seemingly inadvisable symbolic names. And if you want to find all uses of a name, that's easy.
pyrolistical 11 days ago [-]
I wish a language was designed to allow identifier with space in them without quotes. That is the most readable. But I don’t know the consequences to the rest of the grammar
thaumasiotes 11 days ago [-]
Underscores are visually very similar to spaces. Their major problem is unrelated to their appearance - it's that you have to use a key combination in order to type them on a standard keyboard.

The obvious solution would appear to be to use better keyboards, but I don't even see anyone suggesting that.

But note that the lisp identifier style, words-separated-by-hyphens, is better than using underscores despite being less legible. It's purely a matter of how easy it is to type the separator. ("Standard" languages don't allow hyphens in identifier names because they want to allow subtraction without requiring a space between the subtraction operator - and the two operands. In Lisp, spacing around an operator, including the subtraction operator, is required. You could go a long way in a C-like language by just requiring spacing around operators and immediately being able to allow hyphens in identifiers.)

downvotetruth 11 days ago [-]
> Their major problem is unrelated to their appearance

Shift key is a problem for all the shifted only chars. Their major problem is appearance: a single underscore '_' vs underscores '__' . Concatenated glyphs are hard to discern due to only distinguisher being width.

thaumasiotes 11 days ago [-]
I have no idea what you're trying to say. You just compared underscores with underscores; relative to underscores, underscores cannot even theoretically have any problems at all.
11 days ago [-]
wizofaus 11 days ago [-]
Use non-breaking space for identifiers and regular/breaking space as a delimiter? Admittedly you'd want an editor that clearly showed the difference and made the former easy to type.
jfk13 11 days ago [-]
Try Algol68.
orbital223 10 days ago [-]
DAX, used in Microsoft's Power BI, allows for identifiers with spaces.
userbinator 11 days ago [-]
But I don’t know the consequences to the rest of the grammar

We have various natural languages to tell us:

Mikhail_Edoshin 11 days ago [-]
I used to like snake case and variants, but now I like something more elaborate: camel case with semantic underscores:

    Lx_DoThis -- function
    Lx_DoThis_Gen -- generic impl.
    Lx_DoThis_Gcc -- GCC-specific
    LxMyType -- type (class)
    LxMyType_DoThat -- method
That is if I use underscore to separate words, I get names that appear to be composed of an arbitrary number of parts. I want that apparent composition to be meaningful.

In a more sophisticated language where types, methods, and variants have native support, this reduces to essentially camel case. Which still looks better to me because a single thing appears as as single word, not a random number of words.

claytongulick 11 days ago [-]
My style comes from a weird combined history of all the languages I've worked with over the years.

I do lowercase snake for_variable_names, camel case forFunctions() and title case ClassNames.

In markup, CSS class names are kebob-case but IDs follow snake_case rules (this is so that they can be referenced easily and are valid identifiers in js).

I like this approach because I can tell at a glance by the naming convention what kind of identifier I'm dealing with.

I dislike the common js style of everythingIsCamelCase, I find it more difficult to read - and in a language where functions are a primary type, I think it's good practice to differentiate stylistically a variable from a function within the closure or prototype chain.

HarHarVeryFunny 10 days ago [-]
For decades(!) I used CamelCase identifiers, with a convention of using an upper case first letter for constants/types/functions and a lower case first letter for variables.

e.g. MyType myType;

I'm not quite sure what made me switch, but nowadays I used lower case and underscores for all identifiers other than constants, and use a "_t" suffix to distinguish types.

e.g. color_t color; string_list_t list;

I think it's certainly easier on the eyes and more readable.

At the end of the day though it's personal preference, unless you're working on an existing code base where you should adopt the naming and formatting conventions of the code base.

cpeterso 11 days ago [-]
I like the symmetry and readability of “Ada case”: snake_case but with uppercase letters where appropriate, such as acronyms. So instead of XMLHttpRequest or xml_http_request, you would use XML_HTTP_Request.
quesera 10 days ago [-]
It has always kind of bothered me that XMLHttpRequest was neither XmlHttpRequest, nor XMLHTTPRequest.

There's probably some reason buried deep in the Microsoft (formerly Micro-Soft!) project docs.

MetaWhirledPeas 11 days ago [-]
Interesting results! Underscores do have two built-in caveats that offset the recognition benefit somewhat: more characters to type, and an awkward key to hit (shift + hyphen).
Waterluvian 11 days ago [-]
I’m interested in what kinds of software development jobs have you churning out so much code that these things practically matter.

I have found that most of the time you can just automatically switch between the two cases, too. So maybe it’s moot.

innocentoldguy 11 days ago [-]
To overcome this issue, I switched to the Dvorak keyboard layout and use Karabiner Elements to remap some of my keys. The result is that I can type underscores on the home row and parenthesis on my shift keys (tap == parenthesis and hold == shift). This has eliminated the RSI pain I used to feel from these hard-to-reach keys.
kbd 11 days ago [-]
Came here to say this. Underscore/dash is on the home row in Dvorak, making snake_case/kebab-case way more natural to type.

Cool trick about mapping shift keys with Karabiner-Elements. I do that with my caps lock to create a "hyper" key: tapping it brings up Alfred, using it in a chord is "hyper", and long pressing caps lock maintains its default behavior.

inDigiNeous 11 days ago [-]
Typing is not the slow part. Reading and understanding the code is what takes most of the time, and is something that should be optimized.

But '_' might be difficult to hit on non-US keyboards though.

TylerE 11 days ago [-]
It’s not really that great on US boards, either. Way off to in a corner, and requiring shift.
bonsaibilly 11 days ago [-]
I wouldn’t say that offsets the benefits at all. You write it once, but it will likely be read and reread many many times.
robomartin 11 days ago [-]
Sometimes I feel these discussions can derail into the realm of being pointless.

Computers do not care. Seriously. I don’t know about others, I have far more important and urgent things to worry about in the course of completing a project than this_case or thatCase. I prefer this_case, probably out of habit. Yet, it isn’t important. I’ll use anyCase if required. My bank account also could not care less.

cassepipe 10 days ago [-]
I hate camelCase and Pascal Case.

You want arguments? Even though they're post rationalizations of what I like best?

OK, well first I hate to type caps, then I hate CAPS, so more than one in a word is unbearable. ThenIThinkItsHardToRead. Finally I like that we have a visible symbol to mean space that's still visible. Who is going to use it if we programmers don't?

popcorncowboy 10 days ago [-]
And I hate snake_case.

You want post rationalized arguments? Even though this is a parody to bikeshed on a polemic?

OK, well first I hate to type underscores, then I hate "shift -" because it's RSI inducing and unbearable. then_i_think_this_is_ugly. Finally I hate that we use a symbol so removed from natural human writing it screams subservience to the machine. Who is going to rebel if we programmers don't?

jacobsenscott 11 days ago [-]
kebab-case is best - no awkward shift but more readable than camelcase.
Pxtl 11 days ago [-]
Let's just make a language where space isn't used to delimit between tokens and let us have spaces in our names.
benreesman 10 days ago [-]
I would happily trade using some other lexeme for subtraction and negation if it bought me hyphenated identifiers. Code has a fair amount of subtraction but it’s like 2-3 orders of magnitude off from identifier word breaks I’d wager: I take that deal, camel and snake case are both godawful.
moring 10 days ago [-]
I'd also happily accept '-' to mean 'minus' when separated by spaces and be part of an identifier if not. You'd probably not want to space-separate a unary minus, so identifiers could not start with a hyphen, but that's ugly anyway. (BTW so is starting an identifier with an underscore, so either we'd have to finally come up with a better scheme for "private" identifiers, or just keep the underscore for that purpose.)
psychoslave 10 days ago [-]
Everybody seems to ignore the median case: `some·name` is a valid identifier in surprisingly large set of mainstream languages actually. At least it works just fine with C, C++, Javascript, PHP, Python and Ruby.

However it doesn’t in any Shell I tried (bash, fish, zsh) nor C#, Go or Java.

innocentoldguy 11 days ago [-]
I don't see as well as I used to and snake_case is much easier for me to read when scanning through code than camelCase.

This may be why I tend to gravitate towards languages like Elixir and Ruby, who prefer snake_case, and away from languages like Go, which use camelCase.

bitwize 11 days ago [-]
(define kebab-case-for-the-win #t)
inopinatus 11 days ago [-]
Always liked the Ruby convention of PascalCase for constants and under_score for lexicals.
jacobsenscott 11 days ago [-]
In ruby anything that starts with a capital is a constant, but PascalCase is only used for ClassNames and ModuleNames, and ALL_CAPS_SNAKE_CASE = "is used for constant values". Unless you are some kind of monster anyway.
inopinatus 11 days ago [-]
Quite right, that’s the full picture.

Honestly though I am that kind of monster

cb321 10 days ago [-]
This relates to a hotly contended topic in style-insensitive Nim
The_Colonel 10 days ago [-]
Wow, I didn't know this is a thing.

One of the good things about Java ecosystem is that casing and indentation are not really debated (apart from few minor corner cases) and basically everyone follows the standard.

I'm kind of fine with any reasonable coding standard, but I can't stand the debates. Endless bikeshedding.

cb321 10 days ago [-]
nimpretty (like gofmt & etc.) was actually mentioned (buried within that large discussion) as a way for Nim to go full sensitivity but avoid the debates. :-)
teddyh 10 days ago [-]
Tht GNU project prescribes snake_case:

emodendroket 11 days ago [-]
Like brace style or a million other nitpicks, I don't care. I'll use whatever the linter says, so set one up. (though I will say I also hate code styles that don't adhere to the general language convention)
dec0dedab0de 11 days ago [-]
I prefer snakecase, but I just use whatever the standard is for the language I'm using, or the project if it already exists. Staying consistent is most important for these kinds of things.
osigurdson 10 days ago [-]
I just do what is common in the language. Python is interesting as snake case is supposed to be used for the most part but a lot of code seems to use camel case.
every 11 days ago [-]
CamelCase seems to be preferred by screen readers for the visually impaired. Just recently encountered this on Mastodon with their extensive usage of hashtags...
kovac 11 days ago [-]
Isn't camelCase CamelCase and CamelCase PascalCase.

On a more serious note, I feel like a reasonable middle ground is to agree to use one-word identifiers only.

rpaddock 10 days ago [-]
"A well placed underscore can make the difference between a s_exchange and a sex_change." - Intel 8048 User Manual cria 1978
jws 11 days ago [-]
This has remained unsettled for 60 years. Perhaps we need a new choice?

How about Unicode "Thin Space" 0x2009 for a legal identifier character? (HN isn't letting me put the Thin Space in there.)

How about Unicode "Middle Dot" 0x00B7 for a legal identifier·character?

Best if you aren't in a fixed with editor for those.

I used middle dot in an experimental language which was Unicode heavy so already didn't like fixed width fonts. It parses trivially and reads well.

I haven't tried the thin space in earnest, but the example code I typed up looked reasonable.

vsskanth 11 days ago [-]
It's not on the keyboard though
jws 11 days ago [-]
Neither are the upper case letters or the underscore. Those are all composed with multiple keys. Easy enough to carve one out for the new separator.
vsskanth 11 days ago [-]
you can see the underscore on the keyboard
eviks 10 days ago [-]
no you can't, its only difference on many keyboards is the length of the line, you can't see that it should be _down_
MichaelMoser123 11 days ago [-]
i think you need to do what the built-in functions/standard library of language X is doing; if it is camel case then do camel case in your own code - having your own code differ from the convention of the built-in functions/standard library is very confusing.
metadat 11 days ago [-]
snake_case strains my fingers way more with the excessive reaching to tap shift-underscore.

Thank goodness for PyCharm auto-complete. Typing every variable name fully manually in snake_case 24/7 everyday was begging for me to develop RSI.

0x073 11 days ago [-]
I prefer camelCase for class properties and under_score for local variables.
11 days ago [-]
skerit 10 days ago [-]
snake_case for variables and properties, camelCase for methods and classes. That's the way I do all my projects, no matter what the "language standard" is.
reportgunner 10 days ago [-]
This is the way
teddyh 11 days ago [-]
Best of both worlds: Emacs with M-x glasses-mode
transfire 10 days ago [-]
Oh, if only middot had been in the ASCII.
thrown1212 10 days ago [-]
If you find yourself using reallyLongKeywordChainsForIdentifiers then you're probably coding in Java and now you have two problems.
kybernetyk 11 days ago [-]
under_scores look more "l33t" while CamelCase is more readable.
11 days ago [-]