Short variable names explained

Today average variable name has at least one whole English word. Naming things still remains one of hardest problems in programming but most programmers agree that using very short variable names is bad practice. And I was one of them.

The Two Hard Things

But over past 4 years some mysterious force kept pushing my programming towards short variable names. At first I was resisting but later I embraced this path and I see things differently now.

I'll try to explain how one might naturally transition to using short variable names and perhaps convince you that we have been looking at this topic from wrong angle as with my new perspective I can tell that name length is not the property we are looking for.

If you are to truly understand, then you will need the contrast, not adherence to a single idea.

- Kreia

Origin story

In 2020 I started using C programming language for small CLI tools, Advent of Code and more serious software for invoicing and accounting. I avoided external libraries except for the standard libc. And it is full of short names. Function names and example argument names in man pages of libc are usually few characters long, rarely a whole English word. At first I was renaming all of that to "proper" names that are more common for Web App code base. But after a while this started to be inconvenient. That's when I started using short names which lead me to few observations.

1. Using familiar names is convenient

Because C programming language documentation, books, online articles, community, all, use short names in examples, conversation and source code it is very convenient to have the same names. Not longer nor shorter, but the same. It's like a dialect within the programming language. The same dialect of C would not be understood by programmers speaking in dialect of different programming language.

Such dialect is usually defined by standard libraries or commonly used APIs. I think that deviation from this rule is possible when you have your own standard library or work in custom/handmade environment. Otherwise you just fight against the current and you will make it difficult for others to read your code if you open source it.

2. Names are not descriptions

Using libc made me see variable names more like desktop icons. Desktop icons at first are usually abstract and don't tell much about the software hidden behind the small picture. The main role of the icon is to be recognizable after you familiarize yourself with what it represent.

It's the same with any other names, names of plants, animals, places, peoples etc. Now i see that programmers try to use variable name as descriptions of what variable holds instead of actually giving variable a name.

3. Comments are better descriptions

Meaning of a name has to be explained to a stranger that sees it for the first time. This is where using comments that answer the question "what?" are justified. Adding comment to variable definition describing what it is for is better then doing that in variable name because comment can be much longer, have links to external resources, use proper grammar and refer other variable names.

By separating variable description from variable name it's easier to make name more unique or recognizable and to reuse the same variable names in different context.

4. Meaning by context

Names of functions, types and variables can be shorter if you consider the context or scope in which they are defined. When you relay on namespace and context it's easy to reuse the same names across different files, modules and functions.

Shorter more generic names are more reusable. Variable like "name" can be used to store name of document, employee, entity etc. Variable "str" can be used for even more values. Variable "book_author_name" can't be that easily reused. Long variables have tendency to make commonly written code look like something new. When you always loop over list of values the same way then you read entire bloc of code like single symbol.

5. Variables as math symbols

For ages mathematics was developed and explained through long descriptions. Only after graphical symbols where introduced to represent well defined operations that math advanced to higher and more abstract levels.

Programmers are limited by predefined set of symbols possible to insert with keyboard. It's not easy to define new shapes, although there are languages that tries to do just that like APL.

But the point is that instead of thinking about variable name as description, we could think of it as graphical symbol. Names like: argc, argv, buf, str, fp require further explanation when read for the first few times. But after that they are more like small pictures.

This goes even further. When the same common names are used over and over then entire code blocks become recognisable and can be read as single graphical symbol. This reduces cognitive load. Programmers do that all the time by introducing functions. Unfortunately they tend to forget that source code of that function will have to be read too at some point.

About chunking information

6. Make them not short nor long but distinct

When thinking of variable names as graphical symbols it's important to make them distinct. Names that are similar graphically will become source of errors. In short names single letter contribute a lot to how the name looks like. Long names often carries repeated prefix and suffix which creates "information noise", term used in graphic design describing situation in which important information is obscured by large amount of other visual elements. For example imagine road stop sign on the background of many red geometrical shapes. It might get completely invisible for driver.

// If you think that I'm overdoing this then know that there are many
// programmers that will find this function not descriptive enough.

int get_document_index(char *document_name) {
	int document_index;
	char *current_document_name;
	for (document_index = 0;
	     document_names[document_index];
	     document_index++) {
		current_document_name = document_names[document_index];
		if (strcmp(current_document_name, document_name) == 0)
			return document_index;
	}
	return -1;
}

// VS function that can be read in whole as single graphical symbol
// but makes sense only in context of the file.

int indexof(char *str) {
	int i;
	for (i=0; list[i]; i++)
		if (strcmp(list[i], str) == 0)
			return i;
	return -1;
}

I was already using very short name "siz" for variables that hold size of buffer but then I saw "sz" in some code snippet. At first I thought that this is too much. But then I realized that "sz" is way more unique combination of letters than "siz", at least in English. When searching for "sz" you don't get results with: size, emphasize, resize etc.

7. Recycle

Example list of usages/contexts with variable names that I use:

index
	i j k

capacity
size
amount
	n m sz len

sorting
pairs
	a b

difference
	diff delta

coordinates
position
distance
	x y z

colors
channels
	r g b u v

size
order
position
	w h next prev first last beg end row col

hierarchy
tree
list
	head tail node child parent cells

data
memory
	str buf mem tmp pt key value

More unique names are usually necessary for things that define context, scope or namespace like directories, files, structures and functions.

Different perspective

The Go programming language propagate idea that variable has to be longer if it's defined farther from the place where it is used. I think that this is a great advice when you are still looking at variable names topic as short vs long debate. But with my new understanding I no longer see it that way. For me it's now all about:

Names instead of descriptions.
Symbols instead of words.
Familiarity and distinction.
Context and scope.
Repetition and reusability.

In all those points length can play a role but it's not in the center of attention.

And those who were seen dancing were thought to be insane by those who could not hear the music.

- Friedrich Nietzsche