While I was working on a big project full of macro tricks and wizardry, I stumbled upon a bug in which a macro was not expanding properly. The resulting output was "EXPAND(0)
", but EXPAND
was defined as "#define EXPAND(X) X
", so clearly the output should have been "0
".
"No problem", I thought to myself. "It's probably some silly mistake, there are some nasty macros here, after all, plenty of places to go wrong". As I thought that, I isolated the misbehaving macros into their own project, about 200 lines, and started working on a MWE to pinpoint the problem. 200 lines became 150, which in turn became 100, then 20, 10... To my absolute shock, this was my final MWE:
#define EXPAND(X) X
#define PARENTHESIS() ()
#define TEST() EXPAND(0)
EXPAND(TEST PARENTHESIS()) // EXPAND(0)
4 lines.
To add insult to injury, almost any modification to the macros will make them work correctly:
#define EXPAND(X) X
#define PARENTHESIS() ()
#define TEST() EXPAND(0)
// Manually replaced PARENTHESIS()
EXPAND(TEST ()) // 0
#define EXPAND(X) X
#define PARENTHESIS() ()
#define TEST() EXPAND(0)
// Manually replaced TEST()
EXPAND(EXPAND(0)) // 0
// Set EXPAND to 0 instead of X
#define EXPAND(X) 0
#define PARENTHESIS() ()
#define TEST() EXPAND(0)
EXPAND(TEST PARENTHESIS()) // 0
But most importantly, and most oddly, the code below fails in the exact same way:
#define EXPAND(X) X
#define PARENTHESIS() ()
#define TEST() EXPAND(0)
EXPAND(EXPAND(EXPAND(EXPAND(TEST PARENTHESIS())))) // EXPAND(0)
This means the preprocessor is perfectly capable of expanding EXPAND
, but for some reason, it absolutely refuses to expand it again in the last step.
Now, how I'm going to solve this problem in my actual program is neither here nor there. Although a solution would be nice (i.e. a way to expand the token EXPAND(TEST PARENTHESIS())
to 0
), the thing I'm most interested in is: why? Why did the C preprocessor come to the conclusion that "EXPAND(0)
" was the correct expansion in the first case, but not in the other ones?
Although it's easy to find resources on what the C preprocessor does (and some magic that you can do with it), I've yet to find one that explains how it does it, and I want to take this opportunity to understand better how the preprocessor does its job and what rules it uses when expanding macros.
So in light of that: What is the reasoning behind the preprocessor's decision to expand the final macro to "EXPAND(0)
" instead of "0
"?
Edit: After reading Chris Dodd's very detailed, logical and well-put answer, I did what anybody would do in the same situation... try to come up with a counterexample :)
What I concocted was this different 4-liner:
#define EXPAND(X) X
#define GLUE(X,Y) X Y
#define MACRO() GLUE(A,B)
EXPAND(GLUE(MACRO, ())) // GLUE(A,B)
Now, knowing the fact that the C preprocessor is not Turing complete, there is no way the above will ever expand to A B
. If that were the case, GLUE
would expand MACRO
and MACRO
would expand GLUE
. That would lead to the possibility of unlimited recursion, probably implying Turing Completeness for the Cpp. So sadly for the preprocessor wizards out there, the above macro not expanding is a guarantee.
It failing is not really the problem, the real problem is: Where? Where did the preprocessor decide to stop the expansion?
Analyzing the steps:
- step 1 sees the macro
EXPAND
and scans in argument listGLUE(MACRO, ())
forX
- step 2 recognizes
GLUE(MACRO, ())
as a macro:- step 1 (nested) gets
MACRO
and()
as arguments - step 2 scans them but finds no macro
- step 3 inserts into the macro body yielding:
MACRO ()
- step 4 suppresses
GLUE
and scansMACRO ()
for macros, findingMACRO
- step 1 (nested) gets an empty token sequence for the argument
- step 2 scans that empty sequence and does nothing
- step 3 inserts into the macro body
GLUE(A,B)
- step 4 scans
GLUE(A,B)
for macros, findingGLUE
. It is suppressed, however, so it leaves as is.
- step 1 (nested) gets
- so the final value for
X
after step 2 isGLUE(A,B)
(notice that since we are not in step 4 ofGLUE
, in theory, it is not suppressed anymore) - step 3 inserts that into the body, giving
GLUE(A,B)
- step 4 suppresses
EXPAND
and scansGLUE(A,B)
for more macros, findingGLUE
(uuh)- step 1 gets
A
andB
for the arguments (oh no) - step 2 does nothing with them
- step 3 substitutes into the body giving
A B
(well...) - step 4 scans
A B
for macros, but finds nothing
- step 1 gets
- the final result is then
A B
Which would be our dream. Sadly, the macro expands to GLUE(A,B)
.
So our question is: Why?