TL;DR, check the decision tree and summary.
Intro
For most beginners, CMake if is a nightmare. It sometimes takes a quoted string as variable and sometimes it takes a quoted string as a pure string literal, as many other languages do.
In CMake, everything is string. You may see CACHE
option can set
the type for string, such as BOOL
, STRING
, and PATH
. Well, for my understanding, those types are served for cmake-gui purposes. As the cache entry says:
BOOL
Boolean ON/OFF value. cmake-gui(1) offers a checkbox.
FILEPATH
Path to a file on disk. cmake-gui(1) offers a file dialog.
PATH
Path to a directory on disk. cmake-gui(1) offers a file dialog.
STRING
A line of text. cmake-gui(1) offers a text field or a drop-down selection if the STRINGS cache entry property is set.
INTERNAL
A line of text. cmake-gui(1) does not show internal entries. They may be used to store variables persistently across runs. Use of this type implies FORCE.
Is there a good rule to avoid the brain-twisting quoted and unquoted strings? I try to give an answer to address this problem.
string arguments for a command
String plays a very important role in CMake. Every command argument is a string and every variable in CMkae is a string. Well, you can say everything in CMake is a string, which is similar to Bash and Make.
set(OKAY "Oll Korrect")
if("OKAY")
message("Oll Korrect")
endif()
let’s run this script with CMake --trace --debug-output -P Okay.cmake
:
/Users/guihaoliang/Work/CMakeDemo/Okay.cmake[4]: set(OKAY Oll Korrect )
/Users/guihaoliang/Work/CMakeDemo/Okay.cmake[5]: if(OKAY )
CMake Warning (dev) at Okay.cmake:5 (if):
Policy CMP0054 is not set: Only interpret if() arguments as variables or
keywords when unquoted. Run "cmake --help-policy CMP0054" for policy
details. Use the cmake_policy command to set the policy and suppress this
warning.
Quoted variables like "OKAY" will no longer be dereferenced when the policy is set to NEW. Since the policy is not set the OLD behavior will be used.
This warning is for project developers. Use -Wno-dev to suppress it.
/Users/guihaoliang/Work/CMakeDemo/Okay.cmake[6]: message(Oll Korrect )
Oll Korrect
We will cover the warning, and basically, CMake treats if(OKAY)
and if("OKAY")
as same thing. Or equivalently, CMake strips the double-quotes.
That’s right, everything is string and no explicit double quotes are needed. And CMake engine will check its variable definition map to verify if OKAY
is defined. If it’s defined, great, it will be expanded to its value by if
command.
However, I only want to use "OKAY"
as a pure string, and no other meanings neither it should mean a variable. If double-quotes are explicitly used, CMake should recognize the string input to the command is just a string, don’t do any extra interpretations. And that’s what policy CMP0054 does if you read the warning carefully.
Since everything is a string, let’s discuss by its 2 different forms: unquoted and quoted (or string literal, in most languages).
unquoted string
The unquoted string can be treated as a single string entity (without any whitespace and cannot be parsed as a list) or variable that refers to some string value. It’s confusing.
set(VAR a) # same as set("VAR" "a")
message(${VAR})
In this example, VAR
is a variable and a
is a string. Okay, that should be easy to distinguish a variable from a string. If we use ${}
, then it’s a variable. Otherwise, string.
Things are getting complicated when CMake if comes into play:
if(<variable|string>)
True if the variable is defined to a value that is not a false constant. False otherwise. (Note macro arguments are not variables.)
…
The if command was written very early in CMake’s history, predating the ${} variable evaluation syntax, and for convenience, (if command) evaluates variables named by its arguments as shown in the above signatures.
It says since there’s no variable evaluation syntax when if
was developed, the CMake has to evaluate the variable to get its value implicitly by expanding the variable because explicit ${}
was unavailable at that time. That caused this super weird implicit variable evaluation. As Python Zen, explicit is better than implicit.
If you read the document carefully, if(<string>)
will always return FALSE, regardless whether the string is empty or not. Well, that’s weird in other languages, especially in C/C++, where a non-empty string literal will be evaluated as TRUE.
if(<variable|
string>)
True if the variable is defined to a value that is not a false constant. False otherwise. (Note macro arguments are not variables.)
As a result, in the if(unquoted)
clause, you cannot tell whether unquoted
is a variable or a string used to evaluate the condition. If unquoted
is treated as a string, the result is FALSE.
In this case for simple if(unquoted)
, it becomes convenient and easy to remember. You can blindly treat the unquoted
as a variable if it’s not a constant. Why? If(unquoted)
is only TRUE when it’s a defined variable and its value is not a false constant. Just remember this condition, and all other situations evaluate to FALSE.
But, but, but, other than this simple form if(unquoted)
,
# explained later
cmake_policy(SET CMP0054 NEW)
cmake_policy(SET CMP0012 NEW)
if(unquoted STREQUAL "unquoted")
message(TRUE)
else()
message(FALSE)
endif()
you cannot blindly treat unquoted
as a variable. Annoying. No simple rule to cover all.
quoted string
Since everything is string, can we strip double quotes from string literal form?
It should be okay for most cases except for whitespaces and variable reference form ${}
. Let me explain.
quoted string as variable name
Each argument is a string. If argument contains spaces or variable references, quotes come into play. If it’s quoted, it will be treated as one entity. Otherwise, it will be treated as a list, which is merely a semicolon-delimited string.
When you pass a quoted string to a command, you explicitly want CMake don’t split the argument by whitespaces into multiple arguments and treat them as part of a list.
Let’s first look at quoted string arguments with whitespaces inside.
We can use set command to set up a variable:
set(OKAY "Oll Korrect") # set("OKAY" "Oll Korrect")
message(${OKAY})
Due to the quoted string, we can set a variable name containing space (which is a bad practice):
Even though it violates the CMake variable name convention, where no space is allowed in a valid CMake variable name, we can still indirectly reference that invalid variable name. If you are still confused, you can expand ${${Indirect_Var}}
to ${Oll Korrect}
, which you cannot use directly.
quoted string as input argument
Like Bash or Make, quotes can be used to explicitly group several arguments into one single parameter,
# VAR is a string
set(VAR "a b c")
message(${VAR}) # prints a b c
# VAR is a list
set(VAR a b c)
message(${VAR}) # prints abc
a b c
in the first set
is treated as a single entity, whereas in the second set
, it’s treated as 3 separate entities.
quoted variable reference
What about a quoted variable references?
You can substitute a variable with its value by using ${variable}
, where variable
is not a quoted string literal. If you do ${"variable"}
, you will get an error.
# VAR is a list
set(VAR a b c)
message(${VAR}) # equivalent to message(a b c) or message(a;b;c), prints abc
message("${VAR}") # prints list a;b;c
As the above example shows, if variable references ${VAR}
are unquoted, its value, which is a list, will be expanded to 3 separate arguments to command message
. Whereas, the second example will treat the list as a single entity. This behavior is quite similar to how Bash command treat input arguments
quoted string as to refer a variable
In the old versions (dark days of CMake), quoted-string can also be used in the if
clause to represent variable.
It’s a chaos that both quoted string and unquoted string can be treated as variables and then be dereferenced to get the value. It’s painful when the quoted string happens to be a variable’s name and you only intend to use it as a string, instead of a variable.
set(GUI gui)
if ("GUI" STREQUAL "gui")
# "GUI" will be dereferenced to "gui" since it's defined
# whereas "gui" is treated as string since there's no variable
# named as "gui"
message(TRUE)
else()
message(FALSE)
endif()
# prints TRUE
This behavior is annoying. Luckily, CMake has a warning for this type of behavior. Policy CMP0054 can be set to avoid quoted-string be interpreted as variable and thereafter be dereferenced, if it’s defined. In this way, quoted strings should be purely strings.
What if the quoted-string is a boolean constant?
if(<constant>)
True if the constant is1
,ON
,YES
,TRUE
,Y
, or a non-zero number. False if the constant is0
,OFF
,NO
,FALSE
,N
,IGNORE
,NOTFOUND
, the empty string, or ends in the suffix-NOTFOUND
. Named boolean constants are case-insensitive. If the argument is not one of these constants, it is treated as a variable.
if ("ON")
# "ON" is treated as variable ON
message(TRUE)
else()
message(FALSE)
endif()
# prints FALSE
CMake tries to interpret ON
as a variable, which is undefined, then treat it as a string, finally evaluates the condition statement as FALSE.
set(ON ON)
if ("ON")
# ON is dereferenced by ${ON}
message(TRUE)
endif()
# prints TRUE
In CMake versions 2.6.4 and lower the if() command implicitly dereferenced arguments corresponding to variables, even those named like numbers or boolean constants, except for 0 and 1.
To disable this behavior, just enable policy CMP0012 to avoid dereference boolean constants when quoted and in this way, quoted boolean constant string should be treated as normal string boolean constant.
Later versions of CMake prefer to treat numbers and boolean constants literally, so they should not be used as variable names.
cmake_policy(SET CMP0012 NEW)
set(ON OFF)
# if(ON) works as the same. ON won't have a chance to be treated as a variable.
if ("ON")
# "ON" -> ON, won't dereference it to OFF
message(TRUE)
endif()
# prints TRUE
decision tree for string input
To address this ambiguity, I have this decision tree for you to think like a CMake program:
-
parse all string entities. if policy CMP0054 is set, memorize quoted string as a single entity that cannot be used as
<variable>
later to do dereference. -
Translate to the equivalent form with quotes stripped because CMake doesn’t see double quotes when it processes string inputs internally.
-
If the string entity in the command can be accepted as
<constant>
, do so. If CMP0012 is set to OLD, most boolean constants cannot be identified, such as ON, TRUE and FALSE, etc. -
Otherwise, continue to fall back. If it can be accepted as
<variable>
, check the definition table. If it’s found, dereference its value. -
Otherwise, it will be used as a pure
<string>
.
The same thing applies to CMake while
.
Here’s an single example to verify the rule above:
set("Oll Korrect" ON)
if("Oll Korrect") # will be expanded if CMP0054 is not set
message("Oll Korrect")
endif()
- “Oll Korrect” is one entity
- “Oll Korrect” becomes
Oll Korrect
, still one entity if()
can accept constant, andOll Korrect
is not a constant.Oll Korrect
is a variable, expand it toON
. Stop.
with trace output:
/Users/guihaoliang/Work/CMakeDemo/Okay.cmake[4]: set(Oll Korrect ON )
/Users/guihaoliang/Work/CMakeDemo/Okay.cmake[5]: if(Oll Korrect )
/Users/guihaoliang/Work/CMakeDemo/hello.cmake[6]: message(Oll Korrect )
You see, CMake knows Oll Korrect
is one string entity in if(Oll Korrect )
, instead of 2.
Let’s set CMP0054 on,
cmake_policy(SET CMP0054 NEW)
set("Oll Korrect" ON)
if("Oll Korrect")
message("Oll Korrect")
else()
message("Not Oll Korrect")
endif()
- “Oll Korrect” is one entity, which is a string literal and cannot be treated as a variable.
- “Oll Korrect” becomes
Oll Korrect
, still one entity if()
can accept constant, andOll Korrect
is not a constant.Oll Korrect
cannot be a variable, skip this step.Oll Korrect
is a string.
Another example,
cmake_policy(SET CMP0054 NEW)
cmake_policy(SET CMP0012 NEW)
set(ON "NOT ON")
if(ON STREQUAL "ON")
message(TRUE)
else()
message(FALSE)
endif()
- first
ON
can be a variable or string. second “ON” is a string. - “ON” becomes
ON
, but not same as firstON
. To distinguish them, call fistON
asON1
, and secondON
asON2
. if( STREQUAL )
form doesn’t accept constants, skip.ON2
cannot be variable, butON1
can be dereferenced to string “NOT ON”.ON2
is string.
The net result is if("NOT ON" STREQUAL "ON")
, which prints FALSE
.
A take-home question for you, try with our decision tree and print the result in your mind or with cmake -P
;-)
cmake_policy(SET CMP0054 NEW)
cmake_policy(SET CMP0012 NEW)
set(gui GUI)
if(GUI STREQUAL gui)
message(TRUE)
else()
message(FALSE)
endif()
if(GUI STREQUAL "gui")
message(TRUE)
else()
message(FALSE)
endif()
All these examples show that our decision tree works well as hell (might be the internal logic of CMake C++ implementation, who knows).
updates: 2020-01-05, my guess is right,
My posted question got answered!
bool cmConditionEvaluator::GetBooleanValue(
cmExpandedCommandArgument& arg) const
{
// Check basic constants.
if (arg == "0") {
return false;
}
if (arg == "1") {
return true;
}
// Check named constants.
if (cmIsOn(arg.GetValue())) {
return true;
}
if (cmIsOff(arg.GetValue())) {
return false;
}
// Check for numbers.
if (!arg.empty()) {
char* end;
double d = strtod(arg.c_str(), &end);
if (*end == '\0') {
// The whole string is a number. Use C conversion to bool.
return static_cast<bool>(d);
}
}
// Check definition.
const char* def = this->GetDefinitionIfUnquoted(arg);
return !cmIsOff(def);
}
The above cmake source code is equivalent to if(<arg>)
.
The arg
contains the state
to tell whether it’s quoted or not (under the hoold, its value is stored in std::string
). If you are curious, check the definition for this->GetDefinitionIfUnquoted:
const char* cmConditionEvaluator::GetDefinitionIfUnquoted(
cmExpandedCommandArgument const& argument) const
{
// skip if it's quoted when CMP0054 is set to NEW
if ((this->Policy54Status != cmPolicies::WARN &&
this->Policy54Status != cmPolicies::OLD) &&
argument.WasQuoted()) {
return nullptr;
}
const char* def = this->Makefile.GetDefinition(argument.GetValue());
if (def && argument.WasQuoted() &&
this->Policy54Status == cmPolicies::WARN) {
// ...
// warnings
}
}
return def;
}
All in all
- set CMP0012 and CMP0054 on so that you will expect CMake works like a normal programming language. :-)
- don’t use quoted string to refer a variable.
- don’t use quoted string to refer a constant
- quote variable reference to avoid list expansion.
- quote string input with intra-whitespaces to avoid being treated as a list of input strings.
- use our decision tree model!
some great resources
I recommend reading the book mastering CMake. It covers many key and fundamental concepts of CMake. Personally, I like to learn things systematically. The first few chapters of this book explain CMake’s internal working mechanism and its basic language syntax extremely well and clear.
For practitioners, I recommend getting hands dirty by following examples after getting exposed to some fundamentals of CMake. Besides, I found this wonderful post quite useful to get started with CMake.