[中文] Reflection for C++26! 终于等到你

C++26 定稿,Reflection 提案投票通过

2025 年 6 月 21 日,著名微软工程师、C++ 标准委员会委员 Herb Sutter 在其个人博客发布了一篇博文《Trip report: June 2025 ISO C++ standards meeting (Sofia, Bulgaria) – Sutter’s Mill》。

A few minutes ago, the C++ committee voted the first seven (7) papers for compile-time reflection into draft C++26

Herb Sutter 在第一段“A unique milestone: ‘Whole new language’”兴奋地提到,就在几分钟前,C++ 标准委员会投票通过了编译期反射涉及的 7 篇 paper。它们将共同整合进入 C++26 草案

Sutter 这看似平静的信息发布犹如一声惊雷,炸响了我心中的那份初心,久久无法平静。WOW!笔者作为一名 C++ 工程师和爱好者,很早便开始研究各类反射相关提案,见证了其从第一版本 Static Reflection(N3996) 到最终 Reflection for C++26,心中激动之情难以言说。 “终于等到你,还好我没放弃”——C++ 在系统和高性能编程领域继续站稳脚跟,并发挥着愈来愈重要的作用。

过去时:Macro-based 代码生成

在日常工程实践中涉及到代码生成,JSON 序列化、反序列化等常见场景,笔者一般采用 Macros 结合 libclang 来动态解析和生成代码,然后使用 __PRETTY_FUNCTION__ 或者 __FUNCSIG__ 等编译器实现相关的 Dynamic Macros 来读取所在上下文的函数签名字符串。最后,通过把枚举值注入模板的 Non-Type Template Parameters 可以读取到枚举值的字符串形式,再结合 Structured Bindings 可以实现有限个(比如最大 128)类公共成员的按顺序读取,实现简单的成员反射功能。

现代 C++ 缺乏强大的代码生成方法。如果不借助外部工具,一般只能通过 Macros 实现简单的代码展开——即文本替换。Macros 的参数包展开(__VA_ARGS__)也不够灵活,不能按索引操作参数。如果需要实现更复杂的代码展开,一般可以借助上面提到的 libclang 来解析代码的 AST 结构,并递归搜索自己想要的关键词,最后借助文本模板替换库,编写自己项目的私有化生成系统。

在反射这一块,一直也是现代 C++ 工程应用中的短板。typeinfotypeindex 机制只能提供类型运行时名称等有限信息。借助 C++17 及以后的 Structured Bindings,可以编写 Macros,实现不同数据成员个数的类绑定适配,然后再用第一段中的“小把戏”萃取出对应成员的名字。借助 constexprconsteval 函数,可以实现零开销的简单反射功能,满足日常序列化和反序列化需求。

这种方法不够优雅,破坏了 C++ 平台无关的哲学,在实现上需要考虑不同编译器版本、种类,因为 “萃取”的字符串会随着编译器版本更新发生变化,没有标准保障。比如 Clang 14, 15, 16, 17 获取到的字符串就有细微差别,大家可以到 GodBolt 自行验证。其次,这样获取的信息极为有限,无法获取其他元数据,比如类的名字、命名空间、别名(Aliases)、成员函数列表、私有数据成员、特性(Attributes)等。随着 Reflection for C++26 的横空出世,这一顽疾终于告一段落,整个语言特性体系得到重生。

现在时:惊喜!共 7 篇提案同时通过

Herb Sutter 在上述博文中提到,一共有 6 篇反射相关 Core Language 特性,和 1 篇反射相关的 Standard Library 特性在投票中一次性通过。Congrats!它们是:

P2996R13 “Reflection for C++26” by Wyatt Childers, Peter Dimov, Dan Katz, Barry Revzin, Andrew Sutton, Faisal Vali, and Daveed Vandevoorde

P3394R4 “Annotations for reflection” by Wyatt Childers, Dan Katz, Barry Revzin, and Daveed Vandevoorde.

P3293R3 “Splicing a base class subobject” by Peter Dimov, Dan Katz, Barry Revzin, and Daveed Vandevoorde.

P3491R3 “define_static_{string,object,array}” by Wyatt Childers, Peter Dimov, Barry Revzin, and Daveed Vandevoorde

P1306R5 “Expansion statements” by Dan Katz, Andrew Sutton, Sam Goodrick, Daveed Vandevoorde, and Barry Revzin

P3096R12 “Function parameter reflection in reflection for C++26”[sic] by Adam Lach, Dan Katz, and Walter Genovese

P3560R2 “Error handling in reflection” by Peter Dimov and Barry Revzin

核弹:P2996R13 “Reflection for C++26”

这是 Reflection for C++26 的核心本体,构建了整个编译期反射机制的地基和承台。Sutter 在文中提到,虽然这个提案暂不包含强代码生成部分——Metaclasses for Generative C++,但仍然是一个无与伦比和高可用的新语言特性,即完备反射支持。

这份通过的提案引入了新的语言级别运算符^^(好开心的样子)——我们称之为 Reflection Operator。这个运算符可以把一个语法上合规的构造映射成为一个反射元数据对象 std::meta::info。以下形式是合法形式:

^^::
^^ namespace-name
^^ type-id
^^ id-expression

如果 ^^ 后面是 id-expression,映射的结果是通过编译期查找所确定的某个具体实体的反射对象,有下面一些可能值:

  • 变量、静态数据成员、Structure Bindings
  • 自由函数、成员函数
  • 非静态数据成员(成员变量)
  • Primary Template 或者 Primary Member Template(未特化前的通用模板)
  • 枚举值
namespace foo {
  class bar {
  public:
    void do_it() {}
  private:
    int data_{};
  };
}

constexpr auto global_value = 123;
constexpr std::meta::info rbar = ^^foo::bar;
constexpr auto rdo_it = ^^foo::bar::do_it;
constexpr auto rns_foo = ^^foo;
constexpr auto rglobal_value = ^^global_value;

如果我们已经拿到一个反射元数据对象 std::meta::info 值,可以通过拼接器运算符 [:xxx:] ,提案中称之为 Splicers,把它逆映射为之前的语法元素,比如命名空间、类型、别名、函数、变量、枚举、模板等。

int value = 123;
constexpr auto rvalue = ^^value;
typename [: rvalue :] value2 = 456; // int value2;

struct S { int a; };

constexpr S s = {.[:^^S::a:] = 2};

除此之外,还有一种特殊的 Slicers 标准称为 Range Slicers,写作 [: ...members...:] 。 顾名思义,这种 Splicer 运算符可以展开由 std::meta::info 组成的参数包,比如获取某个类的非静态数据成员列表。我们可以这么玩:

template <typename T>
constexpr auto struct_to_tuple(const T& t) {
  constexpr auto members = std::meta::nonstatic_data_members_of(^^T);
  return std::make_tuple(t.[: ...members :]...);
}

上面这种形式会被展开成:

make_tuple(t.[:members[0]:], t.[:members[1]:], ..., t.[:members[N-1]:])

整个 Reflection for C++26 在标准库定义了大量标准 API,支持对 std::meta::info 的各种编译期求值操作。通过这些操作,我们可以轻松获取到近乎一切类型系统的元数据信息,甚至还包括链接类型(linkage)和存储类型(storage)。整个 API 系统如下所示:

namespace std::meta {
  using info = decltype(^^::);

  template <typename R>
    concept reflection_range = /* see above */;

  // name and location
  consteval auto identifier_of(info r) -> string_view;
  consteval auto u8identifier_of(info r) -> u8string_view;

  consteval auto display_string_of(info r) -> string_view;
  consteval auto u8display_string_of(info r) -> u8string_view;

  consteval auto source_location_of(info r) -> source_location;

  // type queries
  consteval auto type_of(info r) -> info;
  consteval auto parent_of(info r) -> info;
  consteval auto dealias(info r) -> info;

  // object and constant queries
  consteval auto object_of(info r) -> info;
  consteval auto constant_of(info r) -> info;

  // template queries
  consteval auto template_of(info r) -> info;
  consteval auto template_arguments_of(info r) -> vector<info>;

  // member queries
  consteval auto members_of(info r) -> vector<info>;
  consteval auto bases_of(info type_class) -> vector<info>;
  consteval auto static_data_members_of(info type_class) -> vector<info>;
  consteval auto nonstatic_data_members_of(info type_class) -> vector<info>;
  consteval auto enumerators_of(info type_enum) -> vector<info>;

  // substitute
  template <reflection_range R = initializer_list<info>>
    consteval auto can_substitute(info templ, R&& args) -> bool;
  template <reflection_range R = initializer_list<info>>
    consteval auto substitute(info templ, R&& args) -> info;

  // reflect expression results
  template <typename T>
    consteval auto reflect_constant(const T& value) -> info;
  template <typename T>
    consteval auto reflect_object(T& value) -> info;
  template <typename T>
    consteval auto reflect_function(T& value) -> info;

  // extract
  template <typename T>
    consteval auto extract(info) -> T;

  // other type predicates (see the wording)
  consteval auto is_public(info r) -> bool;
  consteval auto is_protected(info r) -> bool;
  consteval auto is_private(info r) -> bool;
  consteval auto is_virtual(info r) -> bool;
  consteval auto is_pure_virtual(info r) -> bool;
  consteval auto is_override(info r) -> bool;
  consteval auto is_final(info r) -> bool;
  consteval auto is_deleted(info r) -> bool;
  consteval auto is_defaulted(info r) -> bool;
  consteval auto is_explicit(info r) -> bool;
  consteval auto is_noexcept(info r) -> bool;
  consteval auto is_bit_field(info r) -> bool;
  consteval auto is_enumerator(info r) -> bool;
  consteval auto is_const(info r) -> bool;
  consteval auto is_volatile(info r) -> bool;
  consteval auto is_mutable_member(info r) -> bool;
  consteval auto is_lvalue_reference_qualified(info r) -> bool;
  consteval auto is_rvalue_reference_qualified(info r) -> bool;
  consteval auto has_static_storage_duration(info r) -> bool;
  consteval auto has_thread_storage_duration(info r) -> bool;
  consteval auto has_automatic_storage_duration(info r) -> bool;
  consteval auto has_internal_linkage(info r) -> bool;
  consteval auto has_module_linkage(info r) -> bool;
  consteval auto has_external_linkage(info r) -> bool;
  consteval auto has_linkage(info r) -> bool;
  consteval auto is_class_member(info r) -> bool;
  consteval auto is_namespace_member(info r) -> bool;
  consteval auto is_nonstatic_data_member(info r) -> bool;
  consteval auto is_static_member(info r) -> bool;
  consteval auto is_base(info r) -> bool;
  consteval auto is_data_member_spec(info r) -> bool;
  consteval auto is_namespace(info r) -> bool;
  consteval auto is_function(info r) -> bool;
  consteval auto is_variable(info r) -> bool;
  consteval auto is_type(info r) -> bool;
  consteval auto is_type_alias(info r) -> bool;
  consteval auto is_namespace_alias(info r) -> bool;
  consteval auto is_complete_type(info r) -> bool;
  consteval auto is_enumerable_type(info r) -> bool;
  consteval auto is_template(info r) -> bool;
  consteval auto is_function_template(info r) -> bool;
  consteval auto is_variable_template(info r) -> bool;
  consteval auto is_class_template(info r) -> bool;
  consteval auto is_alias_template(info r) -> bool;
  consteval auto is_conversion_function_template(info r) -> bool;
  consteval auto is_operator_function_template(info r) -> bool;
  consteval auto is_literal_operator_template(info r) -> bool;
  consteval auto is_constructor_template(info r) -> bool;
  consteval auto is_concept(info r) -> bool;
  consteval auto is_structured_binding(info r) -> bool;
  consteval auto is_value(info r) -> bool;
  consteval auto is_object(info r) -> bool;
  consteval auto has_template_arguments(info r) -> bool;
  consteval auto has_default_member_initializer(info r) -> bool;

  consteval auto is_special_member_function(info r) -> bool;
  consteval auto is_conversion_function(info r) -> bool;
  consteval auto is_operator_function(info r) -> bool;
  consteval auto is_literal_operator(info r) -> bool;
  consteval auto is_constructor(info r) -> bool;
  consteval auto is_default_constructor(info r) -> bool;
  consteval auto is_copy_constructor(info r) -> bool;
  consteval auto is_move_constructor(info r) -> bool;
  consteval auto is_assignment(info r) -> bool;
  consteval auto is_copy_assignment(info r) -> bool;
  consteval auto is_move_assignment(info r) -> bool;
  consteval auto is_destructor(info r) -> bool;
  consteval auto is_user_provided(info r) -> bool;
  consteval auto is_user_declared(info r) -> bool;

  // define_aggregate
  struct data_member_options;
  consteval auto data_member_spec(info type_class,
                                  data_member_options options) -> info;
  template <reflection_range R = initializer_list<info>>
    consteval auto define_aggregate(info type_class, R&&) -> info;

  // data layout
  struct member_offset {
    ptrdiff_t bytes;
    ptrdiff_t bits;
    constexpr auto total_bits() const -> ptrdiff_t;
    auto operator<=>(member_offset const&) const = default;
  };

  consteval auto offset_of(info r) -> member_offset;
  consteval auto size_of(info r) -> size_t;
  consteval auto alignment_of(info r) -> size_t;
  consteval auto bit_size_of(info r) -> size_t;
}

Reflection 案例 A:经典枚举值转字符串

template<typename E, bool Enumerable = std::meta::is_enumerable_type(^^E)>
  requires std::is_enum_v<E>
constexpr std::string_view enum_to_string(E value) {
  if constexpr (Enumerable)
    template for (constexpr auto e :
                  std::define_static_array(std::meta::enumerators_of(^^E)))
      if (value == [:e:])
        return std::meta::identifier_of(e);

  return "<unnamed>";
}

int main() {
  enum Color : int;
  static_assert(enum_to_string(Color(0)) == "<unnamed>");
  std::println("Color 0: {}", enum_to_string(Color(0)));  // prints '<unnamed>'

  enum Color : int { red, green, blue };
  static_assert(enum_to_string(Color::red) == "red");
  static_assert(enum_to_string(Color(42)) == "<unnamed>");
  std::println("Color 0: {}", enum_to_string(Color(0)));  // prints 'red'
}

注意其中的 template for (constexpr auto e : ...) ,这是基于编译期范围的展开语法糖,并不是真正的 for 循环。这种方式极大简化了模板元编程,现在使用的各种 tricks (比如 std::apply )可以永远退休了。

Reflection 案例 B:解析命令行参数为结构体

template<typename Opts>
auto parse_options(std::span<std::string_view const> args) -> Opts {
  Opts opts;

  constexpr auto ctx = std::meta::access_context::current();
  template for (constexpr auto dm : nonstatic_data_members_of(^^Opts, ctx)) {
    auto it = std::ranges::find_if(args,
      [](std::string_view arg){
        return arg.starts_with("--") && arg.substr(2) == identifier_of(dm);
      });

    if (it == args.end()) {
      // no option provided, use default
      continue;
    } else if (it + 1 == args.end()) {
      std::print(stderr, "Option {} is missing a value\n", *it);
      std::exit(EXIT_FAILURE);
    }

    using T = typename[:type_of(dm):];
    auto iss = std::ispanstream(it[1]);
    if (iss >> opts.[:dm:]; !iss) {
      std::print(stderr, "Failed to parse option {} into a {}\n", *it, display_string_of(^^T));
      std::exit(EXIT_FAILURE);
    }
  }
  return opts;
}

struct MyOpts {
  std::string file_name = "input.txt";  // Option "--file_name <string>"
  int    count = 1;                     // Option "--count <int>"
};

int main(int argc, char *argv[]) {
  MyOpts opts = parse_options<MyOpts>(std::vector<std::string_view>(argv+1, argv+argc));
  // ...
}

Reflection 案例 C:编译期自增计数器

template<int N> struct Helper;

struct TU_Ticket {
  static consteval int latest() {
    int k = 0;
    while (is_complete_type(substitute(^^Helper,
                                       { std::meta::reflect_constant(k) })))
      ++k;
    return k;
  }

  static consteval void increment() {
    define_aggregate(substitute(^^Helper,
                                { std::meta::reflect_constant(latest())}),
                     {});
  }
};

constexpr int x = TU_Ticket::latest();  // x initialized to 0.

consteval { TU_Ticket::increment(); }
constexpr int y = TU_Ticket::latest();  // y initialized to 1.

consteval { TU_Ticket::increment(); }
constexpr int z = TU_Ticket::latest();  // z initialized to 2.

static_assert(x == 0);
static_assert(y == 1);
static_assert(z == 2);

C++ 也有注解了!P3394R4 “Annotations for reflection”

P3394R4 是笔者认为非常优雅、重要的反射语言特性之一。它为 C++26 引入了用户自定义注解支持,类似于 Java 里面的 Annotations 和 C# 里面的 Attributes。本以为这次 C++ 自定义注解没戏,却意外投票通过,本人激动地跳得三尺高(夸张)。

C++ 语言本身其实有类似的内置特性,早在 C++11 标准就已经引入,那就是称作为“特性”的 Attributes。已通过提案 P3394R4 的作者指出,虽然 Attributes 和 Annotations 的功能相当接近,但是传统 Attributes 的语法并没有严格定义 ,具体的编译器实现可以选择性忽略它们“认为”无效的 Attributes,也可以定义自己特有的 Attributes。这已经成既成事实,要改变这个行为相当困难。因此,C++26 选择引入新的 Annotations,代表用户自定义的“Attributes” ,实现特殊生态位。

一个最简单的数值注解可以用以下方式表示:

struct C {
    [[=1]] int a;
};

这种 [[=1]] 定义了一个 int 类型的注解,其值为 1。 [[]] 里面的 = 是不可省略的,用来区分传统 Attributes。注解可以多个叠加使用,有以下多种写法:

[[=42, =42]] int x;
static_assert(std::meta::annotations_of(^^x).size() == 2);

[[=42]] int f();
[[=24]] int f();
static_assert(std::meta::annotations_of(^^f).size() == 2);

struct [[=0]] S {};  // Okay: Appertains to S.
[[=42]] int f();     // Okay: Appertains to f.
int f[[=0]] ();      // Ditto.

std::meta::annotations_of 元函数可以获取某个类型、变量或函数上的所有注解。注解也可以是 class 或者 struct

struct display_name {
  const char* value{};
};

struct table {
  [[=display_name("MyName")]]
  std::string name;
  
  [[=display_name("Int32Number")]]
  int number{};
};

Annotations 案例:单元测试参数注入和自动发现

namespace N {
    [[=parametrize({
        Tuple{1, 1, 2},
        Tuple{1, 2, 3}
        })]]
    void test_sum(int x, int y, int z) {
        std::println("Called test_sum(x={}, y={}, z={})", x, y, z);
    }

    struct Fixture {
        Fixture() {
            std::println("setup fixture");
        }

        ~Fixture() {
            std::println("teardown fixture");
        }

        [[=parametrize({Tuple{1}, Tuple{2}})]]
        void test_one(int x) {
            std::println("test one({})", x);
        }

        void test_two() {
            std::println("test two");
        }
    };
}

int main() {
    invoke_all<^^N>();
}

假设 invoke_all<std::meta::info> 函数利用 Reflection for C++26 发现某个命名空间下所有标记了 =parametrize 注解的单元测试函数。 =parametrize 注解可以在编译期注入测试用例所需的固定数据,是不是突然 Python 或者 C# 既视感。

后记

本来以为 Reflections for C++26 只是更新语言特性,没想到附加的几个重磅提案全部一次性加入标准。标准委员会和时代的步伐逐渐同步,C++ 也日新月异,朝着现代化的路子大步前进。真诚期待 C++26 的 Task, Execution Model,距离大规模利用 Coroutines 也越来越近。共勉!

Introducing SvcHostify: Simplify Hosting Custom DLL Services

Have you ever struggled with hosting your own DLLs as Windows services using svchost.exe? Many developers face challenges due to the undocumented and restrictive nature of svchost.exe. That’s where SvcHostify comes in—a lightweight, open-source tool designed to make hosting custom DLL services easier than ever.

Why SvcHostify?

SvcHostify eliminates the complexity of writing svchost-compatible services. Whether you’re coding in Java, C#, or C++, this tool handles the heavy lifting, letting you focus on your application logic. No more worries about C-style exports or system quirks.

  • Support for multiple languages: Build services in Java, C#, or C++ without worrying about low-level plumbing.
  • Two hosting modes:
    • SvcHost Mode: For academic/research purposes.
    • Standalone Mode: Run services using rundll32.exe—perfect for production.
  • Streamlined JSON configuration: Define service metadata, runtime behavior, and logging in one simple file.

What Can You Do with SvcHostify?

  • Host Java applications via JVM integration.
  • Run .NET DLLs as in-process COM servers.
  • Manage lightweight C++ services by exporting minimalistic C-style functions.

Features You’ll Love

  1. Effortless Service Management:
    Install and uninstall services in seconds with simple commands.
rundll32 svchostify.dll invoke -i -c <CONFIG_FILE>
rundll32 svchostify.dll invoke -u -c <CONFIG_FILE>

  1. Centralized Logging:
    Consolidates logs from stdout, stderr, and other streams into a single file, making debugging a breeze.
  2. Production-Ready Standalone Mode:
    Distribute your service easily without needing svchost.exe.
  3. Open Source Flexibility:
    Modify, enhance, and adapt SvcHostify for your unique use cases—licensed under MIT.

Sample Use Case: Hosting a Java Service

Here’s an example JSON configuration for running a Java-based service:

{
  "workerType": "jvm",
  "name": "MyJavaService",
  "displayName": "My Java Test Service",
  "context": "path/to/your/app.jar",
  "accountType": "networkService",
  "standalone": true,
  "logger": {
    "basePath": "logs/java.log",
    "maxSize": "10 MiB",
    "maxFiles": 5
  }
}

A few lines of JSON are all it takes to spin up your service!

Who Should Use SvcHostify?

  • Developers who need a quick way to deploy custom DLL services on Windows.
  • Organizations looking to simplify the deployment of internal tools and applications.
  • Hobbyists and researchers exploring the possibilities of Windows Service customization.

Get Started Today!

Visit the SvcHostify GitHub Repository to download the latest release, check out detailed documentation, and dive into sample projects.

Simplify your Windows Service development with SvcHostify—your next DLL-hosting project just got easier!

Controlling the C++ Runtime Linkage: A Portable Solution

A large number of open-source C++ projects choose Modern CMake as a powerful tool used for managing the building process across multiple platforms. Modern CMake refers to the practices, features, and methodologies introduced in CMake 3.x (and beyond) that simplify, improve, and modernize the build system configuration for C++ and other languages. It emphasizes clarity, reusability, and portability, and aims to make CMake easier to maintain while leveraging its full potential.

For the most part, the CMAKE_<LANG>_FLAGS CMake variable is provided for consumers to set additional compiler flags for the <LANG> programming language. For example, CMAKE_CXX_FLAGS impacts the current C++ compiler and CMAKE_C_FLAGS will take effect when building a C target in the subsequent tasks.

if(CMAKE_CXX_COMPILER_ID STREQUAL "MSVC")
  string(APPEND CMAKE_CXX_FLAGS " /EHsc")
else()
  string(APPEND CMAKE_CXX_FLAGS " -fexceptions")
endif()

The CMAKE_<LANG>_FLAGS variable will always override both Release and Debug targets. To set flags separately for one single build type, extra variables like CMAKE_<LANG>_FLAGS_RELEASE and CMAKE_<LANG>_FLAGS_DEBUG are introduced for such kind of usage.

if(CMAKE_CXX_COMPILER_ID STREQUAL "MSVC")
  string(APPEND CMAKE_CXX_FLAGS_DEBUG " /Od")
  string(APPEND CMAKE_CXX_FLAGS_RELEASE " /O2")
else()
  string(APPEND CMAKE_CXX_FLAGS_DEBUG " -O0")
  string(APPEND CMAKE_CXX_FLAGS_RELEASE " -O3")
endif()

On Windows, Visual Studio 2015 and later versions of Visual Studio all use one Universal Visual C Runtime Library, called the Universal CRT (UCRT). The UCRT is a Microsoft Windows operating system component. It’s included as part of the operating system in Windows 10 or later, and Windows Server 2016 or later. The Visual C++ Runtime Library is usually distributed with the corresponding MSVC tooling or installed independently with third-party software. The VC++ Runtime Library depends on the UCRT in the operating system.

The UCRT and the VC++ Runtime have static and dynamic libraries within MSVC and thereby the MSVC’s compiler cl.exe provides several options for linkage mode control of the two runtimes simultaneously. The options are:

Option (Release)Option (Debug)Description
/MD/MDdLinking to UCRT & VC++ Runtime statically
/MT/MTdLinking to UCRT & VC++ Runtime dynamically

For GCC on Linux or Unix platform, this compiler is designed to use options, such as -static-libstdc++ and -static-libgcc, to enable the static libraries of libstdc++.a, libgcc.a, or libc++.a. The libgcc is known as the GCC Low-level Runtime Library, which exists on some platform and GCC generates calls to routines in this library automatically, whenever it needs to perform some operation that is too complicated to emit inline code for.

For Clang on these platforms, it contains the same options as GCC has for compatibility considerations. The options in Clang are aliases to -static-libc++ and -static-compiler-rt, which are identical to those of GCC.

CompilerOptionDescription
GCC-static-libstdc++Linking to libstdc++ statically
GCC-static-libgccLinking to libgcc statically
Clang-static-libc++
-static-libstdc++
Linking to libc++ statically
Clang-static-libgcc
-static-compiler-rt
Linking to libcompiler-rt statically.

In a C++ project, it is recommended to keep the same configuration of compilation for all dependencies. Because of different compilers and option names, it is a bit complex to write CMake code in a portable way and users may set the options via -DCMAKE_<LANG>_FLAGS=xxx in the command line, that could lead to incorrect building configurations or an unexpected behavior.

A solution is to define a CMake option in convention, e.g. WITH_STATIC_RUNTIME, indicating that it is the official method to modify the runtime linkage mode.

option(WITH_STATIC_RUNTIME OFF "Linking to the C++ runtime statically.")

A user-defined CMake function is constructed here to generate flags for different compilers correspondingly, complying the aforementioned rules.

function(es_make_c_cxx_runtime_flags)
  set(options STATIC_RUNTIME)
  set(one_value_args RESULT_DEBUG_FLAGS RESULT_RELEASE_FLAGS)
  set(multi_value_args "")
  cmake_parse_arguments(PARSE_ARGV 0 ARG "${options}" "${one_value_args}" "${multi_value_args}")
  es_ensure_parameters(es_make_c_cxx_runtime_flags ARG RESULT_DEBUG_FLAGS RESULT_RELEASE_FLAGS)

  if(CMAKE_C_COMPILER_ID STREQUAL "MSVC" OR CMAKE_CXX_COMPILER_ID STREQUAL "MSVC")
    if(ARG_STATIC_RUNTIME)
      set(debug_flag /MTd)
      set(release_flag /MT)
    else()
      set(debug_flag /MDd)
      set(release_flag /MD)
    endif()
  elseif(CMAKE_C_COMPILER_ID STREQUAL "GNU" OR CMAKE_C_COMPILER_ID STREQUAL "Clang" OR CMAKE_CXX_COMPILER_ID STREQUAL "GNU" OR CMAKE_CXX_COMPILER_ID STREQUAL "Clang")
    if(ARG_STATIC_RUNTIME)
      set(debug_flag "-static-libstdc++ -static-libgcc")
      set(release_flag ${debug_flag})
    else()
      set(debug_flag "")
      set(release_flag "")
    endif()
  else()
    message(FATAL_ERROR "Unsupported compiler: ${CMAKE_CXX_COMPILER_ID}.")
  endif()
  
  set(${ARG_RESULT_DEBUG_FLAGS} ${debug_flag} PARENT_SCOPE)
  set(${ARG_RESULT_RELEASE_FLAGS} ${release_flag} PARENT_SCOPE)
endfunction()

The es_ensure_parameters function raises a fatal error when the mandatory parameters do not exist. Here is an example showing how to use this function:

if(WITH_STATIC_RUNTIME)
  set(runtime_switch STATIC_RUNTIME)
else()
  set(runtime_switch "")
endif()

es_make_c_cxx_runtime_flags(
  ${runtime_switch}
  debug_flags
  release_flags
)

message(STATUS "debug_flags: ${debug_flags}")
message(STATUS "release_flags: ${release_flags}")

Another design goal is exception safety, that is whatever users set the flags to never corrupts the building process. A workaround is to clear the old runtime flags first and set the new flags after that. A possible implementation is to iterate all the variables to remove conflicting flags and then append the generated flags to these variables. The string(REPLACE ...) function is capable of text replacement operations.

string(REPLACE "string-to-find" "replacement" <RESULT> "source")

set(input_str "Hello world")
string(REPLACE "Hello" "Echo" output_str "${input_str}")
message(STATUS "${output_str}")

In CMake, a list is actually strings separated by semicolons, so an ordinary string can be transformed to a list after inserting semicolons within it. The CMAKE_<LANG>_FLAGS and CMAKE_<LANG>_<BUILD_TYPE> variables use spaces as delimiters and can be passed to the foreach statement by replacing all spaces with semicolons. It is viable to retrieve conflicting flags by reversing the WITH_STATIC_RUNTIME option.

if(ARG_STATIC_RUNTIME)
  set(runtime_switch STATIC_RUNTIME)
  set(reverse_runtime_switch "")
else()
  set(runtime_switch "")
  set(reverse_runtime_switch STATIC_RUNTIME)
endif()

es_make_c_cxx_runtime_flags(
  ${reverse_runtime_switch}
  RESULT_DEBUG_FLAGS reverse_debug_flags
  RESULT_RELEASE_FLAGS reverse_release_flags
)

string(REPLACE " " ";" reverse_debug_flag_list "${reverse_debug_flags}")
string(REPLACE " " ";" reverse_release_flag_list "${reverse_release_flags}")

foreach(item IN LISTS reverse_debug_flag_list)
  string(REPLACE ${item} "" CMAKE_C_FLAGS "${CMAKE_C_FLAGS}")
  string(REPLACE ${item} "" CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS}")
  string(REPLACE ${item} "" CMAKE_C_FLAGS_DEBUG "${CMAKE_C_FLAGS_DEBUG}")
  string(REPLACE ${item} "" CMAKE_CXX_FLAGS_DEBUG "${CMAKE_CXX_FLAGS_DEBUG}")
endforeach()

foreach(item IN LISTS reverse_release_flag_list)
  string(REPLACE ${item} "" CMAKE_C_FLAGS "${CMAKE_C_FLAGS}")
  string(REPLACE ${item} "" CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS}")
  string(REPLACE ${item} "" CMAKE_C_FLAGS_RELEASE "${CMAKE_C_FLAGS_RELEASE}")
  string(REPLACE ${item} "" CMAKE_CXX_FLAGS_RELEASE "${CMAKE_CXX_FLAGS_RELEASE}")
endforeach()

Now all the variables do not contain conflicting flags anymore and the new flags can be appended to them immediately.

es_make_c_cxx_runtime_flags(
  ${runtime_switch}
  RESULT_DEBUG_FLAGS debug_flags
  RESULT_RELEASE_FLAGS release_flags
)

string(APPEND CMAKE_C_FLAGS_DEBUG " ${debug_flags}")
string(APPEND CMAKE_CXX_FLAGS_DEBUG " ${debug_flags}")
string(APPEND CMAKE_C_FLAGS_RELEASE " ${release_flags}")
string(APPEND CMAKE_CXX_FLAGS_RELEASE " ${release_flags}")

If all the above code are located inside a CMake function, which has its own inner scope of variables. Extra set statements are necessary to copy the inside variables to the outer ones.

set(CMAKE_C_FLAGS ${CMAKE_C_FLAGS} PARENT_SCOPE)
set(CMAKE_CXX_FLAGS ${CMAKE_CXX_FLAGS} PARENT_SCOPE)
set(CMAKE_C_FLAGS_DEBUG ${CMAKE_C_FLAGS_DEBUG} PARENT_SCOPE)
set(CMAKE_CXX_FLAGS_DEBUG ${CMAKE_CXX_FLAGS_DEBUG} PARENT_SCOPE)
set(CMAKE_C_FLAGS_RELEASE ${CMAKE_C_FLAGS_RELEASE} PARENT_SCOPE)
set(CMAKE_CXX_FLAGS_RELEASE ${CMAKE_CXX_FLAGS_RELEASE} PARENT_SCOPE)

Well done! Everything might be OK now. The full application can be checked at my personal repo named “Cpp Essence” on GitHub. I will appreciate your stars and contributions there! Big thanks.

Windows String Representations and Simple Conversions via __bstr_t

The Windows NT kernel (from WinNT to Windows 11) internally uses UTF-16 strings by default, including Windows Drivers, Native Applications and COM clients and servers, etc. All other common encodings like UTF-8, GBK, GB18030, BIG-5 should always be converted to UTF-16 before invocations to the kernel functions within ntdll.dll.

There is a compatible layer upon the kernel layer, which is called Win32 API, a huge heritage left by the Win9X series. Win32 API is a family of functions for programmers to communicate with the operating system and hardware conveniently. Microsoft keeps this compatibility on the Windows NT Kernel so that most of the Win32 functions are still remaining unchanged and ABI-compatible. For example the CreateFile function does exist from Windows 98 to Windows 11. That is unbelievable because a Linux distribution may break any API in a minor update!

HANDLE CreateFile(
  [in]           LPCSTR                lpFileName,
  [in]           DWORD                 dwDesiredAccess,
  [in]           DWORD                 dwShareMode,
  [in, optional] LPSECURITY_ATTRIBUTES lpSecurityAttributes,
  [in]           DWORD                 dwCreationDisposition,
  [in]           DWORD                 dwFlagsAndAttributes,
  [in, optional] HANDLE                hTemplateFile
);

Some challenges started to occur. The Unicode Standard was then generally accepted by OS vendors after Win9X was released while the Win9X was still using the ANSI encoding or some multi-byte encoding in the terminal country like GB2312 and BIG-5. The Windows NT Kernel chose the UTF-16 as its kernel string representations that hindered the working progress of compatibility.

To resolve this issue, Microsoft’s talented engineers decided to create duplicates of the corresponding old Win32 APIs. The only difference of these two versions is the string types: LPCSTR vs LPCWSTR, the aliases of const char* and const wchar_t* in C++. The former represents the ANSI, and the latter stores a UTF-16 encoded string. To distinguish the mangled names at the C ABI level, the developers simply added a single-word suffix for the function: -A for the ANSI version and -W for the Unicode version. It provides much flexibility for users to call any of them in their projects.

BOOL DeleteFileA(
  [in] LPCSTR lpFileName
);

BOOL DeleteFileW(
  [in] LPCWSTR lpFileName
);

Generally speaking, the encoding API MultibyteToWideChar and WideCharToMultiByte are the usual way to reach the goal of interoperability for user-mode programs using different internal string representations on Windows.

int WideCharToMultiByte(
  [in]            UINT                               CodePage,
  [in]            DWORD                              dwFlags,
  [in]            _In_NLS_string_(cchWideChar)LPCWCH lpWideCharStr,
  [in]            int                                cchWideChar,
  [out, optional] LPSTR                              lpMultiByteStr,
  [in]            int                                cbMultiByte,
  [in, optional]  LPCCH                              lpDefaultChar,
  [out, optional] LPBOOL                             lpUsedDefaultChar
);

int MultiByteToWideChar(
  [in]            UINT                              CodePage,
  [in]            DWORD                             dwFlags,
  [in]            _In_NLS_string_(cbMultiByte)LPCCH lpMultiByteStr,
  [in]            int                               cbMultiByte,
  [out, optional] LPWSTR                            lpWideCharStr,
  [in]            int                               cchWideChar
);

It requires two calls to each function for one single conversion, the first call to calculate the buffer size and the second to perform the actual conversion. There is a simpler method to behave equivalently, that is to say, via the __bstr_t class that is supplied within the compiler’s COM support.

The COM always uses UTF-16 strings as mentioned above and the BSTR type (that is wchar_t* with some extra header) is the standard string type of COM. MSVC has native support for BSTR called __bstr_t, an encapsulation of the BSTR data type, providing a simplified version compared with the general approach.

A faster way to do encoding conversions is to instantiate an object of _bstr_t using the constructor based on the signature const char*. This overload takes an ANSI string and converts it to a UTF-16 string immediately and the _bstr_t has a const wchar_t* operator() to do the implicit cast and vice versa.

#include <string>
#include <string_view>

#include <comutil.h>

const std::string str{ "Hello some 汉字 characters" };
const _bstr_t wide_str{ str.c_str() };
const std::wstring_view{ wide_str };

#include <string>
#include <string_view>

#include <comutil.h>

const std::wstring str{ L"一些宽字符,逆向转换" };
const _bstr_t wide_str{ str.c_str() };
const std::string_view{ wide_str };