A partial archive of discourse.readcsbooksforgreatgood.org as of Thursday April 25, 2024.

未初始化变量在 GCC 11 版本下,默认已经不是 weak symbol?

dianqk
2022-04-12

根据 7.6.1 How Linkers Resolve Duplicate Symbol Names 一个章节的描,未初始化的全局变量会成为 weak symbol。

Functions and initialized global variables get strong symbols. Uninitialized global variables get weak symbols.

我按照书中的例子实践了一下(对应的文件为 foo3.c 和 bar3.c)。

// bar3.c
int x;

void f()
{
  x = 15212;
}
// foo3.c
#include <stdio.h>

int x = 15213;

int main()
{
  f();
  printf("x = %d\n", x);
  return 0;
}
$ gcc -o foobar3 foo3.c bar3.c
/usr/bin/ld: /tmp/ccynZE9A.o:(.bss+0x0): multiple definition of `x'; /tmp/ccK9RGkg.o:(.data+0x0): first defined here
collect2: error: ld returned 1 exit status

$ gcc -v
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-pc-linux-gnu/11.2.0/lto-wrapper
Target: x86_64-pc-linux-gnu
Configured with: /build/gcc/src/gcc/configure --enable-languages=c,c++,ada,fortran,go,lto,objc,obj-c++,d --enable-bootstrap --prefix=/usr --libdir=/usr/lib --libexecdir=/usr/lib --mandir=/usr/share/man --infodir=/usr/share/info --with-bugurl=https://bugs.archlinux.org/ --with-linker-hash-style=gnu --with-system-zlib --enable-__cxa_atexit --enable-cet=auto --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-default-ssp --enable-gnu-indirect-function --enable-gnu-unique-object --enable-linker-build-id --enable-lto --enable-multilib --enable-plugin --enable-shared --enable-threads=posix --disable-libssp --disable-libstdcxx-pch --disable-werror --with-build-config=bootstrap-lto --enable-link-serialization=1 gdc_include_dir=/usr/include/dlang/gdc
Thread model: posix
Supported LTO compression algorithms: zlib zstd
gcc version 11.2.0 (GCC) 

bar3.c 中的文件应当属于 weak symbol,在合并重复符号时应当会被抛弃,但这却给出了一个重复的报错。
我掏出一个 docker 起一个旧版本的 ubuntu:

root@62c178e49658:~# gcc -o foobar3 foo3.c bar3.c
root@62c178e49658:~# ./foobar3 
x = 15212
root@62c178e49658:~# gcc -v
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-linux-gnu/4.8/lto-wrapper
Target: x86_64-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Ubuntu 4.8.4-2ubuntu1~14.04.4' --with-bugurl=file:///usr/share/doc/gcc-4.8/README.Bugs --enable-languages=c,c++,java,go,d,fortran,objc,obj-c++ --prefix=/usr --program-suffix=-4.8 --enable-shared --enable-linker-build-id --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --with-gxx-include-dir=/usr/include/c++/4.8 --libdir=/usr/lib --enable-nls --with-sysroot=/ --enable-clocale=gnu --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-gnu-unique-object --disable-libmudflap --enable-plugin --with-system-zlib --disable-browser-plugin --enable-java-awt=gtk --enable-gtk-cairo --with-java-home=/usr/lib/jvm/java-1.5.0-gcj-4.8-amd64/jre --enable-java-home --with-jvm-root-dir=/usr/lib/jvm/java-1.5.0-gcj-4.8-amd64 --with-jvm-jar-dir=/usr/lib/jvm-exports/java-1.5.0-gcj-4.8-amd64 --with-arch-directory=amd64 --with-ecj-jar=/usr/share/java/eclipse-ecj.jar --enable-objc-gc --enable-multiarch --disable-werror --with-arch-32=i686 --with-abi=m64 --with-multilib-list=m32,m64,mx32 --with-tune=generic --enable-checking=release --build=x86_64-linux-gnu --host=x86_64-linux-gnu --target=x86_64-linux-gnu
Thread model: posix
gcc version 4.8.4 (Ubuntu 4.8.4-2ubuntu1~14.04.4) 

这个倒是符合 CSAPP 章节内容。

我用 nm 来看了以下两个 .o 的区别:
ubuntu 14.04:

root@62c178e49658:~# gcc -o bar3.o -c bar3.c
root@62c178e49658:~# nm bar3.o
0000000000000000 T f
0000000000000004 C x

archlinux:

$ gcc -o bar3.o -c bar3.c
$ nm bar3.o
0000000000000000 T f
0000000000000000 B x
uFzK3VbZ8aAVmt2h
2022-04-12

我注意到 arch linux 的 x 是 B 类型,意味着 x 在 .bss section,也就意味着它已经初始化,因为 CSAPP Practice Problem 7.2 前面有一段话讲到:

on the other hand, if x is initialized to zero, then it is a strong symbol, so the compiler can confidently assign it to .bss

我在 Ubuntu 20.04 上编译的 bar3.o 符号表为:

nm bar3.o 
0000000000000000 T f
0000000000000004 C x

C 意味着 COMMON

Common symbols are uninitialized data – nm(1): symbols from object files - Linux man page

我的编译器版本

gcc -v
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-linux-gnu/9/lto-wrapper
OFFLOAD_TARGET_NAMES=nvptx-none:hsa
OFFLOAD_TARGET_DEFAULT=1
Target: x86_64-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Ubuntu 9.4.0-1ubuntu1~20.04.1' --with-bugurl=file:///usr/share/doc/gcc-9/README.Bugs --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,gm2 --prefix=/usr --with-gcc-major-version-only --program-suffix=-9 --program-prefix=x86_64-linux-gnu- --enable-shared --enable-linker-build-id --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --libdir=/usr/lib --enable-nls --enable-clocale=gnu --enable-libstdcxx-debug --enable-libstdcxx-time=yes --with-default-libstdcxx-abi=new --enable-gnu-unique-object --disable-vtable-verify --enable-plugin --enable-default-pie --with-system-zlib --with-target-system-zlib=auto --enable-objc-gc=auto --enable-multiarch --disable-werror --with-arch-32=i686 --with-abi=m64 --with-multilib-list=m32,m64,mx32 --enable-multilib --with-tune=generic --enable-offload-targets=nvptx-none=/build/gcc-9-Av3uEd/gcc-9-9.4.0/debian/tmp-nvptx/usr,hsa --without-cuda-driver --enable-checking=release --build=x86_64-linux-gnu --host=x86_64-linux-gnu --target=x86_64-linux-gnu
Thread model: posix
gcc version 9.4.0 (Ubuntu 9.4.0-1ubuntu1~20.04.1) 

然后我发现 -fno-common 选项:

The -fno-common option specifies that the compiler should
instead place uninitialized global variables in the BSS
section of the object file. —gcc(1) - Linux manual page

添加这个选项后:

bar3.o: bar3.c
	gcc -o $@ -c $^ -fno-common

结果为:

nm bar3.o
0000000000000000 T f
0000000000000000 B x

B 意味着 x 在 .bss section

最后找到一个类似的问题: [master] Fails to compile with compile flag -fno-common (that is default for GCC >=10) · Issue #17 · FreeSpacenav/spacenavd · GitHub :[master] Fails to compile with compile flag -fno-common (that is default for GCC >=10)

注意到 LZ 的 arch linux 编译器是 11 版本,而我的是 9,LZ 的 gcc 默认开启了 -fno-common

那么也就意味着 foo3.c 和 bar3.c 的 x 都是 strong symbol, 那么比如违反 680页的 rule1: mutlple strong symbols with the same name are not allowed

在 Ubuntu 20.04, gcc 9 的条件下,加上 -fno-common 可以复现 arch linux , gcc 11 的错误

all: a.out foo3.o bar3.o

a.out: bar3.o foo3.o
	gcc -o $@ $^
foo3.o: foo3.c
	gcc -o $@ -c $^ -fno-common
bar3.o: bar3.c
	gcc -o $@ -c $^ -fno-common

编译结果:

 make
gcc -o bar3.o -c bar3.c -fno-common
gcc -o foo3.o -c foo3.c -fno-common
foo3.c: In function ‘main’:
foo3.c:7:5: warning: implicit declaration of function ‘f’ [-Wimplicit-function-declaration]
    7 |     f();
      |     ^
gcc -o a.out bar3.o foo3.o
/usr/bin/ld: foo3.o:(.data+0x0): multiple definition of `x'; bar3.o:(.bss+0x0): first defined here
collect2: error: ld returned 1 exit status
make: *** [makefile:4: a.out] Error 1

证据:

Default to -fno-common
Porting to GCC 10 - GNU Project