In this article, we’ll take our previous work getting GRUB2 on a QEMU disk and actually use it to boot code that we’ve written.
Don’t get too excited. Mostly we’re going to focus on getting a working Makefile
and coaxing GCC into creating an ELF binary without being Linux specific.
The Code
The kernel we’re going to run is extremely simple and pointless, but it will form the basis for our future experiments and get our codebase kicked off.
void main (void) { while(1) {} }
Nothing to see here, just a loop to keep the CPU in place.
Compiling
Now, how do we get this to work? Well, if you’re interested, you can compile this as a (pointless) Linux program with the standard gcc main.c -o kernel
but that will generate a binary with a lot of stuff in it that we don’t want, and can’t have even if we did. Looking at the output of objdump -d kernel
(a tool that will come in handy later) you can see a lot of symbols and sections from glibc stuff. From the end, for example:
... 0000000000400550 <__libc_csu_fini>: 400550: f3 c3 repz retq 400552: 90 nop 400553: 90 nop Disassembly of section .fini: 0000000000400554 <_fini>: 400554: 48 83 ec 08 sub $0x8,%rsp 400558: 48 83 c4 08 add $0x8,%rsp 40055c: c3
These ELF sections are from libc which this binary has been implicitly linked against. These sections allow GCC to insert things like constructors and destructors into your code, let it interact with the operating system to do things like argv and other magic that is 100% irrelevant to our kernel.
No, we need to find some flags to GCC that let us ignore everything else and just compile what’s written in our source. No libraries, nothing. A brief look through the GCC manpage leads us to:
-ffreestanding Assert that compilation takes place in a freestanding environment. This implies -fno-builtin. A freestanding environment is one in which the standard library may not exist, and program startup may not necessarily be at "main". The most obvious example is an OS kernel. This is equivalent to -fno-hosted. -nostdlib Do not use the standard system startup files or libraries when linking. No startup files and only the libraries you specify will be passed to the linker, options specifying linkage of the system libraries, such as "-static-libgcc" or "-shared-libgcc", will be ignored. The compiler may generate calls to "memcmp", "memset", "memcpy" and "memmove". These entries are usually resolved by entries in libc. These entry points should be supplied through some other mechanism when this option is specified. One of the standard libraries bypassed by -nostdlib and -nodefaultlibs is libgcc.a, a library of internal subroutines which GCC uses to overcome shortcomings of particular machines, or special needs for some languages. In most cases, you need libgcc.a even when you want to avoid other standard libraries. In other words, when you specify -nostdlib or -nodefaultlibs you should usually specify -lgcc as well. This ensures that you have no unresolved references to internal GCC library subroutines. (For example, __main, used to ensure C++ constructors will be called.)
These look like good suspects. -nostdlib
is the real workhorse option, stripping the glibc cruft from our binary. -ffreestanding
is less important, but it will suppress GCC complaining about our main function being non-standard at least.
jack@sagan:$ gcc -o kernel -nostdlib -ffreestanding main.c /usr/bin/ld: warning: cannot find entry symbol _start; defaulting to 0000000000400144
We’ll handle that error later. For now, let’s see what code it put out with objdump -d
jack@sagan:$ objdump -d kernel ./kernel: file format elf64-x86-64 Disassembly of section .text: 0000000000400144 <main>: 400144: 55 push %rbp 400145: 48 89 e5 mov %rsp,%rbp 400148: eb fe jmp 400148 <main+0x4>
Excellent. Much more concise and understandable. main()
is just setting up an empty stack frame and looping in place infinitely.
Linking
We have a number of problems with our current ELF output. The first of which, as ld
told us above, is that it doesn’t know what the starting address is, so it guessed. The second is that the link address GCC chose is completely arbitrary and isn’t a good default. And the third is that, if you use objdump -D
(capital D) to dump all of the sections of the file, we still have two extraneous sections, .eh_frame
and .comment
that are wasting space.
Both of these problems can be solved with a linker script which will tell the linker, ld
- What address the code should be linked at.
- What address the code should be loaded at.
- What symbol is the entry symbol.
- What sections should be kept and which discarded.
Let’s take a look at the linker script:
OUTPUT_FORMAT("elf64-x86-64") ENTRY(main) SECTIONS { .text 0xFFFFFFFF80100000 : AT(0x100000) { *(.text) } .data : { *(.data) } .bss : { *(.bss) } /DISCARD/ : { *(.comment) *(.eh_frame) } }
This script keeps the relevant sections (.text
, which is code, .data
which is inited data, and .bss
which is basically un-inited data) by grouping them together. It discards the extra GCC sections (.comment
, and .eh_frame
) by placing them in the ld
special “/DISCARD/” section. It also sets the output format as 64-bit x86 ELF, which is correct for our kernel to be loaded by GRUB, and sets the entry point to main()
.
Most importantly it sets the link address for code to 0xFFFFFFFF80100000
, but load the code to physical memory 0x100000
with the AT directive. If we omit this AT directive, GRUB will attempt to load to 0xFFFFFFFF80100000
physical and unless you’ve got 16 million terabytes of memory in your VM it will complain about being out of memory and subsequently fail.
Why are we linking at 0xFFFFFFFF80100000?
First let’s just note that the 64-bit architecture only supports 48-bit addresses and the top 16 bits are sign-extensions of the 48th bit. There’s a massive hole of unaddressable memory between 0x7FFFFFFFFFFF and 0xFFFF800000000000 because of this sign extension. We take advantage of this hole by using it to separate user (0 – 128 TB) and kernel (16 Exabytes -roughly- and up) addresses. This gives both halves (user and kernel) plenty of space.
However, there is one more wrinkle. When linked together there are things called ‘relocations’ which have to do with pointer math. Consider loading a pointer like int *bar = &foo
. Syntactically and logically that is sound, however, as part of optimizing the 99% (non-kernel) usecase, GCC assumes that your code is going to be compiled with addresses between 0 and 2G. The result is that &foo
is assumed to be four bytes by GCC, and at link time ld discovers it’s actually eight bytes (a 64 bit address) ld throws an error complaining that this relocation has been truncated (i.e. the top four bytes would be discarded if this program was run).
GCC’s 0 to 2G assumption can be controlled with the -mcmodel
flag. By default, it’s set to “small” (code in 0-2G), but there are also “large” (makes no assumption about addresses but generates more inefficient assembly by assuming all pointers and jumps are going to be anywhere in the 64 bit range), “medium” (a compromise between small and large) and, most importantly, “kernel” which was added so that the Linux kernel could have the assembly efficiency of “small” with the desired virtual address separation. The downside is that “kernel” assumes the code is in -2G to MAX addresses or 0xFFFFFFFF80000000+. So, to take advantage of this compromise between address restrictions and assembly efficiency, we link at 0xFFFFFFFF80100000 and specify -mcmodel=kernel
on the GCC command line.
To use this linker script, we split the compilation process into two parts. First, the compilation of the C in to object (.o
) files. Then the linking of object files into an ELF binary, with the linker script.
jack@sagan:$ gcc -nostdlib -ffreestanding -mcmodel=kernel -c main.c jack@sagan:$ ld -T linker.ld -o kernel main.o
Which now yields kernel
which is a 64-bit ELF file, linked to 0xFFFFFFFF80100000
and ready to be loaded at 0x100000
.
Unfortunately, on x86-64 hosts, this also generates an executable that’s positively massive (1 or 2M) compared to the amount of code we have. This is no good because it’s a waste of space and, worse, it pushes the actual sections of our code out of the 8k that GRUB is going to search for a magic header.
On x86-64 we can solve this by giving the -n
flag to ld
which tells it to not align the program sections at a huge offset.
The following produces a kernel under 1k on x86-64.
jack@sagan:$ ld -T linker.ld -n -o kernel main.o
GRUB Magic
If you tried to load the kernel at this point, GRUB would complain that the binary is missing a signature and you wouldn’t get any farther.
GRUB expects to find a known “Multiboot Header”. You can read the Multiboot Specification which describes what must be embedded into the binary for GRUB to recognize the ELF file as a bootable file in section 3.1.
We’ll be using more of the GRUB features when we want to take advantage of some of the values that it can give us (denoted in flags
) but for right now we just want to make GRUB happy so we can load our kernel.
In short, just to boot, we need 3 32-bit values
u32 0x0 magic value (0x1BADB002) u32 0x4 flags (we'll set to 0x0 for now) u32 0x8 checksum (added to the previous 2 must = 0)
And, just to make it easy on GRUB, the signature has to show up in the first 8192 (8k) of the binary. Considering that ours is 3 bytes long (without the ELF header) we could place it anywhere, but let’s take advantage of our linker script to place the grub magic immediately after the ELF header.
Specifying the GRUB Signature
Using the above information and some basic information about default types on 64-bit (i.e. that unsigned int
is 32-bit) we can easily create a struct
to contain the information.
struct grub_signature { unsigned int magic; unsigned int flags; unsigned int checksum; }; #define GRUB_MAGIC 0x1BADB002 #define GRUB_FLAGS 0x0 #define GRUB_CHECKSUM (-1 * (GRUB_MAGIC + GRUB_FLAGS)) struct grub_signature gs = { GRUB_MAGIC, GRUB_FLAGS, GRUB_CHECKSUM };
But now we have to ensure that the signature shows up in the first 8k of the file so GRUB can find it.
Considering the kernel is less than 1k, that’s already done and this kernel will boot. But eventually the kernel will be far larger than 8k, so we can’t rely on it.
The easy way to accomplish this is to split the GRUB signature into a separate file (grub.c) and make sure that that file’s object code (grub.o) is the first file linked into the kernel by making sure it’s the first object argument to ld
. However, that seems too fragile since it’s based on the build system that we haven’t even touched yet.
In my opinion, we need to enforce that the GRUB signature is the first thing. To that end, let’s add a new code section to the linker script and tell GCC to put our grub_signature
struct gs
into it.
First, the modifications to linker.ld:
... SECTIONS { .grub_sig 0xFFFFFFFF80100000 : AT(0x100000) { *(.grub_sig) } .text : { *(.text) } ...
The grub_sig
section is now the very first thing in our binary after the ELF header.
Now, let’s use GCC’s __attribute__
directive to put the signature in that section by changing the definition of gs
struct grub_signature gs __attribute__ ((section (".grub_sig"))) = { GRUB_MAGIC, GRUB_FLAGS, GRUB_CHECKSUM };
Great. After a recompile, we can look again at the output of objdump -D
and make sure that worked:
jack@sagan:$ objdump -D kernel kernel: file format elf64-x86-64 Disassembly of section .grub_sig: ffffffff80100000 <gs>: ffffffff80100000: 02 b0 ad 1b 00 00 add 0x1bad(%rax),%dh ffffffff80100006: 00 00 add %al,(%rax) ffffffff80100008: fe 4f 52 decb 0x52(%rdi) ffffffff8010000b: e4 .byte 0xe4 Disassembly of section .text: ffffffff8010000c <main>: ffffffff8010000c: 55 push %rbp ffffffff8010000d: 48 89 e5 mov %rsp,%rbp ffffffff80100010: eb fe jmp ffffffff80100010 <main+0x4>
Looks correct, the .grub_sig section is ahead of .text as the first thing in the binary after the ELF header.
Booting
Now, all that’s left is to give it a try. Copy your kernel onto the first partition of your disk (instructions on mounting from the disk image here).
jack@sagan:$ sudo mount loop0 /mnt/os_boot jack@sagan:$ sudo cp kernel /mnt/os_boot/ jack@sagan:$ sync
After the sync completes (should be momentarily unless you’ve got a bunch of other IO going), you can then fire up QEMU.
jack@sagan:$ qemu -hda disk.img -m 1024
Which will quickly drop you at the GRUB prompt.
grub> multiboot (hd0,msdos1)/kernel grub> boot
And if no errors are printed, the kernel is running.
Double Checking
I wouldn’t be much of a hacker if I thought that no output and no confirmation means everything is okay. Let’s check and make sure that everything looks good.
If this was a real machine, we’d be in a hurry to get output to the screen, or flashing LEDs, or we’d be breaking out hardware debuggers to analyze the chip state in the worst case. Fortunately, using QEMU, you can use GDB on your kernel like any other piece of software. We’ll get into more detail later, but for now let’s just see if the machine is looping.
First, make sure you have GDB installed. QEMU won’t complain if you don’t.
Second, (re)start QEMU with the -s
option that tells QEMU to start a gdbserver for your system on TCP port 1234. If you wanted to use breakpoints or walk through GRUB you could also pass it -S
which will keep the CPU from starting until you’ve engaged GDB and issued a ‘continue’.
jack@sagan:$ qemu -hda disk.img -m 1024 -s
Now simply fire up GDB from another terminal and give it a remote target:
jack@sagan:~ $ gdb ... (gdb) target remote tcp::1234 Remote debugging using tcp::1234 0x00008376 in ?? () (gdb) c Continuing. [ Booted with GRUB to get into our code ] ^C Program received signal SIGINT, Interrupt. 0x00100010 in ?? () (gdb) info registers eax 0x2badb001 732803073 ecx 0x0 0 edx 0x0 0 ebx 0x10000 65536 esp 0x7fefc 0x7fefc ebp 0x7fefc 0x7fefc esi 0x0 0 edi 0x0 0 eip 0x100010 0x100010 eflags 0x200002 [ ID ] cs 0x10 16 ss 0x18 24 ds 0x18 24 es 0x18 24 fs 0x18 24 gs 0x18 24
This output confirms that we’ve booted our kernel. The dead give away is that the current instruction (listed when I ^C but also in the eip
register) matches the load address of our jmp
instruction.
Packaging it Up
It’s extremely tedious to have to hand compile this over and over again. I’ve included my source in my git repo with main.c
,linker.ld
, and a Makefile
.
You can browse the ‘the-null-kernel’ tag here.
Alternatively you can clone the git repo with the files and history:
jack@sagan:$ git clone http://codezen.org/src/viridis.git
Thank you for your contribuıtion, this really simplifies the simple operating system development! It is much more sensible to use GRUB, and I also made a simple grub.cfg, leaving it for future newbies like me:
menuentry “My Operating System” {
multiboot (hd0,msdos1)/kernel
boot
}
Do not forget to change hd0,msd0s1 to wherever your kernel is. Write it to a file named grub.cfg under grub directory and voila!