hardware-init-review/hardware_init_review.tex

3513 lines
196 KiB
TeX
Raw Normal View History

2024-08-22 15:38:22 +02:00
% -*- coding: utf-8 -*-
% Copyright (C) 2024 Adrien 'neox' Bourmault <neox@gnu.org>
%
% Permission is granted to copy, distribute and/or modify this document
% under the terms of the GNU Free Documentation License, Version 1.3
% or any later version published by the Free Software Foundation;
% with no Invariant Sections, no Front-Cover Texts, and no Back-Cover Texts.
% A copy of the license is included in the section entitled "GNU
% Free Documentation License".
2024-07-24 17:00:17 +02:00
\input{packages.tex}
2024-08-25 11:54:54 +02:00
\title{Hardware initialization of modern computers}
\author{Adrien 'neox' Bourmault}
\date{\today}
2024-07-24 17:00:17 +02:00
% setup things
\setcounter{secnumdepth}{4}
\setcounter{tocdepth}{4}
%\setcounter{secnumdepth}{4}
% setup bibliography
\addbibresource{bibliographie.bib}
2024-08-21 21:27:29 +02:00
% ------------------------------------------------------------------------------
2024-07-24 17:00:17 +02:00
\begin{document}{
2024-08-21 21:27:29 +02:00
% ------------------------------------------------------------------------------
2024-07-24 17:00:17 +02:00
\sloppy % allow flexible margins
\input{titlepage.tex} % import titlepage
\newpage
2024-08-25 11:54:54 +02:00
% ------------------------------------------------------------------------------
2024-08-21 21:27:29 +02:00
% License header
2024-08-25 11:54:54 +02:00
% ------------------------------------------------------------------------------
2024-07-24 17:00:17 +02:00
\setcounter{page}{2}
\vspace*{\fill} % fill the page so that text is at the bottom
2024-08-22 19:18:34 +02:00
This is Edition 0.0. \\
2024-07-24 17:00:17 +02:00
2024-08-22 19:18:34 +02:00
Copyright (C) 2024 Adrien 'neox' Bourmault
\href{mailto:neox@gnu.org}{<neox@gnu.org>} \\
2024-07-24 17:00:17 +02:00
2024-08-21 21:27:29 +02:00
Permission is granted to copy, distribute and/or modify this document
under the terms of the GNU Free Documentation License, Version 1.3
or any later version published by the Free Software Foundation;
with no Invariant Sections, no Front-Cover Texts, and no Back-Cover Texts.
A copy of the license is included in the section entitled "GNU
Free Documentation License".
2024-07-24 17:00:17 +02:00
2024-08-25 18:51:20 +02:00
Source-code included in this document is licensed under the GNU General
Public License version 2 or later. You can find a copy of this license
at <https://www.gnu.org/licenses/>.
2024-07-24 17:00:17 +02:00
\newpage
2024-08-25 11:54:54 +02:00
% ------------------------------------------------------------------------------
% ACKNOWLEDGMENTS
% ------------------------------------------------------------------------------
\chapter*{Acknowledgments}
\addcontentsline{toc}{chapter}{Acknowledgments}}
Thanks, I guess ? (TODO)
2024-07-24 17:00:17 +02:00
\newpage
2024-08-25 11:54:54 +02:00
% ------------------------------------------------------------------------------
% ABSTRACT
% ------------------------------------------------------------------------------
2024-07-24 17:00:17 +02:00
\chapter*{Abstract}
\addcontentsline{toc}{chapter}{Abstract}
2024-08-22 19:18:34 +02:00
The global trend is towards the scarcity of free software-compatible
hardware, and soon there will be no computer that will work without
software domination by big companies, especially involving firmware like
BIOSes. \\
A Basic Input Output System (BIOS) was originally a set of low-level
functions contained in the read-only memory of a computer's mainboard,
enabling it to perform basic operations when powered up. However, the
definition of a BIOS has evolved to include what used to be known as Power
On Self Test (POST) for the presence of peripherals, allocating resources
for them to avoid conflicts, and then handing over to an operating system
boot loader. Nowadays, the bulk of the BIOS work is the initialization
and training of RAM. This means, for example, initializing the memory
controller and optimizing timing and read/write voltage for optimal
performance, making the code complex, as its role is to optimize several
parallel buses operating at high speeds and shared by many CPU cores,
and make them act as a homogeneous whole. \\
This document is the product of a project hosted by the \textit{LIP6
laboratory} and supported by the \textit{GNU Boot Project} and the
\textit{Free Software Foundation}. It delves into the importance
of firmware in the hardware initialization of modern computers and
explores various aspects of firmware, such as Intel Management Engine
(ME), AMD Platform Security Processor (PSP), Advanced Configuration and
Power Interface (ACPI), and System Management Mode (SMM). Additionally,
it provides an in-depth look at memory initialization and training
algorithms, highlighting their critical role in system stability and
performance. Examples of the implementation in the ASUS KGPE-D16 mainboard
are presented, describing its hardware characteristics, topology, and the
crucial role of firmware in its operation after the mainboard architecture
2024-08-25 11:54:54 +02:00
is examined. Practical examples illustrate the impact of firmware on
2024-08-22 19:18:34 +02:00
hardware initialization, memory optimization, resource allocation,
power management, and security. Specific algorithms used for memory
training and their outcomes are analyzed to demonstrate the complexity
and importance of firmware in achieving optimal system performance.
Furthermore, this document explores the relationship between firmware
and hardware virtualization. Security considerations and future trends
in firmware development are also addressed, emphasizing the need for
continued research and advocacy for free software-compatible hardware.
2024-07-24 17:00:17 +02:00
2024-08-25 11:54:54 +02:00
\newpage
% ------------------------------------------------------------------------------
% Table of contents
% ------------------------------------------------------------------------------
\tableofcontents
\newpage
% List of figures
\addcontentsline{toc}{chapter}{List of Figures}
\listoffigures
\newpage
% List of figures
\addcontentsline{toc}{chapter}{List of Listings}
\listoflistings
\newpage
% ------------------------------------------------------------------------------
% CHAPTER 1: Introduction to firmware and BIOS evolution
% ------------------------------------------------------------------------------
2024-07-24 17:00:17 +02:00
\chapter{Introduction to firmware and BIOS evolution}
\section{Historical context of BIOS}
2024-08-21 12:53:44 +02:00
2024-07-24 17:00:17 +02:00
\subsection{Definition and origin}
2024-08-21 12:53:44 +02:00
2024-08-22 19:18:34 +02:00
The BIOS (Basic Input/Output System) is firmware, which is a type of
software that is embedded into hardware devices to control their basic
functions, acting as a bridge between hardware and other software,
ensuring that the hardware operates correctly. Unlike regular
software, firmware is usually stored in a non-volatile memory like
ROM or flash memory. The term "firmware" comes from its role: it is
"firm" because it's more permanent than regular software (which can
be easily changed) but not as rigid as hardware. \\
2024-08-21 12:53:44 +02:00
The BIOS is used to perform initialization during the booting process
2024-08-22 19:18:34 +02:00
and to provide runtime services for operating systems and programs.
2024-08-21 12:53:44 +02:00
Being a critical component for the startup of personal computers,
acting as an intermediary between the computer's hardware and its
2024-08-22 19:18:34 +02:00
operating system, the BIOS is embedded on a chip on the motherboard
2024-08-25 11:54:54 +02:00
and is the first code that runs when a PC is powered on. The concept
2024-08-22 19:18:34 +02:00
of BIOS has its roots in the early days of personal computing. It
was first developed by IBM for their IBM PC, which was introduced
2024-08-25 11:54:54 +02:00
in 1981 \cite{freiberger2000fire}. The term BIOS itself was
2024-08-22 19:18:34 +02:00
coined by Gary Kildall, who developed the CP/M (Control Program for
Microcomputers) operating system \cite{shustek2016kildall}. In CP/M,
BIOS was used to describe a component that interfaced directly
with the hardware, allowing the operating system to be somewhat
hardware-independent. \\
2024-08-21 12:53:44 +02:00
\begin{figure}[H]
\centering
\includegraphics[width=0.5\textwidth]{images/IBM_logo.png}
\caption{The eight-striped wordmark of IBM (1967, public domain,
trademarked)}
\end{figure}
2024-07-24 17:00:17 +02:00
2024-08-22 19:18:34 +02:00
IBM's implementation of BIOS became a de facto standard in
the industry, as it was part of the IBM PC's open architecture
\cite{grewal_ibm_pc}\cite{ibm_pc}, which refers to the design
philosophy adopted by IBM when developing the IBM Personal Computer
(PC), introduced in 1981. This architecture is characterized by the use
of off-the-shelf components and publicly available specifications,
which allowed other manufacturers to create compatible hardware
2024-08-25 11:54:54 +02:00
and software. It was in fact a departure from the proprietary
2024-08-22 19:18:34 +02:00
systems prevalent at the time, where companies closely guarded their
designs to maintain control over the hardware and software ecosystem.
2024-07-24 17:00:17 +02:00
For example, IBM used the Intel 8088 CPU, a well-documented and widely
2024-08-22 19:18:34 +02:00
available processor, and also the Industry Standard Architecture
(ISA) bus, which defined how various components like memory, storage,
and peripherals communicated with the CPU. This open architecture
allowed other manufacturers to create IBM-compatible computers, also
known as "clones", which further popularized the BIOS concept. As
a result, the IBM PC BIOS set the stage for a standardized method
of interacting with computer hardware, which has evolved over the
years but remains fundamentally the same in principle. IBM also
published detailed technical documentation at that time, including
circuit diagrams, BIOS listings, and interface specifications. This
transparency allowed other companies to understand and replicate
the IBM PC's functionality \cite{freiberger2000fire}.
2024-08-21 12:53:44 +02:00
2024-07-24 17:00:17 +02:00
\subsection{Functionalities and limitations}
2024-08-22 19:18:34 +02:00
When a computer is powered on, the BIOS executes a Power-On
Self-Test (POST), a diagnostic sequence that verifies the integrity
and functionality of critical hardware components such as the CPU,
RAM, disk drives, keyboard, and other peripherals \cite{wiki_bios}.
This process ensures that all essential hardware components are
operational before the system attempts to load the operating system.
If any issues are detected, the BIOS generates error messages or
2024-08-25 11:54:54 +02:00
beep codes to alert the user. Following the successful completion
2024-08-22 19:18:34 +02:00
of POST, the BIOS runs the bootstrap loader, a small program that
identifies the operating system's bootloader on a storage device,
2024-08-25 11:54:54 +02:00
such as a hard drive, floppy disk, or optical drive. The bootstrap
2024-08-22 19:18:34 +02:00
loader then transfers control to the OS bootloader, initiating
the process of loading the operating system into the computer's
memory and starting it. This step effectively bridges the gap
between hardware initialization and operating system execution.
The BIOS also provides a set of low-level software routines known
as interrupts. These routines enable software to perform basic
input/output operations, such as reading from the keyboard, writing
to the display, and accessing disk drives, without needing to manage
the hardware directly. By providing standardized interfaces for
hardware components, the BIOS simplifies software development and
improves compatibility across different hardware configurations
\cite{ibm_pc}. \\
2024-08-21 12:53:44 +02:00
\begin{figure}[H]
\centering
\includegraphics[width=0.5\textwidth]{images/bios_chip.jpg}
\caption{An AMI BIOS chip from a Dell 310, by Jud McCranie
(CC BY-SA 4.0, 2018)}
\end{figure}
2024-07-24 17:00:17 +02:00
Despite its essential role, the early BIOS had several limitations.
One significant limitation was its limited storage capacity.
2024-08-22 19:18:34 +02:00
Early BIOS firmware was stored in Read-Only Memory (ROM) chips with
very limited storage, often just a few kilobytes. This constrained
the complexity and functionality of the BIOS, limiting it to only the
most essential tasks needed to start the system and provide basic
hardware control. The original BIOS was also non-extensible. ROM
chips were typically soldered onto the motherboard, making updates
difficult and costly. Bug fixes, updates for new hardware support,
or enhancements required replacing the ROM chip, leading to challenges
in maintaining and upgrading systems. Furthermore, the early BIOS was
tailored for the specific hardware configurations of the initial IBM
PC models, which included a limited set of peripherals and expansion
options. As new hardware components and peripherals were developed,
the BIOS often needed to be updated to support them, which was not
2024-08-25 11:54:54 +02:00
always feasible or timely. Performance bottlenecks were another
2024-08-22 19:18:34 +02:00
limitation. The BIOS provided basic input/output operations that
were often slower than direct hardware access methods. For example,
disk I/O operations through BIOS interrupts were slower compared
to later direct access methods provided by operating systems,
resulting in performance bottlenecks, especially for disk-intensive
operations. This inflexibility restricts the ability to support new
2024-08-25 11:54:54 +02:00
hardware and technologies efficiently\cite{anderson_2018}. Early BIOS
2024-08-22 19:18:34 +02:00
implementations also had minimal security features. There were no
mechanisms to verify the integrity of the BIOS code or to protect
against unauthorized modifications, leaving systems vulnerable to
attacks that could alter the BIOS and potentially compromise the
2024-08-25 11:54:54 +02:00
entire system, such as rootkits and firmware viruses. Added to that,
2024-08-22 19:18:34 +02:00
the traditional BIOS operates in 16-bit real mode, a constraint that
limits the amount of code and memory it can address. This limitation
hinders the performance and complexity of firmware, making it less
2024-08-25 11:54:54 +02:00
suitable for modern computing needs \cite{intel_uefi}. Additionally,
2024-08-22 19:18:34 +02:00
BIOS relies on the Master Boot Record (MBR) partitioning scheme,
which supports a maximum disk size of 2 terabytes and allows only
four primary partitions \cite{uefi_spec}\cite{russinovich2012}.
This constraint has become a significant drawback as storage
2024-08-25 11:54:54 +02:00
capacities have increased. Furthermore, the traditional BIOS has
2024-08-22 19:18:34 +02:00
limited flexibility and is challenging to update or extend. This
inflexibility restricts the ability to support new hardware and
technologies efficiently \cite{anderson_2018}\cite{acmcs2015}.
2024-07-24 17:00:17 +02:00
\section{Modern BIOS and UEFI}
2024-08-22 19:18:34 +02:00
\subsection{Transition from traditional BIOS to UEFI (Unified
Extensible Firmware Interface)}
2024-08-21 12:53:44 +02:00
All the limitations listed earlier caused a transition to a more
2024-08-22 19:18:34 +02:00
modern firmware interface, designed to address the shortcomings of
the traditional BIOS. This section delves into the historical context
of this shift, the driving factors behind it, and the advantages
UEFI offers over the traditional BIOS. \\
The development of UEFI began in the mid-1990s as part of the
Intel Boot Initiative, which aimed to modernize the boot process
and overcome the limitations of the traditional BIOS. By 2005, the
Unified EFI Forum, a consortium of technology companies including
Intel, AMD, and Microsoft, had formalized the UEFI specification
\cite{uefi_spec}. UEFI was designed to address the shortcomings of
the traditional BIOS, providing several key improvements.
2024-08-21 12:53:44 +02:00
\begin{figure}[H]
\centering
\includegraphics[width=0.25\textwidth]{images/uefi_logo.png}
\caption{The UEFI logo (public domain, 2010)}
\end{figure}
2024-07-24 17:00:17 +02:00
2024-08-22 19:18:34 +02:00
One of the most significant advancements of UEFI is its support for
32-bit and 64-bit modes, allowing it to address more memory and
run more complex firmware programs. This capability enables UEFI
to handle the increased demands of modern hardware and software
2024-08-25 11:54:54 +02:00
\cite{intel_uefi}\cite{shin2011}. Additionally, UEFI uses the GUID
2024-08-22 19:18:34 +02:00
Partition Table (GPT) instead of the MBR, supporting disks larger
than 2 terabytes and allowing for a nearly unlimited number of
partitions \cite{microsoft_uefi}\cite{russinovich2012}.
Improved boot performance is another driving factor. UEFI
provides faster boot times compared to the traditional BIOS,
thanks to its efficient hardware and software initialization
processes. This improvement is particularly beneficial for systems
with complex hardware configurations, where quick boot times
2024-08-25 11:54:54 +02:00
are essential \cite{intel_uefi}. UEFI's modular architecture
2024-08-22 19:18:34 +02:00
makes it more extensible and easier to update compared to the
traditional BIOS. This design allows for the addition of drivers,
applications, and other components without requiring a complete
firmware overhaul, providing greater flexibility and adaptability
to new technologies \cite{acmcs2015}. UEFI also includes enhanced
security features such as \textit{Secure Boot}, which ensures that
only trusted software can be executed during the boot process,
thereby protecting the system from unauthorized modifications and
malware \cite{anderson_2018}\cite{chang2013}. \\
The industry-wide support and standardization of UEFI have accelerated
its adoption across various platforms and devices. Major industry
players, including Intel, AMD, and Microsoft, have adopted UEFI as
the new standard for firmware interfaces, ensuring broad compatibility
and interoperability \cite{uefi_spec}.
2024-08-21 12:53:44 +02:00
\subsection{An other way with \textit{coreboot}}
2024-08-22 19:18:34 +02:00
While UEFI has become the dominant firmware interface for modern
computing systems, it is not without its critics. Some of the primary
concerns about UEFI include its complexity, potential security
vulnerabilities, and the degree of control it provides to hardware
2024-08-25 11:54:54 +02:00
manufacturers over the boot process. Originally known as LinuxBIOS,
2024-08-22 19:18:34 +02:00
\textit{coreboot}, is a free firmware project initiated in 1999 by
Ron Minnich and his team at the Los Alamos National Laboratory. The
project's primary goal was to create a fast, lightweight, and
flexible firmware solution that could initialize hardware and
boot operating systems quickly, while remaining transparent and
2024-08-21 12:53:44 +02:00
auditable\cite{coreboot}. As an alternative to UEFI, \textit{coreboot}
2024-08-22 19:18:34 +02:00
offers a different approach to firmware that aims to address some
of these concerns and continue the evolution of BIOS.\\
2024-07-24 17:00:17 +02:00
One of the main advantages of \textit{coreboot} over UEFI is its
2024-08-22 19:18:34 +02:00
simplicity, as it is designed to perform only the minimal tasks
required to initialize hardware and pass control to a payload, such
as a bootloader or operating system kernel. This minimalist approach
reduces the attack surface and potential for security vulnerabilities,
as there is less code that could be exploited by malicious actors
2024-08-25 11:54:54 +02:00
\cite{rudolph2007}. Another significant benefit of \textit{coreboot}
is its libre nature. Unlike UEFI, which is controlled by a consortium
2024-08-22 19:18:34 +02:00
of hardware and software vendors, \textit{coreboot}'s source code
is freely available and can be audited, modified, and improved by
anyone. This transparency ensures that security researchers and
developers can review the code for potential vulnerabilities and
contribute to its improvement, fostering a community-driven approach
2024-08-25 11:54:54 +02:00
to firmware development\cite{coreboot}. This project also supports
2024-08-22 19:18:34 +02:00
a wide range of bootloaders, called payloads, allowing users to
customize their boot process to suit their specific needs. Popular
payloads include SeaBIOS, which provides legacy BIOS compatibility, and
Tianocore, which offers UEFI functionality within the \textit{coreboot}
2024-08-25 11:54:54 +02:00
framework. This flexibility allows \textit{coreboot} to be used in
2024-08-22 19:18:34 +02:00
a variety of environments, from embedded systems to high-performance
2024-08-25 11:54:54 +02:00
servers \cite{coreboot_payloads}. \\
2024-07-24 17:00:17 +02:00
2024-08-21 12:53:44 +02:00
\begin{figure}[H]
\centering
\includegraphics[width=0.3\textwidth]{images/coreboot_logo.png}
2024-08-22 19:18:34 +02:00
\caption{The \textit{coreboot} logo, by Konsult Stuge \&
coresystems
2024-08-21 12:53:44 +02:00
(coreboot logo license, 2008)}
\end{figure}
2024-08-22 19:18:34 +02:00
Despite its advantages, \textit{coreboot} is not without its
2024-08-25 11:54:54 +02:00
challenges. The project relies heavily on community contributions, and
2024-08-22 19:18:34 +02:00
support for new hardware often lags behind that of UEFI. Additionally,
the minimalist design of \textit{coreboot} means that some advanced
features provided by UEFI are not available by default. However,
the \textit{coreboot} community continues to work on adding
new features and improving compatibility with modern hardware or
security issues \cite{coreboot_challenges}. For example, it provides
a \textit{verified boot} function, allowing to prevent rootkits and
other attacks based on firmware modifications \cite{coreboot_docs}.
However, it's important to note that \textit{coreboot} is not entirely
free in all aspects. Many modern processors and chipsets require
\textit{proprietary blobs}, short for \textit{Binary Large Object},
which is a collection of binary data stored as a single entity. These
blobs are necessary for \textit{coreboot} to function correctly on
a wide range of hardware, but they compromise the goal of having
a fully free firmware one day \cite{blobs}, since these blobs are
used for certain functionalities such as memory initialization and
hardware management.
2024-08-21 12:53:44 +02:00
\begin{figure}[H]
\centering
\includegraphics[width=0.25\textwidth]{images/gnuboot.png}
\caption{The \textit{GNU Boot} logo, by Jason Self (CC0, 2020)}
2024-08-22 19:18:34 +02:00
\end{figure}
To address these concerns, the GNU Project has developed GNU Boot,
a fully free distribution of firmware, including \textit{coreboot},
that aims to be entirely free by avoiding the use of proprietary
binary blobs. GNU Boot is committed to using only free software
for all aspects of firmware, making it a preferred choice for users
and organizations that prioritize software freedom and transparency
\cite{gnuboot}.
2024-07-24 17:00:17 +02:00
\section{Shift in firmware responsibilities}
2024-08-21 12:53:44 +02:00
Initially, the BIOS's primary function was to perform the POST, a basic
2024-08-22 19:18:34 +02:00
diagnostic testing process to check the system's hardware components
and ensure they were functioning correctly. This included verifying
the CPU, memory, and essential peripherals before passing control to
the operating system's bootloader. This process was relatively simple,
given the limited capabilities and straightforward architecture of
early computer systems \cite{anderson_2018}.
2024-08-21 12:53:44 +02:00
As computer systems advanced, particularly with the advent of more
sophisticated memory technologies, the role of firmware expanded
2024-08-22 19:18:34 +02:00
significantly. Modern memory modules operate at much higher
speeds and capacities than their predecessors, requiring precise
configuration to ensure stability and optimal performance. Firmware
now plays a critical role in managing the memory controller, which is
responsible for regulating data flow between the processor and memory
modules. This includes configuring memory frequencies, voltage levels,
and timing parameters to match the specifications of the installed
2024-08-25 11:54:54 +02:00
memory \cite{uefi_spec}\cite{BKDG}. Beyond memory management,
2024-08-22 19:18:34 +02:00
firmware responsibilities have broadened to encompass a wide range
of system-critical tasks. One key area is power management, where
firmware is responsible for optimizing energy consumption across
various components of the system. Efficient power management is
2024-08-21 12:53:44 +02:00
essential not only for extending battery life in portable devices
2024-08-22 19:18:34 +02:00
but also for reducing thermal output and ensuring system longevity
2024-08-25 11:54:54 +02:00
in desktop and server environments. Moreover, modern firmware takes
2024-08-22 19:18:34 +02:00
on significant roles in hardware initialization and configuration,
which were traditionally handled by the operating system. For
example, the initialization of USB controllers, network interfaces,
and storage devices is now often managed by the firmware during
the early stages of the boot process. This shift ensures that the
operating system can seamlessly interact with hardware from the
moment it takes control, reducing boot times and improving overall
2024-08-25 11:54:54 +02:00
system reliability \cite{uefi_spec}. Security has also become a
2024-08-22 19:18:34 +02:00
paramount concern for modern firmware. UEFI (Unified Extensible
Firmware Interface), which has largely replaced traditional BIOS
in modern systems, includes features which prevents unauthorized
or malicious software from loading during the boot process. This
helps protect the system from rootkits and other low-level malware
that could compromise the integrity of the operating system before
2024-08-25 11:54:54 +02:00
it even starts \cite{uefi_spec}. In the context of performance
2024-08-22 19:18:34 +02:00
tuning, firmware sometimes also plays a key role in enabling and
managing overclocking, particularly for the memory subsystem. By
allowing adjustments to memory frequencies, voltages, and timings,
firmware provides tools for enthusiasts to push their systems beyond
default limits. At the same time, it includes safeguards to manage
the risks of instability and hardware damage, balancing performance
2024-08-21 12:53:44 +02:00
gains with system reliability \cite{anderson_2018}. \\
2024-08-22 19:18:34 +02:00
In summary, the evolution of firmware from simple hardware
initialization routines to complex management systems reflects the
increasing sophistication of modern computer architectures. Firmware
is now a critical layer that not only ensures the correct functioning
of hardware components but also optimizes performance, manages power
consumption, and enhances system security, making it an indispensable
part of contemporary computing. \\
2024-08-21 12:53:44 +02:00
2024-08-22 19:18:34 +02:00
This document will focus on \textit{coreboot} during the next parts
to study how modern firmware interact with hardware and also as a
basis for improvements.
2024-08-21 12:53:44 +02:00
2024-08-25 11:54:54 +02:00
% ------------------------------------------------------------------------------
% CHAPTER 2: Characteristics of ASUS KGPE-D16 mainboard
% ------------------------------------------------------------------------------
2024-08-21 12:53:44 +02:00
\chapter{Characteristics of ASUS KGPE-D16 mainboard}
2024-08-21 21:27:29 +02:00
\begin{figure}[H]
2024-08-22 19:18:34 +02:00
\centering \includegraphics[width=0.9\textwidth]{images/kgpe-d16.png}
2024-08-21 21:27:29 +02:00
\caption{The KGPE-D16 (CC BY-SA 4.0, 2021)}
\end{figure}
2024-08-22 19:18:34 +02:00
2024-08-21 21:27:29 +02:00
\newpage
2024-08-21 12:53:44 +02:00
\section{Overview of ASUS KGPE-D16 hardware}
2024-08-22 19:18:34 +02:00
The ASUS KGPE-D16 server mainboard is a dual-socket motherboard
designed to support AMD Family 10h/15h series processors. Released
in 2009, this mainboard was later awarded the \textit{Respects Your
Freedom} (RYF) certification in March 2017, underscoring its commitment
to fully free software compatibility \cite{fsf_ryf}. Indeed, this
mainboard can be operated with a fully free firmware such as GNU
Boot \cite{gnuboot_status}. \\
This mainboard is equipped with robust hardware components designed to
meet the demands of high-performance computing. It features 16 DDR3
DIMM slots, capable of supporting up to 256GB of memory, although
certain configurations may be limited to 192GB, with some reports
suggesting the potential to support 256GB under specific conditions.
In terms of expandability, the KGPE-D16 includes multiple PCIe
slots, with five physical slots available, although only four
can be used simultaneously due to slot sharing. For storage, the
mainboard provides several SATA ports. Networking capabilities are
enhanced by integrated dual gigabit Ethernet ports, which provide
high-speed connectivity essential for data-intensive tasks and network
2024-08-25 11:54:54 +02:00
communication \cite{asus_kgpe_d16_manual}. Additionally, the board
2024-08-22 19:18:34 +02:00
is equipped with various peripheral interfaces, including USB ports,
audio outputs, and other I/O ports, ensuring compatibility with a
wide range of external devices. \\
2024-08-21 12:53:44 +02:00
\begin{figure}[H]
\centering
\includegraphics[width=0.8\textwidth]{images/fig1_schema_basique.png}
2024-08-22 19:18:34 +02:00
\caption{Basic schematics of the ASUS KGPE-D16 Mainboard, ASUS
(2011)} \label{fig:d16_basic_schematics}
2024-08-21 12:53:44 +02:00
\end{figure}
2024-08-22 19:18:34 +02:00
The physical layout of the ASUS KGPE-D16 is meticulously designed
to optimize airflow, cooling, and power distribution. All of this
is critical for maintaining system stability, particularly under
heavy computational loads, as this board was designed for server
operations. In particular, key components such as the CPU sockets,
memory slots, and PCIe slots are strategically positioned. \\
2024-08-21 12:53:44 +02:00
\begin{figure}[H]
\centering
\includegraphics[width=0.8\textwidth]{images/kgpe-d16_real.png}
\caption{The KGPE-D16, viewed from the top (CC BY-SA 4.0, 2024)}
\label{fig:d16_top_view}
\end{figure}
\section{Chipset}
2024-08-22 19:18:34 +02:00
Before diving into the specific components, it is essential
to understand the roles of the northbridge and southbridge in
traditional motherboard architecture. These chipsets historically
managed communication between the CPU and other critical components
of the system \cite{amd_chipsets}. \\
The northbridge is a chipset on the motherboard that traditionally
manages high-speed communication between the CPU, memory (RAM), and
graphics card (if applicable). It serves as a hub for data that needs
to move quickly between these components. On the ASUS KGPE-D16, the
functions typically associated with the northbridge are divided between
the CPUs internal northbridge and an external SR5690 northbridge
chip. The SR5690 specifically acts as a translator and switch,
handling the HyperTransport interface, a high-speed communication
protocol used by AMD processors, and converting it to ALink and PCIe
interfaces, which are crucial for connecting peripherals like graphics
2024-08-25 11:54:54 +02:00
cards \cite{SR5690BDG}. Additionally, the northbridge on the KGPE-D16
2024-08-22 19:18:34 +02:00
incorporates the IOMMU (Input-Output Memory Management Unit), which
is crucial for ensuring secure and efficient memory access by I/O
devices. The IOMMU allows for the virtualization of memory addresses,
providing device isolation and preventing unauthorized memory access,
which is particularly important in environments that run multiple
virtual machines \cite{amd_chipsets}\cite{northbridge_wiki}. \\
The southbridge, on the other hand, is responsible for handling
lower-speed, peripheral interfaces such as the PCI, USB, and
IDE/SATA connections, as well as managing onboard audio and
network controllers. On the KGPE-D16, these functions are managed
by the SP5100 southbridge chip, which integrates several critical
functions including the LPC bridge, SATA controllers, and other
essential I/O operations \cite{amd_chipsets}\cite{southbridge_wiki}.
It is essentially an ALink bus controller and includes the hardware
interrupt controller, the IOAPIC. Interrupts from peripheral always
pass through the northbridge (fig. \ref{fig:d16_ioapic}), since it
translates ALink to HyperTransport for the CPUs and contains the
IOMMU \cite{SR5690BDG}. \\
2024-08-21 12:53:44 +02:00
\begin{figure}[H]
2024-08-22 19:18:34 +02:00
\centering \includegraphics[width=0.9\textwidth]{images/ioapic.png}
\caption{Functional diagram presenting the IOAPIC function of
the SP5100,
2024-08-21 12:53:44 +02:00
ASUS (2011)}
\label{fig:d16_ioapic}
\end{figure}
2024-08-22 19:18:34 +02:00
In addition to the northbridge and southbridge, the KGPE-D16 also
contains specialized chips for managing input/output operations and
system health monitoring. The WINBOND W83667HG-A Super I/O chip handles
traditional I/O functions such as legacy serial and parallel ports,
keyboard, and mouse interfaces, but also the SPI chip that contains the
firmware \cite{winbond}. Meanwhile, the Nuvoton W83795G/ADG Hardware
Monitor oversees the systems health by monitoring temperatures,
voltages, and fan speeds, ensuring that the system operates within
2024-08-25 11:54:54 +02:00
safe parameters \cite{nuvoton}. On the KGPE-D16, access to the Super
2024-08-22 19:18:34 +02:00
I/O from a CPU core is done through the SR5690, then the SP5100,
as that can be observed on the functional diagram of the chipset
(fig. \ref{fig:d16_chipset}) \cite{SR5690BDG}.
2024-08-21 12:53:44 +02:00
\begin{figure}[H]
\centering
\includegraphics[width=0.8\textwidth]{images/fig2_diagramme_chipset.png}
2024-08-22 19:18:34 +02:00
\caption{Functional diagram of the KGPE-D16 chipset (CC BY-SA 4.0,
2024)} \label{fig:d16_chipset}
2024-08-21 12:53:44 +02:00
\end{figure}
\section{Processors}
2024-08-22 19:18:34 +02:00
The ASUS KGPE-D16 supports AMD Family 10h processors, but
it is important to note that Vikings, a known vendor for
libre-software-compatible hardware, does not recommend using the
Opteron 6100 series due to the lack of IOMMU support, which is
critical for security. Fortunately, AMD Family 15h processors are also
supported. However, the Opteron 6300 series, while supported, requires
proprietary microcode updates for stability, IOMMU functionality,
and fixes for specific vulnerabilities, including a gain-root-
via-NMI exploit. The Opteron 6200 series does not suffer from these
problems and works properly without any proprietary microcode update
2024-08-21 12:53:44 +02:00
needed \cite{vikings}. \\
\begin{figure}[H]
\centering
\includegraphics[width=0.9\textwidth]{images/opteron6200_annoté.png}
2024-08-22 19:18:34 +02:00
\caption{Annotated photography of an Opteron 6200 series
CPU (2024), from a photography by AMD Inc. (2008)}
2024-08-21 12:53:44 +02:00
\label{fig:opteron2600}
\end{figure}
2024-08-22 19:18:34 +02:00
The Opteron 6200 series, part of the Bulldozer microarchitecture,
was designed to target high-performance server applications. These
processors feature 16 cores, organized into 8 Bulldozer modules,
with each module containing two integer cores that shared
resources like the floating-point unit (FPU) and L2 cache
2024-08-21 12:53:44 +02:00
(fig. \ref{fig:opteron2600}, \ref{fig:opteron2600_diagram})
\cite{amd_6200}\cite{anandtech_bulldozer}.
2024-08-22 19:18:34 +02:00
The architecture of the Opteron 6200 series is built around AMD's
Bulldozer core design, which uses Clustered Multithreading (CMT) to
maximize resource utilization. This is a technique where each processor
module contains two integer cores that share certain resources like
the floating-point unit (FPU), L2 cache, and instruction fetch/decode
stages. Unlike traditional multithreading, where each core handles
multiple threads, CMT allows two cores to share resources to improve
parallel processing efficiency. This approach aims to balance
performance and resource usage, particularly in multi- threaded
workloads, though it can lead to some performance trade-offs in
single-threaded tasks. In the Opteron 6272, the processor consists
of eight modules, effectively creating 16 integer cores. Due to
the CMT architecture, each Opteron 6272 chip functions as two CPUs
within a single processor, each with its own set of cores, L2 caches,
and shared L3 cache. Here, one CPU is made by four modules, each
module in it sharing certain components, such as the FPU and L2 cache,
between two integer cores. The L3 cache is shared across these modules.
HyperTransport links provide high-speed communication between the two
sockets of the KGPE-D16. Shared L3 cache and direct memory access are
provided by each socket \cite{amd_6200}\cite{hill_impact_caching}. \\
This architecture also integrates a quad-channel DDR3 memory
controller directly into the processor die, which facilitates high
bandwidth and low latency access to memory. This memory controller
supports DDR3 memory speeds up to 1600 MHz and connects directly
to the memory modules via the memory bus. By integrating the memory
controller into the processor, the Opteron 6200 series reduces memory
access latency, enhancing overall performance
\cite{amd_6200}\cite{amd_ddr3_guide}.
It is interesting to note that Opterons
incorporate the internal northbridge that we cited previously. The
traditional northbridge functions, such as memory controller and PCIe
interface management, are partially integrated into the processor. This
integration reduces the distance data must travel between the CPU and
memory, decreasing latency and improving performance, particularly
in memory-intensive applications \cite{amd_6200}. \\
2024-08-21 12:53:44 +02:00
\begin{figure}[H]
2024-08-22 19:18:34 +02:00
\centering \includegraphics[width=0.8\textwidth]{
images/fig3_img_dual_processor_node.png}
\caption{Functional diagram of an Opteron 6200 package
(CC BY-SA 4.0, 2024)}
2024-08-21 12:53:44 +02:00
\label{fig:opteron2600_diagram}
\end{figure}
2024-08-22 19:18:34 +02:00
Power efficiency was a key focus in the design of the Opteron 6200
2024-08-25 11:54:54 +02:00
series. Despite the high core count, the processor includes several
2024-08-22 19:18:34 +02:00
power management features, such as Dynamic Power Management (DPM)
and Turbo Core technology. These features allow the processor to
adjust power usage based on workload demands, balancing performance
with energy consumption. However, the Bulldozer architecture's
focus on high clock speeds and multi-threaded performance resulted
2024-08-21 21:27:29 +02:00
in higher power consumption compared to competing architectures
2024-08-22 19:18:34 +02:00
\cite{anandtech_bulldozer}. A special model of the series, called
\textit{high efficiency} models, solve a bit this problem by proposing
a bit less performant processor but with a power consumption divided
by a factor from 1.5 to 2.0 in some cases. \\
2024-08-21 12:53:44 +02:00
The processor connected to the I/O hub is known as the Bootstrap
2024-08-25 11:54:54 +02:00
Processor (BSP). The BSP is responsible for starting up the system
2024-08-22 19:18:34 +02:00
by executing the initial firmware code from the reset vector,
a specific memory address where the CPU begins execution after a
reset \cite{amd_bsp}. Core 0 of the BSP, called the Bootstrap Core
(BSC), initiates this process. During early initialization, the
BSP performs several critical tasks, such as memory initialization,
and bringing other CPU cores online. One of its duties is storing
Built-In Self-Test (BIST) information, which involves checking the
integrity of the processor's internal components to ensure they are
functioning correctly. The BSP also determines the type of reset
that has occurred—whether it's a cold reset, which happens when
the system is powered on from an off state, or a warm reset, which
is a restart without turning off the power. Identifying the reset
type is crucial for deciding which initialization procedures need
to be executed \cite{amd_bsp}\cite{BKDG}.
2024-08-21 12:53:44 +02:00
2024-08-21 21:27:29 +02:00
\section{Baseboard Management Controller}
2024-08-22 19:18:34 +02:00
2024-08-21 21:27:29 +02:00
The Baseboard Management Controller (BMC) on the KGPE-D16 motherboard,
specifically the ASpeed AST2050, plays a role in the server's
2024-08-22 19:18:34 +02:00
architecture by managing out-of-band communication and control of
2024-08-25 11:54:54 +02:00
the hardware. The AST2050 is based on an ARM926EJ-S processor,
2024-08-22 19:18:34 +02:00
a low-power 32-bit ARM architecture designed for embedded systems
\cite{ast2050_architecture}. This architecture is well-suited for BMCs
due to its efficiency and capability to handle multiple management
tasks concurrently without significant resource demands from the
main system. \\
The AST2050 features several key components that contribute to
2024-08-25 11:54:54 +02:00
its functionality. It includes an integrated VGA controller,
2024-08-22 19:18:34 +02:00
which enables remote graphical management through KVM-over-IP
(Keyboard, Video, Mouse), a critical feature for administrators who
need to interact with the system remotely, including BIOS updates
and troubleshooting \cite{ast2050_kvm}. Additionally, the AST2050
integrates a dedicated memory controller, which supports up to 256MB
2024-08-25 11:54:54 +02:00
of DDR2 RAM. This allows it to handle complex tasks and maintain
2024-08-22 19:18:34 +02:00
responsiveness during management operations \cite{ast2050_memory}.
2024-08-21 21:27:29 +02:00
The BMC also features a network interface controller (NIC) dedicated to
2024-08-22 19:18:34 +02:00
management traffic, ensuring that remote management does not interfere
with the primary network traffic of the server. This separation is
vital for maintaining secure and uninterrupted system management,
especially in environments where uptime is critical \cite{ast2050_nic}.
Another important architectural aspect of the AST2050 is its support
for multiple I/O interfaces, including I2C, GPIO, UART, and USB,
which allow it to interface with various sensors and peripherals
on the motherboard \cite{ast2050_io}. This versatility enables
comprehensive monitoring of hardware health, such as temperature
sensors, fan speeds, and power supplies, all of which can be managed
2024-08-21 21:27:29 +02:00
and configured through the BMC. \\
2024-08-21 13:13:02 +02:00
2024-08-22 19:18:34 +02:00
When combined with OpenBMC \cite{openbmc_wiki}, a libre firmware
that can be run on the AST2050 thanks to Raptor Engineering
\cite{raptor_engineering}, the architecture of the BMC becomes even
more powerful. OpenBMC takes advantage of the AST2050's architecture,
providing a flexible and customizable environment that can be tailored
to specific use cases. This includes adding or modifying features
related to security, logging, and network management, all within
the BMC's ARM architecture framework \cite{openbmc_customization}.
2024-08-21 12:53:44 +02:00
2024-08-25 11:54:54 +02:00
% ------------------------------------------------------------------------------
% CHAPTER 3: Key components in modern firmware
% ------------------------------------------------------------------------------
2024-08-22 19:18:34 +02:00
\chapter{Key components in modern firmware}
2024-08-21 12:53:44 +02:00
2024-08-21 21:27:29 +02:00
\section{General structure of coreboot}
2024-08-22 15:38:22 +02:00
The firmware of the ASUS KGPE-D16 is crucial in ensuring the proper
functioning and optimization of the mainboard's hardware components.
2024-08-25 15:57:26 +02:00
In this chapter and for the rest of this document, we're basing our
study on the 4.11 version of \textit{coreboot} \cite{coreboot_4_11},
which is the last version that supported the ASUS KGPE-D16 mainboard. \\
For the firmware tasks to be done efficiently, \textit{coreboot} is
organized in different stages (fig. \ref{fig:coreboot_stages})
\cite{coreboot_docs}.
2024-08-21 12:53:44 +02:00
2024-08-21 21:27:29 +02:00
\begin{figure}[H]
\centering
\includegraphics[width=0.9\textwidth]{
2024-08-21 12:53:44 +02:00
images/fig9_coreboot_stages.png}
2024-08-22 15:38:22 +02:00
\caption{\textit{coreboot}'s stages timeline, by
\textit{coreboot} project (CC BY-SA 4.0, 2009)}
2024-08-21 21:27:29 +02:00
\label{fig:coreboot_stages}
\end{figure}
2024-08-22 15:38:22 +02:00
Being a complex project with ambitious goals, \textit{coreboot} decided
early on to establish an file-system-based architecture for its images
(also called ROMs). This special file-system is CBFS (which stands for
coreboot file system). The CBFS architecture consists of a binary image
that can be interpreted as a physical disk, referred to here as ROM. A
number of independent components, each with a header added to the data,
are located within the ROM. The components are nominally arranged
sequentially, although they are aligned along a predefined boundary
(fig. \ref{fig:coreboot_diagram}). \\
Each stage is compiled as a separate binary and inserted into the CBFS
with custom compression. The bootblock stage is usually not compressed,
while the ramstage and the payload are compressed with LZMA. Each stage
loads the next stage at a given address (possibly decompressing it in
the process). \\
Some stages are relocatable and can be placed anywhere in the RAM.
These stages are typically cached in the CBMEM for faster loading times
during wake-up. The CBMEM is a specific memory area used by the
\textit{coreboot} firmware to store important data structures and logs
during the boot process. This area is typically allocated in the
system's RAM and is used to store various types of runtime information
that it might need to reference after the initial boot stages. \\
In general, \textit{coreboot} manages main memory through a structured
memory map (fig. \ref{tab:memmap}), allocating specific address ranges
for various hardware functions and system operations. The first 640KB
of memory space is typically unused by coreboot due to historical
reasons. Graphics-related operations use the VGA address range
and the text mode address ranges. It also reserves the higher for
operating system use, ensuring that critical system components
like the IOAPIC and TPM registers have dedicated address spaces.
This structured approach helps maintain system stability and
2024-08-22 19:18:34 +02:00
compatibility across different platforms and allows for a reset vector
fixed at an address (\textit{0xFFFFFFF0}), regardless of the ROM size.
2024-08-22 15:38:22 +02:00
Payloads are typically loaded into high memory, above the reserved areas
for hardware components and system resources. The exact memory location
can vary depending on the system's configuration, but generally,
payloads are placed in a region of memory that does not conflict with
the firmware code or the reserved memory map areas, such as the ROM
mapping ranges. This placement ensures that payloads have sufficient
space to execute without interfering with other critical memory regions
allocated \cite{coreboot_mem_management}.
\begin{table}[ht]
\makebox[\textwidth][c]{%
\begin{tabular}{
|>{\centering\arraybackslash}p{0.35\textwidth}
|>{\centering\arraybackslash}p{0.5\textwidth}|}
\hline
2024-08-25 15:57:26 +02:00
\path{0x00000 - 0x9FFFF}
2024-08-22 15:38:22 +02:00
& Low memory (first 640KB). Never used. \\
\hline
2024-08-25 15:57:26 +02:00
\path{0xA0000 - 0xAFFFF}
2024-08-22 15:38:22 +02:00
& VGA graphics address range. \\
\hline
2024-08-25 15:57:26 +02:00
\path{0xB0000 - 0xB7FFF}
2024-08-22 15:38:22 +02:00
& Monochrome text mode address range.
Few motherboards use
it, but the KGPE-D16 does. \\
\hline
2024-08-25 15:57:26 +02:00
\path{0xB8000 - 0xBFFFF}
2024-08-22 15:38:22 +02:00
& Text mode address range. \\
\hline
2024-08-25 15:57:26 +02:00
\path{0xFEC00000}
2024-08-22 15:38:22 +02:00
& IOAPIC address. \\
\hline
2024-08-25 15:57:26 +02:00
\path{0xFED44000 - 0xFED4FFFF}
2024-08-22 15:38:22 +02:00
& Address range for TPM registers. \\
\hline
2024-08-25 15:57:26 +02:00
\path{0xFF000000 - 0xFFFFFFFF}
2024-08-22 15:38:22 +02:00
& 16 MB ROM mapping address range. \\
\hline
2024-08-25 15:57:26 +02:00
\path{0xFF800000 - 0xFFFFFFFF}
2024-08-22 15:38:22 +02:00
& 8 MB ROM mapping address range. \\
\hline
2024-08-25 15:57:26 +02:00
\path{0xFFC00000 - 0xFFFFFFFF}
2024-08-22 15:38:22 +02:00
& 4 MB ROM mapping address range. \\
\hline
2024-08-25 15:57:26 +02:00
\path{0xFEC00000 - DEVICE MEM HIGH}
2024-08-22 15:38:22 +02:00
& Reserved area for OS use. \\
\hline
\end{tabular}}
\caption{\textit{coreboot} memory map}
\label{tab:memmap}
\end{table}
2024-08-21 21:27:29 +02:00
2024-08-25 15:57:26 +02:00
\subsection{Bootblock}
2024-08-22 15:38:22 +02:00
The bootblock is the first stage executed after the CPU reset. The
beginning of this stage is written in assembly language, and its
main task is to set everything up for a C environment. The rest, of
course, is written in C. This stage occupies the last 20k
(fig. \ref{fig:coreboot_diagram}) of the image and within it is a
main header containing information about the ROM, including the
size, component alignment, and the offset of the start of the first
CBFS component. This block is a mandatory component as it also
contains the entry point of the firmware. \\
\begin{figure}[H]
\centering
2024-08-22 19:18:34 +02:00
\includegraphics[width=0.8\textwidth]{images/fig8_coreboot_architecture.png}
2024-08-22 15:38:22 +02:00
\caption{\textit{coreboot} ROM architecture
(CC BY-SA 4.0, 2024)}
\label{fig:coreboot_diagram}
\end{figure}
2024-08-22 19:18:34 +02:00
Upon startup, the first responsibility of the bootblock is to
execute the code from the reset vector located at the conventional
reset vector in 16-bit real mode. This code is specific to the
processor architecture and, for our board, is stored in the
architecture-specific sources for x86 within \textit{coreboot}.
The entry point into \textit{coreboot} code is defined in two files
2024-08-25 15:57:26 +02:00
in the \path{src/cpu/x86/16bit/} directory: \path{reset16.inc}
and \path{entry16.inc}. The first file serves as a jump to the
\path{_start16bit} procedure defined in the second. Due to space
2024-08-22 19:18:34 +02:00
constraints this function must remain below the 1MB address space
because the IOMMU has not yet been configured to allow anything
else. \\
During this early initialization, the Bootstrap Core (BSC) performs
several critical tasks while the other cores remain dormant. These
tasks include saving the results (and displaying them if necessary)
of the Built-in Self-Test (BIST), formerly known as POST;
invalidating the TLB to prevent any address translation errors;
determining the type of reset (e.g., cold start or warm start);
creating and loading an empty Interrupt Descriptor Table (IDT) to
prevent the use of "legacy" interrupts from real mode until
protected mode is reached. In practice, this means that at the
slightest exception, the BSC will halt. The code then switches to
32-bit protected mode by mapping the first 4 GB of address space for
code and data, and finally jumps to the 32-bit reset code labeled
2024-08-25 15:57:26 +02:00
\path{_protected_start}. \\
2024-08-22 19:18:34 +02:00
Once in protected mode, which constitutes the "normal" operating
mode for the processor, the next step is to set up the execution
environment. To achieve this, the code contained in
\path{src/cpu/x86/32bit/entry32.inc}, followed by
\path{src/cpu/x86/64bit/entry64.inc}, and finally
\path{src/arch/x86/bootblock_crt0.S}, establishes a temporary
stack, transitions to long mode (64-bit addressing) with paging
enabled, and sets up a proper exception vector table. The execution
then jumps to chipset-specific code via the
2024-08-25 15:57:26 +02:00
\path{bootblock_pre_c_entry} procedure.
2024-08-22 19:18:34 +02:00
Once these steps are completed, the bootblock has a minimal C
environment. The procedure now involves allocating
memory for the BSS, and decompressing and loading the next stage. \\
2024-08-25 15:57:26 +02:00
The jump to \path{_bootblock_pre_entry} leads to the code files
2024-08-22 19:18:34 +02:00
\path{src/soc/amd/common/block/cpu/car/cache_as_ram.S} and
\path{src/vendorcode/amd/agesa/f15tn/gcccar.inc}, which are specific
to AMD chipsets. It's worth noting that these files were developed by
AMD's engineers as part of the \textit{AGESA} project. The operations
performed at this stage are related to pre-RAM memory initialization.
All cores of all processors (up to a limit of 64 cores) are started.
The \textit{Cache-As-Ram} is configured using the
Memory-type range registers. These registers allow the
specification of a specific configuration for a given memory area
\cite{BKDG}.
In this case, the area that should correspond to physical memory is
mapped to the cache, while other areas, such as PCI or other bus
zones, are configured accordingly. A specific stack is set up for
each core of each processor (within the arbitrary limit of 64 cores
and 7 nodes, meaning 7 Core 0s). Core 0s receive 16KB, while the
Bootstrap Core (BSC) gets 64KB. The other cores receive 4KB each.
All cores except the BSC are halted and will restart during the
romstage. Finally, the execution jumps to the entry point of the
\textit{bootblock} written in C, labeled
2024-08-25 15:57:26 +02:00
\path{bootblock_c_entry}.
2024-08-22 19:18:34 +02:00
This entry point is located in
\path{src/soc/amd/stoneyridge/bootblock/bootblock.c} and is
specific to AMD processors. It is the first C routine executed, and
its role is to verify that the current processor is indeed the BSC,
allowing the function \path{bootblock_main_with_basetime}
2024-08-22 20:01:29 +02:00
to be called exclusively by the BSC. \\
2024-08-22 19:18:34 +02:00
We are now in the file \path{src/lib/bootblock.c}, written by
Google's team, and entering the
2024-08-25 15:57:26 +02:00
\path{bootblock_main_with_basetime} function, which immediately
calls \path{bootblock_main_with_timestamp}. At this stage, the
2024-08-22 19:18:34 +02:00
goal is to start the romstage, but a few more tasks need to be
completed.
2024-08-25 15:57:26 +02:00
The \path{bootblock_soc_early_init} function is called to
2024-08-22 19:18:34 +02:00
initialize the I2C bus of the southbridge. The
2024-08-25 15:57:26 +02:00
\path{bootblock_fch_early_init} function is invoked to
2024-08-22 19:18:34 +02:00
initialize the SPI buses (including the one for the ROM) and the
serial and "legacy" buses of the southbridge. The CMOS clock is then
initialized, followed by the pre-initialization of the serial
console.
2024-08-25 15:57:26 +02:00
The code then calls the \path{bootblock_mainboard_init}
2024-08-22 19:18:34 +02:00
function, which enters, for the first time, the files specific to
the ASUS KGPE-D16 motherboard:
\path{src/mainboard/ASUS/kgpe-d16/bootblock.c}.
This code performs the northbridge initialization via the
2024-08-25 15:57:26 +02:00
\path{bootblock_northbridge_init} function found in
2024-08-22 19:18:34 +02:00
\path{src/northbridge/amd/amdfam10/bootblock.c}. This involves
locating the HyperTransport bus and enabling the discovery of
devices connected to it (e.g., processors). The southbridge is
2024-08-25 15:57:26 +02:00
initialized using the \path{bootblock_southbridge_init}
2024-08-22 19:18:34 +02:00
function from \path{src/southbridge/amd/sb700/bootblock.c}.
This function, largely programmed by Timothy Pearson from Raptor
Engineering, who performed the first coreboot port for the ASUS
KGPE-D16, finalizes the activation of the SPI bus and the connection
to the ROM memory via SuperIO. The state of a recovery jumper is
then checked (this jumper is intended to reset the CMOS content,
although it is not fully functional at the moment, as indicated by
2024-08-25 15:57:26 +02:00
the \path{FIXME} comment in the code). Control then returns to
\path{bootblock_main} in \path{src/lib/bootblock.c}. \\
2024-08-22 19:18:34 +02:00
At this point, everything is ready to enter the romstage.
\textit{coreboot} has successfully started and can now continue its
2024-08-25 15:57:26 +02:00
execution by calling the \path{run_romstage} function from
2024-08-22 19:18:34 +02:00
\path{src/lib/prog_loaders.c}. This function begins by locating
the corresponding segment in the ROM via the southbridge and SPI
2024-08-25 15:57:26 +02:00
bus using \path{prog_locate}, which utilizes the SPI driver in
2024-08-22 19:18:34 +02:00
\path{src/drivers/cbfs_spi.c}. The contents of the romstage are
then copied into the cache-as-ram by
2024-08-25 15:57:26 +02:00
\path{cbfs_prog_stage_load}. Finally, the \path{prog_run}
2024-08-22 19:18:34 +02:00
function transitions to the romstage after switching back to
32-bit mode.
2024-08-22 15:38:22 +02:00
2024-08-21 12:53:44 +02:00
\subsection{Romstage}
2024-08-22 15:38:22 +02:00
2024-08-22 19:18:34 +02:00
The \textit{romstage} in \textit{coreboot} serves the critical function
of early initialization of peripherals, particularly system memory.
This stage is crucial for setting up the necessary components for the
platform's operation, ensuring that everything is in place for
subsequent stages of the boot process.
During this phase, \textit{coreboot} configures the Advanced
Programmable Interrupt Controller (APIC), which is responsible for
correctly handling interrupts across multiple CPUs, especially in
systems using Symmetric Multiprocessing (SMP). This includes setting
up the Local APIC on each processor and the IOAPIC, part of the
southbridge, to ensure that interrupts from peripherals are routed
to the appropriate CPUs. Additionally, the firmware configures the
HyperTransport (HT) technology, a high-speed communication protocol
that facilitates data exchange between the processor and the
northbridge, ensuring smooth data flow between these components. \\
The \textit{romstage} begins with a call to the
2024-08-25 15:57:26 +02:00
\path{_start} function, defined in
2024-08-22 19:18:34 +02:00
\path{src/cpu/x86/32bit/entry32.inc} via
\path{src/arch/x86/assembly_entry.S}. We then enter the
2024-08-25 15:57:26 +02:00
\path{cache_as_ram_setup} procedure, written in assembly
2024-08-22 19:18:34 +02:00
language, located in \path{src/cpu/amd/car/cache_as_ram.inc}. This
procedure configures the cache to load the future \textit{ramstage}
and initialize memory based on the number of processors and cores
present. Once this is completed, the code calls
2024-08-25 15:57:26 +02:00
\path{cache_as_ram_main} in
2024-08-22 19:18:34 +02:00
\path{src/mainboard/asus/kgpe-d16/romstage.c}, which serves as the
main function of the \textit{romstage}.
2024-08-25 15:57:26 +02:00
In the \path{cache_as_ram_main} function, after reducing the
2024-08-22 19:18:34 +02:00
speed of the HyperTransport bus, only the Bootstrap Core (BSC)
initializes the spinlocks for the serial console, the CMOS storage
memory (used for saving parameters), and the ROM. At this point, the
HyperTransport bus is enumerated, and the PCI bridges are
temporarily disabled. The port 0x80 of the southbridge, used for
motherboard debugging with \textit{Post Codes}, is also initialized.
These codes indicate the status of the boot process and can be
displayed using special PCI cards connected to the system. The
SuperIO is then initialized to activate the serial port, allowing
the serial console to follow \textit{coreboot}s progress in
real-time. If everything proceeds as expected, the code 0x30 is
2024-08-22 20:01:29 +02:00
sent, and the boot process continues. \\
2024-08-22 19:18:34 +02:00
If the result of the Built-in Self-Test (BIST), saved during the
\textit{bootblock}, shows no anomalies, all cores of all nodes are
configured, and they are placed back into sleep mode (except for the
Core 0s). If everything goes well, the code 0x32 is sent, and the
2024-08-25 15:57:26 +02:00
process continues. Using the \path{enable_sr5650_dev8} function,
2024-08-22 19:18:34 +02:00
the southbridges P2P bridge is activated. Additionally, a check is
performed to ensure that the number of physical processors detected
does not exceed the number of sockets available on the board. If any
issues were detected during the BIST, the machine will halt, and the
error will be displayed on the console. Otherwise, the process
continues, and the default hardware information table is
constructed, and the microcode of the physical processors is updated
if necessary. If everything proceeds correctly, the code 0x33 and
then 0x34 is sent, and the process continues. The information about
2024-08-25 15:57:26 +02:00
the physical processors is retrieved using \path{amd_ht_init},
2024-08-22 19:18:34 +02:00
and communication between the two sockets is configured via
2024-08-25 15:57:26 +02:00
\path{amd_ht_fixup}. This process includes disabling any
2024-08-22 19:18:34 +02:00
defective HT links (one per socket in this AMD Family 15h chipset).
If everything is working as expected, the code 0x35 is sent, and
the boot process continues.
2024-08-25 15:57:26 +02:00
With the \path{finalize_node_setup} function, the PCI bus is
2024-08-22 19:18:34 +02:00
initialized, and a mapping is created
2024-08-25 15:57:26 +02:00
(\path{setup_mb_resource_map}). If all goes well, the code 0x36
2024-08-22 19:18:34 +02:00
is sent. This is done in parallel across all Core 0s, so the system
waits for all cores to finish using the
2024-08-25 15:57:26 +02:00
\path{wait_all_core0_started} function. The communication
2024-08-22 19:18:34 +02:00
between the northbridge and southbridge is prepared using
2024-08-25 15:57:26 +02:00
\path{sr5650_early_setup} and
\path{sb7xx_51xx_early_setup}, followed by the activation of
2024-08-22 19:18:34 +02:00
all cores on all nodes, with the system waiting for all cores to be
fully initialized. If everything is successful, the code 0x38 is
sent. \\
At this point, the timer is activated, and a warm reset is performed
2024-08-25 15:57:26 +02:00
via the \path{soft_reset} function to validate all configuration
2024-08-22 19:18:34 +02:00
changes to the HT, PCI buses, and voltage/power settings of the
processors and buses. This results in a system reboot, passing again
through the \textit{bootblock}, but much faster this time since the
system recognizes the warm reset condition. Once this reboot is
complete, the HyperTransport bus is reconfigured into isochronous
mode (switching from asynchronous mode), finalizing the
configuration process. \\
Memory training and optimization are also key functions of the
firmware during the \textit{romstage}. This process involves
adjusting memory settings, such as timings, frequencies, and
voltages, to ensure that the installed memory modules operate
efficiently and stably. This step is crucial for achieving optimal
performance, especially when dealing with large amounts of RAM
and many CPU cores, as supported by the KGPE-D16. We'll see that
in detail during the next chapter. \\
After memory initialization, the process returns to the
2024-08-25 15:57:26 +02:00
\path{cache_as_ram_main} function, where a memory test is
2024-08-22 19:18:34 +02:00
performed. This involves writing predefined values to specific
memory locations and then verifying that the values can be read
back correctly.
If everything passes successfully, the CBMEM is initialized and
2024-08-25 15:57:26 +02:00
one sends code \path{0x41}. At this point, the configuration of
2024-08-22 19:18:34 +02:00
the PCI bus is prepared, which will be completed during the ramstage
by configuring the PCI bridges. The system then exits
2024-08-25 15:57:26 +02:00
\path{cache_as_ram_main} and returns to
\path{cache_as_ram_setup} to finalize the process.
2024-08-22 19:18:34 +02:00
2024-08-25 11:54:54 +02:00
\textit{coreboot} then transitions to the next stage, known as the
2024-08-22 19:18:34 +02:00
postcar stage, where it exits the cache-as-RAM mode and
begins using physical RAM.
2024-08-22 15:38:22 +02:00
2024-08-21 12:53:44 +02:00
\subsection{Ramstage}
2024-08-22 15:38:22 +02:00
The ramstage performs the general initialization of all peripherals,
2024-08-22 19:18:34 +02:00
including the initialization of PCI devices, on-chip devices, the
TPM (if not done by verstage), graphics (optional), and the CPU
(setting up the System Management Mode). After this initialization,
tables are written to inform the payload or operating system about
the existence and current state of the hardware. These tables
include ACPI tables (specific to x86), SMBIOS tables (specific to
x86), coreboot tables, and updates to the device tree (specific to
ARM). Additionally, the ramstage locks down the hardware and
firmware by applying write protection to boot media, locking
security-related registers, and locking SMM (specific to x86)
\cite{coreboot_docs}.
2024-08-22 15:38:22 +02:00
Effective resource allocation is essential for system stability,
2024-08-22 19:18:34 +02:00
particularly in complex configurations involving multiple CPUs
and peripherals. This stage manages initial resource allocation,
resolving any conflicts between hardware components to prevent
resource contention and ensure smooth operation and security, which
is a major concern in modern systems. This includes support for
IOMMU, which is crucial for preventing unauthorized direct memory
access (DMA) attacks, particularly in virtualized environments
(however there are still vulnerabilities that can be exploited,
such as sub-page or IOTLB-based attacks or even configuration
weaknesses \cite{medeiros2017}\cite{markuze2021}). \\
2024-08-22 15:38:22 +02:00
\subsubsection{Advanced Configuration and Power Interface}
2024-08-22 19:18:34 +02:00
The Advanced Configuration and Power Interface (ACPI) is a
critical component of modern computing systems, providing an
open standard for device configuration and power management by
the operating system (OS). Developed in 1996 by Intel,
Microsoft, and Toshiba, ACPI replaced the older Advanced Power
Management (APM) standard with more advanced and flexible power
2024-08-25 11:54:54 +02:00
management capabilities \cite{intel_acpi_introduction_2023}.
At its core,
2024-08-22 19:18:34 +02:00
ACPI is implemented through a series of data structures and
executable code known as ACPI tables, which are provided by the
system firmware and interpreted by the OS. These tables describe
various aspects of the system, including hardware resources,
device power states, and thermal zones. The ACPI Specification
outlines these structures and provides the necessary
standardization for interoperability across different platforms
and operating systems \cite{acpi_os_support}. These tables are
used by the OS to perform low-level task, including managing
power states of the CPU, controlling the voltage and frequency
scaling (also known as Dynamic Voltage and Frequency Scaling,
or DVFS), and coordinating power delivery to peripherals. \\
The ACPI Component Architecture (ACPICA) is the reference
implementation of ACPI, providing a common codebase that can be
used by OS developers to integrate ACPI support. ACPICA includes
tools and libraries that allow for the parsing and execution of
ACPI Machine Language (AML) code, which is embedded within the
ACPI tables \cite{acpi_programming}. One of the key tools in
ACPICA is the Intel ACPI Source Language (IASL) compiler, which
converts ACPI Source Language (ASL) code into AML bytecode,
allowing firmware developers to write custom ACPI
methods \cite{intel_acpi_spec}. The triggering of ACPI events is
managed through a combination of hardware signals and software
routines. For example, when a user presses the power button on a
system, an ACPI event is generated, which is then handled by the
OS. This event might trigger the system to enter a low-power
state, such as sleep or hibernation, depending on the
configuration provided by the ACPI tables
\cite{acpi_os_support}. These power states are defined in the
ACPI specification, with global states (G0 to G3) representing
different levels of system power consumption, and device states
(D0 to D3) representing individual device power levels. \\
The ASUS KGPE-D16 mainboard, which is designed for server and
high-performance computing environments, needs ACPI for managing
its power distribution across multiple CPUs and attached
peripherals. ACPI is integral in controlling the power states of
various components, thereby optimizing performance and energy
use. Additionally, the firmware on the KGPE-D16 uses ACPI tables
to manage system temperature and fan speed, ensuring reliable
operation under heavy workloads \cite{asus_kgpe_d16_manual}.
2024-08-22 15:38:22 +02:00
\subsubsection{System Management Mode}
2024-08-22 19:18:34 +02:00
System Management Mode (SMM) is a highly privileged operating
mode provided by x86 processors for handling system-level
functions such as power management, hardware control, and other
critical tasks that are to be isolated from the OS and
applications. Introduced by Intel, SMM operates in an
environment separate from the main operating system, offering a
controlled space for executing sensitive operations
\cite{uefi_smm_security}. \\
SMM is triggered by a System Management Interrupt (SMI), which
is a non-maskable interrupt that causes the CPU to save its
current state and switch to executing code stored in a protected
area of memory called System Management RAM (SMRAM). SMRAM is a
specialized memory region that is isolated from the rest of the
system, making it inaccessible to the OS and preventing
tampering or interference from other software
\cite{heasman2007}.
Within SMM, the firmware can execute various low-level functions
that require direct hardware control or need to be protected
from the OS. This includes tasks such as thermal management,
where the system monitors CPU temperature and adjusts
performance or power levels to prevent overheating, as well as
power management routines that enable efficient energy usage
by adjusting power states based on system activity
\cite{offsec_bios_smm}. One of the critical security features of
SMM is its role in managing firmware updates and handling
system-level security events. Because SMM operates in a
privileged mode that is isolated from the OS, it can
apply firmware updates and could respond to security threats
without being affected by potentially compromised system
software \cite{domas2015}. However, the high privilege level and
isolation of SMM also present significant security challenges.
If an attacker can compromise SMM, they gain full control over
the system, bypassing all security measures implemented by the
OS \cite{cyber_smm_hack}. Also, with a proprietary firmware,
it means that this code with a very high priviledge level
cannot be audited at all, nor even replaced. \\
The ASUS KGPE-D16 mainboard needs SMM to perform critical
management tasks that need to be done in parallel from the
operating system. For example, SMM is used to monitor and manage
system health by responding to thermal events and adjusting
power levels to maintain system stability. SMM operates
independently of the main operating system, allowing it to
perform sensitive tasks securely. \textit{coreboot}
supports SMM, but its implementation is typically
minimal compared to traditional proprietary firmware. In
\textit{coreboot}, SMM initialization involves setting
up the System Management Interrupt (SMI) handler and configuring
System Management RAM (SMRAM), the memory region where SMM code
executes\cite{brown2003linuxbios}. The extent of SMM support in
\textit{coreboot} can vary significantly depending on the
hardware platform and the specific requirements of the system.
\textit{coreboot}'s design philosophy emphasizes a lightweight
and fast boot process, delegating more complex management tasks
to payloads or the operating system itself
2024-08-22 15:38:22 +02:00
\cite{reinauer2008coreboot}.
2024-08-22 19:18:34 +02:00
One of the key challenges with implementing SMM in
\textit{coreboot} is ensuring that SMI handlers are configured
correctly to manage necessary system tasks without compromising
security or performance. \textit{coreboot}'s approach to SMM is
consistent with its overall goal of providing a streamlined and
efficient firmware solution, leaving more intricate
functionalities to be handled by subsequent software layers
\cite{mohr2012comparative}.
2024-08-21 21:27:29 +02:00
2024-08-21 12:53:44 +02:00
\subsection{Payload}
2024-08-22 19:18:34 +02:00
The payload is the software that executes after coreboot has
completed its initialization tasks. It resides in the CBFS and is
predetermined at compile time, with no option to choose it at
runtime. The primary role of the payload is to load and hand control
over to the operating system. In some cases, the payload itself can
be a component of the operating system \cite{coreboot_docs}.
Examples of payloads are \textit{GNU GRUB}, \textit{SeaBIOS},
\textit{memtest86+} or even sometimes the \textit{Linux kernel}
itself. \\
\textit{TianoCore}, a free implementation of the UEFI (Unified
Extensible Firmware Interface) specification is often used as a
payload \cite{tianocore_payload}.
It provides a UEFI environment after \textit{coreboot} has completed
its initial hardware initialization. This allows the system to
benefit from the advanced features of UEFI, such as a more flexible
boot manager, enhanced features, and support for modern
hardware. Indeed, UEFI, and by extension \textit{TianoCore},
includes a driver model that allows hardware manufacturers to
provide UEFI-compatible drivers. These drivers can be loaded at
boot time, allowing the firmware to support a wide range of modern
devices that \textit{coreboot}, with its more minimalistic and
custom-tailored approach, might not support out of the box.
For example, GOP drivers are responsible for setting up the
graphics hardware in UEFI environments. They replace the older VGA
BIOS routines used in legacy BIOS systems. With GOP drivers,
the system can initialize the GPU and display a graphical interface
even before the operating system loads \cite{osdev_gop}.
Hardware manufacturers can distribute proprietary UEFI drivers as
part of firmware updates, making it straightforward for end-users
to install and use them. This is especially useful for specialized
hardware that requires specific drivers not included in the
free software community. It also gives hardware vendors more control
over how their devices are initialized and used, which can be
an advantage for vendors but is a freedom and user control
limitation. \\
Payloads are then definitely important parts of the firmware.
2024-08-22 15:38:22 +02:00
\section{AMD Platform Security Processor and Intel Management Engine}
2024-08-22 19:18:34 +02:00
The AMD Platform Security Processor (PSP) and Intel Management Engine
(ME) are embedded subsystems within AMD and Intel processors,
respectively, that handle a range of security-related tasks independent
of the main CPU. These subsystems are fundamental to the security
architecture of modern computing platforms, providing functions such as
secure boot, cryptographic key management, and remote system management
\cite{amd_psp_overview}.
The AMD PSP is based on an ARM Cortex-A5 processor and is responsible
for several security functions, including the validation of firmware
during boot (secure boot), management of Trusted Platform Module (TPM)
functions, and handling cryptographic operations such as key generation
and storage. The PSP operates independently of the main x86 cores,
which allows it to execute security functions even when the main system
is powered off or compromised by malware \cite{amd_psp_overview}.
The PSP's isolated environment ensures that sensitive operations are
protected from threats that could affect the main OS. \\
Similarly, the Intel Management Engine (ME) is a dedicated
processor embedded within Intel chipsets that operates
independently of the main CPU. The ME is a comprehensive subsystem that
provides a variety of functions, including out-of-band system
management, security enforcement, and support for Digital Rights
Management (DRM) \cite{intel_csme}. The ME's firmware runs on an
isolated environment that allows it to perform these tasks securely,
even when the system is powered off. This capability is crucial for
enterprise environments where administrators need to perform remote
diagnostics, updates, and security checks without relying on the main
OS.
Intel ME enforces Digital Rights Management (DRM)
through a multifaceted approach leveraging its deeply embedded,
hardware-based capabilities. At the core is the Protected
Execution Environment (PEE), which operates independently from the main
CPU and operating system. This isolation allows to privately
manage cryptographic keys, certificates, and other sensitive data
critical for DRM, which can be very problematic from a user freedom
perspective \cite{fsf_intel_me}. By handling encryption and decryption
processes within this protected environment, Intel ME ensures that
DRM-protected content, such as video streams, remains secure and
unreachable by the user, raising concerns about the control users have
over their own devices \cite{eff_intel_me}.
Intel ME also plays a significant role in maintaining platform
integrity through the secure boot process. During secure boot, Intel ME
ensures that only digitally signed and authorized operating systems and
applications are loaded, which can prevent users from installing
alternative or modified software on their own hardware, further
restricting their freedom \cite{uefi_what_is_uefi}. This is further
reinforced by Intel ME's remote attestation capabilities, where the
systems state is reported to a remote server. This process verifies
that only systems meeting specific security standards—dictated by third
parties—are allowed to access DRM-protected content, potentially
limiting users' control over their own devices \cite{proprivacy_intel_me}.
Moreover, Intel ME supports High-bandwidth Digital Content Protection
(HDCP), a technology that restricts how digital content is transmitted
over interfaces like HDMI or DisplayPort. By enforcing HDCP, Intel ME
ensures that protected digital content, such as high-definition video,
is only transmitted to and displayed on authorized devices, effectively
preventing users from freely using the content they have legally
acquired \cite{phoronix_hdcp_2_2_i915}\cite{kernel_mei_hdcp}.
Together, these features enable Intel ME to provide a comprehensive and
robust DRM enforcement mechanism. However, this also means that users
have less control over their own hardware and digital content, raising
serious concerns about privacy, user autonomy, and the broader
implications for freedom in computing
\cite{fsf_intel_me}\cite{netgarage_intel_me}. \\
Added to that, Intel ME has been a source of controversy due to its deep
2024-08-22 15:38:22 +02:00
integration into the hardware and its potential to be exploited if
2024-08-22 19:18:34 +02:00
vulnerabilities are discovered. Researchers have demonstrated ways to
hack into the ME, potentially gaining control over a system even when
it is powered off \cite{blackhat_me_hack}. These concerns have led to
calls for greater transparency and security measures around the ME and
similar subsystems. When comparing Intel ME and AMD PSP, the primary
difference lies in their scope and functionality. Intel ME offers more
extensive remote management capabilities, making it a more comprehensive
tool for enterprise environments, while AMD PSP focuses more narrowly on
core security tasks. Nonetheless, both play critical roles in ensuring
the security and integrity of modern computing systems. \\
The ASUS KGPE-D16 mainboard does not include AMD PSP nor Intel ME.
2024-08-21 12:53:44 +02:00
2024-08-25 11:54:54 +02:00
% ------------------------------------------------------------------------------
% CHAPTER 4: Memory initialization and training
% ------------------------------------------------------------------------------
\chapter{Memory initialization and training}
\section{Importance of DDR3 Memory Initialization}
2024-08-25 15:57:26 +02:00
Memory modules are designed solely for storing data. The only valid
operations on a memory device are reading data stored in the device,
writing (or storing) data into the device, and refreshing the data.
Memory modules consist of large rectangular arrays of memory cells,
including circuits used to read and write data into the arrays, and
refresh circuits to maintain the integrity of the stored data. The
memory arrays are organized into rows and columns of memory cells,
known as word lines and bit lines, respectively. Each memory cell
has a unique location or address defined by the intersection of a
row and a column. A DRAM memory cell is a capacitor that is charged
to produce a 1 or a 0. \\
2024-08-25 11:54:54 +02:00
DDR3 (Double Data Rate Type 3) is a widely used type of
SDRAM (Synchronous Dynamic Random-Access Memory) that offers
significant performance improvements over its predecessors,
2024-08-25 15:57:26 +02:00
DDR and DDR2. A DDR3 DIMM module contains 240 contacts.
Key features of DDR3 include higher data rates,
2024-08-25 11:54:54 +02:00
lower power consumption, and increased memory capacity, making
it essential for high-performance computing environments
\cite{DDR3_wiki}. One of the critical aspects of DDR3 is its
internal architecture, which supports data rates ranging from
800 to 1600 Mbps and operates at a lower voltage of 1.5V. This
enables faster data processing and more efficient power usage,
crucial for modern applications that require high-speed memory
access \cite{samsung_ddr3}. Additionally, DDR3 memory modules are
available in larger capacities, allowing systems to handle larger
datasets and more complex computing tasks \cite{altera2008}.
However, the advanced features of DDR3 come with increased
2024-08-25 15:57:26 +02:00
complexity in its initialization and operation.
The DDR3 memory interface, used by the ASUS KGPE-D16, is
source-synchronous. Each memory module generates a Data Strobe
(DQS) pulse simultaneously with the data (DQ) it sends during
a memory read operation. Similarly, a DQS must be generated
with its DQ information when writing to memory. The DQS differs
between write and read operations. Specifically, the DQS generated
by the system for a write operation is centered in the data bit
period, while the DQS provided by the memory during a read operation
is aligned with the edge of the data period \cite{samsung_ddr3}. \\
Due to this edge alignment, the read DQS timing can be adjusted
to meet the setup and hold requirements of the registers capturing
the read data. To improve timing margins or reduce simultaneous
switching noise in the system, the DDR3 memory interface also allows
various other timing parameters to be adjusted. If the system uses
dual-inline memory modules (DIMMs), as in our case, the interface
provides write leveling: a timing adjustment that compensates for
variations in signal travel time \cite{micron_ddr3}.
To reduce simultaneous switching noise, DIMM modules feature a
fly-by architecture for routing the address, command, and clock
signals, which causes command signals to reach the
different memory devices with a delay. The fly-by topology has a
"daisy-chain" structure with either very short stubs or no stubs
at all. This structure results in fewer branches and point-to-point
connections: everything originates from the controller, passing
through each module on the node, thereby increasing the throughput.
2024-08-25 11:54:54 +02:00
In this topology, signals are routed sequentially
from the memory controller to each DRAM chip, reducing signal
2024-08-25 15:57:26 +02:00
reflections and improving overall signal integrity.
It means that routing is done in the order of byte lane numbers,
and the data byte lanes are routed on the same layer. Routing can be
simplified by swapping data bits within a byte lane if necessary.
The fly-by topology contrasts with the dual-T topology
(fig. \ref{fig:fly-by}). This design is essential for maintaining
stability at the high speeds DDR3 operates at, but it also
introduces timing challenges, such as timing skew, that must be
carefully managed \cite{micron_ddr3}. \\
2024-08-22 19:18:34 +02:00
2024-08-25 11:54:54 +02:00
\begin{figure}[H]
\centering
\begin{minipage}[b]{0.45\textwidth}
\centering
\includegraphics[width=0.90\textwidth]{images/fly-by.png}
\end{minipage}%
\begin{minipage}[b]{0.45\textwidth}
\centering
\includegraphics[width=0.824\textwidth]{images/t.png}
\end{minipage}
\caption{DDR3 fly-by \textit{versus} T-topology
(CC BY-SA 4.0, 2021)}
\label{fig:fly-by}
\end{figure}
2024-08-22 19:18:34 +02:00
2024-08-25 11:54:54 +02:00
Proper memory initialization ensures that the memory controller
and the memory modules are correctly configured to work together,
allowing for efficient data transfer and reliable operation. The
initialization process involves setting various parameters,
such as memory timings, voltages, and frequencies, which are
critical for ensuring that the memory operates within its optimal
range \cite{samsung_ddr3}. Failure to initialize DDR3 memory
correctly can lead to several serious consequences, including
system instability, data corruption, and reduced performance
\cite{SridharanVilas2015MEiM}. In the worst-case scenario, improper
memory initialization can prevent the system from booting entirely,
as the memory subsystem fails to function correctly.
In the context of the ASUS KGPE-D16, a server motherboard
designed for high-performance applications, proper DDR3 memory
initialization is particularly important. The KGPE-D16 supports
up to 256GB of DDR3 memory across 16 DIMM slots, and any issues
during memory initialization, if non-fatal, could severely impact
the system's ability to handle large datasets or maintain stable
operation under heavy workloads \cite{asus_kgpe_d16_manual}. Given
the critical role that memory plays in the overall performance of
the KGPE-D16, ensuring that DDR3 memory is correctly initialized
is essential for achieving the desired balance of performance,
reliability, and stability in demanding server environments.
\subsection{General steps for DDR3 configuration}
DDR3 memory initialization is a detailed and essential
process that ensures both the stability and performance of the
system. The process involves several critical steps: detection
and identification of memory modules, initial configuration of the
memory controller, adjustment of timing and voltage settings, and
the execution of training and calibration procedures. \\
The initialization begins with the detection and identification of
the installed memory modules. During the BIST, the firmware reads
the Serial Presence Detect (SPD) data stored on
each memory module. SPD data contains crucial information about
the memory module's specifications, including size, speed, CAS
latency (CL), RAS to CAS delay (tRCD), row precharge time (tRP),
and row cycle time (tRC). This data allows to configure
the memory controller for optimal compatibility and performance. \\
Indeed, once the memory modules have been identified, the firmware
proceeds to the initial configuration of the memory controller.
This controller is governed by a state machine that
manages the sequence of operations required to initialize,
maintain, and control memory access. This state machine consists of
multiple states that represent various phases of memory operation,
such as reset, initialization, calibration, and data transfer.
The transitions between these states are either automatic or
command-driven, depending on the specific requirements of each
phase \cite{samsung_ddr3}\cite{micron_ddr3}.
This state machine is presented in the
fig. \ref{fig:ddr3_state_machine}. Automatic transitions, depicted
by thick arrows in the automaton, occur without external
intervention. These typically include transitions that ensure
the memory enters a stable state, such as the transition from
power-on to initialization, or from calibration to idle states.
These transitions are crucial for maintaining the integrity and
stability of the memory system, as they ensure that the controller
progresses through necessary stages like ZQ calibration and write
leveling, which are essential for proper signal timing and
impedance matching
\cite{samsung_ddr3}\cite{micron_ddr3}\cite{burnett_ddr3}. \\
On the other hand, command-driven transitions, represented by normal
arrows in the automaton, require specific commands issued by the
memory controller or the CPU to advance to the next state. For
instance, the transition from the idle state to the data transfer
state requires explicit read or write commands. Similarly,
transitioning from the initialization state to the calibration
state involves issuing mode register set (MRS) commands that
configure the memorys operating parameters. These command-driven
transitions are integral to the dynamic operation of the memory
system, allowing the controller to respond to the system's
operational needs and ensuring that memory accesses are performed
efficiently and accurately \cite{samsung_ddr3}\cite{micron_ddr3}. \\
The memory controller configuration
involves setting up fundamental parameters such as the memory clock
(MEMCLK) frequency and the memory channel configuration. The MEMCLK
frequency is derived from the SPD data, while the memory channels
are configured to operate in single, dual, or quad-channel modes,
depending on the system architecture and the installed modules
\cite{burnett_ddr3}. Proper configuration of the memory controller
is vital to ensure synchronization with the memory modules,
establishing a stable foundation for subsequent operations. \\
The first critical step, during the INIT phase involves the
adjustment of timing and voltage settings. These settings are
essential for ensuring that DDR3 memory operates efficiently and
reliably. Key timing parameters include CAS Latency (CL), RAS to
CAS Delay (tRCD), Row Precharge Time (tRP), and Row Cycle Time (tRC).
These parameters are finely tuned to balance speed and stability
\cite{samsung_ddr3}. The BIOS uses the SPD data to set these
parameters and may also adjust them dynamically to achieve the
best possible performance. Voltage settings, such as DRAM voltage
(typically 1.5V for DDR3) and termination voltage (VTT), are also
configured to maintain stable operation, especially under varying
2024-08-25 15:57:26 +02:00
conditions such as temperature fluctuations \cite{micron_ddr3}. \\
2024-08-25 11:54:54 +02:00
Training and calibration are among the most complex and crucial
stages of DDR3 memory initialization. The fly-by topology used
for address, command, and clock signals in DDR3 modules enhances
signal integrity by reducing the number of stubs and their lengths,
but it also introduces skew between the clock (CK) and data strobe
(DQS) signals \cite{micron_ddr3}. This skew must be compensated to
ensure that data is written and read correctly. The BIOS performs
write leveling, which adjusts the timing of DQS relative to CK
for each memory module. This process ensures that the memory
controller can write data accurately across all modules, even
when they exhibit slight variations in signal timing due to the
physical layout \cite{samsung_ddr3}. \\
2024-08-22 19:18:34 +02:00
2024-08-25 11:54:54 +02:00
\begin{figure}[H]
\centering
\begin{tikzpicture}[scale=0.6,
transform shape,
shorten >=1pt,
node distance=5cm and 5cm,
on grid,
auto]
% States
\node[state, initial] (reset) {RESET};
\node[draw=none,fill=none] (any) [below=2cm of reset] {ANY};
\node[state] (init) [right=of reset] {INIT};
\node[state] (zqcal) [below=of init] {ZQ Calibration};
\node[state, accepting] (idle) [right=of init] {IDLE};
\node[state] (writelevel) [above=of idle] {WRITE LEVELING};
\node[state] (refresh) [right=of idle] {REFRESH};
\node[state] (activation) [below=of idle] {ACTIVATION};
\node[state] (bankactive) [below=of activation] {BANK ACTIVE};
\node[state] (readop) [below right=of bankactive] {READ OP};
\node[state] (writeop) [below left=of bankactive] {WRITE OP};
\node[state] (prechrg) [below right=of readop] {PRE-CHARGING};
% Transitions
\path[->, line width=0.2mm, >=stealth]
(reset) edge node {} (init)
(idle) edge [bend left=20] node {} (writelevel)
edge [bend left=20] node {REF} (refresh)
edge node {} (activation)
edge [bend left=10] node {ZQCL/S} (zqcal)
(activation) edge node {} (bankactive)
(bankactive) edge [bend left=30] node {PRE} (prechrg)
edge [bend left=20] node {write} (writeop)
edge [bend right=20] node {read} (readop)
(writeop) edge [loop left] node {write} (writeop)
edge [bend left=10] node {read\_a} (readop)
edge [bend right=15] node {PRE} (prechrg)
(readop) edge [loop right] node {read} (readop)
edge [bend left=10] node {write\_a} (writeop)
edge [bend right=15] node {PRE} (prechrg);
% Thick transitions
\path[->, line width=0.5mm, >=stealth]
(any) edge node {} (reset)
(init) edge node {ZQCL} (zqcal)
(zqcal) edge [bend left=10] node {} (idle)
(writelevel) edge [bend left=20] node {MRS} (idle)
(refresh) edge [bend left=20] node {} (idle)
(writeop) edge node {} (prechrg)
edge [bend left=20] node {} (bankactive)
(readop) edge [bend left=15] node {} (prechrg)
edge [bend right=20] node {} (bankactive)
(prechrg) edge [bend right=20] node {} (idle);
\end{tikzpicture}
\caption{DDR3 controller state machine}
\label{fig:ddr3_state_machine}
\end{figure}
2024-08-22 19:18:34 +02:00
2024-08-25 11:54:54 +02:00
ZQ calibration is another vital procedure that adjusts the
output driver impedance and on-die termination (ODT) to match
the systems characteristic impedance \cite{micron_ddr3}. This
calibration is critical for maintaining signal integrity under
different operating conditions, such as voltage and temperature
changes. During initialization, the memory controller issues a
ZQCL command, triggering the calibration sequence that optimizes
impedance settings. This ensures that the memory system can
operate with tight timing tolerances, which is crucial for
systems requiring high reliability.
Read training is also essential to ensure that data read from
the memory modules is interpreted correctly by the memory
controller. This process involves adjusting the timing of the
read data strobe (DQS) to align perfectly with the data being
received. Proper read training is necessary for reliable data
retrieval, which directly impacts system performance and stability. \\
In summary, the DDR3 memory initialization process in systems
like the ASUS KGPE-D16 involves a series of detailed and
interdependent steps that are critical for ensuring system
stability and performance. These include the detection and
identification of memory modules, the initial configuration of
the memory controller, precise adjustments of timing and voltage
settings, and rigorous training and calibration procedures.
\section{Memory initialization techniques}
\subsection{Memory training algorithms}
Memory training algorithms are designed to fine-tune the
operational parameters of memory modules, such as timing, voltage,
and impedance. These algorithms play a crucial role in achieving
the optimal performance of DDR3 memory systems, particularly
in complex multi-core environments where synchronization
and timing are challenging. The primary algorithms used in
memory training include ZQ calibration and write leveling.
Optimizing timing and voltage settings is a critical aspect of
memory training. The memory controller adjusts parameters such as
CAS latency, RAS to CAS delay, and other timing characteristics
to ensure that data is read and written with minimal delay
and maximum accuracy. Voltage adjustments are also crucial,
as they help stabilize the operation of memory modules by
ensuring that the power supplied is within the optimal range,
compensating for any variations due to temperature or other factors
\cite{micron_ddr3}\cite{burnett_ddr3}\cite{gopikrishna2021novel}.
\\
ZQ calibration is a critical step in DDR3 memory initialization that
ensures the proper impedance matching of the output driver and
on-die termination (ODT) resistance. Impedance matching is crucial
for maintaining signal integrity by minimizing reflections and
ensuring reliable data transmission between the memory controller
and the DRAM modules. It is initiated by sending ZQCL (ZQ
Calibration Long) commands to the DDR3 DIMMs. Each ZQCL command
triggers a long calibration cycle within the DRAM module. The
purpose of this calibration is to adjust the output driver impedance
and the ODT resistance to match the specified target impedance. This
adjustment compensates for process variations, voltage fluctuations,
and temperature changes that can affect the impedance
characteristics of the DRAM module \cite{gopikrishna2021novel}. \\
A bit in the DRAM Controller
Timing register is set to 1 to send the ZQCL command, and an address
bit is also set to 1 to indicate that the ZQCL command should be
directed to the memory module. Upon receiving the ZQCL command, the
DRAM module begins the calibration process. This involves a series
of internal adjustments where the DRAM module measures its current
impedance and compares it against the target impedance. The module
then modifies its internal settings to reduce the difference between
the current and target impedance values
\cite{gopikrishna2021novel}\cite{samsung_ddr3}. This process is
iterative, meaning that it may require multiple adjustments to
converge on the optimal impedance settings. The calibration is
designed to ensure that the DRAM module's impedance remains within
a tight tolerance, which is critical for high-speed data
communication. The ZQ calibration process is time-sensitive. After
issuing the ZQCL command, the system must wait for 512 memory
clock cycles (MEMCLKs) to allow the calibration to complete.
This delay is necessary because the calibration involves both
measurement and adjustment phases, which require precise timing
to ensure accuracy \cite{gopikrishna2021novel}. If the system does
not wait the full 512 MEMCLKs, the calibration may be incomplete,
leading to suboptimal impedance matching and potential signal
integrity issues, such as reflections or noise on the data lines. \\
During the ZQ calibration, the DRAM module adjusts its output driver
impedance, which controls the strength of the signals it sends out.
The stronger the signal, the less susceptible it is to noise, but if
the impedance is too high or too low, it can cause signal distortion
or reflections. The ODT resistance is also calibrated to properly
terminate signals that reach the end of a data line. Proper
termination is essential to prevent signal reflections that could
interfere with the integrity of the data being transmitted. The ZQCL
command adjusts these settings by fine-tuning the resistance values
based on the modules feedback, ensuring that the signal paths are
optimized for both transmission and termination. Once the ZQ
calibration is complete, the DCT register bit is reset to 0,
indicating that the calibration command has been processed. The
memory controller then verifies that the DRAM module has correctly
adjusted its impedance settings. This verification process may
involve additional test signals sent across the memory bus to
confirm that signal integrity meets the required standards. If the
calibration is successful, the memory subsystem is now properly
calibrated and ready for normal operation. In systems with LRDIMMs
or RDIMMs, additional steps may be necessary to ensure that all
ranks and channels are calibrated correctly, particularly in
multi-rank configurations where impedance matching can be more
complex. However, in systems with complex memory configurations,
such as those using multiple DIMMs per channel or operating at
higher memory frequencies, the ZQ calibration process becomes even
more critical. The calibration may need to be repeated at different
operating points to ensure that the memory subsystem remains stable
across all conditions. This could involve performing multiple ZQCL
calibrations at different memory frequencies, or under different
thermal conditions, to account for the dynamic nature of memory
operation in modern systems. \\
In seed-based algorithms, an initial "seed" value is used
as a reference point for the calibration process. The memory
controller iteratively adjusts the impedance based on feedback
from the memory module, refining the calibration with each
iteration. This method provides a more precise calibration,
particularly in systems where fine-tuned impedance matching is
critical for high-frequency operations \cite{kim2010design}.
Also, while seed-based methods can accelerate the convergence
of calibration, they require careful selection of initial seed
values to avoid suboptimal or even faulty impedance settings
\cite{gopikrishna2021novel}. \\
Write leveling is another critical aspect of memory training,
particularly in DDR3 systems that use a fly-by topology. It involves
using the physical layer (PHY) to detect the edge of the Data Strobe
(DQS) signal in synchronization with the clock (CK) signal on the
DIMM (Dual In-line Memory Module) during write access. The DQS
signal is a timing signal generated by the memory controller that
accompanies data (DQ) during read and write operations. For write
operations, the DQS signal must be perfectly aligned with the CK
signal to ensure that data is correctly written to memory cells.
Indeed, in systems using a fly-by topology, the DQS signal might
arrive at different times for different memory devices on the same
module due to the signal traveling through different lengths of
trace. Write leveling compensates for this skew by adjusting the
timing of the DQS signal relative to the CK signal for each lane
(a group of data lines) \cite{burnett_ddr3}. This training is
performed on a per-channel and per-DIMM basis, ensuring that each
memory module is correctly synchronized with the memory controller,
minimizing timing mismatches that could lead to data corruption. \\
Using seed-based algorithms, the memory controller sets an initial
delay value and then iteratively adjusts it based on the feedback
received from the memory module. This process ensures that the DQS
signal is correctly aligned with the CK signal at the memory
module's pins, minimizing the risk of data corruption and ensuring
reliable write operations
\cite{samsung_ddr3}\cite{gopikrishna2021novel}.
Seed-based write leveling offers improved precision but must be
finely tuned to account for the specific characteristics of the
memory module and the overall system architecture
\cite{gopikrishna2021novel}. \\
In contrast to seed-based algorithms, seedless methods
do not rely on an initial reference value. Instead, they
dynamically adjust the impedance and timing parameters during
the calibration process. Seedless ZQ calibration continuously
monitors the impedance of the memory module and makes real-time
adjustments to maintain optimal matching. This approach can be
beneficial in environments where the operating conditions are
highly variable, as it allows for more flexible and adaptive
calibration \cite{kim2010design}. Similarly, seedless write
leveling dynamically adjusts the DQS timing based on real-time
feedback from the memory module. This method is particularly
useful in systems where the memory configuration is frequently
changed or where the operating conditions vary significantly
\cite{micron_ddr3}\cite{gopikrishna2021novel}. The traditional
ZQ calibration methods, while effective, often struggle with
matching impedance perfectly across all conditions. A master
thesis by \textcite{gopikrishna2021novel} builds upon these
traditional methods by proposing enhancements that involve more
sophisticated calibration approaches, leading to better impedance
matching and overall memory performance \cite{gopikrishna2021novel}.
\subsection{BIOS and Kernel Developer Guide (BKDG) recommendations}
The BIOS and Kernel Developer Guide (BKDG from \textcite{BKDG}) is a
technical manual aimed at BIOS developers and operating system kernel
programmers. It provides in-depth documentation on the AMD
processor architecture, system initialization processes, and
configuration guidelines. The document is essential for
understanding the proper initialization sequences, including
those for DDR3 memory, to ensure system stability and
performance, particularly for AMD Family 15h processors. \\
The initialization of DDR3 memory begins with configuring the DDR
supply voltage regulator, which ensures that the memory modules
receive the correct power levels. Following this, the Northbridge
2024-08-25 15:57:26 +02:00
(NB) P-state is forced to \path{NBP0}, a state that guarantees
2024-08-25 11:54:54 +02:00
stable operation during the initial configuration phases. Once these
preliminary steps are completed, the initialization of the DDR
physical layer (PHY) begins, which is critical for setting up
the communication interface between the memory controller and the
DDR3 modules. PHY fence training deals with overall signal alignment
at the physical interface, while ZQ calibration focuses on impedance
matching, and write leveling addresses timing alignment during
write operations. Each process involves different methods as PHY
fence training uses iterative timing adjustments, ZQ calibration
uses impedance adjustments via the ZQ pin, and write leveling
adjusts DQS timing relative to CK during writes. These processes are
critical for configuring DDR3 DIMMs and ensuring stable and reliable
operation, especially when booting from an unpowered state such as
ACPI S4 (hibernation), S5 (soft off), or G3 (mechanical off).
\subsubsection{DDR3 initialization procedure}
DDR3 initialization is a multi-step process that prepares
both the memory controllers and the DIMMs for operation. This
initialization is essential to set up the memory configuration
and to ensure that the memory subsystem operates correctly
under various conditions.
2024-08-22 19:18:34 +02:00
2024-08-25 11:54:54 +02:00
\begin{itemize}
\item \textbf{Enable DRAM initialization}: The process
begins by
enabling DRAM initialization. This is done
2024-08-25 15:57:26 +02:00
by setting the \path{EnDramInit} bit in
the \path{D18F2x7C_dct} register to 1. The
\path{D18F2x7C_dct} register is a specific
2024-08-25 11:54:54 +02:00
configuration register within the memory
controller that controls various aspects of the
DRAM initialization process. Enabling this bit
initiates the sequence of operations required to
prepare the memory for use. After setting this bit,
the system waits for 200 microseconds to allow the
initialization command to propagate and stabilize.
\item \textbf{Deassert memory reset}: Next, the memory
reset
2024-08-25 15:57:26 +02:00
signal, known as \path{MemRstX}, is deasserted
by setting the \path{DeassertMemRstX} bit in the
\path{D18F2x7C_dct} register to 1. Deasserting
\path{MemRstX} effectively takes the memory
2024-08-25 11:54:54 +02:00
components out of their reset state, allowing them
to begin normal operation. The system then waits
for an additional 500 microseconds to ensure that
the memory reset is fully deasserted and the memory
components are stable.
\item \textbf{Assert clock enable (CKE)}: The next
step involves asserting the clock enable signal, known as
2024-08-25 15:57:26 +02:00
`CKE`, by setting the \path{AssertCke} bit in the
\path{D18F2x7C_dct} register to 1. The \path{CKE}
2024-08-25 11:54:54 +02:00
signal is critical because it enables the clocking
of the DRAM modules, allowing them to synchronize
with the memory controller. The system must wait
2024-08-25 15:57:26 +02:00
for 360 nanoseconds after asserting \path{CKE}
2024-08-25 11:54:54 +02:00
to ensure that the clocking is correctly established.
\item \textbf{Registered DIMMs and LRDIMMs initialization}:
For systems using registered DIMMs (RDIMMs) or Load
Reduced DIMMs (LRDIMMs), additional initialization
steps are necessary. RDIMMs and LRDIMMs have
buffering mechanisms that reduce electrical loading
and improve signal integrity, especially in systems
with multiple memory modules. During initialization,
2024-08-25 15:57:26 +02:00
the BIOS programs the \path{ParEn} bit in the
\path{D18F2x90_dct} register based on whether
2024-08-25 11:54:54 +02:00
the DIMM is buffered or unbuffered. For RDIMMs,
specific Register Control (RC) commands, such as RC0
through RC7, are sent to initialize the memory module's
control registers. Similarly, LRDIMMs require a series
of Flexible Register Control (FRC) commands, such as
F0RC and F1RC, to initialize their internal registers
according to the manufacturers specifications.
\item \textbf{Mode Register Set (MRS)}: The initialization
process also involves sending Mode Register Set
(MRS) commands. These commands are used to configure
various operational parameters of the DDR3 memory
modules, such as burst length, latency timings,
and operating modes. Each MRS command targets a
specific mode register within the memory module,
and the exact sequence of commands is crucial for
setting up the DIMMs according to the systems
requirements and the DIMM manufacturers guidelines.
\end{itemize}
2024-08-22 19:18:34 +02:00
2024-08-25 11:54:54 +02:00
\subsubsection{ZQ calibration process}
2024-08-22 19:18:34 +02:00
2024-08-25 11:54:54 +02:00
ZQ calibration is a key step in DDR3 initialization,
responsible for calibrating the output driver impedance and
on-die termination (ODT) resistance of the DDR3 modules. Proper
impedance matching is essential for maintaining signal
integrity, reducing signal reflections, and ensuring reliable
data communication between the memory controller and the
2024-08-25 18:51:20 +02:00
memory modules. It is important to note that ZQ calibration
is done directly by the memory controller, and that the firmware
is simply triggering it.
2024-08-22 19:18:34 +02:00
2024-08-25 11:54:54 +02:00
\begin{itemize}
\item \textbf{Sending ZQCL commands}: The BIOS initiates
ZQ calibration by sending two ZQCL (ZQ Calibration Long)
commands to each DDR3 DIMM. ZQCL commands instruct the
memory module to perform a long calibration cycle, during
which the module adjusts its output driver impedance and
ODT resistance to match the desired target impedance. This
process compensates for variations due to manufacturing
differences, voltage fluctuations, and temperature
changes. To send a ZQCL command, the BIOS programs the
2024-08-25 15:57:26 +02:00
\path{SendZQCmd} bit in the \path{D18F2x7C_dct}
register to 1 and sets the \path{MrsAddress[10]} bit to 1,
2024-08-25 11:54:54 +02:00
indicating that the ZQCL command should be sent to the
memory module.
\item \textbf{Calibration timing}: After sending the
ZQCL command, the system must wait for 512 memory clock
cycles (MEMCLKs) to allow the calibration process to
complete. During this time, the memory module adjusts
its internal impedance to ensure it matches the specified
target impedance. This timing is critical, as inadequate
wait times could result in incomplete or inaccurate
calibration, leading to signal integrity issues and
potential data errors.
\item \textbf{Finalization of initialization}: Once the
ZQ calibration is complete, the BIOS deactivates the DRAM
2024-08-25 15:57:26 +02:00
initialization process by setting the \path{EnDramInit}
bit in the \path{D18F2x7C_dct} register to 0. For
2024-08-25 11:54:54 +02:00
LRDIMMs, additional configuration steps are required to
finalize the initialization process. These steps include
programming the DCT registers to monitor for errors and
ensure that the LRDIMMs are operating correctly.
\end{itemize}
2024-08-22 19:18:34 +02:00
2024-08-25 11:54:54 +02:00
\subsubsection{Write leveling process}
The BIOS and Kernel Developer Guide (BKDG) provides a
comprehensive approach to the write leveling process, which is
essential for ensuring correct data alignment during write
operations in DDR3 memory systems. Write leveling is
particularly crucial in systems utilizing a fly-by topology,
where timing skew between the clock and data signals can
introduce significant challenges. This kind of algorithms
were not necessary for DDR2, for example.
If the target operating
frequency is higher than the lowest supported MEMCLK frequency,
the BIOS must perform multiple passes to achieve proper write
leveling. The MEMCLK is the clock signal that synchronizes the
communication between the memory controller and the memory
modules. \\
During each pass, the memory subsystem is configured for a
progressively higher operating frequency:
2024-08-22 19:18:34 +02:00
\begin{itemize}
2024-08-25 11:54:54 +02:00
\item \textbf{Pass 1:} The memory subsystem is configured
for the lowest supported MEMCLK, ensuring that initial
timing adjustments are made under the most stable
conditions.
\item \textbf{Pass 2:} The subsystem is then adjusted for
the second-lowest MEMCLK, gradually increasing the
operating frequency while fine-tuning the alignment of
the DQS and CK signals.
\item \textbf{Pass N:} This process continues until the
highest MEMCLK supported by the system is reached,
ensuring that the memory operates reliably at its
maximum speed.
2024-08-22 19:18:34 +02:00
\end{itemize}
2024-08-25 11:54:54 +02:00
This step-wise configuration ensures that the memory system is
stable across all supported operating frequencies, minimizing
the risk of timing errors during write operations, especially
as frequencies increase and timing margins become tighter. The
configuration process varies depending on whether the DIMM is
a Registered DIMM (RDIMM) or an Unregistered DIMM (UDIMM).
RDIMMs include an additional buffer to improve signal integrity,
which is particularly important in systems with multiple DIMMs.
The steps common to both types include a preparation with the
DDR3 Mode Register Commands
(see fig. \ref{fig:ddr3_state_machine}). Mode registers in DDR3
memory are used to configure various operational parameters such
as latency settings, burst length, and write leveling. One of
2024-08-25 15:57:26 +02:00
the key mode registers is \path{MR1_dct}, which is specific to
2024-08-25 11:54:54 +02:00
DDR3 and controls certain features of the memory module,
2024-08-25 15:57:26 +02:00
including write leveling. \path{MR1_dct} is used to enable or
2024-08-25 11:54:54 +02:00
disable specific functions such as write leveling and output
2024-08-25 15:57:26 +02:00
driver settings. The \path{dct} suffix refers to the Data
2024-08-25 11:54:54 +02:00
Control Timing that is specific to this register's function in
managing the timing and control of data operations within the
memory module. For RDIMMs, a 4-rank module is treated as two
separate DIMMs, where each rank is essentially a separate memory
module within the same DIMM. The first two ranks are the primary
target for the initial configuration. The remaining two ranks
are treated as non-target and are configured separately. \\
Then, these steps are followed, still common to both RDIMMs and
UDIMMs:
2024-08-22 19:18:34 +02:00
\begin{itemize}
2024-08-25 11:54:54 +02:00
\item \textbf{Step 1A: Output Driver and ODT configuration
for target DIMM:}
\begin{itemize}
\item For the first rank (target):
\begin{itemize}
2024-08-25 15:57:26 +02:00
\item Set \path{MR1_dct[1:0][Level] = 1}
2024-08-25 11:54:54 +02:00
to enable write leveling.
2024-08-25 15:57:26 +02:00
\item Set \path{MR1_dct[1:0][Qoff] = 0}
2024-08-25 11:54:54 +02:00
to ensure the output drivers are active.
\end{itemize}
\item For other ranks:
\begin{itemize}
2024-08-25 15:57:26 +02:00
\item Set \path{MR1_dct[1:0][Level] = 1}
2024-08-25 11:54:54 +02:00
to prepare for write leveling.
2024-08-25 15:57:26 +02:00
\item Set \path{MR1_dct[1:0][Qoff] = 1}
2024-08-25 11:54:54 +02:00
to deactivate the output drivers for
ranks that are not currently being
leveled.
\end{itemize}
\item If there are two or more DIMMs per channel,
or if there is one DIMM per three channels:
\begin{itemize}
\item Program the target ranks
2024-08-25 15:57:26 +02:00
\path{RttNom} (nominal termination
resistance value) for \path{RttWr}
2024-08-25 11:54:54 +02:00
termination, which helps in managing signal
integrity during the write process by
ensuring the correct impedance matching.
\end{itemize}
\end{itemize}
\item \textbf{Step 1B: Configure non-target RttNom to normal
operation:}
\begin{itemize}
\item After the initial configuration, the
2024-08-25 15:57:26 +02:00
\path{RttNom} values for the non-target ranks
2024-08-25 11:54:54 +02:00
are set to their normal operating states.
\item A wait time of 40 MEMCLKs is observed to
ensure the configuration settings are stable
before proceeding.
\end{itemize}
\item \textbf{Step 3: PHY configuration:}
\begin{itemize}
\item The PHY is then configured to measure and
adjust the timing delays accurately for each
data lane. The PHY layer is responsible for
converting the signals from the memory
controller into a form that can be transmitted
over the physical connections to the memory
modules.
\end{itemize}
\item \textbf{Step 4: Perform write leveling:}
\begin{itemize}
\item The actual write leveling process is executed,
where the DQS signal timing is adjusted to
ensure it aligns perfectly with the CK signal at
the memory modules pins, ensuring that data is
written accurately.
\end{itemize}
\item \textbf{Step 5: Disable PHY configuration
post-measurement:}
\begin{itemize}
\item After completing the write leveling process,
the PHY configuration is disabled to stop further
timing measurements and adjustments, locking in the
calibrated settings.
\end{itemize}
\item \textbf{Step 6: Program the DIMM to normal operation:}
\begin{itemize}
\item Finally, the DIMM is reprogrammed to its
2024-08-25 15:57:26 +02:00
normal operational state, resetting \path{Qoff}
and \path{Level} to \path{0} to conclude the
2024-08-25 11:54:54 +02:00
write leveling process and return to standard
operation.
\end{itemize}
2024-08-22 19:18:34 +02:00
\end{itemize}
2024-08-25 11:54:54 +02:00
For each DIMM, the BIOS must calculate the coarse and fine
delays for each lane in the DQS Write Timing register:
2024-08-22 19:18:34 +02:00
2024-08-25 11:54:54 +02:00
\begin{itemize}
\item \textbf{Coarse Delay Calculation:} This involves
setting a basic delay based on a seed value specific to
the platform. The seed value is determined during
initial system configuration and serves as a starting
point for further delay adjustments.
\item \textbf{Critical Delay Determination:} The minimum of
the coarse delays for each lane and DIMM is considered
the critical delay. This delay is crucial for ensuring
that all data lanes are correctly synchronized.
\item \textbf{Platform-Specific Seed:} The seed ranges
between -1.20ns and +1.20ns, providing a small
adjustment range to fine-tune the timing based on the
specific characteristics of the platform. This seed
value can differ for the first pass compared to
subsequent passes, allowing for incremental adjustments
as the system stabilizes.
\end{itemize}
2024-08-22 19:18:34 +02:00
2024-08-25 15:57:26 +02:00
\section{Current implementation and potential improvements}
2024-08-25 11:54:54 +02:00
2024-08-25 15:57:26 +02:00
\subsection{Current implementation in coreboot on the KGPE-D16}
In this part as for the rest of this document, we're basing our
study on the 4.11 version of \textit{coreboot} \cite{coreboot_4_11},
which is the last version that supported the ASUS KGPE-D16
mainboard. \\
The process starts in
\path{src/mainboard/asus/kgpe-d16/romstage.c}, in the
\path{cache_as_ram_main} function by calling
\path{fill_mem_ctrl} from
\path{src/northbridge/amd/amdfam10/raminit_sysinfo_in_ram.c}
(lst. \ref{lst:fill_mem_ctrl}).
2024-08-25 11:54:54 +02:00
At this current step, only the BSC is running the firmware code.
This function iterates over all memory controllers (one per
node) and initializes their corresponding structures with the
system information needed for the RAM to function. This includes
the addresses of PCI nodes (important for DMA operations) and
SPD addresses, which are internal ROMs in each memory slot
containing crucial information for detecting and initializing
2024-08-25 15:57:26 +02:00
memory modules. \\
\begin{listing}
\begin{adjustwidth}{0.5cm}{0.5cm}
\inputminted{c}{
listings/src_northbridge_amd_amdfam10_raminit_sysinfo_in_ram.c}
\end{adjustwidth}
\caption{
\protect\path{fill_mem_ctrl()}, extract from
\protect\path{src/northbridge/amd/amdfam10/raminit_sysinfo_in_ram.c}}
\label{lst:fill_mem_ctrl}
\end{listing}
If successful, the system posts codes \path{0x3D} and then
\path{0x40}. The \path{raminit_amdmct} function from
\path{src/northbridge/amd/amdfam10/raminit_amdmct.c} is then
called. This function, in turn, calls \path{mctAutoInitMCT_D}
(lst. \ref{lst:mctAutoInitMCT_D_1}) from
\path{src/northbridge/amd/amdmct/mct_ddr3/mct_d.c},
2024-08-25 11:54:54 +02:00
which is responsible for the initial memory initialization,
predominantly written by Raptor Engineering.
At this stage, it is assumed that memory has been pre-mapped
contiguously from address 0 to 4GB and that the previous code
has correctly mapped non-cacheable I/O areas below 4GB for the
2024-08-25 15:57:26 +02:00
PCI bus and Local APIC access for processor cores. \\
2024-08-25 11:54:54 +02:00
The following prerequisites must be in place from the previous
steps:
\begin{itemize}
\item The HyperTransport bus configured, and its speed is
correctly set.
\item The SMBus controller is configured.
\item The BSP is in unreal mode.
\item A stack is set up for all cores.
\item All cores are initialized at a frequency of 2GHz.
\item If we were using saved values, the NVRAM would have been
verified with checksums.
\end{itemize}
2024-08-25 15:57:26 +02:00
\begin{listing}
\begin{adjustwidth}{0.5cm}{0.5cm}
\inputminted{c}{
listings/src_northbridge_amd_amdmct_mct_ddr3_mct_d_1.c}
\end{adjustwidth}
\caption{
Beginning of
\protect\path{mctAutoInitMCT_D()}, extract from
\protect\path{src/northbridge/amd/amdmct/mct_ddr3/mct_d.c}}
\label{lst:mctAutoInitMCT_D_1}
\end{listing}
2024-08-25 11:54:54 +02:00
The memory controller for the BSP is queried to check if it can
manage ECC memory, which is a type of memory that includes
error-correcting code to detect and correct common types of data
2024-08-25 15:57:26 +02:00
corruption (lst. \ref{lst:mctAutoInitMCT_D_2}).
2024-08-25 11:54:54 +02:00
For each node available in the system, the memory controllers
2024-08-25 15:57:26 +02:00
are identified and initialized using a \path{DCTStatStruc}
2024-08-25 11:54:54 +02:00
structure defined in
\path{src/northbridge/amd/amdmct/mct_ddr3/mct_d.h}. This
structure contains all necessary fields for managing a memory
module. The process includes:
\begin{itemize}
\item Retrieving the corresponding field in the sysinfo
structure for the node.
2024-08-25 15:57:26 +02:00
\item Clearing fields with \path{zero}.
2024-08-25 11:54:54 +02:00
\item Initializing basic fields.
\item Initializing the controller linked to the current node.
\item Verifying the presence of the node (checking if the
processor associated with this controller is present).
If yes, the SMBus is informed.
\item Pre-initializing the memory module controller for this
2024-08-25 15:57:26 +02:00
node using \path{mct_preInitDCT}.
2024-08-25 11:54:54 +02:00
\end{itemize}
2024-08-25 15:57:26 +02:00
The memory modules must be initialized. All modules present on
valid nodes are configured with 1.5V voltage
2024-08-25 18:51:20 +02:00
(lst. \ref{lst:mctAutoInitMCT_D_3}). The ZQ calibration
is triggered at this stage. \\
2024-08-25 15:57:26 +02:00
\begin{listing}
\begin{adjustwidth}{0.5cm}{0.5cm}
\inputminted{c}{
listings/src_northbridge_amd_amdmct_mct_ddr3_mct_d_2.c}
\end{adjustwidth}
\caption{
DIMM initialization in
\protect\path{mctAutoInitMCT_D()}, extract from
\protect\path{src/northbridge/amd/amdmct/mct_ddr3/mct_d.c}}
\label{lst:mctAutoInitMCT_D_2}
\end{listing}
\begin{listing}
\begin{adjustwidth}{0.5cm}{0.5cm}
\inputminted{c}{
listings/src_northbridge_amd_amdmct_mct_ddr3_mct_d_3.c}
\end{adjustwidth}
\caption{
Voltage control in
\protect\path{mctAutoInitMCT_D()}, extract from
\protect\path{src/northbridge/amd/amdmct/mct_ddr3/mct_d.c}}
\label{lst:mctAutoInitMCT_D_3}
\end{listing}
Now, present memory modules are detected using \path{mct_initDCT}
(lst. \ref{lst:mctAutoInitMCT_D_4}). The memory modules existence
is checked and the machine halts immediately after displaying a
message if there is no memory.
\textit{coreboot} waits for all modules to be available using
\path{SyncDCTsReady_D}. \\
The firmware maps the physical memory address ranges into the
address space with \path{HTMemMapInit_D} as contiguously as possible
while also constructing the physical memory map. If there is an
area occupied by something else, it is ignored, and a memory hole is
created. \\
Mapping the address ranges into the cache is done with
\path{CPUMemTyping_D} either as WriteBack (cacheable) or
Uncacheable, depending on whether the area corresponds to physical
memory or a memory hole. \\
The external northbridge is notified of this new memory
configuration. \\
\begin{listing}
\begin{adjustwidth}{0.5cm}{0.5cm}
\inputminted{c}{
listings/src_northbridge_amd_amdmct_mct_ddr3_mct_d_fixme.c}
\end{adjustwidth}
\caption{
\protect\path{mctAutoInitMCT_D()} does not allow restoring
previous training values, extract from
\protect\path{src/northbridge/amd/amdmct/mct_ddr3/mct_d.c}}
\label{lst:mctAutoInitMCT_D_fixme}
\end{listing}
\begin{listing}
\begin{adjustwidth}{0.5cm}{0.5cm}
\inputminted{c}{
listings/src_northbridge_amd_amdmct_mct_ddr3_mct_d_4.c}
\end{adjustwidth}
\caption{
Preparing SMBus, DCTs and NB in
\protect\path{mctAutoInitMCT_D()}
\protect\path{src/northbridge/amd/amdmct/mct_ddr3/mct_d.c}}
\label{lst:mctAutoInitMCT_D_4}
\end{listing}
2024-08-25 11:54:54 +02:00
The \textit{coreboot} code compensates for the delay between DQS
2024-08-25 15:57:26 +02:00
and DQ signals, as well as between CMD and DQ. This is handled by
the \path{DQSTiming_D} function (lst. \ref{lst:mctAutoInitMCT_D_5}).
The initialization can be done again if needed after that, otherwise
the channels and nodes are interleaved and ECC is enabled (if
supported by every module). \\
2024-08-25 11:54:54 +02:00
2024-08-25 15:57:26 +02:00
\begin{listing}
\begin{adjustwidth}{0.5cm}{0.5cm}
\inputminted{c}{
listings/src_northbridge_amd_amdmct_mct_ddr3_mct_d_5.c}
\end{adjustwidth}
\caption{
Get DQS, reset and activate ECC in
\protect\path{mctAutoInitMCT_D()}
\protect\path{src/northbridge/amd/amdmct/mct_ddr3/mct_d.c}}
\label{lst:mctAutoInitMCT_D_5}
\end{listing}
2024-08-25 11:54:54 +02:00
2024-08-25 15:57:26 +02:00
After that being done, the DRAM can be mapped into the address
space with cacheability, and the init process finishes with
validation of every populated DCT node
(lst. \ref{lst:mctAutoInitMCT_D_6}). \\
2024-08-22 19:18:34 +02:00
2024-08-25 15:57:26 +02:00
\begin{listing}
2024-08-25 11:54:54 +02:00
\begin{adjustwidth}{0.5cm}{0.5cm}
2024-08-25 15:57:26 +02:00
\inputminted{c}{
listings/src_northbridge_amd_amdmct_mct_ddr3_mct_d_6.c}
2024-08-25 11:54:54 +02:00
\end{adjustwidth}
2024-08-25 15:57:26 +02:00
\caption{
Mapping DRAM with cache, validating DCT nodes
and finishing the init process in
\protect\path{mctAutoInitMCT_D()}
\protect\path{src/northbridge/amd/amdmct/mct_ddr3/mct_d.c}}
\label{lst:mctAutoInitMCT_D_6}
2024-08-25 11:54:54 +02:00
\end{listing}
2024-08-22 19:18:34 +02:00
2024-08-25 15:57:26 +02:00
Finally, if the RAM is of the ECC type, error-correcting codes
are enabled, and the function ends by activating power-saving
features if requested by the user. \\
2024-08-25 18:51:20 +02:00
\subsubsection{Details on the DQS training function}
The \path{DQSTiming_D} function is a critical part of the
firmware responsible for initializing and training the system's
memory.
The function primarily handles the DQS timing, which is
essential for ensuring data integrity and synchronization
between the memory controller and the DRAM. Proper DQS training
is crucial to align the data signals correctly with the clock
signals.
The function begins by declaring local variables, which are
used throughout the function for various operations. It also
includes an early exit condition to bypass DQS training if a
specific status flag (\path{GSB_EnDIMMSpareNW}) is set,
indicating that a DIMM spare feature is enabled
(lst. \ref{lst:var_decl_and_exit}). \\
\begin{listing}
\begin{adjustwidth}{0.5cm}{0.5cm}
\begin{minted}[linenos]{c}
uint8_t Node;
u8 nv_DQSTrainCTL;
uint8_t retry_requested;
if (pMCTstat->GStatus & (1 << GSB_EnDIMMSpareNW)) {
return;
}
\end{minted}
\end{adjustwidth}
\caption{Initial variable declarations and early exit check.}
\label{lst:var_decl_and_exit}
\end{listing}
Next, the function initializes the TCWL (CAS Write Latency)
offset to zero for each node and DCT (DRAM Controller Timing).
This ensures that the memory write latency is properly aligned
before the DQS training begins
(lst. \ref{lst:set_tcwl_offset}). \\
\begin{listing}
\begin{adjustwidth}{0.5cm}{0.5cm}
\begin{minted}[linenos]{c}
for (Node = 0; Node < MAX_NODES_SUPPORTED; Node++) {
uint8_t dct;
struct DCTStatStruc *pDCTstat;
pDCTstat = pDCTstatA + Node;
for (dct = 0; dct < 2; dct++)
pDCTstat->tcwl_delay[dct] = 0;
}
\end{minted}
\end{adjustwidth}
\caption{Setting initial TCWL offset to zero for all nodes and DCTs,
extract from the
\protect\path{DQSTiming_D} function in
\protect\path{src/northbridge/amd/amdmct/mct_ddr3/mct_d.c}}
\label{lst:set_tcwl_offset}
\end{listing}
A retry mechanism is introduced to handle potential errors
during DQS training. The \path{nv_DQSTrainCTL} variable is
set based on the \path{allow_config_restore} parameter,
determining whether to restore a previous configuration or
proceed with fresh training, but non-working on the current
implementation of ASUS KGPE-D16
(lst. \ref{lst:mctAutoInitMCT_D_fixme}). \\
Then, the pre-training function are called
(lst. \ref{lst:retry_pre_training}). \\
\begin{listing}
\begin{adjustwidth}{0.5cm}{0.5cm}
\begin{minted}[linenos]{c}
retry_dqs_training_and_levelization:
nv_DQSTrainCTL = !allow_config_restore;
mct_BeforeDQSTrain_D(pMCTstat, pDCTstatA);
phyAssistedMemFnceTraining(pMCTstat, pDCTstatA, -1);
\end{minted}
\end{adjustwidth}
\caption{Retry mechanism initialization and pre-training operations,
extract from the
\protect\path{DQSTiming_D} function in
\protect\path{src/northbridge/amd/amdmct/mct_ddr3/mct_d.c}}
\label{lst:retry_pre_training}
\end{listing}
For AMD's Fam15h processors, additional PHY compensation is
performed for each node and valid DCT
(lst. \ref{lst:phy_compensation_init}). This is necessary to
fine-tune the electrical characteristics of the memory
interface. For more information about the PHY training, see
the earlier sections about RAM training algorithm. \\
\begin{listing}
\begin{adjustwidth}{0.5cm}{0.5cm}
\begin{minted}[linenos]{c}
if (is_fam15h()) {
struct DCTStatStruc *pDCTstat;
for (Node = 0; Node < MAX_NODES_SUPPORTED; Node++) {
pDCTstat = pDCTstatA + Node;
if (pDCTstat->NodePresent) {
if (pDCTstat->DIMMValidDCT[0])
InitPhyCompensation(pMCTstat, pDCTstat, 0);
if (pDCTstat->DIMMValidDCT[1])
InitPhyCompensation(pMCTstat, pDCTstat, 1);
}
}
}
\end{minted}
\end{adjustwidth}
\caption{Family-specific PHY compensation initialization for Fam15h processors,
extract from the
\protect\path{DQSTiming_D} function in
\protect\path{src/northbridge/amd/amdmct/mct_ddr3/mct_d.c}}
\label{lst:phy_compensation_init}
\end{listing}
Before proceeding with the main DQS training, the function
invokes a hook function that allows for additional
configurations or custom operations. \\
If \path{nv_DQSTrainCTL} indicates that fresh training should
proceed, the function performs the main DQS training in multiple
passes, including receiver enable training with
\path{TrainReceiverEn_D} and DQS position
training with \path{mct_TrainDQSPos_D}
(lst. \ref{dqs_training_process}). The process is
repeated in different modes to achieve optimal timing. \\
\begin{listing}
\begin{adjustwidth}{0.5cm}{0.5cm}
\begin{minted}[linenos]{c}
if (nv_DQSTrainCTL) {
mct_WriteLevelization_HW(pMCTstat, pDCTstatA, FirstPass);
if (is_fam15h()) {
TrainReceiverEn_D(pMCTstat, pDCTstatA, FirstPass);
}
mct_WriteLevelization_HW(pMCTstat, pDCTstatA, SecondPass);
if (is_fam15h()) {
TrainReceiverEn_D(pMCTstat, pDCTstatA, FirstPass);
} else {
TrainReceiverEn_D(pMCTstat, pDCTstatA, FirstPass);
}
mct_TrainDQSPos_D(pMCTstat, pDCTstatA);
[...]
}
\end{minted}
\end{adjustwidth}
\caption{Main DQS training process in multiple passes,
extract from the
\protect\path{DQSTiming_D} function in
\protect\path{src/northbridge/amd/amdmct/mct_ddr3/mct_d.c}}
\label{lst:dqs_training_process}
\end{listing}
The function checks for any errors during the DQS training. If
errors are detected, it may request a retrain, reset certain
parameters, and restart the training process and even restart
the whole system if needed (lst. \ref{lst:error_handling}). \\
\begin{listing}
\begin{adjustwidth}{0.5cm}{0.5cm}
\begin{minted}[linenos]{c}
retry_requested = 0;
for (Node = 0; Node < MAX_NODES_SUPPORTED; Node++) {
struct DCTStatStruc *pDCTstat;
pDCTstat = pDCTstatA + Node;
if (pDCTstat->NodePresent) {
if (pDCTstat->TrainErrors & (1 << SB_FatalError)) {
printk(BIOS_ERR, "DIMM training FAILED! Restarting system...");
soft_reset();
}
if (pDCTstat->TrainErrors & (1 << SB_RetryConfigTrain)) {
retry_requested = 1;
pDCTstat->TrainErrors &= ~(1 << SB_RetryConfigTrain);
pDCTstat->TrainErrors &= ~(1 << SB_NODQSPOS);
pDCTstat->ErrStatus &= ~(1 << SB_RetryConfigTrain);
pDCTstat->ErrStatus &= ~(1 << SB_NODQSPOS);
}
}
}
\end{minted}
\end{adjustwidth}
\caption{Error detection and retry mechanism during DQS training,
extract from the
\protect\path{DQSTiming_D} function in
\protect\path{src/northbridge/amd/amdmct/mct_ddr3/mct_d.c}}
\label{lst:error_handling}
\end{listing}
Once the training is successfully completed without errors, the
function finalizes the process by setting the maximum read
latency and exiting the training mode. For systems with
\path{allow_config_restore} enabled, it restores the previous
configuration from NVRAM instead of performing a fresh training
(lst. \ref{lst:finalization_exit}). \\
\begin{listing}
\begin{adjustwidth}{0.5cm}{0.5cm}
\begin{minted}[linenos]{c}
TrainMaxRdLatency_En_D(pMCTstat, pDCTstatA);
if (is_fam15h())
exit_training_mode_fam15(pMCTstat, pDCTstatA);
else
mctSetEccDQSRcvrEn_D(pMCTstat, pDCTstatA);
} else {
mct_WriteLevelization_HW(pMCTstat, pDCTstatA, FirstPass);
mct_WriteLevelization_HW(pMCTstat, pDCTstatA, SecondPass);
#if CONFIG(HAVE_ACPI_RESUME)
printk(BIOS_DEBUG, "mctAutoInitMCT_D: Restoring DIMM training configuration from NVRAM\n");
if (restore_mct_information_from_nvram(1) != 0)
printk(BIOS_CRIT, "%s: ERROR: Unable to restore DCT configuration from NVRAM\n", __func__);
#endif
if (is_fam15h())
exit_training_mode_fam15(pMCTstat, pDCTstatA);
pMCTstat->GStatus |= 1 << GSB_ConfigRestored;
}
\end{minted}
\end{adjustwidth}
\caption{Finalization of DQS training and configuration restoration,
extract from the
\protect\path{DQSTiming_D} function in
\protect\path{src/northbridge/amd/amdmct/mct_ddr3/mct_d.c}}
\label{lst:finalization_exit}
\end{listing}
Finally, the function performs a cleanup operation specific to
Fam15h processors, where it switches the DCT control register
as required by a known erratum from AMD for the BKDG
(Erratum 505). This is followed by a post-training hook that
allows for any additional necessary actions
(lst. \ref{lst:post_training_cleanup}). \\
\begin{listing}
\begin{adjustwidth}{0.5cm}{0.5cm}
\begin{minted}[linenos]{c}
if (is_fam15h()) {
struct DCTStatStruc *pDCTstat;
for (Node = 0; Node < MAX_NODES_SUPPORTED; Node++) {
pDCTstat = pDCTstatA + Node;
if (pDCTstat->NodePresent) {
fam15h_switch_dct(pDCTstat->dev_map, 0);
}
}
}
/* FIXME - currently uses calculated value TrainMaxReadLatency_D(pMCTstat, pDCTstatA); */
mctHookAfterAnyTraining();
\end{minted}
\end{adjustwidth}
\caption{Post-training cleanup and final hook execution}
\label{lst:post_training_cleanup}
\end{listing}
\subsubsection{Details on the DQS receiver training function}
TODO study \path{TrainReceiverEn_D} \\
\subsubsection{Details on the DQS position training function}
TODO study \path{mct_TrainDQSPos_D} \\
2024-08-22 19:18:34 +02:00
2024-08-25 11:54:54 +02:00
\subsection{Potential enhancements [WIP]}
\begin{itemize}
\item Identifying areas for improvement in the current
implementation
\item Potential enhancements to memory training algorithms
and configuration settings
\item Broader applicability of these improvements to other
systems using \textit{coreboot}
\end{itemize}
2024-07-24 17:00:17 +02:00
2024-08-25 15:57:26 +02:00
FIXME (lst. \ref{lst:mctAutoInitMCT_D_fixme}) \\
It seems that that seeds for used for DQS training should be
extensively determined for each motherboard, and the BKDG
\cite{BKDG} does not tell otherwise. Moreover, seeds can be
configured uniquely for every possible socket, channel, DIMM module,
and even byte lane combination. The current implementation of
\path{DQSTiming_D} code is only using the recommended seeds from
the table 99 of the BKDG \cite{BKDG}, which is not sufficient
and absolutely not adapted to every DIMM module in the market. \\
See \path{TrainDQSRdWrPos_D_Fam15} in
2024-08-25 18:51:20 +02:00
\path{src/northbridge/amd/amdmct/mct/mct_ddr3/mctdqs_d.c} : allowed
2024-08-25 15:57:26 +02:00
to have negative DQS ("Attempting to continue but your system may
be unstable"). This kind of value should be discarded and
calculation done again. \\
2024-08-25 11:54:54 +02:00
% ------------------------------------------------------------------------------
% CHAPTER 5: Virtualization of the operating system through firmware abstraction
% ------------------------------------------------------------------------------
2024-08-22 19:40:05 +02:00
\chapter{Virtualization of the operating system through firmware abstraction}
In contemporary computing systems, the operating system (OS) no longer
interacts directly with hardware in the same way it did in earlier computing
architectures. Instead, the OS operates within a highly abstracted
environment, where critical functions are managed by various firmware
components such as ACPI, SMM, UEFI, Intel Management Engine (ME), and AMD
Platform Security Processor (PSP). This layered abstraction has led to the
argument that the OS is effectively running in a virtualized environment,
akin to a virtual machine (VM).
\section{ACPI and abstraction of hardware control}
The Advanced Configuration and Power Interface (ACPI) provides a
standardized method for the OS to manage hardware configuration and
power states, effectively abstracting the underlying hardware
complexities. ACPI abstracts hardware details, allowing the OS to
interact with hardware components without needing direct control over
them. This abstraction is similar to how a hypervisor abstracts physical
hardware for VMs, enabling a consistent interface regardless of the
underlying hardware specifics. \\
According to \textcite{bellosa2010}, the abstraction provided by ACPI
not only simplifies the OS's interaction with hardware but also limits
the OS's ability to fully control the hardware, which is instead managed
by ACPI-compliant firmware. This layer of abstraction contributes to the
virtualization-like environment in which the OS operates. \\
2024-08-25 11:54:54 +02:00
More importantly, the ACPI Component Architecture (ACPICA) is a critical
component integrated into the Linux kernel, serving as the foundation
for the system's ACPI implementation \cite{intel_acpi_programming_2023}.
ACPICA provides the core ACPI functionalities, such as hardware
configuration, power management, and thermal management, which are
essential for modern computing platforms. However, its integration into
the Linux kernel has brought significant complexity and code overhead,
making Linux heavily dependent on ACPICA for managing ACPI-related
tasks.
ACPICA is a large and complex project, with its codebase encompassing
a wide range of functionalities required to implement ACPI standards.
The integration of ACPICA into the Linux kernel significantly increases
the kernel's overall code size. An example of that can easily be
reproduced with a small experiment (lst. \ref{lst:acpica_in_linux}).
\begin{listing}[H]
\begin{adjustwidth}{0.5cm}{0.5cm}
\inputminted{sh}{listings/acpica_size.sh}
\end{adjustwidth}
2024-08-25 15:57:26 +02:00
\caption{How to estimate the impact of ACPICA in Linux}
2024-08-25 11:54:54 +02:00
\label{lst:acpica_in_linux}
\end{listing}
As of recent statistics, ACPICA comprises between 100,000 to 200,000
lines of code, making it one of the larger subsystems within the Linux
kernel. This size is indicative of the extensive range of features
and capabilities ACPICA must support, including but not limited to the
ACPI interpreter, AML (ACPI Machine Language) parser, and various
hardware-specific drivers. The ACPICA codebase is not monolithic; it is
highly modular and consists of various components, each responsible for
specific ACPI functions. For instance, ACPICA includes components for
managing ACPI tables, interpreting AML bytecode, handling events, and
interacting with hardware. This modularity, while beneficial for
isolating different functionalities, also contributes to the overall
complexity of the system. The separation of ACPICA into multiple modules
necessitates careful coordination and integration with the rest of the
Linux kernel, adding to the kernel's complexity. \\
ACPICA's integration into the Linux kernel is designed to maintain a
clear separation between the core ACPI functionalities and the kernel's
other subsystems \cite{intel_acpi_programming_2023}. This separation is
achieved through well-defined interfaces and abstraction layers,
allowing the Linux kernel to interact with ACPICA without being tightly
coupled to its internal implementation details. For example, ACPICA
provides an API that the Linux kernel can use to interact with ACPI
tables, execute ACPI methods, and manage power states. This API
abstracts the underlying complexity of the ACPI implementation, making
it easier for kernel developers to incorporate ACPI support without
delving into the intricacies of ACPICA's internals.
Moreover, ACPICA's role in interpreting AML bytecode, which is
essentially a form of low-level programming language embedded in ACPI
tables, adds a layer of abstraction. The Linux kernel relies on ACPICA
to execute AML methods and manage hardware resources according to the
ACPI specifications. This reliance further underscores the idea that
ACPI acts as a virtualizing environment, shielding the kernel from
the complexities of directly interfacing with hardware components.
2024-08-22 19:40:05 +02:00
\section{SMM as a hidden execution layer}
System Management Mode (SMM) is a special-purpose operating mode
provided by x86 processors, designed to handle system-wide functions
such as power management, thermal monitoring, and hardware control,
independent of the OS. SMM operates transparently to the OS, executing
code that the OS cannot detect or control, similar to how a hypervisor
controls the execution environment of VMs. \\
Research by \textcite{huang2009invisible} argues that SMM introduces a
hidden layer of execution that diminishes the OS's control over the
hardware, creating a virtualized environment where the OS is unaware of
and unable to influence certain system-level operations. This hidden
execution layer reinforces the idea that the OS runs in an environment
similar to a VM, with the firmware acting as a hypervisor. \\
\section{UEFI and persistence}
The Unified Extensible Firmware Interface (UEFI) has largely replaced
the traditional BIOS in modern systems, providing a sophisticated
environment that includes a kernel-like structure capable of running
drivers and applications independently of the OS. UEFI remains active
even after the OS has booted, continuing to manage certain hardware
functions, which abstracts these functions away from the OS. \\
\textcite{mcclean2017uefi} discusses how UEFI creates a persistent
execution environment that overlaps with the OS's operation, effectively
placing the OS in a position where it runs on top of another controlling
layer, much like a guest OS in a VM. This persistence and the ability of
UEFI to manage hardware resources independently further blur the lines
2024-08-25 11:54:54 +02:00
between traditional OS operation and virtualized environments.
Indeed, as we studied in a precedent chapter, UEFI is designed as a
modular and extensible firmware interface that sits between the
computer's hardware and the operating system. Unlike the monolithic
BIOS, UEFI is composed of several layers and components, each
responsible for different aspects of the system's boot and runtime
processes. The core components of UEFI include the Pre-EFI
Initialization (PEI), Driver Execution Environment (DXE),
Boot Device Selection (BDS), and Runtime Services. Each of these
components plays a critical role in initializing the hardware,
managing drivers, selecting boot devices, and providing runtime
services to the OS. \\
The PEI (Pre-EFI Initialization) phase is responsible for initializing
the CPU, memory, and other essential hardware components. It ensures
that the system is in a stable state before handing control to the
DXE phase. In the DXE phase, the system loads and initializes various
drivers required for the OS to interact with the hardware. The DXE phase
also constructs the UEFI Boot Services, which provide the OS with
interfaces to the hardware during the boot process. The BDS (Boot Device
Selection) phase is responsible for selecting the device from which the
OS will boot. It interacts with the UEFI Boot Manager to determine the
correct boot path and load the OS. After the OS has booted, UEFI
provides Runtime Services that remain accessible to the OS. These
services include interfaces for managing system variables, time, and
hardware. UEFI also supports the execution of standalone applications,
which can be used for system diagnostics, firmware updates, or other
tasks. These applications operate independently of the OS, highlighting
UEFI's capabilities as a minimalistic OS. \\
UEFI abstracts the underlying hardware from the OS, providing a
standardized interface for the OS to interact with different hardware
components. This abstraction simplifies the development of OSes and
drivers, as they do not need to be tailored for specific hardware
configurations. UEFI's hardware abstraction is one of the key features
that enable it to act as a virtualizing environment for the OS
\cite{mcclean2017uefi}.
\subsection{Memory Management}
UEFI provides a detailed memory map to the OS during the boot process,
which includes information about available, reserved, and used memory
regions. The OS uses this memory map to manage its own memory allocation
and paging mechanisms. The overlap in memory management functions
highlights UEFI's role in preparing the system for OS operation.
This memory map includes all the memory regions in the system,
categorized into different types, such as usable memory, reserved
memory, and memory-mapped I/O. The OS relies on this map to understand
the system's memory layout and avoid conflicts \cite{osdev_uefi_memory}.
The OS extends UEFI's memory
management by implementing its own memory allocation, paging, and
virtual memory mechanisms. However, the OS's memory management is
built on the foundation provided by UEFI, demonstrating the close
relationship between the two.
\subsection{File System Management}
UEFI includes its own file system management capabilities, which overlap
with those of the OS. The most notable example is the EFI System
Partition (ESP), a special partition formatted with the FAT file system
that UEFI uses to store bootloaders, drivers, and other critical files
\cite{uefi_spec}. The ESP is a mandatory partition in UEFI systems,
containing the bootloaders, firmware updates, and other files
necessary for system initialization. UEFI accesses the ESP
independently of the OS, but the OS can also access and manage files
on the ESP, creating an overlap in file system management functions
\cite{uefi_smm_security}. UEFI natively supports the FAT file
system, allowing it to read and write files on the ESP. This support
overlaps with the OS's file system management, as both UEFI and the
OS can manipulate files on the ESP.
\subsection{Device Drivers}
As we studied in an earlier chapter, UEFI includes its own driver
model, allowing it to load and execute drivers independently of the
OS. This capability overlaps with the OS's driver management
functions, as both UEFI and the OS manage hardware devices through
drivers.
UEFI drivers are typically used during
the boot process to initialize and control hardware devices. These
drivers provide the necessary interfaces for the OS to interact with
the hardware once it has booted \cite{uefi_smm_security}.
After the OS has booted, it loads its own drivers for hardware
devices. However, the OS often relies on the initial hardware setup
performed by UEFI drivers.
\subsection{Power Management}
UEFI provides power management services that overlap with the OS's
power management functions. These services allow UEFI to manage
power states and transitions independently of the OS \cite{uefi_spec}.
These services ensure that the system conserves power during periods
of inactivity and can quickly resume operation when needed
The OS extends UEFI's power management by implementing its own
power-saving mechanisms, such as CPU throttling and dynamic voltage
scaling.
2024-08-22 19:40:05 +02:00
\section{Intel and AMD: control beyond the OS}
Intel Management Engine (ME) and AMD Platform Security Processor (PSP)
are embedded microcontrollers within Intel and AMD processors,
respectively. These components run their own firmware and operate
independently of the main CPU, handling tasks such as security
enforcement, remote management, and digital rights management (DRM). \\
\textcite{bulygin2013chipset} highlights how these microcontrollers have
control over the system that supersedes the OS, managing hardware and
security functions without the OS's knowledge or consent. This level of
control is reminiscent of a hypervisor that manages the resources and
security of VMs. The OS, in this context, operates similarly to a VM
that does not have full control over the hardware it ostensibly manages. \\
\section{The OS as a virtualized environment}
The combined effect of these firmware components (ACPI, SMM, UEFI,
Intel ME, and AMD PSP) creates an environment where the OS operates in
a virtualized or highly abstracted layer. The OS does not directly
manage the hardware; instead, it interfaces with these firmware
components, which themselves control the hardware resources. This
situation is analogous to a virtual machine, where the guest OS
operates on virtualized hardware managed by a hypervisor. \\
\textcite{smith2019firmware} argues that modern OS environments,
influenced by these firmware components, should be considered
virtualized environments. The firmware acts as an intermediary layer
that abstracts and controls hardware resources, thereby limiting the
OS's direct access and control. \\
The presence and operation of modern firmware components such as ACPI,
SMM, UEFI, Intel ME, and AMD PSP contribute to a significant abstraction
of hardware from the OS. This abstraction creates an environment that
parallels the operation of a virtual machine, where the OS functions
within a controlled, virtualized layer managed by these firmware
systems. The growing body of research supports this perspective,
suggesting that the traditional notion of an OS directly managing
hardware is increasingly outdated in the face of these complex,
autonomous firmware components.
2024-07-24 17:00:17 +02:00
2024-08-22 19:54:35 +02:00
\chapter*{Conclusion}
2024-07-24 17:00:17 +02:00
\addcontentsline{toc}{chapter}{Conclusion}
2024-08-22 19:54:35 +02:00
This document has explored the evolution and current state of firmware,
particularly focusing on the transition from traditional BIOS to more
advanced firmware interfaces such as UEFI and \textit{coreboot}. The
evolution from a simple set of routines stored in ROM to complex systems
like UEFI and \textit{coreboot} highlights the growing importance of
firmware in modern computing. Firmware now plays a critical role not
only in hardware initialization but also in memory management, security,
and system performance optimization. \\
The study of the ASUS KGPE-D16 mainboard illustrates how firmware,
particularly \textit{coreboot}, plays a crucial role in the efficient
and secure operation of high-performance systems. The KGPE-D16, with its
support for free software-compatible firmware, exemplifies the potential
of libre firmware to deliver both high performance and freedom from
proprietary constraints. However, it is important to acknowledge that
the KGPE-D16 is not without its imperfections. The detailed analysis of
firmware components, such as the bootblock, romstage, and especially the
RAM initialization and training algorithms, reveals areas where the
firmware can be further refined to enhance system stability and
performance. These improvements are not only beneficial for the KGPE-D16
but can also be applied to other boards, extending the impact of these
optimizations across a broader range of hardware. \\
Moreover, the discussion on modern firmware components such as ACPI,
SMM, UEFI, Intel ME, and AMD PSP demonstrates how these elements
abstract hardware from the operating system, creating a virtualized
environment where the OS operates more like a guest in a
hypervisor-controlled system. This abstraction raises important
considerations about control, security, and user freedom in contemporary
computing.
As we continue to witness the increasing complexity and influence of
firmware in computing, it becomes crucial to advocate for free
software-compatible hardware. The dependence on proprietary firmware and
the associated restrictions on user freedom are growing concerns that
need to be addressed. The development and adoption of libre firmware
solutions, such as \textit{coreboot} and GNU Boot, are essential steps
towards ensuring that users retain control over their hardware and
software environments. \\
It is imperative that the community of developers, researchers, and
users come together to support and contribute to the development of
free firmware. By fostering innovation and collaboration in this field,
we can advance towards a future where free software-compatible hardware
becomes the norm, ensuring that computing remains open, secure, and
under the control of its users. The significance of a libre BIOS cannot
be overstated, it is the foundation upon which a truly free and open
computing ecosystem can be built \cite{coreboot_fsf}.
The importance of the GNU Boot project cannot be
overstated. As a fully free firmware initiative, GNU Boot represents a
critical step towards achieving truly libre BIOSes, ensuring that users
can maintain full control over their hardware and firmware environments.
The continued development and support of GNU Boot are essential for
advancing the goals of free software and protecting user freedoms in the
increasingly complex landscape of modern computing. \\
2024-07-24 17:00:17 +02:00
\newpage
% Bibliography
\nocite{*}
\addcontentsline{toc}{chapter}{Bibliography}
\printbibliography
\newpage
2024-08-25 11:54:54 +02:00
% ------------------------------------------------------------------------------
% LICENSE
% ------------------------------------------------------------------------------
2024-08-22 15:38:22 +02:00
\chapter*{\center\rlap{GNU Free Documentation License}}
2024-07-24 17:00:17 +02:00
\addcontentsline{toc}{chapter}{GNU Free Documentation License}
Version 1.3, 3 November 2008
Copyright \copyright{} 2000, 2001, 2002, 2007, 2008 Free Software Foundation, Inc.
\bigskip
2024-08-25 15:57:26 +02:00
\path{<https://fsf.org/>}
2024-07-24 17:00:17 +02:00
\bigskip
Everyone is permitted to copy and distribute verbatim copies
of this license document, but changing it is not allowed.
2024-08-22 15:38:22 +02:00
\bigskip\bigskip{\bf\large Preamble}\bigskip
2024-07-24 17:00:17 +02:00
The purpose of this License is to make a manual, textbook, or other
functional and useful document ``free'' in the sense of freedom: to
assure everyone the effective freedom to copy and redistribute it,
with or without modifying it, either commercially or noncommercially.
Secondarily, this License preserves for the author and publisher a way
to get credit for their work, while not being considered responsible
for modifications made by others.
This License is a kind of ``copyleft'', which means that derivative
2024-08-25 11:54:54 +02:00
works of the document must themselves be free in the same sense. It
2024-07-24 17:00:17 +02:00
complements the GNU General Public License, which is a copyleft
license designed for free software.
We have designed this License in order to use it for manuals for free
software, because free software needs free documentation: a free
program should come with manuals providing the same freedoms that the
2024-08-25 11:54:54 +02:00
software does. But this License is not limited to software manuals;
2024-07-24 17:00:17 +02:00
it can be used for any textual work, regardless of subject matter or
2024-08-25 11:54:54 +02:00
whether it is published as a printed book. We recommend this License
2024-07-24 17:00:17 +02:00
principally for works whose purpose is instruction or reference.
2024-08-22 15:38:22 +02:00
\bigskip\bigskip{\Large\bf 1. APPLICABILITY AND DEFINITIONS\par}\bigskip
2024-07-24 17:00:17 +02:00
This License applies to any manual or other work, in any medium, that
contains a notice placed by the copyright holder saying it can be
2024-08-25 11:54:54 +02:00
distributed under the terms of this License. Such a notice grants a
2024-07-24 17:00:17 +02:00
world-wide, royalty-free license, unlimited in duration, to use that
2024-08-25 11:54:54 +02:00
work under the conditions stated herein. The ``\textbf{Document}'', below,
refers to any such manual or work. Any member of the public is a
licensee, and is addressed as ``\textbf{you}''. You accept the license if you
2024-07-24 17:00:17 +02:00
copy, modify or distribute the work in a way requiring permission
under copyright law.
A ``\textbf{Modified Version}'' of the Document means any work containing the
Document or a portion of it, either copied verbatim, or with
modifications and/or translated into another language.
A ``\textbf{Secondary Section}'' is a named appendix or a front-matter section of
the Document that deals exclusively with the relationship of the
publishers or authors of the Document to the Document's overall subject
(or to related matters) and contains nothing that could fall directly
2024-08-25 11:54:54 +02:00
within that overall subject. (Thus, if the Document is in part a
2024-07-24 17:00:17 +02:00
textbook of mathematics, a Secondary Section may not explain any
mathematics.) The relationship could be a matter of historical
connection with the subject or with related matters, or of legal,
commercial, philosophical, ethical or political position regarding
them.
The ``\textbf{Invariant Sections}'' are certain Secondary Sections whose titles
are designated, as being those of Invariant Sections, in the notice
2024-08-25 11:54:54 +02:00
that says that the Document is released under this License. If a
2024-07-24 17:00:17 +02:00
section does not fit the above definition of Secondary then it is not
2024-08-25 11:54:54 +02:00
allowed to be designated as Invariant. The Document may contain zero
Invariant Sections. If the Document does not identify any Invariant
2024-07-24 17:00:17 +02:00
Sections then there are none.
The ``\textbf{Cover Texts}'' are certain short passages of text that are listed,
as Front-Cover Texts or Back-Cover Texts, in the notice that says that
2024-08-25 11:54:54 +02:00
the Document is released under this License. A Front-Cover Text may
2024-07-24 17:00:17 +02:00
be at most 5 words, and a Back-Cover Text may be at most 25 words.
A ``\textbf{Transparent}'' copy of the Document means a machine-readable copy,
represented in a format whose specification is available to the
general public, that is suitable for revising the document
straightforwardly with generic text editors or (for images composed of
pixels) generic paint programs or (for drawings) some widely available
drawing editor, and that is suitable for input to text formatters or
for automatic translation to a variety of formats suitable for input
2024-08-25 11:54:54 +02:00
to text formatters. A copy made in an otherwise Transparent file
2024-07-24 17:00:17 +02:00
format whose markup, or absence of markup, has been arranged to thwart
or discourage subsequent modification by readers is not Transparent.
An image format is not Transparent if used for any substantial amount
2024-08-25 11:54:54 +02:00
of text. A copy that is not ``Transparent'' is called ``\textbf{Opaque}''.
2024-07-24 17:00:17 +02:00
Examples of suitable formats for Transparent copies include plain
ASCII without markup, Texinfo input format, LaTeX input format, SGML
or XML using a publicly available DTD, and standard-conforming simple
2024-08-25 11:54:54 +02:00
HTML, PostScript or PDF designed for human modification. Examples of
transparent image formats include PNG, XCF and JPG. Opaque formats
2024-07-24 17:00:17 +02:00
include proprietary formats that can be read and edited only by
proprietary word processors, SGML or XML for which the DTD and/or
processing tools are not generally available, and the
machine-generated HTML, PostScript or PDF produced by some word
processors for output purposes only.
The ``\textbf{Title Page}'' means, for a printed book, the title page itself,
plus such following pages as are needed to hold, legibly, the material
2024-08-25 11:54:54 +02:00
this License requires to appear in the title page. For works in
2024-07-24 17:00:17 +02:00
formats which do not have any title page as such, ``Title Page'' means
the text near the most prominent appearance of the work's title,
preceding the beginning of the body of the text.
The ``\textbf{publisher}'' means any person or entity that distributes
copies of the Document to the public.
A section ``\textbf{Entitled XYZ}'' means a named subunit of the Document whose
title either is precisely XYZ or contains XYZ in parentheses following
2024-08-25 11:54:54 +02:00
text that translates XYZ in another language. (Here XYZ stands for a
2024-07-24 17:00:17 +02:00
specific section name mentioned below, such as ``\textbf{Acknowledgements}'',
``\textbf{Dedications}'', ``\textbf{Endorsements}'', or ``\textbf{History}''.)
To ``\textbf{Preserve the Title}''
of such a section when you modify the Document means that it remains a
section ``Entitled XYZ'' according to this definition.
The Document may include Warranty Disclaimers next to the notice which
2024-08-25 11:54:54 +02:00
states that this License applies to the Document. These Warranty
2024-07-24 17:00:17 +02:00
Disclaimers are considered to be included by reference in this
License, but only as regards disclaiming warranties: any other
implication that these Warranty Disclaimers may have is void and has
no effect on the meaning of this License.
2024-08-22 15:38:22 +02:00
\bigskip\bigskip{\Large\bf 2. VERBATIM COPYING\par}\bigskip
2024-07-24 17:00:17 +02:00
You may copy and distribute the Document in any medium, either
commercially or noncommercially, provided that this License, the
copyright notices, and the license notice saying this License applies
to the Document are reproduced in all copies, and that you add no other
2024-08-25 11:54:54 +02:00
conditions whatsoever to those of this License. You may not use
2024-07-24 17:00:17 +02:00
technical measures to obstruct or control the reading or further
2024-08-25 11:54:54 +02:00
copying of the copies you make or distribute. However, you may accept
compensation in exchange for copies. If you distribute a large enough
2024-07-24 17:00:17 +02:00
number of copies you must also follow the conditions in section~3.
You may also lend copies, under the same conditions stated above, and
you may publicly display copies.
2024-08-22 15:38:22 +02:00
\bigskip\bigskip{\Large\bf 3. COPYING IN QUANTITY\par}\bigskip
2024-07-24 17:00:17 +02:00
If you publish printed copies (or copies in media that commonly have
printed covers) of the Document, numbering more than 100, and the
Document's license notice requires Cover Texts, you must enclose the
copies in covers that carry, clearly and legibly, all these Cover
Texts: Front-Cover Texts on the front cover, and Back-Cover Texts on
2024-08-25 11:54:54 +02:00
the back cover. Both covers must also clearly and legibly identify
you as the publisher of these copies. The front cover must present
2024-07-24 17:00:17 +02:00
the full title with all words of the title equally prominent and
2024-08-25 11:54:54 +02:00
visible. You may add other material on the covers in addition.
2024-07-24 17:00:17 +02:00
Copying with changes limited to the covers, as long as they preserve
the title of the Document and satisfy these conditions, can be treated
as verbatim copying in other respects.
If the required texts for either cover are too voluminous to fit
legibly, you should put the first ones listed (as many as fit
reasonably) on the actual cover, and continue the rest onto adjacent
pages.
If you publish or distribute Opaque copies of the Document numbering
more than 100, you must either include a machine-readable Transparent
copy along with each Opaque copy, or state in or with each Opaque copy
a computer-network location from which the general network-using
public has access to download using public-standard network protocols
a complete Transparent copy of the Document, free of added material.
If you use the latter option, you must take reasonably prudent steps,
when you begin distribution of Opaque copies in quantity, to ensure
that this Transparent copy will remain thus accessible at the stated
location until at least one year after the last time you distribute an
Opaque copy (directly or through your agents or retailers) of that
edition to the public.
It is requested, but not required, that you contact the authors of the
Document well before redistributing any large number of copies, to give
them a chance to provide you with an updated version of the Document.
2024-08-22 15:38:22 +02:00
\bigskip\bigskip{\Large\bf 4. MODIFICATIONS\par}\bigskip
2024-07-24 17:00:17 +02:00
You may copy and distribute a Modified Version of the Document under
the conditions of sections 2 and 3 above, provided that you release
the Modified Version under precisely this License, with the Modified
Version filling the role of the Document, thus licensing distribution
and modification of the Modified Version to whoever possesses a copy
2024-08-25 11:54:54 +02:00
of it. In addition, you must do these things in the Modified Version:
2024-07-24 17:00:17 +02:00
\begin{itemize}
\item[A.]
Use in the Title Page (and on the covers, if any) a title distinct
from that of the Document, and from those of previous versions
(which should, if there were any, be listed in the History section
2024-08-25 11:54:54 +02:00
of the Document). You may use the same title as a previous version
2024-07-24 17:00:17 +02:00
if the original publisher of that version gives permission.
\item[B.]
List on the Title Page, as authors, one or more persons or entities
responsible for authorship of the modifications in the Modified
Version, together with at least five of the principal authors of the
Document (all of its principal authors, if it has fewer than five),
unless they release you from this requirement.
\item[C.]
State on the Title page the name of the publisher of the
Modified Version, as the publisher.
\item[D.]
Preserve all the copyright notices of the Document.
\item[E.]
Add an appropriate copyright notice for your modifications
adjacent to the other copyright notices.
\item[F.]
Include, immediately after the copyright notices, a license notice
giving the public permission to use the Modified Version under the
terms of this License, in the form shown in the Addendum below.
\item[G.]
Preserve in that license notice the full lists of Invariant Sections
and required Cover Texts given in the Document's license notice.
\item[H.]
Include an unaltered copy of this License.
\item[I.]
Preserve the section Entitled ``History'', Preserve its Title, and add
to it an item stating at least the title, year, new authors, and
2024-08-25 11:54:54 +02:00
publisher of the Modified Version as given on the Title Page. If
2024-07-24 17:00:17 +02:00
there is no section Entitled ``History'' in the Document, create one
stating the title, year, authors, and publisher of the Document as
given on its Title Page, then add an item describing the Modified
Version as stated in the previous sentence.
\item[J.]
Preserve the network location, if any, given in the Document for
public access to a Transparent copy of the Document, and likewise
the network locations given in the Document for previous versions
2024-08-25 11:54:54 +02:00
it was based on. These may be placed in the ``History'' section.
2024-07-24 17:00:17 +02:00
You may omit a network location for a work that was published at
least four years before the Document itself, or if the original
publisher of the version it refers to gives permission.
\item[K.]
For any section Entitled ``Acknowledgements'' or ``Dedications'',
Preserve the Title of the section, and preserve in the section all
the substance and tone of each of the contributor acknowledgements
and/or dedications given therein.
\item[L.]
Preserve all the Invariant Sections of the Document,
2024-08-25 11:54:54 +02:00
unaltered in their text and in their titles. Section numbers
2024-07-24 17:00:17 +02:00
or the equivalent are not considered part of the section titles.
\item[M.]
2024-08-25 11:54:54 +02:00
Delete any section Entitled ``Endorsements''. Such a section
2024-07-24 17:00:17 +02:00
may not be included in the Modified Version.
\item[N.]
Do not retitle any existing section to be Entitled ``Endorsements''
or to conflict in title with any Invariant Section.
\item[O.]
Preserve any Warranty Disclaimers.
\end{itemize}
If the Modified Version includes new front-matter sections or
appendices that qualify as Secondary Sections and contain no material
copied from the Document, you may at your option designate some or all
2024-08-25 11:54:54 +02:00
of these sections as invariant. To do this, add their titles to the
2024-07-24 17:00:17 +02:00
list of Invariant Sections in the Modified Version's license notice.
These titles must be distinct from any other section titles.
You may add a section Entitled ``Endorsements'', provided it contains
nothing but endorsements of your Modified Version by various
parties---for example, statements of peer review or that the text has
been approved by an organization as the authoritative definition of a
standard.
You may add a passage of up to five words as a Front-Cover Text, and a
passage of up to 25 words as a Back-Cover Text, to the end of the list
2024-08-25 11:54:54 +02:00
of Cover Texts in the Modified Version. Only one passage of
2024-07-24 17:00:17 +02:00
Front-Cover Text and one of Back-Cover Text may be added by (or
2024-08-25 11:54:54 +02:00
through arrangements made by) any one entity. If the Document already
2024-07-24 17:00:17 +02:00
includes a cover text for the same cover, previously added by you or
by arrangement made by the same entity you are acting on behalf of,
you may not add another; but you may replace the old one, on explicit
permission from the previous publisher that added the old one.
The author(s) and publisher(s) of the Document do not by this License
give permission to use their names for publicity for or to assert or
imply endorsement of any Modified Version.
2024-08-22 15:38:22 +02:00
\bigskip\bigskip{\Large\bf 5. COMBINING DOCUMENTS\par}\bigskip
2024-07-24 17:00:17 +02:00
You may combine the Document with other documents released under this
License, under the terms defined in section~4 above for modified
versions, provided that you include in the combination all of the
Invariant Sections of all of the original documents, unmodified, and
list them all as Invariant Sections of your combined work in its
license notice, and that you preserve all their Warranty Disclaimers.
The combined work need only contain one copy of this License, and
multiple identical Invariant Sections may be replaced with a single
2024-08-25 11:54:54 +02:00
copy. If there are multiple Invariant Sections with the same name but
2024-07-24 17:00:17 +02:00
different contents, make the title of each such section unique by
adding at the end of it, in parentheses, the name of the original
author or publisher of that section if known, or else a unique number.
Make the same adjustment to the section titles in the list of
Invariant Sections in the license notice of the combined work.
In the combination, you must combine any sections Entitled ``History''
in the various original documents, forming one section Entitled
``History''; likewise combine any sections Entitled ``Acknowledgements'',
2024-08-25 11:54:54 +02:00
and any sections Entitled ``Dedications''. You must delete all sections
2024-07-24 17:00:17 +02:00
Entitled ``Endorsements''.
2024-08-22 15:38:22 +02:00
\bigskip\bigskip{\Large\bf 6. COLLECTIONS OF DOCUMENTS\par}\bigskip
2024-07-24 17:00:17 +02:00
You may make a collection consisting of the Document and other documents
released under this License, and replace the individual copies of this
License in the various documents with a single copy that is included in
the collection, provided that you follow the rules of this License for
verbatim copying of each of the documents in all other respects.
You may extract a single document from such a collection, and distribute
it individually under this License, provided you insert a copy of this
License into the extracted document, and follow this License in all
other respects regarding verbatim copying of that document.
2024-08-22 15:38:22 +02:00
\bigskip\bigskip{\Large\bf 7. AGGREGATION WITH INDEPENDENT WORKS\par}\bigskip
2024-07-24 17:00:17 +02:00
A compilation of the Document or its derivatives with other separate
and independent documents or works, in or on a volume of a storage or
distribution medium, is called an ``aggregate'' if the copyright
resulting from the compilation is not used to limit the legal rights
of the compilation's users beyond what the individual works permit.
When the Document is included in an aggregate, this License does not
apply to the other works in the aggregate which are not themselves
derivative works of the Document.
If the Cover Text requirement of section~3 is applicable to these
copies of the Document, then if the Document is less than one half of
the entire aggregate, the Document's Cover Texts may be placed on
covers that bracket the Document within the aggregate, or the
electronic equivalent of covers if the Document is in electronic form.
Otherwise they must appear on printed covers that bracket the whole
aggregate.
2024-08-22 15:38:22 +02:00
\bigskip\bigskip{\Large\bf 8. TRANSLATION\par}\bigskip
2024-07-24 17:00:17 +02:00
Translation is considered a kind of modification, so you may
distribute translations of the Document under the terms of section~4.
Replacing Invariant Sections with translations requires special
permission from their copyright holders, but you may include
translations of some or all Invariant Sections in addition to the
2024-08-25 11:54:54 +02:00
original versions of these Invariant Sections. You may include a
2024-07-24 17:00:17 +02:00
translation of this License, and all the license notices in the
Document, and any Warranty Disclaimers, provided that you also include
the original English version of this License and the original versions
2024-08-25 11:54:54 +02:00
of those notices and disclaimers. In case of a disagreement between
2024-07-24 17:00:17 +02:00
the translation and the original version of this License or a notice
or disclaimer, the original version will prevail.
If a section in the Document is Entitled ``Acknowledgements'',
``Dedications'', or ``History'', the requirement (section~4) to Preserve
its Title (section~1) will typically require changing the actual
title.
2024-08-22 15:38:22 +02:00
\bigskip\bigskip{\Large\bf 9. TERMINATION\par}\bigskip
2024-07-24 17:00:17 +02:00
You may not copy, modify, sublicense, or distribute the Document
2024-08-25 11:54:54 +02:00
except as expressly provided under this License. Any attempt
2024-07-24 17:00:17 +02:00
otherwise to copy, modify, sublicense, or distribute it is void, and
will automatically terminate your rights under this License.
However, if you cease all violation of this License, then your license
from a particular copyright holder is reinstated (a) provisionally,
unless and until the copyright holder explicitly and finally
terminates your license, and (b) permanently, if the copyright holder
fails to notify you of the violation by some reasonable means prior to
60 days after the cessation.
Moreover, your license from a particular copyright holder is
reinstated permanently if the copyright holder notifies you of the
violation by some reasonable means, this is the first time you have
received notice of violation of this License (for any work) from that
copyright holder, and you cure the violation prior to 30 days after
your receipt of the notice.
Termination of your rights under this section does not terminate the
licenses of parties who have received copies or rights from you under
2024-08-25 11:54:54 +02:00
this License. If your rights have been terminated and not permanently
2024-07-24 17:00:17 +02:00
reinstated, receipt of a copy of some or all of the same material does
not give you any rights to use it.
2024-08-22 15:38:22 +02:00
\bigskip\bigskip{\Large\bf 10. FUTURE REVISIONS OF THIS LICENSE\par}\bigskip
2024-07-24 17:00:17 +02:00
The Free Software Foundation may publish new, revised versions
2024-08-25 11:54:54 +02:00
of the GNU Free Documentation License from time to time. Such new
2024-07-24 17:00:17 +02:00
versions will be similar in spirit to the present version, but may
2024-08-25 11:54:54 +02:00
differ in detail to address new problems or concerns. See
2024-08-25 15:57:26 +02:00
\path{https://www.gnu.org/licenses/}.
2024-07-24 17:00:17 +02:00
Each version of the License is given a distinguishing version number.
If the Document specifies that a particular numbered version of this
License ``or any later version'' applies to it, you have the option of
following the terms and conditions either of that specified version or
of any later version that has been published (not as a draft) by the
2024-08-25 11:54:54 +02:00
Free Software Foundation. If the Document does not specify a version
2024-07-24 17:00:17 +02:00
number of this License, you may choose any version ever published (not
2024-08-25 11:54:54 +02:00
as a draft) by the Free Software Foundation. If the Document
2024-07-24 17:00:17 +02:00
specifies that a proxy can decide which future versions of this
License can be used, that proxy's public statement of acceptance of a
version permanently authorizes you to choose that version for the
Document.
2024-08-22 15:38:22 +02:00
\bigskip\bigskip{\Large\bf 11. RELICENSING\par}\bigskip
2024-07-24 17:00:17 +02:00
``Massive Multiauthor Collaboration Site'' (or ``MMC Site'') means any
World Wide Web server that publishes copyrightable works and also
2024-08-25 11:54:54 +02:00
provides prominent facilities for anybody to edit those works. A
public wiki that anybody can edit is an example of such a server. A
2024-07-24 17:00:17 +02:00
``Massive Multiauthor Collaboration'' (or ``MMC'') contained in the
site means any set of copyrightable works thus published on the MMC
site.
``CC-BY-SA'' means the Creative Commons Attribution-Share Alike 3.0
license published by Creative Commons Corporation, a not-for-profit
corporation with a principal place of business in San Francisco,
California, as well as future copyleft versions of that license
published by that same organization.
``Incorporate'' means to publish or republish a Document, in whole or
in part, as part of another Document.
An MMC is ``eligible for relicensing'' if it is licensed under this
License, and if all works that were first published under this License
somewhere other than this MMC, and subsequently incorporated in whole
or in part into the MMC, (1) had no cover texts or invariant sections,
and (2) were thus incorporated prior to November 1, 2008.
The operator of an MMC Site may republish an MMC contained in the site
under CC-BY-SA on the same site at any time before August 1, 2009,
provided the MMC is eligible for relicensing.
2024-08-22 15:38:22 +02:00
\bigskip\bigskip{\Large\bf ADDENDUM: How to use this License for your documents\par}\bigskip
2024-07-24 17:00:17 +02:00
To use this License in a document you have written, include a copy of
the License in the document and put the following copyright and
license notices just after the title page:
\bigskip
\begin{quote}
Copyright \copyright{} YEAR YOUR NAME.
Permission is granted to copy, distribute and/or modify this document
under the terms of the GNU Free Documentation License, Version 1.3
or any later version published by the Free Software Foundation;
with no Invariant Sections, no Front-Cover Texts, and no Back-Cover Texts.
A copy of the license is included in the section entitled ``GNU
Free Documentation License''.
\end{quote}
\bigskip
If you have Invariant Sections, Front-Cover Texts and Back-Cover Texts,
replace the ``with \dots\ Texts.''\ line with this:
\bigskip
\begin{quote}
with the Invariant Sections being LIST THEIR TITLES, with the
Front-Cover Texts being LIST, and with the Back-Cover Texts being LIST.
\end{quote}
\bigskip
If you have Invariant Sections without Cover Texts, or some other
combination of the three, merge those two alternatives to suit the
situation.
If your document contains nontrivial examples of program code, we
recommend releasing these examples in parallel under your choice of
free software license, such as the GNU General Public License,
to permit their use in free software.
\end{document}
2024-08-25 11:54:54 +02:00