5483 lines
294 KiB
TeX
5483 lines
294 KiB
TeX
% -*- coding: utf-8 -*-
|
||
% Copyright (C) 2024 Adrien 'neox' Bourmault <neox@gnu.org>
|
||
%
|
||
% Permission is granted to copy, distribute and/or modify this document
|
||
% under the terms of the GNU Free Documentation License, Version 1.3
|
||
% or any later version published by the Free Software Foundation;
|
||
% with no Invariant Sections, no Front-Cover Texts, and no Back-Cover Texts.
|
||
% A copy of the license is included in the section entitled "GNU
|
||
% Free Documentation License".
|
||
|
||
\input{packages.tex}
|
||
|
||
\title{Hardware initialization of modern computers}
|
||
\author{Adrien 'neox' Bourmault}
|
||
\date{\today}
|
||
|
||
% setup bibliography
|
||
\addbibresource{bibliographie.bib}
|
||
|
||
% ------------------------------------------------------------------------------
|
||
\begin{document}{
|
||
% ------------------------------------------------------------------------------
|
||
|
||
\sloppy % allow flexible margins
|
||
\input{titlepage.tex} % import titlepage
|
||
\newpage
|
||
|
||
% ------------------------------------------------------------------------------
|
||
% License header
|
||
% ------------------------------------------------------------------------------
|
||
|
||
\setcounter{page}{2}
|
||
\vspace*{\fill} % fill the page so that text is at the bottom
|
||
|
||
This is Edition 1.1. \\
|
||
|
||
Copyright \copyright\ 2024 Adrien 'neox' Bourmault
|
||
\href{mailto:neox@gnu.org}{<neox@gnu.org>} \\
|
||
|
||
Permission is granted to copy, distribute and/or modify this document
|
||
under the terms of the GNU Free Documentation License, Version 1.3
|
||
or any later version published by the Free Software Foundation;
|
||
with the Invariant Sections being "GNU General Public License version 2"
|
||
and "GNU Free Documentation License", with no Front-Cover Texts, and no
|
||
Back-Cover Texts.
|
||
A copy of the license is included in the section entitled "GNU
|
||
Free Documentation License". \\
|
||
|
||
Source-code included in this document is licensed under the GNU General
|
||
Public License version 2 or any later version published by the
|
||
Free Software Foundation.
|
||
A copy of the license is included in the section entitled "GNU
|
||
General Public License version 2". \\
|
||
|
||
\newpage
|
||
|
||
% ------------------------------------------------------------------------------
|
||
% ACKNOWLEDGMENTS
|
||
% ------------------------------------------------------------------------------
|
||
\chapter*{Acknowledgments}
|
||
\addcontentsline{toc}{chapter}{Acknowledgments}
|
||
|
||
First and foremost, I would like to express my deep gratitude to
|
||
Marie-Minerve Louërat, without whom
|
||
this work would not have come to fruition. Her efforts to assist me on
|
||
legal matters and our enriching discussions on the philosophy of free
|
||
software have been invaluable to me. \\
|
||
|
||
I am also thankful to Roselyne Chotin for agreeing to fund this work as
|
||
without her support, none of this would have been possible, and
|
||
Franck Wajsburt for his invaluable advice at the beginning of my work,
|
||
which greatly helped me organize myself, as well as for his support and
|
||
reviews throughout this period. \\
|
||
|
||
I wish to express my appreciation to the Free Software Foundation for
|
||
funding the necessary equipment for this project, especially to
|
||
the board members of the foundation, including Richard M. Stallman and
|
||
Odile Bénassy, for making the funding for this project possible.
|
||
Special thanks go to Zoë Kooyman and Ian Kelling for their exceptional
|
||
support in managing the hardware order and for their kindness and assistance
|
||
throughout the entire process. \\
|
||
|
||
I am deeply grateful to Denis Carikli (GNUtoo), my fellow GNU Boot
|
||
co-maintainer, for his meticulous reviews, emotional support, and brilliant
|
||
ideas that enriched this work, and to Richard M. Stallman for his advice and
|
||
support throughout this journey. \\
|
||
|
||
A big thank you to Manuel Bouyer for his infinite patience with all my
|
||
requests regarding network, software, and hardware configurations. \\
|
||
|
||
Also, I warmly thank my family and friends for their constant
|
||
encouragement and for reviewing my work throughout this entire process. \\
|
||
|
||
And finally, I would like to thank the break room and its kettle,
|
||
without which no tea would have been possible, thereby jeopardizing this
|
||
work. \\
|
||
|
||
\newpage
|
||
|
||
% ------------------------------------------------------------------------------
|
||
% ABSTRACT
|
||
% ------------------------------------------------------------------------------
|
||
\chapter*{Abstract}
|
||
\addcontentsline{toc}{chapter}{Abstract}
|
||
|
||
The global trend is towards the scarcity of free software-compatible
|
||
hardware, and soon there will be no computer that will work without
|
||
software domination by big companies, especially involving firmware like
|
||
BIOSes. \\
|
||
|
||
A Basic Input Output System (BIOS) was originally a set of low-level
|
||
functions contained in the read-only memory of a computer's mainboard,
|
||
enabling it to perform basic operations when powered up. However, the
|
||
definition of a BIOS has evolved to include what used to be known as Power
|
||
On Self Test (POST) for the presence of peripherals, allocating resources
|
||
for them to avoid conflicts, and then handing over to an operating system
|
||
boot loader. Nowadays, the bulk of the BIOS work is the initialization
|
||
and training of RAM. This means, for example, initializing the memory
|
||
controller and optimizing timing and read/write voltage for optimal
|
||
performance, making the code complex, as its role is to optimize several
|
||
parallel buses operating at high speeds and shared by many CPU cores,
|
||
and make them act as a homogeneous whole. \\
|
||
|
||
This document is the product of a project hosted by the \textit{LIP6
|
||
laboratory} and supported by the \textit{GNU Boot Project} and the
|
||
\textit{Free Software Foundation}. It delves into the importance
|
||
of firmware in the hardware initialization of modern computers and
|
||
explores various aspects of firmware, such as Intel Management Engine
|
||
(ME), AMD Platform Security Processor (PSP), Advanced Configuration and
|
||
Power Interface (ACPI), and System Management Mode (SMM). Additionally,
|
||
it provides an in-depth look at memory initialization and training
|
||
algorithms, highlighting their critical role in system stability and
|
||
performance. Examples of the implementation in the ASUS KGPE-D16 mainboard
|
||
are presented, describing its hardware characteristics, topology, and the
|
||
crucial role of firmware in its operation after the mainboard architecture
|
||
is examined. Practical examples illustrate the impact of firmware on
|
||
hardware initialization, memory optimization, resource allocation,
|
||
power management, and security. Specific algorithms used for memory
|
||
training and their outcomes are analyzed to demonstrate the complexity
|
||
and importance of firmware in achieving optimal system performance.
|
||
Furthermore, this document explores the relationship between firmware
|
||
and hardware virtualization. Security considerations and future trends
|
||
in firmware development are also addressed, emphasizing the need for
|
||
continued research and advocacy for free software-compatible hardware.
|
||
|
||
\newpage
|
||
|
||
% ------------------------------------------------------------------------------
|
||
% Table of contents
|
||
% ------------------------------------------------------------------------------
|
||
\chapter*{\vspace{-\cftbeforechapskip}}
|
||
\addcontentsline{toc}{chapter}{Contents}
|
||
\tableofcontents
|
||
\newpage
|
||
|
||
% List of figures
|
||
\chapter*{\vspace{-\cftbeforechapskip}}
|
||
\addcontentsline{toc}{chapter}{List of Figures}
|
||
\listoffigures
|
||
\newpage
|
||
|
||
% List of listings
|
||
%\chapter*{\vspace{-3em}}
|
||
\listoflistings
|
||
\addcontentsline{toc}{chapter}{List of Listings}
|
||
\newpage
|
||
|
||
% ------------------------------------------------------------------------------
|
||
% CHAPTER 1: Introduction to firmware and BIOS evolution
|
||
% ------------------------------------------------------------------------------
|
||
\chapter{Introduction to firmware and BIOS evolution}
|
||
|
||
\section{Historical context of BIOS}
|
||
|
||
\subsection{Definition and origin}
|
||
|
||
The BIOS (Basic Input/Output System) is firmware, which is a type of
|
||
software that is embedded into hardware devices to control their basic
|
||
functions, acting as a bridge between hardware and other software,
|
||
ensuring that the hardware operates correctly. Unlike regular
|
||
software, firmware is usually stored in a non-volatile memory like
|
||
ROM or flash memory. The term "firmware" comes from its role: it is
|
||
"firm" because it's more permanent than regular software (which can
|
||
be easily changed) but not as rigid as hardware. \\
|
||
|
||
The BIOS is used to perform initialization during the booting process
|
||
and to provide runtime services for operating systems and programs.
|
||
Being a critical component for the startup of personal computers,
|
||
acting as an intermediary between the computer's hardware and its
|
||
operating system, the BIOS is embedded on a chip on the motherboard
|
||
and is the first code that runs when a PC is powered on. The concept
|
||
of BIOS has its roots in the early days of personal computing. It
|
||
was first developed by IBM for their IBM PC, which was introduced
|
||
in 1981 \cite{freiberger2000fire}. The term BIOS itself was
|
||
coined by Gary Kildall, who developed the CP/M (Control Program for
|
||
Microcomputers) operating system \cite{shustek2016kildall}. In CP/M,
|
||
BIOS was used to describe a component that interfaced directly
|
||
with the hardware, allowing the operating system to be somewhat
|
||
hardware-independent. \\
|
||
|
||
\begin{figure}[H]
|
||
\centering
|
||
\includegraphics[width=0.5\textwidth]{images/IBM_logo.png}
|
||
\caption{The eight-striped wordmark of IBM (1967, public domain,
|
||
trademarked)}
|
||
\end{figure}
|
||
|
||
IBM's implementation of BIOS became a de facto standard in
|
||
the industry, as it was part of the IBM PC's open architecture
|
||
\cite{grewal_ibm_pc}\cite{ibm_pc}, which refers to the design
|
||
philosophy adopted by IBM when developing the IBM Personal Computer
|
||
(PC), introduced in 1981. This architecture is characterized by the use
|
||
of off-the-shelf components and publicly available specifications,
|
||
which allowed other manufacturers to create compatible hardware
|
||
and software. It was in fact a departure from the proprietary
|
||
systems prevalent at the time, where companies closely guarded their
|
||
designs to maintain control over the hardware and software ecosystem.
|
||
For example, IBM used the Intel 8088 CPU, a well-documented and widely
|
||
available processor, and also the Industry Standard Architecture
|
||
(ISA) bus, which defined how various components like memory, storage,
|
||
and peripherals communicated with the CPU. This open architecture
|
||
allowed other manufacturers to create IBM-compatible computers, also
|
||
known as "clones", which further popularized the BIOS concept. As
|
||
a result, the IBM PC BIOS set the stage for a standardized method
|
||
of interacting with computer hardware, which has evolved over the
|
||
years but remains fundamentally the same in principle. IBM also
|
||
published detailed technical documentation at that time, including
|
||
circuit diagrams, BIOS listings, and interface specifications. This
|
||
transparency allowed other companies to understand and replicate
|
||
the IBM PC's functionality \cite{freiberger2000fire}.
|
||
|
||
\subsection{Functionalities and limitations}
|
||
|
||
When a computer is powered on, the BIOS executes a Power-On
|
||
Self-Test (POST), a diagnostic sequence that verifies the integrity
|
||
and functionality of critical hardware components such as the CPU,
|
||
RAM, disk drives, keyboard, and other peripherals \cite{wiki_bios}.
|
||
This process ensures that all essential hardware components are
|
||
operational before the system attempts to load the operating system.
|
||
If any issues are detected, the BIOS generates error messages or
|
||
beep codes to alert the user. Following the successful completion
|
||
of POST, the BIOS runs the bootstrap loader, a small program that
|
||
identifies the operating system's bootloader on a storage device,
|
||
such as a hard drive, floppy disk, or optical drive. The bootstrap
|
||
loader then transfers control to the OS bootloader, initiating
|
||
the process of loading the operating system into the computer's
|
||
memory and starting it. This step effectively bridges the gap
|
||
between hardware initialization and operating system execution.
|
||
The BIOS also provides a set of low-level software routines known
|
||
as interrupts. These routines enable software to perform basic
|
||
input/output operations, such as reading from the keyboard, writing
|
||
to the display, and accessing disk drives, without needing to manage
|
||
the hardware directly. By providing standardized interfaces for
|
||
hardware components, the BIOS simplifies software development and
|
||
improves compatibility across different hardware configurations
|
||
\cite{ibm_pc}. \\
|
||
|
||
\begin{figure}[H]
|
||
\centering
|
||
\includegraphics[width=0.5\textwidth]{images/bios_chip.jpg}
|
||
\caption{An AMI BIOS chip from a Dell 310, by Jud McCranie
|
||
(CC BY-SA 4.0, 2018)}
|
||
\end{figure}
|
||
|
||
Despite its essential role, the early BIOS had several limitations.
|
||
One significant limitation was its limited storage capacity.
|
||
Early BIOS firmware was stored in Read-Only Memory (ROM) chips with
|
||
very limited storage, often just a few kilobytes. This constrained
|
||
the complexity and functionality of the BIOS, limiting it to only the
|
||
most essential tasks needed to start the system and provide basic
|
||
hardware control. The original BIOS was also non-extensible. ROM
|
||
chips were typically soldered onto the motherboard, making updates
|
||
difficult and costly. Bug fixes, updates for new hardware support,
|
||
or enhancements required replacing the ROM chip, leading to challenges
|
||
in maintaining and upgrading systems. Furthermore, the early BIOS was
|
||
tailored for the specific hardware configurations of the initial IBM
|
||
PC models, which included a limited set of peripherals and expansion
|
||
options. As new hardware components and peripherals were developed,
|
||
the BIOS often needed to be updated to support them, which was not
|
||
always feasible or timely. Performance bottlenecks were another
|
||
limitation. The BIOS provided basic input/output operations that
|
||
were often slower than direct hardware access methods. For example,
|
||
disk I/O operations through BIOS interrupts were slower compared
|
||
to later direct access methods provided by operating systems,
|
||
resulting in performance bottlenecks, especially for disk-intensive
|
||
operations \cite{osdev_uefi}. Early BIOS
|
||
implementations also had minimal security features. There were no
|
||
mechanisms to verify the integrity of the BIOS code or to protect
|
||
against unauthorized modifications, leaving systems vulnerable to
|
||
attacks that could alter the BIOS and potentially compromise the
|
||
entire system, such as rootkits and firmware viruses. Added to that,
|
||
the traditional BIOS operates in 16-bit real mode, a constraint that
|
||
limits the amount of code and memory it can address. This limitation
|
||
hinders the performance and complexity of firmware, making it less
|
||
suitable for modern computing needs \cite{intel_uefi}. Additionally,
|
||
BIOS relies on the Master Boot Record (MBR) partitioning scheme,
|
||
which supports a maximum disk size of 2 terabytes and allows only
|
||
four primary partitions \cite{uefi_spec}\cite{russinovich2012}.
|
||
This constraint has become a significant drawback as storage
|
||
capacities have increased. Furthermore, the traditional BIOS has
|
||
limited flexibility and is challenging to update or extend. This
|
||
inflexibility restricts the ability to support new hardware and
|
||
technologies efficiently \cite{osdev_uefi}\cite{acmcs2015}.
|
||
|
||
\section{Modern BIOS and UEFI}
|
||
|
||
\subsection{Transition from traditional BIOS to UEFI (Unified
|
||
Extensible Firmware Interface)}
|
||
|
||
All the limitations listed earlier caused a transition to a more
|
||
modern firmware interface, designed to address the shortcomings of
|
||
the traditional BIOS. This section delves into the historical context
|
||
of this shift, the driving factors behind it, and the advantages
|
||
UEFI offers over the traditional BIOS. \\
|
||
|
||
The development of UEFI began in the mid-1990s as part of the
|
||
Intel Boot Initiative, which aimed to modernize the boot process
|
||
and overcome the limitations of the traditional BIOS. By 2005, the
|
||
Unified EFI Forum, a consortium of technology companies including
|
||
Intel, AMD, and Microsoft, had formalized the UEFI specification
|
||
\cite{uefi_spec}. UEFI was designed to address the shortcomings of
|
||
the traditional BIOS, providing several key improvements.
|
||
|
||
\begin{figure}[H]
|
||
\centering
|
||
\includegraphics[width=0.25\textwidth]{images/uefi_logo.png}
|
||
\caption{The UEFI logo (public domain, 2010)}
|
||
\end{figure}
|
||
|
||
One of the most significant advancements of UEFI is its support for
|
||
32-bit and 64-bit modes, allowing it to address more memory and
|
||
run more complex firmware programs. This capability enables UEFI
|
||
to handle the increased demands of modern hardware and software
|
||
\cite{intel_uefi}\cite{shin2011}. Additionally, UEFI uses the GUID
|
||
Partition Table (GPT) instead of the MBR, supporting disks larger
|
||
than 2 terabytes and allowing for a nearly unlimited number of
|
||
partitions \cite{microsoft_uefi}\cite{russinovich2012}.
|
||
|
||
Improved boot performance is another driving factor. UEFI
|
||
provides faster boot times compared to the traditional BIOS,
|
||
thanks to its efficient hardware and software initialization
|
||
processes. This improvement is particularly beneficial for systems
|
||
with complex hardware configurations, where quick boot times
|
||
are essential \cite{intel_uefi}. UEFI's modular architecture
|
||
makes it more extensible and easier to update compared to the
|
||
traditional BIOS. This design allows for the addition of drivers,
|
||
applications, and other components without requiring a complete
|
||
firmware overhaul, providing greater flexibility and adaptability
|
||
to new technologies \cite{acmcs2015}. UEFI also includes enhanced
|
||
security features such as \textit{Secure Boot}, which ensures that
|
||
only trusted software can be executed during the boot process,
|
||
thereby protecting the system from unauthorized modifications and
|
||
malware \cite{osdev_uefi}\cite{chang2013}. \\
|
||
|
||
The industry-wide support and standardization of UEFI have accelerated
|
||
its adoption across various platforms and devices. Major industry
|
||
players, including Intel, AMD, and Microsoft, have adopted UEFI as
|
||
the new standard for firmware interfaces, ensuring broad compatibility
|
||
and interoperability \cite{uefi_spec}.
|
||
|
||
\subsection{An other way with \textit{coreboot}}
|
||
|
||
While UEFI has become the dominant firmware interface for modern
|
||
computing systems, it is not without its critics. Some of the primary
|
||
concerns about UEFI include its complexity, potential security
|
||
vulnerabilities, and the degree of control it provides to hardware
|
||
manufacturers over the boot process. Originally known as LinuxBIOS,
|
||
\textit{coreboot}, is a free firmware project initiated in 1999 by
|
||
Ron Minnich and his team at the Los Alamos National Laboratory. The
|
||
project's primary goal was to create a fast, lightweight, and
|
||
flexible firmware solution that could initialize hardware and
|
||
boot operating systems quickly, while remaining transparent and
|
||
auditable\cite{coreboot}. As an alternative to UEFI, \textit{coreboot}
|
||
offers a different approach to firmware that aims to address some
|
||
of these concerns and continue the evolution of BIOS.\\
|
||
|
||
One of the main advantages of \textit{coreboot} over UEFI is its
|
||
simplicity, as it is designed to perform only the minimal tasks
|
||
required to initialize hardware and pass control to a payload, such
|
||
as a bootloader or operating system kernel. This minimalist approach
|
||
reduces the attack surface and potential for security vulnerabilities,
|
||
as there is less code that could be exploited by malicious actors.
|
||
Another significant benefit of \textit{coreboot}
|
||
is its libre nature. Unlike UEFI, which is controlled by a consortium
|
||
of hardware and software vendors, \textit{coreboot}'s source code
|
||
is freely available and can be audited, modified, and improved by
|
||
anyone. This transparency ensures that security researchers and
|
||
developers can review the code for potential vulnerabilities and
|
||
contribute to its improvement, fostering a community-driven approach
|
||
to firmware development\cite{coreboot}. This project also supports
|
||
a wide range of bootloaders, called payloads, allowing users to
|
||
customize their boot process to suit their specific needs. Popular
|
||
payloads include SeaBIOS, which provides legacy BIOS compatibility, and
|
||
Tianocore, which offers UEFI functionality within the \textit{coreboot}
|
||
framework. This flexibility allows \textit{coreboot} to be used in
|
||
a variety of environments, from embedded systems to high-performance
|
||
servers \cite{coreboot_payloads}. \\
|
||
|
||
\begin{figure}[H]
|
||
\centering
|
||
\includegraphics[width=0.3\textwidth]{images/coreboot_logo.png}
|
||
\caption{The \textit{coreboot} logo, by Konsult Stuge \&
|
||
coresystems
|
||
(coreboot logo license, 2008)}
|
||
\end{figure}
|
||
|
||
Despite its advantages, \textit{coreboot} is not without its
|
||
challenges. The project relies heavily on community contributions, and
|
||
support for new hardware often lags behind that of UEFI. Additionally,
|
||
the minimalist design of \textit{coreboot} means that some advanced
|
||
features provided by UEFI are not available by default. However,
|
||
the \textit{coreboot} community continues to work on adding
|
||
new features and improving compatibility with modern hardware or
|
||
security issues \cite{coreboot_challenges}. For example, it provides
|
||
a \textit{verified boot} function, allowing to prevent rootkits and
|
||
other attacks based on firmware modifications \cite{coreboot_docs}.
|
||
However, it's important to note that \textit{coreboot} is not entirely
|
||
free in all aspects. Many modern processors and chipsets require
|
||
\textit{proprietary blobs}, short for \textit{Binary Large Object},
|
||
which is a collection of binary data stored as a single entity. These
|
||
blobs are necessary for \textit{coreboot} to function correctly on
|
||
a wide range of hardware, but they compromise the goal of having
|
||
a fully free firmware one day \cite{blobs}, since these blobs are
|
||
used for certain functionalities such as memory initialization and
|
||
hardware management.
|
||
|
||
\begin{figure}[H]
|
||
\centering
|
||
\includegraphics[width=0.25\textwidth]{images/gnuboot.png}
|
||
\caption{The \textit{GNU Boot} logo, by Jason Self (CC0, 2020)}
|
||
\end{figure}
|
||
|
||
To address these concerns, the GNU Project has developed GNU Boot,
|
||
a fully free distribution of firmware, including \textit{coreboot},
|
||
that aims to be entirely free by avoiding the use of proprietary
|
||
binary blobs.
|
||
|
||
GNU Boot is only a distribution: it reuses existing software projects
|
||
and is not very different from fully free GNU/Linux distributions like
|
||
Trisquel or Guix, as GNU Boot is committed to use only free software
|
||
for all aspects of firmware, making it a preferred choice for users
|
||
and organizations that prioritize software freedom and transparency.
|
||
Its goal include to build the software and assemble it in something
|
||
that can be installed, and also to test it and to provide installation
|
||
and upgrade instructions \cite{gnuboot}.
|
||
|
||
\section{Shift in firmware responsibilities}
|
||
|
||
Initially, the BIOS's primary function was to perform the POST, a basic
|
||
diagnostic testing process to check the system's hardware components
|
||
and ensure they were functioning correctly. This included verifying
|
||
the CPU, memory, and essential peripherals before passing control to
|
||
the operating system's bootloader. This process was relatively simple,
|
||
given the limited capabilities and straightforward architecture of
|
||
early computer systems \cite{osdev_uefi}.
|
||
|
||
As computer systems advanced, particularly with the advent of more
|
||
sophisticated memory technologies, the role of firmware expanded
|
||
significantly. Modern memory modules operate at much higher
|
||
speeds and capacities than their predecessors, requiring precise
|
||
configuration to ensure stability and optimal performance. Firmware
|
||
now plays a critical role in managing the memory controller, which is
|
||
responsible for regulating data flow between the processor and memory
|
||
modules. This includes configuring memory frequencies, voltage levels,
|
||
and timing parameters to match the specifications of the installed
|
||
memory \cite{uefi_spec}\cite{BKDG}. Beyond memory management,
|
||
firmware responsibilities have broadened to encompass a wide range
|
||
of system-critical tasks, and even so by including runtime components
|
||
in addition to its initialization tasks.
|
||
One key area is power management, where
|
||
firmware is responsible for optimizing energy consumption across
|
||
various components of the system. Efficient power management is
|
||
essential not only for extending battery life in portable devices
|
||
but also for reducing thermal output and ensuring system longevity
|
||
in desktop and server environments. Moreover, modern firmware takes
|
||
on significant roles in hardware initialization and configuration,
|
||
which were traditionally handled by the operating system. For
|
||
example, the initialization of USB controllers, network interfaces,
|
||
and storage devices is now often managed by the firmware during
|
||
the early stages of the boot process. This shift ensures that the
|
||
operating system can seamlessly interact with hardware from the
|
||
moment it takes control, reducing boot times and improving overall
|
||
system reliability \cite{uefi_spec}. Security has also become a
|
||
paramount concern for modern firmware. UEFI (Unified Extensible
|
||
Firmware Interface), which has largely replaced traditional BIOS
|
||
in modern systems, includes features which prevents unauthorized
|
||
or malicious software from loading during the boot process. This
|
||
helps protect the system from rootkits and other low-level malware
|
||
that could compromise the integrity of the operating system before
|
||
it even starts \cite{uefi_spec}. In the context of performance
|
||
tuning, firmware sometimes also plays a key role in enabling and
|
||
managing overclocking, particularly for the memory subsystem. By
|
||
allowing adjustments to memory frequencies, voltages, and timings,
|
||
firmware provides tools for enthusiasts to push their systems beyond
|
||
default limits. At the same time, it includes safeguards to manage
|
||
the risks of instability and hardware damage, balancing performance
|
||
gains with system reliability \cite{osdev_uefi}. \\
|
||
|
||
In summary, the evolution of firmware from simple hardware
|
||
initialization routines to complex management systems reflects the
|
||
increasing sophistication of modern computer architectures. Firmware
|
||
is now a critical layer that not only ensures the correct functioning
|
||
of hardware components but also optimizes performance, manages power
|
||
consumption, and enhances system security, making it an indispensable
|
||
part of contemporary computing. \\
|
||
|
||
This document will focus on \textit{coreboot} during the next parts
|
||
to study how modern firmware interact with hardware and also as a
|
||
basis for improvements.
|
||
|
||
% ------------------------------------------------------------------------------
|
||
% CHAPTER 2: Characteristics of ASUS KGPE-D16 mainboard
|
||
% ------------------------------------------------------------------------------
|
||
\chapter{Characteristics of ASUS KGPE-D16 mainboard}
|
||
|
||
\begin{figure}[H]
|
||
\centering \includegraphics[width=0.9\textwidth]{images/kgpe-d16.png}
|
||
\caption{The KGPE-D16 (CC BY-SA 4.0, 2021)}
|
||
\end{figure}
|
||
|
||
\newpage
|
||
|
||
\section{Overview of ASUS KGPE-D16 hardware}
|
||
|
||
The ASUS KGPE-D16 server mainboard is a dual-socket motherboard
|
||
designed to support AMD Family 10h/15h series processors. Released
|
||
in 2009, this mainboard was later awarded the \textit{Respects Your
|
||
Freedom} (RYF) certification in March 2017, underscoring its commitment
|
||
to fully free software compatibility \cite{fsf_ryf}. Indeed, this
|
||
mainboard can be operated with a fully free firmware such as GNU
|
||
Boot \cite{gnuboot_status}. \\
|
||
|
||
This mainboard is equipped with robust hardware components designed to
|
||
meet the demands of high-performance computing. It features 16 DDR3
|
||
DIMM slots, capable of supporting up to 256GB of memory, although
|
||
certain configurations may be limited to 192GB, with some reports
|
||
suggesting the potential to support 256GB under specific conditions.
|
||
In terms of expandability, the KGPE-D16 includes multiple PCIe
|
||
slots, with five physical slots available, although only four
|
||
can be used simultaneously due to slot sharing. For storage, the
|
||
mainboard provides several SATA ports. Networking capabilities are
|
||
enhanced by integrated dual gigabit Ethernet ports, which provide
|
||
high-speed connectivity essential for data-intensive tasks and network
|
||
communication \cite{asus_kgpe_d16_manual}. Additionally, the board
|
||
is equipped with various peripheral interfaces, including USB ports,
|
||
audio outputs, and other I/O ports, ensuring compatibility with a
|
||
wide range of external devices. \\
|
||
|
||
\begin{figure}[H]
|
||
\centering
|
||
\includegraphics[width=0.8\textwidth]{images/fig1_schema_basique.png}
|
||
\caption{Basic schematics of the ASUS KGPE-D16 Mainboard, ASUS
|
||
(2011)} \label{fig:d16_basic_schematics}
|
||
\end{figure}
|
||
|
||
The physical layout of the ASUS KGPE-D16 is meticulously designed
|
||
to optimize airflow, cooling, and power distribution. All of this
|
||
is critical for maintaining system stability, particularly under
|
||
heavy computational loads, as this board was designed for server
|
||
operations. In particular, key components such as the CPU sockets,
|
||
memory slots, and PCIe slots are strategically positioned. \\
|
||
|
||
\begin{figure}[H]
|
||
\centering
|
||
\includegraphics[width=0.8\textwidth]{images/kgpe-d16_real.png}
|
||
\caption{The KGPE-D16, viewed from the top (CC BY-SA 4.0, 2024)}
|
||
\label{fig:d16_top_view}
|
||
\end{figure}
|
||
|
||
\section{Chipset}
|
||
|
||
Before diving into the specific components, it is essential
|
||
to understand the roles of the northbridge and southbridge in
|
||
traditional motherboard architecture. These chipsets historically
|
||
managed communication between the CPU and other critical components
|
||
of the system \cite{amd_chipsets}. \\
|
||
|
||
The northbridge is a chipset on the motherboard that traditionally
|
||
manages high-speed communication between the CPU, memory (RAM), and
|
||
graphics card (if applicable). It serves as a hub for data that needs
|
||
to move quickly between these components. On the ASUS KGPE-D16, the
|
||
functions typically associated with the northbridge are divided between
|
||
the CPU’s internal northbridge and an external SR5690 northbridge
|
||
chip. The SR5690 specifically acts as a translator and switch,
|
||
handling the HyperTransport interface, a high-speed communication
|
||
protocol used by AMD processors, and converting it to ALink and PCIe
|
||
interfaces, which are crucial for connecting peripherals like graphics
|
||
cards \cite{SR5690BDG}. Additionally, the northbridge on the KGPE-D16
|
||
incorporates the IOMMU (Input-Output Memory Management Unit), which
|
||
is crucial for ensuring secure and efficient memory access by I/O
|
||
devices. The IOMMU allows for the virtualization of memory addresses,
|
||
providing device isolation and preventing unauthorized memory access,
|
||
which is particularly important in environments that run multiple
|
||
virtual machines \cite{amd_chipsets}\cite{northbridge_wiki}. \\
|
||
|
||
The southbridge, on the other hand, is responsible for handling
|
||
lower-speed, peripheral interfaces such as the PCI, USB, and
|
||
IDE/SATA connections, as well as managing onboard audio and
|
||
network controllers. On the KGPE-D16, these functions are managed
|
||
by the SP5100 southbridge chip, which integrates several critical
|
||
functions including the LPC bridge, SATA controllers, and other
|
||
essential I/O operations \cite{amd_chipsets}\cite{southbridge_wiki}.
|
||
It is essentially an ALink bus controller and includes the hardware
|
||
interrupt controller, the IOAPIC. Interrupts from peripheral always
|
||
pass through the northbridge (fig. \ref{fig:d16_ioapic}), since it
|
||
translates ALink to HyperTransport for the CPUs and contains the
|
||
IOMMU \cite{SR5690BDG}. \\
|
||
|
||
\begin{figure}[H]
|
||
\centering \includegraphics[width=0.9\textwidth]{images/ioapic.png}
|
||
\caption{Functional diagram presenting the IOAPIC function of
|
||
the SP5100,
|
||
ASUS (2011)}
|
||
\label{fig:d16_ioapic}
|
||
\end{figure}
|
||
|
||
In addition to the northbridge and southbridge, the KGPE-D16 also
|
||
contains specialized chips for managing input/output operations and
|
||
system health monitoring. The WINBOND W83667HG-A Super I/O chip handles
|
||
traditional I/O functions such as legacy serial and parallel ports,
|
||
keyboard, and mouse interfaces, but also the SPI chip (Serial Peripheral
|
||
Interface, a synchronous serial communication protocol primarily used
|
||
to communicate between microcontrollers and peripheral devices like
|
||
sensors or memory devices) that contains the firmware \cite{winbond}.
|
||
Meanwhile, the Nuvoton W83795G/ADG Hardware
|
||
Monitor oversees the system’s health by monitoring temperatures,
|
||
voltages, and fan speeds, ensuring that the system operates within
|
||
safe parameters \cite{nuvoton}. On the KGPE-D16, access to the Super
|
||
I/O from a CPU core is done through the SR5690, then the SP5100,
|
||
as that can be observed on the functional diagram of the chipset
|
||
(fig. \ref{fig:d16_chipset}) \cite{SR5690BDG}.
|
||
|
||
\begin{figure}[H]
|
||
\centering
|
||
\includegraphics[width=0.8\textwidth]{images/fig2_diagramme_chipset.png}
|
||
\caption{Functional diagram of the KGPE-D16 chipset (CC BY-SA 4.0,
|
||
2024)} \label{fig:d16_chipset}
|
||
\end{figure}
|
||
|
||
\section{Processors}
|
||
|
||
The ASUS KGPE-D16 supports AMD Family 10h processors, but
|
||
it is important to note that Vikings, a known vendor for
|
||
libre-software-compatible hardware, does not recommend using the
|
||
Opteron 6100 series due to the lack of IOMMU support, which is
|
||
critical for security. Fortunately, AMD Family 15h processors are also
|
||
supported. However, the Opteron 6300 series, while supported, requires
|
||
proprietary microcode updates for stability, IOMMU functionality,
|
||
and fixes for specific vulnerabilities, including a gain-root-
|
||
via-NMI exploit. The Opteron 6200 series does not suffer from these
|
||
problems and works properly without any proprietary microcode update
|
||
needed \cite{vikings}. \\
|
||
|
||
\begin{figure}[H]
|
||
\centering
|
||
\includegraphics[width=0.9\textwidth]{images/opteron6200_annoté.png}
|
||
\caption{Annotated photography of an Opteron 6200 series
|
||
CPU (2024), from a photography by AMD Inc. (2008)}
|
||
\label{fig:opteron2600}
|
||
\end{figure}
|
||
|
||
The Opteron 6200 series, part of the Bulldozer microarchitecture,
|
||
was designed to target high-performance server applications. These
|
||
processors feature 16 cores, organized into 8 Bulldozer modules,
|
||
with each module containing two integer cores that shared
|
||
resources like the floating-point unit (FPU) and L2 cache
|
||
(fig. \ref{fig:opteron2600}, \ref{fig:opteron2600_diagram})
|
||
\cite{amd_6200}\cite{anandtech_bulldozer}.
|
||
|
||
The architecture of the Opteron 6200 series is built around AMD's
|
||
Bulldozer core design, which uses Clustered Multithreading (CMT) to
|
||
maximize resource utilization. This is a technique where each processor
|
||
module contains two integer cores that share certain resources like
|
||
the floating-point unit (FPU), L2 cache, and instruction fetch/decode
|
||
stages. Unlike traditional multithreading, where each core handles
|
||
multiple threads, CMT allows two cores to share resources to improve
|
||
parallel processing efficiency. This approach aims to balance
|
||
performance and resource usage, particularly in multi- threaded
|
||
workloads, though it can lead to some performance trade-offs in
|
||
single-threaded tasks. In the Opteron 6272, the processor consists
|
||
of eight modules, effectively creating 16 integer cores. Due to
|
||
the CMT architecture, each Opteron 6272 chip functions as two CPUs
|
||
within a single processor, each with its own set of cores, L2 caches,
|
||
and shared L3 cache. Here, one CPU is made by four modules, each
|
||
module in it sharing certain components, such as the FPU and L2 cache,
|
||
between two integer cores. The L3 cache is shared across these modules.
|
||
HyperTransport links provide high-speed communication between the two
|
||
sockets of the KGPE-D16. Shared L3 cache and direct memory access are
|
||
provided by each socket \cite{amd_6200}\cite{hill_impact_caching}. \\
|
||
|
||
This architecture also integrates a quad-channel DDR3 memory
|
||
controller directly into the processor die, which facilitates high
|
||
bandwidth and low latency access to memory. This memory controller
|
||
supports DDR3 memory speeds up to 1600 MHz and connects directly
|
||
to the memory modules via the memory bus. By integrating the memory
|
||
controller into the processor, the Opteron 6200 series reduces memory
|
||
access latency, enhancing overall performance
|
||
\cite{amd_6200}\cite{amd_ddr3_guide}.
|
||
It is interesting to note that Opterons
|
||
incorporate the internal northbridge that we cited previously. The
|
||
traditional northbridge functions, such as memory controller and PCIe
|
||
interface management, are partially integrated into the processor. This
|
||
integration reduces the distance data must travel between the CPU and
|
||
memory, decreasing latency and improving performance, particularly
|
||
in memory-intensive applications \cite{amd_6200}. \\
|
||
|
||
|
||
\begin{figure}[H]
|
||
\centering \includegraphics[width=0.8\textwidth]{
|
||
images/fig3_img_dual_processor_node.png}
|
||
\caption{Functional diagram of an Opteron 6200 package
|
||
(CC BY-SA 4.0, 2024)}
|
||
\label{fig:opteron2600_diagram}
|
||
\end{figure}
|
||
|
||
Power efficiency was a key focus in the design of the Opteron 6200
|
||
series. Despite the high core count, the processor includes several
|
||
power management features, such as Dynamic Power Management (DPM)
|
||
and Turbo Core technology. These features allow the processor to
|
||
adjust power usage based on workload demands, balancing performance
|
||
with energy consumption. However, the Bulldozer architecture's
|
||
focus on high clock speeds and multi-threaded performance resulted
|
||
in higher power consumption compared to competing architectures
|
||
\cite{anandtech_bulldozer}. A special model of the series, called
|
||
\textit{high efficiency} models, solve a bit this problem by proposing
|
||
a bit less performant processor but with a power consumption divided
|
||
by a factor from 1.5 to 2.0 in some cases. \\
|
||
|
||
The processor connected to the I/O hub is known as the Bootstrap
|
||
Processor (BSP). The BSP is responsible for starting up the system
|
||
by executing the initial firmware code from the reset vector,
|
||
a specific memory address where the CPU begins execution after a
|
||
reset \cite{amd_bsp}. Core 0 of the BSP, called the Bootstrap Core
|
||
(BSC), initiates this process. During early initialization, the
|
||
BSP performs several critical tasks, such as memory initialization,
|
||
and bringing other CPU cores online. One of its duties is storing
|
||
Built-In Self-Test (BIST) information, which involves checking the
|
||
integrity of the processor's internal components to ensure they are
|
||
functioning correctly. The BSP also determines the type of reset
|
||
that has occurred whether it's a cold reset, which happens when
|
||
the system is powered on from an off state, or a warm reset, which
|
||
is a restart without turning off the power. Identifying the reset
|
||
type is crucial for deciding which initialization procedures need
|
||
to be executed \cite{amd_bsp}\cite{BKDG}.
|
||
|
||
\section{Baseboard Management Controller}
|
||
|
||
The Baseboard Management Controller (BMC) on the KGPE-D16 motherboard,
|
||
specifically the ASpeed AST2050, plays a role in the server's
|
||
architecture by managing out-of-band communication and control of
|
||
the hardware. The AST2050 is based on an ARM926EJ-S processor,
|
||
a low-power 32-bit ARM architecture designed for embedded systems
|
||
\cite{ast2050_architecture}. This architecture is well-suited for BMCs
|
||
due to its efficiency and capability to handle multiple management
|
||
tasks concurrently without significant resource demands from the
|
||
main system. \\
|
||
|
||
The AST2050 features several key components that contribute to
|
||
its functionality. It includes an integrated VGA controller,
|
||
which enables remote graphical management through KVM-over-IP
|
||
(Keyboard, Video, Mouse), a critical feature for administrators who
|
||
need to interact with the system remotely, including BIOS updates
|
||
and troubleshooting \cite{ast2050_kvm}. Additionally, the AST2050
|
||
integrates a dedicated memory controller, which supports up to 256MB
|
||
of DDR2 RAM. This allows it to handle complex tasks and maintain
|
||
responsiveness during management operations \cite{ast2050_memory}.
|
||
The BMC also features a network interface controller (NIC) dedicated to
|
||
management traffic, ensuring that remote management does not interfere
|
||
with the primary network traffic of the server. This separation is
|
||
vital for maintaining secure and uninterrupted system management,
|
||
especially in environments where uptime is critical \cite{ast2050_nic}.
|
||
Another important architectural aspect of the AST2050 is its support
|
||
for multiple I/O interfaces, including I2C, GPIO, UART, and USB,
|
||
which allow it to interface with various sensors and peripherals
|
||
on the motherboard \cite{ast2050_io}. This versatility enables
|
||
comprehensive monitoring of hardware health, such as temperature
|
||
sensors, fan speeds, and power supplies, all of which can be managed
|
||
and configured through the BMC. \\
|
||
|
||
When combined with OpenBMC \cite{openbmc_wiki}, a libre firmware
|
||
that can be run on the AST2050 thanks to Raptor Engineering
|
||
\cite{raptor_engineering}, the architecture of the BMC becomes even
|
||
more powerful. OpenBMC takes advantage of the AST2050's architecture,
|
||
providing a flexible and customizable environment that can be tailored
|
||
to specific use cases. This includes adding or modifying features
|
||
related to security, logging, and network management, all within
|
||
the BMC's ARM architecture framework \cite{openbmc_customization}.
|
||
|
||
% ------------------------------------------------------------------------------
|
||
% CHAPTER 3: Key components in modern firmware
|
||
% ------------------------------------------------------------------------------
|
||
\chapter{Key components in modern firmware}
|
||
|
||
\section{General structure of coreboot}
|
||
|
||
The firmware of the ASUS KGPE-D16 is crucial in ensuring the proper
|
||
functioning and optimization of the mainboard's hardware components.
|
||
In this chapter and for the rest of this document, we're basing our
|
||
study on the 4.11 version of \textit{coreboot} \cite{coreboot_4_11},
|
||
which is the last version that supported the ASUS KGPE-D16 mainboard. \\
|
||
|
||
For the initialization tasks to be done efficiently, \textit{coreboot} is
|
||
organized in different stages (fig. \ref{fig:coreboot_stages})
|
||
\cite{coreboot_docs}.
|
||
|
||
\begin{figure}[H]
|
||
\centering
|
||
\includegraphics[width=0.9\textwidth]{
|
||
images/fig9_coreboot_stages.png}
|
||
\caption{\textit{coreboot}'s stages timeline, by
|
||
\textit{coreboot} project (CC BY-SA 4.0, 2009)}
|
||
\label{fig:coreboot_stages}
|
||
\end{figure}
|
||
|
||
Being a complex project with ambitious goals, \textit{coreboot} decided
|
||
early on to establish an file-system-based architecture for its images
|
||
(also called ROMs). This special file-system is CBFS (which stands for
|
||
coreboot file system). The CBFS architecture consists of a binary image
|
||
that can be interpreted as a physical disk, referred to here as ROM. A
|
||
number of independent components, each with a header added to the data,
|
||
are located within the ROM. The components are nominally arranged
|
||
sequentially, although they are aligned along a predefined boundary
|
||
(fig. \ref{fig:coreboot_diagram}). \\
|
||
|
||
Each stage is compiled as a separate binary and inserted into the CBFS
|
||
with custom compression. The bootblock stage is usually not compressed,
|
||
while the ramstage and the payload are compressed with LZMA. Each stage
|
||
loads the next stage at a given address (possibly decompressing it in
|
||
the process). \\
|
||
|
||
Some stages are relocatable and can be placed anywhere in the RAM.
|
||
These stages are typically cached in the CBMEM for faster loading times
|
||
during wake-up. The CBMEM is a specific memory area used by the
|
||
\textit{coreboot} firmware to store important data structures and logs
|
||
during the boot process. This area is typically allocated in the
|
||
system's RAM and is used to store various types of runtime information
|
||
that it might need to reference after the initial boot stages. \\
|
||
|
||
In general, \textit{coreboot} manages main memory through a structured
|
||
memory map (fig. \ref{tab:memmap}), allocating specific address ranges
|
||
for various hardware functions and system operations. The first 640KB
|
||
of memory space is typically unused by coreboot due to historical
|
||
reasons. Graphics-related operations use the VGA address range
|
||
and the text mode address ranges. It also reserves the higher for
|
||
operating system use, ensuring that critical system components
|
||
like the IOAPIC and TPM registers have dedicated address spaces.
|
||
This structured approach helps maintain system stability and
|
||
compatibility across different platforms and allows for a reset vector
|
||
fixed at an address (\texttt{0xFFFFFFF0}), regardless of the ROM size.
|
||
Payloads are typically loaded into high memory, above the reserved areas
|
||
for hardware components and system resources. The exact memory location
|
||
can vary depending on the system's configuration, but generally,
|
||
payloads are placed in a region of memory that does not conflict with
|
||
the firmware code or the reserved memory map areas, such as the ROM
|
||
mapping ranges. This placement ensures that payloads have sufficient
|
||
space to execute without interfering with other critical memory regions
|
||
allocated \cite{coreboot_mem_management}.
|
||
|
||
\begin{table}[ht]
|
||
\makebox[\textwidth][c]{%
|
||
\begin{tabular}{
|
||
|>{\centering\arraybackslash}p{0.35\textwidth}
|
||
|>{\centering\arraybackslash}p{0.5\textwidth}|}
|
||
\hline
|
||
\path{0x00000 - 0x9FFFF}
|
||
& Low memory (first 640KB). Never used. \\
|
||
\hline
|
||
\path{0xA0000 - 0xAFFFF}
|
||
& VGA graphics address range. \\
|
||
\hline
|
||
\path{0xB0000 - 0xB7FFF}
|
||
& Monochrome text mode address range.
|
||
Few motherboards use
|
||
it, but the KGPE-D16 does. \\
|
||
\hline
|
||
\path{0xB8000 - 0xBFFFF}
|
||
& Text mode address range. \\
|
||
\hline
|
||
\path{0xFEC00000}
|
||
& IOAPIC address. \\
|
||
\hline
|
||
\path{0xFED44000 - 0xFED4FFFF}
|
||
& Address range for TPM registers. \\
|
||
\hline
|
||
\path{0xFF000000 - 0xFFFFFFFF}
|
||
& 16 MB ROM mapping address range. \\
|
||
\hline
|
||
\path{0xFF800000 - 0xFFFFFFFF}
|
||
& 8 MB ROM mapping address range. \\
|
||
\hline
|
||
\path{0xFFC00000 - 0xFFFFFFFF}
|
||
& 4 MB ROM mapping address range. \\
|
||
\hline
|
||
\path{0xFEC00000 - DEVICE MEM HIGH}
|
||
& Reserved area for OS use. \\
|
||
\hline
|
||
\end{tabular}}
|
||
\caption{\textit{coreboot} memory map}
|
||
\label{tab:memmap}
|
||
\end{table}
|
||
|
||
\subsection{Bootblock}
|
||
|
||
The bootblock is the first stage executed after the CPU reset. The
|
||
beginning of this stage is written in assembly language, and its
|
||
main task is to set everything up for a C environment. The rest, of
|
||
course, is written in C. This stage occupies the last 20k
|
||
(fig. \ref{fig:coreboot_diagram}) of the image and within it is a
|
||
main header containing information about the ROM, including the
|
||
size, component alignment, and the offset of the start of the first
|
||
CBFS component. This block is a mandatory component as it also
|
||
contains the entry point of the firmware. \\
|
||
|
||
\begin{figure}[H]
|
||
\centering
|
||
\includegraphics[width=0.8\textwidth]{images/fig8_coreboot_architecture.png}
|
||
\caption{\textit{coreboot} ROM architecture
|
||
(CC BY-SA 4.0, 2024)}
|
||
\label{fig:coreboot_diagram}
|
||
\end{figure}
|
||
|
||
Upon startup, the first responsibility of the bootblock is to
|
||
execute the code from the reset vector located at the conventional
|
||
reset vector in 16-bit real mode. This code is specific to the
|
||
processor architecture and, for our board, is stored in the
|
||
architecture-specific sources for x86 within \textit{coreboot}.
|
||
The entry point into \textit{coreboot} code is defined in two files
|
||
in the \path{src/cpu/x86/16bit/} directory: \path{reset16.inc}
|
||
and \path{entry16.inc}. The first file serves as a jump to the
|
||
\path{_start16bit} procedure defined in the second. Due to space
|
||
constraints this function must remain below the 1MB address space
|
||
because the IOMMU has not yet been configured to allow anything
|
||
else. \\
|
||
|
||
During this early initialization, the Bootstrap Core (BSC) performs
|
||
several critical tasks while the other cores remain dormant. These
|
||
tasks include saving the results (and displaying them if necessary)
|
||
of the Built-in Self-Test (BIST), formerly known as POST;
|
||
invalidating the TLB to prevent any address translation errors;
|
||
determining the type of reset (e.g., cold start or warm start);
|
||
creating and loading an empty Interrupt Descriptor Table (IDT) to
|
||
prevent the use of "legacy" interrupts from real mode until
|
||
protected mode is reached. In practice, this means that at the
|
||
slightest exception, the BSC will halt. The code then switches to
|
||
32-bit protected mode by mapping the first 4 GB of address space for
|
||
code and data, and finally jumps to the 32-bit reset code labeled
|
||
\path{_protected_start}. \\
|
||
|
||
|
||
Once in protected mode, which constitutes the "normal" operating
|
||
mode for the processor, the next step is to set up the execution
|
||
environment. To achieve this, the code contained in
|
||
\path{src/cpu/x86/32bit/entry32.inc}, followed by
|
||
\path{src/cpu/x86/64bit/entry64.inc}, and finally
|
||
\path{src/arch/x86/bootblock_crt0.S}, establishes a temporary
|
||
stack, transitions to long mode (64-bit addressing) with paging
|
||
enabled, and sets up a proper exception vector table. The execution
|
||
then jumps to chipset-specific code via the
|
||
\path{bootblock_pre_c_entry} procedure.
|
||
Once these steps are completed, the bootblock has a minimal C
|
||
environment. The procedure now involves allocating
|
||
memory for the BSS, and decompressing and loading the next stage. \\
|
||
|
||
The jump to \path{_bootblock_pre_entry} leads to the code files
|
||
\path{src/soc/amd/common/block/cpu/car/cache_as_ram.S} and
|
||
\path{src/vendorcode/amd/agesa/f15tn/gcccar.inc}, which are specific
|
||
to AMD chipsets. It's worth noting that these files were developed by
|
||
AMD's engineers as part of the \textit{AGESA} project. The operations
|
||
performed at this stage are related to pre-RAM memory initialization.
|
||
All cores of all processors (up to a limit of 64 cores) are started.
|
||
The \textit{Cache-As-Ram} is configured using the
|
||
Memory-type range registers. These registers allow the
|
||
specification of a specific configuration for a given memory area
|
||
\cite{BKDG}.
|
||
In this case, the area that should correspond to physical memory is
|
||
mapped to the cache, while other areas, such as PCI or other bus
|
||
zones, are configured accordingly. A specific stack is set up for
|
||
each core of each processor (within the arbitrary limit of 64 cores
|
||
and 7 nodes, meaning 7 Core 0s). Core 0s receive 16KB, while the
|
||
Bootstrap Core (BSC) gets 64KB. The other cores receive 4KB each.
|
||
All cores except the BSC are halted and will restart during the
|
||
romstage. Finally, the execution jumps to the entry point of the
|
||
\textit{bootblock} written in C, labeled
|
||
\path{bootblock_c_entry}.
|
||
This entry point is located in
|
||
\path{src/soc/amd/stoneyridge/bootblock/bootblock.c} and is
|
||
specific to AMD processors. It is the first C routine executed, and
|
||
its role is to verify that the current processor is indeed the BSC,
|
||
allowing the function \path{bootblock_main_with_basetime}
|
||
to be called exclusively by the BSC. \\
|
||
|
||
We are now in the file \path{src/lib/bootblock.c}, written by
|
||
Google's team, and entering the
|
||
\path{bootblock_main_with_basetime} function, which immediately
|
||
calls \path{bootblock_main_with_timestamp}. At this stage, the
|
||
goal is to start the romstage, but a few more tasks need to be
|
||
completed.
|
||
|
||
The \path{bootblock_soc_early_init} function is called to
|
||
initialize the I2C bus of the southbridge. The
|
||
\path{bootblock_fch_early_init} function is invoked to
|
||
initialize the SPI buses (Serial Peripheral Interface,
|
||
allowing to access the chip that contains the ROM) and the
|
||
serial and "legacy" buses of the southbridge. The CMOS clock is then
|
||
initialized, followed by the pre-initialization of the serial
|
||
console.
|
||
The code then calls the \path{bootblock_mainboard_init}
|
||
function, which enters, for the first time, the files specific to
|
||
the ASUS KGPE-D16 motherboard:
|
||
\path{src/mainboard/ASUS/kgpe-d16/bootblock.c}.
|
||
This code performs the northbridge initialization via the
|
||
\path{bootblock_northbridge_init} function found in
|
||
\path{src/northbridge/amd/amdfam10/bootblock.c}. This involves
|
||
locating the HyperTransport bus and enabling the discovery of
|
||
devices connected to it (e.g., processors). The southbridge is
|
||
initialized using the \path{bootblock_southbridge_init}
|
||
function from \path{src/southbridge/amd/sb700/bootblock.c}.
|
||
This function, largely programmed by Timothy Pearson from Raptor
|
||
Engineering, who performed the first coreboot port for the ASUS
|
||
KGPE-D16, finalizes the activation of the SPI bus and the connection
|
||
to the ROM memory via SuperIO. The state of a recovery jumper is
|
||
then checked (this jumper is intended to reset the CMOS content,
|
||
although it is not fully functional at the moment, as indicated by
|
||
the \path{FIXME} comment in the code). Control then returns to
|
||
\path{bootblock_main} in \path{src/lib/bootblock.c}. \\
|
||
|
||
At this point, everything is ready to enter the romstage.
|
||
\textit{coreboot} has successfully started and can now continue its
|
||
execution by calling the \path{run_romstage} function from
|
||
\path{src/lib/prog_loaders.c}. This function begins by locating
|
||
the corresponding segment in the ROM via the southbridge and SPI
|
||
bus using \path{prog_locate}, which utilizes the SPI driver in
|
||
\path{src/drivers/cbfs_spi.c}. The contents of the romstage are
|
||
then copied into the cache-as-ram by
|
||
\path{cbfs_prog_stage_load}. Finally, the \path{prog_run}
|
||
function transitions to the romstage after switching back to
|
||
32-bit mode.
|
||
|
||
\subsection{Romstage}
|
||
|
||
The \textit{romstage} in \textit{coreboot} serves the critical function
|
||
of early initialization of peripherals, particularly system memory.
|
||
This stage is crucial for setting up the necessary components for the
|
||
platform's operation, ensuring that everything is in place for
|
||
subsequent stages of the boot process.
|
||
During this phase, \textit{coreboot} configures the Advanced
|
||
Programmable Interrupt Controller (APIC), which is responsible for
|
||
correctly handling interrupts across multiple CPUs, especially in
|
||
systems using Symmetric Multiprocessing (SMP). This includes setting
|
||
up the Local APIC on each processor and the IOAPIC, part of the
|
||
southbridge, to ensure that interrupts from peripherals are routed
|
||
to the appropriate CPUs. Additionally, the firmware configures the
|
||
HyperTransport (HT) technology, a high-speed communication protocol
|
||
that facilitates data exchange between the processor and the
|
||
northbridge, ensuring smooth data flow between these components.
|
||
During this stage, microcode patches may be loaded into CPU and
|
||
remain resident, settings related to memory controllers and CPU
|
||
too. \\
|
||
|
||
The \textit{romstage} begins with a call to the
|
||
\path{_start} function, defined in
|
||
\path{src/cpu/x86/32bit/entry32.inc} via
|
||
\path{src/arch/x86/assembly_entry.S}. We then enter the
|
||
\path{cache_as_ram_setup} procedure, written in assembly
|
||
language, located in \path{src/cpu/amd/car/cache_as_ram.inc}. This
|
||
procedure configures the cache to load the future \textit{ramstage}
|
||
and initialize memory based on the number of processors and cores
|
||
present. Once this is completed, the code calls
|
||
\path{cache_as_ram_main} in
|
||
\path{src/mainboard/asus/kgpe-d16/romstage.c}, which serves as the
|
||
main function of the \textit{romstage}.
|
||
In the \path{cache_as_ram_main} function, after reducing the
|
||
speed of the HyperTransport bus, only the Bootstrap Core (BSC)
|
||
initializes the spinlocks for the serial console, the CMOS storage
|
||
memory (used for saving parameters), and the ROM. At this point, the
|
||
HyperTransport bus is enumerated, and the PCI bridges are
|
||
temporarily disabled. The port 0x80 of the southbridge, used for
|
||
motherboard debugging with \textit{Post Codes}, is also initialized.
|
||
These codes indicate the status of the boot process and can be
|
||
displayed using special PCI cards connected to the system. The
|
||
SuperIO is then initialized to activate the serial port, allowing
|
||
the serial console to follow \textit{coreboot}’s progress in
|
||
real-time. If everything proceeds as expected, the code 0x30 is
|
||
sent, and the boot process continues. \\
|
||
|
||
If the result of the Built-in Self-Test (BIST), saved during the
|
||
\textit{bootblock}, shows no anomalies, all cores of all nodes are
|
||
configured, and they are placed back into sleep mode (except for the
|
||
Core 0s). If everything goes well, the code 0x32 is sent, and the
|
||
process continues. Using the \path{enable_sr5650_dev8} function,
|
||
the southbridge’s P2P bridge is activated. Additionally, a check is
|
||
performed to ensure that the number of physical processors detected
|
||
does not exceed the number of sockets available on the board. If any
|
||
issues were detected during the BIST, the machine will halt, and the
|
||
error will be displayed on the console. Otherwise, the process
|
||
continues, and the default hardware information table is
|
||
constructed, and the microcode of the physical processors is updated
|
||
if necessary. If everything proceeds correctly, the code 0x33 and
|
||
then 0x34 is sent, and the process continues. The information about
|
||
the physical processors is retrieved using \path{amd_ht_init},
|
||
and communication between the two sockets is configured via
|
||
\path{amd_ht_fixup}. This process includes disabling any
|
||
defective HT links (one per socket in this AMD Family 15h chipset).
|
||
If everything is working as expected, the code 0x35 is sent, and
|
||
the boot process continues.
|
||
With the \path{finalize_node_setup} function, the PCI bus is
|
||
initialized, and a mapping is created
|
||
(\path{setup_mb_resource_map}). If all goes well, the code 0x36
|
||
is sent. This is done in parallel across all Core 0s, so the system
|
||
waits for all cores to finish using the
|
||
\path{wait_all_core0_started} function. The communication
|
||
between the northbridge and southbridge is prepared using
|
||
\path{sr5650_early_setup} and
|
||
\path{sb7xx_51xx_early_setup}, followed by the activation of
|
||
all cores on all nodes, with the system waiting for all cores to be
|
||
fully initialized. If everything is successful, the code 0x38 is
|
||
sent. \\
|
||
|
||
At this point, the timer is activated, and a warm reset is performed
|
||
via the \path{soft_reset} function to validate all configuration
|
||
changes to the HT, PCI buses, and voltage/power settings of the
|
||
processors and buses. This results in a system reboot, passing again
|
||
through the \textit{bootblock}, but much faster this time since the
|
||
system recognizes the warm reset condition. Once this reboot is
|
||
complete, the HyperTransport bus is reconfigured into isochronous
|
||
mode (switching from asynchronous mode), finalizing the
|
||
configuration process. \\
|
||
|
||
Memory training and optimization are also key functions of the
|
||
firmware during the \textit{romstage}. This process involves
|
||
adjusting memory settings, such as timings, frequencies, and
|
||
voltages, to ensure that the installed memory modules operate
|
||
efficiently and stably. This step is crucial for achieving optimal
|
||
performance, especially when dealing with large amounts of RAM
|
||
and many CPU cores, as supported by the KGPE-D16. We'll see that
|
||
in detail during the next chapter. \\
|
||
|
||
After memory initialization, the process returns to the
|
||
\path{cache_as_ram_main} function, where a memory test is
|
||
performed. This involves writing predefined values to specific
|
||
memory locations and then verifying that the values can be read
|
||
back correctly.
|
||
If everything passes successfully, the CBMEM is initialized and
|
||
one sends code \path{0x41}. At this point, the configuration of
|
||
the PCI bus is prepared, which will be completed during the ramstage
|
||
by configuring the PCI bridges. The system then exits
|
||
\path{cache_as_ram_main} and returns to
|
||
\path{cache_as_ram_setup} to finalize the process.
|
||
|
||
\textit{coreboot} then transitions to the next stage, known as the
|
||
postcar stage, where it exits the cache-as-RAM mode and
|
||
begins using physical RAM.
|
||
|
||
\subsection{Ramstage}
|
||
|
||
The ramstage performs the general initialization of all peripherals,
|
||
including the initialization of PCI devices, on-chip devices, the
|
||
TPM (if not done by verstage), graphics (optional), and the CPU
|
||
(setting up the System Management Mode). After this initialization,
|
||
tables are written to inform the payload or operating system about
|
||
the existence and current state of the hardware. These tables
|
||
include ACPI tables (specific to x86), SMBIOS tables (specific to
|
||
x86), coreboot tables, and updates to the device tree (specific to
|
||
ARM). Additionally, the ramstage locks down the hardware and
|
||
firmware by applying write protection to boot media, locking
|
||
security-related registers, and locking SMM (specific to x86),
|
||
which is a resident component in a protected area.
|
||
\cite{coreboot_docs}. CBMEM data structures (like coreboot tables,
|
||
memory map, etc.) are populated during this stage and left resident
|
||
for the OS or payload to access, same for SMBIOS tables, ACPI tables
|
||
and eventually option ROMS. \\
|
||
|
||
Effective resource allocation is essential for system stability,
|
||
particularly in complex configurations involving multiple CPUs
|
||
and peripherals. This stage manages initial resource allocation,
|
||
resolving any conflicts between hardware components to prevent
|
||
resource contention and ensure smooth operation and security, which
|
||
is a major concern in modern systems. This includes support for
|
||
IOMMU, which is crucial for preventing unauthorized direct memory
|
||
access (DMA) attacks, particularly in virtualized environments
|
||
(however there are still vulnerabilities that can be exploited,
|
||
such as sub-page or IOTLB-based attacks or even configuration
|
||
weaknesses \cite{medeiros2017}\cite{markuze2021}). \\
|
||
|
||
\subsubsection{Advanced Configuration and Power Interface}
|
||
|
||
The Advanced Configuration and Power Interface (ACPI) is a
|
||
critical component of modern computing systems, providing an
|
||
open standard for device configuration and power management by
|
||
the operating system (OS). Developed in 1996 by Intel,
|
||
Microsoft, and Toshiba, ACPI replaced the older Advanced Power
|
||
Management (APM) standard with more advanced and flexible power
|
||
management capabilities \cite{intel_acpi_introduction_2023}.
|
||
At its core,
|
||
ACPI is implemented through a series of data structures and
|
||
executable code known as ACPI tables, which are provided by the
|
||
system firmware and interpreted by the OS at runtime. It means
|
||
that these are components from the firmware that remain resident
|
||
while the OS runs. These tables describe
|
||
various aspects of the system, including hardware resources,
|
||
device power states, and thermal zones. The ACPI Specification
|
||
outlines these structures and provides the necessary
|
||
standardization for interoperability across different platforms
|
||
and operating systems \cite{acpi_os_support}. These tables are
|
||
used by the OS to perform low-level task, including managing
|
||
power states of the CPU, controlling the voltage and frequency
|
||
scaling (also known as Dynamic Voltage and Frequency Scaling,
|
||
or DVFS), and coordinating power delivery to peripherals. \\
|
||
|
||
The ACPI Component Architecture (ACPICA) is the reference
|
||
implementation of ACPI, providing a common codebase that can be
|
||
used by OS developers to integrate ACPI support. ACPICA includes
|
||
tools and libraries that allow for the parsing and execution of
|
||
ACPI Machine Language (AML) code, which is embedded within the
|
||
ACPI tables \cite{acpi_programming}. One of the key tools in
|
||
ACPICA is the Intel ACPI Source Language (IASL) compiler, which
|
||
converts ACPI Source Language (ASL) code into AML bytecode,
|
||
allowing firmware developers to write custom ACPI
|
||
methods \cite{intel_acpi_spec}. The triggering of ACPI events is
|
||
managed through a combination of hardware signals and software
|
||
routines. For example, when a user presses the power button on a
|
||
system, an ACPI event is generated, which is then handled by the
|
||
OS. This event might trigger the system to enter a low-power
|
||
state, such as sleep or hibernation, depending on the
|
||
configuration provided by the ACPI tables
|
||
\cite{acpi_os_support}. These power states are defined in the
|
||
ACPI specification, with global states (G0 to G3) representing
|
||
different levels of system power consumption, and device states
|
||
(D0 to D3) representing individual device power levels. \\
|
||
|
||
The ASUS KGPE-D16 mainboard, which is designed for server and
|
||
high-performance computing environments, needs ACPI for managing
|
||
its power distribution across multiple CPUs and attached
|
||
peripherals. ACPI is integral in controlling the power states of
|
||
various components, thereby optimizing performance and energy
|
||
use. Additionally, the firmware on the KGPE-D16 uses ACPI tables
|
||
to manage system temperature and fan speed, ensuring reliable
|
||
operation under heavy workloads \cite{asus_kgpe_d16_manual}.
|
||
|
||
\subsubsection{System Management Mode}
|
||
|
||
System Management Mode (SMM) is a highly privileged operating
|
||
mode provided by x86 processors for handling system-level
|
||
functions such as power management, hardware control, and other
|
||
critical tasks that are to be isolated from the OS and
|
||
applications. Introduced by Intel, SMM operates in an
|
||
environment separate from the main operating system, offering a
|
||
controlled space for executing sensitive operations
|
||
\cite{uefi_smm_security}. This is another firmware component
|
||
that remains resident while the OS runs. \\
|
||
|
||
SMM is triggered by a System Management Interrupt (SMI), which
|
||
is a non-maskable interrupt that causes the CPU to save its
|
||
current state and switch to executing code stored in a protected
|
||
area of memory called System Management RAM (SMRAM). SMRAM is a
|
||
specialized memory region that is isolated from the rest of the
|
||
system, making it inaccessible to the OS and preventing
|
||
tampering or interference from other software
|
||
\cite{heasman2007}.
|
||
Within SMM, the firmware can execute various low-level functions
|
||
that require direct hardware control or need to be protected
|
||
from the OS. This includes tasks such as thermal management,
|
||
where the system monitors CPU temperature and adjusts
|
||
performance or power levels to prevent overheating, as well as
|
||
power management routines that enable efficient energy usage
|
||
by adjusting power states based on system activity
|
||
\cite{offsec_bios_smm}. One of the critical security features of
|
||
SMM is its role in managing firmware updates and handling
|
||
system-level security events. Because SMM operates in a
|
||
privileged mode that is isolated from the OS, it can
|
||
apply firmware updates and could respond to security threats
|
||
without being affected by potentially compromised system
|
||
software \cite{domas2015}. However, the high privilege level and
|
||
isolation of SMM also present significant security challenges.
|
||
If an attacker can compromise SMM, they gain full control over
|
||
the system, bypassing all security measures implemented by the
|
||
OS \cite{cyber_smm_hack}. Also, with a proprietary firmware,
|
||
it means that this code with a very high priviledge level
|
||
cannot be audited at all, nor even replaced. \\
|
||
|
||
The ASUS KGPE-D16 mainboard needs SMM to perform critical
|
||
management tasks that need to be done in parallel from the
|
||
operating system. For example, SMM is used to monitor and manage
|
||
system health by responding to thermal events and adjusting
|
||
power levels to maintain system stability. SMM operates
|
||
independently of the main operating system, allowing it to
|
||
perform sensitive tasks securely. \textit{coreboot}
|
||
supports SMM, but its implementation is typically
|
||
minimal compared to traditional proprietary firmware. In
|
||
\textit{coreboot}, SMM initialization involves setting
|
||
up the System Management Interrupt (SMI) handler and configuring
|
||
System Management RAM (SMRAM), the memory region where SMM code
|
||
executes\cite{brown2003linuxbios}. The extent of SMM support in
|
||
\textit{coreboot} can vary significantly depending on the
|
||
hardware platform and the specific requirements of the system.
|
||
\textit{coreboot}'s design philosophy emphasizes a lightweight
|
||
and fast boot process, delegating more complex management tasks
|
||
to payloads or the operating system itself
|
||
\cite{reinauer2008coreboot}.
|
||
|
||
One of the key challenges with implementing SMM in
|
||
\textit{coreboot} is ensuring that SMI handlers are configured
|
||
correctly to manage necessary system tasks without compromising
|
||
security or performance. \textit{coreboot}'s approach to SMM is
|
||
consistent with its overall goal of providing a streamlined and
|
||
efficient firmware solution, leaving more intricate
|
||
functionalities to be handled by subsequent software layers
|
||
\cite{mohr2012comparative}.
|
||
|
||
\subsection{Payload}
|
||
|
||
The payload is the software that executes after coreboot has
|
||
completed its initialization tasks. It resides in the CBFS and is
|
||
predetermined at compile time, with no option to choose it at
|
||
runtime. The primary role of the payload is to load and hand control
|
||
over to the operating system. In some cases, the payload itself can
|
||
be a component of the operating system \cite{coreboot_docs}.
|
||
Examples of payloads are \textit{GNU GRUB}, \textit{SeaBIOS},
|
||
\textit{memtest86+} or even sometimes the \textit{Linux kernel}
|
||
itself. \\
|
||
|
||
\textit{TianoCore}, a free implementation of the UEFI (Unified
|
||
Extensible Firmware Interface) specification is often used as a
|
||
payload \cite{tianocore_payload}.
|
||
It provides a UEFI environment after \textit{coreboot} has completed
|
||
its initial hardware initialization. This allows the system to
|
||
benefit from the advanced features of UEFI, such as a more flexible
|
||
boot manager, enhanced features, and support for modern
|
||
hardware. Indeed, UEFI, and by extension \textit{TianoCore},
|
||
includes a driver model that allows hardware manufacturers to
|
||
provide UEFI-compatible drivers. These drivers can be loaded at
|
||
boot time, allowing the firmware to support a wide range of modern
|
||
devices that \textit{coreboot}, with its more minimalistic and
|
||
custom-tailored approach, might not support out of the box.
|
||
For example, GOP drivers are responsible for setting up the
|
||
graphics hardware in UEFI environments. They replace the older VGA
|
||
BIOS routines used in legacy BIOS systems. With GOP drivers,
|
||
the system can initialize the GPU and display a graphical interface
|
||
even before the operating system loads \cite{osdev_gop}.
|
||
These are other examples of resident firmware components while the
|
||
OS is running. \\
|
||
Hardware manufacturers can distribute proprietary UEFI drivers as
|
||
part of firmware updates, making it straightforward for end-users
|
||
to install and use them. This is especially useful for specialized
|
||
hardware that requires specific drivers not included in the
|
||
free software community. It also gives hardware vendors more control
|
||
over how their devices are initialized and used, which can be
|
||
an advantage for vendors but is a freedom and user control
|
||
limitation.
|
||
|
||
Payloads are then definitely important parts of the firmware.
|
||
|
||
\section{AMD Platform Security Processor and Intel Management Engine}
|
||
|
||
The AMD Platform Security Processor (PSP) and Intel Management Engine
|
||
(ME) are embedded subsystems within AMD and Intel processors,
|
||
respectively, that handle a range of security-related tasks independent
|
||
of the main CPU. These subsystems are fundamental to the security
|
||
architecture of modern computing platforms, providing functions such as
|
||
secure boot, cryptographic key management, and remote system management
|
||
\cite{herrmann2017dissecting}.
|
||
|
||
The AMD PSP is based on an ARM Cortex-A5 processor and is responsible
|
||
for several security functions, including the validation of firmware
|
||
during boot (secure boot), management of Trusted Platform Module (TPM)
|
||
functions, and handling cryptographic operations such as key generation
|
||
and storage. The PSP operates independently of the main x86 cores,
|
||
which allows it to execute security functions even when the main system
|
||
is powered off or compromised by malware \cite{herrmann2017dissecting}.
|
||
The PSP's isolated environment ensures that sensitive operations are
|
||
protected from threats that could affect the main OS. \\
|
||
|
||
Similarly, the Intel Management Engine (ME) is a dedicated
|
||
processor embedded within Intel chipsets that operates
|
||
independently of the main CPU. The ME is a comprehensive subsystem that
|
||
provides a variety of functions, including out-of-band system
|
||
management, security enforcement, and support for Digital Rights
|
||
Management (DRM) \cite{intel_csme}. The ME's firmware runs on an
|
||
isolated environment that allows it to perform these tasks securely,
|
||
even when the system is powered off. This capability is crucial for
|
||
enterprise environments where administrators need to perform remote
|
||
diagnostics, updates, and security checks without relying on the main
|
||
OS.
|
||
Intel ME enforces Digital Rights Management (DRM)
|
||
through a multifaceted approach leveraging its deeply embedded,
|
||
hardware-based capabilities. At the core is the Protected
|
||
Execution Environment (PEE), which operates independently from the main
|
||
CPU and operating system. This isolation allows to privately
|
||
manage cryptographic keys, certificates, and other sensitive data
|
||
critical for DRM, which can be very problematic from a user freedom
|
||
perspective \cite{fsf_intel_me}. By handling encryption and decryption
|
||
processes within this protected environment, Intel ME ensures that
|
||
DRM-protected content, such as video streams, remains secure and
|
||
unreachable by the user, raising concerns about the control users have
|
||
over their own devices \cite{eff_intel_me}.
|
||
Intel ME also plays a significant role in maintaining platform
|
||
integrity through the secure boot process. During secure boot, Intel ME
|
||
ensures that only digitally signed and authorized operating systems and
|
||
applications are loaded, which can prevent users from installing
|
||
alternative or modified software on their own hardware, further
|
||
restricting their freedom \cite{uefi_what_is_uefi}. This is further
|
||
reinforced by Intel ME's remote attestation capabilities, where the
|
||
system’s state is reported to a remote server. This process verifies
|
||
that only systems meeting specific security standards dictated by third
|
||
parties are allowed to access DRM-protected content, potentially
|
||
limiting users' control over their own devices \cite{proprivacy_intel_me}.
|
||
Moreover, Intel ME supports High-bandwidth Digital Content Protection
|
||
(HDCP), a technology that restricts how digital content is transmitted
|
||
over interfaces like HDMI or DisplayPort. By enforcing HDCP, Intel ME
|
||
ensures that protected digital content, such as high-definition video,
|
||
is only transmitted to and displayed on authorized devices, effectively
|
||
preventing users from freely using the content they have legally
|
||
acquired \cite{phoronix_hdcp_2_2_i915}\cite{kernel_mei_hdcp}.
|
||
Together, these features enable Intel ME to provide a comprehensive and
|
||
robust DRM enforcement mechanism. However, this also means that users
|
||
have less control over their own hardware and digital content, raising
|
||
serious concerns about privacy, user autonomy, and the broader
|
||
implications for freedom in computing
|
||
\cite{fsf_intel_me}\cite{netgarage_intel_me}. \\
|
||
|
||
Added to that, Intel ME has been a source of controversy due to its deep
|
||
integration into the hardware and its potential to be exploited if
|
||
vulnerabilities are discovered. Researchers have demonstrated ways to
|
||
hack into the ME, potentially gaining control over a system even when
|
||
it is powered off \cite{blackhat_me_hack}. These concerns have led to
|
||
calls for greater transparency and security measures around the ME and
|
||
similar subsystems. When comparing Intel ME and AMD PSP, the primary
|
||
difference lies in their scope and functionality. Intel ME offers more
|
||
extensive remote management capabilities, making it a more comprehensive
|
||
tool for enterprise environments, while AMD PSP focuses more narrowly on
|
||
core security tasks. Nonetheless, both play critical roles in ensuring
|
||
the security and integrity of modern computing systems. \\
|
||
|
||
The ASUS KGPE-D16 mainboard does not include AMD PSP nor Intel ME.
|
||
|
||
% ------------------------------------------------------------------------------
|
||
% CHAPTER 4: Memory initialization and training
|
||
% ------------------------------------------------------------------------------
|
||
\chapter{Memory initialization and training}
|
||
|
||
\section{Importance of DDR3 memory initialization}
|
||
|
||
Memory modules are designed solely for storing data. The only valid
|
||
operations on a memory device are reading data stored in the device,
|
||
writing (or storing) data into the device, and refreshing the data.
|
||
Memory modules consist of large rectangular arrays of memory cells,
|
||
including circuits used to read and write data into the arrays, and
|
||
refresh circuits to maintain the integrity of the stored data. The
|
||
memory arrays are organized into rows and columns of memory cells,
|
||
known as word lines and bit lines, respectively. Each memory cell
|
||
has a unique location or address defined by the intersection of a
|
||
row and a column. A DRAM memory cell is a capacitor that is charged
|
||
to produce a 1 or a 0. \\
|
||
|
||
DDR3 (Double Data Rate Type 3) is a widely used type of
|
||
SDRAM (Synchronous Dynamic Random-Access Memory) that offers
|
||
significant performance improvements over its predecessors,
|
||
DDR and DDR2. A DDR3 DIMM module contains 240 contacts.
|
||
Key features of DDR3 include higher data rates,
|
||
lower power consumption, and increased memory capacity, making
|
||
it essential for high-performance computing environments
|
||
\cite{DDR3_wiki}. One of the critical aspects of DDR3 is its
|
||
internal architecture, which supports data rates ranging from
|
||
800 to 1600 Mbps and operates at a lower voltage of 1.5V. This
|
||
enables faster data processing and more efficient power usage,
|
||
crucial for modern applications that require high-speed memory
|
||
access \cite{samsung_ddr3}. Additionally, DDR3 memory modules are
|
||
available in larger capacities, allowing systems to handle larger
|
||
datasets and more complex computing tasks \cite{altera2008}.
|
||
However, the advanced features of DDR3 come with increased
|
||
complexity in its initialization and operation.
|
||
The DDR3 memory interface, used by the ASUS KGPE-D16, is
|
||
source-synchronous. Each memory module generates a Data Strobe
|
||
(DQS) pulse simultaneously with the data (DQ) it sends during
|
||
a memory read operation. Similarly, a DQS must be generated
|
||
with its DQ information when writing to memory. The DQS differs
|
||
between write and read operations. Specifically, the DQS generated
|
||
by the system for a write operation is centered in the data bit
|
||
period, while the DQS provided by the memory during a read operation
|
||
is aligned with the edge of the data period \cite{samsung_ddr3}. \\
|
||
|
||
Due to this edge alignment, the read DQS timing can be adjusted
|
||
to meet the setup and hold requirements of the registers capturing
|
||
the read data. To improve timing margins or reduce simultaneous
|
||
switching noise in the system, the DDR3 memory interface also allows
|
||
various other timing parameters to be adjusted. If the system uses
|
||
dual-inline memory modules (DIMMs), as in our case, the interface
|
||
provides write leveling: a timing adjustment that compensates for
|
||
variations in signal travel time \cite{micron_ddr3}.
|
||
To reduce simultaneous switching noise, DIMM modules feature a
|
||
fly-by architecture for routing the address, command, and clock
|
||
signals, which causes command signals to reach the
|
||
different memory devices with a delay. The fly-by topology has a
|
||
"daisy-chain" structure with either very short stubs or no stubs
|
||
at all. This structure results in fewer branches and point-to-point
|
||
connections: everything originates from the controller, passing
|
||
through each module on the node, thereby increasing the throughput.
|
||
In this topology, signals are routed sequentially
|
||
from the memory controller to each DRAM chip, reducing signal
|
||
reflections and improving overall signal integrity.
|
||
It means that routing is done in the order of byte lane numbers,
|
||
and the data byte lanes are routed on the same layer. Routing can be
|
||
simplified by swapping data bits within a byte lane if necessary.
|
||
The fly-by topology contrasts with the dual-T topology
|
||
(fig. \ref{fig:fly-by}). This design is essential for maintaining
|
||
stability at the high speeds DDR3 operates at, but it also
|
||
introduces timing challenges, such as timing skew, that must be
|
||
carefully managed \cite{micron_ddr3}. \\
|
||
|
||
\begin{figure}[H]
|
||
\centering
|
||
\begin{minipage}[b]{0.45\textwidth}
|
||
\centering
|
||
\includegraphics[width=0.90\textwidth]{images/fly-by.png}
|
||
\end{minipage}%
|
||
\begin{minipage}[b]{0.45\textwidth}
|
||
\centering
|
||
\includegraphics[width=0.824\textwidth]{images/t.png}
|
||
\end{minipage}
|
||
\caption{DDR3 fly-by \textit{versus} T-topology
|
||
(CC BY-SA 4.0, 2021)}
|
||
\label{fig:fly-by}
|
||
\end{figure}
|
||
|
||
Proper memory initialization ensures that the memory controller
|
||
and the memory modules are correctly configured to work together,
|
||
allowing for efficient data transfer and reliable operation. The
|
||
initialization process involves setting various parameters,
|
||
such as memory timings, voltages, and frequencies, which are
|
||
critical for ensuring that the memory operates within its optimal
|
||
range \cite{samsung_ddr3}. Failure to initialize DDR3 memory
|
||
correctly can lead to several serious consequences, including
|
||
system instability, data corruption, and reduced performance
|
||
\cite{SridharanVilas2015MEiM}. In the worst-case scenario, improper
|
||
memory initialization can prevent the system from booting entirely,
|
||
as the memory subsystem fails to function correctly.
|
||
In the context of the ASUS KGPE-D16, a server motherboard
|
||
designed for high-performance applications, proper DDR3 memory
|
||
initialization is particularly important. The KGPE-D16 supports
|
||
up to 256GB of DDR3 memory across 16 DIMM slots, and any issues
|
||
during memory initialization, if non-fatal, could severely impact
|
||
the system's ability to handle large datasets or maintain stable
|
||
operation under heavy workloads \cite{asus_kgpe_d16_manual}. Given
|
||
the critical role that memory plays in the overall performance of
|
||
the KGPE-D16, ensuring that DDR3 memory is correctly initialized
|
||
is essential for achieving the desired balance of performance,
|
||
reliability, and stability in demanding server environments.
|
||
|
||
\section{General steps for DDR3 configuration}
|
||
|
||
DDR3 memory initialization is a detailed and essential
|
||
process that ensures both the stability and performance of the
|
||
system. The process involves several critical steps: detection
|
||
and identification of memory modules, initial configuration of the
|
||
memory controller, adjustment of timing and voltage settings, and
|
||
the execution of training and calibration procedures. \\
|
||
|
||
The initialization begins with the detection and identification of
|
||
the installed memory modules. During the BIST, the firmware reads
|
||
the Serial Presence Detect (SPD) data stored on
|
||
each memory module. SPD data contains crucial information about
|
||
the memory module's specifications, including size, speed, CAS
|
||
latency (CL), RAS to CAS delay (tRCD), row precharge time (tRP),
|
||
and row cycle time (tRC). This data allows to configure
|
||
the memory controller for optimal compatibility and performance. \\
|
||
|
||
Indeed, once the memory modules have been identified, the firmware
|
||
proceeds to the initial configuration of the memory controller.
|
||
This controller is governed by a state machine that
|
||
manages the sequence of operations required to initialize,
|
||
maintain, and control memory access. This state machine consists of
|
||
multiple states that represent various phases of memory operation,
|
||
such as reset, initialization, calibration, and data transfer.
|
||
The transitions between these states are either automatic or
|
||
command-driven, depending on the specific requirements of each
|
||
phase \cite{samsung_ddr3}\cite{micron_ddr3}.
|
||
This state machine is presented in the
|
||
fig. \ref{fig:ddr3_state_machine}. Automatic transitions, depicted
|
||
by thick arrows in the automaton, occur without external
|
||
intervention. These typically include transitions that ensure
|
||
the memory enters a stable state, such as the transition from
|
||
power-on to initialization, or from calibration to idle states.
|
||
These transitions are crucial for maintaining the integrity and
|
||
stability of the memory system, as they ensure that the controller
|
||
progresses through necessary stages like ZQ calibration and write
|
||
leveling, which are essential for proper signal timing and
|
||
impedance matching
|
||
\cite{samsung_ddr3}\cite{micron_ddr3}\cite{burnett_ddr3}. \\
|
||
|
||
On the other hand, command-driven transitions, represented by normal
|
||
arrows in the automaton, require specific commands issued by the
|
||
memory controller or the CPU to advance to the next state. For
|
||
instance, the transition from the idle state to the data transfer
|
||
state requires explicit read or write commands. Similarly,
|
||
transitioning from the initialization state to the calibration
|
||
state involves issuing mode register set (MRS) commands that
|
||
configure the memory’s operating parameters. These command-driven
|
||
transitions are integral to the dynamic operation of the memory
|
||
system, allowing the controller to respond to the system's
|
||
operational needs and ensuring that memory accesses are performed
|
||
efficiently and accurately \cite{samsung_ddr3}\cite{micron_ddr3}. \\
|
||
|
||
The memory controller configuration
|
||
involves setting up fundamental parameters such as the memory clock
|
||
(MEMCLK) frequency and the memory channel configuration. The MEMCLK
|
||
frequency is derived from the SPD data, while the memory channels
|
||
are configured to operate in single, dual, or quad-channel modes,
|
||
depending on the system architecture and the installed modules
|
||
\cite{burnett_ddr3}. Proper configuration of the memory controller
|
||
is vital to ensure synchronization with the memory modules,
|
||
establishing a stable foundation for subsequent operations. \\
|
||
|
||
The first critical step, during the INIT phase involves the
|
||
adjustment of timing and voltage settings. These settings are
|
||
essential for ensuring that DDR3 memory operates efficiently and
|
||
reliably. Key timing parameters include CAS Latency (CL), RAS to
|
||
CAS Delay (tRCD), Row Precharge Time (tRP), and Row Cycle Time (tRC).
|
||
These parameters are finely tuned to balance speed and stability
|
||
\cite{samsung_ddr3}. The BIOS uses the SPD data to set these
|
||
parameters and may also adjust them dynamically to achieve the
|
||
best possible performance. Voltage settings, such as DRAM voltage
|
||
(typically 1.5V for DDR3) and termination voltage (VTT), are also
|
||
configured to maintain stable operation, especially under varying
|
||
conditions such as temperature fluctuations \cite{micron_ddr3}. \\
|
||
|
||
Training and calibration are among the most complex and crucial
|
||
stages of DDR3 memory initialization. The fly-by topology used
|
||
for address, command, and clock signals in DDR3 modules enhances
|
||
signal integrity by reducing the number of stubs and their lengths,
|
||
but it also introduces skew between the clock (CK) and data strobe
|
||
(DQS) signals \cite{micron_ddr3}. This skew must be compensated to
|
||
ensure that data is written and read correctly. The BIOS performs
|
||
write leveling, which adjusts the timing of DQS relative to CK
|
||
for each memory module. This process ensures that the memory
|
||
controller can write data accurately across all modules, even
|
||
when they exhibit slight variations in signal timing due to the
|
||
physical layout \cite{samsung_ddr3}. \\
|
||
|
||
\begin{figure}[H]
|
||
\centering
|
||
\begin{tikzpicture}[scale=0.6,
|
||
transform shape,
|
||
shorten >=1pt,
|
||
node distance=5cm and 5cm,
|
||
on grid,
|
||
auto]
|
||
% States
|
||
\node[state, initial] (reset) {RESET};
|
||
\node[draw=none,fill=none] (any) [below=2cm of reset] {ANY};
|
||
\node[state] (init) [right=of reset] {INIT};
|
||
\node[state] (zqcal) [below=of init] {ZQ Calibration};
|
||
\node[state, accepting] (idle) [right=of init] {IDLE};
|
||
\node[state] (writelevel) [above=of idle] {WRITE LEVELING};
|
||
\node[state] (refresh) [right=of idle] {REFRESH};
|
||
\node[state] (activation) [below=of idle] {ACTIVATION};
|
||
\node[state] (bankactive) [below=of activation] {BANK ACTIVE};
|
||
\node[state] (readop) [below right=of bankactive] {READ OP};
|
||
\node[state] (writeop) [below left=of bankactive] {WRITE OP};
|
||
\node[state] (prechrg) [below right=of readop] {PRE-CHARGING};
|
||
% Transitions
|
||
\path[->, line width=0.2mm, >=stealth]
|
||
(reset) edge node {} (init)
|
||
(idle) edge [bend left=20] node {} (writelevel)
|
||
edge [bend left=20] node {REF} (refresh)
|
||
edge node {} (activation)
|
||
edge [bend left=10] node {ZQCL/S} (zqcal)
|
||
(activation) edge node {} (bankactive)
|
||
(bankactive) edge [bend left=30] node {PRE} (prechrg)
|
||
edge [bend left=20] node {write} (writeop)
|
||
edge [bend right=20] node {read} (readop)
|
||
(writeop) edge [loop left] node {write} (writeop)
|
||
edge [bend left=10] node {read\_a} (readop)
|
||
edge [bend right=15] node {PRE} (prechrg)
|
||
(readop) edge [loop right] node {read} (readop)
|
||
edge [bend left=10] node {write\_a} (writeop)
|
||
edge [bend right=15] node {PRE} (prechrg);
|
||
% Thick transitions
|
||
\path[->, line width=0.5mm, >=stealth]
|
||
(any) edge node {} (reset)
|
||
(init) edge node {ZQCL} (zqcal)
|
||
(zqcal) edge [bend left=10] node {} (idle)
|
||
(writelevel) edge [bend left=20] node {MRS} (idle)
|
||
(refresh) edge [bend left=20] node {} (idle)
|
||
(writeop) edge node {} (prechrg)
|
||
edge [bend left=20] node {} (bankactive)
|
||
(readop) edge [bend left=15] node {} (prechrg)
|
||
edge [bend right=20] node {} (bankactive)
|
||
(prechrg) edge [bend right=20] node {} (idle);
|
||
\end{tikzpicture}
|
||
\caption{DDR3 controller state machine}
|
||
\label{fig:ddr3_state_machine}
|
||
\end{figure}
|
||
|
||
ZQ calibration is another vital procedure that adjusts the
|
||
output driver impedance and on-die termination (ODT) to match
|
||
the system’s characteristic impedance \cite{micron_ddr3}. This
|
||
calibration is critical for maintaining signal integrity under
|
||
different operating conditions, such as voltage and temperature
|
||
changes. During initialization, the memory controller issues a
|
||
ZQCL command to the DRAM modules, triggering the calibration
|
||
sequence that optimizes impedance settings.
|
||
This ensures that the memory system can
|
||
operate with tight timing tolerances, which is crucial for
|
||
systems requiring high reliability.
|
||
Read training is also essential to ensure that data read from
|
||
the memory modules is interpreted correctly by the memory
|
||
controller. This process involves adjusting the timing of the
|
||
read data strobe (DQS) to align perfectly with the data being
|
||
received. Proper read training is necessary for reliable data
|
||
retrieval, which directly impacts system performance and stability. \\
|
||
|
||
ZQCS (ZQ Calibration Short) however is a procedure used
|
||
to periodically adjust the DRAM's ODT and output driver impedance
|
||
during normal operation. Unlike the full ZQCL (ZQ Calibration Long),
|
||
which is performed during initial memory initialization, ZQCS is a
|
||
quicker, less comprehensive calibration that fine-tunes the
|
||
impedance settings in response to changes in temperature, voltage,
|
||
or other environmental factors. This helps maintain optimal signal
|
||
integrity and performance throughout the memory's operation without
|
||
the need for a full recalibration. \\
|
||
|
||
In summary, the DDR3 memory initialization process in systems
|
||
like the ASUS KGPE-D16 involves a series of detailed and
|
||
interdependent steps that are critical for ensuring system
|
||
stability and performance. These include the detection and
|
||
identification of memory modules, the initial configuration of
|
||
the memory controller, precise adjustments of timing and voltage
|
||
settings, and rigorous training and calibration procedures.
|
||
|
||
\section{Memory initialization techniques}
|
||
|
||
\subsection{Memory training algorithms}
|
||
|
||
Memory training algorithms are designed to fine-tune the
|
||
operational parameters of memory modules, such as timing, voltage,
|
||
and impedance. These algorithms play a crucial role in achieving
|
||
the optimal performance of DDR3 memory systems, particularly
|
||
in complex multi-core environments where synchronization
|
||
and timing are challenging. The primary algorithms used in
|
||
memory training include ZQ calibration and write leveling.
|
||
Optimizing timing and voltage settings is a critical aspect of
|
||
memory training. The memory controller adjusts parameters such as
|
||
CAS latency, RAS to CAS delay, and other timing characteristics
|
||
to ensure that data is read and written with minimal delay
|
||
and maximum accuracy. Voltage adjustments are also crucial,
|
||
as they help stabilize the operation of memory modules by
|
||
ensuring that the power supplied is within the optimal range,
|
||
compensating for any variations due to temperature or other factors
|
||
\cite{micron_ddr3}\cite{burnett_ddr3}\cite{gopikrishna2021novel}.
|
||
\\
|
||
|
||
ZQ calibration is a critical step in DDR3 memory initialization that
|
||
ensures the proper impedance matching of the output driver and
|
||
on-die termination (ODT) resistance. Impedance matching is crucial
|
||
for maintaining signal integrity by minimizing reflections and
|
||
ensuring reliable data transmission between the memory controller
|
||
and the DRAM modules. It is initiated by sending ZQCL (ZQ
|
||
Calibration Long) commands to the DDR3 DIMMs. Each ZQCL command
|
||
triggers a long calibration cycle within the DRAM module. The
|
||
purpose of this calibration is to adjust the output driver impedance
|
||
and the ODT resistance to match the specified target impedance. This
|
||
adjustment compensates for process variations, voltage fluctuations,
|
||
and temperature changes that can affect the impedance
|
||
characteristics of the DRAM module \cite{gopikrishna2021novel}. \\
|
||
|
||
A bit in the DRAM Controller
|
||
Timing register is set to 1 to send the ZQCL command, and an address
|
||
bit is also set to 1 to indicate that the ZQCL command should be
|
||
directed to the memory module. Upon receiving the ZQCL command, the
|
||
DRAM module begins the calibration process. This involves a series
|
||
of internal adjustments where the DRAM module measures its current
|
||
impedance and compares it against the target impedance. The module
|
||
then modifies its internal settings to reduce the difference between
|
||
the current and target impedance values
|
||
\cite{gopikrishna2021novel}\cite{samsung_ddr3}. This process is
|
||
iterative, meaning that it may require multiple adjustments to
|
||
converge on the optimal impedance settings. The calibration is
|
||
designed to ensure that the DRAM module's impedance remains within
|
||
a tight tolerance, which is critical for high-speed data
|
||
communication. The ZQ calibration process is time-sensitive. After
|
||
issuing the ZQCL command, the system must wait for 512 memory
|
||
clock cycles (MEMCLKs) to allow the calibration to complete.
|
||
This delay is necessary because the calibration involves both
|
||
measurement and adjustment phases, which require precise timing
|
||
to ensure accuracy \cite{gopikrishna2021novel}. If the system does
|
||
not wait the full 512 MEMCLKs, the calibration may be incomplete,
|
||
leading to suboptimal impedance matching and potential signal
|
||
integrity issues, such as reflections or noise on the data lines. \\
|
||
|
||
During the ZQ calibration, the DRAM module adjusts its output driver
|
||
impedance, which controls the strength of the signals it sends out.
|
||
The stronger the signal, the less susceptible it is to noise, but if
|
||
the impedance is too high or too low, it can cause signal distortion
|
||
or reflections. The ODT resistance is also calibrated to properly
|
||
terminate signals that reach the end of a data line. Proper
|
||
termination is essential to prevent signal reflections that could
|
||
interfere with the integrity of the data being transmitted. The ZQCL
|
||
command adjusts these settings by fine-tuning the resistance values
|
||
based on the module’s feedback, ensuring that the signal paths are
|
||
optimized for both transmission and termination. Once the ZQ
|
||
calibration is complete, the DCT register bit is reset to 0,
|
||
indicating that the calibration command has been processed. The
|
||
memory controller then verifies that the DRAM module has correctly
|
||
adjusted its impedance settings. This verification process may
|
||
involve additional test signals sent across the memory bus to
|
||
confirm that signal integrity meets the required standards. If the
|
||
calibration is successful, the memory subsystem is now properly
|
||
calibrated and ready for normal operation. In systems with LRDIMMs
|
||
or RDIMMs, additional steps may be necessary to ensure that all
|
||
ranks and channels are calibrated correctly, particularly in
|
||
multi-rank configurations where impedance matching can be more
|
||
complex. However, in systems with complex memory configurations,
|
||
such as those using multiple DIMMs per channel or operating at
|
||
higher memory frequencies, the ZQ calibration process becomes even
|
||
more critical. The calibration may need to be repeated at different
|
||
operating points to ensure that the memory subsystem remains stable
|
||
across all conditions. This could involve performing multiple ZQCL
|
||
calibrations at different memory frequencies, or under different
|
||
thermal conditions, to account for the dynamic nature of memory
|
||
operation in modern systems. \\
|
||
|
||
In seed-based algorithms, an initial "seed" value is used
|
||
as a reference point for the calibration process. The memory
|
||
controller iteratively adjusts the impedance based on feedback
|
||
from the memory module, refining the calibration with each
|
||
iteration. This method provides a more precise calibration,
|
||
particularly in systems where fine-tuned impedance matching is
|
||
critical for high-frequency operations \cite{kim2010design}.
|
||
Also, while seed-based methods can accelerate the convergence
|
||
of calibration, they require careful selection of initial seed
|
||
values to avoid suboptimal or even faulty impedance settings
|
||
\cite{gopikrishna2021novel}. \\
|
||
|
||
Write leveling is another critical aspect of memory training,
|
||
particularly in DDR3 systems that use a fly-by topology. It involves
|
||
using the physical layer (PHY) to detect the edge of the Data Strobe
|
||
(DQS) signal in synchronization with the clock (CK) signal on the
|
||
DIMM (Dual In-line Memory Module) during write access. The DQS
|
||
signal is a timing signal generated by the memory controller that
|
||
accompanies data (DQ) during read and write operations. For write
|
||
operations, the DQS signal must be perfectly aligned with the CK
|
||
signal to ensure that data is correctly written to memory cells.
|
||
Indeed, in systems using a fly-by topology, the DQS signal might
|
||
arrive at different times for different memory devices on the same
|
||
module due to the signal traveling through different lengths of
|
||
trace. Write leveling compensates for this skew by adjusting the
|
||
timing of the DQS signal relative to the CK signal for each lane
|
||
(a group of data lines) \cite{burnett_ddr3}. This training is
|
||
performed on a per-channel and per-DIMM basis, ensuring that each
|
||
memory module is correctly synchronized with the memory controller,
|
||
minimizing timing mismatches that could lead to data corruption. \\
|
||
|
||
Write leveling implies to perform a DQS position training, a
|
||
specific form of training focused on aligning the DQS signal with
|
||
the data (DQ) signals during write operations. In this process,
|
||
the memory controller adjusts the phase of the DQS signal to ensure
|
||
that it is correctly aligned with the data signals across all data
|
||
lanes, centering the DQS signal within the "data eye" for optimal
|
||
timing. This ensures that all data bits are written correctly and
|
||
consistently across the memory module, reducing the risk of timing
|
||
errors and data corruption. Additionally, DQS receiver training is
|
||
also needed to ensure that the memory controller can correctly
|
||
capture the DQS signal during read operations
|
||
\cite{micron_ddr3}.
|
||
The core operation is to make the MCT send out specific test
|
||
patterns to the DRAM to determine the timing relationship between
|
||
the DQS and data signals, then the MCT adjusts the delay or phase of
|
||
the DQS signal relative to the clock signal (CK) and the data
|
||
signals (DQ) while checking the integrity of the test data in the
|
||
DRAM. \\
|
||
|
||
Using seed-based algorithms, the memory controller sets an initial
|
||
delay value and then iteratively adjusts it based on the feedback
|
||
received from the memory module. This process ensures that the DQS
|
||
signal is correctly aligned with the CK signal at the memory
|
||
module's pins, minimizing the risk of data corruption and ensuring
|
||
reliable write operations \cite{samsung_ddr3}\cite{gopikrishna2021novel}.
|
||
Seed-based write leveling offers improved precision but must be
|
||
finely tuned to account for the specific characteristics of the
|
||
memory module and the overall system architecture
|
||
\cite{gopikrishna2021novel}. \\
|
||
|
||
In contrast to seed-based algorithms, seedless methods do not rely on
|
||
an initial reference value. Instead, they dynamically adjust the
|
||
impedance and timing parameters during the calibration process.
|
||
Seedless ZQ calibration continuously monitors the impedance of the
|
||
memory module and makes real-time adjustments to maintain optimal
|
||
matching. This approach can be beneficial in environments where the
|
||
operating conditions are highly variable, as it allows for more
|
||
flexible and adaptive calibration \cite{kim2010design}. Similarly,
|
||
seedless write leveling dynamically adjusts the DQS timing based on
|
||
real-time feedback from the memory module. This method is particularly
|
||
useful in systems where the memory configuration is frequently changed
|
||
or where the operating conditions vary significantly
|
||
\cite{micron_ddr3}\cite{gopikrishna2021novel}. The traditional ZQ
|
||
calibration methods, while effective, often struggle with matching
|
||
impedance perfectly across all conditions. A master thesis by
|
||
\textcite{gopikrishna2021novel} builds upon these traditional methods
|
||
by proposing enhancements that involve more sophisticated calibration
|
||
approaches, leading to better impedance matching and overall memory
|
||
performance \cite{gopikrishna2021novel}.
|
||
|
||
|
||
\subsection{BIOS and Kernel Developer Guide (BKDG) recommendations}
|
||
|
||
The BIOS and Kernel Developer Guide (BKDG from \textcite{BKDG}) is a
|
||
technical manual aimed at BIOS developers and operating system kernel
|
||
programmers. It provides in-depth documentation on the AMD
|
||
processor architecture, system initialization processes, and
|
||
configuration guidelines. The document is essential for
|
||
understanding the proper initialization sequences, including
|
||
those for DDR3 memory, to ensure system stability and
|
||
performance, particularly for AMD Family 15h processors. \\
|
||
|
||
The initialization of DDR3 memory begins with configuring the DDR
|
||
supply voltage regulator, which ensures that the memory modules
|
||
receive the correct power levels. Following this, the Northbridge
|
||
(NB) P-state is forced to \path{NBP0}, a state that guarantees
|
||
stable operation during the initial configuration phases. Once these
|
||
preliminary steps are completed, the initialization of the DDR
|
||
physical layer (PHY) begins, which is critical for setting up
|
||
the communication interface between the memory controller and the
|
||
DDR3 modules. PHY fence training deals with overall signal alignment
|
||
at the physical interface, while ZQ calibration focuses on impedance
|
||
matching, and write leveling addresses timing alignment during
|
||
write operations. Each process involves different methods as PHY
|
||
fence training uses iterative timing adjustments, ZQ calibration
|
||
uses impedance adjustments via the ZQ pin, and write leveling
|
||
adjusts DQS timing relative to CK during writes. These processes are
|
||
critical for configuring DDR3 DIMMs and ensuring stable and reliable
|
||
operation, especially when booting from an unpowered state such as
|
||
ACPI S4 (hibernation), S5 (soft off), or G3 (mechanical off).
|
||
|
||
\subsubsection{DDR3 initialization procedure}
|
||
|
||
DDR3 initialization is a multi-step process that prepares
|
||
both the memory controllers and the DIMMs for operation. This
|
||
initialization is essential to set up the memory configuration
|
||
and to ensure that the memory subsystem operates correctly
|
||
under various conditions.
|
||
|
||
\begin{itemize}
|
||
\item \textbf{Enable DRAM initialization}: The process
|
||
begins by
|
||
enabling DRAM initialization. This is done
|
||
by setting the \path{EnDramInit} bit in
|
||
the \path{D18F2x7C_dct} register to 1. The
|
||
\path{D18F2x7C_dct} register is a specific
|
||
configuration register within the memory
|
||
controller that controls various aspects of the
|
||
DRAM initialization process. Enabling this bit
|
||
initiates the sequence of operations required to
|
||
prepare the memory for use. After setting this bit,
|
||
the system waits for 200 microseconds to allow the
|
||
initialization command to propagate and stabilize.
|
||
|
||
\item \textbf{Deassert memory reset}: Next, the memory
|
||
reset
|
||
signal, known as \path{MemRstX}, is deasserted
|
||
by setting the \path{DeassertMemRstX} bit in the
|
||
\path{D18F2x7C_dct} register to 1. Deasserting
|
||
\path{MemRstX} effectively takes the memory
|
||
components out of their reset state, allowing them
|
||
to begin normal operation. The system then waits
|
||
for an additional 500 microseconds to ensure that
|
||
the memory reset is fully deasserted and the memory
|
||
components are stable.
|
||
|
||
\item \textbf{Assert clock enable (CKE)}: The next
|
||
step involves asserting the clock enable signal, known as
|
||
`CKE`, by setting the \path{AssertCke} bit in the
|
||
\path{D18F2x7C_dct} register to 1. The \path{CKE}
|
||
signal is critical because it enables the clocking
|
||
of the DRAM modules, allowing them to synchronize
|
||
with the memory controller. The system must wait
|
||
for 360 nanoseconds after asserting \path{CKE}
|
||
to ensure that the clocking is correctly established.
|
||
|
||
\item \textbf{Registered DIMMs and LRDIMMs initialization}:
|
||
For systems using registered DIMMs (RDIMMs) or Load
|
||
Reduced DIMMs (LRDIMMs), additional initialization
|
||
steps are necessary. RDIMMs and LRDIMMs have
|
||
buffering mechanisms that reduce electrical loading
|
||
and improve signal integrity, especially in systems
|
||
with multiple memory modules. During initialization,
|
||
the BIOS programs the \path{ParEn} bit in the
|
||
\path{D18F2x90_dct} register based on whether
|
||
the DIMM is buffered or unbuffered. For RDIMMs,
|
||
specific Register Control (RC) commands, such as RC0
|
||
through RC7, are sent to initialize the memory module's
|
||
control registers. Similarly, LRDIMMs require a series
|
||
of Flexible Register Control (FRC) commands, such as
|
||
F0RC and F1RC, to initialize their internal registers
|
||
according to the manufacturer’s specifications.
|
||
|
||
\item \textbf{Mode Register Set (MRS)}: The initialization
|
||
process also involves sending Mode Register Set
|
||
(MRS) commands. These commands are used to configure
|
||
various operational parameters of the DDR3 memory
|
||
modules, such as burst length, latency timings,
|
||
and operating modes. Each MRS command targets a
|
||
specific mode register within the memory module,
|
||
and the exact sequence of commands is crucial for
|
||
setting up the DIMMs according to the system’s
|
||
requirements and the DIMM manufacturer’s guidelines.
|
||
\end{itemize}
|
||
|
||
\subsubsection{ZQ calibration process}
|
||
|
||
ZQ calibration is a key step in DDR3 initialization,
|
||
responsible for calibrating the output driver impedance and
|
||
on-die termination (ODT) resistance of the DDR3 modules. Proper
|
||
impedance matching is essential for maintaining signal
|
||
integrity, reducing signal reflections, and ensuring reliable
|
||
data communication between the memory controller and the
|
||
memory modules. It is important to note that ZQ calibration
|
||
is done directly by the memory controller, and that the firmware
|
||
is simply triggering it.
|
||
|
||
\begin{itemize}
|
||
\item \textbf{Sending ZQCL commands}: The BIOS initiates
|
||
ZQ calibration by sending two ZQCL (ZQ Calibration Long)
|
||
commands to each DDR3 DIMM. ZQCL commands instruct the
|
||
memory module to perform a long calibration cycle, during
|
||
which the module adjusts its output driver impedance and
|
||
ODT resistance to match the desired target impedance. This
|
||
process compensates for variations due to manufacturing
|
||
differences, voltage fluctuations, and temperature
|
||
changes. To send a ZQCL command, the BIOS programs the
|
||
\path{SendZQCmd} bit in the \path{D18F2x7C_dct}
|
||
register to 1 and sets the \path{MrsAddress[10]} bit to 1,
|
||
indicating that the ZQCL command should be sent to the
|
||
memory module.
|
||
|
||
\item \textbf{Calibration timing}: After sending the
|
||
ZQCL command, the system must wait for 512 memory clock
|
||
cycles (MEMCLKs) to allow the calibration process to
|
||
complete. During this time, the memory module adjusts
|
||
its internal impedance to ensure it matches the specified
|
||
target impedance. This timing is critical, as inadequate
|
||
wait times could result in incomplete or inaccurate
|
||
calibration, leading to signal integrity issues and
|
||
potential data errors.
|
||
|
||
\item \textbf{Finalization of initialization}: Once the
|
||
ZQ calibration is complete, the BIOS deactivates the DRAM
|
||
initialization process by setting the \path{EnDramInit}
|
||
bit in the \path{D18F2x7C_dct} register to 0. For
|
||
LRDIMMs, additional configuration steps are required to
|
||
finalize the initialization process. These steps include
|
||
programming the DCT registers to monitor for errors and
|
||
ensure that the LRDIMMs are operating correctly.
|
||
\end{itemize}
|
||
|
||
\subsubsection{Write leveling process}
|
||
|
||
The BIOS and Kernel Developer Guide (BKDG) provides
|
||
information on the write leveling process, which is
|
||
essential for ensuring correct data alignment during write
|
||
operations in DDR3 memory systems. Write leveling is
|
||
particularly crucial in systems utilizing a fly-by topology,
|
||
where timing skew between the clock and data signals can
|
||
introduce significant challenges. This kind of algorithms
|
||
were not necessary for DDR2, for example.
|
||
If the target operating
|
||
frequency is higher than the lowest supported MEMCLK frequency,
|
||
the BIOS must perform multiple passes to achieve proper write
|
||
leveling. The MEMCLK is the clock signal that synchronizes the
|
||
communication between the memory controller and the memory
|
||
modules. \\
|
||
|
||
During each pass, the memory subsystem is configured for a
|
||
progressively higher operating frequency:
|
||
|
||
\begin{itemize}
|
||
\item \textbf{Pass 1:} The memory subsystem is configured
|
||
for the lowest supported MEMCLK, ensuring that initial
|
||
timing adjustments are made under the most stable
|
||
conditions.
|
||
\item \textbf{Pass 2:} The subsystem is then adjusted for
|
||
the second-lowest MEMCLK, gradually increasing the
|
||
operating frequency while fine-tuning the alignment of
|
||
the DQS and CK signals.
|
||
\item \textbf{Pass N:} This process continues until the
|
||
highest MEMCLK supported by the system is reached,
|
||
ensuring that the memory operates reliably at its
|
||
maximum speed.
|
||
\end{itemize}
|
||
|
||
This step-wise configuration ensures that the memory system is
|
||
stable across all supported operating frequencies, minimizing
|
||
the risk of timing errors during write operations, especially
|
||
as frequencies increase and timing margins become tighter. The
|
||
configuration process varies depending on whether the DIMM is
|
||
a Registered DIMM (RDIMM) or an Unregistered DIMM (UDIMM).
|
||
RDIMMs include an additional buffer to improve signal integrity,
|
||
which is particularly important in systems with multiple DIMMs.
|
||
The steps common to both types include a preparation with the
|
||
DDR3 Mode Register Commands
|
||
(see fig. \ref{fig:ddr3_state_machine}).
|
||
For RDIMMs, a 4-rank module is treated as two
|
||
separate DIMMs, where each rank is essentially a separate memory
|
||
module within the same DIMM. The first two ranks are the primary
|
||
target for the initial configuration. The remaining two ranks
|
||
are treated as non-target and are configured separately. \\
|
||
|
||
Mode registers in DDR3
|
||
memory are used to configure various operational parameters such
|
||
as latency settings, burst length, and write leveling. One of
|
||
the key mode registers is \path{MR1_dct}, which is specific to
|
||
DDR3 and controls certain features of the memory module,
|
||
including write leveling. \path{MR1_dct} is used to enable or
|
||
disable specific functions such as write leveling and output
|
||
driver settings. The \path{dct} suffix refers to the Data
|
||
Control Timing that is specific to this register's function in
|
||
managing the timing and control of data operations within the
|
||
memory module. \\
|
||
|
||
Then, these steps are followed, still common to both RDIMMs and
|
||
UDIMMs:
|
||
|
||
\begin{itemize}
|
||
\item \textbf{Step 1A: Output Driver and ODT configuration
|
||
for target DIMM:}
|
||
\begin{itemize}
|
||
\item For the first rank (target):
|
||
\begin{itemize}
|
||
\item Set \path{MR1_dct[1:0][Level] = 1}
|
||
to enable write leveling.
|
||
\item Set \path{MR1_dct[1:0][Qoff] = 0}
|
||
to ensure the output drivers are active.
|
||
\end{itemize}
|
||
\item For other ranks:
|
||
\begin{itemize}
|
||
\item Set \path{MR1_dct[1:0][Level] = 1}
|
||
to prepare for write leveling.
|
||
\item Set \path{MR1_dct[1:0][Qoff] = 1}
|
||
to deactivate the output drivers for
|
||
ranks that are not currently being
|
||
leveled.
|
||
\end{itemize}
|
||
\item If there are two or more DIMMs per channel,
|
||
or if there is one DIMM per three channels:
|
||
\begin{itemize}
|
||
\item Program the target rank’s
|
||
\path{RttNom} (nominal termination
|
||
resistance value) for \path{RttWr}
|
||
termination, which helps in managing signal
|
||
integrity during the write process by
|
||
ensuring the correct impedance matching.
|
||
\end{itemize}
|
||
\end{itemize}
|
||
|
||
\item \textbf{Step 1B: Configure non-target RttNom to normal
|
||
operation:}
|
||
\begin{itemize}
|
||
\item After the initial configuration, the
|
||
\path{RttNom} values for the non-target ranks
|
||
are set to their normal operating states.
|
||
\item A wait time of 40 MEMCLKs is observed to
|
||
ensure the configuration settings are stable
|
||
before proceeding.
|
||
\end{itemize}
|
||
|
||
\item \textbf{Step 3: PHY configuration:}
|
||
\begin{itemize}
|
||
\item The PHY is then configured to measure and
|
||
adjust the timing delays accurately for each
|
||
data lane. The PHY layer is responsible for
|
||
converting the signals from the memory
|
||
controller into a form that can be transmitted
|
||
over the physical connections to the memory
|
||
modules.
|
||
\end{itemize}
|
||
|
||
\item \textbf{Step 4: Perform write leveling:}
|
||
\begin{itemize}
|
||
\item The actual write leveling process is executed,
|
||
where the DQS signal timing is adjusted to
|
||
ensure it aligns perfectly with the CK signal at
|
||
the memory module’s pins, ensuring that data is
|
||
written accurately.
|
||
\end{itemize}
|
||
|
||
\item \textbf{Step 5: Disable PHY configuration
|
||
post-measurement:}
|
||
\begin{itemize}
|
||
\item After completing the write leveling process,
|
||
the PHY configuration is disabled to stop further
|
||
timing measurements and adjustments, locking in the
|
||
calibrated settings.
|
||
\end{itemize}
|
||
|
||
\item \textbf{Step 6: Program the DIMM to normal operation:}
|
||
\begin{itemize}
|
||
\item Finally, the DIMM is reprogrammed to its
|
||
normal operational state, resetting \path{Qoff}
|
||
and \path{Level} to \path{0} to conclude the
|
||
write leveling process and return to standard
|
||
operation.
|
||
\end{itemize}
|
||
\end{itemize}
|
||
|
||
For each DIMM, the BIOS must calculate the coarse and fine
|
||
delays for each lane in the DQS Write Timing register:
|
||
|
||
\begin{itemize}
|
||
\item \textbf{Coarse Delay Calculation:} This involves
|
||
setting a basic delay based on a seed value specific to
|
||
the platform. The seed value is determined during
|
||
initial system configuration and serves as a starting
|
||
point for further delay adjustments.
|
||
\item \textbf{Critical Delay Determination:} The minimum of
|
||
the coarse delays for each lane and DIMM is considered
|
||
the critical delay. This delay is crucial for ensuring
|
||
that all data lanes are correctly synchronized.
|
||
\item \textbf{Platform-Specific Seed:} The seed ranges
|
||
between -1.20ns and +1.20ns, providing a small
|
||
adjustment range to fine-tune the timing based on the
|
||
specific characteristics of the platform. This seed
|
||
value can differ for the first pass compared to
|
||
subsequent passes, allowing for incremental adjustments
|
||
as the system stabilizes.
|
||
\end{itemize}
|
||
|
||
\section{Current implementation and potential improvements}
|
||
|
||
\subsection{Current implementation in coreboot on the KGPE-D16}
|
||
|
||
In this part as for the rest of this document, we're basing our
|
||
study on the 4.11 version of \textit{coreboot} \cite{coreboot_4_11},
|
||
which is the last version that supported the ASUS KGPE-D16
|
||
mainboard. \\
|
||
|
||
The process starts in
|
||
\path{src/mainboard/asus/kgpe-d16/romstage.c}, in the
|
||
\path{cache_as_ram_main} function by calling
|
||
\path{fill_mem_ctrl} from
|
||
\path{src/northbridge/amd/amdfam10/raminit_sysinfo_in_ram.c}
|
||
(lst. \ref{lst:fill_mem_ctrl}).
|
||
At this current step, only the BSC is running the firmware code.
|
||
This function iterates over all memory controllers (one per
|
||
node) and initializes their corresponding structures with the
|
||
system information needed for the RAM to function. This includes
|
||
the addresses of PCI nodes (important for DMA operations) and
|
||
SPD addresses, which are internal ROMs in each memory slot
|
||
containing crucial information for detecting and initializing
|
||
memory modules. \\
|
||
|
||
\begin{listing}[H]
|
||
\begin{adjustwidth}{0.5cm}{0.5cm}
|
||
\inputminted{c}{
|
||
listings/src_northbridge_amd_amdfam10_raminit_sysinfo_in_ram.c}
|
||
\end{adjustwidth}
|
||
\caption{
|
||
\protect\path{fill_mem_ctrl()}, extract from
|
||
\protect\path{src/northbridge/amd/amdfam10/raminit_sysinfo_in_ram.c}}
|
||
\label{lst:fill_mem_ctrl}
|
||
\end{listing}
|
||
|
||
If successful, the system posts codes \path{0x3D} and then
|
||
\path{0x40}. The \path{raminit_amdmct} function from
|
||
\path{src/northbridge/amd/amdfam10/raminit_amdmct.c} is then
|
||
called. This function, in turn, calls \path{mctAutoInitMCT_D}
|
||
(lst. \ref{lst:mctAutoInitMCT_D_1}) from
|
||
\path{src/northbridge/amd/amdmct/mct_ddr3/mct_d.c},
|
||
which is responsible for the initial memory initialization,
|
||
predominantly written by Raptor Engineering. \\
|
||
|
||
At this stage, it is assumed that memory has been pre-mapped
|
||
contiguously from address 0 to 4GB and that the previous code
|
||
has correctly mapped non-cacheable I/O areas below 4GB for the
|
||
PCI bus and Local APIC access for processor cores. \\
|
||
|
||
The following prerequisites must be in place from the previous
|
||
steps:
|
||
|
||
\begin{itemize}
|
||
\item The HyperTransport bus configured, and its speed is
|
||
correctly set.
|
||
\item The SMBus controller is configured.
|
||
\item The BSP is in unreal mode.
|
||
\item A stack is set up for all cores.
|
||
\item All cores are initialized at a frequency of 2GHz.
|
||
\item If we were using saved values, the NVRAM would have been
|
||
verified with checksums.
|
||
\end{itemize}
|
||
|
||
The memory controller for the BSP is queried to check if it can
|
||
manage ECC memory, which is a type of memory that includes
|
||
error-correcting code to detect and correct common types of data
|
||
corruption (lst. \ref{lst:mctAutoInitMCT_D_2}).
|
||
|
||
For each node available in the system, the memory controllers
|
||
are identified and initialized using a \path{DCTStatStruc}
|
||
structure defined in
|
||
\path{src/northbridge/amd/amdmct/mct_ddr3/mct_d.h}. This
|
||
structure contains all necessary fields for managing a memory
|
||
module. The process includes:
|
||
|
||
\begin{itemize}
|
||
\item Retrieving the corresponding field in the sysinfo
|
||
structure for the node.
|
||
\item Clearing fields with \path{zero}.
|
||
\item Initializing basic fields.
|
||
\item Initializing the controller linked to the current node.
|
||
\item Verifying the presence of the node (checking if the
|
||
processor associated with this controller is present).
|
||
If yes, the SMBus is informed.
|
||
\item Pre-initializing the memory module controller for this
|
||
node using \path{mct_preInitDCT}.
|
||
\end{itemize}
|
||
|
||
The memory modules must be initialized. All modules present on
|
||
valid nodes are configured with 1.5V voltage
|
||
(lst. \ref{lst:mctAutoInitMCT_D_3}). The ZQ calibration
|
||
is triggered at this stage. \\
|
||
|
||
Now, present memory modules are detected using \path{mct_initDCT}
|
||
(lst. \ref{lst:mctAutoInitMCT_D_4}). The memory modules existence
|
||
is checked and the machine halts immediately after displaying a
|
||
message if there is no memory.
|
||
\textit{coreboot} waits for all modules to be available using
|
||
\path{SyncDCTsReady_D}. \\
|
||
|
||
The firmware maps the physical memory address ranges into the
|
||
address space with \path{HTMemMapInit_D} as contiguously as possible
|
||
while also constructing the physical memory map. If there is an
|
||
area occupied by something else, it is ignored, and a memory hole is
|
||
created. \\
|
||
|
||
Mapping the address ranges into the cache is done with
|
||
\path{CPUMemTyping_D} either as WriteBack (cacheable) or
|
||
Uncacheable, depending on whether the area corresponds to physical
|
||
memory or a memory hole. \\
|
||
|
||
The external northbridge is notified of this new memory
|
||
configuration. \\
|
||
|
||
The \textit{coreboot} code compensates for the delay between DQS
|
||
and DQ signals, as well as between CMD and DQ. This is handled by
|
||
the \path{DQSTiming_D} function (lst. \ref{lst:mctAutoInitMCT_D_5}).
|
||
The initialization can be done again if needed after that, otherwise
|
||
the channels and nodes are interleaved and ECC is enabled (if
|
||
supported by every module). \\
|
||
|
||
After that being done, the DRAM can be mapped into the address
|
||
space with cacheability, and the init process finishes with
|
||
validation of every populated DCT node
|
||
(lst. \ref{lst:mctAutoInitMCT_D_6}). \\
|
||
|
||
Finally, if the RAM is of the ECC type, error-correcting codes
|
||
are enabled, and the function ends by activating power-saving
|
||
features if requested by the user. \\
|
||
|
||
\subsubsection{Details on the DQS training function}
|
||
|
||
The \path{DQSTiming_D} function is a critical part of the
|
||
firmware responsible for initializing and training the system's
|
||
memory.
|
||
The function primarily handles the DQS timing, which is
|
||
essential for ensuring data integrity and synchronization
|
||
between the memory controller and the DRAM. Proper DQS training
|
||
is crucial to align the data signals correctly with the clock
|
||
signals.
|
||
|
||
The function begins by declaring local variables, which are
|
||
used throughout the function for various operations. It also
|
||
includes an early exit condition to bypass DQS training if a
|
||
specific status flag (\path{GSB_EnDIMMSpareNW}) is set,
|
||
indicating that a DIMM spare feature is enabled
|
||
(lst. \ref{lst:var_decl_and_exit}). These spare DIMMs are not
|
||
used for normal memory operations but are kept in reserve for
|
||
redundancy. \\
|
||
|
||
\begin{listing}[H]
|
||
\begin{adjustwidth}{0.5cm}{0.5cm}
|
||
\begin{minted}[linenos]{c}
|
||
if (pMCTstat->GStatus & (1 << GSB_EnDIMMSpareNW)) {
|
||
return;
|
||
}
|
||
\end{minted}
|
||
\end{adjustwidth}
|
||
\caption{Early exit check,
|
||
extract from the
|
||
\protect\path{DQSTiming_D} function in
|
||
\protect\path{src/northbridge/amd/amdmct/mct_ddr3/mct_d.c}}
|
||
\label{lst:var_decl_and_exit}
|
||
\end{listing}
|
||
|
||
Next, the function initializes the TCWL (CAS Write Latency)
|
||
offset to zero for each node and DCT.
|
||
This ensures that the memory write latency is properly aligned
|
||
before the DQS training begins
|
||
(lst. \ref{lst:set_tcwl_offset}). \\
|
||
|
||
\begin{listing}[H]
|
||
\begin{adjustwidth}{0.5cm}{0.5cm}
|
||
\begin{minted}[linenos]{c}
|
||
for (Node = 0; Node < MAX_NODES_SUPPORTED; Node++) {
|
||
uint8_t dct;
|
||
struct DCTStatStruc *pDCTstat;
|
||
pDCTstat = pDCTstatA + Node;
|
||
for (dct = 0; dct < 2; dct++)
|
||
pDCTstat->tcwl_delay[dct] = 0;
|
||
}
|
||
\end{minted}
|
||
\end{adjustwidth}
|
||
\caption{Setting initial TCWL offset to zero for all nodes and DCTs,
|
||
extract from the
|
||
\protect\path{DQSTiming_D} function in
|
||
\protect\path{src/northbridge/amd/amdmct/mct_ddr3/mct_d.c}}
|
||
\label{lst:set_tcwl_offset}
|
||
\end{listing}
|
||
|
||
A retry mechanism is introduced to handle potential errors
|
||
during DQS training and the pre-training function are called
|
||
(lst. \ref{lst:retry_pre_training}). \\
|
||
|
||
\begin{listing}[H]
|
||
\begin{adjustwidth}{0.5cm}{0.5cm}
|
||
\begin{minted}[linenos]{c}
|
||
retry_dqs_training_and_levelization:
|
||
nv_DQSTrainCTL = !allow_config_restore;
|
||
|
||
mct_BeforeDQSTrain_D(pMCTstat, pDCTstatA);
|
||
phyAssistedMemFnceTraining(pMCTstat, pDCTstatA, -1);
|
||
\end{minted}
|
||
\end{adjustwidth}
|
||
\caption{Retry mechanism initialization and pre-training operations,
|
||
extract from the
|
||
\protect\path{DQSTiming_D} function in
|
||
\protect\path{src/northbridge/amd/amdmct/mct_ddr3/mct_d.c}}
|
||
\label{lst:retry_pre_training}
|
||
\end{listing}
|
||
|
||
For AMD's Fam15h processors, additional PHY compensation is
|
||
needed for each node and valid DCT
|
||
(lst. \ref{lst:phy_compensation_init}). This is necessary to
|
||
fine-tune the electrical characteristics of the memory
|
||
interface. For more information about the PHY training, see
|
||
the earlier sections about RAM training algorithm. \\
|
||
|
||
\begin{listing}[H]
|
||
\begin{adjustwidth}{0.5cm}{0.5cm}
|
||
\begin{minted}[linenos]{c}
|
||
for (Node = 0; Node < MAX_NODES_SUPPORTED; Node++) {
|
||
pDCTstat = pDCTstatA + Node;
|
||
if (pDCTstat->NodePresent) {
|
||
if (pDCTstat->DIMMValidDCT[0])
|
||
InitPhyCompensation(pMCTstat, pDCTstat, 0);
|
||
if (pDCTstat->DIMMValidDCT[1])
|
||
InitPhyCompensation(pMCTstat, pDCTstat, 1);
|
||
}
|
||
}
|
||
\end{minted}
|
||
\end{adjustwidth}
|
||
\caption{PHY compensation initialization,
|
||
extract from the
|
||
\protect\path{DQSTiming_D} function in
|
||
\protect\path{src/northbridge/amd/amdmct/mct_ddr3/mct_d.c}}
|
||
\label{lst:phy_compensation_init}
|
||
\end{listing}
|
||
|
||
Before proceeding with the main DQS training, the function
|
||
invokes a hook function that allows for additional
|
||
configurations or custom operations:
|
||
\path{mctHookBeforeAnyTraining}. \\
|
||
|
||
The \path{nv_DQSTrainCTL} variable is
|
||
set based on the \path{allow_config_restore} parameter,
|
||
determining whether to restore a previous configuration or
|
||
proceed with fresh training. This is however not working on the
|
||
current implementation of ASUS KGPE-D16 firmware
|
||
(lst. \ref{lst:mctAutoInitMCT_D_fixme}).
|
||
If \path{nv_DQSTrainCTL} indicates that fresh training should
|
||
proceed, the function performs the main DQS training in multiple
|
||
passes, including receiver enable training with
|
||
\path{TrainReceiverEn_D}, write leveling with
|
||
\path{mct_WriteLevelization_HW}, DQS position
|
||
training with \path{mct_TrainDQSPos_D} and the maximum read
|
||
latency calculation with \path{TrainMaxRdLatency_En_D}
|
||
(lst. \ref{lst:dqs_training_process}).
|
||
Write leveling is done in two passes, with a DQS receiver
|
||
training between and another pass of receiver training after.
|
||
After that, a DQS position training is done and the process
|
||
finished with the maximum read latency, i.e the delay between
|
||
the request for data and the delivery of that data by the DRAM.
|
||
\\
|
||
|
||
\begin{listing}[H]
|
||
\begin{adjustwidth}{0.5cm}{0.5cm}
|
||
\begin{minted}[linenos]{c}
|
||
if (nv_DQSTrainCTL) {
|
||
mct_WriteLevelization_HW(pMCTstat, pDCTstatA, FirstPass);
|
||
TrainReceiverEn_D(pMCTstat, pDCTstatA, FirstPass);
|
||
mct_WriteLevelization_HW(pMCTstat, pDCTstatA, SecondPass);
|
||
|
||
/* TODO: Determine why running TrainReceiverEn_D in SecondPass mode yields
|
||
* less stable training values than when run in FirstPass mode as in the HACK
|
||
* below.*/
|
||
TrainReceiverEn_D(pMCTstat, pDCTstatA, FirstPass);
|
||
mct_TrainDQSPos_D(pMCTstat, pDCTstatA);
|
||
...
|
||
TrainMaxRdLatency_En_D(pMCTstat, pDCTstatA);
|
||
} else {
|
||
mct_WriteLevelization_HW(pMCTstat, pDCTstatA, FirstPass);
|
||
mct_WriteLevelization_HW(pMCTstat, pDCTstatA, SecondPass);
|
||
#if CONFIG(HAVE_ACPI_RESUME)
|
||
printk(BIOS_DEBUG, "mctAutoInitMCT_D: Restoring DIMM training configuration"
|
||
"from NVRAM\n");
|
||
if (restore_mct_information_from_nvram(1) != 0)
|
||
printk(BIOS_CRIT, "%s: ERROR: Unable to restore DCT configuration from"
|
||
"NVRAM\n", __func__);
|
||
#endif
|
||
exit_training_mode_fam15(pMCTstat, pDCTstatA);
|
||
pMCTstat->GStatus |= 1 << GSB_ConfigRestored;"
|
||
}
|
||
\end{minted}
|
||
\end{adjustwidth}
|
||
\caption{Main DQS training process in multiple passes,
|
||
extract from the
|
||
\protect\path{DQSTiming_D} function in
|
||
\protect\path{src/northbridge/amd/amdmct/mct_ddr3/mct_d.c}}
|
||
\label{lst:dqs_training_process}
|
||
\end{listing}
|
||
|
||
The function checks for any errors during the DQS training. If
|
||
errors are detected, it may request a retrain, reset certain
|
||
parameters, and restart the training process and even restart
|
||
the whole system if needed (lst. \ref{lst:error_handling}).
|
||
If the training process it to be restarted, the firmware
|
||
sets the DIMMs frequencies to minimum and applies timing changes
|
||
to DIMMs before jumping to the retry label
|
||
(lst. \ref{lst:retry_pre_training}). \\
|
||
|
||
Once the training is successfully completed without errors, the
|
||
function finalizes the process by setting the maximum read
|
||
latency and exiting the training mode. For systems with
|
||
\path{allow_config_restore} enabled, it restores the previous
|
||
configuration from NVRAM instead of performing a fresh training
|
||
(lst. \ref{lst:dqs_training_process}). \\
|
||
|
||
Finally, the function performs a cleanup operation specific to
|
||
Fam15h processors, where it switches the DCT control register
|
||
as required by a known erratum from AMD for the BKDG
|
||
(Erratum 505) \cite{amd_fam15h_revision_guide}.
|
||
This is followed by a post-training hook that
|
||
allows for any additional necessary actions
|
||
(lst. \ref{lst:post_training_cleanup}). \\
|
||
|
||
\begin{listing}[htpb]
|
||
\begin{adjustwidth}{0.5cm}{0.5cm}
|
||
\begin{minted}[linenos]{c}
|
||
for (Node = 0; Node < MAX_NODES_SUPPORTED; Node++) {
|
||
pDCTstat = pDCTstatA + Node;
|
||
if (pDCTstat->NodePresent) {
|
||
fam15h_switch_dct(pDCTstat->dev_map, 0);
|
||
}
|
||
}
|
||
|
||
/* FIXME - currently uses calculated value
|
||
* TrainMaxReadLatency_D(pMCTstat, pDCTstatA); */
|
||
mctHookAfterAnyTraining();
|
||
\end{minted}
|
||
\end{adjustwidth}
|
||
\caption{Post-training cleanup and final hook execution,
|
||
extract from the
|
||
\protect\path{DQSTiming_D} function in
|
||
\protect\path{src/northbridge/amd/amdmct/mct_ddr3/mct_d.c}}
|
||
\label{lst:post_training_cleanup}
|
||
\end{listing}
|
||
|
||
\subsubsection{Details on the write leveling implementation}
|
||
|
||
The \path{WriteLevelization_HW} function is responsible for
|
||
performing hardware-level write leveling on DRAM modules during
|
||
the memory initialization process. Write leveling ensures that
|
||
the DQS signals are correctly aligned with the clock signals,
|
||
preventing timing mismatches during write operations. \\
|
||
|
||
The function begins by initializing pointers to key data
|
||
structures, linking the memory controller (MCT) and DRAM
|
||
controller timing (DCT) data for subsequent operations. \\
|
||
|
||
Auto-refresh and short ZQ calibration are temporarily disabled
|
||
to prevent interference during the critical timing adjustments
|
||
of write leveling.
|
||
The memory controller is prepared for write leveling by
|
||
configuring necessary parameters with \path{PrepareC_MCT},
|
||
then the main operation can begin. \\
|
||
|
||
In the first pass (lst. \ref{lst:write_level_first_pass}),
|
||
the function repeatedly attempts to align
|
||
the DQS signals with \path{PhyWLPass1}, retrying if invalid
|
||
values are detected. This phase ensures basic alignment for
|
||
further fine-tuning. The function retries up to 8 times if it
|
||
detects invalid timing values. \\
|
||
|
||
During the second pass (lst. \ref{lst:write_level_second_pass}),
|
||
the function first checks if the target memory frequency
|
||
(\path{TargetFreq}) is higher than the minimum memory clock
|
||
frequency stored in the non-volatile bits
|
||
(\path{NV_MIN_MEMCLK}). If so, the memory frequency is
|
||
incrementally adjusted toward the final target f requency.
|
||
This step-by-step approach is crucial, especially for AMD Fam15h
|
||
processors, where the frequency must be gradually stepped up to
|
||
avoid instability. \\
|
||
|
||
For each frequency step, the write leveling process is
|
||
recalibrated by invoking the \path{PhyWLPass2} function. This
|
||
function adjusts the DQS timing for each data channel (DCT) and
|
||
validates the results. The function retries up to 8 times if it
|
||
detects invalid timing values. The global status
|
||
(\path{global_phy_training_status}) aggregates the results of
|
||
each step, tracking any persistent issues. \\
|
||
|
||
The \path{PhyWLPass1} and \path{PhyWLPass2} function relyon
|
||
\path{AgesaHwWlPhase1}, \path{AgesaHwWlPhase2} and
|
||
\path{AgesaHwWlPhase3} for this. \\
|
||
|
||
Once the target frequency is reached and all write leveling
|
||
adjustments are made, the final timing values are stored.
|
||
The gross and fine delays from the previous passes are copied
|
||
into the final pass structures. This ensures that the DQS
|
||
timings are consistent and stable across all data channels. \\
|
||
|
||
If any issues persist after retries, the function logs a
|
||
warning. This indicates that the system may continue to operate,
|
||
but with a potential risk of instability due to imperfect
|
||
write leveling calibration. \\
|
||
|
||
After leveling, the function re-enables auto-refresh and short
|
||
ZQ calibration, ensuring the memory subsystem is correctly
|
||
configured for normal operation. \\
|
||
|
||
\begin{listing}[htpb]
|
||
\begin{adjustwidth}{0.5cm}{0.5cm}
|
||
\begin{minted}[linenos]{c}
|
||
if (Pass == FirstPass) {
|
||
timeout = 0;
|
||
do {
|
||
status = 0;
|
||
timeout++;
|
||
status |= PhyWLPass1(pMCTstat, pDCTstat, 0);
|
||
status |= PhyWLPass1(pMCTstat, pDCTstat, 1);
|
||
if (status)
|
||
printk(BIOS_INFO, "%s: Retrying write levelling due to invalid "
|
||
"value(s) detected in first phase\n", __func__);
|
||
} while (status && (timeout < 8));
|
||
if (status)
|
||
printk(BIOS_INFO, "%s: Uncorrectable invalid value(s) detected in first "
|
||
"phase of write levelling\n", __func__);
|
||
}
|
||
\end{minted}
|
||
\end{adjustwidth}
|
||
\caption{Write leveling (first pass),
|
||
extract from the
|
||
\protect\path{WriteLevelization_HW} function in
|
||
\protect\path{src/northbridge/amd/amdmct/mct_ddr3/mcthwl.c}}
|
||
\label{lst:write_level_first_pass}
|
||
\end{listing}
|
||
|
||
The detailled write leveling process is divided into three
|
||
distinct phases, each managed by a specific function:
|
||
\path{AgesaHwWlPhase1}, \path{AgesaHwWlPhase2}, and
|
||
\path{AgesaHwWlPhase3} from \path{mcthwl.c}.
|
||
These phases work together to
|
||
fine-tune the timing delays (gross and fine) for each byte
|
||
lane, ensuring reliable data transmission. \\
|
||
|
||
The write leveling process begins by selecting the target
|
||
DIMM. This is accomplished by programming the
|
||
\path{TrDimmSel} register to ensure that the subsequent
|
||
operations apply to the correct DIMM
|
||
(lst. \ref{lst:target_dimm_selection}) \\
|
||
|
||
\begin{listing}
|
||
\begin{adjustwidth}{0.5cm}{0.5cm}
|
||
\begin{minted}[linenos]{c}
|
||
set_DCT_ADDR_Bits(pDCTData, dct, pDCTData->NodeId, FUN_DCT,
|
||
DRAM_ADD_DCT_PHY_CONTROL_REG, TrDimmSelStart,
|
||
TrDimmSelEnd, (u32)dimm);
|
||
\end{minted}
|
||
\end{adjustwidth}
|
||
\caption{Target DIMM selection for write leveling,
|
||
extract from
|
||
\protect\path{AgesaHwWlPhase1} function in
|
||
\protect\path{src/northbridge/amd/amdmct/mct_ddr3/mhwlc_d.c}}
|
||
\label{lst:target_dimm_selection}
|
||
\end{listing}
|
||
|
||
In the case of x4 DIMMs, which are common in high-density
|
||
memory configurations, write leveling must be performed
|
||
separately for each nibble (4-bit group). The function
|
||
checks if x4 DIMMs are present and, if so, prepares to train
|
||
both nibbles (lst. \ref{lst:x4_dimm_handling}). \\
|
||
|
||
\begin{listing}
|
||
\begin{adjustwidth}{0.5cm}{0.5cm}
|
||
\begin{minted}[linenos]{c}
|
||
train_both_nibbles = 0;
|
||
if (pDCTstat->Dimmx4Present)
|
||
if (is_fam15h())
|
||
train_both_nibbles = 1;
|
||
\end{minted}
|
||
\end{adjustwidth}
|
||
\caption{Handling of x4 DIMMs and nibble training,
|
||
extract from
|
||
\protect\path{AgesaHwWlPhase1} function in
|
||
\protect\path{src/northbridge/amd/amdmct/mct_ddr3/mhwlc_d.c}}
|
||
\label{lst:x4_dimm_handling}
|
||
\end{listing}
|
||
|
||
The DIMMs are prepared for write leveling by issuing Mode
|
||
Register (MR) commands. These commands configure the DIMMs
|
||
to enter a state where write leveling can be performed
|
||
(lst. \ref{lst:prepare_dimms}). \\
|
||
|
||
\begin{listing}
|
||
\begin{adjustwidth}{0.5cm}{0.5cm}
|
||
\begin{minted}[linenos]{c}
|
||
prepareDimms(pMCTstat, pDCTstat, dct, dimm, TRUE);
|
||
\end{minted}
|
||
\end{adjustwidth}
|
||
\caption{Preparing DIMMs for write leveling,
|
||
extract from
|
||
\protect\path{AgesaHwWlPhase1} function in
|
||
\protect\path{src/northbridge/amd/amdmct/mct_ddr3/mhwlc_d.c}}
|
||
\label{lst:prepare_dimms}
|
||
\end{listing}
|
||
|
||
The \path{procConfig} function is called to configure the
|
||
processor's DDR PHY (Physical Layer) for write leveling.
|
||
This configuration includes setting initial seed values for
|
||
gross and fine delays, which are essential for the
|
||
subsequent timing adjustments. \\
|
||
|
||
\path{procConfig} generates initial seed values
|
||
(lst. \ref{lst:seed_generation}) for gross
|
||
and fine delays. These seeds are calculated based on several
|
||
factors:
|
||
|
||
\begin{itemize}
|
||
\item \textbf{Processor Type:} For Fam15h processors,
|
||
specific tables from the Fam15h BKDG \cite{BKDG} are
|
||
referenced to select appropriate seed values for
|
||
different package types (e.g., Socket G34, Socket
|
||
C32).
|
||
\item \textbf{DIMM Type:} The seed values are adjusted
|
||
based on whether the RDIMMs are registered or
|
||
load-reduced, with different base values used for
|
||
these configurations.
|
||
\item \textbf{Memory Clock Frequency:} The seeds are
|
||
further adjusted based on the current memory clock
|
||
frequency (\path{MemClkFreq}), ensuring that the
|
||
timing is correct for the operating speed of the
|
||
memory.
|
||
\end{itemize}
|
||
|
||
The calculated seed values are then scaled to the minimum
|
||
supported memory frequency and stored in the
|
||
\path{WLSeedGrossDelay} and \path{WLSeedFineDelay} arrays
|
||
for each byte lane. \\
|
||
|
||
\begin{listing}
|
||
\begin{adjustwidth}{0.5cm}{0.5cm}
|
||
\begin{minted}[linenos]{c}
|
||
Seed_Total = (int32_t) (((((int64_t) Seed_Total) *
|
||
fam15h_freq_tab[MemClkFreq] * 100) / (mctGet_NVbits(NV_MIN_MEMCLK) * 100)));
|
||
|
||
Seed_Gross = (Seed_Total >> 5) & 0x1f;
|
||
Seed_Fine = Seed_Total & 0x1f;
|
||
\end{minted}
|
||
\end{adjustwidth}
|
||
\caption{Seed generation,
|
||
extract from
|
||
\protect\path{procConfig} function in
|
||
\protect\path{src/northbridge/amd/amdmct/mct_ddr3/mhwlc_d.c}}
|
||
\label{lst:seed_generation}
|
||
\end{listing}
|
||
|
||
Write leveling is initiated by enabling the
|
||
\path{WrtLvTrEn} bit. This allows the DDR PHY to begin
|
||
adjusting the DQS signals relative to the clock signals
|
||
(lst. \ref{lst:initiate_write_leveling}). \\
|
||
|
||
\begin{listing}
|
||
\begin{adjustwidth}{0.5cm}{0.5cm}
|
||
\begin{minted}[linenos]{c}
|
||
set_DCT_ADDR_Bits(pDCTData, dct, pDCTData->NodeId, FUN_DCT,
|
||
DRAM_ADD_DCT_PHY_CONTROL_REG, WrtLvTrEn, WrtLvTrEn, 1);
|
||
\end{minted}
|
||
\end{adjustwidth}
|
||
\caption{Initiating write leveling training,
|
||
extract from
|
||
\protect\path{AgesaHwWlPhase1} function in
|
||
\protect\path{src/northbridge/amd/amdmct/mct_ddr3/mhwlc_d.c}}
|
||
\label{lst:initiate_write_leveling}
|
||
\end{listing}
|
||
|
||
If the DIMM is not x4, the function skips the nibble
|
||
training loop, as it is unnecessary
|
||
(lst. \ref{lst:exit_non_x4}). \\
|
||
|
||
\begin{listing}
|
||
\begin{adjustwidth}{0.5cm}{0.5cm}
|
||
\begin{minted}[linenos]{c}
|
||
if ((pDCTstat->Dimmx4Present & (1 << (dimm + dct))) == 0)
|
||
break;
|
||
\end{minted}
|
||
\end{adjustwidth}
|
||
\caption{Exit for non-x4 DIMMs,
|
||
extract from
|
||
\protect\path{AgesaHwWlPhase2} function in
|
||
\protect\path{src/northbridge/amd/amdmct/mct_ddr3/mhwlc_d.c}}
|
||
\label{lst:exit_non_x4}
|
||
\end{listing}
|
||
|
||
After a delay to allow the leveling process to stabilize,
|
||
the function reads the gross and fine delay values from the
|
||
relevant registers and stores them
|
||
(lst. \ref{lst:finalize_write_leveling}). These values
|
||
represent the initial timing adjustments necessary for
|
||
correct DQS alignment. \\
|
||
|
||
\begin{listing}
|
||
\begin{adjustwidth}{0.5cm}{0.5cm}
|
||
\begin{minted}[linenos]{c}
|
||
for (ByteLane = 0; ByteLane < lane_count; ByteLane++) {
|
||
getWLByteDelay(pDCTstat, dct, ByteLane, dimm, pass, nibble, lane_count);
|
||
}
|
||
\end{minted}
|
||
\end{adjustwidth}
|
||
\caption{Reading and storing delay values after write leveling,
|
||
extract from
|
||
\protect\path{AgesaHwWlPhase2} function in
|
||
\protect\path{src/northbridge/amd/amdmct/mct_ddr3/mhwlc_d.c}}
|
||
\label{lst:finalize_write_leveling}
|
||
\end{listing}
|
||
|
||
\subsubsection{Details on the DQS position training function}
|
||
|
||
The DQS position training is a crucial step in the memory
|
||
initialization process, ensuring that both read and write
|
||
operations are correctly aligned with the clock signal. \\
|
||
|
||
The function \path{TrainDQSRdWrPos_D_Fam15} orchestrates this
|
||
process by iterating over memory lanes and adjusting timing
|
||
parameters to find optimal settings
|
||
(lst. \ref{lst:dqs_train_init}). It is called by
|
||
\path{mct_TrainDQSPos_D}. \\
|
||
|
||
The function begins by initializing several variables and
|
||
settings necessary for the training process. These include:
|
||
|
||
\begin{itemize}
|
||
\item \path{Errors}: A variable to track any errors
|
||
encountered during the training.
|
||
\item \path{dual_rank}: A flag to indicate whether the
|
||
current DIMM has two ranks.
|
||
\item \path{passing_dqs_delay_found}: An array to track
|
||
whether a passing DQS delay has been found for each lane.
|
||
\item \path{dqs_results_array}: A multi-dimensional array to
|
||
store the results of the DQS delay tests across
|
||
different write and read steps.
|
||
\end{itemize}
|
||
|
||
The function then loops over each receiver (loosely associated
|
||
with chip selects) to perform the training for each rank within
|
||
each DIMM. \\
|
||
|
||
\begin{listing}
|
||
\begin{adjustwidth}{0.5cm}{0.5cm}
|
||
\begin{minted}[linenos]{c}
|
||
for (Receiver = receiver_start; Receiver < receiver_end; Receiver++) {
|
||
dimm = (Receiver >> 1);
|
||
...
|
||
if (!mct_RcvrRankEnabled_D(pMCTstat, pDCTstat, dct, Receiver)) {
|
||
continue;
|
||
}
|
||
\end{minted}
|
||
\end{adjustwidth}
|
||
\caption{Looping over each receiver,
|
||
extract from
|
||
\protect\path{TrainDQSRdWrPos_D_Fam15} function in
|
||
\protect\path{src/northbridge/amd/amdmct/mct_ddr3/mctdqs_d.c}}
|
||
\label{lst:dqs_train_init}
|
||
\end{listing}
|
||
|
||
For each lane in the memory channel, the function iterates over
|
||
possible write and read delay values to find the optimal
|
||
configuration (lst. \ref{lst:dqs_train_iteration}).
|
||
This is done by:
|
||
|
||
\begin{enumerate}
|
||
\item Iterating over the write data delay values from the
|
||
initial value to the initial value plus 1 UI
|
||
(Unit Interval).
|
||
\item For each write data delay, iterating over possible
|
||
read DQS delay values from 0 to 1 UI.
|
||
\item For each combination of write and read delays, testing
|
||
the configuration by writing a training pattern to the
|
||
memory and reading it back to check if it passes or
|
||
fails.
|
||
\end{enumerate}
|
||
|
||
\begin{listing}
|
||
\begin{adjustwidth}{0.5cm}{0.5cm}
|
||
\begin{minted}[linenos]{c}
|
||
for (current_write_data_delay[lane] = initial_write_dqs_delay[lane];
|
||
current_write_data_delay[lane] < (initial_write_dqs_delay[lane] + 0x20);
|
||
current_write_data_delay[lane]++) {
|
||
...
|
||
for (current_read_dqs_delay[lane] = 0;
|
||
current_read_dqs_delay[lane] < 0x20;
|
||
current_read_dqs_delay[lane]++) {
|
||
...
|
||
write_dqs_read_data_timing_registers(
|
||
current_read_dqs_delay, dev, dct, dimm, index_reg);
|
||
read_dram_dqs_training_pattern_fam15(
|
||
pMCTstat, pDCTstat, dct, Receiver, lane, ((check_antiphase == 0)?1:0));
|
||
...
|
||
}
|
||
}
|
||
\end{minted}
|
||
\end{adjustwidth}
|
||
\caption{Iteration over write and read delay values for each lane,
|
||
extract from
|
||
\protect\path{TrainDQSRdWrPos_D_Fam15} function in
|
||
\protect\path{src/northbridge/amd/amdmct/mct_ddr3/mctdqs_d.c}}
|
||
\label{lst:dqs_train_iteration}
|
||
\end{listing}
|
||
|
||
During each iteration, the results are recorded in the
|
||
\path{dqs_results_array}, which tracks whether the combination
|
||
of write and read delays was successful (pass) or not (fail).
|
||
The results are stored for both the primary rank and, if
|
||
applicable, the secondary rank when dual rank DIMMs are used.
|
||
\\
|
||
|
||
After iterating over all possible delay values, the function
|
||
processes the results to determine the best DQS delay settings
|
||
(lst. \ref{lst:dqs_train_results}). \\
|
||
|
||
This is done by:
|
||
|
||
\begin{itemize}
|
||
\item Finding the longest consecutive string of passing
|
||
values for both read and write operations.
|
||
\item Calculating the center of the passing region and using
|
||
this as the optimal delay setting.
|
||
\item If the center of the region is below a threshold,
|
||
issuing a warning that a negative DQS recovery delay
|
||
was detected, which could lead to instability.
|
||
\end{itemize}
|
||
|
||
\begin{listing}
|
||
\begin{adjustwidth}{0.5cm}{0.5cm}
|
||
\begin{minted}[linenos]{c}
|
||
if (best_count > 2) {
|
||
uint16_t region_center = (best_pos + (best_count / 2));
|
||
if (region_center < 16) {
|
||
printk(BIOS_WARNING,
|
||
"TrainDQSRdWrPos: negative DQS recovery delay detected!");
|
||
region_center = 0;
|
||
} else {
|
||
region_center -= 16;
|
||
}
|
||
...
|
||
current_read_dqs_delay[lane] = region_center;
|
||
passing_dqs_delay_found[lane] = 1;
|
||
write_dqs_read_data_timing_registers(
|
||
current_read_dqs_delay, dev, dct, dimm, index_reg);
|
||
}
|
||
\end{minted}
|
||
\end{adjustwidth}
|
||
\caption{Processing the results to determine the best DQS delay settings,
|
||
extract from
|
||
\protect\path{TrainDQSRdWrPos_D_Fam15} function in
|
||
\protect\path{src/northbridge/amd/amdmct/mct_ddr3/mctdqs_d.c}}
|
||
\label{lst:dqs_train_results}
|
||
\end{listing}
|
||
|
||
Finally, the function checks if any lane did not find a valid
|
||
passing region (lst. \ref{lst:dqs_train_finalize}).
|
||
If any lanes failed to find a passing DQS delay,
|
||
the \path{Errors} flag is set, and this error is propagated
|
||
through the \path{pDCTstat->TrainErrors} and
|
||
\path{pDCTstat->ErrStatus} variables.
|
||
\\
|
||
|
||
The function returns \path{1} if no errors were encountered,
|
||
and \texttt{0} otherwise, which is unusual. \\
|
||
|
||
\begin{listing}
|
||
\begin{adjustwidth}{0.5cm}{0.5cm}
|
||
\begin{minted}[linenos]{c}
|
||
for (lane = lane_start; lane < lane_end; lane++) {
|
||
if (!passing_dqs_delay_found[lane]) {
|
||
Errors |= 1 << SB_NODQSPOS;
|
||
}
|
||
}
|
||
pDCTstat->TrainErrors |= Errors;
|
||
pDCTstat->ErrStatus |= Errors;
|
||
return !Errors;
|
||
\end{minted}
|
||
\end{adjustwidth}
|
||
\caption{Final error handling and return value,
|
||
extract from
|
||
\protect\path{TrainDQSRdWrPos_D_Fam15} function in
|
||
\protect\path{src/northbridge/amd/amdmct/mct_ddr3/mctdqs_d.c}}
|
||
\label{lst:dqs_train_finalize}
|
||
\end{listing}
|
||
|
||
\subsubsection{Details on the DQS receiver training function}
|
||
|
||
In AMD Fam15h G34 processors, the DQS receiver enable training
|
||
is a critical step in ensuring that the memory subsystem operates
|
||
correctly and reliably. This training aligns the DQS signal with
|
||
the clock signal, ensuring proper data capture during memory reads.
|
||
\\
|
||
|
||
The DQS receiver enable training algorithm is executed twice:
|
||
first at the lowest supported MEMCLK frequency and then at the
|
||
highest supported MEMCLK frequency. The purpose of this training
|
||
is to fine-tune the timing parameters so that the memory
|
||
controller can reliably read data from the memory modules.
|
||
The algorithm is implemented in the function
|
||
\path{dqsTrainRcvrEn_SW_Fam15} from
|
||
\path{src/northbridge/amd/amdmct/mct_ddr3/mctsrc.c}, which
|
||
orchestrates the
|
||
entire process, called by the \path{mct_TrainRcvrEn_D} function,
|
||
which has been called itself by \path{TrainReceiverEn_D} from
|
||
\path{src/northbridge/amd/amdmct/mct_ddr3/mctdqs_d.c}. \\
|
||
|
||
Here, seeds are initial delay values used to set
|
||
up the memory controller's timing parameters. These seeds are
|
||
generated based on the specific characteristics of the memory
|
||
configuration, such as the package type (e.g., G34, C32), the
|
||
type of DIMMs installed (Registered, Load Reduced, etc.), and
|
||
the maximum number of DIMMs that can be installed in a channel.
|
||
\\
|
||
|
||
The seed generation is handled by the function
|
||
\path{fam15_receiver_enable_training_seed}. This function
|
||
generates a base seed value for each memory channel, based on
|
||
predefined tables in the BKDG \cite{BKDG}. The base seed values
|
||
are specific to the memory configuration and are adjusted based
|
||
on the type of DIMM and the number of DIMMs in each channel. \\
|
||
|
||
The generated seed values are then adjusted
|
||
(lst. \ref{lst:seed_adjustment}) based on the
|
||
operating frequency of the memory (MEMCLK). The adjustment
|
||
scales the seed values to account for the difference between
|
||
the current memory frequency and the minimum supported
|
||
frequency. This ensures that the training can be accurately
|
||
performed across different operating conditions. \\
|
||
|
||
\begin{listing}
|
||
\begin{adjustwidth}{0.5cm}{0.5cm}
|
||
\begin{minted}[linenos]{c}
|
||
initial_seed = (uint16_t) (((((uint64_t) initial_seed) *
|
||
fam15h_freq_tab[mem_clk] * 100) / (min_mem_clk * 100)));
|
||
\end{minted}
|
||
\end{adjustwidth}
|
||
\caption{Adjusting the seed values based on the operating
|
||
frequency of the memory,
|
||
extract from
|
||
\protect\path{dqsTrainRcvrEn_SW_Fam15} function in
|
||
\protect\path{src/northbridge/amd/amdmct/mct_ddr3/mctsrc.c}}
|
||
\label{lst:seed_adjustment}
|
||
\end{listing}
|
||
|
||
Once the seeds are generated and adjusted, they are used to set
|
||
the initial delay values for the DQS receiver enable training
|
||
(lst. \ref{lst:initial_delay_values}).
|
||
The delay values are split into two components: gross delay and
|
||
fine delay. The gross delay determines the overall timing
|
||
offset, while the fine delay adjusts the timing with finer
|
||
granularity. \\
|
||
|
||
\begin{listing}
|
||
\begin{adjustwidth}{0.5cm}{0.5cm}
|
||
\begin{minted}[linenos]{c}
|
||
for (lane = 0; lane < lane_count; lane++) {
|
||
seed_gross[lane] = (seed[lane] >> 5) & 0x1f;
|
||
seed_fine[lane] = seed[lane] & 0x1f;
|
||
|
||
if (seed_gross[lane] & 0x1)
|
||
seed_pre_gross[lane] = 1;
|
||
else
|
||
seed_pre_gross[lane] = 2;
|
||
|
||
// Set the gross delay
|
||
current_total_delay[lane] = ((seed_gross[lane] & 0x1f) << 5);
|
||
}
|
||
\end{minted}
|
||
\end{adjustwidth}
|
||
\caption{Setting initial delay values based on the generated
|
||
seed values,
|
||
extract from
|
||
\protect\path{dqsTrainRcvrEn_SW_Fam15} function in
|
||
\protect\path{src/northbridge/amd/amdmct/mct_ddr3/mctsrc.c}}
|
||
\label{lst:initial_delay_values}
|
||
\end{listing}
|
||
|
||
These delay values are then written to the appropriate registers
|
||
to configure the memory controller for the DQS receiver enable
|
||
training. The training is performed in multiple steps,
|
||
iteratively refining the delay values until the DQS signal is
|
||
correctly aligned with the clock signal. \\
|
||
|
||
During the initialization phase, the memory controller is
|
||
prepared for training. This includes enabling the training mode,
|
||
configuring the memory channels, and disabling certain features
|
||
such as ECC (Error-Correcting Code) to prevent interference
|
||
during training (lst. \ref{lst:initialization_phase}). \\
|
||
|
||
\begin{listing}
|
||
\begin{adjustwidth}{0.5cm}{0.5cm}
|
||
\begin{minted}[linenos]{c}
|
||
fam15EnableTrainingMode(pMCTstat, pDCTstat, ch, 1);
|
||
_DisableDramECC = mct_DisableDimmEccEn_D(pMCTstat, pDCTstat);
|
||
\end{minted}
|
||
\end{adjustwidth}
|
||
\caption{Enabling training mode and disabling ECC,
|
||
extract from
|
||
\protect\path{dqsTrainRcvrEn_SW_Fam15} function in
|
||
\protect\path{src/northbridge/amd/amdmct/mct_ddr3/mctsrc.c}}
|
||
\label{lst:initialization_phase}
|
||
\end{listing}
|
||
|
||
The training phase is where the actual alignment of the DQS
|
||
signal occurs. The memory controller iterates over each DIMM and
|
||
each lane (lst. \ref{lst:training_phase}),
|
||
applying the seed values and adjusting the delay
|
||
registers accordingly. For each DIMM, the training is performed
|
||
twice: once for the first nibble (lower 4 bits) and once for
|
||
the second nibble (upper 4 bits) if the DIMM is x4. \\
|
||
|
||
\begin{listing}
|
||
\begin{adjustwidth}{0.5cm}{0.5cm}
|
||
\begin{minted}[linenos]{c}
|
||
for (rank = 0; rank < (_2Ranks + 1); rank++) {
|
||
for (nibble = 0; nibble < (train_both_nibbles + 1); nibble++) {
|
||
...
|
||
write_dqs_receiver_enable_control_registers(
|
||
current_total_delay, dev, Channel, dimm, index_reg);
|
||
...
|
||
}
|
||
}
|
||
\end{minted}
|
||
\end{adjustwidth}
|
||
\caption{Iterating over ranks and nibbles to apply delay values,
|
||
extract from
|
||
\protect\path{dqsTrainRcvrEn_SW_Fam15} function in
|
||
\protect\path{src/northbridge/amd/amdmct/mct_ddr3/mctsrc.c}}
|
||
\label{lst:training_phase}
|
||
\end{listing}
|
||
|
||
During the training, the controller issues read requests to the
|
||
memory to observe the timing of the DQS signal. The observed
|
||
delays are then averaged and adjusted to ensure the DQS signal
|
||
is correctly aligned across all lanes and ranks. \\
|
||
|
||
In the finalization phase, the memory controller exits the
|
||
training mode (lst. \ref{lst:finalization_phase}),
|
||
and the computed delay values are written back to
|
||
the appropriate registers. This ensures that the DQS signal
|
||
remains correctly aligned during normal operation. \\
|
||
|
||
\begin{listing}
|
||
\begin{adjustwidth}{0.5cm}{0.5cm}
|
||
\begin{minted}[linenos]{c}
|
||
Calc_SetMaxRdLatency_D_Fam15(pMCTstat, pDCTstat, 0, 0);
|
||
Calc_SetMaxRdLatency_D_Fam15(pMCTstat, pDCTstat, 1, 0);
|
||
if (Pass == FirstPass) {
|
||
mct_DisableDQSRcvEn_D(pDCTstat);
|
||
}
|
||
\end{minted}
|
||
\end{adjustwidth}
|
||
\caption{Exiting training mode and setting read latency,
|
||
extract from
|
||
\protect\path{dqsTrainRcvrEn_SW_Fam15} function in
|
||
\protect\path{src/northbridge/amd/amdmct/mct_ddr3/mctsrc.c}}
|
||
\label{lst:finalization_phase}
|
||
\end{listing}
|
||
|
||
\subsection{Potential enhancements}
|
||
|
||
\subsubsection{DQS receiver training}
|
||
|
||
While the DQS receiver enable training implementation for AMD
|
||
Fam15h G34 processors can perform its intended function in some
|
||
cases, there are several areas where the code is either
|
||
incomplete, suboptimal, or potentially problematic. \\
|
||
|
||
The presence of \path{TODO} comments in the code indicates areas
|
||
where the implementation is either incomplete or lacks certain
|
||
necessary functionality. These unaddressed tasks can lead to
|
||
performance issues, potential bugs, or incomplete training,
|
||
which could compromise the stability and reliability of the
|
||
memory subsystem. \\
|
||
|
||
In the seed adjustment section for the second pass of training,
|
||
the code includes a \path{TODO} comment regarding fetching the
|
||
correct value from \path{RC2[0]} for the \path{addr_prelaunch}
|
||
variable (lst. \ref{lst:todo_rc2}).
|
||
|
||
\begin{listing}
|
||
\begin{adjustwidth}{0.5cm}{0.5cm}
|
||
\begin{minted}[linenos]{c}
|
||
uint8_t addr_prelaunch = 0; /* TODO: Fetch the correct value from RC2[0] */
|
||
\end{minted}
|
||
\end{adjustwidth}
|
||
\caption{An unimplemented feature in the seed adjustment logic,
|
||
extract from
|
||
\protect\path{dqsTrainRcvrEn_SW_Fam15} function in
|
||
\protect\path{src/northbridge/amd/amdmct/mct_ddr3/mcrsrc.c}}
|
||
\label{lst:todo_rc2}
|
||
\end{listing}
|
||
|
||
This unimplemented feature suggests that the training process
|
||
may not be fully optimized, as the correct prelaunch address
|
||
setting is not being applied. This could result in incorrect
|
||
seed values being used during the training, leading to
|
||
suboptimal alignment of the DQS signal. Also, the comment
|
||
is unclear about what RC2[0] really means. \\
|
||
|
||
The code contains another \path{TODO} comment indicating that
|
||
the support for Load Reduced DIMMs (LRDIMMs) is unimplemented
|
||
(lst. \ref{lst:todo_lrdimm}).
|
||
|
||
\begin{listing}
|
||
\begin{adjustwidth}{0.5cm}{0.5cm}
|
||
\begin{minted}[linenos]{c}
|
||
else if ((pDCTstat->Status & (1 << SB_LoadReduced))) {
|
||
/* TODO
|
||
* Load reduced DIMM support unimplemented
|
||
*/
|
||
register_delay = 0x0;
|
||
}
|
||
\end{minted}
|
||
\end{adjustwidth}
|
||
\caption{LRDIMM support is
|
||
unimplemented,
|
||
extract from
|
||
\protect\path{dqsTrainRcvrEn_SW_Fam15} function in
|
||
\protect\path{src/northbridge/amd/amdmct/mct_ddr3/mcrsrc.c}}
|
||
\label{lst:todo_lrdimm}
|
||
\end{listing}
|
||
|
||
This omission is significant because LRDIMMs are commonly used
|
||
in server environments where high memory capacity is required.
|
||
The lack of support for LRDIMMs could lead to incorrect training
|
||
or even failures when such DIMMs are installed, severely
|
||
impacting the reliability of the system. \\
|
||
|
||
\path{FIXME} comments in the code are often indicators of known
|
||
issues or temporary workarounds that need to be addressed. In
|
||
this implementation, there are several such comments that
|
||
highlight critical areas where the current approach may be
|
||
flawed or incomplete. \\
|
||
|
||
The first \path{FIXME} comment questions the usage of the
|
||
\path{SSEDIS} setting during the training process
|
||
(lst. \ref{lst:fixme_ssedis}).
|
||
|
||
\begin{listing}
|
||
\begin{adjustwidth}{0.5cm}{0.5cm}
|
||
\begin{minted}[linenos]{c}
|
||
msr = HWCR_MSR;
|
||
_RDMSR(msr, &lo, &hi);
|
||
/* FIXME: Why use SSEDIS */
|
||
if (lo & (1 << 17)) { /* save the old value */
|
||
_Wrap32Dis = 1;
|
||
}
|
||
lo |= (1 << 17); /* HWCR.wrap32dis */
|
||
lo &= ~(1 << 15); /* SSEDIS */
|
||
_WRMSR(msr, lo, hi); /* Setting wrap32dis allows 64-bit memory
|
||
* references in real mode */
|
||
\end{minted}
|
||
\end{adjustwidth}
|
||
\caption{Questioning the use of
|
||
\texttt{SSEDIS} in the MSR setting,
|
||
extract from
|
||
\protect\path{dqsTrainRcvrEn_SW_Fam15} function in
|
||
\protect\path{src/northbridge/amd/amdmct/mct_ddr3/mcrsrc.c}}
|
||
\label{lst:fixme_ssedis}
|
||
\end{listing}
|
||
|
||
The concern here is that disabling the \path{SSEDIS}
|
||
(SSE Disable) bit could have unintended side effects,
|
||
particularly in environments where SSE instructions are
|
||
expected to be enabled. This could impact the performance of
|
||
the system during training and potentially lead to instability.
|
||
\\
|
||
|
||
The code also highlights a potential misprint in the BKDG
|
||
regarding the \path{WrDqDqsEarly} value
|
||
(lst. \ref{lst:fixme_misprint}).
|
||
|
||
\begin{listing}
|
||
\begin{adjustwidth}{0.5cm}{0.5cm}
|
||
\begin{minted}[linenos]{c}
|
||
/* NOTE: While the BKDG states to only program DqsRcvEnGrossDelay, this appears
|
||
* to have been a misprint as DqsRcvEnFineDelay should be set to zero as well.
|
||
*/
|
||
\end{minted}
|
||
\end{adjustwidth}
|
||
\caption{A possible misprint
|
||
in the BKDG regarding delay settings,
|
||
extract from
|
||
\protect\path{dqsTrainRcvrEn_SW_Fam15} function in
|
||
\protect\path{src/northbridge/amd/amdmct/mct_ddr3/mcrsrc.c}}
|
||
\label{lst:fixme_misprint}
|
||
\end{listing}
|
||
|
||
This indicates that the implementation may be based on incorrect
|
||
or incomplete documentation, leading to potential errors in
|
||
setting the delay values. If this is indeed a misprint in the
|
||
BKDG, the correction should be verified with updated
|
||
documentation, and the implementation should be adjusted
|
||
accordingly. \\
|
||
|
||
In addition to the explicit \path{TODO} and \path{FIXME}
|
||
comments, there are other aspects of the implementation that
|
||
could impact performance and stability. \\
|
||
|
||
The logic for adjusting the seed values based on the memory
|
||
frequency and the platform's minimum supported frequency is
|
||
complex and prone to errors
|
||
(lst. \ref{lst:seed_adjustment_logic}),
|
||
especially when combined with the
|
||
incomplete \path{TODO} features.
|
||
|
||
\begin{listing}
|
||
\begin{adjustwidth}{0.5cm}{0.5cm}
|
||
\begin{minted}[linenos]{c}
|
||
initial_seed = (uint16_t) (((((uint64_t) initial_seed) *
|
||
fam15h_freq_tab[mem_clk] * 100) / (min_mem_clk * 100)));
|
||
\end{minted}
|
||
\end{adjustwidth}
|
||
\caption{Complex seed adjustment logic,
|
||
extract from
|
||
\protect\path{dqsTrainRcvrEn_SW_Fam15} function in
|
||
\protect\path{src/northbridge/amd/amdmct/mct_ddr3/mcrsrc.c}}
|
||
\label{lst:seed_adjustment_logic}
|
||
\end{listing}
|
||
|
||
The risk here is that incorrect
|
||
seed values could be used, leading to timing mismatches during
|
||
the training process. \\
|
||
|
||
Added to that, stock seeds from the BKDG are used
|
||
(lst. \ref{lst:dqs_receiver_training_seeds}).
|
||
However, it seems that that seeds for used for DQS
|
||
training should be extensively determined for each motherboard,
|
||
and the BKDG \cite{BKDG} does not tell otherwise. Moreover,
|
||
seeds can be configured uniquely for every possible socket,
|
||
channel, DIMM module, and even byte lane combination. The current
|
||
implementation is here only using the recommended seeds from
|
||
the table 99 of the BKDG \cite{BKDG}, which is not sufficient
|
||
and absolutely not adapted to every DIMM module in the market.
|
||
\\
|
||
|
||
\begin{listing}
|
||
\begin{adjustwidth}{0.5cm}{0.5cm}
|
||
\begin{minted}[linenos]{c}
|
||
if (pDCTstat->Status & (1 << SB_Registered)) {
|
||
if (package_type == PT_GR) {
|
||
/* Socket G34: Fam15h BKDG v3.14 Table 99 */
|
||
if (MaxDimmsInstallable == 1) {
|
||
if (channel == 0)
|
||
seed = 0x43;
|
||
else if (channel == 1)
|
||
seed = 0x3f;
|
||
else if (channel == 2)
|
||
seed = 0x3a;
|
||
else if (channel == 3)
|
||
seed = 0x35;
|
||
} else if (MaxDimmsInstallable == 2) {
|
||
if (channel == 0)
|
||
seed = 0x54;
|
||
else if (channel == 1)
|
||
seed = 0x4d;
|
||
else if (channel == 2)
|
||
seed = 0x45;
|
||
else if (channel == 3)
|
||
seed = 0x40;
|
||
} else if (MaxDimmsInstallable == 3) {
|
||
if (channel == 0)
|
||
seed = 0x6b;
|
||
else if (channel == 1)
|
||
seed = 0x5e;
|
||
else if (channel == 2)
|
||
seed = 0x4b;
|
||
else if (channel == 3)
|
||
seed = 0x3d;
|
||
}
|
||
\end{minted}
|
||
\end{adjustwidth}
|
||
\caption{Seeds used for DQS Receiver training,
|
||
extract from
|
||
\protect\path{dqsTrainRcvrEn_SW_Fam15} function in
|
||
\protect\path{src/northbridge/amd/amdmct/mct_ddr3/mcrsrc.c}}
|
||
\label{lst:dqs_receiver_training_seeds}
|
||
\end{listing}
|
||
|
||
The current implementation also has limited error handling and
|
||
reporting. While some errors are detected during training, the
|
||
code does not have robust mechanisms for recovering from or
|
||
correcting these errors. \\
|
||
|
||
This approach might lead to further complications in high-load
|
||
scenarios or when the memory configuration changes, as the
|
||
underlying issues are not resolved. \\
|
||
|
||
\subsubsection{Write leveling}
|
||
|
||
While the current implementation of write leveling on AMD Fam15h
|
||
G34 processors with RDIMMs can be functional in some cases and
|
||
provides the necessary steps to align DQS signals correctly
|
||
during write operations, there are several areas where the
|
||
implementation is either incomplete, relies on temporary
|
||
workarounds, or may introduce stability and performance issues.
|
||
\\
|
||
|
||
One of the most significant concerns with the current
|
||
implementation is the presence of unresolved \path{TODO} and
|
||
\path{FIXME} comments throughout the code. These comments
|
||
indicate areas where the implementation is either incomplete or
|
||
has known issues that have not been fully resolved. \\
|
||
|
||
In the \path{procConfig} function, a \path{TODO} comment
|
||
mentions that the current implementation may not be using
|
||
the correct or final value for this variable, once again because
|
||
of a value from RC2[0] that isn't fetched, potentially
|
||
leading to inaccuracies in the seed values used during write
|
||
leveling (lst. \ref{lst:todo_seed_generation}).
|
||
This inaccuracy can result in timing mismatches, which
|
||
may cause data corruption or other stability issues. \\
|
||
|
||
\begin{listing}
|
||
\begin{adjustwidth}{0.5cm}{0.5cm}
|
||
\begin{minted}[linenos]{c}
|
||
uint8_t AddrCmdPrelaunch = 0; /* TODO: Fetch the correct value from RC2[0] */
|
||
\end{minted}
|
||
\end{adjustwidth}
|
||
\caption{Incomplete seed generation
|
||
implementation,
|
||
extract from
|
||
\protect\path{procConfig} function in
|
||
\protect\path{src/northbridge/amd/amdmct/mct_ddr3/mhwlc_d.c}}
|
||
\label{lst:todo_seed_generation}
|
||
\end{listing}
|
||
|
||
Another \path{FIXME} in the code indicates that the
|
||
\path{WrDqDqsEarly} parameter, which is critical for fine-tuning
|
||
the DQS signal’s timing during write operations, is being
|
||
ignored due to unresolved issues
|
||
(lst. \ref{lst:fixme_wrdqdqs_early}). This omission can result in
|
||
less accurate timing adjustments, leading to potential marginal
|
||
instability in systems where tight timing margins are critical.
|
||
\\
|
||
|
||
\begin{listing}
|
||
\begin{adjustwidth}{0.5cm}{0.5cm}
|
||
\begin{minted}[linenos]{c}
|
||
/* FIXME: Ignore WrDqDqsEarly for now to work around training issues */
|
||
\end{minted}
|
||
\end{adjustwidth}
|
||
\caption{Omission of
|
||
\texttt{WrDqDqsEarly} parameter,
|
||
extract from
|
||
\protect\path{procConfig} function in
|
||
\protect\path{src/northbridge/amd/amdmct/mct_ddr3/mhwlc_d.c}}
|
||
\label{lst:fixme_wrdqdqs_early}
|
||
\end{listing}}
|
||
|
||
The current implementation uses generic or "stock" seed values
|
||
for certain configurations, such as Socket G34
|
||
(lst. \ref{lst:fixme_mainboard_specific_overrides}). Without
|
||
mainboard-specific overrides, the memory initialization process
|
||
might not be fully optimized for the particular motherboard in
|
||
use. This could result in suboptimal performance or stability
|
||
issues in specific environments, particularly in server
|
||
applications where memory performance is critical. \\
|
||
|
||
\begin{listing}
|
||
\begin{adjustwidth}{0.5cm}{0.5cm}
|
||
\begin{minted}[linenos]{c}
|
||
/* FIXME: Implement mainboard-specific seed and WrDqsGrossDly base overrides.
|
||
* 0x41 and 0x0 are the "stock" values */
|
||
\end{minted}
|
||
\end{adjustwidth}
|
||
\caption{Lack of
|
||
mainboard-specific seed overrides,
|
||
extract from
|
||
\protect\path{procConfig} function in
|
||
\protect\path{src/northbridge/amd/amdmct/mct_ddr3/mhwlc_d.c}}
|
||
\label{lst:fixme_mainboard_specific_overrides}
|
||
\end{listing}
|
||
|
||
In \path{AgesaHwWlPhase2}, there is a \path{FIXME} comment that
|
||
suggests that the Critical Gross Delay adjustment has been
|
||
temporarily disabled due to conflicts with RDIMM training
|
||
(lst. \ref{lst:fixme_cgd_adjustment}).
|
||
Disabling this adjustment can lead to less precise DQS alignment,
|
||
especially in complex memory configurations like those using
|
||
RDIMMs, potentially causing instability or degraded performance.
|
||
\\
|
||
|
||
\begin{listing}
|
||
\begin{adjustwidth}{0.5cm}{0.5cm}
|
||
\begin{minted}[linenos]{c}
|
||
/* FIXME: For now, disable CGD adjustment as it seems to interfere with
|
||
* registered DIMM training */
|
||
\end{minted}
|
||
\end{adjustwidth}
|
||
\caption{Disabled CGD adjustment due
|
||
to conflicts,
|
||
extract from
|
||
\protect\path{AgesaHwWlPhase2} function in
|
||
\protect\path{src/northbridge/amd/amdmct/mct_ddr3/mhwlc_d.c}}
|
||
\label{lst:fixme_cgd_adjustment}
|
||
\end{listing}
|
||
|
||
The function also bypasses (lst. \ref{lst:fixme_bypass_critical_adjustments})
|
||
certain critical adjustments if the memory speed is being tuned (e.g.,
|
||
during frequency stepping). This bypass is noted as a temporary
|
||
measure due to problems encountered during testing, where the
|
||
first pass values were found to cause issues with PHY training
|
||
on all Family 15h processors tested. This approach indicates a
|
||
lack of robustness in the implementation, particularly in
|
||
handling dynamic changes in memory frequency, which is essential
|
||
for server environments where performance tuning is common. \\
|
||
|
||
\begin{listing}
|
||
\begin{adjustwidth}{0.5cm}{0.5cm}
|
||
\begin{minted}[linenos]{c}
|
||
/* FIXME: Using the Pass 1 training values causes major phy training problems on
|
||
* all Family 15h processors I tested (Pass 1 values are randomly too high,
|
||
* and Pass 2 cannot lock). Figure out why this is and fix it, then remove
|
||
* the bypass code below... */
|
||
\end{minted}
|
||
\end{adjustwidth}
|
||
\caption{Bypass of critical
|
||
adjustments during speed tuning,
|
||
extract from
|
||
\protect\path{AgesaHwWlPhase2} function in
|
||
\protect\path{src/northbridge/amd/amdmct/mct_ddr3/mhwlc_d.c}}
|
||
\label{lst:fixme_bypass_critical_adjustments}
|
||
\end{listing}
|
||
|
||
The current implementation attempts to compensate for noise and
|
||
instability by overriding faulty values with seed values in
|
||
\path{AgesaHwWlPhase2} (lst. \ref{lst:reactive_error_handling}).
|
||
However, this approach is somewhat blunt
|
||
and reactive, addressing the symptoms rather than the underlying
|
||
causes of instability. This method does not ensure that noise or
|
||
instability is sufficiently mitigated, potentially leading to
|
||
marginal or sporadic failures during normal operation. \\
|
||
|
||
\begin{listing}
|
||
\begin{adjustwidth}{0.5cm}{0.5cm}
|
||
\begin{minted}[linenos]{c}
|
||
if (faulty_value_detected) {
|
||
pDCTData->WLGrossDelay[index+ByteLane] =
|
||
pDCTData->WLSeedGrossDelay[index+ByteLane];
|
||
pDCTData->WLFineDelay[index+ByteLane] =
|
||
pDCTData->WLSeedFineDelay[index+ByteLane];
|
||
status = 1;
|
||
}
|
||
\end{minted}
|
||
\end{adjustwidth}
|
||
\caption{Blunt error handling to compensate for noise and
|
||
instability,
|
||
extract from
|
||
\protect\path{AgesaHwWlPhase2} function in
|
||
\protect\path{src/northbridge/amd/amdmct/mct_ddr3/mhwlc_d.c}}
|
||
\label{lst:reactive_error_handling}
|
||
\end{listing}
|
||
|
||
The handling of x4 DIMMs, with separate training for each nibble,
|
||
introduces additional complexity. While necessary for these
|
||
configurations, the logic is fragmented, with several points
|
||
where the function branches based on whether the DIMM is x4.
|
||
This complexity increases the risk of bugs or missed conditions,
|
||
particularly if future changes or enhancements are made to the
|
||
code. The overcomplicated logic can also make the code more
|
||
difficult to maintain and extend. \\
|
||
|
||
\subsubsection{DQS position training}
|
||
|
||
While the DQS position training algorithm implemented in the
|
||
\path{TrainDQSRdWrPos_D_Fam15} function may work in some
|
||
cased to ensure optimal data strobe alignment, there are
|
||
several critical flaws and issues within the implementation
|
||
that could impact its effectiveness and reliability. \\
|
||
|
||
Throughout the function, there is an overreliance on hardcoded
|
||
constants and magic numbers, such as:
|
||
|
||
\begin{itemize}
|
||
\item The use of \texttt{0x20} to represent 1 UI (Unit
|
||
Interval) in multiple places.
|
||
\item The constant \texttt{16} used in the adjustment of
|
||
\texttt{region\_center} during the processing of results.
|
||
\item Magic numbers like \texttt{32} and \texttt{48} in the
|
||
array dimensions for \texttt{dqs\_results\_array}.
|
||
\end{itemize}
|
||
|
||
These values should be replaced with named constants or
|
||
variables that clearly indicate their purpose, improving code
|
||
readability and maintainability. Additionally, using
|
||
well-defined constants would allow easier adjustments if the
|
||
algorithm needs to be adapted for different hardware
|
||
configurations or future revisions of the architecture. \\
|
||
|
||
The error handling within the function is rudimentary, with
|
||
errors being flagged primarily by setting bits in the
|
||
\texttt{Errors} variable. However, the function does not
|
||
provide detailed diagnostics or recovery strategies when an
|
||
error occurs. For example:
|
||
|
||
\begin{itemize}
|
||
\item If no passing DQS delay is found for a lane, the
|
||
function simply sets an error bit without attempting any
|
||
corrective actions or providing detailed information on
|
||
what went wrong.
|
||
\item The early abort mechanism based on the value read from
|
||
the \texttt{0x264} register does not offer a robust
|
||
fallback or retry mechanism, which could lead to
|
||
situations where minor, recoverable issues cause the
|
||
entire training process to fail.
|
||
\end{itemize}
|
||
|
||
Improving the error handling to include detailed diagnostics,
|
||
logging, and potentially corrective actions (such as retrying
|
||
the training with adjusted parameters) would make the function
|
||
more resilient and reliable. \\
|
||
|
||
The function contains several areas where the logic is more
|
||
complex than necessary, which can lead to difficulties in
|
||
understanding and maintaining the code. Examples include:
|
||
|
||
\begin{itemize}
|
||
\item The nested loops for iterating over write and read
|
||
delays are deeply nested, making it challenging to
|
||
follow the flow of the code and understand the
|
||
interactions between different parts of the algorithm.
|
||
\item The use of multiple copies of delay settings (e.g.,
|
||
\texttt{current\_write\_data\_delay},
|
||
\texttt{initial\_write\_data\_timing}, and
|
||
\texttt{initial\_write\_dqs\_delay}) introduces
|
||
redundancy and increases the likelihood of errors
|
||
or inconsistencies.
|
||
\end{itemize}
|
||
|
||
Refactoring the code to simplify the logic, reduce redundancy,
|
||
and make the flow of operations clearer would improve both the
|
||
readability and reliability of the implementation. \\
|
||
|
||
The current implementation does not adequately handle edge cases
|
||
and boundary conditions, such as:
|
||
|
||
\begin{itemize}
|
||
\item The warning issued when a negative DQS recovery delay
|
||
is detected suggests that the function continues despite
|
||
recognizing a potentially critical issue, which could
|
||
lead to system instability
|
||
(lst. \ref{lst:dqs_train_negative_delay}).
|
||
\item The averaging of delay values for dual-rank DIMMs does
|
||
not account for the possibility of significant
|
||
discrepancies between the ranks, which could result in
|
||
suboptimal or unstable settings.
|
||
\item The function does not include comprehensive checks for
|
||
situations where the calculated delay settings might
|
||
exceed hardware limitations or cause timing violations.
|
||
\end{itemize}
|
||
|
||
|
||
\begin{listing}
|
||
\begin{adjustwidth}{0.5cm}{0.5cm}
|
||
\begin{minted}[linenos]{c}
|
||
if (best_count > 2) {
|
||
uint16_t region_center = (best_pos + (best_count / 2));
|
||
if (region_center < 16) {
|
||
printk(BIOS_WARNING,
|
||
"TrainDQSRdWrPos: negative DQS recovery delay detected!");
|
||
region_center = 0;
|
||
} else {
|
||
region_center -= 16;
|
||
}
|
||
...
|
||
current_read_dqs_delay[lane] = region_center;
|
||
passing_dqs_delay_found[lane] = 1;
|
||
write_dqs_read_data_timing_registers(current_read_dqs_delay, dev, dct, dimm, index_reg);
|
||
}
|
||
\end{minted}
|
||
\end{adjustwidth}
|
||
\caption{Allowing a negative DQS recovery delay measurement,
|
||
extract from
|
||
\protect\path{TrainDQSRdWrPos_D_Fam15} function in
|
||
\protect\path{src/northbridge/amd/amdmct/mct_ddr3/mctdqs_d.c}}
|
||
\label{lst:dqs_train_negative_delay}
|
||
\end{listing}
|
||
|
||
Improving the handling of edge cases and boundary conditions,
|
||
possibly by incorporating more robust validation checks and
|
||
conservative fallback mechanisms, would make the algorithm more
|
||
reliable in a wider range of scenarios. \\
|
||
|
||
The code contains several \texttt{TODO} and \texttt{FIXME}
|
||
comments that indicate incomplete or problematic parts of
|
||
the implementation:
|
||
|
||
\begin{itemize}
|
||
\item The comment \texttt{TODO: Fetch the correct value
|
||
from RC2[0]} suggests that critical configuration values
|
||
are not correctly initialized, which could compromise
|
||
the entire training process.
|
||
\item The \texttt{FIXME} comments related to early abort
|
||
checks and DQS recovery delay calculations indicate that
|
||
there are known issues with the current approach that
|
||
have not been resolved, potentially leading to incorrect
|
||
or unstable results.
|
||
\item The handling of antiphase results, particularly with
|
||
respect to checking for early aborts, is incomplete and
|
||
could lead to situations where incorrect results are
|
||
accepted without proper validation.
|
||
\end{itemize}
|
||
|
||
The current implementation's approach to iterating over every
|
||
possible combination of write and read delays is exhaustive but
|
||
may be inefficient. The function performs multiple reads and
|
||
writes to hardware registers for every iteration, which could
|
||
be time-consuming, especially on systems with a large number
|
||
of lanes or complex memory configurations. \\
|
||
|
||
Consideration should be given to optimizing the algorithm,
|
||
possibly by narrowing the search space based on prior knowledge
|
||
or implementing more efficient search techniques, to reduce
|
||
the time required for DQS position training without compromising
|
||
accuracy. \\
|
||
|
||
\subsubsection{On saving training values in NVRAM}
|
||
|
||
The function \path{mctAutoInitMCT_D} is responsible for
|
||
automatically initializing the memory controller training (MCT)
|
||
process, which involves configuring various memory parameters
|
||
and performing training routines to ensure stable and efficient
|
||
memory operation. However, the fact that
|
||
\path{mctAutoInitMCT\_D} does not allow for the restoration of
|
||
training data from NVRAM (lst. \ref{lst:mctAutoInitMCT_D_fixme})
|
||
poses several significant problems. \\
|
||
|
||
Memory training is a time-consuming process that involves
|
||
multiple iterations of read/write operations, delay adjustments,
|
||
and calibration steps. By not restoring previously saved
|
||
training data from NVRAM, the system is forced to re-run the
|
||
full training sequence every time it boots up. This leads to
|
||
longer boot times, which can be particularly problematic in
|
||
environments where quick system restarts are critical, such
|
||
as in servers or embedded systems. \\
|
||
|
||
Each time memory training is performed, it puts additional
|
||
stress on the memory modules and the memory controller.
|
||
Repeatedly executing the training process at every boot can
|
||
contribute to the wear and tear of hardware components,
|
||
potentially reducing their lifespan. This issue is especially
|
||
concerning in systems that frequently power cycle or reboot. \\
|
||
|
||
Memory training is sensitive to various factors, such as
|
||
temperature, voltage, and load conditions. As a result, the
|
||
training results can vary slightly between different boot
|
||
cycles. Without the ability to restore previously validated
|
||
training data, there is a risk of inconsistency in memory
|
||
performance across reboots. This could lead to instability
|
||
or suboptimal memory operation, affecting the overall
|
||
performance of the system. \\
|
||
|
||
If the memory training process fails during boot, the system
|
||
may be unable to operate properly or may fail to boot entirely.
|
||
By restoring validated training data from NVRAM, the system
|
||
can bypass the training process altogether, reducing the risk
|
||
of boot failures caused by training issues. Without this
|
||
feature, any minor issue that affects training could result
|
||
in system downtime. \\
|
||
|
||
Finally, modern memory controllers often include power-saving
|
||
features that are fine-tuned during the training process. By
|
||
reusing validated training data from NVRAM, the system can
|
||
quickly return to an optimized state with lower power
|
||
consumption.
|
||
The inability to restore this data forces the system to
|
||
operate at a potentially less efficient state until training
|
||
is complete, leading to higher power consumption during the
|
||
boot process. \\
|
||
|
||
\subsubsection{A seedless DQS position training algorithm}
|
||
|
||
An algorithm to find the best timing for the DQS so that the
|
||
memory controller can reliably read data from the memory
|
||
could be done without relying on any pre-known starting
|
||
values (seeds). This would allow for better reliability and
|
||
wider support for different situations. The algorithm
|
||
could be describe as follows. \\
|
||
|
||
\begin{itemize}
|
||
\item Prepare Memory Controller:
|
||
The memory controller needs to be in a state where it can
|
||
safely adjust the DQS timing without affecting the normal
|
||
operation of the system. By blocking the DQS signal locking,
|
||
we ensure that the adjustments made during training do not
|
||
interfere with the controller’s ability to capture data
|
||
until the optimal settings are found.
|
||
|
||
\item Initialize Variables:
|
||
Set up variables to store the various timing settings and
|
||
test results for each bytelane. This setup is crucial
|
||
because each bytelane might require a different optimal
|
||
timing, and keeping track of these values ensures that the
|
||
algorithm can correctly determine the best delay settings
|
||
later.
|
||
\end{itemize}
|
||
|
||
The main loop is the core of the algorithm, where different
|
||
timing settings are systematically explored. By looping
|
||
through possible delay settings, the algorithm ensures
|
||
that it doesn't miss any potential optimal timings. The
|
||
loop structure allows a methodical test of a range of
|
||
delays to find the most reliable one. \\
|
||
|
||
The gross delay is here the coarse adjustment to the timing
|
||
of the DQS signal. It shifts the timing window by a large
|
||
amount, helping to broadly align the DQS with the data
|
||
lines (DQ). The fine delay, which is the smaller, more
|
||
precise change to the timing of the DQS signal once the
|
||
coarse alignment (through gross delay) has been achieved,
|
||
would then be computed. \\
|
||
|
||
To compute a delay, here would be the steps:
|
||
|
||
\begin{itemize}
|
||
\item Set a delay:
|
||
Setting an initial delay allows the algorithm to start
|
||
testing. The initial delay might be zero or another default
|
||
value, providing a baseline from which to begin the search
|
||
for the optimal timing.
|
||
|
||
\item Test it:
|
||
After setting the delay, it is essential to test whether the
|
||
memory controller can read data correctly. This step is
|
||
critical because it indicates whether the current delay
|
||
setting is within the acceptable range for reliable data
|
||
capture.
|
||
|
||
\item Check the result:
|
||
If the memory controller successfully reads data, it means
|
||
the current delay setting is valid. This information is
|
||
crucial because it helps define the range of acceptable
|
||
timings. If the test fails, it indicates that the curren
|
||
t delay setting is outside the range where the memory
|
||
controller can reliably capture data.
|
||
|
||
\item Increase/decrease delay:
|
||
By incrementally adjusting the delay, either increasing or
|
||
decreasing, the algorithm can explore different timing
|
||
settings in a controlled manner. This ensures that the
|
||
entire range of possible delays is covered without skipping
|
||
over any potential good delays.
|
||
|
||
\item Test again:
|
||
Re-testing after each adjustment ensures that the exact
|
||
point where the DQS timing goes from acceptable (pass) to
|
||
unacceptable (fail) is caught. This step helps in
|
||
identifying the transition point, which is often the optimal
|
||
place to set the DQS delay.
|
||
|
||
\item Look for a transition:
|
||
The transition from pass to fail is where the DQS timing
|
||
crosses the boundary of the valid timing window. This
|
||
transition is crucial because it marks the end of the
|
||
reliable range. The best timing is usually just before
|
||
this transition.
|
||
|
||
\item Record the best setting:
|
||
Storing the best delay setting for each bytelane ensures
|
||
that a reliable timing configuration is available when the
|
||
training is complete.
|
||
|
||
\item Confirm all bytelanes:
|
||
Before finalizing the settings, it is important to ensure
|
||
that the chosen delays work for all bytelanes. This step
|
||
serves as a final safeguard against errors, ensuring that
|
||
every part of the data bus is correctly aligned.
|
||
\end{itemize}
|
||
|
||
Each bytelane (8-bit segment of data) may require a
|
||
different optimal delay setting. By repeating the process
|
||
for all bytelanes, the algorithm ensures that the entire
|
||
data bus is correctly timed. Misalignment in even one
|
||
bytelane can lead to data errors, making it essential to
|
||
tune every bytelane individually. \\
|
||
|
||
Once the best settings are confirmed, they need to be
|
||
applied to the memory controller for use during normal
|
||
operation. This step locks in the most reliable timing
|
||
configuration found during the training process. \\
|
||
|
||
After the optimal settings are applied, it is necessary
|
||
to allow the DQS signal locking mechanism to resume. This
|
||
locks in the delay settings, ensuring stable operation going
|
||
forward. \\
|
||
|
||
Finally, the algorithm needs to indicate whether it was
|
||
successful in finding reliable timing settings for all
|
||
bytelanes. This feedback is crucial for determining whether
|
||
the memory system is correctly configured or if further
|
||
adjustments or troubleshooting are needed. \\
|
||
|
||
% ------------------------------------------------------------------------------
|
||
% CHAPTER 5: Virtualization of the operating system through firmware abstraction
|
||
% ------------------------------------------------------------------------------
|
||
\chapter{Virtualization of the operating system through firmware abstraction}
|
||
|
||
In contemporary computing systems, the operating system (OS) no longer
|
||
interacts directly with hardware in the same way it did in earlier computing
|
||
architectures. Instead, the OS operates within a highly abstracted
|
||
environment, where critical functions are managed by various firmware
|
||
components such as ACPI, SMM, UEFI, Intel Management Engine (ME), and AMD
|
||
Platform Security Processor (PSP). This layered abstraction has led to the
|
||
argument that the OS is effectively running in a virtualized environment,
|
||
akin to a virtual machine (VM).
|
||
|
||
\section{ACPI and abstraction of hardware control}
|
||
|
||
The Advanced Configuration and Power Interface (ACPI) provides a
|
||
standardized method for the OS to manage hardware configuration and
|
||
power states, effectively abstracting the underlying hardware
|
||
complexities. ACPI abstracts hardware details, allowing the OS to
|
||
interact with hardware components without needing direct control over
|
||
them. This abstraction is similar to how a hypervisor abstracts physical
|
||
hardware for VMs, enabling a consistent interface regardless of the
|
||
underlying hardware specifics. \\
|
||
|
||
According to \textcite{bellosa2010}, the abstraction provided by ACPI
|
||
not only simplifies the OS's interaction with hardware but also limits
|
||
the OS's ability to fully control the hardware, which is instead managed
|
||
by ACPI-compliant firmware. This layer of abstraction contributes to the
|
||
virtualization-like environment in which the OS operates. \\
|
||
|
||
More importantly, the ACPI Component Architecture (ACPICA) is a critical
|
||
component integrated into the Linux kernel, serving as the foundation
|
||
for the system's ACPI implementation \cite{intel_acpi_programming_2023}.
|
||
ACPICA provides the core ACPI functionalities, such as hardware
|
||
configuration, power management, and thermal management, which are
|
||
essential for modern computing platforms. However, its integration into
|
||
the Linux kernel has brought significant complexity and code overhead,
|
||
making Linux heavily dependent on ACPICA for managing ACPI-related
|
||
tasks.
|
||
|
||
ACPICA is a large and complex project, with its codebase encompassing
|
||
a wide range of functionalities required to implement ACPI standards.
|
||
The integration of ACPICA into the Linux kernel significantly increases
|
||
the kernel's overall code size. An example of that can easily be
|
||
reproduced with a small experiment (lst. \ref{lst:acpica_in_linux}).
|
||
|
||
\begin{listing}[H]
|
||
\begin{adjustwidth}{0.5cm}{0.5cm}
|
||
\inputminted{sh}{listings/acpica_size.sh}
|
||
\end{adjustwidth}
|
||
\caption{How to estimate the impact of ACPICA in Linux}
|
||
\label{lst:acpica_in_linux}
|
||
\end{listing}
|
||
|
||
As of recent statistics, ACPICA comprises between 100,000 to 200,000
|
||
lines of code, making it one of the larger subsystems within the Linux
|
||
kernel. This size is indicative of the extensive range of features
|
||
and capabilities ACPICA must support, including but not limited to the
|
||
ACPI interpreter, AML (ACPI Machine Language) parser, and various
|
||
hardware-specific drivers. The ACPICA codebase is not monolithic; it is
|
||
highly modular and consists of various components, each responsible for
|
||
specific ACPI functions. For instance, ACPICA includes components for
|
||
managing ACPI tables, interpreting AML bytecode, handling events, and
|
||
interacting with hardware. This modularity, while beneficial for
|
||
isolating different functionalities, also contributes to the overall
|
||
complexity of the system. The separation of ACPICA into multiple modules
|
||
necessitates careful coordination and integration with the rest of the
|
||
Linux kernel, adding to the kernel's complexity. \\
|
||
|
||
ACPICA's integration into the Linux kernel is designed to maintain a
|
||
clear separation between the core ACPI functionalities and the kernel's
|
||
other subsystems \cite{intel_acpi_programming_2023}. This separation is
|
||
achieved through well-defined interfaces and abstraction layers,
|
||
allowing the Linux kernel to interact with ACPICA without being tightly
|
||
coupled to its internal implementation details. For example, ACPICA
|
||
provides an API that the Linux kernel can use to interact with ACPI
|
||
tables, execute ACPI methods, and manage power states. This API
|
||
abstracts the underlying complexity of the ACPI implementation, making
|
||
it easier for kernel developers to incorporate ACPI support without
|
||
delving into the intricacies of ACPICA's internals.
|
||
Moreover, ACPICA's role in interpreting AML bytecode, which is
|
||
essentially a form of low-level programming language embedded in ACPI
|
||
tables, adds a layer of abstraction. The Linux kernel relies on ACPICA
|
||
to execute AML methods and manage hardware resources according to the
|
||
ACPI specifications. This reliance further underscores the idea that
|
||
ACPI acts as a virtualizing environment, shielding the kernel from
|
||
the complexities of directly interfacing with hardware components.
|
||
|
||
\section{SMM as a hidden execution layer}
|
||
|
||
System Management Mode (SMM) is a special-purpose operating mode
|
||
provided by x86 processors, designed to handle system-wide functions
|
||
such as power management, thermal monitoring, and hardware control,
|
||
independent of the OS. SMM operates transparently to the OS, executing
|
||
code that the OS cannot detect or control, similar to how a hypervisor
|
||
controls the execution environment of VMs. \\
|
||
|
||
Research by \textcite{huang2009invisible} argues that SMM introduces a
|
||
hidden layer of execution that diminishes the OS's control over the
|
||
hardware, creating a virtualized environment where the OS is unaware of
|
||
and unable to influence certain system-level operations. This hidden
|
||
execution layer reinforces the idea that the OS runs in an environment
|
||
similar to a VM, with the firmware acting as a hypervisor. \\
|
||
|
||
\section{UEFI and persistence}
|
||
|
||
The Unified Extensible Firmware Interface (UEFI) has largely replaced
|
||
the traditional BIOS in modern systems, providing a sophisticated
|
||
environment that includes a kernel-like structure capable of running
|
||
drivers and applications independently of the OS. UEFI remains active
|
||
even after the OS has booted, continuing to manage certain hardware
|
||
functions, which abstracts these functions away from the OS. \\
|
||
|
||
\textcite{mcclean2017uefi} discusses how UEFI creates a persistent
|
||
execution environment that overlaps with the OS's operation, effectively
|
||
placing the OS in a position where it runs on top of another controlling
|
||
layer, much like a guest OS in a VM. This persistence and the ability of
|
||
UEFI to manage hardware resources independently further blur the lines
|
||
between traditional OS operation and virtualized environments.
|
||
Indeed, as we studied in a precedent chapter, UEFI is designed as a
|
||
modular and extensible firmware interface that sits between the
|
||
computer's hardware and the operating system. Unlike the monolithic
|
||
BIOS, UEFI is composed of several layers and components, each
|
||
responsible for different aspects of the system's boot and runtime
|
||
processes. The core components of UEFI include the Pre-EFI
|
||
Initialization (PEI), Driver Execution Environment (DXE),
|
||
Boot Device Selection (BDS), and Runtime Services. Each of these
|
||
components plays a critical role in initializing the hardware,
|
||
managing drivers, selecting boot devices, and providing runtime
|
||
services to the OS. \\
|
||
|
||
The PEI (Pre-EFI Initialization) phase is responsible for initializing
|
||
the CPU, memory, and other essential hardware components. It ensures
|
||
that the system is in a stable state before handing control to the
|
||
DXE phase. In the DXE phase, the system loads and initializes various
|
||
drivers required for the OS to interact with the hardware. The DXE phase
|
||
also constructs the UEFI Boot Services, which provide the OS with
|
||
interfaces to the hardware during the boot process. The BDS (Boot Device
|
||
Selection) phase is responsible for selecting the device from which the
|
||
OS will boot. It interacts with the UEFI Boot Manager to determine the
|
||
correct boot path and load the OS. After the OS has booted, UEFI
|
||
provides Runtime Services that remain accessible to the OS. These
|
||
services include interfaces for managing system variables, time, and
|
||
hardware. UEFI also supports the execution of standalone applications,
|
||
which can be used for system diagnostics, firmware updates, or other
|
||
tasks. These applications operate independently of the OS, highlighting
|
||
UEFI's capabilities as a minimalistic OS. \\
|
||
|
||
UEFI abstracts the underlying hardware from the OS, providing a
|
||
standardized interface for the OS to interact with different hardware
|
||
components. This abstraction simplifies the development of OSes and
|
||
drivers, as they do not need to be tailored for specific hardware
|
||
configurations. UEFI's hardware abstraction is one of the key features
|
||
that enable it to act as a virtualizing environment for the OS
|
||
\cite{mcclean2017uefi}.
|
||
|
||
\subsection{Memory Management}
|
||
|
||
UEFI provides a detailed memory map to the OS during the boot process,
|
||
which includes information about available, reserved, and used memory
|
||
regions. The OS uses this memory map to manage its own memory allocation
|
||
and paging mechanisms. The overlap in memory management functions
|
||
highlights UEFI's role in preparing the system for OS operation.
|
||
This memory map includes all the memory regions in the system,
|
||
categorized into different types, such as usable memory, reserved
|
||
memory, and memory-mapped I/O. The OS relies on this map to understand
|
||
the system's memory layout and avoid conflicts \cite{osdev_uefi_memory}.
|
||
The OS extends UEFI's memory
|
||
management by implementing its own memory allocation, paging, and
|
||
virtual memory mechanisms. However, the OS's memory management is
|
||
built on the foundation provided by UEFI, demonstrating the close
|
||
relationship between the two.
|
||
|
||
\subsection{File System Management}
|
||
|
||
UEFI includes its own file system management capabilities, which overlap
|
||
with those of the OS. The most notable example is the EFI System
|
||
Partition (ESP), a special partition formatted with the FAT file system
|
||
that UEFI uses to store bootloaders, drivers, and other critical files
|
||
\cite{uefi_spec}. The ESP is a mandatory partition in UEFI systems,
|
||
containing the bootloaders, firmware updates, and other files
|
||
necessary for system initialization. UEFI accesses the ESP
|
||
independently of the OS, but the OS can also access and manage files
|
||
on the ESP, creating an overlap in file system management functions
|
||
\cite{uefi_smm_security}. UEFI natively supports the FAT file
|
||
system, allowing it to read and write files on the ESP. This support
|
||
overlaps with the OS's file system management, as both UEFI and the
|
||
OS can manipulate files on the ESP.
|
||
|
||
\subsection{Device Drivers}
|
||
|
||
As we studied in an earlier chapter, UEFI includes its own driver
|
||
model, allowing it to load and execute drivers independently of the
|
||
OS. This capability overlaps with the OS's driver management
|
||
functions, as both UEFI and the OS manage hardware devices through
|
||
drivers.
|
||
UEFI drivers are typically used during
|
||
the boot process to initialize and control hardware devices. These
|
||
drivers provide the necessary interfaces for the OS to interact with
|
||
the hardware once it has booted \cite{uefi_smm_security}.
|
||
After the OS has booted, it loads its own drivers for hardware
|
||
devices. However, the OS often relies on the initial hardware setup
|
||
performed by UEFI drivers.
|
||
|
||
\subsection{Power Management}
|
||
|
||
UEFI provides power management services that overlap with the OS's
|
||
power management functions. These services allow UEFI to manage
|
||
power states and transitions independently of the OS \cite{uefi_spec}.
|
||
These services ensure that the system conserves power during periods
|
||
of inactivity and can quickly resume operation when needed
|
||
The OS extends UEFI's power management by implementing its own
|
||
power-saving mechanisms, such as CPU throttling and dynamic voltage
|
||
scaling.
|
||
|
||
\section{Intel and AMD: control beyond the OS}
|
||
|
||
Intel Management Engine (ME) and AMD Platform Security Processor (PSP)
|
||
are embedded microcontrollers within Intel and AMD processors,
|
||
respectively. These components run their own firmware and operate
|
||
independently of the main CPU, handling tasks such as security
|
||
enforcement, remote management, and digital rights management (DRM). \\
|
||
|
||
\textcite{bulygin2013chipset} highlights how these microcontrollers have
|
||
control over the system that supersedes the OS, managing hardware and
|
||
security functions without the OS's knowledge or consent. This level of
|
||
control is reminiscent of a hypervisor that manages the resources and
|
||
security of VMs. The OS, in this context, operates similarly to a VM
|
||
that does not have full control over the hardware it ostensibly manages. \\
|
||
|
||
\section{Processors microcode}
|
||
|
||
Modern CPUs are incredibly complex, with their functionality relying
|
||
heavily on microcode to interpret and execute instructions. Microcode
|
||
acts as a translation layer between the high-level instructions that
|
||
software provides and the lower-level operations that the hardware
|
||
can execute. Microcode operates directly within the CPU. \\
|
||
|
||
CPU microcode is a set of low-level firmware instructions embedded
|
||
within the processor. It translates complex machine instructions into
|
||
simpler, executable sequences of operations that the CPU's hardware
|
||
can directly perform \cite{Intel2018}. This layer of abstraction allows
|
||
CPU manufacturers to update or patch the behavior of the processor
|
||
post-manufacturing, which is crucial for addressing bugs, optimizing
|
||
performance, and applying security patches \cite{Wilcox2018}.
|
||
|
||
In a sense, microcode can be seen as an argument for the CPU running
|
||
a form of low-level virtual machine. Just as a VM abstracts and manages
|
||
hardware resources for a guest OS, microcode abstracts and manages the
|
||
complexity of CPU hardware for machine-level instructions. This
|
||
virtualization enables the CPU to support a wide variety of instructions
|
||
and operational modes without needing to change the underlying hardware
|
||
\cite{Abraham1983}.
|
||
|
||
\section{The OS as a virtualized environment}
|
||
|
||
The combined effect of these firmware components (ACPI, SMM, UEFI,
|
||
Intel ME, and AMD PSP) creates an environment where the OS operates in
|
||
a virtualized or highly abstracted layer. The OS does not directly
|
||
manage the hardware; instead, it interfaces with these firmware
|
||
components, which themselves control the hardware resources. This
|
||
situation is analogous to a virtual machine, where the guest OS
|
||
operates on virtualized hardware managed by a hypervisor. \\
|
||
|
||
\textcite{smith2019firmware} argues that modern OS environments,
|
||
influenced by these firmware components, should be considered
|
||
virtualized environments. The firmware acts as an intermediary layer
|
||
that abstracts and controls hardware resources, thereby limiting the
|
||
OS's direct access and control. \\
|
||
|
||
The presence and operation of modern firmware components such as ACPI,
|
||
SMM, UEFI, Intel ME, and AMD PSP and even CPU microcode contribute to
|
||
a significant abstraction of hardware from the OS.
|
||
This abstraction creates an environment that
|
||
parallels the operation of a virtual machine, where the OS functions
|
||
within a controlled, virtualized layer managed by these firmware
|
||
systems. The growing body of research supports this perspective,
|
||
suggesting that the traditional notion of an OS directly managing
|
||
hardware is increasingly outdated in the face of these complex,
|
||
autonomous firmware components.
|
||
|
||
\chapter*{Conclusion}
|
||
\addcontentsline{toc}{chapter}{Conclusion}
|
||
|
||
This document has explored the evolution and current state of firmware,
|
||
particularly focusing on the transition from traditional BIOS to more
|
||
advanced firmware interfaces such as UEFI and \textit{coreboot}. The
|
||
evolution from a simple set of routines stored in ROM to complex systems
|
||
like UEFI and \textit{coreboot} highlights the growing importance of
|
||
firmware in modern computing.
|
||
|
||
Firmware now plays a critical role not
|
||
only in hardware initialization but also in memory management, security,
|
||
and system performance optimization. \\
|
||
|
||
The study of the ASUS KGPE-D16 mainboard illustrates how firmware,
|
||
particularly \textit{coreboot}, plays a crucial role in the efficient
|
||
and secure operation of high-performance systems. The KGPE-D16, with its
|
||
support for free software-compatible firmware, exemplifies the potential
|
||
of libre firmware to deliver both high performance and freedom from
|
||
proprietary constraints. However, it is important to acknowledge that
|
||
the KGPE-D16 is not without its imperfections. The detailed analysis of
|
||
firmware components, such as the bootblock, romstage, and especially the
|
||
RAM initialization and training algorithms, reveals areas where the
|
||
firmware can be further refined to enhance system stability and
|
||
performance. These improvements are not only beneficial for the KGPE-D16
|
||
but can also be applied to other boards, extending the impact of these
|
||
optimizations across a broader range of hardware. \\
|
||
|
||
Moreover, the discussion on modern firmware components such as ACPI,
|
||
SMM, UEFI, Intel ME, and AMD PSP demonstrates how these elements
|
||
abstract hardware from the operating system, creating a virtualized
|
||
environment where the OS operates more like a guest in a
|
||
hypervisor-controlled system. This abstraction raises important
|
||
considerations about control, security, and user freedom in contemporary
|
||
computing.
|
||
As we continue to witness the increasing complexity and influence of
|
||
firmware in computing, it becomes crucial to advocate for free
|
||
software-compatible hardware. The dependence on proprietary firmware and
|
||
the associated restrictions on user freedom are growing concerns that
|
||
need to be addressed. The development and adoption of libre firmware
|
||
solutions, such as \textit{coreboot} and GNU Boot, are essential steps
|
||
towards ensuring that users retain control over their hardware and
|
||
software environments. \\
|
||
|
||
It is imperative that the community of developers, researchers, and
|
||
users come together to support and contribute to the development of
|
||
free firmware. By fostering innovation and collaboration in this field,
|
||
we can advance towards a future where free software-compatible hardware
|
||
becomes the norm, ensuring that computing remains open, secure, and
|
||
under the control of its users. The significance of a libre BIOS cannot
|
||
be overstated, it is the foundation upon which a truly free and open
|
||
computing ecosystem can be built \cite{coreboot_fsf}.
|
||
The importance of the GNU Boot project cannot be
|
||
overstated. As a fully free firmware initiative, GNU Boot represents a
|
||
critical step towards achieving truly libre BIOSes, ensuring that users
|
||
can maintain full control over their hardware and firmware environments.
|
||
The continued development and support of GNU Boot are essential for
|
||
advancing the goals of free software and protecting user freedoms in the
|
||
increasingly complex landscape of modern computing. \\
|
||
|
||
\newpage
|
||
|
||
% Bibliography
|
||
\nocite{*}
|
||
\printbibliography
|
||
\addcontentsline{toc}{chapter}{Bibliography}
|
||
\newpage
|
||
|
||
\chapter*{Appendix: Long code listings}
|
||
\addcontentsline{toc}{chapter}{Appendix: Long code listings}
|
||
\renewcommand{\thelisting}{L.\arabic{listing}}
|
||
|
||
\begin{listing}
|
||
\begin{adjustwidth}{0.5cm}{0.5cm}
|
||
\inputminted{c}{
|
||
listings/src_northbridge_amd_amdmct_mct_ddr3_mct_d_1.c}
|
||
\end{adjustwidth}
|
||
\caption{
|
||
Beginning of
|
||
\protect\path{mctAutoInitMCT_D()}, extract from
|
||
\protect\path{src/northbridge/amd/amdmct/mct_ddr3/mct_d.c}}
|
||
\label{lst:mctAutoInitMCT_D_1}
|
||
\end{listing}
|
||
|
||
\begin{listing}
|
||
\begin{adjustwidth}{0.5cm}{0.5cm}
|
||
\inputminted{c}{
|
||
listings/src_northbridge_amd_amdmct_mct_ddr3_mct_d_2.c}
|
||
\end{adjustwidth}
|
||
\caption{
|
||
DIMM initialization in
|
||
\protect\path{mctAutoInitMCT_D()}, extract from
|
||
\protect\path{src/northbridge/amd/amdmct/mct_ddr3/mct_d.c}}
|
||
\label{lst:mctAutoInitMCT_D_2}
|
||
\end{listing}
|
||
|
||
\begin{listing}
|
||
\begin{adjustwidth}{0.5cm}{0.5cm}
|
||
\inputminted{c}{
|
||
listings/src_northbridge_amd_amdmct_mct_ddr3_mct_d_3.c}
|
||
\end{adjustwidth}
|
||
\caption{
|
||
Voltage control in
|
||
\protect\path{mctAutoInitMCT_D()}, extract from
|
||
\protect\path{src/northbridge/amd/amdmct/mct_ddr3/mct_d.c}}
|
||
\label{lst:mctAutoInitMCT_D_3}
|
||
\end{listing}
|
||
|
||
\begin{listing}
|
||
\begin{adjustwidth}{0.5cm}{0.5cm}
|
||
\inputminted{c}{
|
||
listings/src_northbridge_amd_amdmct_mct_ddr3_mct_d_fixme.c}
|
||
\end{adjustwidth}
|
||
\caption{
|
||
\protect\path{mctAutoInitMCT_D()} does not allow restoring
|
||
previous training values, extract from
|
||
\protect\path{src/northbridge/amd/amdmct/mct_ddr3/mct_d.c}}
|
||
\label{lst:mctAutoInitMCT_D_fixme}
|
||
\end{listing}
|
||
|
||
\begin{listing}
|
||
\begin{adjustwidth}{0.5cm}{0.5cm}
|
||
\inputminted{c}{
|
||
listings/src_northbridge_amd_amdmct_mct_ddr3_mct_d_4.c}
|
||
\end{adjustwidth}
|
||
\caption{
|
||
Preparing SMBus, DCTs and NB in
|
||
\protect\path{mctAutoInitMCT_D()} from
|
||
\protect\path{src/northbridge/amd/amdmct/mct_ddr3/mct_d.c}}
|
||
\label{lst:mctAutoInitMCT_D_4}
|
||
\end{listing}
|
||
|
||
\begin{listing}
|
||
\begin{adjustwidth}{0.5cm}{0.5cm}
|
||
\inputminted{c}{
|
||
listings/src_northbridge_amd_amdmct_mct_ddr3_mct_d_5.c}
|
||
\end{adjustwidth}
|
||
\caption{
|
||
Get DQS, reset and activate ECC in
|
||
\protect\path{mctAutoInitMCT_D()} from
|
||
\protect\path{src/northbridge/amd/amdmct/mct_ddr3/mct_d.c}}
|
||
\label{lst:mctAutoInitMCT_D_5}
|
||
\end{listing}
|
||
|
||
\begin{listing}
|
||
\begin{adjustwidth}{0.5cm}{0.5cm}
|
||
\inputminted{c}{
|
||
listings/src_northbridge_amd_amdmct_mct_ddr3_mct_d_6.c}
|
||
\end{adjustwidth}
|
||
\caption{
|
||
Mapping DRAM with cache, validating DCT nodes
|
||
and finishing the init process in
|
||
\protect\path{mctAutoInitMCT_D()} from
|
||
\protect\path{src/northbridge/amd/amdmct/mct_ddr3/mct_d.c}}
|
||
\label{lst:mctAutoInitMCT_D_6}
|
||
\end{listing}
|
||
|
||
\begin{listing}[H]
|
||
\begin{adjustwidth}{0.5cm}{0.5cm}
|
||
\begin{minted}[linenos]{c}
|
||
retry_requested = 0;
|
||
for (Node = 0; Node < MAX_NODES_SUPPORTED; Node++) {
|
||
struct DCTStatStruc *pDCTstat;
|
||
pDCTstat = pDCTstatA + Node;
|
||
|
||
if (pDCTstat->NodePresent) {
|
||
if (pDCTstat->TrainErrors & (1 << SB_FatalError)) {
|
||
printk(BIOS_ERR, "DIMM training FAILED! Restarting system...");
|
||
soft_reset();
|
||
}
|
||
if (pDCTstat->TrainErrors & (1 << SB_RetryConfigTrain)) {
|
||
retry_requested = 1;
|
||
|
||
pDCTstat->TrainErrors &= ~(1 << SB_RetryConfigTrain);
|
||
pDCTstat->TrainErrors &= ~(1 << SB_NODQSPOS);
|
||
pDCTstat->ErrStatus &= ~(1 << SB_RetryConfigTrain);
|
||
pDCTstat->ErrStatus &= ~(1 << SB_NODQSPOS);
|
||
}
|
||
}
|
||
}
|
||
|
||
if (retry_requested) {
|
||
printk(BIOS_DEBUG, "%s: Restarting training on algorithm request\n",
|
||
__func__);
|
||
/* Reset frequency to minimum */
|
||
for (Node = 0; Node < MAX_NODES_SUPPORTED; Node++) {
|
||
struct DCTStatStruc *pDCTstat;
|
||
pDCTstat = pDCTstatA + Node;
|
||
if (pDCTstat->NodePresent) {
|
||
uint8_t original_target_freq = pDCTstat->TargetFreq;
|
||
uint8_t original_auto_speed = pDCTstat->DIMMAutoSpeed;
|
||
pDCTstat->TargetFreq = mhz_to_memclk_config(mctGet_NVbits(NV_MIN_MEMCLK));
|
||
pDCTstat->Speed = pDCTstat->DIMMAutoSpeed = pDCTstat->TargetFreq;
|
||
SetTargetFreq(pMCTstat, pDCTstatA, Node);
|
||
pDCTstat->TargetFreq = original_target_freq;
|
||
pDCTstat->DIMMAutoSpeed = original_auto_speed;
|
||
}
|
||
}
|
||
/* Apply any DIMM timing changes */
|
||
for (Node = 0; Node < MAX_NODES_SUPPORTED; Node++) {
|
||
struct DCTStatStruc *pDCTstat;
|
||
pDCTstat = pDCTstatA + Node;
|
||
if (pDCTstat->NodePresent) {
|
||
AutoCycTiming_D(pMCTstat, pDCTstat, 0);
|
||
if (!pDCTstat->GangedMode)
|
||
if (pDCTstat->DIMMValidDCT[1] > 0)
|
||
AutoCycTiming_D(pMCTstat, pDCTstat, 1);
|
||
}
|
||
}
|
||
goto retry_dqs_training_and_levelization;
|
||
}
|
||
\end{minted}
|
||
\end{adjustwidth}
|
||
\caption{Error detection and retry mechanism during DQS training,
|
||
extract from the
|
||
\protect\path{DQSTiming_D} function in
|
||
\protect\path{src/northbridge/amd/amdmct/mct_ddr3/mct_d.c}}
|
||
\label{lst:error_handling}
|
||
\end{listing}
|
||
|
||
\begin{listing}[H]
|
||
\begin{adjustwidth}{0.5cm}{0.5cm}
|
||
\begin{minted}[linenos]{c}
|
||
if (Pass == SecondPass) {
|
||
if (pDCTstat->TargetFreq > mhz_to_memclk_config(mctGet_NVbits(NV_MIN_MEMCLK))) {
|
||
uint8_t global_phy_training_status = 0;
|
||
final_target_freq = pDCTstat->TargetFreq;
|
||
|
||
while (pDCTstat->Speed != final_target_freq) {
|
||
if (is_fam15h())
|
||
pDCTstat->TargetFreq =
|
||
fam15h_next_highest_memclk_freq(pDCTstat->Speed);
|
||
else
|
||
pDCTstat->TargetFreq = final_target_freq;
|
||
SetTargetFreq(pMCTstat, pDCTstatA, Node);
|
||
timeout = 0;
|
||
do {
|
||
status = 0;
|
||
timeout++;
|
||
status |= PhyWLPass2(pMCTstat, pDCTstat, 0,
|
||
(pDCTstat->TargetFreq == final_target_freq));
|
||
status |= PhyWLPass2(pMCTstat, pDCTstat, 1,
|
||
(pDCTstat->TargetFreq == final_target_freq));
|
||
if (status)
|
||
printk(BIOS_INFO,
|
||
"%s: Retrying write levelling due to invalid value(s) "
|
||
"detected in last phase\n",
|
||
__func__);
|
||
} while (status && (timeout < 8));
|
||
global_phy_training_status |= status;
|
||
}
|
||
|
||
pDCTstat->TargetFreq = final_target_freq;
|
||
|
||
if (global_phy_training_status)
|
||
printk(BIOS_WARNING,
|
||
"%s: Uncorrectable invalid value(s) detected in second phase of "
|
||
"write levelling; "
|
||
"continuing but system may be unstable!\n",
|
||
__func__);
|
||
|
||
uint8_t dct;
|
||
for (dct = 0; dct < 2; dct++) {
|
||
sDCTStruct *pDCTData = pDCTstat->C_DCTPtr[dct];
|
||
memcpy(pDCTData->WLGrossDelayFinalPass,
|
||
pDCTData->WLGrossDelayPrevPass,
|
||
sizeof(pDCTData->WLGrossDelayPrevPass));
|
||
memcpy(pDCTData->WLFineDelayFinalPass,
|
||
pDCTData->WLFineDelayPrevPass,
|
||
sizeof(pDCTData->WLFineDelayPrevPass));
|
||
pDCTData->WLCriticalGrossDelayFinalPass =
|
||
pDCTData->WLCriticalGrossDelayPrevPass;
|
||
}
|
||
}
|
||
}
|
||
\end{minted}
|
||
\end{adjustwidth}
|
||
\caption{Write Leveling (second pass), extract from the
|
||
\texttt{WriteLevelization\_HW} function in
|
||
\texttt{src/northbridge/amd/amdmct/mct\_ddr3/mcthwl.c}.}
|
||
\label{lst:write_level_second_pass}
|
||
\end{listing}
|
||
|
||
\begin{listing}
|
||
\begin{adjustwidth}{0.5cm}{0.5cm}
|
||
\begin{minted}[linenos]{c}
|
||
uint8_t MaxDimmsInstallable = mctGet_NVbits(NV_MAX_DIMMS_PER_CH);
|
||
|
||
if (pDCTstat->Status & (1 << SB_Registered)) {
|
||
if (package_type == PT_GR) {
|
||
// Socket G34: Fam15h BKDG v3.14 Table 99
|
||
if (MaxDimmsInstallable == 1) {
|
||
if (channel == 0)
|
||
seed = 0x43;
|
||
else if (channel == 1)
|
||
seed = 0x3f;
|
||
else if (channel == 2)
|
||
seed = 0x3a;
|
||
else if (channel == 3)
|
||
seed = 0x35;
|
||
}
|
||
...
|
||
}
|
||
...
|
||
} else if (pDCTstat->Status & (1 << SB_LoadReduced)) {
|
||
// Load Reduced DIMM configuration
|
||
if (package_type == PT_GR) {
|
||
// Socket G34: Fam15h BKDG v3.14 Table 99
|
||
if (MaxDimmsInstallable == 1) {
|
||
if (channel == 0)
|
||
seed = 0x123;
|
||
...
|
||
}
|
||
}
|
||
}
|
||
\end{minted}
|
||
\end{adjustwidth}
|
||
\caption{Seed generation for DQS receiver enable training based on DIMM type
|
||
and configuration,
|
||
extract from
|
||
\protect\path{fam15_receiver_enable_training_seed} function in
|
||
\protect\path{src/northbridge/amd/amdmct/mct_ddr3/mctsrc.c}}
|
||
\label{lst:seed_generation}
|
||
\end{listing}
|
||
|
||
\newpage
|
||
|
||
% ------------------------------------------------------------------------------
|
||
%
|
||
%
|
||
% LICENSES
|
||
%
|
||
%
|
||
%
|
||
% ------------------------------------------------------------------------------
|
||
\chapter*{\center\rlap{GNU General Public License version 2}}
|
||
\addcontentsline{toc}{chapter}{GNU General Public License version 2}
|
||
|
||
\parindent 0in
|
||
|
||
Version 2, June 1991
|
||
|
||
Copyright \copyright\ 1989, 1991 Free Software Foundation, Inc.
|
||
|
||
\bigskip
|
||
|
||
51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA
|
||
|
||
\bigskip
|
||
|
||
Everyone is permitted to copy and distribute verbatim copies
|
||
of this license document, but changing it is not allowed.
|
||
|
||
\bigskip{\bf\large Preamble}\bigskip
|
||
|
||
|
||
The licenses for most software are designed to take away your freedom to
|
||
share and change it. By contrast, the GNU General Public License is
|
||
intended to guarantee your freedom to share and change free software---to
|
||
make sure the software is free for all its users. This General Public
|
||
License applies to most of the Free Software Foundation's software and to
|
||
any other program whose authors commit to using it. (Some other Free
|
||
Software Foundation software is covered by the GNU Library General Public
|
||
License instead.) You can apply it to your programs, too.
|
||
|
||
When we speak of free software, we are referring to freedom, not price.
|
||
Our General Public Licenses are designed to make sure that you have the
|
||
freedom to distribute copies of free software (and charge for this service
|
||
if you wish), that you receive source code or can get it if you want it,
|
||
that you can change the software or use pieces of it in new free programs;
|
||
and that you know you can do these things.
|
||
|
||
To protect your rights, we need to make restrictions that forbid anyone to
|
||
deny you these rights or to ask you to surrender the rights. These
|
||
restrictions translate to certain responsibilities for you if you
|
||
distribute copies of the software, or if you modify it.
|
||
|
||
For example, if you distribute copies of such a program, whether gratis or
|
||
for a fee, you must give the recipients all the rights that you have. You
|
||
must make sure that they, too, receive or can get the source code. And
|
||
you must show them these terms so they know their rights.
|
||
|
||
We protect your rights with two steps: (1) copyright the software, and (2)
|
||
offer you this license which gives you legal permission to copy,
|
||
distribute and/or modify the software.
|
||
|
||
Also, for each author's protection and ours, we want to make certain that
|
||
everyone understands that there is no warranty for this free software. If
|
||
the software is modified by someone else and passed on, we want its
|
||
recipients to know that what they have is not the original, so that any
|
||
problems introduced by others will not reflect on the original authors'
|
||
reputations.
|
||
|
||
Finally, any free program is threatened constantly by software patents.
|
||
We wish to avoid the danger that redistributors of a free program will
|
||
individually obtain patent licenses, in effect making the program
|
||
proprietary. To prevent this, we have made it clear that any patent must
|
||
be licensed for everyone's free use or not licensed at all.
|
||
|
||
The precise terms and conditions for copying, distribution and
|
||
modification follow.
|
||
|
||
\bigskip{\Large \sc Terms and Conditions For Copying, Distribution and
|
||
Modification}\bigskip
|
||
|
||
|
||
%\renewcommand{\theenumi}{\alpha{enumi}}
|
||
\begin{enumerate}
|
||
|
||
\addtocounter{enumi}{-1}
|
||
|
||
\item
|
||
|
||
This License applies to any program or other work which contains a notice
|
||
placed by the copyright holder saying it may be distributed under the
|
||
terms of this General Public License. The ``Program'', below, refers to
|
||
any such program or work, and a ``work based on the Program'' means either
|
||
the Program or any derivative work under copyright law: that is to say, a
|
||
work containing the Program or a portion of it, either verbatim or with
|
||
modifications and/or translated into another language. (Hereinafter,
|
||
translation is included without limitation in the term ``modification''.)
|
||
Each licensee is addressed as ``you''.
|
||
|
||
Activities other than copying, distribution and modification are not
|
||
covered by this License; they are outside its scope. The act of
|
||
running the Program is not restricted, and the output from the Program
|
||
is covered only if its contents constitute a work based on the
|
||
Program (independent of having been made by running the Program).
|
||
Whether that is true depends on what the Program does.
|
||
|
||
\item You may copy and distribute verbatim copies of the Program's source
|
||
code as you receive it, in any medium, provided that you conspicuously
|
||
and appropriately publish on each copy an appropriate copyright notice
|
||
and disclaimer of warranty; keep intact all the notices that refer to
|
||
this License and to the absence of any warranty; and give any other
|
||
recipients of the Program a copy of this License along with the Program.
|
||
|
||
You may charge a fee for the physical act of transferring a copy, and you
|
||
may at your option offer warranty protection in exchange for a fee.
|
||
|
||
\item
|
||
|
||
You may modify your copy or copies of the Program or any portion
|
||
of it, thus forming a work based on the Program, and copy and
|
||
distribute such modifications or work under the terms of Section 1
|
||
above, provided that you also meet all of these conditions:
|
||
|
||
\begin{enumerate}
|
||
|
||
\item
|
||
|
||
You must cause the modified files to carry prominent notices stating that
|
||
you changed the files and the date of any change.
|
||
|
||
\item
|
||
|
||
You must cause any work that you distribute or publish, that in
|
||
whole or in part contains or is derived from the Program or any
|
||
part thereof, to be licensed as a whole at no charge to all third
|
||
parties under the terms of this License.
|
||
|
||
\item
|
||
If the modified program normally reads commands interactively
|
||
when run, you must cause it, when started running for such
|
||
interactive use in the most ordinary way, to print or display an
|
||
announcement including an appropriate copyright notice and a
|
||
notice that there is no warranty (or else, saying that you provide
|
||
a warranty) and that users may redistribute the program under
|
||
these conditions, and telling the user how to view a copy of this
|
||
License. (Exception: if the Program itself is interactive but
|
||
does not normally print such an announcement, your work based on
|
||
the Program is not required to print an announcement.)
|
||
|
||
\end{enumerate}
|
||
|
||
|
||
These requirements apply to the modified work as a whole. If
|
||
identifiable sections of that work are not derived from the Program,
|
||
and can be reasonably considered independent and separate works in
|
||
themselves, then this License, and its terms, do not apply to those
|
||
sections when you distribute them as separate works. But when you
|
||
distribute the same sections as part of a whole which is a work based
|
||
on the Program, the distribution of the whole must be on the terms of
|
||
this License, whose permissions for other licensees extend to the
|
||
entire whole, and thus to each and every part regardless of who wrote it.
|
||
|
||
Thus, it is not the intent of this section to claim rights or contest
|
||
your rights to work written entirely by you; rather, the intent is to
|
||
exercise the right to control the distribution of derivative or
|
||
collective works based on the Program.
|
||
|
||
In addition, mere aggregation of another work not based on the Program
|
||
with the Program (or with a work based on the Program) on a volume of
|
||
a storage or distribution medium does not bring the other work under
|
||
the scope of this License.
|
||
|
||
\item
|
||
You may copy and distribute the Program (or a work based on it,
|
||
under Section 2) in object code or executable form under the terms of
|
||
Sections 1 and 2 above provided that you also do one of the following:
|
||
|
||
\begin{enumerate}
|
||
|
||
\item
|
||
|
||
Accompany it with the complete corresponding machine-readable
|
||
source code, which must be distributed under the terms of Sections
|
||
1 and 2 above on a medium customarily used for software interchange; or,
|
||
|
||
\item
|
||
|
||
Accompany it with a written offer, valid for at least three
|
||
years, to give any third party, for a charge no more than your
|
||
cost of physically performing source distribution, a complete
|
||
machine-readable copy of the corresponding source code, to be
|
||
distributed under the terms of Sections 1 and 2 above on a medium
|
||
customarily used for software interchange; or,
|
||
|
||
\item
|
||
|
||
Accompany it with the information you received as to the offer
|
||
to distribute corresponding source code. (This alternative is
|
||
allowed only for noncommercial distribution and only if you
|
||
received the program in object code or executable form with such
|
||
an offer, in accord with Subsection b above.)
|
||
|
||
\end{enumerate}
|
||
|
||
|
||
The source code for a work means the preferred form of the work for
|
||
making modifications to it. For an executable work, complete source
|
||
code means all the source code for all modules it contains, plus any
|
||
associated interface definition files, plus the scripts used to
|
||
control compilation and installation of the executable. However, as a
|
||
special exception, the source code distributed need not include
|
||
anything that is normally distributed (in either source or binary
|
||
form) with the major components (compiler, kernel, and so on) of the
|
||
operating system on which the executable runs, unless that component
|
||
itself accompanies the executable.
|
||
|
||
If distribution of executable or object code is made by offering
|
||
access to copy from a designated place, then offering equivalent
|
||
access to copy the source code from the same place counts as
|
||
distribution of the source code, even though third parties are not
|
||
compelled to copy the source along with the object code.
|
||
|
||
\item
|
||
You may not copy, modify, sublicense, or distribute the Program
|
||
except as expressly provided under this License. Any attempt
|
||
otherwise to copy, modify, sublicense or distribute the Program is
|
||
void, and will automatically terminate your rights under this License.
|
||
However, parties who have received copies, or rights, from you under
|
||
this License will not have their licenses terminated so long as such
|
||
parties remain in full compliance.
|
||
|
||
\item
|
||
You are not required to accept this License, since you have not
|
||
signed it. However, nothing else grants you permission to modify or
|
||
distribute the Program or its derivative works. These actions are
|
||
prohibited by law if you do not accept this License. Therefore, by
|
||
modifying or distributing the Program (or any work based on the
|
||
Program), you indicate your acceptance of this License to do so, and
|
||
all its terms and conditions for copying, distributing or modifying
|
||
the Program or works based on it.
|
||
|
||
\item
|
||
Each time you redistribute the Program (or any work based on the
|
||
Program), the recipient automatically receives a license from the
|
||
original licensor to copy, distribute or modify the Program subject to
|
||
these terms and conditions. You may not impose any further
|
||
restrictions on the recipients' exercise of the rights granted herein.
|
||
You are not responsible for enforcing compliance by third parties to
|
||
this License.
|
||
|
||
\item
|
||
If, as a consequence of a court judgment or allegation of patent
|
||
infringement or for any other reason (not limited to patent issues),
|
||
conditions are imposed on you (whether by court order, agreement or
|
||
otherwise) that contradict the conditions of this License, they do not
|
||
excuse you from the conditions of this License. If you cannot
|
||
distribute so as to satisfy simultaneously your obligations under this
|
||
License and any other pertinent obligations, then as a consequence you
|
||
may not distribute the Program at all. For example, if a patent
|
||
license would not permit royalty-free redistribution of the Program by
|
||
all those who receive copies directly or indirectly through you, then
|
||
the only way you could satisfy both it and this License would be to
|
||
refrain entirely from distribution of the Program.
|
||
|
||
If any portion of this section is held invalid or unenforceable under
|
||
any particular circumstance, the balance of the section is intended to
|
||
apply and the section as a whole is intended to apply in other
|
||
circumstances.
|
||
|
||
It is not the purpose of this section to induce you to infringe any
|
||
patents or other property right claims or to contest validity of any
|
||
such claims; this section has the sole purpose of protecting the
|
||
integrity of the free software distribution system, which is
|
||
implemented by public license practices. Many people have made
|
||
generous contributions to the wide range of software distributed
|
||
through that system in reliance on consistent application of that
|
||
system; it is up to the author/donor to decide if he or she is willing
|
||
to distribute software through any other system and a licensee cannot
|
||
impose that choice.
|
||
|
||
This section is intended to make thoroughly clear what is believed to
|
||
be a consequence of the rest of this License.
|
||
|
||
\item
|
||
If the distribution and/or use of the Program is restricted in
|
||
certain countries either by patents or by copyrighted interfaces, the
|
||
original copyright holder who places the Program under this License
|
||
may add an explicit geographical distribution limitation excluding
|
||
those countries, so that distribution is permitted only in or among
|
||
countries not thus excluded. In such case, this License incorporates
|
||
the limitation as if written in the body of this License.
|
||
|
||
\item
|
||
The Free Software Foundation may publish revised and/or new versions
|
||
of the General Public License from time to time. Such new versions will
|
||
be similar in spirit to the present version, but may differ in detail to
|
||
address new problems or concerns.
|
||
|
||
Each version is given a distinguishing version number. If the Program
|
||
specifies a version number of this License which applies to it and ``any
|
||
later version'', you have the option of following the terms and conditions
|
||
either of that version or of any later version published by the Free
|
||
Software Foundation. If the Program does not specify a version number of
|
||
this License, you may choose any version ever published by the Free Software
|
||
Foundation.
|
||
|
||
\item
|
||
If you wish to incorporate parts of the Program into other free
|
||
programs whose distribution conditions are different, write to the author
|
||
to ask for permission. For software which is copyrighted by the Free
|
||
Software Foundation, write to the Free Software Foundation; we sometimes
|
||
make exceptions for this. Our decision will be guided by the two goals
|
||
of preserving the free status of all derivatives of our free software and
|
||
of promoting the sharing and reuse of software generally.
|
||
|
||
\bigskip{\Large\sc
|
||
No Warranty
|
||
}\bigskip
|
||
|
||
\item
|
||
{\sc Because the program is licensed free of charge, there is no warranty
|
||
for the program, to the extent permitted by applicable law. Except when
|
||
otherwise stated in writing the copyright holders and/or other parties
|
||
provide the program ``as is'' without warranty of any kind, either expressed
|
||
or implied, including, but not limited to, the implied warranties of
|
||
merchantability and fitness for a particular purpose. The entire risk as
|
||
to the quality and performance of the program is with you. Should the
|
||
program prove defective, you assume the cost of all necessary servicing,
|
||
repair or correction.}
|
||
|
||
\item
|
||
{\sc In no event unless required by applicable law or agreed to in writing
|
||
will any copyright holder, or any other party who may modify and/or
|
||
redistribute the program as permitted above, be liable to you for damages,
|
||
including any general, special, incidental or consequential damages arising
|
||
out of the use or inability to use the program (including but not limited
|
||
to loss of data or data being rendered inaccurate or losses sustained by
|
||
you or third parties or a failure of the program to operate with any other
|
||
programs), even if such holder or other party has been advised of the
|
||
possibility of such damages.}
|
||
|
||
\end{enumerate}
|
||
|
||
|
||
\bigskip{\Large\sc End of Terms and Conditions}\bigskip
|
||
|
||
|
||
\pagebreak[2]
|
||
|
||
\section*{Appendix: How to Apply These Terms to Your New Programs}
|
||
|
||
If you develop a new program, and you want it to be of the greatest
|
||
possible use to the public, the best way to achieve this is to make it
|
||
free software which everyone can redistribute and change under these
|
||
terms.
|
||
|
||
To do so, attach the following notices to the program. It is safest to
|
||
attach them to the start of each source file to most effectively convey
|
||
the exclusion of warranty; and each file should have at least the
|
||
``copyright'' line and a pointer to where the full notice is found.
|
||
|
||
\begin{quote}
|
||
one line to give the program's name and a brief idea of what it does. \\
|
||
Copyright (C) yyyy name of author \\
|
||
|
||
This program is free software; you can redistribute it and/or modify
|
||
it under the terms of the GNU General Public License as published by
|
||
the Free Software Foundation; either version 2 of the License, or
|
||
(at your option) any later version.
|
||
|
||
This program is distributed in the hope that it will be useful,
|
||
but WITHOUT ANY WARRANTY; without even the implied warranty of
|
||
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
|
||
GNU General Public License for more details.
|
||
|
||
You should have received a copy of the GNU General Public License
|
||
along with this program; if not, write to the Free Software
|
||
Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA.
|
||
\end{quote}
|
||
|
||
Also add information on how to contact you by electronic and paper mail.
|
||
|
||
If the program is interactive, make it output a short notice like this
|
||
when it starts in an interactive mode:
|
||
|
||
\begin{quote}
|
||
Gnomovision version 69, Copyright (C) yyyy name of author \\
|
||
Gnomovision comes with ABSOLUTELY NO WARRANTY; for details type `show w'. \\
|
||
This is free software, and you are welcome to redistribute it
|
||
under certain conditions; type `show c' for details.
|
||
\end{quote}
|
||
|
||
|
||
The hypothetical commands {\tt show w} and {\tt show c} should show the
|
||
appropriate parts of the General Public License. Of course, the commands
|
||
you use may be called something other than {\tt show w} and {\tt show c};
|
||
they could even be mouse-clicks or menu items---whatever suits your
|
||
program.
|
||
|
||
You should also get your employer (if you work as a programmer) or your
|
||
school, if any, to sign a ``copyright disclaimer'' for the program, if
|
||
necessary. Here is a sample; alter the names:
|
||
|
||
\begin{quote}
|
||
Yoyodyne, Inc., hereby disclaims all copyright interest in the program \\
|
||
`Gnomovision' (which makes passes at compilers) written by James Hacker. \\
|
||
|
||
signature of Ty Coon, 1 April 1989 \\
|
||
Ty Coon, President of Vice
|
||
\end{quote}
|
||
|
||
|
||
This General Public License does not permit incorporating your program
|
||
into proprietary programs. If your program is a subroutine library, you
|
||
may consider it more useful to permit linking proprietary applications
|
||
with the library. If this is what you want to do, use the GNU Library
|
||
General Public License instead of this License.
|
||
|
||
\chapter*{\center\rlap{GNU Free Documentation License}}
|
||
\addcontentsline{toc}{chapter}{GNU Free Documentation License}
|
||
|
||
Version 1.3, 3 November 2008
|
||
|
||
|
||
Copyright \copyright{} 2000, 2001, 2002, 2007, 2008 Free Software Foundation, Inc.
|
||
|
||
\bigskip
|
||
|
||
\path{<https://fsf.org/>}
|
||
|
||
\bigskip
|
||
|
||
Everyone is permitted to copy and distribute verbatim copies
|
||
of this license document, but changing it is not allowed.
|
||
|
||
|
||
|
||
|
||
\bigskip\bigskip{\bf\large Preamble}\bigskip
|
||
|
||
|
||
The purpose of this License is to make a manual, textbook, or other
|
||
functional and useful document ``free'' in the sense of freedom: to
|
||
assure everyone the effective freedom to copy and redistribute it,
|
||
with or without modifying it, either commercially or noncommercially.
|
||
Secondarily, this License preserves for the author and publisher a way
|
||
to get credit for their work, while not being considered responsible
|
||
for modifications made by others.
|
||
|
||
This License is a kind of ``copyleft'', which means that derivative
|
||
works of the document must themselves be free in the same sense. It
|
||
complements the GNU General Public License, which is a copyleft
|
||
license designed for free software.
|
||
|
||
We have designed this License in order to use it for manuals for free
|
||
software, because free software needs free documentation: a free
|
||
program should come with manuals providing the same freedoms that the
|
||
software does. But this License is not limited to software manuals;
|
||
it can be used for any textual work, regardless of subject matter or
|
||
whether it is published as a printed book. We recommend this License
|
||
principally for works whose purpose is instruction or reference.
|
||
|
||
|
||
|
||
\bigskip\bigskip{\Large\bf 1. APPLICABILITY AND DEFINITIONS\par}\bigskip
|
||
|
||
|
||
This License applies to any manual or other work, in any medium, that
|
||
contains a notice placed by the copyright holder saying it can be
|
||
distributed under the terms of this License. Such a notice grants a
|
||
world-wide, royalty-free license, unlimited in duration, to use that
|
||
work under the conditions stated herein. The ``\textbf{Document}'', below,
|
||
refers to any such manual or work. Any member of the public is a
|
||
licensee, and is addressed as ``\textbf{you}''. You accept the license if you
|
||
copy, modify or distribute the work in a way requiring permission
|
||
under copyright law.
|
||
|
||
A ``\textbf{Modified Version}'' of the Document means any work containing the
|
||
Document or a portion of it, either copied verbatim, or with
|
||
modifications and/or translated into another language.
|
||
|
||
A ``\textbf{Secondary Section}'' is a named appendix or a front-matter section of
|
||
the Document that deals exclusively with the relationship of the
|
||
publishers or authors of the Document to the Document's overall subject
|
||
(or to related matters) and contains nothing that could fall directly
|
||
within that overall subject. (Thus, if the Document is in part a
|
||
textbook of mathematics, a Secondary Section may not explain any
|
||
mathematics.) The relationship could be a matter of historical
|
||
connection with the subject or with related matters, or of legal,
|
||
commercial, philosophical, ethical or political position regarding
|
||
them.
|
||
|
||
The ``\textbf{Invariant Sections}'' are certain Secondary Sections whose titles
|
||
are designated, as being those of Invariant Sections, in the notice
|
||
that says that the Document is released under this License. If a
|
||
section does not fit the above definition of Secondary then it is not
|
||
allowed to be designated as Invariant. The Document may contain zero
|
||
Invariant Sections. If the Document does not identify any Invariant
|
||
Sections then there are none.
|
||
|
||
The ``\textbf{Cover Texts}'' are certain short passages of text that are listed,
|
||
as Front-Cover Texts or Back-Cover Texts, in the notice that says that
|
||
the Document is released under this License. A Front-Cover Text may
|
||
be at most 5 words, and a Back-Cover Text may be at most 25 words.
|
||
|
||
A ``\textbf{Transparent}'' copy of the Document means a machine-readable copy,
|
||
represented in a format whose specification is available to the
|
||
general public, that is suitable for revising the document
|
||
straightforwardly with generic text editors or (for images composed of
|
||
pixels) generic paint programs or (for drawings) some widely available
|
||
drawing editor, and that is suitable for input to text formatters or
|
||
for automatic translation to a variety of formats suitable for input
|
||
to text formatters. A copy made in an otherwise Transparent file
|
||
format whose markup, or absence of markup, has been arranged to thwart
|
||
or discourage subsequent modification by readers is not Transparent.
|
||
An image format is not Transparent if used for any substantial amount
|
||
of text. A copy that is not ``Transparent'' is called ``\textbf{Opaque}''.
|
||
|
||
Examples of suitable formats for Transparent copies include plain
|
||
ASCII without markup, Texinfo input format, LaTeX input format, SGML
|
||
or XML using a publicly available DTD, and standard-conforming simple
|
||
HTML, PostScript or PDF designed for human modification. Examples of
|
||
transparent image formats include PNG, XCF and JPG. Opaque formats
|
||
include proprietary formats that can be read and edited only by
|
||
proprietary word processors, SGML or XML for which the DTD and/or
|
||
processing tools are not generally available, and the
|
||
machine-generated HTML, PostScript or PDF produced by some word
|
||
processors for output purposes only.
|
||
|
||
The ``\textbf{Title Page}'' means, for a printed book, the title page itself,
|
||
plus such following pages as are needed to hold, legibly, the material
|
||
this License requires to appear in the title page. For works in
|
||
formats which do not have any title page as such, ``Title Page'' means
|
||
the text near the most prominent appearance of the work's title,
|
||
preceding the beginning of the body of the text.
|
||
|
||
The ``\textbf{publisher}'' means any person or entity that distributes
|
||
copies of the Document to the public.
|
||
|
||
A section ``\textbf{Entitled XYZ}'' means a named subunit of the Document whose
|
||
title either is precisely XYZ or contains XYZ in parentheses following
|
||
text that translates XYZ in another language. (Here XYZ stands for a
|
||
specific section name mentioned below, such as ``\textbf{Acknowledgements}'',
|
||
``\textbf{Dedications}'', ``\textbf{Endorsements}'', or ``\textbf{History}''.)
|
||
To ``\textbf{Preserve the Title}''
|
||
of such a section when you modify the Document means that it remains a
|
||
section ``Entitled XYZ'' according to this definition.
|
||
|
||
The Document may include Warranty Disclaimers next to the notice which
|
||
states that this License applies to the Document. These Warranty
|
||
Disclaimers are considered to be included by reference in this
|
||
License, but only as regards disclaiming warranties: any other
|
||
implication that these Warranty Disclaimers may have is void and has
|
||
no effect on the meaning of this License.
|
||
|
||
|
||
|
||
\bigskip\bigskip{\Large\bf 2. VERBATIM COPYING\par}\bigskip
|
||
|
||
|
||
You may copy and distribute the Document in any medium, either
|
||
commercially or noncommercially, provided that this License, the
|
||
copyright notices, and the license notice saying this License applies
|
||
to the Document are reproduced in all copies, and that you add no other
|
||
conditions whatsoever to those of this License. You may not use
|
||
technical measures to obstruct or control the reading or further
|
||
copying of the copies you make or distribute. However, you may accept
|
||
compensation in exchange for copies. If you distribute a large enough
|
||
number of copies you must also follow the conditions in section~3.
|
||
|
||
You may also lend copies, under the same conditions stated above, and
|
||
you may publicly display copies.
|
||
|
||
|
||
|
||
\bigskip\bigskip{\Large\bf 3. COPYING IN QUANTITY\par}\bigskip
|
||
|
||
|
||
|
||
If you publish printed copies (or copies in media that commonly have
|
||
printed covers) of the Document, numbering more than 100, and the
|
||
Document's license notice requires Cover Texts, you must enclose the
|
||
copies in covers that carry, clearly and legibly, all these Cover
|
||
Texts: Front-Cover Texts on the front cover, and Back-Cover Texts on
|
||
the back cover. Both covers must also clearly and legibly identify
|
||
you as the publisher of these copies. The front cover must present
|
||
the full title with all words of the title equally prominent and
|
||
visible. You may add other material on the covers in addition.
|
||
Copying with changes limited to the covers, as long as they preserve
|
||
the title of the Document and satisfy these conditions, can be treated
|
||
as verbatim copying in other respects.
|
||
|
||
If the required texts for either cover are too voluminous to fit
|
||
legibly, you should put the first ones listed (as many as fit
|
||
reasonably) on the actual cover, and continue the rest onto adjacent
|
||
pages.
|
||
|
||
If you publish or distribute Opaque copies of the Document numbering
|
||
more than 100, you must either include a machine-readable Transparent
|
||
copy along with each Opaque copy, or state in or with each Opaque copy
|
||
a computer-network location from which the general network-using
|
||
public has access to download using public-standard network protocols
|
||
a complete Transparent copy of the Document, free of added material.
|
||
If you use the latter option, you must take reasonably prudent steps,
|
||
when you begin distribution of Opaque copies in quantity, to ensure
|
||
that this Transparent copy will remain thus accessible at the stated
|
||
location until at least one year after the last time you distribute an
|
||
Opaque copy (directly or through your agents or retailers) of that
|
||
edition to the public.
|
||
|
||
It is requested, but not required, that you contact the authors of the
|
||
Document well before redistributing any large number of copies, to give
|
||
them a chance to provide you with an updated version of the Document.
|
||
|
||
|
||
|
||
\bigskip\bigskip{\Large\bf 4. MODIFICATIONS\par}\bigskip
|
||
|
||
|
||
You may copy and distribute a Modified Version of the Document under
|
||
the conditions of sections 2 and 3 above, provided that you release
|
||
the Modified Version under precisely this License, with the Modified
|
||
Version filling the role of the Document, thus licensing distribution
|
||
and modification of the Modified Version to whoever possesses a copy
|
||
of it. In addition, you must do these things in the Modified Version:
|
||
|
||
\begin{itemize}
|
||
\item[A.]
|
||
Use in the Title Page (and on the covers, if any) a title distinct
|
||
from that of the Document, and from those of previous versions
|
||
(which should, if there were any, be listed in the History section
|
||
of the Document). You may use the same title as a previous version
|
||
if the original publisher of that version gives permission.
|
||
|
||
\item[B.]
|
||
List on the Title Page, as authors, one or more persons or entities
|
||
responsible for authorship of the modifications in the Modified
|
||
Version, together with at least five of the principal authors of the
|
||
Document (all of its principal authors, if it has fewer than five),
|
||
unless they release you from this requirement.
|
||
|
||
\item[C.]
|
||
State on the Title page the name of the publisher of the
|
||
Modified Version, as the publisher.
|
||
|
||
\item[D.]
|
||
Preserve all the copyright notices of the Document.
|
||
|
||
\item[E.]
|
||
Add an appropriate copyright notice for your modifications
|
||
adjacent to the other copyright notices.
|
||
|
||
\item[F.]
|
||
Include, immediately after the copyright notices, a license notice
|
||
giving the public permission to use the Modified Version under the
|
||
terms of this License, in the form shown in the Addendum below.
|
||
|
||
\item[G.]
|
||
Preserve in that license notice the full lists of Invariant Sections
|
||
and required Cover Texts given in the Document's license notice.
|
||
|
||
\item[H.]
|
||
Include an unaltered copy of this License.
|
||
|
||
\item[I.]
|
||
Preserve the section Entitled ``History'', Preserve its Title, and add
|
||
to it an item stating at least the title, year, new authors, and
|
||
publisher of the Modified Version as given on the Title Page. If
|
||
there is no section Entitled ``History'' in the Document, create one
|
||
stating the title, year, authors, and publisher of the Document as
|
||
given on its Title Page, then add an item describing the Modified
|
||
Version as stated in the previous sentence.
|
||
|
||
\item[J.]
|
||
Preserve the network location, if any, given in the Document for
|
||
public access to a Transparent copy of the Document, and likewise
|
||
the network locations given in the Document for previous versions
|
||
it was based on. These may be placed in the ``History'' section.
|
||
You may omit a network location for a work that was published at
|
||
least four years before the Document itself, or if the original
|
||
publisher of the version it refers to gives permission.
|
||
|
||
\item[K.]
|
||
For any section Entitled ``Acknowledgements'' or ``Dedications'',
|
||
Preserve the Title of the section, and preserve in the section all
|
||
the substance and tone of each of the contributor acknowledgements
|
||
and/or dedications given therein.
|
||
|
||
\item[L.]
|
||
Preserve all the Invariant Sections of the Document,
|
||
unaltered in their text and in their titles. Section numbers
|
||
or the equivalent are not considered part of the section titles.
|
||
|
||
\item[M.]
|
||
Delete any section Entitled ``Endorsements''. Such a section
|
||
may not be included in the Modified Version.
|
||
|
||
\item[N.]
|
||
Do not retitle any existing section to be Entitled ``Endorsements''
|
||
or to conflict in title with any Invariant Section.
|
||
|
||
\item[O.]
|
||
Preserve any Warranty Disclaimers.
|
||
\end{itemize}
|
||
|
||
If the Modified Version includes new front-matter sections or
|
||
appendices that qualify as Secondary Sections and contain no material
|
||
copied from the Document, you may at your option designate some or all
|
||
of these sections as invariant. To do this, add their titles to the
|
||
list of Invariant Sections in the Modified Version's license notice.
|
||
These titles must be distinct from any other section titles.
|
||
|
||
You may add a section Entitled ``Endorsements'', provided it contains
|
||
nothing but endorsements of your Modified Version by various
|
||
parties---for example, statements of peer review or that the text has
|
||
been approved by an organization as the authoritative definition of a
|
||
standard.
|
||
|
||
You may add a passage of up to five words as a Front-Cover Text, and a
|
||
passage of up to 25 words as a Back-Cover Text, to the end of the list
|
||
of Cover Texts in the Modified Version. Only one passage of
|
||
Front-Cover Text and one of Back-Cover Text may be added by (or
|
||
through arrangements made by) any one entity. If the Document already
|
||
includes a cover text for the same cover, previously added by you or
|
||
by arrangement made by the same entity you are acting on behalf of,
|
||
you may not add another; but you may replace the old one, on explicit
|
||
permission from the previous publisher that added the old one.
|
||
|
||
The author(s) and publisher(s) of the Document do not by this License
|
||
give permission to use their names for publicity for or to assert or
|
||
imply endorsement of any Modified Version.
|
||
|
||
|
||
|
||
\bigskip\bigskip{\Large\bf 5. COMBINING DOCUMENTS\par}\bigskip
|
||
|
||
|
||
|
||
You may combine the Document with other documents released under this
|
||
License, under the terms defined in section~4 above for modified
|
||
versions, provided that you include in the combination all of the
|
||
Invariant Sections of all of the original documents, unmodified, and
|
||
list them all as Invariant Sections of your combined work in its
|
||
license notice, and that you preserve all their Warranty Disclaimers.
|
||
|
||
The combined work need only contain one copy of this License, and
|
||
multiple identical Invariant Sections may be replaced with a single
|
||
copy. If there are multiple Invariant Sections with the same name but
|
||
different contents, make the title of each such section unique by
|
||
adding at the end of it, in parentheses, the name of the original
|
||
author or publisher of that section if known, or else a unique number.
|
||
Make the same adjustment to the section titles in the list of
|
||
Invariant Sections in the license notice of the combined work.
|
||
|
||
In the combination, you must combine any sections Entitled ``History''
|
||
in the various original documents, forming one section Entitled
|
||
``History''; likewise combine any sections Entitled ``Acknowledgements'',
|
||
and any sections Entitled ``Dedications''. You must delete all sections
|
||
Entitled ``Endorsements''.
|
||
|
||
|
||
\bigskip\bigskip{\Large\bf 6. COLLECTIONS OF DOCUMENTS\par}\bigskip
|
||
|
||
|
||
You may make a collection consisting of the Document and other documents
|
||
released under this License, and replace the individual copies of this
|
||
License in the various documents with a single copy that is included in
|
||
the collection, provided that you follow the rules of this License for
|
||
verbatim copying of each of the documents in all other respects.
|
||
|
||
You may extract a single document from such a collection, and distribute
|
||
it individually under this License, provided you insert a copy of this
|
||
License into the extracted document, and follow this License in all
|
||
other respects regarding verbatim copying of that document.
|
||
|
||
|
||
|
||
\bigskip\bigskip{\Large\bf 7. AGGREGATION WITH INDEPENDENT WORKS\par}\bigskip
|
||
|
||
|
||
|
||
A compilation of the Document or its derivatives with other separate
|
||
and independent documents or works, in or on a volume of a storage or
|
||
distribution medium, is called an ``aggregate'' if the copyright
|
||
resulting from the compilation is not used to limit the legal rights
|
||
of the compilation's users beyond what the individual works permit.
|
||
When the Document is included in an aggregate, this License does not
|
||
apply to the other works in the aggregate which are not themselves
|
||
derivative works of the Document.
|
||
|
||
If the Cover Text requirement of section~3 is applicable to these
|
||
copies of the Document, then if the Document is less than one half of
|
||
the entire aggregate, the Document's Cover Texts may be placed on
|
||
covers that bracket the Document within the aggregate, or the
|
||
electronic equivalent of covers if the Document is in electronic form.
|
||
Otherwise they must appear on printed covers that bracket the whole
|
||
aggregate.
|
||
|
||
|
||
|
||
\bigskip\bigskip{\Large\bf 8. TRANSLATION\par}\bigskip
|
||
|
||
|
||
|
||
Translation is considered a kind of modification, so you may
|
||
distribute translations of the Document under the terms of section~4.
|
||
Replacing Invariant Sections with translations requires special
|
||
permission from their copyright holders, but you may include
|
||
translations of some or all Invariant Sections in addition to the
|
||
original versions of these Invariant Sections. You may include a
|
||
translation of this License, and all the license notices in the
|
||
Document, and any Warranty Disclaimers, provided that you also include
|
||
the original English version of this License and the original versions
|
||
of those notices and disclaimers. In case of a disagreement between
|
||
the translation and the original version of this License or a notice
|
||
or disclaimer, the original version will prevail.
|
||
|
||
If a section in the Document is Entitled ``Acknowledgements'',
|
||
``Dedications'', or ``History'', the requirement (section~4) to Preserve
|
||
its Title (section~1) will typically require changing the actual
|
||
title.
|
||
|
||
|
||
|
||
\bigskip\bigskip{\Large\bf 9. TERMINATION\par}\bigskip
|
||
|
||
|
||
|
||
You may not copy, modify, sublicense, or distribute the Document
|
||
except as expressly provided under this License. Any attempt
|
||
otherwise to copy, modify, sublicense, or distribute it is void, and
|
||
will automatically terminate your rights under this License.
|
||
|
||
However, if you cease all violation of this License, then your license
|
||
from a particular copyright holder is reinstated (a) provisionally,
|
||
unless and until the copyright holder explicitly and finally
|
||
terminates your license, and (b) permanently, if the copyright holder
|
||
fails to notify you of the violation by some reasonable means prior to
|
||
60 days after the cessation.
|
||
|
||
Moreover, your license from a particular copyright holder is
|
||
reinstated permanently if the copyright holder notifies you of the
|
||
violation by some reasonable means, this is the first time you have
|
||
received notice of violation of this License (for any work) from that
|
||
copyright holder, and you cure the violation prior to 30 days after
|
||
your receipt of the notice.
|
||
|
||
Termination of your rights under this section does not terminate the
|
||
licenses of parties who have received copies or rights from you under
|
||
this License. If your rights have been terminated and not permanently
|
||
reinstated, receipt of a copy of some or all of the same material does
|
||
not give you any rights to use it.
|
||
|
||
|
||
|
||
\bigskip\bigskip{\Large\bf 10. FUTURE REVISIONS OF THIS LICENSE\par}\bigskip
|
||
|
||
|
||
|
||
The Free Software Foundation may publish new, revised versions
|
||
of the GNU Free Documentation License from time to time. Such new
|
||
versions will be similar in spirit to the present version, but may
|
||
differ in detail to address new problems or concerns. See
|
||
\path{https://www.gnu.org/licenses/}.
|
||
|
||
Each version of the License is given a distinguishing version number.
|
||
If the Document specifies that a particular numbered version of this
|
||
License ``or any later version'' applies to it, you have the option of
|
||
following the terms and conditions either of that specified version or
|
||
of any later version that has been published (not as a draft) by the
|
||
Free Software Foundation. If the Document does not specify a version
|
||
number of this License, you may choose any version ever published (not
|
||
as a draft) by the Free Software Foundation. If the Document
|
||
specifies that a proxy can decide which future versions of this
|
||
License can be used, that proxy's public statement of acceptance of a
|
||
version permanently authorizes you to choose that version for the
|
||
Document.
|
||
|
||
|
||
|
||
\bigskip\bigskip{\Large\bf 11. RELICENSING\par}\bigskip
|
||
|
||
|
||
|
||
``Massive Multiauthor Collaboration Site'' (or ``MMC Site'') means any
|
||
World Wide Web server that publishes copyrightable works and also
|
||
provides prominent facilities for anybody to edit those works. A
|
||
public wiki that anybody can edit is an example of such a server. A
|
||
``Massive Multiauthor Collaboration'' (or ``MMC'') contained in the
|
||
site means any set of copyrightable works thus published on the MMC
|
||
site.
|
||
|
||
``CC-BY-SA'' means the Creative Commons Attribution-Share Alike 3.0
|
||
license published by Creative Commons Corporation, a not-for-profit
|
||
corporation with a principal place of business in San Francisco,
|
||
California, as well as future copyleft versions of that license
|
||
published by that same organization.
|
||
|
||
``Incorporate'' means to publish or republish a Document, in whole or
|
||
in part, as part of another Document.
|
||
|
||
An MMC is ``eligible for relicensing'' if it is licensed under this
|
||
License, and if all works that were first published under this License
|
||
somewhere other than this MMC, and subsequently incorporated in whole
|
||
or in part into the MMC, (1) had no cover texts or invariant sections,
|
||
and (2) were thus incorporated prior to November 1, 2008.
|
||
|
||
The operator of an MMC Site may republish an MMC contained in the site
|
||
under CC-BY-SA on the same site at any time before August 1, 2009,
|
||
provided the MMC is eligible for relicensing.
|
||
|
||
|
||
|
||
\bigskip\bigskip{\Large\bf ADDENDUM: How to use this License for your documents\par}\bigskip
|
||
|
||
|
||
To use this License in a document you have written, include a copy of
|
||
the License in the document and put the following copyright and
|
||
license notices just after the title page:
|
||
|
||
\bigskip
|
||
\begin{quote}
|
||
Copyright \copyright{} YEAR YOUR NAME.
|
||
Permission is granted to copy, distribute and/or modify this document
|
||
under the terms of the GNU Free Documentation License, Version 1.3
|
||
or any later version published by the Free Software Foundation;
|
||
with no Invariant Sections, no Front-Cover Texts, and no Back-Cover Texts.
|
||
A copy of the license is included in the section entitled ``GNU
|
||
Free Documentation License''.
|
||
\end{quote}
|
||
\bigskip
|
||
|
||
If you have Invariant Sections, Front-Cover Texts and Back-Cover Texts,
|
||
replace the ``with \dots\ Texts.''\ line with this:
|
||
|
||
\bigskip
|
||
\begin{quote}
|
||
with the Invariant Sections being LIST THEIR TITLES, with the
|
||
Front-Cover Texts being LIST, and with the Back-Cover Texts being LIST.
|
||
\end{quote}
|
||
\bigskip
|
||
|
||
If you have Invariant Sections without Cover Texts, or some other
|
||
combination of the three, merge those two alternatives to suit the
|
||
situation.
|
||
|
||
If your document contains nontrivial examples of program code, we
|
||
recommend releasing these examples in parallel under your choice of
|
||
free software license, such as the GNU General Public License,
|
||
to permit their use in free software.
|
||
|
||
\end{document}
|