2024-08-22 15:38:22 +02:00
|
|
|
|
% -*- coding: utf-8 -*-
|
|
|
|
|
% Copyright (C) 2024 Adrien 'neox' Bourmault <neox@gnu.org>
|
|
|
|
|
%
|
|
|
|
|
% Permission is granted to copy, distribute and/or modify this document
|
|
|
|
|
% under the terms of the GNU Free Documentation License, Version 1.3
|
|
|
|
|
% or any later version published by the Free Software Foundation;
|
|
|
|
|
% with no Invariant Sections, no Front-Cover Texts, and no Back-Cover Texts.
|
|
|
|
|
% A copy of the license is included in the section entitled "GNU
|
|
|
|
|
% Free Documentation License".
|
|
|
|
|
|
2024-07-24 17:00:17 +02:00
|
|
|
|
\input{packages.tex}
|
|
|
|
|
|
2024-08-25 11:54:54 +02:00
|
|
|
|
\title{Hardware initialization of modern computers}
|
|
|
|
|
\author{Adrien 'neox' Bourmault}
|
|
|
|
|
\date{\today}
|
|
|
|
|
|
2024-07-24 17:00:17 +02:00
|
|
|
|
% setup things
|
|
|
|
|
\setcounter{secnumdepth}{4}
|
|
|
|
|
\setcounter{tocdepth}{4}
|
|
|
|
|
%\setcounter{secnumdepth}{4}
|
|
|
|
|
|
|
|
|
|
% setup bibliography
|
|
|
|
|
\addbibresource{bibliographie.bib}
|
|
|
|
|
|
2024-08-21 21:27:29 +02:00
|
|
|
|
% ------------------------------------------------------------------------------
|
2024-07-24 17:00:17 +02:00
|
|
|
|
\begin{document}{
|
2024-08-21 21:27:29 +02:00
|
|
|
|
% ------------------------------------------------------------------------------
|
2024-07-24 17:00:17 +02:00
|
|
|
|
|
|
|
|
|
\sloppy % allow flexible margins
|
|
|
|
|
\input{titlepage.tex} % import titlepage
|
|
|
|
|
\newpage
|
|
|
|
|
|
2024-08-25 11:54:54 +02:00
|
|
|
|
% ------------------------------------------------------------------------------
|
2024-08-21 21:27:29 +02:00
|
|
|
|
% License header
|
2024-08-25 11:54:54 +02:00
|
|
|
|
% ------------------------------------------------------------------------------
|
2024-07-24 17:00:17 +02:00
|
|
|
|
|
|
|
|
|
\setcounter{page}{2}
|
|
|
|
|
\vspace*{\fill} % fill the page so that text is at the bottom
|
|
|
|
|
|
2024-08-27 13:27:07 +02:00
|
|
|
|
This is Edition 0.2. \\
|
2024-07-24 17:00:17 +02:00
|
|
|
|
|
2024-08-27 13:27:07 +02:00
|
|
|
|
Copyright \copyright\ 2024 Adrien 'neox' Bourmault
|
2024-08-22 19:18:34 +02:00
|
|
|
|
\href{mailto:neox@gnu.org}{<neox@gnu.org>} \\
|
2024-07-24 17:00:17 +02:00
|
|
|
|
|
2024-08-21 21:27:29 +02:00
|
|
|
|
Permission is granted to copy, distribute and/or modify this document
|
|
|
|
|
under the terms of the GNU Free Documentation License, Version 1.3
|
|
|
|
|
or any later version published by the Free Software Foundation;
|
2024-08-27 13:27:07 +02:00
|
|
|
|
with the Invariant Sections being "GNU General Public License version 2"
|
|
|
|
|
and "GNU Free Documentation License", with no Front-Cover Texts, and no
|
|
|
|
|
Back-Cover Texts.
|
2024-08-21 21:27:29 +02:00
|
|
|
|
A copy of the license is included in the section entitled "GNU
|
2024-08-27 13:27:07 +02:00
|
|
|
|
Free Documentation License". \\
|
2024-07-24 17:00:17 +02:00
|
|
|
|
|
2024-08-25 18:51:20 +02:00
|
|
|
|
Source-code included in this document is licensed under the GNU General
|
2024-08-27 13:27:07 +02:00
|
|
|
|
Public License version 2 or any later version published by the
|
|
|
|
|
Free Software Foundation.
|
|
|
|
|
A copy of the license is included in the section entitled "GNU
|
|
|
|
|
General Public License version 2". \\
|
2024-08-25 18:51:20 +02:00
|
|
|
|
|
2024-07-24 17:00:17 +02:00
|
|
|
|
\newpage
|
|
|
|
|
|
2024-08-25 11:54:54 +02:00
|
|
|
|
% ------------------------------------------------------------------------------
|
|
|
|
|
% ACKNOWLEDGMENTS
|
|
|
|
|
% ------------------------------------------------------------------------------
|
|
|
|
|
\chapter*{Acknowledgments}
|
2024-08-27 13:27:07 +02:00
|
|
|
|
\addcontentsline{toc}{chapter}{Acknowledgments}
|
2024-08-25 11:54:54 +02:00
|
|
|
|
|
2024-08-26 19:35:06 +02:00
|
|
|
|
First and foremost, I would like to express my deep gratitude to
|
|
|
|
|
Marie-Minerve Louërat, without whom
|
|
|
|
|
this work would not have come to fruition. Her efforts to assist me on
|
|
|
|
|
legal matters and our enriching discussions on the philosophy of free
|
|
|
|
|
software have been invaluable to me. \\
|
|
|
|
|
|
2024-08-26 19:35:55 +02:00
|
|
|
|
I am also thankful to Roselyne Chotin for agreeing to fund this work as
|
|
|
|
|
without her support, none of this would have been possible, and
|
2024-08-26 19:35:06 +02:00
|
|
|
|
Franck Wajsburt for his invaluable advice at the beginning of my work,
|
|
|
|
|
which greatly helped me organize myself, as well as for his support and
|
|
|
|
|
reviews throughout this period. \\
|
|
|
|
|
|
|
|
|
|
I wish to express my appreciation to the Free Software Foundation for
|
|
|
|
|
funding the necessary equipment for this project. Special thanks go to
|
|
|
|
|
Zoë Kooymann and Ian Kelling for their dedication in securing this funding
|
|
|
|
|
and for their kindness throughout all the procedures. \\
|
|
|
|
|
|
2024-08-27 13:27:07 +02:00
|
|
|
|
I am deeply grateful to Denis Carikli (GNUtoo), my fellow GNU Boot
|
|
|
|
|
co-maintainer, for his meticulous reviews, emotional support, and brilliant
|
|
|
|
|
ideas that enriched this work, and to Richard M. Stallman for his advice and
|
2024-08-26 19:35:55 +02:00
|
|
|
|
support throughout this journey. \\
|
2024-08-26 19:35:06 +02:00
|
|
|
|
|
|
|
|
|
A big thank you to Manuel Bouyer for his infinite patience with all my
|
|
|
|
|
requests regarding network, software, and hardware configurations. \\
|
|
|
|
|
|
|
|
|
|
Also, I warmly thank my family and friends for their constant
|
|
|
|
|
encouragement and for reviewing my work throughout this entire process. \\
|
|
|
|
|
|
|
|
|
|
And finally, I would like to thank the break room and its kettle,
|
|
|
|
|
without which no tea would have been possible, thereby jeopardizing this
|
|
|
|
|
work. \\
|
2024-07-24 17:00:17 +02:00
|
|
|
|
|
|
|
|
|
\newpage
|
|
|
|
|
|
2024-08-25 11:54:54 +02:00
|
|
|
|
% ------------------------------------------------------------------------------
|
|
|
|
|
% ABSTRACT
|
|
|
|
|
% ------------------------------------------------------------------------------
|
2024-07-24 17:00:17 +02:00
|
|
|
|
\chapter*{Abstract}
|
|
|
|
|
\addcontentsline{toc}{chapter}{Abstract}
|
|
|
|
|
|
2024-08-22 19:18:34 +02:00
|
|
|
|
The global trend is towards the scarcity of free software-compatible
|
|
|
|
|
hardware, and soon there will be no computer that will work without
|
|
|
|
|
software domination by big companies, especially involving firmware like
|
|
|
|
|
BIOSes. \\
|
|
|
|
|
|
|
|
|
|
A Basic Input Output System (BIOS) was originally a set of low-level
|
|
|
|
|
functions contained in the read-only memory of a computer's mainboard,
|
|
|
|
|
enabling it to perform basic operations when powered up. However, the
|
|
|
|
|
definition of a BIOS has evolved to include what used to be known as Power
|
|
|
|
|
On Self Test (POST) for the presence of peripherals, allocating resources
|
|
|
|
|
for them to avoid conflicts, and then handing over to an operating system
|
|
|
|
|
boot loader. Nowadays, the bulk of the BIOS work is the initialization
|
|
|
|
|
and training of RAM. This means, for example, initializing the memory
|
|
|
|
|
controller and optimizing timing and read/write voltage for optimal
|
|
|
|
|
performance, making the code complex, as its role is to optimize several
|
|
|
|
|
parallel buses operating at high speeds and shared by many CPU cores,
|
|
|
|
|
and make them act as a homogeneous whole. \\
|
|
|
|
|
|
|
|
|
|
This document is the product of a project hosted by the \textit{LIP6
|
|
|
|
|
laboratory} and supported by the \textit{GNU Boot Project} and the
|
|
|
|
|
\textit{Free Software Foundation}. It delves into the importance
|
|
|
|
|
of firmware in the hardware initialization of modern computers and
|
|
|
|
|
explores various aspects of firmware, such as Intel Management Engine
|
|
|
|
|
(ME), AMD Platform Security Processor (PSP), Advanced Configuration and
|
|
|
|
|
Power Interface (ACPI), and System Management Mode (SMM). Additionally,
|
|
|
|
|
it provides an in-depth look at memory initialization and training
|
|
|
|
|
algorithms, highlighting their critical role in system stability and
|
|
|
|
|
performance. Examples of the implementation in the ASUS KGPE-D16 mainboard
|
|
|
|
|
are presented, describing its hardware characteristics, topology, and the
|
|
|
|
|
crucial role of firmware in its operation after the mainboard architecture
|
2024-08-25 11:54:54 +02:00
|
|
|
|
is examined. Practical examples illustrate the impact of firmware on
|
2024-08-22 19:18:34 +02:00
|
|
|
|
hardware initialization, memory optimization, resource allocation,
|
|
|
|
|
power management, and security. Specific algorithms used for memory
|
|
|
|
|
training and their outcomes are analyzed to demonstrate the complexity
|
|
|
|
|
and importance of firmware in achieving optimal system performance.
|
|
|
|
|
Furthermore, this document explores the relationship between firmware
|
|
|
|
|
and hardware virtualization. Security considerations and future trends
|
|
|
|
|
in firmware development are also addressed, emphasizing the need for
|
|
|
|
|
continued research and advocacy for free software-compatible hardware.
|
2024-07-24 17:00:17 +02:00
|
|
|
|
|
2024-08-25 11:54:54 +02:00
|
|
|
|
\newpage
|
|
|
|
|
|
|
|
|
|
% ------------------------------------------------------------------------------
|
|
|
|
|
% Table of contents
|
|
|
|
|
% ------------------------------------------------------------------------------
|
2024-08-27 13:27:07 +02:00
|
|
|
|
\chapter*{\vspace{-\cftbeforechapskip}}
|
|
|
|
|
\addcontentsline{toc}{chapter}{Contents}
|
2024-08-25 11:54:54 +02:00
|
|
|
|
\tableofcontents
|
|
|
|
|
\newpage
|
|
|
|
|
|
|
|
|
|
% List of figures
|
2024-08-27 13:27:07 +02:00
|
|
|
|
\chapter*{\vspace{-\cftbeforechapskip}}
|
2024-08-25 11:54:54 +02:00
|
|
|
|
\addcontentsline{toc}{chapter}{List of Figures}
|
|
|
|
|
\listoffigures
|
|
|
|
|
\newpage
|
|
|
|
|
|
2024-08-27 13:27:07 +02:00
|
|
|
|
% List of listings
|
|
|
|
|
%\chapter*{\vspace{-3em}}
|
2024-08-25 11:54:54 +02:00
|
|
|
|
\listoflistings
|
2024-08-27 13:27:07 +02:00
|
|
|
|
\addcontentsline{toc}{chapter}{List of Listings}
|
2024-08-25 11:54:54 +02:00
|
|
|
|
\newpage
|
|
|
|
|
|
|
|
|
|
% ------------------------------------------------------------------------------
|
|
|
|
|
% CHAPTER 1: Introduction to firmware and BIOS evolution
|
|
|
|
|
% ------------------------------------------------------------------------------
|
2024-07-24 17:00:17 +02:00
|
|
|
|
\chapter{Introduction to firmware and BIOS evolution}
|
|
|
|
|
|
|
|
|
|
\section{Historical context of BIOS}
|
2024-08-21 12:53:44 +02:00
|
|
|
|
|
2024-07-24 17:00:17 +02:00
|
|
|
|
\subsection{Definition and origin}
|
2024-08-21 12:53:44 +02:00
|
|
|
|
|
2024-08-22 19:18:34 +02:00
|
|
|
|
The BIOS (Basic Input/Output System) is firmware, which is a type of
|
|
|
|
|
software that is embedded into hardware devices to control their basic
|
|
|
|
|
functions, acting as a bridge between hardware and other software,
|
|
|
|
|
ensuring that the hardware operates correctly. Unlike regular
|
|
|
|
|
software, firmware is usually stored in a non-volatile memory like
|
|
|
|
|
ROM or flash memory. The term "firmware" comes from its role: it is
|
|
|
|
|
"firm" because it's more permanent than regular software (which can
|
|
|
|
|
be easily changed) but not as rigid as hardware. \\
|
|
|
|
|
|
2024-08-21 12:53:44 +02:00
|
|
|
|
The BIOS is used to perform initialization during the booting process
|
2024-08-22 19:18:34 +02:00
|
|
|
|
and to provide runtime services for operating systems and programs.
|
2024-08-21 12:53:44 +02:00
|
|
|
|
Being a critical component for the startup of personal computers,
|
|
|
|
|
acting as an intermediary between the computer's hardware and its
|
2024-08-22 19:18:34 +02:00
|
|
|
|
operating system, the BIOS is embedded on a chip on the motherboard
|
2024-08-25 11:54:54 +02:00
|
|
|
|
and is the first code that runs when a PC is powered on. The concept
|
2024-08-22 19:18:34 +02:00
|
|
|
|
of BIOS has its roots in the early days of personal computing. It
|
|
|
|
|
was first developed by IBM for their IBM PC, which was introduced
|
2024-08-25 11:54:54 +02:00
|
|
|
|
in 1981 \cite{freiberger2000fire}. The term BIOS itself was
|
2024-08-22 19:18:34 +02:00
|
|
|
|
coined by Gary Kildall, who developed the CP/M (Control Program for
|
|
|
|
|
Microcomputers) operating system \cite{shustek2016kildall}. In CP/M,
|
|
|
|
|
BIOS was used to describe a component that interfaced directly
|
|
|
|
|
with the hardware, allowing the operating system to be somewhat
|
|
|
|
|
hardware-independent. \\
|
2024-08-21 12:53:44 +02:00
|
|
|
|
|
|
|
|
|
\begin{figure}[H]
|
|
|
|
|
\centering
|
|
|
|
|
\includegraphics[width=0.5\textwidth]{images/IBM_logo.png}
|
|
|
|
|
\caption{The eight-striped wordmark of IBM (1967, public domain,
|
|
|
|
|
trademarked)}
|
|
|
|
|
\end{figure}
|
2024-07-24 17:00:17 +02:00
|
|
|
|
|
2024-08-22 19:18:34 +02:00
|
|
|
|
IBM's implementation of BIOS became a de facto standard in
|
|
|
|
|
the industry, as it was part of the IBM PC's open architecture
|
|
|
|
|
\cite{grewal_ibm_pc}\cite{ibm_pc}, which refers to the design
|
|
|
|
|
philosophy adopted by IBM when developing the IBM Personal Computer
|
|
|
|
|
(PC), introduced in 1981. This architecture is characterized by the use
|
|
|
|
|
of off-the-shelf components and publicly available specifications,
|
|
|
|
|
which allowed other manufacturers to create compatible hardware
|
2024-08-25 11:54:54 +02:00
|
|
|
|
and software. It was in fact a departure from the proprietary
|
2024-08-22 19:18:34 +02:00
|
|
|
|
systems prevalent at the time, where companies closely guarded their
|
|
|
|
|
designs to maintain control over the hardware and software ecosystem.
|
2024-07-24 17:00:17 +02:00
|
|
|
|
For example, IBM used the Intel 8088 CPU, a well-documented and widely
|
2024-08-22 19:18:34 +02:00
|
|
|
|
available processor, and also the Industry Standard Architecture
|
|
|
|
|
(ISA) bus, which defined how various components like memory, storage,
|
|
|
|
|
and peripherals communicated with the CPU. This open architecture
|
|
|
|
|
allowed other manufacturers to create IBM-compatible computers, also
|
|
|
|
|
known as "clones", which further popularized the BIOS concept. As
|
|
|
|
|
a result, the IBM PC BIOS set the stage for a standardized method
|
|
|
|
|
of interacting with computer hardware, which has evolved over the
|
|
|
|
|
years but remains fundamentally the same in principle. IBM also
|
|
|
|
|
published detailed technical documentation at that time, including
|
|
|
|
|
circuit diagrams, BIOS listings, and interface specifications. This
|
|
|
|
|
transparency allowed other companies to understand and replicate
|
|
|
|
|
the IBM PC's functionality \cite{freiberger2000fire}.
|
2024-08-21 12:53:44 +02:00
|
|
|
|
|
2024-07-24 17:00:17 +02:00
|
|
|
|
\subsection{Functionalities and limitations}
|
|
|
|
|
|
2024-08-22 19:18:34 +02:00
|
|
|
|
When a computer is powered on, the BIOS executes a Power-On
|
|
|
|
|
Self-Test (POST), a diagnostic sequence that verifies the integrity
|
|
|
|
|
and functionality of critical hardware components such as the CPU,
|
|
|
|
|
RAM, disk drives, keyboard, and other peripherals \cite{wiki_bios}.
|
|
|
|
|
This process ensures that all essential hardware components are
|
|
|
|
|
operational before the system attempts to load the operating system.
|
|
|
|
|
If any issues are detected, the BIOS generates error messages or
|
2024-08-25 11:54:54 +02:00
|
|
|
|
beep codes to alert the user. Following the successful completion
|
2024-08-22 19:18:34 +02:00
|
|
|
|
of POST, the BIOS runs the bootstrap loader, a small program that
|
|
|
|
|
identifies the operating system's bootloader on a storage device,
|
2024-08-25 11:54:54 +02:00
|
|
|
|
such as a hard drive, floppy disk, or optical drive. The bootstrap
|
2024-08-22 19:18:34 +02:00
|
|
|
|
loader then transfers control to the OS bootloader, initiating
|
|
|
|
|
the process of loading the operating system into the computer's
|
|
|
|
|
memory and starting it. This step effectively bridges the gap
|
|
|
|
|
between hardware initialization and operating system execution.
|
|
|
|
|
The BIOS also provides a set of low-level software routines known
|
|
|
|
|
as interrupts. These routines enable software to perform basic
|
|
|
|
|
input/output operations, such as reading from the keyboard, writing
|
|
|
|
|
to the display, and accessing disk drives, without needing to manage
|
|
|
|
|
the hardware directly. By providing standardized interfaces for
|
|
|
|
|
hardware components, the BIOS simplifies software development and
|
|
|
|
|
improves compatibility across different hardware configurations
|
|
|
|
|
\cite{ibm_pc}. \\
|
2024-08-21 12:53:44 +02:00
|
|
|
|
|
|
|
|
|
\begin{figure}[H]
|
|
|
|
|
\centering
|
|
|
|
|
\includegraphics[width=0.5\textwidth]{images/bios_chip.jpg}
|
|
|
|
|
\caption{An AMI BIOS chip from a Dell 310, by Jud McCranie
|
|
|
|
|
(CC BY-SA 4.0, 2018)}
|
|
|
|
|
\end{figure}
|
2024-07-24 17:00:17 +02:00
|
|
|
|
|
|
|
|
|
Despite its essential role, the early BIOS had several limitations.
|
|
|
|
|
One significant limitation was its limited storage capacity.
|
2024-08-22 19:18:34 +02:00
|
|
|
|
Early BIOS firmware was stored in Read-Only Memory (ROM) chips with
|
|
|
|
|
very limited storage, often just a few kilobytes. This constrained
|
|
|
|
|
the complexity and functionality of the BIOS, limiting it to only the
|
|
|
|
|
most essential tasks needed to start the system and provide basic
|
|
|
|
|
hardware control. The original BIOS was also non-extensible. ROM
|
|
|
|
|
chips were typically soldered onto the motherboard, making updates
|
|
|
|
|
difficult and costly. Bug fixes, updates for new hardware support,
|
|
|
|
|
or enhancements required replacing the ROM chip, leading to challenges
|
|
|
|
|
in maintaining and upgrading systems. Furthermore, the early BIOS was
|
|
|
|
|
tailored for the specific hardware configurations of the initial IBM
|
|
|
|
|
PC models, which included a limited set of peripherals and expansion
|
|
|
|
|
options. As new hardware components and peripherals were developed,
|
|
|
|
|
the BIOS often needed to be updated to support them, which was not
|
2024-08-25 11:54:54 +02:00
|
|
|
|
always feasible or timely. Performance bottlenecks were another
|
2024-08-22 19:18:34 +02:00
|
|
|
|
limitation. The BIOS provided basic input/output operations that
|
|
|
|
|
were often slower than direct hardware access methods. For example,
|
|
|
|
|
disk I/O operations through BIOS interrupts were slower compared
|
|
|
|
|
to later direct access methods provided by operating systems,
|
|
|
|
|
resulting in performance bottlenecks, especially for disk-intensive
|
2024-08-27 16:03:22 +02:00
|
|
|
|
operations \cite{osdev_uefi}. Early BIOS
|
2024-08-22 19:18:34 +02:00
|
|
|
|
implementations also had minimal security features. There were no
|
|
|
|
|
mechanisms to verify the integrity of the BIOS code or to protect
|
|
|
|
|
against unauthorized modifications, leaving systems vulnerable to
|
|
|
|
|
attacks that could alter the BIOS and potentially compromise the
|
2024-08-25 11:54:54 +02:00
|
|
|
|
entire system, such as rootkits and firmware viruses. Added to that,
|
2024-08-22 19:18:34 +02:00
|
|
|
|
the traditional BIOS operates in 16-bit real mode, a constraint that
|
|
|
|
|
limits the amount of code and memory it can address. This limitation
|
|
|
|
|
hinders the performance and complexity of firmware, making it less
|
2024-08-25 11:54:54 +02:00
|
|
|
|
suitable for modern computing needs \cite{intel_uefi}. Additionally,
|
2024-08-22 19:18:34 +02:00
|
|
|
|
BIOS relies on the Master Boot Record (MBR) partitioning scheme,
|
|
|
|
|
which supports a maximum disk size of 2 terabytes and allows only
|
|
|
|
|
four primary partitions \cite{uefi_spec}\cite{russinovich2012}.
|
|
|
|
|
This constraint has become a significant drawback as storage
|
2024-08-25 11:54:54 +02:00
|
|
|
|
capacities have increased. Furthermore, the traditional BIOS has
|
2024-08-22 19:18:34 +02:00
|
|
|
|
limited flexibility and is challenging to update or extend. This
|
|
|
|
|
inflexibility restricts the ability to support new hardware and
|
2024-08-27 16:03:22 +02:00
|
|
|
|
technologies efficiently \cite{osdev_uefi}\cite{acmcs2015}.
|
2024-07-24 17:00:17 +02:00
|
|
|
|
|
|
|
|
|
\section{Modern BIOS and UEFI}
|
|
|
|
|
|
2024-08-22 19:18:34 +02:00
|
|
|
|
\subsection{Transition from traditional BIOS to UEFI (Unified
|
|
|
|
|
Extensible Firmware Interface)}
|
2024-08-21 12:53:44 +02:00
|
|
|
|
|
|
|
|
|
All the limitations listed earlier caused a transition to a more
|
2024-08-22 19:18:34 +02:00
|
|
|
|
modern firmware interface, designed to address the shortcomings of
|
|
|
|
|
the traditional BIOS. This section delves into the historical context
|
|
|
|
|
of this shift, the driving factors behind it, and the advantages
|
|
|
|
|
UEFI offers over the traditional BIOS. \\
|
|
|
|
|
|
|
|
|
|
The development of UEFI began in the mid-1990s as part of the
|
|
|
|
|
Intel Boot Initiative, which aimed to modernize the boot process
|
|
|
|
|
and overcome the limitations of the traditional BIOS. By 2005, the
|
|
|
|
|
Unified EFI Forum, a consortium of technology companies including
|
|
|
|
|
Intel, AMD, and Microsoft, had formalized the UEFI specification
|
|
|
|
|
\cite{uefi_spec}. UEFI was designed to address the shortcomings of
|
|
|
|
|
the traditional BIOS, providing several key improvements.
|
2024-08-21 12:53:44 +02:00
|
|
|
|
|
|
|
|
|
\begin{figure}[H]
|
|
|
|
|
\centering
|
|
|
|
|
\includegraphics[width=0.25\textwidth]{images/uefi_logo.png}
|
|
|
|
|
\caption{The UEFI logo (public domain, 2010)}
|
|
|
|
|
\end{figure}
|
2024-07-24 17:00:17 +02:00
|
|
|
|
|
2024-08-22 19:18:34 +02:00
|
|
|
|
One of the most significant advancements of UEFI is its support for
|
|
|
|
|
32-bit and 64-bit modes, allowing it to address more memory and
|
|
|
|
|
run more complex firmware programs. This capability enables UEFI
|
|
|
|
|
to handle the increased demands of modern hardware and software
|
2024-08-25 11:54:54 +02:00
|
|
|
|
\cite{intel_uefi}\cite{shin2011}. Additionally, UEFI uses the GUID
|
2024-08-22 19:18:34 +02:00
|
|
|
|
Partition Table (GPT) instead of the MBR, supporting disks larger
|
|
|
|
|
than 2 terabytes and allowing for a nearly unlimited number of
|
|
|
|
|
partitions \cite{microsoft_uefi}\cite{russinovich2012}.
|
|
|
|
|
|
|
|
|
|
Improved boot performance is another driving factor. UEFI
|
|
|
|
|
provides faster boot times compared to the traditional BIOS,
|
|
|
|
|
thanks to its efficient hardware and software initialization
|
|
|
|
|
processes. This improvement is particularly beneficial for systems
|
|
|
|
|
with complex hardware configurations, where quick boot times
|
2024-08-25 11:54:54 +02:00
|
|
|
|
are essential \cite{intel_uefi}. UEFI's modular architecture
|
2024-08-22 19:18:34 +02:00
|
|
|
|
makes it more extensible and easier to update compared to the
|
|
|
|
|
traditional BIOS. This design allows for the addition of drivers,
|
|
|
|
|
applications, and other components without requiring a complete
|
|
|
|
|
firmware overhaul, providing greater flexibility and adaptability
|
|
|
|
|
to new technologies \cite{acmcs2015}. UEFI also includes enhanced
|
|
|
|
|
security features such as \textit{Secure Boot}, which ensures that
|
|
|
|
|
only trusted software can be executed during the boot process,
|
|
|
|
|
thereby protecting the system from unauthorized modifications and
|
2024-08-27 16:03:22 +02:00
|
|
|
|
malware \cite{osdev_uefi}\cite{chang2013}. \\
|
2024-08-22 19:18:34 +02:00
|
|
|
|
|
|
|
|
|
The industry-wide support and standardization of UEFI have accelerated
|
|
|
|
|
its adoption across various platforms and devices. Major industry
|
|
|
|
|
players, including Intel, AMD, and Microsoft, have adopted UEFI as
|
|
|
|
|
the new standard for firmware interfaces, ensuring broad compatibility
|
|
|
|
|
and interoperability \cite{uefi_spec}.
|
2024-08-21 12:53:44 +02:00
|
|
|
|
|
|
|
|
|
\subsection{An other way with \textit{coreboot}}
|
|
|
|
|
|
2024-08-22 19:18:34 +02:00
|
|
|
|
While UEFI has become the dominant firmware interface for modern
|
|
|
|
|
computing systems, it is not without its critics. Some of the primary
|
|
|
|
|
concerns about UEFI include its complexity, potential security
|
|
|
|
|
vulnerabilities, and the degree of control it provides to hardware
|
2024-08-25 11:54:54 +02:00
|
|
|
|
manufacturers over the boot process. Originally known as LinuxBIOS,
|
2024-08-22 19:18:34 +02:00
|
|
|
|
\textit{coreboot}, is a free firmware project initiated in 1999 by
|
|
|
|
|
Ron Minnich and his team at the Los Alamos National Laboratory. The
|
|
|
|
|
project's primary goal was to create a fast, lightweight, and
|
|
|
|
|
flexible firmware solution that could initialize hardware and
|
|
|
|
|
boot operating systems quickly, while remaining transparent and
|
2024-08-21 12:53:44 +02:00
|
|
|
|
auditable\cite{coreboot}. As an alternative to UEFI, \textit{coreboot}
|
2024-08-22 19:18:34 +02:00
|
|
|
|
offers a different approach to firmware that aims to address some
|
|
|
|
|
of these concerns and continue the evolution of BIOS.\\
|
2024-07-24 17:00:17 +02:00
|
|
|
|
|
|
|
|
|
One of the main advantages of \textit{coreboot} over UEFI is its
|
2024-08-22 19:18:34 +02:00
|
|
|
|
simplicity, as it is designed to perform only the minimal tasks
|
|
|
|
|
required to initialize hardware and pass control to a payload, such
|
|
|
|
|
as a bootloader or operating system kernel. This minimalist approach
|
|
|
|
|
reduces the attack surface and potential for security vulnerabilities,
|
2024-08-27 16:03:22 +02:00
|
|
|
|
as there is less code that could be exploited by malicious actors.
|
|
|
|
|
Another significant benefit of \textit{coreboot}
|
2024-08-25 11:54:54 +02:00
|
|
|
|
is its libre nature. Unlike UEFI, which is controlled by a consortium
|
2024-08-22 19:18:34 +02:00
|
|
|
|
of hardware and software vendors, \textit{coreboot}'s source code
|
|
|
|
|
is freely available and can be audited, modified, and improved by
|
|
|
|
|
anyone. This transparency ensures that security researchers and
|
|
|
|
|
developers can review the code for potential vulnerabilities and
|
|
|
|
|
contribute to its improvement, fostering a community-driven approach
|
2024-08-25 11:54:54 +02:00
|
|
|
|
to firmware development\cite{coreboot}. This project also supports
|
2024-08-22 19:18:34 +02:00
|
|
|
|
a wide range of bootloaders, called payloads, allowing users to
|
|
|
|
|
customize their boot process to suit their specific needs. Popular
|
|
|
|
|
payloads include SeaBIOS, which provides legacy BIOS compatibility, and
|
|
|
|
|
Tianocore, which offers UEFI functionality within the \textit{coreboot}
|
2024-08-25 11:54:54 +02:00
|
|
|
|
framework. This flexibility allows \textit{coreboot} to be used in
|
2024-08-22 19:18:34 +02:00
|
|
|
|
a variety of environments, from embedded systems to high-performance
|
2024-08-25 11:54:54 +02:00
|
|
|
|
servers \cite{coreboot_payloads}. \\
|
2024-07-24 17:00:17 +02:00
|
|
|
|
|
2024-08-21 12:53:44 +02:00
|
|
|
|
\begin{figure}[H]
|
|
|
|
|
\centering
|
|
|
|
|
\includegraphics[width=0.3\textwidth]{images/coreboot_logo.png}
|
2024-08-22 19:18:34 +02:00
|
|
|
|
\caption{The \textit{coreboot} logo, by Konsult Stuge \&
|
|
|
|
|
coresystems
|
2024-08-21 12:53:44 +02:00
|
|
|
|
(coreboot logo license, 2008)}
|
|
|
|
|
\end{figure}
|
|
|
|
|
|
2024-08-22 19:18:34 +02:00
|
|
|
|
Despite its advantages, \textit{coreboot} is not without its
|
2024-08-25 11:54:54 +02:00
|
|
|
|
challenges. The project relies heavily on community contributions, and
|
2024-08-22 19:18:34 +02:00
|
|
|
|
support for new hardware often lags behind that of UEFI. Additionally,
|
|
|
|
|
the minimalist design of \textit{coreboot} means that some advanced
|
|
|
|
|
features provided by UEFI are not available by default. However,
|
|
|
|
|
the \textit{coreboot} community continues to work on adding
|
|
|
|
|
new features and improving compatibility with modern hardware or
|
|
|
|
|
security issues \cite{coreboot_challenges}. For example, it provides
|
|
|
|
|
a \textit{verified boot} function, allowing to prevent rootkits and
|
|
|
|
|
other attacks based on firmware modifications \cite{coreboot_docs}.
|
|
|
|
|
However, it's important to note that \textit{coreboot} is not entirely
|
|
|
|
|
free in all aspects. Many modern processors and chipsets require
|
|
|
|
|
\textit{proprietary blobs}, short for \textit{Binary Large Object},
|
|
|
|
|
which is a collection of binary data stored as a single entity. These
|
|
|
|
|
blobs are necessary for \textit{coreboot} to function correctly on
|
|
|
|
|
a wide range of hardware, but they compromise the goal of having
|
|
|
|
|
a fully free firmware one day \cite{blobs}, since these blobs are
|
|
|
|
|
used for certain functionalities such as memory initialization and
|
|
|
|
|
hardware management.
|
|
|
|
|
|
2024-08-21 12:53:44 +02:00
|
|
|
|
\begin{figure}[H]
|
|
|
|
|
\centering
|
|
|
|
|
\includegraphics[width=0.25\textwidth]{images/gnuboot.png}
|
|
|
|
|
\caption{The \textit{GNU Boot} logo, by Jason Self (CC0, 2020)}
|
2024-08-22 19:18:34 +02:00
|
|
|
|
\end{figure}
|
|
|
|
|
|
|
|
|
|
To address these concerns, the GNU Project has developed GNU Boot,
|
|
|
|
|
a fully free distribution of firmware, including \textit{coreboot},
|
|
|
|
|
that aims to be entirely free by avoiding the use of proprietary
|
2024-08-27 14:20:57 +02:00
|
|
|
|
binary blobs.
|
2024-08-27 14:19:02 +02:00
|
|
|
|
|
|
|
|
|
GNU Boot is only a distribution: it reuses existing software projects
|
|
|
|
|
and is not very different from fully free GNU/Linux distributions like
|
2024-08-27 14:20:57 +02:00
|
|
|
|
Trisquel or Guix, as GNU Boot is committed to use only free software
|
2024-08-22 19:18:34 +02:00
|
|
|
|
for all aspects of firmware, making it a preferred choice for users
|
2024-08-27 14:19:02 +02:00
|
|
|
|
and organizations that prioritize software freedom and transparency.
|
|
|
|
|
Its goal include to build the software and assemble it in something
|
|
|
|
|
that can be installed, and also to test it and to provide installation
|
|
|
|
|
and upgrade instructions \cite{gnuboot}.
|
2024-07-24 17:00:17 +02:00
|
|
|
|
|
|
|
|
|
\section{Shift in firmware responsibilities}
|
|
|
|
|
|
2024-08-21 12:53:44 +02:00
|
|
|
|
Initially, the BIOS's primary function was to perform the POST, a basic
|
2024-08-22 19:18:34 +02:00
|
|
|
|
diagnostic testing process to check the system's hardware components
|
|
|
|
|
and ensure they were functioning correctly. This included verifying
|
|
|
|
|
the CPU, memory, and essential peripherals before passing control to
|
|
|
|
|
the operating system's bootloader. This process was relatively simple,
|
|
|
|
|
given the limited capabilities and straightforward architecture of
|
2024-08-27 16:03:22 +02:00
|
|
|
|
early computer systems \cite{osdev_uefi}.
|
2024-08-21 12:53:44 +02:00
|
|
|
|
|
|
|
|
|
As computer systems advanced, particularly with the advent of more
|
|
|
|
|
sophisticated memory technologies, the role of firmware expanded
|
2024-08-22 19:18:34 +02:00
|
|
|
|
significantly. Modern memory modules operate at much higher
|
|
|
|
|
speeds and capacities than their predecessors, requiring precise
|
|
|
|
|
configuration to ensure stability and optimal performance. Firmware
|
|
|
|
|
now plays a critical role in managing the memory controller, which is
|
|
|
|
|
responsible for regulating data flow between the processor and memory
|
|
|
|
|
modules. This includes configuring memory frequencies, voltage levels,
|
|
|
|
|
and timing parameters to match the specifications of the installed
|
2024-08-25 11:54:54 +02:00
|
|
|
|
memory \cite{uefi_spec}\cite{BKDG}. Beyond memory management,
|
2024-08-22 19:18:34 +02:00
|
|
|
|
firmware responsibilities have broadened to encompass a wide range
|
2024-08-27 13:29:46 +02:00
|
|
|
|
of system-critical tasks, and even so by including runtime components
|
|
|
|
|
in addition to its initialization tasks.
|
|
|
|
|
One key area is power management, where
|
2024-08-22 19:18:34 +02:00
|
|
|
|
firmware is responsible for optimizing energy consumption across
|
|
|
|
|
various components of the system. Efficient power management is
|
2024-08-21 12:53:44 +02:00
|
|
|
|
essential not only for extending battery life in portable devices
|
2024-08-22 19:18:34 +02:00
|
|
|
|
but also for reducing thermal output and ensuring system longevity
|
2024-08-25 11:54:54 +02:00
|
|
|
|
in desktop and server environments. Moreover, modern firmware takes
|
2024-08-22 19:18:34 +02:00
|
|
|
|
on significant roles in hardware initialization and configuration,
|
|
|
|
|
which were traditionally handled by the operating system. For
|
|
|
|
|
example, the initialization of USB controllers, network interfaces,
|
|
|
|
|
and storage devices is now often managed by the firmware during
|
|
|
|
|
the early stages of the boot process. This shift ensures that the
|
|
|
|
|
operating system can seamlessly interact with hardware from the
|
|
|
|
|
moment it takes control, reducing boot times and improving overall
|
2024-08-25 11:54:54 +02:00
|
|
|
|
system reliability \cite{uefi_spec}. Security has also become a
|
2024-08-22 19:18:34 +02:00
|
|
|
|
paramount concern for modern firmware. UEFI (Unified Extensible
|
|
|
|
|
Firmware Interface), which has largely replaced traditional BIOS
|
|
|
|
|
in modern systems, includes features which prevents unauthorized
|
|
|
|
|
or malicious software from loading during the boot process. This
|
|
|
|
|
helps protect the system from rootkits and other low-level malware
|
|
|
|
|
that could compromise the integrity of the operating system before
|
2024-08-25 11:54:54 +02:00
|
|
|
|
it even starts \cite{uefi_spec}. In the context of performance
|
2024-08-22 19:18:34 +02:00
|
|
|
|
tuning, firmware sometimes also plays a key role in enabling and
|
|
|
|
|
managing overclocking, particularly for the memory subsystem. By
|
|
|
|
|
allowing adjustments to memory frequencies, voltages, and timings,
|
|
|
|
|
firmware provides tools for enthusiasts to push their systems beyond
|
|
|
|
|
default limits. At the same time, it includes safeguards to manage
|
|
|
|
|
the risks of instability and hardware damage, balancing performance
|
2024-08-27 16:03:22 +02:00
|
|
|
|
gains with system reliability \cite{osdev_uefi}. \\
|
2024-08-21 12:53:44 +02:00
|
|
|
|
|
2024-08-22 19:18:34 +02:00
|
|
|
|
In summary, the evolution of firmware from simple hardware
|
|
|
|
|
initialization routines to complex management systems reflects the
|
|
|
|
|
increasing sophistication of modern computer architectures. Firmware
|
|
|
|
|
is now a critical layer that not only ensures the correct functioning
|
|
|
|
|
of hardware components but also optimizes performance, manages power
|
|
|
|
|
consumption, and enhances system security, making it an indispensable
|
|
|
|
|
part of contemporary computing. \\
|
2024-08-21 12:53:44 +02:00
|
|
|
|
|
2024-08-22 19:18:34 +02:00
|
|
|
|
This document will focus on \textit{coreboot} during the next parts
|
|
|
|
|
to study how modern firmware interact with hardware and also as a
|
|
|
|
|
basis for improvements.
|
2024-08-21 12:53:44 +02:00
|
|
|
|
|
2024-08-25 11:54:54 +02:00
|
|
|
|
% ------------------------------------------------------------------------------
|
|
|
|
|
% CHAPTER 2: Characteristics of ASUS KGPE-D16 mainboard
|
|
|
|
|
% ------------------------------------------------------------------------------
|
2024-08-21 12:53:44 +02:00
|
|
|
|
\chapter{Characteristics of ASUS KGPE-D16 mainboard}
|
|
|
|
|
|
2024-08-21 21:27:29 +02:00
|
|
|
|
\begin{figure}[H]
|
2024-08-22 19:18:34 +02:00
|
|
|
|
\centering \includegraphics[width=0.9\textwidth]{images/kgpe-d16.png}
|
2024-08-21 21:27:29 +02:00
|
|
|
|
\caption{The KGPE-D16 (CC BY-SA 4.0, 2021)}
|
|
|
|
|
\end{figure}
|
2024-08-22 19:18:34 +02:00
|
|
|
|
|
2024-08-21 21:27:29 +02:00
|
|
|
|
\newpage
|
2024-08-21 12:53:44 +02:00
|
|
|
|
|
|
|
|
|
\section{Overview of ASUS KGPE-D16 hardware}
|
|
|
|
|
|
2024-08-22 19:18:34 +02:00
|
|
|
|
The ASUS KGPE-D16 server mainboard is a dual-socket motherboard
|
|
|
|
|
designed to support AMD Family 10h/15h series processors. Released
|
|
|
|
|
in 2009, this mainboard was later awarded the \textit{Respects Your
|
|
|
|
|
Freedom} (RYF) certification in March 2017, underscoring its commitment
|
|
|
|
|
to fully free software compatibility \cite{fsf_ryf}. Indeed, this
|
|
|
|
|
mainboard can be operated with a fully free firmware such as GNU
|
|
|
|
|
Boot \cite{gnuboot_status}. \\
|
|
|
|
|
|
|
|
|
|
This mainboard is equipped with robust hardware components designed to
|
|
|
|
|
meet the demands of high-performance computing. It features 16 DDR3
|
|
|
|
|
DIMM slots, capable of supporting up to 256GB of memory, although
|
|
|
|
|
certain configurations may be limited to 192GB, with some reports
|
|
|
|
|
suggesting the potential to support 256GB under specific conditions.
|
|
|
|
|
In terms of expandability, the KGPE-D16 includes multiple PCIe
|
|
|
|
|
slots, with five physical slots available, although only four
|
|
|
|
|
can be used simultaneously due to slot sharing. For storage, the
|
|
|
|
|
mainboard provides several SATA ports. Networking capabilities are
|
|
|
|
|
enhanced by integrated dual gigabit Ethernet ports, which provide
|
|
|
|
|
high-speed connectivity essential for data-intensive tasks and network
|
2024-08-25 11:54:54 +02:00
|
|
|
|
communication \cite{asus_kgpe_d16_manual}. Additionally, the board
|
2024-08-22 19:18:34 +02:00
|
|
|
|
is equipped with various peripheral interfaces, including USB ports,
|
|
|
|
|
audio outputs, and other I/O ports, ensuring compatibility with a
|
|
|
|
|
wide range of external devices. \\
|
2024-08-21 12:53:44 +02:00
|
|
|
|
|
|
|
|
|
\begin{figure}[H]
|
|
|
|
|
\centering
|
|
|
|
|
\includegraphics[width=0.8\textwidth]{images/fig1_schema_basique.png}
|
2024-08-22 19:18:34 +02:00
|
|
|
|
\caption{Basic schematics of the ASUS KGPE-D16 Mainboard, ASUS
|
|
|
|
|
(2011)} \label{fig:d16_basic_schematics}
|
2024-08-21 12:53:44 +02:00
|
|
|
|
\end{figure}
|
|
|
|
|
|
2024-08-22 19:18:34 +02:00
|
|
|
|
The physical layout of the ASUS KGPE-D16 is meticulously designed
|
|
|
|
|
to optimize airflow, cooling, and power distribution. All of this
|
|
|
|
|
is critical for maintaining system stability, particularly under
|
|
|
|
|
heavy computational loads, as this board was designed for server
|
|
|
|
|
operations. In particular, key components such as the CPU sockets,
|
|
|
|
|
memory slots, and PCIe slots are strategically positioned. \\
|
2024-08-21 12:53:44 +02:00
|
|
|
|
|
|
|
|
|
\begin{figure}[H]
|
|
|
|
|
\centering
|
|
|
|
|
\includegraphics[width=0.8\textwidth]{images/kgpe-d16_real.png}
|
|
|
|
|
\caption{The KGPE-D16, viewed from the top (CC BY-SA 4.0, 2024)}
|
|
|
|
|
\label{fig:d16_top_view}
|
|
|
|
|
\end{figure}
|
|
|
|
|
|
|
|
|
|
\section{Chipset}
|
|
|
|
|
|
2024-08-22 19:18:34 +02:00
|
|
|
|
Before diving into the specific components, it is essential
|
|
|
|
|
to understand the roles of the northbridge and southbridge in
|
|
|
|
|
traditional motherboard architecture. These chipsets historically
|
|
|
|
|
managed communication between the CPU and other critical components
|
|
|
|
|
of the system \cite{amd_chipsets}. \\
|
|
|
|
|
|
|
|
|
|
The northbridge is a chipset on the motherboard that traditionally
|
|
|
|
|
manages high-speed communication between the CPU, memory (RAM), and
|
|
|
|
|
graphics card (if applicable). It serves as a hub for data that needs
|
|
|
|
|
to move quickly between these components. On the ASUS KGPE-D16, the
|
|
|
|
|
functions typically associated with the northbridge are divided between
|
|
|
|
|
the CPU’s internal northbridge and an external SR5690 northbridge
|
|
|
|
|
chip. The SR5690 specifically acts as a translator and switch,
|
|
|
|
|
handling the HyperTransport interface, a high-speed communication
|
|
|
|
|
protocol used by AMD processors, and converting it to ALink and PCIe
|
|
|
|
|
interfaces, which are crucial for connecting peripherals like graphics
|
2024-08-25 11:54:54 +02:00
|
|
|
|
cards \cite{SR5690BDG}. Additionally, the northbridge on the KGPE-D16
|
2024-08-22 19:18:34 +02:00
|
|
|
|
incorporates the IOMMU (Input-Output Memory Management Unit), which
|
|
|
|
|
is crucial for ensuring secure and efficient memory access by I/O
|
|
|
|
|
devices. The IOMMU allows for the virtualization of memory addresses,
|
|
|
|
|
providing device isolation and preventing unauthorized memory access,
|
|
|
|
|
which is particularly important in environments that run multiple
|
|
|
|
|
virtual machines \cite{amd_chipsets}\cite{northbridge_wiki}. \\
|
|
|
|
|
|
|
|
|
|
The southbridge, on the other hand, is responsible for handling
|
|
|
|
|
lower-speed, peripheral interfaces such as the PCI, USB, and
|
|
|
|
|
IDE/SATA connections, as well as managing onboard audio and
|
|
|
|
|
network controllers. On the KGPE-D16, these functions are managed
|
|
|
|
|
by the SP5100 southbridge chip, which integrates several critical
|
|
|
|
|
functions including the LPC bridge, SATA controllers, and other
|
|
|
|
|
essential I/O operations \cite{amd_chipsets}\cite{southbridge_wiki}.
|
|
|
|
|
It is essentially an ALink bus controller and includes the hardware
|
|
|
|
|
interrupt controller, the IOAPIC. Interrupts from peripheral always
|
|
|
|
|
pass through the northbridge (fig. \ref{fig:d16_ioapic}), since it
|
|
|
|
|
translates ALink to HyperTransport for the CPUs and contains the
|
|
|
|
|
IOMMU \cite{SR5690BDG}. \\
|
|
|
|
|
|
2024-08-21 12:53:44 +02:00
|
|
|
|
\begin{figure}[H]
|
2024-08-22 19:18:34 +02:00
|
|
|
|
\centering \includegraphics[width=0.9\textwidth]{images/ioapic.png}
|
|
|
|
|
\caption{Functional diagram presenting the IOAPIC function of
|
|
|
|
|
the SP5100,
|
2024-08-21 12:53:44 +02:00
|
|
|
|
ASUS (2011)}
|
|
|
|
|
\label{fig:d16_ioapic}
|
|
|
|
|
\end{figure}
|
|
|
|
|
|
2024-08-22 19:18:34 +02:00
|
|
|
|
In addition to the northbridge and southbridge, the KGPE-D16 also
|
|
|
|
|
contains specialized chips for managing input/output operations and
|
|
|
|
|
system health monitoring. The WINBOND W83667HG-A Super I/O chip handles
|
|
|
|
|
traditional I/O functions such as legacy serial and parallel ports,
|
2024-08-27 13:46:50 +02:00
|
|
|
|
keyboard, and mouse interfaces, but also the SPI chip (Serial Peripheral
|
|
|
|
|
Interface, a synchronous serial communication protocol primarily used
|
|
|
|
|
to communicate between microcontrollers and peripheral devices like
|
|
|
|
|
sensors or memory devices) that contains the firmware \cite{winbond}.
|
|
|
|
|
Meanwhile, the Nuvoton W83795G/ADG Hardware
|
2024-08-22 19:18:34 +02:00
|
|
|
|
Monitor oversees the system’s health by monitoring temperatures,
|
|
|
|
|
voltages, and fan speeds, ensuring that the system operates within
|
2024-08-25 11:54:54 +02:00
|
|
|
|
safe parameters \cite{nuvoton}. On the KGPE-D16, access to the Super
|
2024-08-22 19:18:34 +02:00
|
|
|
|
I/O from a CPU core is done through the SR5690, then the SP5100,
|
|
|
|
|
as that can be observed on the functional diagram of the chipset
|
|
|
|
|
(fig. \ref{fig:d16_chipset}) \cite{SR5690BDG}.
|
|
|
|
|
|
2024-08-21 12:53:44 +02:00
|
|
|
|
\begin{figure}[H]
|
|
|
|
|
\centering
|
|
|
|
|
\includegraphics[width=0.8\textwidth]{images/fig2_diagramme_chipset.png}
|
2024-08-22 19:18:34 +02:00
|
|
|
|
\caption{Functional diagram of the KGPE-D16 chipset (CC BY-SA 4.0,
|
|
|
|
|
2024)} \label{fig:d16_chipset}
|
2024-08-21 12:53:44 +02:00
|
|
|
|
\end{figure}
|
|
|
|
|
|
|
|
|
|
\section{Processors}
|
|
|
|
|
|
2024-08-22 19:18:34 +02:00
|
|
|
|
The ASUS KGPE-D16 supports AMD Family 10h processors, but
|
|
|
|
|
it is important to note that Vikings, a known vendor for
|
|
|
|
|
libre-software-compatible hardware, does not recommend using the
|
|
|
|
|
Opteron 6100 series due to the lack of IOMMU support, which is
|
|
|
|
|
critical for security. Fortunately, AMD Family 15h processors are also
|
|
|
|
|
supported. However, the Opteron 6300 series, while supported, requires
|
|
|
|
|
proprietary microcode updates for stability, IOMMU functionality,
|
|
|
|
|
and fixes for specific vulnerabilities, including a gain-root-
|
|
|
|
|
via-NMI exploit. The Opteron 6200 series does not suffer from these
|
|
|
|
|
problems and works properly without any proprietary microcode update
|
2024-08-21 12:53:44 +02:00
|
|
|
|
needed \cite{vikings}. \\
|
|
|
|
|
|
|
|
|
|
\begin{figure}[H]
|
|
|
|
|
\centering
|
|
|
|
|
\includegraphics[width=0.9\textwidth]{images/opteron6200_annoté.png}
|
2024-08-22 19:18:34 +02:00
|
|
|
|
\caption{Annotated photography of an Opteron 6200 series
|
|
|
|
|
CPU (2024), from a photography by AMD Inc. (2008)}
|
2024-08-21 12:53:44 +02:00
|
|
|
|
\label{fig:opteron2600}
|
|
|
|
|
\end{figure}
|
2024-08-22 19:18:34 +02:00
|
|
|
|
|
|
|
|
|
The Opteron 6200 series, part of the Bulldozer microarchitecture,
|
|
|
|
|
was designed to target high-performance server applications. These
|
|
|
|
|
processors feature 16 cores, organized into 8 Bulldozer modules,
|
|
|
|
|
with each module containing two integer cores that shared
|
|
|
|
|
resources like the floating-point unit (FPU) and L2 cache
|
2024-08-21 12:53:44 +02:00
|
|
|
|
(fig. \ref{fig:opteron2600}, \ref{fig:opteron2600_diagram})
|
|
|
|
|
\cite{amd_6200}\cite{anandtech_bulldozer}.
|
2024-08-22 19:18:34 +02:00
|
|
|
|
|
|
|
|
|
The architecture of the Opteron 6200 series is built around AMD's
|
|
|
|
|
Bulldozer core design, which uses Clustered Multithreading (CMT) to
|
|
|
|
|
maximize resource utilization. This is a technique where each processor
|
|
|
|
|
module contains two integer cores that share certain resources like
|
|
|
|
|
the floating-point unit (FPU), L2 cache, and instruction fetch/decode
|
|
|
|
|
stages. Unlike traditional multithreading, where each core handles
|
|
|
|
|
multiple threads, CMT allows two cores to share resources to improve
|
|
|
|
|
parallel processing efficiency. This approach aims to balance
|
|
|
|
|
performance and resource usage, particularly in multi- threaded
|
|
|
|
|
workloads, though it can lead to some performance trade-offs in
|
|
|
|
|
single-threaded tasks. In the Opteron 6272, the processor consists
|
|
|
|
|
of eight modules, effectively creating 16 integer cores. Due to
|
|
|
|
|
the CMT architecture, each Opteron 6272 chip functions as two CPUs
|
|
|
|
|
within a single processor, each with its own set of cores, L2 caches,
|
|
|
|
|
and shared L3 cache. Here, one CPU is made by four modules, each
|
|
|
|
|
module in it sharing certain components, such as the FPU and L2 cache,
|
|
|
|
|
between two integer cores. The L3 cache is shared across these modules.
|
|
|
|
|
HyperTransport links provide high-speed communication between the two
|
|
|
|
|
sockets of the KGPE-D16. Shared L3 cache and direct memory access are
|
|
|
|
|
provided by each socket \cite{amd_6200}\cite{hill_impact_caching}. \\
|
|
|
|
|
|
|
|
|
|
This architecture also integrates a quad-channel DDR3 memory
|
|
|
|
|
controller directly into the processor die, which facilitates high
|
|
|
|
|
bandwidth and low latency access to memory. This memory controller
|
|
|
|
|
supports DDR3 memory speeds up to 1600 MHz and connects directly
|
|
|
|
|
to the memory modules via the memory bus. By integrating the memory
|
|
|
|
|
controller into the processor, the Opteron 6200 series reduces memory
|
|
|
|
|
access latency, enhancing overall performance
|
|
|
|
|
\cite{amd_6200}\cite{amd_ddr3_guide}.
|
|
|
|
|
It is interesting to note that Opterons
|
|
|
|
|
incorporate the internal northbridge that we cited previously. The
|
|
|
|
|
traditional northbridge functions, such as memory controller and PCIe
|
|
|
|
|
interface management, are partially integrated into the processor. This
|
|
|
|
|
integration reduces the distance data must travel between the CPU and
|
|
|
|
|
memory, decreasing latency and improving performance, particularly
|
|
|
|
|
in memory-intensive applications \cite{amd_6200}. \\
|
2024-08-21 12:53:44 +02:00
|
|
|
|
|
|
|
|
|
|
|
|
|
|
\begin{figure}[H]
|
2024-08-22 19:18:34 +02:00
|
|
|
|
\centering \includegraphics[width=0.8\textwidth]{
|
|
|
|
|
images/fig3_img_dual_processor_node.png}
|
|
|
|
|
\caption{Functional diagram of an Opteron 6200 package
|
|
|
|
|
(CC BY-SA 4.0, 2024)}
|
2024-08-21 12:53:44 +02:00
|
|
|
|
\label{fig:opteron2600_diagram}
|
|
|
|
|
\end{figure}
|
|
|
|
|
|
2024-08-22 19:18:34 +02:00
|
|
|
|
Power efficiency was a key focus in the design of the Opteron 6200
|
2024-08-25 11:54:54 +02:00
|
|
|
|
series. Despite the high core count, the processor includes several
|
2024-08-22 19:18:34 +02:00
|
|
|
|
power management features, such as Dynamic Power Management (DPM)
|
|
|
|
|
and Turbo Core technology. These features allow the processor to
|
|
|
|
|
adjust power usage based on workload demands, balancing performance
|
|
|
|
|
with energy consumption. However, the Bulldozer architecture's
|
|
|
|
|
focus on high clock speeds and multi-threaded performance resulted
|
2024-08-21 21:27:29 +02:00
|
|
|
|
in higher power consumption compared to competing architectures
|
2024-08-22 19:18:34 +02:00
|
|
|
|
\cite{anandtech_bulldozer}. A special model of the series, called
|
|
|
|
|
\textit{high efficiency} models, solve a bit this problem by proposing
|
|
|
|
|
a bit less performant processor but with a power consumption divided
|
|
|
|
|
by a factor from 1.5 to 2.0 in some cases. \\
|
2024-08-21 12:53:44 +02:00
|
|
|
|
|
|
|
|
|
The processor connected to the I/O hub is known as the Bootstrap
|
2024-08-25 11:54:54 +02:00
|
|
|
|
Processor (BSP). The BSP is responsible for starting up the system
|
2024-08-22 19:18:34 +02:00
|
|
|
|
by executing the initial firmware code from the reset vector,
|
|
|
|
|
a specific memory address where the CPU begins execution after a
|
|
|
|
|
reset \cite{amd_bsp}. Core 0 of the BSP, called the Bootstrap Core
|
|
|
|
|
(BSC), initiates this process. During early initialization, the
|
|
|
|
|
BSP performs several critical tasks, such as memory initialization,
|
|
|
|
|
and bringing other CPU cores online. One of its duties is storing
|
|
|
|
|
Built-In Self-Test (BIST) information, which involves checking the
|
|
|
|
|
integrity of the processor's internal components to ensure they are
|
|
|
|
|
functioning correctly. The BSP also determines the type of reset
|
2024-08-27 13:52:27 +02:00
|
|
|
|
that has occurred whether it's a cold reset, which happens when
|
2024-08-22 19:18:34 +02:00
|
|
|
|
the system is powered on from an off state, or a warm reset, which
|
|
|
|
|
is a restart without turning off the power. Identifying the reset
|
|
|
|
|
type is crucial for deciding which initialization procedures need
|
|
|
|
|
to be executed \cite{amd_bsp}\cite{BKDG}.
|
2024-08-21 12:53:44 +02:00
|
|
|
|
|
2024-08-21 21:27:29 +02:00
|
|
|
|
\section{Baseboard Management Controller}
|
2024-08-22 19:18:34 +02:00
|
|
|
|
|
2024-08-21 21:27:29 +02:00
|
|
|
|
The Baseboard Management Controller (BMC) on the KGPE-D16 motherboard,
|
|
|
|
|
specifically the ASpeed AST2050, plays a role in the server's
|
2024-08-22 19:18:34 +02:00
|
|
|
|
architecture by managing out-of-band communication and control of
|
2024-08-25 11:54:54 +02:00
|
|
|
|
the hardware. The AST2050 is based on an ARM926EJ-S processor,
|
2024-08-22 19:18:34 +02:00
|
|
|
|
a low-power 32-bit ARM architecture designed for embedded systems
|
|
|
|
|
\cite{ast2050_architecture}. This architecture is well-suited for BMCs
|
|
|
|
|
due to its efficiency and capability to handle multiple management
|
|
|
|
|
tasks concurrently without significant resource demands from the
|
|
|
|
|
main system. \\
|
|
|
|
|
|
|
|
|
|
The AST2050 features several key components that contribute to
|
2024-08-25 11:54:54 +02:00
|
|
|
|
its functionality. It includes an integrated VGA controller,
|
2024-08-22 19:18:34 +02:00
|
|
|
|
which enables remote graphical management through KVM-over-IP
|
|
|
|
|
(Keyboard, Video, Mouse), a critical feature for administrators who
|
|
|
|
|
need to interact with the system remotely, including BIOS updates
|
|
|
|
|
and troubleshooting \cite{ast2050_kvm}. Additionally, the AST2050
|
|
|
|
|
integrates a dedicated memory controller, which supports up to 256MB
|
2024-08-25 11:54:54 +02:00
|
|
|
|
of DDR2 RAM. This allows it to handle complex tasks and maintain
|
2024-08-22 19:18:34 +02:00
|
|
|
|
responsiveness during management operations \cite{ast2050_memory}.
|
2024-08-21 21:27:29 +02:00
|
|
|
|
The BMC also features a network interface controller (NIC) dedicated to
|
2024-08-22 19:18:34 +02:00
|
|
|
|
management traffic, ensuring that remote management does not interfere
|
|
|
|
|
with the primary network traffic of the server. This separation is
|
|
|
|
|
vital for maintaining secure and uninterrupted system management,
|
|
|
|
|
especially in environments where uptime is critical \cite{ast2050_nic}.
|
|
|
|
|
Another important architectural aspect of the AST2050 is its support
|
|
|
|
|
for multiple I/O interfaces, including I2C, GPIO, UART, and USB,
|
|
|
|
|
which allow it to interface with various sensors and peripherals
|
|
|
|
|
on the motherboard \cite{ast2050_io}. This versatility enables
|
|
|
|
|
comprehensive monitoring of hardware health, such as temperature
|
|
|
|
|
sensors, fan speeds, and power supplies, all of which can be managed
|
2024-08-21 21:27:29 +02:00
|
|
|
|
and configured through the BMC. \\
|
2024-08-21 13:13:02 +02:00
|
|
|
|
|
2024-08-22 19:18:34 +02:00
|
|
|
|
When combined with OpenBMC \cite{openbmc_wiki}, a libre firmware
|
|
|
|
|
that can be run on the AST2050 thanks to Raptor Engineering
|
|
|
|
|
\cite{raptor_engineering}, the architecture of the BMC becomes even
|
|
|
|
|
more powerful. OpenBMC takes advantage of the AST2050's architecture,
|
|
|
|
|
providing a flexible and customizable environment that can be tailored
|
|
|
|
|
to specific use cases. This includes adding or modifying features
|
|
|
|
|
related to security, logging, and network management, all within
|
|
|
|
|
the BMC's ARM architecture framework \cite{openbmc_customization}.
|
2024-08-21 12:53:44 +02:00
|
|
|
|
|
2024-08-25 11:54:54 +02:00
|
|
|
|
% ------------------------------------------------------------------------------
|
|
|
|
|
% CHAPTER 3: Key components in modern firmware
|
|
|
|
|
% ------------------------------------------------------------------------------
|
2024-08-22 19:18:34 +02:00
|
|
|
|
\chapter{Key components in modern firmware}
|
2024-08-21 12:53:44 +02:00
|
|
|
|
|
2024-08-21 21:27:29 +02:00
|
|
|
|
\section{General structure of coreboot}
|
|
|
|
|
|
2024-08-22 15:38:22 +02:00
|
|
|
|
The firmware of the ASUS KGPE-D16 is crucial in ensuring the proper
|
|
|
|
|
functioning and optimization of the mainboard's hardware components.
|
2024-08-25 15:57:26 +02:00
|
|
|
|
In this chapter and for the rest of this document, we're basing our
|
|
|
|
|
study on the 4.11 version of \textit{coreboot} \cite{coreboot_4_11},
|
|
|
|
|
which is the last version that supported the ASUS KGPE-D16 mainboard. \\
|
|
|
|
|
|
2024-08-27 13:41:07 +02:00
|
|
|
|
For the initialization tasks to be done efficiently, \textit{coreboot} is
|
2024-08-25 15:57:26 +02:00
|
|
|
|
organized in different stages (fig. \ref{fig:coreboot_stages})
|
|
|
|
|
\cite{coreboot_docs}.
|
2024-08-21 12:53:44 +02:00
|
|
|
|
|
2024-08-21 21:27:29 +02:00
|
|
|
|
\begin{figure}[H]
|
|
|
|
|
\centering
|
|
|
|
|
\includegraphics[width=0.9\textwidth]{
|
2024-08-21 12:53:44 +02:00
|
|
|
|
images/fig9_coreboot_stages.png}
|
2024-08-22 15:38:22 +02:00
|
|
|
|
\caption{\textit{coreboot}'s stages timeline, by
|
|
|
|
|
\textit{coreboot} project (CC BY-SA 4.0, 2009)}
|
2024-08-21 21:27:29 +02:00
|
|
|
|
\label{fig:coreboot_stages}
|
|
|
|
|
\end{figure}
|
|
|
|
|
|
2024-08-22 15:38:22 +02:00
|
|
|
|
Being a complex project with ambitious goals, \textit{coreboot} decided
|
|
|
|
|
early on to establish an file-system-based architecture for its images
|
|
|
|
|
(also called ROMs). This special file-system is CBFS (which stands for
|
|
|
|
|
coreboot file system). The CBFS architecture consists of a binary image
|
|
|
|
|
that can be interpreted as a physical disk, referred to here as ROM. A
|
|
|
|
|
number of independent components, each with a header added to the data,
|
|
|
|
|
are located within the ROM. The components are nominally arranged
|
|
|
|
|
sequentially, although they are aligned along a predefined boundary
|
|
|
|
|
(fig. \ref{fig:coreboot_diagram}). \\
|
|
|
|
|
|
|
|
|
|
Each stage is compiled as a separate binary and inserted into the CBFS
|
|
|
|
|
with custom compression. The bootblock stage is usually not compressed,
|
|
|
|
|
while the ramstage and the payload are compressed with LZMA. Each stage
|
|
|
|
|
loads the next stage at a given address (possibly decompressing it in
|
|
|
|
|
the process). \\
|
|
|
|
|
|
|
|
|
|
Some stages are relocatable and can be placed anywhere in the RAM.
|
|
|
|
|
These stages are typically cached in the CBMEM for faster loading times
|
|
|
|
|
during wake-up. The CBMEM is a specific memory area used by the
|
|
|
|
|
\textit{coreboot} firmware to store important data structures and logs
|
|
|
|
|
during the boot process. This area is typically allocated in the
|
|
|
|
|
system's RAM and is used to store various types of runtime information
|
|
|
|
|
that it might need to reference after the initial boot stages. \\
|
|
|
|
|
|
|
|
|
|
In general, \textit{coreboot} manages main memory through a structured
|
|
|
|
|
memory map (fig. \ref{tab:memmap}), allocating specific address ranges
|
|
|
|
|
for various hardware functions and system operations. The first 640KB
|
|
|
|
|
of memory space is typically unused by coreboot due to historical
|
|
|
|
|
reasons. Graphics-related operations use the VGA address range
|
|
|
|
|
and the text mode address ranges. It also reserves the higher for
|
|
|
|
|
operating system use, ensuring that critical system components
|
|
|
|
|
like the IOAPIC and TPM registers have dedicated address spaces.
|
|
|
|
|
This structured approach helps maintain system stability and
|
2024-08-22 19:18:34 +02:00
|
|
|
|
compatibility across different platforms and allows for a reset vector
|
2024-08-27 14:15:12 +02:00
|
|
|
|
fixed at an address (\texttt{0xFFFFFFF0}), regardless of the ROM size.
|
2024-08-22 15:38:22 +02:00
|
|
|
|
Payloads are typically loaded into high memory, above the reserved areas
|
|
|
|
|
for hardware components and system resources. The exact memory location
|
|
|
|
|
can vary depending on the system's configuration, but generally,
|
|
|
|
|
payloads are placed in a region of memory that does not conflict with
|
|
|
|
|
the firmware code or the reserved memory map areas, such as the ROM
|
|
|
|
|
mapping ranges. This placement ensures that payloads have sufficient
|
|
|
|
|
space to execute without interfering with other critical memory regions
|
|
|
|
|
allocated \cite{coreboot_mem_management}.
|
|
|
|
|
|
|
|
|
|
\begin{table}[ht]
|
|
|
|
|
\makebox[\textwidth][c]{%
|
|
|
|
|
\begin{tabular}{
|
|
|
|
|
|>{\centering\arraybackslash}p{0.35\textwidth}
|
|
|
|
|
|>{\centering\arraybackslash}p{0.5\textwidth}|}
|
|
|
|
|
\hline
|
2024-08-25 15:57:26 +02:00
|
|
|
|
\path{0x00000 - 0x9FFFF}
|
2024-08-22 15:38:22 +02:00
|
|
|
|
& Low memory (first 640KB). Never used. \\
|
|
|
|
|
\hline
|
2024-08-25 15:57:26 +02:00
|
|
|
|
\path{0xA0000 - 0xAFFFF}
|
2024-08-22 15:38:22 +02:00
|
|
|
|
& VGA graphics address range. \\
|
|
|
|
|
\hline
|
2024-08-25 15:57:26 +02:00
|
|
|
|
\path{0xB0000 - 0xB7FFF}
|
2024-08-22 15:38:22 +02:00
|
|
|
|
& Monochrome text mode address range.
|
|
|
|
|
Few motherboards use
|
|
|
|
|
it, but the KGPE-D16 does. \\
|
|
|
|
|
\hline
|
2024-08-25 15:57:26 +02:00
|
|
|
|
\path{0xB8000 - 0xBFFFF}
|
2024-08-22 15:38:22 +02:00
|
|
|
|
& Text mode address range. \\
|
|
|
|
|
\hline
|
2024-08-25 15:57:26 +02:00
|
|
|
|
\path{0xFEC00000}
|
2024-08-22 15:38:22 +02:00
|
|
|
|
& IOAPIC address. \\
|
|
|
|
|
\hline
|
2024-08-25 15:57:26 +02:00
|
|
|
|
\path{0xFED44000 - 0xFED4FFFF}
|
2024-08-22 15:38:22 +02:00
|
|
|
|
& Address range for TPM registers. \\
|
|
|
|
|
\hline
|
2024-08-25 15:57:26 +02:00
|
|
|
|
\path{0xFF000000 - 0xFFFFFFFF}
|
2024-08-22 15:38:22 +02:00
|
|
|
|
& 16 MB ROM mapping address range. \\
|
|
|
|
|
\hline
|
2024-08-25 15:57:26 +02:00
|
|
|
|
\path{0xFF800000 - 0xFFFFFFFF}
|
2024-08-22 15:38:22 +02:00
|
|
|
|
& 8 MB ROM mapping address range. \\
|
|
|
|
|
\hline
|
2024-08-25 15:57:26 +02:00
|
|
|
|
\path{0xFFC00000 - 0xFFFFFFFF}
|
2024-08-22 15:38:22 +02:00
|
|
|
|
& 4 MB ROM mapping address range. \\
|
|
|
|
|
\hline
|
2024-08-25 15:57:26 +02:00
|
|
|
|
\path{0xFEC00000 - DEVICE MEM HIGH}
|
2024-08-22 15:38:22 +02:00
|
|
|
|
& Reserved area for OS use. \\
|
|
|
|
|
\hline
|
|
|
|
|
\end{tabular}}
|
|
|
|
|
\caption{\textit{coreboot} memory map}
|
|
|
|
|
\label{tab:memmap}
|
|
|
|
|
\end{table}
|
2024-08-21 21:27:29 +02:00
|
|
|
|
|
2024-08-25 15:57:26 +02:00
|
|
|
|
\subsection{Bootblock}
|
2024-08-22 15:38:22 +02:00
|
|
|
|
|
|
|
|
|
The bootblock is the first stage executed after the CPU reset. The
|
|
|
|
|
beginning of this stage is written in assembly language, and its
|
|
|
|
|
main task is to set everything up for a C environment. The rest, of
|
|
|
|
|
course, is written in C. This stage occupies the last 20k
|
|
|
|
|
(fig. \ref{fig:coreboot_diagram}) of the image and within it is a
|
|
|
|
|
main header containing information about the ROM, including the
|
|
|
|
|
size, component alignment, and the offset of the start of the first
|
|
|
|
|
CBFS component. This block is a mandatory component as it also
|
|
|
|
|
contains the entry point of the firmware. \\
|
|
|
|
|
|
|
|
|
|
\begin{figure}[H]
|
|
|
|
|
\centering
|
2024-08-22 19:18:34 +02:00
|
|
|
|
\includegraphics[width=0.8\textwidth]{images/fig8_coreboot_architecture.png}
|
2024-08-22 15:38:22 +02:00
|
|
|
|
\caption{\textit{coreboot} ROM architecture
|
|
|
|
|
(CC BY-SA 4.0, 2024)}
|
|
|
|
|
\label{fig:coreboot_diagram}
|
|
|
|
|
\end{figure}
|
|
|
|
|
|
2024-08-22 19:18:34 +02:00
|
|
|
|
Upon startup, the first responsibility of the bootblock is to
|
|
|
|
|
execute the code from the reset vector located at the conventional
|
|
|
|
|
reset vector in 16-bit real mode. This code is specific to the
|
|
|
|
|
processor architecture and, for our board, is stored in the
|
|
|
|
|
architecture-specific sources for x86 within \textit{coreboot}.
|
|
|
|
|
The entry point into \textit{coreboot} code is defined in two files
|
2024-08-25 15:57:26 +02:00
|
|
|
|
in the \path{src/cpu/x86/16bit/} directory: \path{reset16.inc}
|
|
|
|
|
and \path{entry16.inc}. The first file serves as a jump to the
|
|
|
|
|
\path{_start16bit} procedure defined in the second. Due to space
|
2024-08-22 19:18:34 +02:00
|
|
|
|
constraints this function must remain below the 1MB address space
|
|
|
|
|
because the IOMMU has not yet been configured to allow anything
|
|
|
|
|
else. \\
|
|
|
|
|
|
|
|
|
|
During this early initialization, the Bootstrap Core (BSC) performs
|
|
|
|
|
several critical tasks while the other cores remain dormant. These
|
|
|
|
|
tasks include saving the results (and displaying them if necessary)
|
|
|
|
|
of the Built-in Self-Test (BIST), formerly known as POST;
|
|
|
|
|
invalidating the TLB to prevent any address translation errors;
|
|
|
|
|
determining the type of reset (e.g., cold start or warm start);
|
|
|
|
|
creating and loading an empty Interrupt Descriptor Table (IDT) to
|
|
|
|
|
prevent the use of "legacy" interrupts from real mode until
|
|
|
|
|
protected mode is reached. In practice, this means that at the
|
|
|
|
|
slightest exception, the BSC will halt. The code then switches to
|
|
|
|
|
32-bit protected mode by mapping the first 4 GB of address space for
|
|
|
|
|
code and data, and finally jumps to the 32-bit reset code labeled
|
2024-08-25 15:57:26 +02:00
|
|
|
|
\path{_protected_start}. \\
|
2024-08-22 19:18:34 +02:00
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Once in protected mode, which constitutes the "normal" operating
|
|
|
|
|
mode for the processor, the next step is to set up the execution
|
|
|
|
|
environment. To achieve this, the code contained in
|
|
|
|
|
\path{src/cpu/x86/32bit/entry32.inc}, followed by
|
|
|
|
|
\path{src/cpu/x86/64bit/entry64.inc}, and finally
|
|
|
|
|
\path{src/arch/x86/bootblock_crt0.S}, establishes a temporary
|
|
|
|
|
stack, transitions to long mode (64-bit addressing) with paging
|
|
|
|
|
enabled, and sets up a proper exception vector table. The execution
|
|
|
|
|
then jumps to chipset-specific code via the
|
2024-08-25 15:57:26 +02:00
|
|
|
|
\path{bootblock_pre_c_entry} procedure.
|
2024-08-22 19:18:34 +02:00
|
|
|
|
Once these steps are completed, the bootblock has a minimal C
|
|
|
|
|
environment. The procedure now involves allocating
|
|
|
|
|
memory for the BSS, and decompressing and loading the next stage. \\
|
|
|
|
|
|
2024-08-25 15:57:26 +02:00
|
|
|
|
The jump to \path{_bootblock_pre_entry} leads to the code files
|
2024-08-22 19:18:34 +02:00
|
|
|
|
\path{src/soc/amd/common/block/cpu/car/cache_as_ram.S} and
|
|
|
|
|
\path{src/vendorcode/amd/agesa/f15tn/gcccar.inc}, which are specific
|
|
|
|
|
to AMD chipsets. It's worth noting that these files were developed by
|
|
|
|
|
AMD's engineers as part of the \textit{AGESA} project. The operations
|
|
|
|
|
performed at this stage are related to pre-RAM memory initialization.
|
|
|
|
|
All cores of all processors (up to a limit of 64 cores) are started.
|
|
|
|
|
The \textit{Cache-As-Ram} is configured using the
|
|
|
|
|
Memory-type range registers. These registers allow the
|
|
|
|
|
specification of a specific configuration for a given memory area
|
|
|
|
|
\cite{BKDG}.
|
|
|
|
|
In this case, the area that should correspond to physical memory is
|
|
|
|
|
mapped to the cache, while other areas, such as PCI or other bus
|
|
|
|
|
zones, are configured accordingly. A specific stack is set up for
|
|
|
|
|
each core of each processor (within the arbitrary limit of 64 cores
|
|
|
|
|
and 7 nodes, meaning 7 Core 0s). Core 0s receive 16KB, while the
|
|
|
|
|
Bootstrap Core (BSC) gets 64KB. The other cores receive 4KB each.
|
|
|
|
|
All cores except the BSC are halted and will restart during the
|
|
|
|
|
romstage. Finally, the execution jumps to the entry point of the
|
|
|
|
|
\textit{bootblock} written in C, labeled
|
2024-08-25 15:57:26 +02:00
|
|
|
|
\path{bootblock_c_entry}.
|
2024-08-22 19:18:34 +02:00
|
|
|
|
This entry point is located in
|
|
|
|
|
\path{src/soc/amd/stoneyridge/bootblock/bootblock.c} and is
|
|
|
|
|
specific to AMD processors. It is the first C routine executed, and
|
|
|
|
|
its role is to verify that the current processor is indeed the BSC,
|
|
|
|
|
allowing the function \path{bootblock_main_with_basetime}
|
2024-08-22 20:01:29 +02:00
|
|
|
|
to be called exclusively by the BSC. \\
|
2024-08-22 19:18:34 +02:00
|
|
|
|
|
|
|
|
|
We are now in the file \path{src/lib/bootblock.c}, written by
|
|
|
|
|
Google's team, and entering the
|
2024-08-25 15:57:26 +02:00
|
|
|
|
\path{bootblock_main_with_basetime} function, which immediately
|
|
|
|
|
calls \path{bootblock_main_with_timestamp}. At this stage, the
|
2024-08-22 19:18:34 +02:00
|
|
|
|
goal is to start the romstage, but a few more tasks need to be
|
|
|
|
|
completed.
|
|
|
|
|
|
2024-08-25 15:57:26 +02:00
|
|
|
|
The \path{bootblock_soc_early_init} function is called to
|
2024-08-22 19:18:34 +02:00
|
|
|
|
initialize the I2C bus of the southbridge. The
|
2024-08-25 15:57:26 +02:00
|
|
|
|
\path{bootblock_fch_early_init} function is invoked to
|
2024-08-27 13:46:50 +02:00
|
|
|
|
initialize the SPI buses (Serial Peripheral Interface,
|
|
|
|
|
allowing to access the chip that contains the ROM) and the
|
2024-08-22 19:18:34 +02:00
|
|
|
|
serial and "legacy" buses of the southbridge. The CMOS clock is then
|
|
|
|
|
initialized, followed by the pre-initialization of the serial
|
|
|
|
|
console.
|
2024-08-25 15:57:26 +02:00
|
|
|
|
The code then calls the \path{bootblock_mainboard_init}
|
2024-08-22 19:18:34 +02:00
|
|
|
|
function, which enters, for the first time, the files specific to
|
|
|
|
|
the ASUS KGPE-D16 motherboard:
|
|
|
|
|
\path{src/mainboard/ASUS/kgpe-d16/bootblock.c}.
|
|
|
|
|
This code performs the northbridge initialization via the
|
2024-08-25 15:57:26 +02:00
|
|
|
|
\path{bootblock_northbridge_init} function found in
|
2024-08-22 19:18:34 +02:00
|
|
|
|
\path{src/northbridge/amd/amdfam10/bootblock.c}. This involves
|
|
|
|
|
locating the HyperTransport bus and enabling the discovery of
|
|
|
|
|
devices connected to it (e.g., processors). The southbridge is
|
2024-08-25 15:57:26 +02:00
|
|
|
|
initialized using the \path{bootblock_southbridge_init}
|
2024-08-22 19:18:34 +02:00
|
|
|
|
function from \path{src/southbridge/amd/sb700/bootblock.c}.
|
|
|
|
|
This function, largely programmed by Timothy Pearson from Raptor
|
|
|
|
|
Engineering, who performed the first coreboot port for the ASUS
|
|
|
|
|
KGPE-D16, finalizes the activation of the SPI bus and the connection
|
|
|
|
|
to the ROM memory via SuperIO. The state of a recovery jumper is
|
|
|
|
|
then checked (this jumper is intended to reset the CMOS content,
|
|
|
|
|
although it is not fully functional at the moment, as indicated by
|
2024-08-25 15:57:26 +02:00
|
|
|
|
the \path{FIXME} comment in the code). Control then returns to
|
|
|
|
|
\path{bootblock_main} in \path{src/lib/bootblock.c}. \\
|
2024-08-22 19:18:34 +02:00
|
|
|
|
|
|
|
|
|
At this point, everything is ready to enter the romstage.
|
|
|
|
|
\textit{coreboot} has successfully started and can now continue its
|
2024-08-25 15:57:26 +02:00
|
|
|
|
execution by calling the \path{run_romstage} function from
|
2024-08-22 19:18:34 +02:00
|
|
|
|
\path{src/lib/prog_loaders.c}. This function begins by locating
|
|
|
|
|
the corresponding segment in the ROM via the southbridge and SPI
|
2024-08-25 15:57:26 +02:00
|
|
|
|
bus using \path{prog_locate}, which utilizes the SPI driver in
|
2024-08-22 19:18:34 +02:00
|
|
|
|
\path{src/drivers/cbfs_spi.c}. The contents of the romstage are
|
|
|
|
|
then copied into the cache-as-ram by
|
2024-08-25 15:57:26 +02:00
|
|
|
|
\path{cbfs_prog_stage_load}. Finally, the \path{prog_run}
|
2024-08-22 19:18:34 +02:00
|
|
|
|
function transitions to the romstage after switching back to
|
|
|
|
|
32-bit mode.
|
2024-08-22 15:38:22 +02:00
|
|
|
|
|
2024-08-21 12:53:44 +02:00
|
|
|
|
\subsection{Romstage}
|
2024-08-22 15:38:22 +02:00
|
|
|
|
|
2024-08-22 19:18:34 +02:00
|
|
|
|
The \textit{romstage} in \textit{coreboot} serves the critical function
|
|
|
|
|
of early initialization of peripherals, particularly system memory.
|
|
|
|
|
This stage is crucial for setting up the necessary components for the
|
|
|
|
|
platform's operation, ensuring that everything is in place for
|
|
|
|
|
subsequent stages of the boot process.
|
|
|
|
|
During this phase, \textit{coreboot} configures the Advanced
|
|
|
|
|
Programmable Interrupt Controller (APIC), which is responsible for
|
|
|
|
|
correctly handling interrupts across multiple CPUs, especially in
|
|
|
|
|
systems using Symmetric Multiprocessing (SMP). This includes setting
|
|
|
|
|
up the Local APIC on each processor and the IOAPIC, part of the
|
|
|
|
|
southbridge, to ensure that interrupts from peripherals are routed
|
|
|
|
|
to the appropriate CPUs. Additionally, the firmware configures the
|
|
|
|
|
HyperTransport (HT) technology, a high-speed communication protocol
|
|
|
|
|
that facilitates data exchange between the processor and the
|
2024-08-27 13:41:07 +02:00
|
|
|
|
northbridge, ensuring smooth data flow between these components.
|
|
|
|
|
During this stage, microcode patches may be loaded into CPU and
|
|
|
|
|
remain resident, settings related to memory controllers and CPU
|
|
|
|
|
too. \\
|
2024-08-22 19:18:34 +02:00
|
|
|
|
|
|
|
|
|
The \textit{romstage} begins with a call to the
|
2024-08-25 15:57:26 +02:00
|
|
|
|
\path{_start} function, defined in
|
2024-08-22 19:18:34 +02:00
|
|
|
|
\path{src/cpu/x86/32bit/entry32.inc} via
|
|
|
|
|
\path{src/arch/x86/assembly_entry.S}. We then enter the
|
2024-08-25 15:57:26 +02:00
|
|
|
|
\path{cache_as_ram_setup} procedure, written in assembly
|
2024-08-22 19:18:34 +02:00
|
|
|
|
language, located in \path{src/cpu/amd/car/cache_as_ram.inc}. This
|
|
|
|
|
procedure configures the cache to load the future \textit{ramstage}
|
|
|
|
|
and initialize memory based on the number of processors and cores
|
|
|
|
|
present. Once this is completed, the code calls
|
2024-08-25 15:57:26 +02:00
|
|
|
|
\path{cache_as_ram_main} in
|
2024-08-22 19:18:34 +02:00
|
|
|
|
\path{src/mainboard/asus/kgpe-d16/romstage.c}, which serves as the
|
|
|
|
|
main function of the \textit{romstage}.
|
2024-08-25 15:57:26 +02:00
|
|
|
|
In the \path{cache_as_ram_main} function, after reducing the
|
2024-08-22 19:18:34 +02:00
|
|
|
|
speed of the HyperTransport bus, only the Bootstrap Core (BSC)
|
|
|
|
|
initializes the spinlocks for the serial console, the CMOS storage
|
|
|
|
|
memory (used for saving parameters), and the ROM. At this point, the
|
|
|
|
|
HyperTransport bus is enumerated, and the PCI bridges are
|
|
|
|
|
temporarily disabled. The port 0x80 of the southbridge, used for
|
|
|
|
|
motherboard debugging with \textit{Post Codes}, is also initialized.
|
|
|
|
|
These codes indicate the status of the boot process and can be
|
|
|
|
|
displayed using special PCI cards connected to the system. The
|
|
|
|
|
SuperIO is then initialized to activate the serial port, allowing
|
|
|
|
|
the serial console to follow \textit{coreboot}’s progress in
|
|
|
|
|
real-time. If everything proceeds as expected, the code 0x30 is
|
2024-08-22 20:01:29 +02:00
|
|
|
|
sent, and the boot process continues. \\
|
|
|
|
|
|
2024-08-22 19:18:34 +02:00
|
|
|
|
If the result of the Built-in Self-Test (BIST), saved during the
|
|
|
|
|
\textit{bootblock}, shows no anomalies, all cores of all nodes are
|
|
|
|
|
configured, and they are placed back into sleep mode (except for the
|
|
|
|
|
Core 0s). If everything goes well, the code 0x32 is sent, and the
|
2024-08-25 15:57:26 +02:00
|
|
|
|
process continues. Using the \path{enable_sr5650_dev8} function,
|
2024-08-22 19:18:34 +02:00
|
|
|
|
the southbridge’s P2P bridge is activated. Additionally, a check is
|
|
|
|
|
performed to ensure that the number of physical processors detected
|
|
|
|
|
does not exceed the number of sockets available on the board. If any
|
|
|
|
|
issues were detected during the BIST, the machine will halt, and the
|
|
|
|
|
error will be displayed on the console. Otherwise, the process
|
|
|
|
|
continues, and the default hardware information table is
|
|
|
|
|
constructed, and the microcode of the physical processors is updated
|
|
|
|
|
if necessary. If everything proceeds correctly, the code 0x33 and
|
|
|
|
|
then 0x34 is sent, and the process continues. The information about
|
2024-08-25 15:57:26 +02:00
|
|
|
|
the physical processors is retrieved using \path{amd_ht_init},
|
2024-08-22 19:18:34 +02:00
|
|
|
|
and communication between the two sockets is configured via
|
2024-08-25 15:57:26 +02:00
|
|
|
|
\path{amd_ht_fixup}. This process includes disabling any
|
2024-08-22 19:18:34 +02:00
|
|
|
|
defective HT links (one per socket in this AMD Family 15h chipset).
|
|
|
|
|
If everything is working as expected, the code 0x35 is sent, and
|
|
|
|
|
the boot process continues.
|
2024-08-25 15:57:26 +02:00
|
|
|
|
With the \path{finalize_node_setup} function, the PCI bus is
|
2024-08-22 19:18:34 +02:00
|
|
|
|
initialized, and a mapping is created
|
2024-08-25 15:57:26 +02:00
|
|
|
|
(\path{setup_mb_resource_map}). If all goes well, the code 0x36
|
2024-08-22 19:18:34 +02:00
|
|
|
|
is sent. This is done in parallel across all Core 0s, so the system
|
|
|
|
|
waits for all cores to finish using the
|
2024-08-25 15:57:26 +02:00
|
|
|
|
\path{wait_all_core0_started} function. The communication
|
2024-08-22 19:18:34 +02:00
|
|
|
|
between the northbridge and southbridge is prepared using
|
2024-08-25 15:57:26 +02:00
|
|
|
|
\path{sr5650_early_setup} and
|
|
|
|
|
\path{sb7xx_51xx_early_setup}, followed by the activation of
|
2024-08-22 19:18:34 +02:00
|
|
|
|
all cores on all nodes, with the system waiting for all cores to be
|
|
|
|
|
fully initialized. If everything is successful, the code 0x38 is
|
|
|
|
|
sent. \\
|
|
|
|
|
|
|
|
|
|
At this point, the timer is activated, and a warm reset is performed
|
2024-08-25 15:57:26 +02:00
|
|
|
|
via the \path{soft_reset} function to validate all configuration
|
2024-08-22 19:18:34 +02:00
|
|
|
|
changes to the HT, PCI buses, and voltage/power settings of the
|
|
|
|
|
processors and buses. This results in a system reboot, passing again
|
|
|
|
|
through the \textit{bootblock}, but much faster this time since the
|
|
|
|
|
system recognizes the warm reset condition. Once this reboot is
|
|
|
|
|
complete, the HyperTransport bus is reconfigured into isochronous
|
|
|
|
|
mode (switching from asynchronous mode), finalizing the
|
|
|
|
|
configuration process. \\
|
|
|
|
|
|
|
|
|
|
Memory training and optimization are also key functions of the
|
|
|
|
|
firmware during the \textit{romstage}. This process involves
|
|
|
|
|
adjusting memory settings, such as timings, frequencies, and
|
|
|
|
|
voltages, to ensure that the installed memory modules operate
|
|
|
|
|
efficiently and stably. This step is crucial for achieving optimal
|
|
|
|
|
performance, especially when dealing with large amounts of RAM
|
|
|
|
|
and many CPU cores, as supported by the KGPE-D16. We'll see that
|
|
|
|
|
in detail during the next chapter. \\
|
|
|
|
|
|
|
|
|
|
After memory initialization, the process returns to the
|
2024-08-25 15:57:26 +02:00
|
|
|
|
\path{cache_as_ram_main} function, where a memory test is
|
2024-08-22 19:18:34 +02:00
|
|
|
|
performed. This involves writing predefined values to specific
|
|
|
|
|
memory locations and then verifying that the values can be read
|
|
|
|
|
back correctly.
|
|
|
|
|
If everything passes successfully, the CBMEM is initialized and
|
2024-08-25 15:57:26 +02:00
|
|
|
|
one sends code \path{0x41}. At this point, the configuration of
|
2024-08-22 19:18:34 +02:00
|
|
|
|
the PCI bus is prepared, which will be completed during the ramstage
|
|
|
|
|
by configuring the PCI bridges. The system then exits
|
2024-08-25 15:57:26 +02:00
|
|
|
|
\path{cache_as_ram_main} and returns to
|
|
|
|
|
\path{cache_as_ram_setup} to finalize the process.
|
2024-08-22 19:18:34 +02:00
|
|
|
|
|
2024-08-25 11:54:54 +02:00
|
|
|
|
\textit{coreboot} then transitions to the next stage, known as the
|
2024-08-22 19:18:34 +02:00
|
|
|
|
postcar stage, where it exits the cache-as-RAM mode and
|
|
|
|
|
begins using physical RAM.
|
2024-08-22 15:38:22 +02:00
|
|
|
|
|
2024-08-21 12:53:44 +02:00
|
|
|
|
\subsection{Ramstage}
|
|
|
|
|
|
2024-08-22 15:38:22 +02:00
|
|
|
|
The ramstage performs the general initialization of all peripherals,
|
2024-08-22 19:18:34 +02:00
|
|
|
|
including the initialization of PCI devices, on-chip devices, the
|
|
|
|
|
TPM (if not done by verstage), graphics (optional), and the CPU
|
|
|
|
|
(setting up the System Management Mode). After this initialization,
|
|
|
|
|
tables are written to inform the payload or operating system about
|
|
|
|
|
the existence and current state of the hardware. These tables
|
|
|
|
|
include ACPI tables (specific to x86), SMBIOS tables (specific to
|
|
|
|
|
x86), coreboot tables, and updates to the device tree (specific to
|
|
|
|
|
ARM). Additionally, the ramstage locks down the hardware and
|
|
|
|
|
firmware by applying write protection to boot media, locking
|
2024-08-27 13:41:07 +02:00
|
|
|
|
security-related registers, and locking SMM (specific to x86),
|
|
|
|
|
which is a resident component in a protected area.
|
|
|
|
|
\cite{coreboot_docs}. CBMEM data structures (like coreboot tables,
|
|
|
|
|
memory map, etc.) are populated during this stage and left resident
|
|
|
|
|
for the OS or payload to access, same for SMBIOS tables, ACPI tables
|
|
|
|
|
and eventually option ROMS. \\
|
|
|
|
|
|
2024-08-22 15:38:22 +02:00
|
|
|
|
Effective resource allocation is essential for system stability,
|
2024-08-22 19:18:34 +02:00
|
|
|
|
particularly in complex configurations involving multiple CPUs
|
|
|
|
|
and peripherals. This stage manages initial resource allocation,
|
|
|
|
|
resolving any conflicts between hardware components to prevent
|
|
|
|
|
resource contention and ensure smooth operation and security, which
|
|
|
|
|
is a major concern in modern systems. This includes support for
|
|
|
|
|
IOMMU, which is crucial for preventing unauthorized direct memory
|
|
|
|
|
access (DMA) attacks, particularly in virtualized environments
|
|
|
|
|
(however there are still vulnerabilities that can be exploited,
|
|
|
|
|
such as sub-page or IOTLB-based attacks or even configuration
|
|
|
|
|
weaknesses \cite{medeiros2017}\cite{markuze2021}). \\
|
2024-08-22 15:38:22 +02:00
|
|
|
|
|
|
|
|
|
\subsubsection{Advanced Configuration and Power Interface}
|
|
|
|
|
|
2024-08-22 19:18:34 +02:00
|
|
|
|
The Advanced Configuration and Power Interface (ACPI) is a
|
|
|
|
|
critical component of modern computing systems, providing an
|
|
|
|
|
open standard for device configuration and power management by
|
|
|
|
|
the operating system (OS). Developed in 1996 by Intel,
|
|
|
|
|
Microsoft, and Toshiba, ACPI replaced the older Advanced Power
|
|
|
|
|
Management (APM) standard with more advanced and flexible power
|
2024-08-25 11:54:54 +02:00
|
|
|
|
management capabilities \cite{intel_acpi_introduction_2023}.
|
|
|
|
|
At its core,
|
2024-08-22 19:18:34 +02:00
|
|
|
|
ACPI is implemented through a series of data structures and
|
|
|
|
|
executable code known as ACPI tables, which are provided by the
|
2024-08-27 13:41:07 +02:00
|
|
|
|
system firmware and interpreted by the OS at runtime. It means
|
|
|
|
|
that these are components from the firmware that remain resident
|
|
|
|
|
while the OS runs. These tables describe
|
2024-08-22 19:18:34 +02:00
|
|
|
|
various aspects of the system, including hardware resources,
|
|
|
|
|
device power states, and thermal zones. The ACPI Specification
|
|
|
|
|
outlines these structures and provides the necessary
|
|
|
|
|
standardization for interoperability across different platforms
|
|
|
|
|
and operating systems \cite{acpi_os_support}. These tables are
|
|
|
|
|
used by the OS to perform low-level task, including managing
|
|
|
|
|
power states of the CPU, controlling the voltage and frequency
|
|
|
|
|
scaling (also known as Dynamic Voltage and Frequency Scaling,
|
|
|
|
|
or DVFS), and coordinating power delivery to peripherals. \\
|
|
|
|
|
|
|
|
|
|
The ACPI Component Architecture (ACPICA) is the reference
|
|
|
|
|
implementation of ACPI, providing a common codebase that can be
|
|
|
|
|
used by OS developers to integrate ACPI support. ACPICA includes
|
|
|
|
|
tools and libraries that allow for the parsing and execution of
|
|
|
|
|
ACPI Machine Language (AML) code, which is embedded within the
|
|
|
|
|
ACPI tables \cite{acpi_programming}. One of the key tools in
|
|
|
|
|
ACPICA is the Intel ACPI Source Language (IASL) compiler, which
|
|
|
|
|
converts ACPI Source Language (ASL) code into AML bytecode,
|
|
|
|
|
allowing firmware developers to write custom ACPI
|
|
|
|
|
methods \cite{intel_acpi_spec}. The triggering of ACPI events is
|
|
|
|
|
managed through a combination of hardware signals and software
|
|
|
|
|
routines. For example, when a user presses the power button on a
|
|
|
|
|
system, an ACPI event is generated, which is then handled by the
|
|
|
|
|
OS. This event might trigger the system to enter a low-power
|
|
|
|
|
state, such as sleep or hibernation, depending on the
|
|
|
|
|
configuration provided by the ACPI tables
|
|
|
|
|
\cite{acpi_os_support}. These power states are defined in the
|
|
|
|
|
ACPI specification, with global states (G0 to G3) representing
|
|
|
|
|
different levels of system power consumption, and device states
|
|
|
|
|
(D0 to D3) representing individual device power levels. \\
|
|
|
|
|
|
|
|
|
|
The ASUS KGPE-D16 mainboard, which is designed for server and
|
|
|
|
|
high-performance computing environments, needs ACPI for managing
|
|
|
|
|
its power distribution across multiple CPUs and attached
|
|
|
|
|
peripherals. ACPI is integral in controlling the power states of
|
|
|
|
|
various components, thereby optimizing performance and energy
|
|
|
|
|
use. Additionally, the firmware on the KGPE-D16 uses ACPI tables
|
|
|
|
|
to manage system temperature and fan speed, ensuring reliable
|
|
|
|
|
operation under heavy workloads \cite{asus_kgpe_d16_manual}.
|
2024-08-22 15:38:22 +02:00
|
|
|
|
|
|
|
|
|
\subsubsection{System Management Mode}
|
|
|
|
|
|
2024-08-22 19:18:34 +02:00
|
|
|
|
System Management Mode (SMM) is a highly privileged operating
|
|
|
|
|
mode provided by x86 processors for handling system-level
|
|
|
|
|
functions such as power management, hardware control, and other
|
|
|
|
|
critical tasks that are to be isolated from the OS and
|
|
|
|
|
applications. Introduced by Intel, SMM operates in an
|
|
|
|
|
environment separate from the main operating system, offering a
|
|
|
|
|
controlled space for executing sensitive operations
|
2024-08-27 13:41:07 +02:00
|
|
|
|
\cite{uefi_smm_security}. This is another firmware component
|
|
|
|
|
that remains resident while the OS runs. \\
|
2024-08-22 19:18:34 +02:00
|
|
|
|
|
|
|
|
|
SMM is triggered by a System Management Interrupt (SMI), which
|
|
|
|
|
is a non-maskable interrupt that causes the CPU to save its
|
|
|
|
|
current state and switch to executing code stored in a protected
|
|
|
|
|
area of memory called System Management RAM (SMRAM). SMRAM is a
|
|
|
|
|
specialized memory region that is isolated from the rest of the
|
|
|
|
|
system, making it inaccessible to the OS and preventing
|
|
|
|
|
tampering or interference from other software
|
|
|
|
|
\cite{heasman2007}.
|
|
|
|
|
Within SMM, the firmware can execute various low-level functions
|
|
|
|
|
that require direct hardware control or need to be protected
|
|
|
|
|
from the OS. This includes tasks such as thermal management,
|
|
|
|
|
where the system monitors CPU temperature and adjusts
|
|
|
|
|
performance or power levels to prevent overheating, as well as
|
|
|
|
|
power management routines that enable efficient energy usage
|
|
|
|
|
by adjusting power states based on system activity
|
|
|
|
|
\cite{offsec_bios_smm}. One of the critical security features of
|
|
|
|
|
SMM is its role in managing firmware updates and handling
|
|
|
|
|
system-level security events. Because SMM operates in a
|
|
|
|
|
privileged mode that is isolated from the OS, it can
|
|
|
|
|
apply firmware updates and could respond to security threats
|
|
|
|
|
without being affected by potentially compromised system
|
|
|
|
|
software \cite{domas2015}. However, the high privilege level and
|
|
|
|
|
isolation of SMM also present significant security challenges.
|
|
|
|
|
If an attacker can compromise SMM, they gain full control over
|
|
|
|
|
the system, bypassing all security measures implemented by the
|
|
|
|
|
OS \cite{cyber_smm_hack}. Also, with a proprietary firmware,
|
|
|
|
|
it means that this code with a very high priviledge level
|
|
|
|
|
cannot be audited at all, nor even replaced. \\
|
|
|
|
|
|
|
|
|
|
The ASUS KGPE-D16 mainboard needs SMM to perform critical
|
|
|
|
|
management tasks that need to be done in parallel from the
|
|
|
|
|
operating system. For example, SMM is used to monitor and manage
|
|
|
|
|
system health by responding to thermal events and adjusting
|
|
|
|
|
power levels to maintain system stability. SMM operates
|
|
|
|
|
independently of the main operating system, allowing it to
|
|
|
|
|
perform sensitive tasks securely. \textit{coreboot}
|
|
|
|
|
supports SMM, but its implementation is typically
|
|
|
|
|
minimal compared to traditional proprietary firmware. In
|
|
|
|
|
\textit{coreboot}, SMM initialization involves setting
|
|
|
|
|
up the System Management Interrupt (SMI) handler and configuring
|
|
|
|
|
System Management RAM (SMRAM), the memory region where SMM code
|
|
|
|
|
executes\cite{brown2003linuxbios}. The extent of SMM support in
|
|
|
|
|
\textit{coreboot} can vary significantly depending on the
|
|
|
|
|
hardware platform and the specific requirements of the system.
|
|
|
|
|
\textit{coreboot}'s design philosophy emphasizes a lightweight
|
|
|
|
|
and fast boot process, delegating more complex management tasks
|
|
|
|
|
to payloads or the operating system itself
|
2024-08-22 15:38:22 +02:00
|
|
|
|
\cite{reinauer2008coreboot}.
|
|
|
|
|
|
2024-08-22 19:18:34 +02:00
|
|
|
|
One of the key challenges with implementing SMM in
|
|
|
|
|
\textit{coreboot} is ensuring that SMI handlers are configured
|
|
|
|
|
correctly to manage necessary system tasks without compromising
|
|
|
|
|
security or performance. \textit{coreboot}'s approach to SMM is
|
|
|
|
|
consistent with its overall goal of providing a streamlined and
|
|
|
|
|
efficient firmware solution, leaving more intricate
|
|
|
|
|
functionalities to be handled by subsequent software layers
|
|
|
|
|
\cite{mohr2012comparative}.
|
2024-08-21 21:27:29 +02:00
|
|
|
|
|
2024-08-21 12:53:44 +02:00
|
|
|
|
\subsection{Payload}
|
|
|
|
|
|
2024-08-22 19:18:34 +02:00
|
|
|
|
The payload is the software that executes after coreboot has
|
|
|
|
|
completed its initialization tasks. It resides in the CBFS and is
|
|
|
|
|
predetermined at compile time, with no option to choose it at
|
|
|
|
|
runtime. The primary role of the payload is to load and hand control
|
|
|
|
|
over to the operating system. In some cases, the payload itself can
|
|
|
|
|
be a component of the operating system \cite{coreboot_docs}.
|
|
|
|
|
Examples of payloads are \textit{GNU GRUB}, \textit{SeaBIOS},
|
|
|
|
|
\textit{memtest86+} or even sometimes the \textit{Linux kernel}
|
|
|
|
|
itself. \\
|
|
|
|
|
|
|
|
|
|
\textit{TianoCore}, a free implementation of the UEFI (Unified
|
|
|
|
|
Extensible Firmware Interface) specification is often used as a
|
|
|
|
|
payload \cite{tianocore_payload}.
|
|
|
|
|
It provides a UEFI environment after \textit{coreboot} has completed
|
|
|
|
|
its initial hardware initialization. This allows the system to
|
|
|
|
|
benefit from the advanced features of UEFI, such as a more flexible
|
|
|
|
|
boot manager, enhanced features, and support for modern
|
|
|
|
|
hardware. Indeed, UEFI, and by extension \textit{TianoCore},
|
|
|
|
|
includes a driver model that allows hardware manufacturers to
|
|
|
|
|
provide UEFI-compatible drivers. These drivers can be loaded at
|
|
|
|
|
boot time, allowing the firmware to support a wide range of modern
|
|
|
|
|
devices that \textit{coreboot}, with its more minimalistic and
|
|
|
|
|
custom-tailored approach, might not support out of the box.
|
|
|
|
|
For example, GOP drivers are responsible for setting up the
|
|
|
|
|
graphics hardware in UEFI environments. They replace the older VGA
|
|
|
|
|
BIOS routines used in legacy BIOS systems. With GOP drivers,
|
|
|
|
|
the system can initialize the GPU and display a graphical interface
|
|
|
|
|
even before the operating system loads \cite{osdev_gop}.
|
2024-08-27 13:41:07 +02:00
|
|
|
|
These are other examples of resident firmware components while the
|
|
|
|
|
OS is running. \\
|
2024-08-22 19:18:34 +02:00
|
|
|
|
Hardware manufacturers can distribute proprietary UEFI drivers as
|
|
|
|
|
part of firmware updates, making it straightforward for end-users
|
|
|
|
|
to install and use them. This is especially useful for specialized
|
|
|
|
|
hardware that requires specific drivers not included in the
|
|
|
|
|
free software community. It also gives hardware vendors more control
|
|
|
|
|
over how their devices are initialized and used, which can be
|
|
|
|
|
an advantage for vendors but is a freedom and user control
|
2024-08-27 13:41:07 +02:00
|
|
|
|
limitation.
|
2024-08-22 19:18:34 +02:00
|
|
|
|
|
|
|
|
|
Payloads are then definitely important parts of the firmware.
|
2024-08-22 15:38:22 +02:00
|
|
|
|
|
|
|
|
|
\section{AMD Platform Security Processor and Intel Management Engine}
|
|
|
|
|
|
2024-08-22 19:18:34 +02:00
|
|
|
|
The AMD Platform Security Processor (PSP) and Intel Management Engine
|
|
|
|
|
(ME) are embedded subsystems within AMD and Intel processors,
|
|
|
|
|
respectively, that handle a range of security-related tasks independent
|
|
|
|
|
of the main CPU. These subsystems are fundamental to the security
|
|
|
|
|
architecture of modern computing platforms, providing functions such as
|
|
|
|
|
secure boot, cryptographic key management, and remote system management
|
2024-08-27 16:03:22 +02:00
|
|
|
|
\cite{herrmann2017dissecting}.
|
2024-08-22 19:18:34 +02:00
|
|
|
|
|
|
|
|
|
The AMD PSP is based on an ARM Cortex-A5 processor and is responsible
|
|
|
|
|
for several security functions, including the validation of firmware
|
|
|
|
|
during boot (secure boot), management of Trusted Platform Module (TPM)
|
|
|
|
|
functions, and handling cryptographic operations such as key generation
|
|
|
|
|
and storage. The PSP operates independently of the main x86 cores,
|
|
|
|
|
which allows it to execute security functions even when the main system
|
2024-08-27 16:03:22 +02:00
|
|
|
|
is powered off or compromised by malware \cite{herrmann2017dissecting}.
|
2024-08-22 19:18:34 +02:00
|
|
|
|
The PSP's isolated environment ensures that sensitive operations are
|
|
|
|
|
protected from threats that could affect the main OS. \\
|
|
|
|
|
|
|
|
|
|
Similarly, the Intel Management Engine (ME) is a dedicated
|
|
|
|
|
processor embedded within Intel chipsets that operates
|
|
|
|
|
independently of the main CPU. The ME is a comprehensive subsystem that
|
|
|
|
|
provides a variety of functions, including out-of-band system
|
|
|
|
|
management, security enforcement, and support for Digital Rights
|
|
|
|
|
Management (DRM) \cite{intel_csme}. The ME's firmware runs on an
|
|
|
|
|
isolated environment that allows it to perform these tasks securely,
|
|
|
|
|
even when the system is powered off. This capability is crucial for
|
|
|
|
|
enterprise environments where administrators need to perform remote
|
|
|
|
|
diagnostics, updates, and security checks without relying on the main
|
|
|
|
|
OS.
|
|
|
|
|
Intel ME enforces Digital Rights Management (DRM)
|
|
|
|
|
through a multifaceted approach leveraging its deeply embedded,
|
|
|
|
|
hardware-based capabilities. At the core is the Protected
|
|
|
|
|
Execution Environment (PEE), which operates independently from the main
|
|
|
|
|
CPU and operating system. This isolation allows to privately
|
|
|
|
|
manage cryptographic keys, certificates, and other sensitive data
|
|
|
|
|
critical for DRM, which can be very problematic from a user freedom
|
|
|
|
|
perspective \cite{fsf_intel_me}. By handling encryption and decryption
|
|
|
|
|
processes within this protected environment, Intel ME ensures that
|
|
|
|
|
DRM-protected content, such as video streams, remains secure and
|
|
|
|
|
unreachable by the user, raising concerns about the control users have
|
|
|
|
|
over their own devices \cite{eff_intel_me}.
|
|
|
|
|
Intel ME also plays a significant role in maintaining platform
|
|
|
|
|
integrity through the secure boot process. During secure boot, Intel ME
|
|
|
|
|
ensures that only digitally signed and authorized operating systems and
|
|
|
|
|
applications are loaded, which can prevent users from installing
|
|
|
|
|
alternative or modified software on their own hardware, further
|
|
|
|
|
restricting their freedom \cite{uefi_what_is_uefi}. This is further
|
|
|
|
|
reinforced by Intel ME's remote attestation capabilities, where the
|
|
|
|
|
system’s state is reported to a remote server. This process verifies
|
2024-08-27 13:52:27 +02:00
|
|
|
|
that only systems meeting specific security standards dictated by third
|
|
|
|
|
parties are allowed to access DRM-protected content, potentially
|
2024-08-22 19:18:34 +02:00
|
|
|
|
limiting users' control over their own devices \cite{proprivacy_intel_me}.
|
|
|
|
|
Moreover, Intel ME supports High-bandwidth Digital Content Protection
|
|
|
|
|
(HDCP), a technology that restricts how digital content is transmitted
|
|
|
|
|
over interfaces like HDMI or DisplayPort. By enforcing HDCP, Intel ME
|
|
|
|
|
ensures that protected digital content, such as high-definition video,
|
|
|
|
|
is only transmitted to and displayed on authorized devices, effectively
|
|
|
|
|
preventing users from freely using the content they have legally
|
|
|
|
|
acquired \cite{phoronix_hdcp_2_2_i915}\cite{kernel_mei_hdcp}.
|
|
|
|
|
Together, these features enable Intel ME to provide a comprehensive and
|
|
|
|
|
robust DRM enforcement mechanism. However, this also means that users
|
|
|
|
|
have less control over their own hardware and digital content, raising
|
|
|
|
|
serious concerns about privacy, user autonomy, and the broader
|
|
|
|
|
implications for freedom in computing
|
|
|
|
|
\cite{fsf_intel_me}\cite{netgarage_intel_me}. \\
|
|
|
|
|
|
|
|
|
|
Added to that, Intel ME has been a source of controversy due to its deep
|
2024-08-22 15:38:22 +02:00
|
|
|
|
integration into the hardware and its potential to be exploited if
|
2024-08-22 19:18:34 +02:00
|
|
|
|
vulnerabilities are discovered. Researchers have demonstrated ways to
|
|
|
|
|
hack into the ME, potentially gaining control over a system even when
|
|
|
|
|
it is powered off \cite{blackhat_me_hack}. These concerns have led to
|
|
|
|
|
calls for greater transparency and security measures around the ME and
|
|
|
|
|
similar subsystems. When comparing Intel ME and AMD PSP, the primary
|
|
|
|
|
difference lies in their scope and functionality. Intel ME offers more
|
|
|
|
|
extensive remote management capabilities, making it a more comprehensive
|
|
|
|
|
tool for enterprise environments, while AMD PSP focuses more narrowly on
|
|
|
|
|
core security tasks. Nonetheless, both play critical roles in ensuring
|
|
|
|
|
the security and integrity of modern computing systems. \\
|
|
|
|
|
|
|
|
|
|
The ASUS KGPE-D16 mainboard does not include AMD PSP nor Intel ME.
|
2024-08-21 12:53:44 +02:00
|
|
|
|
|
2024-08-25 11:54:54 +02:00
|
|
|
|
% ------------------------------------------------------------------------------
|
|
|
|
|
% CHAPTER 4: Memory initialization and training
|
|
|
|
|
% ------------------------------------------------------------------------------
|
|
|
|
|
\chapter{Memory initialization and training}
|
|
|
|
|
|
2024-08-27 13:27:07 +02:00
|
|
|
|
\section{Importance of DDR3 memory initialization}
|
2024-08-25 11:54:54 +02:00
|
|
|
|
|
2024-08-25 15:57:26 +02:00
|
|
|
|
Memory modules are designed solely for storing data. The only valid
|
|
|
|
|
operations on a memory device are reading data stored in the device,
|
|
|
|
|
writing (or storing) data into the device, and refreshing the data.
|
|
|
|
|
Memory modules consist of large rectangular arrays of memory cells,
|
|
|
|
|
including circuits used to read and write data into the arrays, and
|
|
|
|
|
refresh circuits to maintain the integrity of the stored data. The
|
|
|
|
|
memory arrays are organized into rows and columns of memory cells,
|
|
|
|
|
known as word lines and bit lines, respectively. Each memory cell
|
|
|
|
|
has a unique location or address defined by the intersection of a
|
|
|
|
|
row and a column. A DRAM memory cell is a capacitor that is charged
|
|
|
|
|
to produce a 1 or a 0. \\
|
|
|
|
|
|
2024-08-25 11:54:54 +02:00
|
|
|
|
DDR3 (Double Data Rate Type 3) is a widely used type of
|
|
|
|
|
SDRAM (Synchronous Dynamic Random-Access Memory) that offers
|
|
|
|
|
significant performance improvements over its predecessors,
|
2024-08-25 15:57:26 +02:00
|
|
|
|
DDR and DDR2. A DDR3 DIMM module contains 240 contacts.
|
|
|
|
|
Key features of DDR3 include higher data rates,
|
2024-08-25 11:54:54 +02:00
|
|
|
|
lower power consumption, and increased memory capacity, making
|
|
|
|
|
it essential for high-performance computing environments
|
|
|
|
|
\cite{DDR3_wiki}. One of the critical aspects of DDR3 is its
|
|
|
|
|
internal architecture, which supports data rates ranging from
|
|
|
|
|
800 to 1600 Mbps and operates at a lower voltage of 1.5V. This
|
|
|
|
|
enables faster data processing and more efficient power usage,
|
|
|
|
|
crucial for modern applications that require high-speed memory
|
|
|
|
|
access \cite{samsung_ddr3}. Additionally, DDR3 memory modules are
|
|
|
|
|
available in larger capacities, allowing systems to handle larger
|
|
|
|
|
datasets and more complex computing tasks \cite{altera2008}.
|
|
|
|
|
However, the advanced features of DDR3 come with increased
|
2024-08-25 15:57:26 +02:00
|
|
|
|
complexity in its initialization and operation.
|
|
|
|
|
The DDR3 memory interface, used by the ASUS KGPE-D16, is
|
|
|
|
|
source-synchronous. Each memory module generates a Data Strobe
|
|
|
|
|
(DQS) pulse simultaneously with the data (DQ) it sends during
|
|
|
|
|
a memory read operation. Similarly, a DQS must be generated
|
|
|
|
|
with its DQ information when writing to memory. The DQS differs
|
|
|
|
|
between write and read operations. Specifically, the DQS generated
|
|
|
|
|
by the system for a write operation is centered in the data bit
|
|
|
|
|
period, while the DQS provided by the memory during a read operation
|
|
|
|
|
is aligned with the edge of the data period \cite{samsung_ddr3}. \\
|
|
|
|
|
|
|
|
|
|
Due to this edge alignment, the read DQS timing can be adjusted
|
|
|
|
|
to meet the setup and hold requirements of the registers capturing
|
|
|
|
|
the read data. To improve timing margins or reduce simultaneous
|
|
|
|
|
switching noise in the system, the DDR3 memory interface also allows
|
|
|
|
|
various other timing parameters to be adjusted. If the system uses
|
|
|
|
|
dual-inline memory modules (DIMMs), as in our case, the interface
|
|
|
|
|
provides write leveling: a timing adjustment that compensates for
|
|
|
|
|
variations in signal travel time \cite{micron_ddr3}.
|
|
|
|
|
To reduce simultaneous switching noise, DIMM modules feature a
|
|
|
|
|
fly-by architecture for routing the address, command, and clock
|
|
|
|
|
signals, which causes command signals to reach the
|
|
|
|
|
different memory devices with a delay. The fly-by topology has a
|
|
|
|
|
"daisy-chain" structure with either very short stubs or no stubs
|
|
|
|
|
at all. This structure results in fewer branches and point-to-point
|
|
|
|
|
connections: everything originates from the controller, passing
|
|
|
|
|
through each module on the node, thereby increasing the throughput.
|
2024-08-25 11:54:54 +02:00
|
|
|
|
In this topology, signals are routed sequentially
|
|
|
|
|
from the memory controller to each DRAM chip, reducing signal
|
2024-08-25 15:57:26 +02:00
|
|
|
|
reflections and improving overall signal integrity.
|
|
|
|
|
It means that routing is done in the order of byte lane numbers,
|
|
|
|
|
and the data byte lanes are routed on the same layer. Routing can be
|
|
|
|
|
simplified by swapping data bits within a byte lane if necessary.
|
|
|
|
|
The fly-by topology contrasts with the dual-T topology
|
|
|
|
|
(fig. \ref{fig:fly-by}). This design is essential for maintaining
|
|
|
|
|
stability at the high speeds DDR3 operates at, but it also
|
|
|
|
|
introduces timing challenges, such as timing skew, that must be
|
|
|
|
|
carefully managed \cite{micron_ddr3}. \\
|
2024-08-22 19:18:34 +02:00
|
|
|
|
|
2024-08-25 11:54:54 +02:00
|
|
|
|
\begin{figure}[H]
|
|
|
|
|
\centering
|
|
|
|
|
\begin{minipage}[b]{0.45\textwidth}
|
|
|
|
|
\centering
|
|
|
|
|
\includegraphics[width=0.90\textwidth]{images/fly-by.png}
|
|
|
|
|
\end{minipage}%
|
|
|
|
|
\begin{minipage}[b]{0.45\textwidth}
|
|
|
|
|
\centering
|
|
|
|
|
\includegraphics[width=0.824\textwidth]{images/t.png}
|
|
|
|
|
\end{minipage}
|
|
|
|
|
\caption{DDR3 fly-by \textit{versus} T-topology
|
|
|
|
|
(CC BY-SA 4.0, 2021)}
|
|
|
|
|
\label{fig:fly-by}
|
|
|
|
|
\end{figure}
|
2024-08-22 19:18:34 +02:00
|
|
|
|
|
2024-08-25 11:54:54 +02:00
|
|
|
|
Proper memory initialization ensures that the memory controller
|
|
|
|
|
and the memory modules are correctly configured to work together,
|
|
|
|
|
allowing for efficient data transfer and reliable operation. The
|
|
|
|
|
initialization process involves setting various parameters,
|
|
|
|
|
such as memory timings, voltages, and frequencies, which are
|
|
|
|
|
critical for ensuring that the memory operates within its optimal
|
|
|
|
|
range \cite{samsung_ddr3}. Failure to initialize DDR3 memory
|
|
|
|
|
correctly can lead to several serious consequences, including
|
|
|
|
|
system instability, data corruption, and reduced performance
|
|
|
|
|
\cite{SridharanVilas2015MEiM}. In the worst-case scenario, improper
|
|
|
|
|
memory initialization can prevent the system from booting entirely,
|
|
|
|
|
as the memory subsystem fails to function correctly.
|
|
|
|
|
In the context of the ASUS KGPE-D16, a server motherboard
|
|
|
|
|
designed for high-performance applications, proper DDR3 memory
|
|
|
|
|
initialization is particularly important. The KGPE-D16 supports
|
|
|
|
|
up to 256GB of DDR3 memory across 16 DIMM slots, and any issues
|
|
|
|
|
during memory initialization, if non-fatal, could severely impact
|
|
|
|
|
the system's ability to handle large datasets or maintain stable
|
|
|
|
|
operation under heavy workloads \cite{asus_kgpe_d16_manual}. Given
|
|
|
|
|
the critical role that memory plays in the overall performance of
|
|
|
|
|
the KGPE-D16, ensuring that DDR3 memory is correctly initialized
|
|
|
|
|
is essential for achieving the desired balance of performance,
|
|
|
|
|
reliability, and stability in demanding server environments.
|
|
|
|
|
|
2024-08-27 14:14:41 +02:00
|
|
|
|
\section{General steps for DDR3 configuration}
|
|
|
|
|
|
|
|
|
|
DDR3 memory initialization is a detailed and essential
|
|
|
|
|
process that ensures both the stability and performance of the
|
|
|
|
|
system. The process involves several critical steps: detection
|
|
|
|
|
and identification of memory modules, initial configuration of the
|
|
|
|
|
memory controller, adjustment of timing and voltage settings, and
|
|
|
|
|
the execution of training and calibration procedures. \\
|
|
|
|
|
|
|
|
|
|
The initialization begins with the detection and identification of
|
|
|
|
|
the installed memory modules. During the BIST, the firmware reads
|
|
|
|
|
the Serial Presence Detect (SPD) data stored on
|
|
|
|
|
each memory module. SPD data contains crucial information about
|
|
|
|
|
the memory module's specifications, including size, speed, CAS
|
|
|
|
|
latency (CL), RAS to CAS delay (tRCD), row precharge time (tRP),
|
|
|
|
|
and row cycle time (tRC). This data allows to configure
|
|
|
|
|
the memory controller for optimal compatibility and performance. \\
|
|
|
|
|
|
|
|
|
|
Indeed, once the memory modules have been identified, the firmware
|
|
|
|
|
proceeds to the initial configuration of the memory controller.
|
|
|
|
|
This controller is governed by a state machine that
|
|
|
|
|
manages the sequence of operations required to initialize,
|
|
|
|
|
maintain, and control memory access. This state machine consists of
|
|
|
|
|
multiple states that represent various phases of memory operation,
|
|
|
|
|
such as reset, initialization, calibration, and data transfer.
|
|
|
|
|
The transitions between these states are either automatic or
|
|
|
|
|
command-driven, depending on the specific requirements of each
|
|
|
|
|
phase \cite{samsung_ddr3}\cite{micron_ddr3}.
|
|
|
|
|
This state machine is presented in the
|
|
|
|
|
fig. \ref{fig:ddr3_state_machine}. Automatic transitions, depicted
|
|
|
|
|
by thick arrows in the automaton, occur without external
|
|
|
|
|
intervention. These typically include transitions that ensure
|
|
|
|
|
the memory enters a stable state, such as the transition from
|
|
|
|
|
power-on to initialization, or from calibration to idle states.
|
|
|
|
|
These transitions are crucial for maintaining the integrity and
|
|
|
|
|
stability of the memory system, as they ensure that the controller
|
|
|
|
|
progresses through necessary stages like ZQ calibration and write
|
|
|
|
|
leveling, which are essential for proper signal timing and
|
|
|
|
|
impedance matching
|
|
|
|
|
\cite{samsung_ddr3}\cite{micron_ddr3}\cite{burnett_ddr3}. \\
|
|
|
|
|
|
|
|
|
|
On the other hand, command-driven transitions, represented by normal
|
|
|
|
|
arrows in the automaton, require specific commands issued by the
|
|
|
|
|
memory controller or the CPU to advance to the next state. For
|
|
|
|
|
instance, the transition from the idle state to the data transfer
|
|
|
|
|
state requires explicit read or write commands. Similarly,
|
|
|
|
|
transitioning from the initialization state to the calibration
|
|
|
|
|
state involves issuing mode register set (MRS) commands that
|
|
|
|
|
configure the memory’s operating parameters. These command-driven
|
|
|
|
|
transitions are integral to the dynamic operation of the memory
|
|
|
|
|
system, allowing the controller to respond to the system's
|
|
|
|
|
operational needs and ensuring that memory accesses are performed
|
|
|
|
|
efficiently and accurately \cite{samsung_ddr3}\cite{micron_ddr3}. \\
|
|
|
|
|
|
|
|
|
|
The memory controller configuration
|
|
|
|
|
involves setting up fundamental parameters such as the memory clock
|
|
|
|
|
(MEMCLK) frequency and the memory channel configuration. The MEMCLK
|
|
|
|
|
frequency is derived from the SPD data, while the memory channels
|
|
|
|
|
are configured to operate in single, dual, or quad-channel modes,
|
|
|
|
|
depending on the system architecture and the installed modules
|
|
|
|
|
\cite{burnett_ddr3}. Proper configuration of the memory controller
|
|
|
|
|
is vital to ensure synchronization with the memory modules,
|
|
|
|
|
establishing a stable foundation for subsequent operations. \\
|
|
|
|
|
|
|
|
|
|
The first critical step, during the INIT phase involves the
|
|
|
|
|
adjustment of timing and voltage settings. These settings are
|
|
|
|
|
essential for ensuring that DDR3 memory operates efficiently and
|
|
|
|
|
reliably. Key timing parameters include CAS Latency (CL), RAS to
|
|
|
|
|
CAS Delay (tRCD), Row Precharge Time (tRP), and Row Cycle Time (tRC).
|
|
|
|
|
These parameters are finely tuned to balance speed and stability
|
|
|
|
|
\cite{samsung_ddr3}. The BIOS uses the SPD data to set these
|
|
|
|
|
parameters and may also adjust them dynamically to achieve the
|
|
|
|
|
best possible performance. Voltage settings, such as DRAM voltage
|
|
|
|
|
(typically 1.5V for DDR3) and termination voltage (VTT), are also
|
|
|
|
|
configured to maintain stable operation, especially under varying
|
|
|
|
|
conditions such as temperature fluctuations \cite{micron_ddr3}. \\
|
|
|
|
|
|
|
|
|
|
Training and calibration are among the most complex and crucial
|
|
|
|
|
stages of DDR3 memory initialization. The fly-by topology used
|
|
|
|
|
for address, command, and clock signals in DDR3 modules enhances
|
|
|
|
|
signal integrity by reducing the number of stubs and their lengths,
|
|
|
|
|
but it also introduces skew between the clock (CK) and data strobe
|
|
|
|
|
(DQS) signals \cite{micron_ddr3}. This skew must be compensated to
|
|
|
|
|
ensure that data is written and read correctly. The BIOS performs
|
|
|
|
|
write leveling, which adjusts the timing of DQS relative to CK
|
|
|
|
|
for each memory module. This process ensures that the memory
|
|
|
|
|
controller can write data accurately across all modules, even
|
|
|
|
|
when they exhibit slight variations in signal timing due to the
|
|
|
|
|
physical layout \cite{samsung_ddr3}. \\
|
2024-08-22 19:18:34 +02:00
|
|
|
|
|
2024-08-27 14:14:41 +02:00
|
|
|
|
\begin{figure}[H]
|
|
|
|
|
\centering
|
|
|
|
|
\begin{tikzpicture}[scale=0.6,
|
|
|
|
|
transform shape,
|
|
|
|
|
shorten >=1pt,
|
|
|
|
|
node distance=5cm and 5cm,
|
|
|
|
|
on grid,
|
|
|
|
|
auto]
|
|
|
|
|
% States
|
|
|
|
|
\node[state, initial] (reset) {RESET};
|
|
|
|
|
\node[draw=none,fill=none] (any) [below=2cm of reset] {ANY};
|
|
|
|
|
\node[state] (init) [right=of reset] {INIT};
|
|
|
|
|
\node[state] (zqcal) [below=of init] {ZQ Calibration};
|
|
|
|
|
\node[state, accepting] (idle) [right=of init] {IDLE};
|
|
|
|
|
\node[state] (writelevel) [above=of idle] {WRITE LEVELING};
|
|
|
|
|
\node[state] (refresh) [right=of idle] {REFRESH};
|
|
|
|
|
\node[state] (activation) [below=of idle] {ACTIVATION};
|
|
|
|
|
\node[state] (bankactive) [below=of activation] {BANK ACTIVE};
|
|
|
|
|
\node[state] (readop) [below right=of bankactive] {READ OP};
|
|
|
|
|
\node[state] (writeop) [below left=of bankactive] {WRITE OP};
|
|
|
|
|
\node[state] (prechrg) [below right=of readop] {PRE-CHARGING};
|
|
|
|
|
% Transitions
|
|
|
|
|
\path[->, line width=0.2mm, >=stealth]
|
|
|
|
|
(reset) edge node {} (init)
|
|
|
|
|
(idle) edge [bend left=20] node {} (writelevel)
|
|
|
|
|
edge [bend left=20] node {REF} (refresh)
|
|
|
|
|
edge node {} (activation)
|
|
|
|
|
edge [bend left=10] node {ZQCL/S} (zqcal)
|
|
|
|
|
(activation) edge node {} (bankactive)
|
|
|
|
|
(bankactive) edge [bend left=30] node {PRE} (prechrg)
|
|
|
|
|
edge [bend left=20] node {write} (writeop)
|
|
|
|
|
edge [bend right=20] node {read} (readop)
|
|
|
|
|
(writeop) edge [loop left] node {write} (writeop)
|
|
|
|
|
edge [bend left=10] node {read\_a} (readop)
|
|
|
|
|
edge [bend right=15] node {PRE} (prechrg)
|
|
|
|
|
(readop) edge [loop right] node {read} (readop)
|
|
|
|
|
edge [bend left=10] node {write\_a} (writeop)
|
|
|
|
|
edge [bend right=15] node {PRE} (prechrg);
|
|
|
|
|
% Thick transitions
|
|
|
|
|
\path[->, line width=0.5mm, >=stealth]
|
|
|
|
|
(any) edge node {} (reset)
|
|
|
|
|
(init) edge node {ZQCL} (zqcal)
|
|
|
|
|
(zqcal) edge [bend left=10] node {} (idle)
|
|
|
|
|
(writelevel) edge [bend left=20] node {MRS} (idle)
|
|
|
|
|
(refresh) edge [bend left=20] node {} (idle)
|
|
|
|
|
(writeop) edge node {} (prechrg)
|
|
|
|
|
edge [bend left=20] node {} (bankactive)
|
|
|
|
|
(readop) edge [bend left=15] node {} (prechrg)
|
|
|
|
|
edge [bend right=20] node {} (bankactive)
|
|
|
|
|
(prechrg) edge [bend right=20] node {} (idle);
|
|
|
|
|
\end{tikzpicture}
|
|
|
|
|
\caption{DDR3 controller state machine}
|
|
|
|
|
\label{fig:ddr3_state_machine}
|
|
|
|
|
\end{figure}
|
2024-08-22 19:18:34 +02:00
|
|
|
|
|
2024-08-27 14:14:41 +02:00
|
|
|
|
ZQ calibration is another vital procedure that adjusts the
|
|
|
|
|
output driver impedance and on-die termination (ODT) to match
|
|
|
|
|
the system’s characteristic impedance \cite{micron_ddr3}. This
|
|
|
|
|
calibration is critical for maintaining signal integrity under
|
|
|
|
|
different operating conditions, such as voltage and temperature
|
|
|
|
|
changes. During initialization, the memory controller issues a
|
|
|
|
|
ZQCL command to the DRAM modules, triggering the calibration
|
|
|
|
|
sequence that optimizes impedance settings.
|
|
|
|
|
This ensures that the memory system can
|
|
|
|
|
operate with tight timing tolerances, which is crucial for
|
|
|
|
|
systems requiring high reliability.
|
|
|
|
|
Read training is also essential to ensure that data read from
|
|
|
|
|
the memory modules is interpreted correctly by the memory
|
|
|
|
|
controller. This process involves adjusting the timing of the
|
|
|
|
|
read data strobe (DQS) to align perfectly with the data being
|
|
|
|
|
received. Proper read training is necessary for reliable data
|
|
|
|
|
retrieval, which directly impacts system performance and stability. \\
|
|
|
|
|
|
|
|
|
|
ZQCS (ZQ Calibration Short) however is a procedure used
|
|
|
|
|
to periodically adjust the DRAM's ODT and output driver impedance
|
|
|
|
|
during normal operation. Unlike the full ZQCL (ZQ Calibration Long),
|
|
|
|
|
which is performed during initial memory initialization, ZQCS is a
|
|
|
|
|
quicker, less comprehensive calibration that fine-tunes the
|
|
|
|
|
impedance settings in response to changes in temperature, voltage,
|
|
|
|
|
or other environmental factors. This helps maintain optimal signal
|
|
|
|
|
integrity and performance throughout the memory's operation without
|
|
|
|
|
the need for a full recalibration. \\
|
|
|
|
|
|
|
|
|
|
In summary, the DDR3 memory initialization process in systems
|
|
|
|
|
like the ASUS KGPE-D16 involves a series of detailed and
|
|
|
|
|
interdependent steps that are critical for ensuring system
|
|
|
|
|
stability and performance. These include the detection and
|
|
|
|
|
identification of memory modules, the initial configuration of
|
|
|
|
|
the memory controller, precise adjustments of timing and voltage
|
|
|
|
|
settings, and rigorous training and calibration procedures.
|
2024-08-25 11:54:54 +02:00
|
|
|
|
|
|
|
|
|
\section{Memory initialization techniques}
|
|
|
|
|
|
|
|
|
|
\subsection{Memory training algorithms}
|
|
|
|
|
|
|
|
|
|
Memory training algorithms are designed to fine-tune the
|
|
|
|
|
operational parameters of memory modules, such as timing, voltage,
|
|
|
|
|
and impedance. These algorithms play a crucial role in achieving
|
|
|
|
|
the optimal performance of DDR3 memory systems, particularly
|
|
|
|
|
in complex multi-core environments where synchronization
|
|
|
|
|
and timing are challenging. The primary algorithms used in
|
|
|
|
|
memory training include ZQ calibration and write leveling.
|
|
|
|
|
Optimizing timing and voltage settings is a critical aspect of
|
|
|
|
|
memory training. The memory controller adjusts parameters such as
|
|
|
|
|
CAS latency, RAS to CAS delay, and other timing characteristics
|
|
|
|
|
to ensure that data is read and written with minimal delay
|
|
|
|
|
and maximum accuracy. Voltage adjustments are also crucial,
|
|
|
|
|
as they help stabilize the operation of memory modules by
|
|
|
|
|
ensuring that the power supplied is within the optimal range,
|
|
|
|
|
compensating for any variations due to temperature or other factors
|
|
|
|
|
\cite{micron_ddr3}\cite{burnett_ddr3}\cite{gopikrishna2021novel}.
|
|
|
|
|
\\
|
|
|
|
|
|
|
|
|
|
ZQ calibration is a critical step in DDR3 memory initialization that
|
|
|
|
|
ensures the proper impedance matching of the output driver and
|
|
|
|
|
on-die termination (ODT) resistance. Impedance matching is crucial
|
|
|
|
|
for maintaining signal integrity by minimizing reflections and
|
|
|
|
|
ensuring reliable data transmission between the memory controller
|
|
|
|
|
and the DRAM modules. It is initiated by sending ZQCL (ZQ
|
|
|
|
|
Calibration Long) commands to the DDR3 DIMMs. Each ZQCL command
|
|
|
|
|
triggers a long calibration cycle within the DRAM module. The
|
|
|
|
|
purpose of this calibration is to adjust the output driver impedance
|
|
|
|
|
and the ODT resistance to match the specified target impedance. This
|
|
|
|
|
adjustment compensates for process variations, voltage fluctuations,
|
|
|
|
|
and temperature changes that can affect the impedance
|
|
|
|
|
characteristics of the DRAM module \cite{gopikrishna2021novel}. \\
|
|
|
|
|
|
|
|
|
|
A bit in the DRAM Controller
|
|
|
|
|
Timing register is set to 1 to send the ZQCL command, and an address
|
|
|
|
|
bit is also set to 1 to indicate that the ZQCL command should be
|
|
|
|
|
directed to the memory module. Upon receiving the ZQCL command, the
|
|
|
|
|
DRAM module begins the calibration process. This involves a series
|
|
|
|
|
of internal adjustments where the DRAM module measures its current
|
|
|
|
|
impedance and compares it against the target impedance. The module
|
|
|
|
|
then modifies its internal settings to reduce the difference between
|
|
|
|
|
the current and target impedance values
|
|
|
|
|
\cite{gopikrishna2021novel}\cite{samsung_ddr3}. This process is
|
|
|
|
|
iterative, meaning that it may require multiple adjustments to
|
|
|
|
|
converge on the optimal impedance settings. The calibration is
|
|
|
|
|
designed to ensure that the DRAM module's impedance remains within
|
|
|
|
|
a tight tolerance, which is critical for high-speed data
|
|
|
|
|
communication. The ZQ calibration process is time-sensitive. After
|
|
|
|
|
issuing the ZQCL command, the system must wait for 512 memory
|
|
|
|
|
clock cycles (MEMCLKs) to allow the calibration to complete.
|
|
|
|
|
This delay is necessary because the calibration involves both
|
|
|
|
|
measurement and adjustment phases, which require precise timing
|
|
|
|
|
to ensure accuracy \cite{gopikrishna2021novel}. If the system does
|
|
|
|
|
not wait the full 512 MEMCLKs, the calibration may be incomplete,
|
|
|
|
|
leading to suboptimal impedance matching and potential signal
|
|
|
|
|
integrity issues, such as reflections or noise on the data lines. \\
|
|
|
|
|
|
|
|
|
|
During the ZQ calibration, the DRAM module adjusts its output driver
|
|
|
|
|
impedance, which controls the strength of the signals it sends out.
|
|
|
|
|
The stronger the signal, the less susceptible it is to noise, but if
|
|
|
|
|
the impedance is too high or too low, it can cause signal distortion
|
|
|
|
|
or reflections. The ODT resistance is also calibrated to properly
|
|
|
|
|
terminate signals that reach the end of a data line. Proper
|
|
|
|
|
termination is essential to prevent signal reflections that could
|
|
|
|
|
interfere with the integrity of the data being transmitted. The ZQCL
|
|
|
|
|
command adjusts these settings by fine-tuning the resistance values
|
|
|
|
|
based on the module’s feedback, ensuring that the signal paths are
|
|
|
|
|
optimized for both transmission and termination. Once the ZQ
|
|
|
|
|
calibration is complete, the DCT register bit is reset to 0,
|
|
|
|
|
indicating that the calibration command has been processed. The
|
|
|
|
|
memory controller then verifies that the DRAM module has correctly
|
|
|
|
|
adjusted its impedance settings. This verification process may
|
|
|
|
|
involve additional test signals sent across the memory bus to
|
|
|
|
|
confirm that signal integrity meets the required standards. If the
|
|
|
|
|
calibration is successful, the memory subsystem is now properly
|
|
|
|
|
calibrated and ready for normal operation. In systems with LRDIMMs
|
|
|
|
|
or RDIMMs, additional steps may be necessary to ensure that all
|
|
|
|
|
ranks and channels are calibrated correctly, particularly in
|
|
|
|
|
multi-rank configurations where impedance matching can be more
|
|
|
|
|
complex. However, in systems with complex memory configurations,
|
|
|
|
|
such as those using multiple DIMMs per channel or operating at
|
|
|
|
|
higher memory frequencies, the ZQ calibration process becomes even
|
|
|
|
|
more critical. The calibration may need to be repeated at different
|
|
|
|
|
operating points to ensure that the memory subsystem remains stable
|
|
|
|
|
across all conditions. This could involve performing multiple ZQCL
|
|
|
|
|
calibrations at different memory frequencies, or under different
|
|
|
|
|
thermal conditions, to account for the dynamic nature of memory
|
|
|
|
|
operation in modern systems. \\
|
|
|
|
|
|
|
|
|
|
In seed-based algorithms, an initial "seed" value is used
|
|
|
|
|
as a reference point for the calibration process. The memory
|
|
|
|
|
controller iteratively adjusts the impedance based on feedback
|
|
|
|
|
from the memory module, refining the calibration with each
|
|
|
|
|
iteration. This method provides a more precise calibration,
|
|
|
|
|
particularly in systems where fine-tuned impedance matching is
|
|
|
|
|
critical for high-frequency operations \cite{kim2010design}.
|
|
|
|
|
Also, while seed-based methods can accelerate the convergence
|
|
|
|
|
of calibration, they require careful selection of initial seed
|
|
|
|
|
values to avoid suboptimal or even faulty impedance settings
|
|
|
|
|
\cite{gopikrishna2021novel}. \\
|
|
|
|
|
|
|
|
|
|
Write leveling is another critical aspect of memory training,
|
|
|
|
|
particularly in DDR3 systems that use a fly-by topology. It involves
|
|
|
|
|
using the physical layer (PHY) to detect the edge of the Data Strobe
|
|
|
|
|
(DQS) signal in synchronization with the clock (CK) signal on the
|
|
|
|
|
DIMM (Dual In-line Memory Module) during write access. The DQS
|
|
|
|
|
signal is a timing signal generated by the memory controller that
|
|
|
|
|
accompanies data (DQ) during read and write operations. For write
|
|
|
|
|
operations, the DQS signal must be perfectly aligned with the CK
|
|
|
|
|
signal to ensure that data is correctly written to memory cells.
|
|
|
|
|
Indeed, in systems using a fly-by topology, the DQS signal might
|
|
|
|
|
arrive at different times for different memory devices on the same
|
|
|
|
|
module due to the signal traveling through different lengths of
|
|
|
|
|
trace. Write leveling compensates for this skew by adjusting the
|
|
|
|
|
timing of the DQS signal relative to the CK signal for each lane
|
|
|
|
|
(a group of data lines) \cite{burnett_ddr3}. This training is
|
|
|
|
|
performed on a per-channel and per-DIMM basis, ensuring that each
|
|
|
|
|
memory module is correctly synchronized with the memory controller,
|
|
|
|
|
minimizing timing mismatches that could lead to data corruption. \\
|
|
|
|
|
|
2024-08-26 19:19:02 +02:00
|
|
|
|
Write leveling implies to perform a DQS position training, a
|
|
|
|
|
specific form of training focused on aligning the DQS signal with
|
|
|
|
|
the data (DQ) signals during write operations. In this process,
|
|
|
|
|
the memory controller adjusts the phase of the DQS signal to ensure
|
|
|
|
|
that it is correctly aligned with the data signals across all data
|
|
|
|
|
lanes, centering the DQS signal within the "data eye" for optimal
|
|
|
|
|
timing. This ensures that all data bits are written correctly and
|
|
|
|
|
consistently across the memory module, reducing the risk of timing
|
|
|
|
|
errors and data corruption. Additionally, DQS receiver training is
|
|
|
|
|
also needed to ensure that the memory controller can correctly
|
|
|
|
|
capture the DQS signal during read operations
|
|
|
|
|
\cite{micron_ddr3}.
|
|
|
|
|
The core operation is to make the MCT send out specific test
|
|
|
|
|
patterns to the DRAM to determine the timing relationship between
|
|
|
|
|
the DQS and data signals, then the MCT adjusts the delay or phase of
|
|
|
|
|
the DQS signal relative to the clock signal (CK) and the data
|
|
|
|
|
signals (DQ) while checking the integrity of the test data in the
|
|
|
|
|
DRAM. \\
|
|
|
|
|
|
2024-08-25 11:54:54 +02:00
|
|
|
|
Using seed-based algorithms, the memory controller sets an initial
|
|
|
|
|
delay value and then iteratively adjusts it based on the feedback
|
|
|
|
|
received from the memory module. This process ensures that the DQS
|
|
|
|
|
signal is correctly aligned with the CK signal at the memory
|
|
|
|
|
module's pins, minimizing the risk of data corruption and ensuring
|
2024-08-26 19:19:02 +02:00
|
|
|
|
reliable write operations \cite{samsung_ddr3}\cite{gopikrishna2021novel}.
|
2024-08-25 11:54:54 +02:00
|
|
|
|
Seed-based write leveling offers improved precision but must be
|
|
|
|
|
finely tuned to account for the specific characteristics of the
|
|
|
|
|
memory module and the overall system architecture
|
|
|
|
|
\cite{gopikrishna2021novel}. \\
|
|
|
|
|
|
2024-08-26 19:19:02 +02:00
|
|
|
|
In contrast to seed-based algorithms, seedless methods do not rely on
|
|
|
|
|
an initial reference value. Instead, they dynamically adjust the
|
|
|
|
|
impedance and timing parameters during the calibration process.
|
|
|
|
|
Seedless ZQ calibration continuously monitors the impedance of the
|
|
|
|
|
memory module and makes real-time adjustments to maintain optimal
|
|
|
|
|
matching. This approach can be beneficial in environments where the
|
|
|
|
|
operating conditions are highly variable, as it allows for more
|
|
|
|
|
flexible and adaptive calibration \cite{kim2010design}. Similarly,
|
|
|
|
|
seedless write leveling dynamically adjusts the DQS timing based on
|
|
|
|
|
real-time feedback from the memory module. This method is particularly
|
|
|
|
|
useful in systems where the memory configuration is frequently changed
|
|
|
|
|
or where the operating conditions vary significantly
|
|
|
|
|
\cite{micron_ddr3}\cite{gopikrishna2021novel}. The traditional ZQ
|
|
|
|
|
calibration methods, while effective, often struggle with matching
|
|
|
|
|
impedance perfectly across all conditions. A master thesis by
|
|
|
|
|
\textcite{gopikrishna2021novel} builds upon these traditional methods
|
|
|
|
|
by proposing enhancements that involve more sophisticated calibration
|
|
|
|
|
approaches, leading to better impedance matching and overall memory
|
|
|
|
|
performance \cite{gopikrishna2021novel}.
|
|
|
|
|
|
2024-08-25 11:54:54 +02:00
|
|
|
|
|
|
|
|
|
\subsection{BIOS and Kernel Developer Guide (BKDG) recommendations}
|
|
|
|
|
|
|
|
|
|
The BIOS and Kernel Developer Guide (BKDG from \textcite{BKDG}) is a
|
|
|
|
|
technical manual aimed at BIOS developers and operating system kernel
|
|
|
|
|
programmers. It provides in-depth documentation on the AMD
|
|
|
|
|
processor architecture, system initialization processes, and
|
|
|
|
|
configuration guidelines. The document is essential for
|
|
|
|
|
understanding the proper initialization sequences, including
|
|
|
|
|
those for DDR3 memory, to ensure system stability and
|
|
|
|
|
performance, particularly for AMD Family 15h processors. \\
|
|
|
|
|
|
|
|
|
|
The initialization of DDR3 memory begins with configuring the DDR
|
|
|
|
|
supply voltage regulator, which ensures that the memory modules
|
|
|
|
|
receive the correct power levels. Following this, the Northbridge
|
2024-08-25 15:57:26 +02:00
|
|
|
|
(NB) P-state is forced to \path{NBP0}, a state that guarantees
|
2024-08-25 11:54:54 +02:00
|
|
|
|
stable operation during the initial configuration phases. Once these
|
|
|
|
|
preliminary steps are completed, the initialization of the DDR
|
|
|
|
|
physical layer (PHY) begins, which is critical for setting up
|
|
|
|
|
the communication interface between the memory controller and the
|
|
|
|
|
DDR3 modules. PHY fence training deals with overall signal alignment
|
|
|
|
|
at the physical interface, while ZQ calibration focuses on impedance
|
|
|
|
|
matching, and write leveling addresses timing alignment during
|
|
|
|
|
write operations. Each process involves different methods as PHY
|
|
|
|
|
fence training uses iterative timing adjustments, ZQ calibration
|
|
|
|
|
uses impedance adjustments via the ZQ pin, and write leveling
|
|
|
|
|
adjusts DQS timing relative to CK during writes. These processes are
|
|
|
|
|
critical for configuring DDR3 DIMMs and ensuring stable and reliable
|
|
|
|
|
operation, especially when booting from an unpowered state such as
|
|
|
|
|
ACPI S4 (hibernation), S5 (soft off), or G3 (mechanical off).
|
|
|
|
|
|
|
|
|
|
\subsubsection{DDR3 initialization procedure}
|
|
|
|
|
|
|
|
|
|
DDR3 initialization is a multi-step process that prepares
|
|
|
|
|
both the memory controllers and the DIMMs for operation. This
|
|
|
|
|
initialization is essential to set up the memory configuration
|
|
|
|
|
and to ensure that the memory subsystem operates correctly
|
|
|
|
|
under various conditions.
|
2024-08-22 19:18:34 +02:00
|
|
|
|
|
2024-08-25 11:54:54 +02:00
|
|
|
|
\begin{itemize}
|
|
|
|
|
\item \textbf{Enable DRAM initialization}: The process
|
|
|
|
|
begins by
|
|
|
|
|
enabling DRAM initialization. This is done
|
2024-08-25 15:57:26 +02:00
|
|
|
|
by setting the \path{EnDramInit} bit in
|
|
|
|
|
the \path{D18F2x7C_dct} register to 1. The
|
|
|
|
|
\path{D18F2x7C_dct} register is a specific
|
2024-08-25 11:54:54 +02:00
|
|
|
|
configuration register within the memory
|
|
|
|
|
controller that controls various aspects of the
|
|
|
|
|
DRAM initialization process. Enabling this bit
|
|
|
|
|
initiates the sequence of operations required to
|
|
|
|
|
prepare the memory for use. After setting this bit,
|
|
|
|
|
the system waits for 200 microseconds to allow the
|
|
|
|
|
initialization command to propagate and stabilize.
|
|
|
|
|
|
|
|
|
|
\item \textbf{Deassert memory reset}: Next, the memory
|
|
|
|
|
reset
|
2024-08-25 15:57:26 +02:00
|
|
|
|
signal, known as \path{MemRstX}, is deasserted
|
|
|
|
|
by setting the \path{DeassertMemRstX} bit in the
|
|
|
|
|
\path{D18F2x7C_dct} register to 1. Deasserting
|
|
|
|
|
\path{MemRstX} effectively takes the memory
|
2024-08-25 11:54:54 +02:00
|
|
|
|
components out of their reset state, allowing them
|
|
|
|
|
to begin normal operation. The system then waits
|
|
|
|
|
for an additional 500 microseconds to ensure that
|
|
|
|
|
the memory reset is fully deasserted and the memory
|
|
|
|
|
components are stable.
|
|
|
|
|
|
|
|
|
|
\item \textbf{Assert clock enable (CKE)}: The next
|
|
|
|
|
step involves asserting the clock enable signal, known as
|
2024-08-25 15:57:26 +02:00
|
|
|
|
`CKE`, by setting the \path{AssertCke} bit in the
|
|
|
|
|
\path{D18F2x7C_dct} register to 1. The \path{CKE}
|
2024-08-25 11:54:54 +02:00
|
|
|
|
signal is critical because it enables the clocking
|
|
|
|
|
of the DRAM modules, allowing them to synchronize
|
|
|
|
|
with the memory controller. The system must wait
|
2024-08-25 15:57:26 +02:00
|
|
|
|
for 360 nanoseconds after asserting \path{CKE}
|
2024-08-25 11:54:54 +02:00
|
|
|
|
to ensure that the clocking is correctly established.
|
|
|
|
|
|
|
|
|
|
\item \textbf{Registered DIMMs and LRDIMMs initialization}:
|
|
|
|
|
For systems using registered DIMMs (RDIMMs) or Load
|
|
|
|
|
Reduced DIMMs (LRDIMMs), additional initialization
|
|
|
|
|
steps are necessary. RDIMMs and LRDIMMs have
|
|
|
|
|
buffering mechanisms that reduce electrical loading
|
|
|
|
|
and improve signal integrity, especially in systems
|
|
|
|
|
with multiple memory modules. During initialization,
|
2024-08-25 15:57:26 +02:00
|
|
|
|
the BIOS programs the \path{ParEn} bit in the
|
|
|
|
|
\path{D18F2x90_dct} register based on whether
|
2024-08-25 11:54:54 +02:00
|
|
|
|
the DIMM is buffered or unbuffered. For RDIMMs,
|
|
|
|
|
specific Register Control (RC) commands, such as RC0
|
|
|
|
|
through RC7, are sent to initialize the memory module's
|
|
|
|
|
control registers. Similarly, LRDIMMs require a series
|
|
|
|
|
of Flexible Register Control (FRC) commands, such as
|
|
|
|
|
F0RC and F1RC, to initialize their internal registers
|
|
|
|
|
according to the manufacturer’s specifications.
|
|
|
|
|
|
|
|
|
|
\item \textbf{Mode Register Set (MRS)}: The initialization
|
|
|
|
|
process also involves sending Mode Register Set
|
|
|
|
|
(MRS) commands. These commands are used to configure
|
|
|
|
|
various operational parameters of the DDR3 memory
|
|
|
|
|
modules, such as burst length, latency timings,
|
|
|
|
|
and operating modes. Each MRS command targets a
|
|
|
|
|
specific mode register within the memory module,
|
|
|
|
|
and the exact sequence of commands is crucial for
|
|
|
|
|
setting up the DIMMs according to the system’s
|
|
|
|
|
requirements and the DIMM manufacturer’s guidelines.
|
|
|
|
|
\end{itemize}
|
2024-08-22 19:18:34 +02:00
|
|
|
|
|
2024-08-25 11:54:54 +02:00
|
|
|
|
\subsubsection{ZQ calibration process}
|
2024-08-22 19:18:34 +02:00
|
|
|
|
|
2024-08-25 11:54:54 +02:00
|
|
|
|
ZQ calibration is a key step in DDR3 initialization,
|
|
|
|
|
responsible for calibrating the output driver impedance and
|
|
|
|
|
on-die termination (ODT) resistance of the DDR3 modules. Proper
|
|
|
|
|
impedance matching is essential for maintaining signal
|
|
|
|
|
integrity, reducing signal reflections, and ensuring reliable
|
|
|
|
|
data communication between the memory controller and the
|
2024-08-25 18:51:20 +02:00
|
|
|
|
memory modules. It is important to note that ZQ calibration
|
|
|
|
|
is done directly by the memory controller, and that the firmware
|
|
|
|
|
is simply triggering it.
|
2024-08-22 19:18:34 +02:00
|
|
|
|
|
2024-08-25 11:54:54 +02:00
|
|
|
|
\begin{itemize}
|
|
|
|
|
\item \textbf{Sending ZQCL commands}: The BIOS initiates
|
|
|
|
|
ZQ calibration by sending two ZQCL (ZQ Calibration Long)
|
|
|
|
|
commands to each DDR3 DIMM. ZQCL commands instruct the
|
|
|
|
|
memory module to perform a long calibration cycle, during
|
|
|
|
|
which the module adjusts its output driver impedance and
|
|
|
|
|
ODT resistance to match the desired target impedance. This
|
|
|
|
|
process compensates for variations due to manufacturing
|
|
|
|
|
differences, voltage fluctuations, and temperature
|
|
|
|
|
changes. To send a ZQCL command, the BIOS programs the
|
2024-08-25 15:57:26 +02:00
|
|
|
|
\path{SendZQCmd} bit in the \path{D18F2x7C_dct}
|
|
|
|
|
register to 1 and sets the \path{MrsAddress[10]} bit to 1,
|
2024-08-25 11:54:54 +02:00
|
|
|
|
indicating that the ZQCL command should be sent to the
|
|
|
|
|
memory module.
|
|
|
|
|
|
|
|
|
|
\item \textbf{Calibration timing}: After sending the
|
|
|
|
|
ZQCL command, the system must wait for 512 memory clock
|
|
|
|
|
cycles (MEMCLKs) to allow the calibration process to
|
|
|
|
|
complete. During this time, the memory module adjusts
|
|
|
|
|
its internal impedance to ensure it matches the specified
|
|
|
|
|
target impedance. This timing is critical, as inadequate
|
|
|
|
|
wait times could result in incomplete or inaccurate
|
|
|
|
|
calibration, leading to signal integrity issues and
|
|
|
|
|
potential data errors.
|
|
|
|
|
|
|
|
|
|
\item \textbf{Finalization of initialization}: Once the
|
|
|
|
|
ZQ calibration is complete, the BIOS deactivates the DRAM
|
2024-08-25 15:57:26 +02:00
|
|
|
|
initialization process by setting the \path{EnDramInit}
|
|
|
|
|
bit in the \path{D18F2x7C_dct} register to 0. For
|
2024-08-25 11:54:54 +02:00
|
|
|
|
LRDIMMs, additional configuration steps are required to
|
|
|
|
|
finalize the initialization process. These steps include
|
|
|
|
|
programming the DCT registers to monitor for errors and
|
|
|
|
|
ensure that the LRDIMMs are operating correctly.
|
|
|
|
|
\end{itemize}
|
2024-08-22 19:18:34 +02:00
|
|
|
|
|
2024-08-25 11:54:54 +02:00
|
|
|
|
\subsubsection{Write leveling process}
|
|
|
|
|
|
2024-08-26 19:19:02 +02:00
|
|
|
|
The BIOS and Kernel Developer Guide (BKDG) provides
|
|
|
|
|
information on the write leveling process, which is
|
2024-08-25 11:54:54 +02:00
|
|
|
|
essential for ensuring correct data alignment during write
|
|
|
|
|
operations in DDR3 memory systems. Write leveling is
|
|
|
|
|
particularly crucial in systems utilizing a fly-by topology,
|
|
|
|
|
where timing skew between the clock and data signals can
|
|
|
|
|
introduce significant challenges. This kind of algorithms
|
|
|
|
|
were not necessary for DDR2, for example.
|
|
|
|
|
If the target operating
|
|
|
|
|
frequency is higher than the lowest supported MEMCLK frequency,
|
|
|
|
|
the BIOS must perform multiple passes to achieve proper write
|
|
|
|
|
leveling. The MEMCLK is the clock signal that synchronizes the
|
|
|
|
|
communication between the memory controller and the memory
|
|
|
|
|
modules. \\
|
|
|
|
|
|
|
|
|
|
During each pass, the memory subsystem is configured for a
|
|
|
|
|
progressively higher operating frequency:
|
2024-08-22 19:18:34 +02:00
|
|
|
|
|
|
|
|
|
\begin{itemize}
|
2024-08-25 11:54:54 +02:00
|
|
|
|
\item \textbf{Pass 1:} The memory subsystem is configured
|
|
|
|
|
for the lowest supported MEMCLK, ensuring that initial
|
|
|
|
|
timing adjustments are made under the most stable
|
|
|
|
|
conditions.
|
|
|
|
|
\item \textbf{Pass 2:} The subsystem is then adjusted for
|
|
|
|
|
the second-lowest MEMCLK, gradually increasing the
|
|
|
|
|
operating frequency while fine-tuning the alignment of
|
|
|
|
|
the DQS and CK signals.
|
|
|
|
|
\item \textbf{Pass N:} This process continues until the
|
|
|
|
|
highest MEMCLK supported by the system is reached,
|
|
|
|
|
ensuring that the memory operates reliably at its
|
|
|
|
|
maximum speed.
|
2024-08-22 19:18:34 +02:00
|
|
|
|
\end{itemize}
|
|
|
|
|
|
2024-08-25 11:54:54 +02:00
|
|
|
|
This step-wise configuration ensures that the memory system is
|
|
|
|
|
stable across all supported operating frequencies, minimizing
|
|
|
|
|
the risk of timing errors during write operations, especially
|
|
|
|
|
as frequencies increase and timing margins become tighter. The
|
|
|
|
|
configuration process varies depending on whether the DIMM is
|
|
|
|
|
a Registered DIMM (RDIMM) or an Unregistered DIMM (UDIMM).
|
|
|
|
|
RDIMMs include an additional buffer to improve signal integrity,
|
|
|
|
|
which is particularly important in systems with multiple DIMMs.
|
|
|
|
|
The steps common to both types include a preparation with the
|
|
|
|
|
DDR3 Mode Register Commands
|
2024-08-26 19:19:02 +02:00
|
|
|
|
(see fig. \ref{fig:ddr3_state_machine}).
|
|
|
|
|
For RDIMMs, a 4-rank module is treated as two
|
|
|
|
|
separate DIMMs, where each rank is essentially a separate memory
|
|
|
|
|
module within the same DIMM. The first two ranks are the primary
|
|
|
|
|
target for the initial configuration. The remaining two ranks
|
|
|
|
|
are treated as non-target and are configured separately. \\
|
|
|
|
|
|
|
|
|
|
Mode registers in DDR3
|
2024-08-25 11:54:54 +02:00
|
|
|
|
memory are used to configure various operational parameters such
|
|
|
|
|
as latency settings, burst length, and write leveling. One of
|
2024-08-25 15:57:26 +02:00
|
|
|
|
the key mode registers is \path{MR1_dct}, which is specific to
|
2024-08-25 11:54:54 +02:00
|
|
|
|
DDR3 and controls certain features of the memory module,
|
2024-08-25 15:57:26 +02:00
|
|
|
|
including write leveling. \path{MR1_dct} is used to enable or
|
2024-08-25 11:54:54 +02:00
|
|
|
|
disable specific functions such as write leveling and output
|
2024-08-25 15:57:26 +02:00
|
|
|
|
driver settings. The \path{dct} suffix refers to the Data
|
2024-08-25 11:54:54 +02:00
|
|
|
|
Control Timing that is specific to this register's function in
|
|
|
|
|
managing the timing and control of data operations within the
|
2024-08-26 19:19:02 +02:00
|
|
|
|
memory module. \\
|
2024-08-25 11:54:54 +02:00
|
|
|
|
|
|
|
|
|
Then, these steps are followed, still common to both RDIMMs and
|
|
|
|
|
UDIMMs:
|
2024-08-22 19:18:34 +02:00
|
|
|
|
|
|
|
|
|
\begin{itemize}
|
2024-08-25 11:54:54 +02:00
|
|
|
|
\item \textbf{Step 1A: Output Driver and ODT configuration
|
|
|
|
|
for target DIMM:}
|
|
|
|
|
\begin{itemize}
|
|
|
|
|
\item For the first rank (target):
|
|
|
|
|
\begin{itemize}
|
2024-08-25 15:57:26 +02:00
|
|
|
|
\item Set \path{MR1_dct[1:0][Level] = 1}
|
2024-08-25 11:54:54 +02:00
|
|
|
|
to enable write leveling.
|
2024-08-25 15:57:26 +02:00
|
|
|
|
\item Set \path{MR1_dct[1:0][Qoff] = 0}
|
2024-08-25 11:54:54 +02:00
|
|
|
|
to ensure the output drivers are active.
|
|
|
|
|
\end{itemize}
|
|
|
|
|
\item For other ranks:
|
|
|
|
|
\begin{itemize}
|
2024-08-25 15:57:26 +02:00
|
|
|
|
\item Set \path{MR1_dct[1:0][Level] = 1}
|
2024-08-25 11:54:54 +02:00
|
|
|
|
to prepare for write leveling.
|
2024-08-25 15:57:26 +02:00
|
|
|
|
\item Set \path{MR1_dct[1:0][Qoff] = 1}
|
2024-08-25 11:54:54 +02:00
|
|
|
|
to deactivate the output drivers for
|
|
|
|
|
ranks that are not currently being
|
|
|
|
|
leveled.
|
|
|
|
|
\end{itemize}
|
|
|
|
|
\item If there are two or more DIMMs per channel,
|
|
|
|
|
or if there is one DIMM per three channels:
|
|
|
|
|
\begin{itemize}
|
|
|
|
|
\item Program the target rank’s
|
2024-08-25 15:57:26 +02:00
|
|
|
|
\path{RttNom} (nominal termination
|
|
|
|
|
resistance value) for \path{RttWr}
|
2024-08-25 11:54:54 +02:00
|
|
|
|
termination, which helps in managing signal
|
|
|
|
|
integrity during the write process by
|
|
|
|
|
ensuring the correct impedance matching.
|
|
|
|
|
\end{itemize}
|
|
|
|
|
\end{itemize}
|
|
|
|
|
|
|
|
|
|
\item \textbf{Step 1B: Configure non-target RttNom to normal
|
|
|
|
|
operation:}
|
|
|
|
|
\begin{itemize}
|
|
|
|
|
\item After the initial configuration, the
|
2024-08-25 15:57:26 +02:00
|
|
|
|
\path{RttNom} values for the non-target ranks
|
2024-08-25 11:54:54 +02:00
|
|
|
|
are set to their normal operating states.
|
|
|
|
|
\item A wait time of 40 MEMCLKs is observed to
|
|
|
|
|
ensure the configuration settings are stable
|
|
|
|
|
before proceeding.
|
|
|
|
|
\end{itemize}
|
|
|
|
|
|
|
|
|
|
\item \textbf{Step 3: PHY configuration:}
|
|
|
|
|
\begin{itemize}
|
|
|
|
|
\item The PHY is then configured to measure and
|
|
|
|
|
adjust the timing delays accurately for each
|
|
|
|
|
data lane. The PHY layer is responsible for
|
|
|
|
|
converting the signals from the memory
|
|
|
|
|
controller into a form that can be transmitted
|
|
|
|
|
over the physical connections to the memory
|
|
|
|
|
modules.
|
|
|
|
|
\end{itemize}
|
|
|
|
|
|
|
|
|
|
\item \textbf{Step 4: Perform write leveling:}
|
|
|
|
|
\begin{itemize}
|
|
|
|
|
\item The actual write leveling process is executed,
|
|
|
|
|
where the DQS signal timing is adjusted to
|
|
|
|
|
ensure it aligns perfectly with the CK signal at
|
|
|
|
|
the memory module’s pins, ensuring that data is
|
|
|
|
|
written accurately.
|
|
|
|
|
\end{itemize}
|
|
|
|
|
|
|
|
|
|
\item \textbf{Step 5: Disable PHY configuration
|
|
|
|
|
post-measurement:}
|
|
|
|
|
\begin{itemize}
|
|
|
|
|
\item After completing the write leveling process,
|
|
|
|
|
the PHY configuration is disabled to stop further
|
|
|
|
|
timing measurements and adjustments, locking in the
|
|
|
|
|
calibrated settings.
|
|
|
|
|
\end{itemize}
|
|
|
|
|
|
|
|
|
|
\item \textbf{Step 6: Program the DIMM to normal operation:}
|
|
|
|
|
\begin{itemize}
|
|
|
|
|
\item Finally, the DIMM is reprogrammed to its
|
2024-08-25 15:57:26 +02:00
|
|
|
|
normal operational state, resetting \path{Qoff}
|
|
|
|
|
and \path{Level} to \path{0} to conclude the
|
2024-08-25 11:54:54 +02:00
|
|
|
|
write leveling process and return to standard
|
|
|
|
|
operation.
|
|
|
|
|
\end{itemize}
|
2024-08-22 19:18:34 +02:00
|
|
|
|
\end{itemize}
|
|
|
|
|
|
2024-08-25 11:54:54 +02:00
|
|
|
|
For each DIMM, the BIOS must calculate the coarse and fine
|
|
|
|
|
delays for each lane in the DQS Write Timing register:
|
2024-08-22 19:18:34 +02:00
|
|
|
|
|
2024-08-25 11:54:54 +02:00
|
|
|
|
\begin{itemize}
|
|
|
|
|
\item \textbf{Coarse Delay Calculation:} This involves
|
|
|
|
|
setting a basic delay based on a seed value specific to
|
|
|
|
|
the platform. The seed value is determined during
|
|
|
|
|
initial system configuration and serves as a starting
|
|
|
|
|
point for further delay adjustments.
|
|
|
|
|
\item \textbf{Critical Delay Determination:} The minimum of
|
|
|
|
|
the coarse delays for each lane and DIMM is considered
|
|
|
|
|
the critical delay. This delay is crucial for ensuring
|
|
|
|
|
that all data lanes are correctly synchronized.
|
|
|
|
|
\item \textbf{Platform-Specific Seed:} The seed ranges
|
|
|
|
|
between -1.20ns and +1.20ns, providing a small
|
|
|
|
|
adjustment range to fine-tune the timing based on the
|
|
|
|
|
specific characteristics of the platform. This seed
|
|
|
|
|
value can differ for the first pass compared to
|
|
|
|
|
subsequent passes, allowing for incremental adjustments
|
|
|
|
|
as the system stabilizes.
|
|
|
|
|
\end{itemize}
|
2024-08-22 19:18:34 +02:00
|
|
|
|
|
2024-08-25 15:57:26 +02:00
|
|
|
|
\section{Current implementation and potential improvements}
|
2024-08-25 11:54:54 +02:00
|
|
|
|
|
2024-08-25 15:57:26 +02:00
|
|
|
|
\subsection{Current implementation in coreboot on the KGPE-D16}
|
|
|
|
|
|
|
|
|
|
In this part as for the rest of this document, we're basing our
|
|
|
|
|
study on the 4.11 version of \textit{coreboot} \cite{coreboot_4_11},
|
|
|
|
|
which is the last version that supported the ASUS KGPE-D16
|
|
|
|
|
mainboard. \\
|
|
|
|
|
|
|
|
|
|
The process starts in
|
|
|
|
|
\path{src/mainboard/asus/kgpe-d16/romstage.c}, in the
|
|
|
|
|
\path{cache_as_ram_main} function by calling
|
|
|
|
|
\path{fill_mem_ctrl} from
|
|
|
|
|
\path{src/northbridge/amd/amdfam10/raminit_sysinfo_in_ram.c}
|
|
|
|
|
(lst. \ref{lst:fill_mem_ctrl}).
|
2024-08-25 11:54:54 +02:00
|
|
|
|
At this current step, only the BSC is running the firmware code.
|
|
|
|
|
This function iterates over all memory controllers (one per
|
|
|
|
|
node) and initializes their corresponding structures with the
|
|
|
|
|
system information needed for the RAM to function. This includes
|
|
|
|
|
the addresses of PCI nodes (important for DMA operations) and
|
|
|
|
|
SPD addresses, which are internal ROMs in each memory slot
|
|
|
|
|
containing crucial information for detecting and initializing
|
2024-08-25 15:57:26 +02:00
|
|
|
|
memory modules. \\
|
|
|
|
|
|
2024-08-27 16:03:22 +02:00
|
|
|
|
\begin{listing}[H]
|
2024-08-25 15:57:26 +02:00
|
|
|
|
\begin{adjustwidth}{0.5cm}{0.5cm}
|
|
|
|
|
\inputminted{c}{
|
|
|
|
|
listings/src_northbridge_amd_amdfam10_raminit_sysinfo_in_ram.c}
|
|
|
|
|
\end{adjustwidth}
|
|
|
|
|
\caption{
|
|
|
|
|
\protect\path{fill_mem_ctrl()}, extract from
|
|
|
|
|
\protect\path{src/northbridge/amd/amdfam10/raminit_sysinfo_in_ram.c}}
|
|
|
|
|
\label{lst:fill_mem_ctrl}
|
|
|
|
|
\end{listing}
|
|
|
|
|
|
|
|
|
|
If successful, the system posts codes \path{0x3D} and then
|
|
|
|
|
\path{0x40}. The \path{raminit_amdmct} function from
|
|
|
|
|
\path{src/northbridge/amd/amdfam10/raminit_amdmct.c} is then
|
|
|
|
|
called. This function, in turn, calls \path{mctAutoInitMCT_D}
|
|
|
|
|
(lst. \ref{lst:mctAutoInitMCT_D_1}) from
|
|
|
|
|
\path{src/northbridge/amd/amdmct/mct_ddr3/mct_d.c},
|
2024-08-25 11:54:54 +02:00
|
|
|
|
which is responsible for the initial memory initialization,
|
2024-08-27 16:03:22 +02:00
|
|
|
|
predominantly written by Raptor Engineering. \\
|
2024-08-25 11:54:54 +02:00
|
|
|
|
|
|
|
|
|
At this stage, it is assumed that memory has been pre-mapped
|
|
|
|
|
contiguously from address 0 to 4GB and that the previous code
|
|
|
|
|
has correctly mapped non-cacheable I/O areas below 4GB for the
|
2024-08-25 15:57:26 +02:00
|
|
|
|
PCI bus and Local APIC access for processor cores. \\
|
2024-08-25 11:54:54 +02:00
|
|
|
|
|
|
|
|
|
The following prerequisites must be in place from the previous
|
|
|
|
|
steps:
|
|
|
|
|
|
|
|
|
|
\begin{itemize}
|
|
|
|
|
\item The HyperTransport bus configured, and its speed is
|
|
|
|
|
correctly set.
|
|
|
|
|
\item The SMBus controller is configured.
|
|
|
|
|
\item The BSP is in unreal mode.
|
|
|
|
|
\item A stack is set up for all cores.
|
|
|
|
|
\item All cores are initialized at a frequency of 2GHz.
|
|
|
|
|
\item If we were using saved values, the NVRAM would have been
|
|
|
|
|
verified with checksums.
|
|
|
|
|
\end{itemize}
|
|
|
|
|
|
|
|
|
|
The memory controller for the BSP is queried to check if it can
|
|
|
|
|
manage ECC memory, which is a type of memory that includes
|
|
|
|
|
error-correcting code to detect and correct common types of data
|
2024-08-25 15:57:26 +02:00
|
|
|
|
corruption (lst. \ref{lst:mctAutoInitMCT_D_2}).
|
2024-08-25 11:54:54 +02:00
|
|
|
|
|
|
|
|
|
For each node available in the system, the memory controllers
|
2024-08-25 15:57:26 +02:00
|
|
|
|
are identified and initialized using a \path{DCTStatStruc}
|
2024-08-25 11:54:54 +02:00
|
|
|
|
structure defined in
|
|
|
|
|
\path{src/northbridge/amd/amdmct/mct_ddr3/mct_d.h}. This
|
|
|
|
|
structure contains all necessary fields for managing a memory
|
|
|
|
|
module. The process includes:
|
|
|
|
|
|
|
|
|
|
\begin{itemize}
|
|
|
|
|
\item Retrieving the corresponding field in the sysinfo
|
|
|
|
|
structure for the node.
|
2024-08-25 15:57:26 +02:00
|
|
|
|
\item Clearing fields with \path{zero}.
|
2024-08-25 11:54:54 +02:00
|
|
|
|
\item Initializing basic fields.
|
|
|
|
|
\item Initializing the controller linked to the current node.
|
|
|
|
|
\item Verifying the presence of the node (checking if the
|
|
|
|
|
processor associated with this controller is present).
|
|
|
|
|
If yes, the SMBus is informed.
|
|
|
|
|
\item Pre-initializing the memory module controller for this
|
2024-08-25 15:57:26 +02:00
|
|
|
|
node using \path{mct_preInitDCT}.
|
2024-08-25 11:54:54 +02:00
|
|
|
|
\end{itemize}
|
|
|
|
|
|
2024-08-25 15:57:26 +02:00
|
|
|
|
The memory modules must be initialized. All modules present on
|
|
|
|
|
valid nodes are configured with 1.5V voltage
|
2024-08-25 18:51:20 +02:00
|
|
|
|
(lst. \ref{lst:mctAutoInitMCT_D_3}). The ZQ calibration
|
|
|
|
|
is triggered at this stage. \\
|
2024-08-25 15:57:26 +02:00
|
|
|
|
|
|
|
|
|
Now, present memory modules are detected using \path{mct_initDCT}
|
|
|
|
|
(lst. \ref{lst:mctAutoInitMCT_D_4}). The memory modules existence
|
|
|
|
|
is checked and the machine halts immediately after displaying a
|
|
|
|
|
message if there is no memory.
|
|
|
|
|
\textit{coreboot} waits for all modules to be available using
|
|
|
|
|
\path{SyncDCTsReady_D}. \\
|
|
|
|
|
|
|
|
|
|
The firmware maps the physical memory address ranges into the
|
|
|
|
|
address space with \path{HTMemMapInit_D} as contiguously as possible
|
|
|
|
|
while also constructing the physical memory map. If there is an
|
|
|
|
|
area occupied by something else, it is ignored, and a memory hole is
|
|
|
|
|
created. \\
|
|
|
|
|
|
|
|
|
|
Mapping the address ranges into the cache is done with
|
|
|
|
|
\path{CPUMemTyping_D} either as WriteBack (cacheable) or
|
|
|
|
|
Uncacheable, depending on whether the area corresponds to physical
|
|
|
|
|
memory or a memory hole. \\
|
|
|
|
|
|
|
|
|
|
The external northbridge is notified of this new memory
|
|
|
|
|
configuration. \\
|
|
|
|
|
|
2024-08-25 11:54:54 +02:00
|
|
|
|
The \textit{coreboot} code compensates for the delay between DQS
|
2024-08-25 15:57:26 +02:00
|
|
|
|
and DQ signals, as well as between CMD and DQ. This is handled by
|
|
|
|
|
the \path{DQSTiming_D} function (lst. \ref{lst:mctAutoInitMCT_D_5}).
|
|
|
|
|
The initialization can be done again if needed after that, otherwise
|
|
|
|
|
the channels and nodes are interleaved and ECC is enabled (if
|
|
|
|
|
supported by every module). \\
|
2024-08-25 11:54:54 +02:00
|
|
|
|
|
2024-08-25 15:57:26 +02:00
|
|
|
|
After that being done, the DRAM can be mapped into the address
|
|
|
|
|
space with cacheability, and the init process finishes with
|
|
|
|
|
validation of every populated DCT node
|
|
|
|
|
(lst. \ref{lst:mctAutoInitMCT_D_6}). \\
|
2024-08-22 19:18:34 +02:00
|
|
|
|
|
2024-08-25 15:57:26 +02:00
|
|
|
|
Finally, if the RAM is of the ECC type, error-correcting codes
|
|
|
|
|
are enabled, and the function ends by activating power-saving
|
|
|
|
|
features if requested by the user. \\
|
|
|
|
|
|
2024-08-25 18:51:20 +02:00
|
|
|
|
\subsubsection{Details on the DQS training function}
|
|
|
|
|
|
|
|
|
|
The \path{DQSTiming_D} function is a critical part of the
|
|
|
|
|
firmware responsible for initializing and training the system's
|
|
|
|
|
memory.
|
|
|
|
|
The function primarily handles the DQS timing, which is
|
|
|
|
|
essential for ensuring data integrity and synchronization
|
|
|
|
|
between the memory controller and the DRAM. Proper DQS training
|
|
|
|
|
is crucial to align the data signals correctly with the clock
|
|
|
|
|
signals.
|
|
|
|
|
|
|
|
|
|
The function begins by declaring local variables, which are
|
|
|
|
|
used throughout the function for various operations. It also
|
|
|
|
|
includes an early exit condition to bypass DQS training if a
|
|
|
|
|
specific status flag (\path{GSB_EnDIMMSpareNW}) is set,
|
|
|
|
|
indicating that a DIMM spare feature is enabled
|
2024-08-26 19:19:02 +02:00
|
|
|
|
(lst. \ref{lst:var_decl_and_exit}). These spare DIMMs are not
|
|
|
|
|
used for normal memory operations but are kept in reserve for
|
|
|
|
|
redundancy. \\
|
2024-08-25 18:51:20 +02:00
|
|
|
|
|
2024-08-26 19:19:02 +02:00
|
|
|
|
\begin{listing}[H]
|
2024-08-25 18:51:20 +02:00
|
|
|
|
\begin{adjustwidth}{0.5cm}{0.5cm}
|
|
|
|
|
\begin{minted}[linenos]{c}
|
|
|
|
|
if (pMCTstat->GStatus & (1 << GSB_EnDIMMSpareNW)) {
|
|
|
|
|
return;
|
|
|
|
|
}
|
|
|
|
|
\end{minted}
|
|
|
|
|
\end{adjustwidth}
|
2024-08-26 19:19:02 +02:00
|
|
|
|
\caption{Early exit check,
|
|
|
|
|
extract from the
|
|
|
|
|
\protect\path{DQSTiming_D} function in
|
|
|
|
|
\protect\path{src/northbridge/amd/amdmct/mct_ddr3/mct_d.c}}
|
2024-08-25 18:51:20 +02:00
|
|
|
|
\label{lst:var_decl_and_exit}
|
|
|
|
|
\end{listing}
|
|
|
|
|
|
|
|
|
|
Next, the function initializes the TCWL (CAS Write Latency)
|
2024-08-26 19:19:02 +02:00
|
|
|
|
offset to zero for each node and DCT.
|
2024-08-25 18:51:20 +02:00
|
|
|
|
This ensures that the memory write latency is properly aligned
|
|
|
|
|
before the DQS training begins
|
|
|
|
|
(lst. \ref{lst:set_tcwl_offset}). \\
|
|
|
|
|
|
2024-08-26 19:19:02 +02:00
|
|
|
|
\begin{listing}[H]
|
2024-08-25 18:51:20 +02:00
|
|
|
|
\begin{adjustwidth}{0.5cm}{0.5cm}
|
|
|
|
|
\begin{minted}[linenos]{c}
|
|
|
|
|
for (Node = 0; Node < MAX_NODES_SUPPORTED; Node++) {
|
|
|
|
|
uint8_t dct;
|
|
|
|
|
struct DCTStatStruc *pDCTstat;
|
|
|
|
|
pDCTstat = pDCTstatA + Node;
|
|
|
|
|
for (dct = 0; dct < 2; dct++)
|
|
|
|
|
pDCTstat->tcwl_delay[dct] = 0;
|
|
|
|
|
}
|
|
|
|
|
\end{minted}
|
|
|
|
|
\end{adjustwidth}
|
|
|
|
|
\caption{Setting initial TCWL offset to zero for all nodes and DCTs,
|
|
|
|
|
extract from the
|
|
|
|
|
\protect\path{DQSTiming_D} function in
|
|
|
|
|
\protect\path{src/northbridge/amd/amdmct/mct_ddr3/mct_d.c}}
|
|
|
|
|
\label{lst:set_tcwl_offset}
|
|
|
|
|
\end{listing}
|
|
|
|
|
|
|
|
|
|
A retry mechanism is introduced to handle potential errors
|
2024-08-26 19:19:02 +02:00
|
|
|
|
during DQS training and the pre-training function are called
|
2024-08-25 18:51:20 +02:00
|
|
|
|
(lst. \ref{lst:retry_pre_training}). \\
|
|
|
|
|
|
2024-08-26 19:19:02 +02:00
|
|
|
|
\begin{listing}[H]
|
2024-08-25 18:51:20 +02:00
|
|
|
|
\begin{adjustwidth}{0.5cm}{0.5cm}
|
|
|
|
|
\begin{minted}[linenos]{c}
|
|
|
|
|
retry_dqs_training_and_levelization:
|
|
|
|
|
nv_DQSTrainCTL = !allow_config_restore;
|
|
|
|
|
|
|
|
|
|
mct_BeforeDQSTrain_D(pMCTstat, pDCTstatA);
|
|
|
|
|
phyAssistedMemFnceTraining(pMCTstat, pDCTstatA, -1);
|
|
|
|
|
\end{minted}
|
|
|
|
|
\end{adjustwidth}
|
|
|
|
|
\caption{Retry mechanism initialization and pre-training operations,
|
|
|
|
|
extract from the
|
|
|
|
|
\protect\path{DQSTiming_D} function in
|
|
|
|
|
\protect\path{src/northbridge/amd/amdmct/mct_ddr3/mct_d.c}}
|
|
|
|
|
\label{lst:retry_pre_training}
|
|
|
|
|
\end{listing}
|
|
|
|
|
|
|
|
|
|
For AMD's Fam15h processors, additional PHY compensation is
|
2024-08-26 19:19:02 +02:00
|
|
|
|
needed for each node and valid DCT
|
2024-08-25 18:51:20 +02:00
|
|
|
|
(lst. \ref{lst:phy_compensation_init}). This is necessary to
|
|
|
|
|
fine-tune the electrical characteristics of the memory
|
|
|
|
|
interface. For more information about the PHY training, see
|
|
|
|
|
the earlier sections about RAM training algorithm. \\
|
|
|
|
|
|
2024-08-26 19:19:02 +02:00
|
|
|
|
\begin{listing}[H]
|
2024-08-25 18:51:20 +02:00
|
|
|
|
\begin{adjustwidth}{0.5cm}{0.5cm}
|
|
|
|
|
\begin{minted}[linenos]{c}
|
2024-08-26 19:19:02 +02:00
|
|
|
|
for (Node = 0; Node < MAX_NODES_SUPPORTED; Node++) {
|
|
|
|
|
pDCTstat = pDCTstatA + Node;
|
|
|
|
|
if (pDCTstat->NodePresent) {
|
|
|
|
|
if (pDCTstat->DIMMValidDCT[0])
|
|
|
|
|
InitPhyCompensation(pMCTstat, pDCTstat, 0);
|
|
|
|
|
if (pDCTstat->DIMMValidDCT[1])
|
|
|
|
|
InitPhyCompensation(pMCTstat, pDCTstat, 1);
|
2024-08-25 18:51:20 +02:00
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
\end{minted}
|
|
|
|
|
\end{adjustwidth}
|
2024-08-26 19:19:02 +02:00
|
|
|
|
\caption{PHY compensation initialization,
|
2024-08-25 18:51:20 +02:00
|
|
|
|
extract from the
|
|
|
|
|
\protect\path{DQSTiming_D} function in
|
|
|
|
|
\protect\path{src/northbridge/amd/amdmct/mct_ddr3/mct_d.c}}
|
|
|
|
|
\label{lst:phy_compensation_init}
|
|
|
|
|
\end{listing}
|
|
|
|
|
|
|
|
|
|
Before proceeding with the main DQS training, the function
|
|
|
|
|
invokes a hook function that allows for additional
|
2024-08-26 19:19:02 +02:00
|
|
|
|
configurations or custom operations:
|
|
|
|
|
\path{mctHookBeforeAnyTraining}. \\
|
2024-08-25 18:51:20 +02:00
|
|
|
|
|
2024-08-26 19:19:02 +02:00
|
|
|
|
The \path{nv_DQSTrainCTL} variable is
|
|
|
|
|
set based on the \path{allow_config_restore} parameter,
|
|
|
|
|
determining whether to restore a previous configuration or
|
|
|
|
|
proceed with fresh training. This is however not working on the
|
|
|
|
|
current implementation of ASUS KGPE-D16 firmware
|
|
|
|
|
(lst. \ref{lst:mctAutoInitMCT_D_fixme}).
|
2024-08-25 18:51:20 +02:00
|
|
|
|
If \path{nv_DQSTrainCTL} indicates that fresh training should
|
|
|
|
|
proceed, the function performs the main DQS training in multiple
|
|
|
|
|
passes, including receiver enable training with
|
2024-08-26 19:19:02 +02:00
|
|
|
|
\path{TrainReceiverEn_D}, write leveling with
|
|
|
|
|
\path{mct_WriteLevelization_HW}, DQS position
|
|
|
|
|
training with \path{mct_TrainDQSPos_D} and the maximum read
|
|
|
|
|
latency calculation with \path{TrainMaxRdLatency_En_D}
|
|
|
|
|
(lst. \ref{lst:dqs_training_process}).
|
|
|
|
|
Write leveling is done in two passes, with a DQS receiver
|
|
|
|
|
training between and another pass of receiver training after.
|
|
|
|
|
After that, a DQS position training is done and the process
|
|
|
|
|
finished with the maximum read latency, i.e the delay between
|
|
|
|
|
the request for data and the delivery of that data by the DRAM.
|
|
|
|
|
\\
|
|
|
|
|
|
|
|
|
|
\begin{listing}[H]
|
2024-08-25 18:51:20 +02:00
|
|
|
|
\begin{adjustwidth}{0.5cm}{0.5cm}
|
|
|
|
|
\begin{minted}[linenos]{c}
|
|
|
|
|
if (nv_DQSTrainCTL) {
|
|
|
|
|
mct_WriteLevelization_HW(pMCTstat, pDCTstatA, FirstPass);
|
2024-08-26 19:19:02 +02:00
|
|
|
|
TrainReceiverEn_D(pMCTstat, pDCTstatA, FirstPass);
|
2024-08-25 18:51:20 +02:00
|
|
|
|
mct_WriteLevelization_HW(pMCTstat, pDCTstatA, SecondPass);
|
|
|
|
|
|
2024-08-26 19:19:02 +02:00
|
|
|
|
/* TODO: Determine why running TrainReceiverEn_D in SecondPass mode yields
|
|
|
|
|
* less stable training values than when run in FirstPass mode as in the HACK
|
|
|
|
|
* below.*/
|
|
|
|
|
TrainReceiverEn_D(pMCTstat, pDCTstatA, FirstPass);
|
2024-08-25 18:51:20 +02:00
|
|
|
|
mct_TrainDQSPos_D(pMCTstat, pDCTstatA);
|
2024-08-27 16:03:22 +02:00
|
|
|
|
...
|
2024-08-26 19:19:02 +02:00
|
|
|
|
TrainMaxRdLatency_En_D(pMCTstat, pDCTstatA);
|
|
|
|
|
} else {
|
|
|
|
|
mct_WriteLevelization_HW(pMCTstat, pDCTstatA, FirstPass);
|
|
|
|
|
mct_WriteLevelization_HW(pMCTstat, pDCTstatA, SecondPass);
|
|
|
|
|
#if CONFIG(HAVE_ACPI_RESUME)
|
|
|
|
|
printk(BIOS_DEBUG, "mctAutoInitMCT_D: Restoring DIMM training configuration"
|
|
|
|
|
"from NVRAM\n");
|
|
|
|
|
if (restore_mct_information_from_nvram(1) != 0)
|
|
|
|
|
printk(BIOS_CRIT, "%s: ERROR: Unable to restore DCT configuration from"
|
|
|
|
|
"NVRAM\n", __func__);
|
|
|
|
|
#endif
|
|
|
|
|
exit_training_mode_fam15(pMCTstat, pDCTstatA);
|
|
|
|
|
pMCTstat->GStatus |= 1 << GSB_ConfigRestored;"
|
2024-08-25 18:51:20 +02:00
|
|
|
|
}
|
|
|
|
|
\end{minted}
|
|
|
|
|
\end{adjustwidth}
|
|
|
|
|
\caption{Main DQS training process in multiple passes,
|
|
|
|
|
extract from the
|
|
|
|
|
\protect\path{DQSTiming_D} function in
|
|
|
|
|
\protect\path{src/northbridge/amd/amdmct/mct_ddr3/mct_d.c}}
|
|
|
|
|
\label{lst:dqs_training_process}
|
|
|
|
|
\end{listing}
|
|
|
|
|
|
2024-08-26 19:19:02 +02:00
|
|
|
|
The function checks for any errors during the DQS training. If
|
|
|
|
|
errors are detected, it may request a retrain, reset certain
|
|
|
|
|
parameters, and restart the training process and even restart
|
|
|
|
|
the whole system if needed (lst. \ref{lst:error_handling}).
|
|
|
|
|
If the training process it to be restarted, the firmware
|
|
|
|
|
sets the DIMMs frequencies to minimum and applies timing changes
|
|
|
|
|
to DIMMs before jumping to the retry label
|
|
|
|
|
(lst. \ref{lst:retry_pre_training}). \\
|
|
|
|
|
|
2024-08-25 18:51:20 +02:00
|
|
|
|
Once the training is successfully completed without errors, the
|
|
|
|
|
function finalizes the process by setting the maximum read
|
|
|
|
|
latency and exiting the training mode. For systems with
|
|
|
|
|
\path{allow_config_restore} enabled, it restores the previous
|
|
|
|
|
configuration from NVRAM instead of performing a fresh training
|
2024-08-26 19:19:02 +02:00
|
|
|
|
(lst. \ref{lst:dqs_training_process}). \\
|
2024-08-25 18:51:20 +02:00
|
|
|
|
|
2024-08-26 19:19:02 +02:00
|
|
|
|
Finally, the function performs a cleanup operation specific to
|
|
|
|
|
Fam15h processors, where it switches the DCT control register
|
|
|
|
|
as required by a known erratum from AMD for the BKDG
|
|
|
|
|
(Erratum 505) \cite{amd_fam15h_revision_guide}.
|
|
|
|
|
This is followed by a post-training hook that
|
|
|
|
|
allows for any additional necessary actions
|
|
|
|
|
(lst. \ref{lst:post_training_cleanup}). \\
|
|
|
|
|
|
|
|
|
|
\begin{listing}[htpb]
|
2024-08-25 18:51:20 +02:00
|
|
|
|
\begin{adjustwidth}{0.5cm}{0.5cm}
|
|
|
|
|
\begin{minted}[linenos]{c}
|
2024-08-26 19:19:02 +02:00
|
|
|
|
for (Node = 0; Node < MAX_NODES_SUPPORTED; Node++) {
|
|
|
|
|
pDCTstat = pDCTstatA + Node;
|
|
|
|
|
if (pDCTstat->NodePresent) {
|
|
|
|
|
fam15h_switch_dct(pDCTstat->dev_map, 0);
|
|
|
|
|
}
|
|
|
|
|
}
|
2024-08-25 18:51:20 +02:00
|
|
|
|
|
2024-08-26 19:19:02 +02:00
|
|
|
|
/* FIXME - currently uses calculated value
|
|
|
|
|
* TrainMaxReadLatency_D(pMCTstat, pDCTstatA); */
|
|
|
|
|
mctHookAfterAnyTraining();
|
|
|
|
|
\end{minted}
|
|
|
|
|
\end{adjustwidth}
|
|
|
|
|
\caption{Post-training cleanup and final hook execution,
|
|
|
|
|
extract from the
|
|
|
|
|
\protect\path{DQSTiming_D} function in
|
|
|
|
|
\protect\path{src/northbridge/amd/amdmct/mct_ddr3/mct_d.c}}
|
|
|
|
|
\label{lst:post_training_cleanup}
|
|
|
|
|
\end{listing}
|
2024-08-25 18:51:20 +02:00
|
|
|
|
|
2024-08-26 19:19:02 +02:00
|
|
|
|
\subsubsection{Details on the write leveling implementation}
|
|
|
|
|
|
|
|
|
|
The \path{WriteLevelization_HW} function is responsible for
|
|
|
|
|
performing hardware-level write leveling on DRAM modules during
|
|
|
|
|
the memory initialization process. Write leveling ensures that
|
|
|
|
|
the DQS signals are correctly aligned with the clock signals,
|
|
|
|
|
preventing timing mismatches during write operations. \\
|
|
|
|
|
|
|
|
|
|
The function begins by initializing pointers to key data
|
|
|
|
|
structures, linking the memory controller (MCT) and DRAM
|
|
|
|
|
controller timing (DCT) data for subsequent operations. \\
|
|
|
|
|
|
|
|
|
|
Auto-refresh and short ZQ calibration are temporarily disabled
|
|
|
|
|
to prevent interference during the critical timing adjustments
|
|
|
|
|
of write leveling.
|
|
|
|
|
The memory controller is prepared for write leveling by
|
|
|
|
|
configuring necessary parameters with \path{PrepareC_MCT},
|
|
|
|
|
then the main operation can begin. \\
|
|
|
|
|
|
|
|
|
|
In the first pass (lst. \ref{lst:write_level_first_pass}),
|
|
|
|
|
the function repeatedly attempts to align
|
|
|
|
|
the DQS signals with \path{PhyWLPass1}, retrying if invalid
|
|
|
|
|
values are detected. This phase ensures basic alignment for
|
|
|
|
|
further fine-tuning. The function retries up to 8 times if it
|
|
|
|
|
detects invalid timing values. \\
|
|
|
|
|
|
|
|
|
|
During the second pass (lst. \ref{lst:write_level_second_pass}),
|
|
|
|
|
the function first checks if the target memory frequency
|
|
|
|
|
(\path{TargetFreq}) is higher than the minimum memory clock
|
|
|
|
|
frequency stored in the non-volatile bits
|
|
|
|
|
(\path{NV_MIN_MEMCLK}). If so, the memory frequency is
|
|
|
|
|
incrementally adjusted toward the final target f requency.
|
|
|
|
|
This step-by-step approach is crucial, especially for AMD Fam15h
|
|
|
|
|
processors, where the frequency must be gradually stepped up to
|
|
|
|
|
avoid instability. \\
|
|
|
|
|
|
|
|
|
|
For each frequency step, the write leveling process is
|
|
|
|
|
recalibrated by invoking the \path{PhyWLPass2} function. This
|
|
|
|
|
function adjusts the DQS timing for each data channel (DCT) and
|
|
|
|
|
validates the results. The function retries up to 8 times if it
|
|
|
|
|
detects invalid timing values. The global status
|
|
|
|
|
(\path{global_phy_training_status}) aggregates the results of
|
|
|
|
|
each step, tracking any persistent issues. \\
|
|
|
|
|
|
2024-08-27 16:03:22 +02:00
|
|
|
|
The \path{PhyWLPass1} and \path{PhyWLPass2} function relyon
|
2024-08-26 19:19:02 +02:00
|
|
|
|
\path{AgesaHwWlPhase1}, \path{AgesaHwWlPhase2} and
|
|
|
|
|
\path{AgesaHwWlPhase3} for this. \\
|
|
|
|
|
|
|
|
|
|
Once the target frequency is reached and all write leveling
|
|
|
|
|
adjustments are made, the final timing values are stored.
|
|
|
|
|
The gross and fine delays from the previous passes are copied
|
|
|
|
|
into the final pass structures. This ensures that the DQS
|
|
|
|
|
timings are consistent and stable across all data channels. \\
|
|
|
|
|
|
|
|
|
|
If any issues persist after retries, the function logs a
|
|
|
|
|
warning. This indicates that the system may continue to operate,
|
|
|
|
|
but with a potential risk of instability due to imperfect
|
|
|
|
|
write leveling calibration. \\
|
|
|
|
|
|
|
|
|
|
After leveling, the function re-enables auto-refresh and short
|
|
|
|
|
ZQ calibration, ensuring the memory subsystem is correctly
|
|
|
|
|
configured for normal operation. \\
|
|
|
|
|
|
|
|
|
|
\begin{listing}[htpb]
|
|
|
|
|
\begin{adjustwidth}{0.5cm}{0.5cm}
|
|
|
|
|
\begin{minted}[linenos]{c}
|
|
|
|
|
if (Pass == FirstPass) {
|
|
|
|
|
timeout = 0;
|
|
|
|
|
do {
|
|
|
|
|
status = 0;
|
|
|
|
|
timeout++;
|
|
|
|
|
status |= PhyWLPass1(pMCTstat, pDCTstat, 0);
|
|
|
|
|
status |= PhyWLPass1(pMCTstat, pDCTstat, 1);
|
|
|
|
|
if (status)
|
|
|
|
|
printk(BIOS_INFO, "%s: Retrying write levelling due to invalid "
|
|
|
|
|
"value(s) detected in first phase\n", __func__);
|
|
|
|
|
} while (status && (timeout < 8));
|
|
|
|
|
if (status)
|
|
|
|
|
printk(BIOS_INFO, "%s: Uncorrectable invalid value(s) detected in first "
|
|
|
|
|
"phase of write levelling\n", __func__);
|
|
|
|
|
}
|
|
|
|
|
\end{minted}
|
|
|
|
|
\end{adjustwidth}
|
|
|
|
|
\caption{Write leveling (first pass),
|
|
|
|
|
extract from the
|
|
|
|
|
\protect\path{WriteLevelization_HW} function in
|
|
|
|
|
\protect\path{src/northbridge/amd/amdmct/mct_ddr3/mcthwl.c}}
|
|
|
|
|
\label{lst:write_level_first_pass}
|
|
|
|
|
\end{listing}
|
|
|
|
|
|
2024-08-27 13:56:15 +02:00
|
|
|
|
The detailled write leveling process is divided into three
|
|
|
|
|
distinct phases, each managed by a specific function:
|
2024-08-26 19:19:02 +02:00
|
|
|
|
\path{AgesaHwWlPhase1}, \path{AgesaHwWlPhase2}, and
|
2024-08-27 13:27:07 +02:00
|
|
|
|
\path{AgesaHwWlPhase3} from \path{mcthwl.c}.
|
|
|
|
|
These phases work together to
|
2024-08-26 19:19:02 +02:00
|
|
|
|
fine-tune the timing delays (gross and fine) for each byte
|
|
|
|
|
lane, ensuring reliable data transmission. \\
|
|
|
|
|
|
|
|
|
|
The write leveling process begins by selecting the target
|
|
|
|
|
DIMM. This is accomplished by programming the
|
|
|
|
|
\path{TrDimmSel} register to ensure that the subsequent
|
2024-08-27 16:03:22 +02:00
|
|
|
|
operations apply to the correct DIMM
|
|
|
|
|
(lst. \ref{lst:target_dimm_selection}) \\
|
2024-08-26 19:19:02 +02:00
|
|
|
|
|
|
|
|
|
\begin{listing}
|
2024-08-27 13:27:07 +02:00
|
|
|
|
\begin{adjustwidth}{0.5cm}{0.5cm}
|
|
|
|
|
\begin{minted}[linenos]{c}
|
2024-08-26 19:19:02 +02:00
|
|
|
|
set_DCT_ADDR_Bits(pDCTData, dct, pDCTData->NodeId, FUN_DCT,
|
|
|
|
|
DRAM_ADD_DCT_PHY_CONTROL_REG, TrDimmSelStart,
|
|
|
|
|
TrDimmSelEnd, (u32)dimm);
|
2024-08-27 13:27:07 +02:00
|
|
|
|
\end{minted}
|
|
|
|
|
\end{adjustwidth}
|
|
|
|
|
\caption{Target DIMM selection for write leveling,
|
|
|
|
|
extract from
|
|
|
|
|
\protect\path{AgesaHwWlPhase1} function in
|
|
|
|
|
\protect\path{src/northbridge/amd/amdmct/mct_ddr3/mhwlc_d.c}}
|
2024-08-26 19:19:02 +02:00
|
|
|
|
\label{lst:target_dimm_selection}
|
|
|
|
|
\end{listing}
|
|
|
|
|
|
|
|
|
|
In the case of x4 DIMMs, which are common in high-density
|
|
|
|
|
memory configurations, write leveling must be performed
|
|
|
|
|
separately for each nibble (4-bit group). The function
|
|
|
|
|
checks if x4 DIMMs are present and, if so, prepares to train
|
2024-08-27 16:03:22 +02:00
|
|
|
|
both nibbles (lst. \ref{lst:x4_dimm_handling}). \\
|
2024-08-25 18:51:20 +02:00
|
|
|
|
|
2024-08-26 19:19:02 +02:00
|
|
|
|
\begin{listing}
|
2024-08-27 13:27:07 +02:00
|
|
|
|
\begin{adjustwidth}{0.5cm}{0.5cm}
|
|
|
|
|
\begin{minted}[linenos]{c}
|
2024-08-26 19:19:02 +02:00
|
|
|
|
train_both_nibbles = 0;
|
|
|
|
|
if (pDCTstat->Dimmx4Present)
|
2024-08-25 18:51:20 +02:00
|
|
|
|
if (is_fam15h())
|
2024-08-26 19:19:02 +02:00
|
|
|
|
train_both_nibbles = 1;
|
2024-08-27 13:27:07 +02:00
|
|
|
|
\end{minted}
|
|
|
|
|
\end{adjustwidth}
|
|
|
|
|
\caption{Handling of x4 DIMMs and nibble training,
|
|
|
|
|
extract from
|
|
|
|
|
\protect\path{AgesaHwWlPhase1} function in
|
|
|
|
|
\protect\path{src/northbridge/amd/amdmct/mct_ddr3/mhwlc_d.c}}
|
2024-08-26 19:19:02 +02:00
|
|
|
|
\label{lst:x4_dimm_handling}
|
|
|
|
|
\end{listing}
|
|
|
|
|
|
|
|
|
|
The DIMMs are prepared for write leveling by issuing Mode
|
|
|
|
|
Register (MR) commands. These commands configure the DIMMs
|
2024-08-27 16:03:22 +02:00
|
|
|
|
to enter a state where write leveling can be performed
|
|
|
|
|
(lst. \ref{lst:prepare_dimms}). \\
|
2024-08-26 19:19:02 +02:00
|
|
|
|
|
|
|
|
|
\begin{listing}
|
|
|
|
|
\begin{adjustwidth}{0.5cm}{0.5cm}
|
|
|
|
|
\begin{minted}[linenos]{c}
|
|
|
|
|
prepareDimms(pMCTstat, pDCTstat, dct, dimm, TRUE);
|
|
|
|
|
\end{minted}
|
|
|
|
|
\end{adjustwidth}
|
2024-08-27 13:27:07 +02:00
|
|
|
|
\caption{Preparing DIMMs for write leveling,
|
|
|
|
|
extract from
|
|
|
|
|
\protect\path{AgesaHwWlPhase1} function in
|
|
|
|
|
\protect\path{src/northbridge/amd/amdmct/mct_ddr3/mhwlc_d.c}}
|
2024-08-26 19:19:02 +02:00
|
|
|
|
\label{lst:prepare_dimms}
|
|
|
|
|
\end{listing}
|
|
|
|
|
|
|
|
|
|
The \path{procConfig} function is called to configure the
|
|
|
|
|
processor's DDR PHY (Physical Layer) for write leveling.
|
|
|
|
|
This configuration includes setting initial seed values for
|
|
|
|
|
gross and fine delays, which are essential for the
|
|
|
|
|
subsequent timing adjustments. \\
|
|
|
|
|
|
2024-08-27 16:03:22 +02:00
|
|
|
|
\path{procConfig} generates initial seed values
|
|
|
|
|
(lst. \ref{lst:seed_generation}) for gross
|
2024-08-26 19:19:02 +02:00
|
|
|
|
and fine delays. These seeds are calculated based on several
|
|
|
|
|
factors:
|
|
|
|
|
|
|
|
|
|
\begin{itemize}
|
|
|
|
|
\item \textbf{Processor Type:} For Fam15h processors,
|
|
|
|
|
specific tables from the Fam15h BKDG \cite{BKDG} are
|
|
|
|
|
referenced to select appropriate seed values for
|
|
|
|
|
different package types (e.g., Socket G34, Socket
|
|
|
|
|
C32).
|
|
|
|
|
\item \textbf{DIMM Type:} The seed values are adjusted
|
|
|
|
|
based on whether the RDIMMs are registered or
|
|
|
|
|
load-reduced, with different base values used for
|
|
|
|
|
these configurations.
|
|
|
|
|
\item \textbf{Memory Clock Frequency:} The seeds are
|
|
|
|
|
further adjusted based on the current memory clock
|
|
|
|
|
frequency (\path{MemClkFreq}), ensuring that the
|
|
|
|
|
timing is correct for the operating speed of the
|
|
|
|
|
memory.
|
|
|
|
|
\end{itemize}
|
|
|
|
|
|
|
|
|
|
The calculated seed values are then scaled to the minimum
|
|
|
|
|
supported memory frequency and stored in the
|
|
|
|
|
\path{WLSeedGrossDelay} and \path{WLSeedFineDelay} arrays
|
|
|
|
|
for each byte lane. \\
|
|
|
|
|
|
|
|
|
|
\begin{listing}
|
|
|
|
|
\begin{adjustwidth}{0.5cm}{0.5cm}
|
|
|
|
|
\begin{minted}[linenos]{c}
|
|
|
|
|
Seed_Total = (int32_t) (((((int64_t) Seed_Total) *
|
|
|
|
|
fam15h_freq_tab[MemClkFreq] * 100) / (mctGet_NVbits(NV_MIN_MEMCLK) * 100)));
|
|
|
|
|
|
|
|
|
|
Seed_Gross = (Seed_Total >> 5) & 0x1f;
|
|
|
|
|
Seed_Fine = Seed_Total & 0x1f;
|
|
|
|
|
\end{minted}
|
|
|
|
|
\end{adjustwidth}
|
2024-08-27 13:27:07 +02:00
|
|
|
|
\caption{Seed generation,
|
|
|
|
|
extract from
|
|
|
|
|
\protect\path{procConfig} function in
|
|
|
|
|
\protect\path{src/northbridge/amd/amdmct/mct_ddr3/mhwlc_d.c}}
|
2024-08-26 19:19:02 +02:00
|
|
|
|
\label{lst:seed_generation}
|
|
|
|
|
\end{listing}
|
2024-08-25 18:51:20 +02:00
|
|
|
|
|
2024-08-26 19:19:02 +02:00
|
|
|
|
Write leveling is initiated by enabling the
|
|
|
|
|
\path{WrtLvTrEn} bit. This allows the DDR PHY to begin
|
2024-08-27 16:03:22 +02:00
|
|
|
|
adjusting the DQS signals relative to the clock signals
|
|
|
|
|
(lst. \ref{lst:initiate_write_leveling}). \\
|
2024-08-26 19:19:02 +02:00
|
|
|
|
|
|
|
|
|
\begin{listing}
|
|
|
|
|
\begin{adjustwidth}{0.5cm}{0.5cm}
|
|
|
|
|
\begin{minted}[linenos]{c}
|
|
|
|
|
set_DCT_ADDR_Bits(pDCTData, dct, pDCTData->NodeId, FUN_DCT,
|
|
|
|
|
DRAM_ADD_DCT_PHY_CONTROL_REG, WrtLvTrEn, WrtLvTrEn, 1);
|
|
|
|
|
\end{minted}
|
|
|
|
|
\end{adjustwidth}
|
2024-08-27 13:27:07 +02:00
|
|
|
|
\caption{Initiating write leveling training,
|
|
|
|
|
extract from
|
|
|
|
|
\protect\path{AgesaHwWlPhase1} function in
|
|
|
|
|
\protect\path{src/northbridge/amd/amdmct/mct_ddr3/mhwlc_d.c}}
|
2024-08-26 19:19:02 +02:00
|
|
|
|
\label{lst:initiate_write_leveling}
|
|
|
|
|
\end{listing}
|
|
|
|
|
|
2024-08-27 13:27:07 +02:00
|
|
|
|
If the DIMM is not x4, the function skips the nibble
|
2024-08-27 16:03:22 +02:00
|
|
|
|
training loop, as it is unnecessary
|
|
|
|
|
(lst. \ref{lst:exit_non_x4}). \\
|
2024-08-27 13:27:07 +02:00
|
|
|
|
|
|
|
|
|
\begin{listing}
|
|
|
|
|
\begin{adjustwidth}{0.5cm}{0.5cm}
|
|
|
|
|
\begin{minted}[linenos]{c}
|
|
|
|
|
if ((pDCTstat->Dimmx4Present & (1 << (dimm + dct))) == 0)
|
|
|
|
|
break;
|
|
|
|
|
\end{minted}
|
|
|
|
|
\end{adjustwidth}
|
|
|
|
|
\caption{Exit for non-x4 DIMMs,
|
|
|
|
|
extract from
|
|
|
|
|
\protect\path{AgesaHwWlPhase2} function in
|
|
|
|
|
\protect\path{src/northbridge/amd/amdmct/mct_ddr3/mhwlc_d.c}}
|
|
|
|
|
\label{lst:exit_non_x4}
|
|
|
|
|
\end{listing}
|
|
|
|
|
|
2024-08-26 19:19:02 +02:00
|
|
|
|
After a delay to allow the leveling process to stabilize,
|
|
|
|
|
the function reads the gross and fine delay values from the
|
2024-08-27 16:03:22 +02:00
|
|
|
|
relevant registers and stores them
|
|
|
|
|
(lst. \ref{lst:finalize_write_leveling}). These values
|
|
|
|
|
represent the initial timing adjustments necessary for
|
|
|
|
|
correct DQS alignment. \\
|
2024-08-26 19:19:02 +02:00
|
|
|
|
|
|
|
|
|
\begin{listing}
|
|
|
|
|
\begin{adjustwidth}{0.5cm}{0.5cm}
|
|
|
|
|
\begin{minted}[linenos]{c}
|
|
|
|
|
for (ByteLane = 0; ByteLane < lane_count; ByteLane++) {
|
|
|
|
|
getWLByteDelay(pDCTstat, dct, ByteLane, dimm, pass, nibble, lane_count);
|
2024-08-25 18:51:20 +02:00
|
|
|
|
}
|
2024-08-26 19:19:02 +02:00
|
|
|
|
\end{minted}
|
|
|
|
|
\end{adjustwidth}
|
2024-08-27 13:27:07 +02:00
|
|
|
|
\caption{Reading and storing delay values after write leveling,
|
|
|
|
|
extract from
|
|
|
|
|
\protect\path{AgesaHwWlPhase2} function in
|
|
|
|
|
\protect\path{src/northbridge/amd/amdmct/mct_ddr3/mhwlc_d.c}}
|
2024-08-26 19:19:02 +02:00
|
|
|
|
\label{lst:finalize_write_leveling}
|
|
|
|
|
\end{listing}
|
|
|
|
|
|
|
|
|
|
\subsubsection{Details on the DQS position training function}
|
|
|
|
|
|
|
|
|
|
The DQS position training is a crucial step in the memory
|
|
|
|
|
initialization process, ensuring that both read and write
|
|
|
|
|
operations are correctly aligned with the clock signal. \\
|
|
|
|
|
|
|
|
|
|
The function \path{TrainDQSRdWrPos_D_Fam15} orchestrates this
|
|
|
|
|
process by iterating over memory lanes and adjusting timing
|
2024-08-27 16:03:22 +02:00
|
|
|
|
parameters to find optimal settings
|
|
|
|
|
(lst. \ref{lst:dqs_train_init}). It is called by
|
2024-08-26 19:19:02 +02:00
|
|
|
|
\path{mct_TrainDQSPos_D}. \\
|
|
|
|
|
|
|
|
|
|
The function begins by initializing several variables and
|
|
|
|
|
settings necessary for the training process. These include:
|
|
|
|
|
|
|
|
|
|
\begin{itemize}
|
2024-08-27 13:27:07 +02:00
|
|
|
|
\item \path{Errors}: A variable to track any errors
|
|
|
|
|
encountered during the training.
|
|
|
|
|
\item \path{dual_rank}: A flag to indicate whether the
|
|
|
|
|
current DIMM has two ranks.
|
|
|
|
|
\item \path{passing_dqs_delay_found}: An array to track
|
|
|
|
|
whether a passing DQS delay has been found for each lane.
|
|
|
|
|
\item \path{dqs_results_array}: A multi-dimensional array to
|
|
|
|
|
store the results of the DQS delay tests across
|
|
|
|
|
different write and read steps.
|
2024-08-26 19:19:02 +02:00
|
|
|
|
\end{itemize}
|
|
|
|
|
|
|
|
|
|
The function then loops over each receiver (loosely associated
|
|
|
|
|
with chip selects) to perform the training for each rank within
|
|
|
|
|
each DIMM. \\
|
|
|
|
|
|
|
|
|
|
\begin{listing}
|
|
|
|
|
\begin{adjustwidth}{0.5cm}{0.5cm}
|
|
|
|
|
\begin{minted}[linenos]{c}
|
|
|
|
|
for (Receiver = receiver_start; Receiver < receiver_end; Receiver++) {
|
|
|
|
|
dimm = (Receiver >> 1);
|
|
|
|
|
...
|
|
|
|
|
if (!mct_RcvrRankEnabled_D(pMCTstat, pDCTstat, dct, Receiver)) {
|
|
|
|
|
continue;
|
|
|
|
|
}
|
2024-08-25 18:51:20 +02:00
|
|
|
|
\end{minted}
|
|
|
|
|
\end{adjustwidth}
|
2024-08-27 13:27:07 +02:00
|
|
|
|
\caption{Looping over each receiver,
|
|
|
|
|
extract from
|
|
|
|
|
\protect\path{TrainDQSRdWrPos_D_Fam15} function in
|
|
|
|
|
\protect\path{src/northbridge/amd/amdmct/mct_ddr3/mctdqs_d.c}}
|
2024-08-26 19:19:02 +02:00
|
|
|
|
\label{lst:dqs_train_init}
|
2024-08-25 18:51:20 +02:00
|
|
|
|
\end{listing}
|
|
|
|
|
|
2024-08-26 19:19:02 +02:00
|
|
|
|
For each lane in the memory channel, the function iterates over
|
|
|
|
|
possible write and read delay values to find the optimal
|
2024-08-27 16:03:22 +02:00
|
|
|
|
configuration (lst. \ref{lst:dqs_train_iteration}).
|
|
|
|
|
This is done by:
|
2024-08-26 19:19:02 +02:00
|
|
|
|
|
|
|
|
|
\begin{enumerate}
|
2024-08-27 13:27:07 +02:00
|
|
|
|
\item Iterating over the write data delay values from the
|
|
|
|
|
initial value to the initial value plus 1 UI
|
|
|
|
|
(Unit Interval).
|
|
|
|
|
\item For each write data delay, iterating over possible
|
|
|
|
|
read DQS delay values from 0 to 1 UI.
|
|
|
|
|
\item For each combination of write and read delays, testing
|
|
|
|
|
the configuration by writing a training pattern to the
|
|
|
|
|
memory and reading it back to check if it passes or
|
|
|
|
|
fails.
|
2024-08-26 19:19:02 +02:00
|
|
|
|
\end{enumerate}
|
2024-08-25 18:51:20 +02:00
|
|
|
|
|
|
|
|
|
\begin{listing}
|
|
|
|
|
\begin{adjustwidth}{0.5cm}{0.5cm}
|
|
|
|
|
\begin{minted}[linenos]{c}
|
2024-08-26 19:19:02 +02:00
|
|
|
|
for (current_write_data_delay[lane] = initial_write_dqs_delay[lane];
|
|
|
|
|
current_write_data_delay[lane] < (initial_write_dqs_delay[lane] + 0x20);
|
|
|
|
|
current_write_data_delay[lane]++) {
|
|
|
|
|
...
|
|
|
|
|
for (current_read_dqs_delay[lane] = 0;
|
|
|
|
|
current_read_dqs_delay[lane] < 0x20;
|
|
|
|
|
current_read_dqs_delay[lane]++) {
|
|
|
|
|
...
|
2024-08-27 16:03:22 +02:00
|
|
|
|
write_dqs_read_data_timing_registers(
|
|
|
|
|
current_read_dqs_delay, dev, dct, dimm, index_reg);
|
|
|
|
|
read_dram_dqs_training_pattern_fam15(
|
|
|
|
|
pMCTstat, pDCTstat, dct, Receiver, lane, ((check_antiphase == 0)?1:0));
|
2024-08-26 19:19:02 +02:00
|
|
|
|
...
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
\end{minted}
|
|
|
|
|
\end{adjustwidth}
|
2024-08-27 13:27:07 +02:00
|
|
|
|
\caption{Iteration over write and read delay values for each lane,
|
|
|
|
|
extract from
|
|
|
|
|
\protect\path{TrainDQSRdWrPos_D_Fam15} function in
|
|
|
|
|
\protect\path{src/northbridge/amd/amdmct/mct_ddr3/mctdqs_d.c}}
|
2024-08-26 19:19:02 +02:00
|
|
|
|
\label{lst:dqs_train_iteration}
|
|
|
|
|
\end{listing}
|
2024-08-25 18:51:20 +02:00
|
|
|
|
|
2024-08-26 19:19:02 +02:00
|
|
|
|
During each iteration, the results are recorded in the
|
|
|
|
|
\path{dqs_results_array}, which tracks whether the combination
|
|
|
|
|
of write and read delays was successful (pass) or not (fail).
|
|
|
|
|
The results are stored for both the primary rank and, if
|
|
|
|
|
applicable, the secondary rank when dual rank DIMMs are used.
|
|
|
|
|
\\
|
|
|
|
|
|
|
|
|
|
After iterating over all possible delay values, the function
|
2024-08-27 16:03:22 +02:00
|
|
|
|
processes the results to determine the best DQS delay settings
|
|
|
|
|
(lst. \ref{lst:dqs_train_results}). \\
|
2024-08-26 19:19:02 +02:00
|
|
|
|
|
|
|
|
|
This is done by:
|
|
|
|
|
|
|
|
|
|
\begin{itemize}
|
2024-08-27 13:27:07 +02:00
|
|
|
|
\item Finding the longest consecutive string of passing
|
|
|
|
|
values for both read and write operations.
|
|
|
|
|
\item Calculating the center of the passing region and using
|
|
|
|
|
this as the optimal delay setting.
|
|
|
|
|
\item If the center of the region is below a threshold,
|
|
|
|
|
issuing a warning that a negative DQS recovery delay
|
|
|
|
|
was detected, which could lead to instability.
|
2024-08-26 19:19:02 +02:00
|
|
|
|
\end{itemize}
|
|
|
|
|
|
|
|
|
|
\begin{listing}
|
|
|
|
|
\begin{adjustwidth}{0.5cm}{0.5cm}
|
|
|
|
|
\begin{minted}[linenos]{c}
|
|
|
|
|
if (best_count > 2) {
|
|
|
|
|
uint16_t region_center = (best_pos + (best_count / 2));
|
|
|
|
|
if (region_center < 16) {
|
2024-08-27 16:03:22 +02:00
|
|
|
|
printk(BIOS_WARNING,
|
|
|
|
|
"TrainDQSRdWrPos: negative DQS recovery delay detected!");
|
2024-08-26 19:19:02 +02:00
|
|
|
|
region_center = 0;
|
|
|
|
|
} else {
|
|
|
|
|
region_center -= 16;
|
2024-08-25 18:51:20 +02:00
|
|
|
|
}
|
2024-08-26 19:19:02 +02:00
|
|
|
|
...
|
|
|
|
|
current_read_dqs_delay[lane] = region_center;
|
|
|
|
|
passing_dqs_delay_found[lane] = 1;
|
2024-08-27 16:03:22 +02:00
|
|
|
|
write_dqs_read_data_timing_registers(
|
|
|
|
|
current_read_dqs_delay, dev, dct, dimm, index_reg);
|
2024-08-25 18:51:20 +02:00
|
|
|
|
}
|
2024-08-26 19:19:02 +02:00
|
|
|
|
\end{minted}
|
|
|
|
|
\end{adjustwidth}
|
2024-08-27 13:27:07 +02:00
|
|
|
|
\caption{Processing the results to determine the best DQS delay settings,
|
|
|
|
|
extract from
|
|
|
|
|
\protect\path{TrainDQSRdWrPos_D_Fam15} function in
|
|
|
|
|
\protect\path{src/northbridge/amd/amdmct/mct_ddr3/mctdqs_d.c}}
|
2024-08-26 19:19:02 +02:00
|
|
|
|
\label{lst:dqs_train_results}
|
|
|
|
|
\end{listing}
|
2024-08-25 18:51:20 +02:00
|
|
|
|
|
2024-08-26 19:19:02 +02:00
|
|
|
|
Finally, the function checks if any lane did not find a valid
|
2024-08-27 16:03:22 +02:00
|
|
|
|
passing region (lst. \ref{lst:dqs_train_finalize}).
|
|
|
|
|
If any lanes failed to find a passing DQS delay,
|
2024-08-26 19:19:02 +02:00
|
|
|
|
the \path{Errors} flag is set, and this error is propagated
|
|
|
|
|
through the \path{pDCTstat->TrainErrors} and
|
|
|
|
|
\path{pDCTstat->ErrStatus} variables.
|
|
|
|
|
\\
|
|
|
|
|
|
|
|
|
|
The function returns \path{1} if no errors were encountered,
|
|
|
|
|
and \texttt{0} otherwise, which is unusual. \\
|
|
|
|
|
|
|
|
|
|
\begin{listing}
|
|
|
|
|
\begin{adjustwidth}{0.5cm}{0.5cm}
|
|
|
|
|
\begin{minted}[linenos]{c}
|
|
|
|
|
for (lane = lane_start; lane < lane_end; lane++) {
|
|
|
|
|
if (!passing_dqs_delay_found[lane]) {
|
|
|
|
|
Errors |= 1 << SB_NODQSPOS;
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
pDCTstat->TrainErrors |= Errors;
|
|
|
|
|
pDCTstat->ErrStatus |= Errors;
|
|
|
|
|
return !Errors;
|
2024-08-25 18:51:20 +02:00
|
|
|
|
\end{minted}
|
|
|
|
|
\end{adjustwidth}
|
2024-08-27 13:27:07 +02:00
|
|
|
|
\caption{Final error handling and return value,
|
|
|
|
|
extract from
|
|
|
|
|
\protect\path{TrainDQSRdWrPos_D_Fam15} function in
|
|
|
|
|
\protect\path{src/northbridge/amd/amdmct/mct_ddr3/mctdqs_d.c}}
|
2024-08-26 19:19:02 +02:00
|
|
|
|
\label{lst:dqs_train_finalize}
|
2024-08-25 18:51:20 +02:00
|
|
|
|
\end{listing}
|
|
|
|
|
|
|
|
|
|
\subsubsection{Details on the DQS receiver training function}
|
|
|
|
|
|
2024-08-26 19:19:02 +02:00
|
|
|
|
In AMD Fam15h G34 processors, the DQS receiver enable training
|
|
|
|
|
is a critical step in ensuring that the memory subsystem operates
|
|
|
|
|
correctly and reliably. This training aligns the DQS signal with
|
|
|
|
|
the clock signal, ensuring proper data capture during memory reads.
|
|
|
|
|
\\
|
|
|
|
|
|
|
|
|
|
The DQS receiver enable training algorithm is executed twice:
|
|
|
|
|
first at the lowest supported MEMCLK frequency and then at the
|
|
|
|
|
highest supported MEMCLK frequency. The purpose of this training
|
|
|
|
|
is to fine-tune the timing parameters so that the memory
|
|
|
|
|
controller can reliably read data from the memory modules.
|
|
|
|
|
The algorithm is implemented in the function
|
|
|
|
|
\path{dqsTrainRcvrEn_SW_Fam15} from
|
2024-08-27 13:27:07 +02:00
|
|
|
|
\path{src/northbridge/amd/amdmct/mct_ddr3/mctsrc.c}, which
|
|
|
|
|
orchestrates the
|
2024-08-26 19:19:02 +02:00
|
|
|
|
entire process, called by the \path{mct_TrainRcvrEn_D} function,
|
|
|
|
|
which has been called itself by \path{TrainReceiverEn_D} from
|
2024-08-27 13:27:07 +02:00
|
|
|
|
\path{src/northbridge/amd/amdmct/mct_ddr3/mctdqs_d.c}. \\
|
2024-08-26 19:19:02 +02:00
|
|
|
|
|
|
|
|
|
Here, seeds are initial delay values used to set
|
|
|
|
|
up the memory controller's timing parameters. These seeds are
|
|
|
|
|
generated based on the specific characteristics of the memory
|
|
|
|
|
configuration, such as the package type (e.g., G34, C32), the
|
|
|
|
|
type of DIMMs installed (Registered, Load Reduced, etc.), and
|
|
|
|
|
the maximum number of DIMMs that can be installed in a channel.
|
|
|
|
|
\\
|
|
|
|
|
|
|
|
|
|
The seed generation is handled by the function
|
|
|
|
|
\path{fam15_receiver_enable_training_seed}. This function
|
|
|
|
|
generates a base seed value for each memory channel, based on
|
|
|
|
|
predefined tables in the BKDG \cite{BKDG}. The base seed values
|
|
|
|
|
are specific to the memory configuration and are adjusted based
|
|
|
|
|
on the type of DIMM and the number of DIMMs in each channel. \\
|
2024-08-25 18:51:20 +02:00
|
|
|
|
|
2024-08-27 16:03:22 +02:00
|
|
|
|
The generated seed values are then adjusted
|
|
|
|
|
(lst. \ref{lst:seed_adjustment}) based on the
|
2024-08-26 19:19:02 +02:00
|
|
|
|
operating frequency of the memory (MEMCLK). The adjustment
|
|
|
|
|
scales the seed values to account for the difference between
|
|
|
|
|
the current memory frequency and the minimum supported
|
|
|
|
|
frequency. This ensures that the training can be accurately
|
|
|
|
|
performed across different operating conditions. \\
|
2024-08-22 19:18:34 +02:00
|
|
|
|
|
2024-08-26 19:19:02 +02:00
|
|
|
|
\begin{listing}
|
|
|
|
|
\begin{adjustwidth}{0.5cm}{0.5cm}
|
|
|
|
|
\begin{minted}[linenos]{c}
|
|
|
|
|
initial_seed = (uint16_t) (((((uint64_t) initial_seed) *
|
|
|
|
|
fam15h_freq_tab[mem_clk] * 100) / (min_mem_clk * 100)));
|
|
|
|
|
\end{minted}
|
|
|
|
|
\end{adjustwidth}
|
2024-08-27 13:27:07 +02:00
|
|
|
|
\caption{Adjusting the seed values based on the operating
|
|
|
|
|
frequency of the memory,
|
|
|
|
|
extract from
|
|
|
|
|
\protect\path{dqsTrainRcvrEn_SW_Fam15} function in
|
|
|
|
|
\protect\path{src/northbridge/amd/amdmct/mct_ddr3/mctsrc.c}}
|
2024-08-26 19:19:02 +02:00
|
|
|
|
\label{lst:seed_adjustment}
|
|
|
|
|
\end{listing}
|
|
|
|
|
|
|
|
|
|
Once the seeds are generated and adjusted, they are used to set
|
2024-08-27 16:03:22 +02:00
|
|
|
|
the initial delay values for the DQS receiver enable training
|
|
|
|
|
(lst. \ref{lst:initial_delay_values}).
|
2024-08-26 19:19:02 +02:00
|
|
|
|
The delay values are split into two components: gross delay and
|
|
|
|
|
fine delay. The gross delay determines the overall timing
|
|
|
|
|
offset, while the fine delay adjusts the timing with finer
|
|
|
|
|
granularity. \\
|
|
|
|
|
|
|
|
|
|
\begin{listing}
|
|
|
|
|
\begin{adjustwidth}{0.5cm}{0.5cm}
|
|
|
|
|
\begin{minted}[linenos]{c}
|
|
|
|
|
for (lane = 0; lane < lane_count; lane++) {
|
|
|
|
|
seed_gross[lane] = (seed[lane] >> 5) & 0x1f;
|
|
|
|
|
seed_fine[lane] = seed[lane] & 0x1f;
|
|
|
|
|
|
|
|
|
|
if (seed_gross[lane] & 0x1)
|
|
|
|
|
seed_pre_gross[lane] = 1;
|
|
|
|
|
else
|
|
|
|
|
seed_pre_gross[lane] = 2;
|
|
|
|
|
|
|
|
|
|
// Set the gross delay
|
|
|
|
|
current_total_delay[lane] = ((seed_gross[lane] & 0x1f) << 5);
|
|
|
|
|
}
|
|
|
|
|
\end{minted}
|
|
|
|
|
\end{adjustwidth}
|
2024-08-27 13:27:07 +02:00
|
|
|
|
\caption{Setting initial delay values based on the generated
|
|
|
|
|
seed values,
|
|
|
|
|
extract from
|
|
|
|
|
\protect\path{dqsTrainRcvrEn_SW_Fam15} function in
|
|
|
|
|
\protect\path{src/northbridge/amd/amdmct/mct_ddr3/mctsrc.c}}
|
2024-08-26 19:19:02 +02:00
|
|
|
|
\label{lst:initial_delay_values}
|
|
|
|
|
\end{listing}
|
|
|
|
|
|
|
|
|
|
These delay values are then written to the appropriate registers
|
|
|
|
|
to configure the memory controller for the DQS receiver enable
|
|
|
|
|
training. The training is performed in multiple steps,
|
|
|
|
|
iteratively refining the delay values until the DQS signal is
|
|
|
|
|
correctly aligned with the clock signal. \\
|
|
|
|
|
|
|
|
|
|
During the initialization phase, the memory controller is
|
|
|
|
|
prepared for training. This includes enabling the training mode,
|
|
|
|
|
configuring the memory channels, and disabling certain features
|
|
|
|
|
such as ECC (Error-Correcting Code) to prevent interference
|
2024-08-27 16:03:22 +02:00
|
|
|
|
during training (lst. \ref{lst:initialization_phase}). \\
|
2024-08-26 19:19:02 +02:00
|
|
|
|
|
|
|
|
|
\begin{listing}
|
|
|
|
|
\begin{adjustwidth}{0.5cm}{0.5cm}
|
|
|
|
|
\begin{minted}[linenos]{c}
|
|
|
|
|
fam15EnableTrainingMode(pMCTstat, pDCTstat, ch, 1);
|
|
|
|
|
_DisableDramECC = mct_DisableDimmEccEn_D(pMCTstat, pDCTstat);
|
|
|
|
|
\end{minted}
|
|
|
|
|
\end{adjustwidth}
|
2024-08-27 16:03:22 +02:00
|
|
|
|
\caption{Enabling training mode and disabling ECC,
|
2024-08-27 13:27:07 +02:00
|
|
|
|
extract from
|
|
|
|
|
\protect\path{dqsTrainRcvrEn_SW_Fam15} function in
|
|
|
|
|
\protect\path{src/northbridge/amd/amdmct/mct_ddr3/mctsrc.c}}
|
2024-08-26 19:19:02 +02:00
|
|
|
|
\label{lst:initialization_phase}
|
|
|
|
|
\end{listing}
|
|
|
|
|
|
|
|
|
|
The training phase is where the actual alignment of the DQS
|
|
|
|
|
signal occurs. The memory controller iterates over each DIMM and
|
2024-08-27 16:03:22 +02:00
|
|
|
|
each lane (lst. \ref{lst:training_phase}),
|
|
|
|
|
applying the seed values and adjusting the delay
|
2024-08-26 19:19:02 +02:00
|
|
|
|
registers accordingly. For each DIMM, the training is performed
|
|
|
|
|
twice: once for the first nibble (lower 4 bits) and once for
|
|
|
|
|
the second nibble (upper 4 bits) if the DIMM is x4. \\
|
|
|
|
|
|
|
|
|
|
\begin{listing}
|
|
|
|
|
\begin{adjustwidth}{0.5cm}{0.5cm}
|
|
|
|
|
\begin{minted}[linenos]{c}
|
|
|
|
|
for (rank = 0; rank < (_2Ranks + 1); rank++) {
|
|
|
|
|
for (nibble = 0; nibble < (train_both_nibbles + 1); nibble++) {
|
|
|
|
|
...
|
2024-08-27 16:03:22 +02:00
|
|
|
|
write_dqs_receiver_enable_control_registers(
|
|
|
|
|
current_total_delay, dev, Channel, dimm, index_reg);
|
2024-08-26 19:19:02 +02:00
|
|
|
|
...
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
\end{minted}
|
|
|
|
|
\end{adjustwidth}
|
2024-08-27 13:27:07 +02:00
|
|
|
|
\caption{Iterating over ranks and nibbles to apply delay values,
|
|
|
|
|
extract from
|
|
|
|
|
\protect\path{dqsTrainRcvrEn_SW_Fam15} function in
|
|
|
|
|
\protect\path{src/northbridge/amd/amdmct/mct_ddr3/mctsrc.c}}
|
2024-08-26 19:19:02 +02:00
|
|
|
|
\label{lst:training_phase}
|
|
|
|
|
\end{listing}
|
|
|
|
|
|
|
|
|
|
During the training, the controller issues read requests to the
|
|
|
|
|
memory to observe the timing of the DQS signal. The observed
|
|
|
|
|
delays are then averaged and adjusted to ensure the DQS signal
|
|
|
|
|
is correctly aligned across all lanes and ranks. \\
|
|
|
|
|
|
|
|
|
|
In the finalization phase, the memory controller exits the
|
2024-08-27 16:03:22 +02:00
|
|
|
|
training mode (lst. \ref{lst:finalization_phase}),
|
|
|
|
|
and the computed delay values are written back to
|
2024-08-26 19:19:02 +02:00
|
|
|
|
the appropriate registers. This ensures that the DQS signal
|
|
|
|
|
remains correctly aligned during normal operation. \\
|
|
|
|
|
|
|
|
|
|
\begin{listing}
|
|
|
|
|
\begin{adjustwidth}{0.5cm}{0.5cm}
|
|
|
|
|
\begin{minted}[linenos]{c}
|
|
|
|
|
Calc_SetMaxRdLatency_D_Fam15(pMCTstat, pDCTstat, 0, 0);
|
|
|
|
|
Calc_SetMaxRdLatency_D_Fam15(pMCTstat, pDCTstat, 1, 0);
|
|
|
|
|
if (Pass == FirstPass) {
|
|
|
|
|
mct_DisableDQSRcvEn_D(pDCTstat);
|
|
|
|
|
}
|
|
|
|
|
\end{minted}
|
|
|
|
|
\end{adjustwidth}
|
2024-08-27 13:27:07 +02:00
|
|
|
|
\caption{Exiting training mode and setting read latency,
|
|
|
|
|
extract from
|
|
|
|
|
\protect\path{dqsTrainRcvrEn_SW_Fam15} function in
|
|
|
|
|
\protect\path{src/northbridge/amd/amdmct/mct_ddr3/mctsrc.c}}
|
2024-08-26 19:19:02 +02:00
|
|
|
|
\label{lst:finalization_phase}
|
|
|
|
|
\end{listing}
|
|
|
|
|
|
|
|
|
|
\subsection{Potential enhancements}
|
|
|
|
|
|
|
|
|
|
\subsubsection{DQS receiver training}
|
|
|
|
|
|
|
|
|
|
While the DQS receiver enable training implementation for AMD
|
|
|
|
|
Fam15h G34 processors can perform its intended function in some
|
|
|
|
|
cases, there are several areas where the code is either
|
|
|
|
|
incomplete, suboptimal, or potentially problematic. \\
|
|
|
|
|
|
|
|
|
|
The presence of \path{TODO} comments in the code indicates areas
|
|
|
|
|
where the implementation is either incomplete or lacks certain
|
|
|
|
|
necessary functionality. These unaddressed tasks can lead to
|
|
|
|
|
performance issues, potential bugs, or incomplete training,
|
|
|
|
|
which could compromise the stability and reliability of the
|
|
|
|
|
memory subsystem. \\
|
|
|
|
|
|
|
|
|
|
In the seed adjustment section for the second pass of training,
|
|
|
|
|
the code includes a \path{TODO} comment regarding fetching the
|
|
|
|
|
correct value from \path{RC2[0]} for the \path{addr_prelaunch}
|
2024-08-27 16:03:22 +02:00
|
|
|
|
variable (lst. \ref{lst:todo_rc2}).
|
2024-08-26 19:19:02 +02:00
|
|
|
|
|
|
|
|
|
\begin{listing}
|
|
|
|
|
\begin{adjustwidth}{0.5cm}{0.5cm}
|
|
|
|
|
\begin{minted}[linenos]{c}
|
|
|
|
|
uint8_t addr_prelaunch = 0; /* TODO: Fetch the correct value from RC2[0] */
|
|
|
|
|
\end{minted}
|
|
|
|
|
\end{adjustwidth}
|
2024-08-27 13:27:07 +02:00
|
|
|
|
\caption{An unimplemented feature in the seed adjustment logic,
|
|
|
|
|
extract from
|
|
|
|
|
\protect\path{dqsTrainRcvrEn_SW_Fam15} function in
|
|
|
|
|
\protect\path{src/northbridge/amd/amdmct/mct_ddr3/mcrsrc.c}}
|
2024-08-26 19:19:02 +02:00
|
|
|
|
\label{lst:todo_rc2}
|
|
|
|
|
\end{listing}
|
|
|
|
|
|
|
|
|
|
This unimplemented feature suggests that the training process
|
|
|
|
|
may not be fully optimized, as the correct prelaunch address
|
|
|
|
|
setting is not being applied. This could result in incorrect
|
|
|
|
|
seed values being used during the training, leading to
|
2024-08-27 13:27:07 +02:00
|
|
|
|
suboptimal alignment of the DQS signal. Also, the comment
|
|
|
|
|
is unclear about what RC2[0] really means. \\
|
2024-08-26 19:19:02 +02:00
|
|
|
|
|
|
|
|
|
The code contains another \path{TODO} comment indicating that
|
2024-08-27 16:03:22 +02:00
|
|
|
|
the support for Load Reduced DIMMs (LRDIMMs) is unimplemented
|
|
|
|
|
(lst. \ref{lst:todo_lrdimm}).
|
2024-08-26 19:19:02 +02:00
|
|
|
|
|
|
|
|
|
\begin{listing}
|
|
|
|
|
\begin{adjustwidth}{0.5cm}{0.5cm}
|
|
|
|
|
\begin{minted}[linenos]{c}
|
|
|
|
|
else if ((pDCTstat->Status & (1 << SB_LoadReduced))) {
|
|
|
|
|
/* TODO
|
|
|
|
|
* Load reduced DIMM support unimplemented
|
|
|
|
|
*/
|
|
|
|
|
register_delay = 0x0;
|
|
|
|
|
}
|
|
|
|
|
\end{minted}
|
|
|
|
|
\end{adjustwidth}
|
2024-08-27 13:27:07 +02:00
|
|
|
|
\caption{LRDIMM support is
|
|
|
|
|
unimplemented,
|
|
|
|
|
extract from
|
|
|
|
|
\protect\path{dqsTrainRcvrEn_SW_Fam15} function in
|
|
|
|
|
\protect\path{src/northbridge/amd/amdmct/mct_ddr3/mcrsrc.c}}
|
2024-08-26 19:19:02 +02:00
|
|
|
|
\label{lst:todo_lrdimm}
|
|
|
|
|
\end{listing}
|
|
|
|
|
|
|
|
|
|
This omission is significant because LRDIMMs are commonly used
|
|
|
|
|
in server environments where high memory capacity is required.
|
|
|
|
|
The lack of support for LRDIMMs could lead to incorrect training
|
|
|
|
|
or even failures when such DIMMs are installed, severely
|
|
|
|
|
impacting the reliability of the system. \\
|
|
|
|
|
|
|
|
|
|
\path{FIXME} comments in the code are often indicators of known
|
|
|
|
|
issues or temporary workarounds that need to be addressed. In
|
|
|
|
|
this implementation, there are several such comments that
|
|
|
|
|
highlight critical areas where the current approach may be
|
|
|
|
|
flawed or incomplete. \\
|
|
|
|
|
|
|
|
|
|
The first \path{FIXME} comment questions the usage of the
|
2024-08-27 16:03:22 +02:00
|
|
|
|
\path{SSEDIS} setting during the training process
|
|
|
|
|
(lst. \ref{lst:fixme_ssedis}).
|
2024-08-26 19:19:02 +02:00
|
|
|
|
|
|
|
|
|
\begin{listing}
|
|
|
|
|
\begin{adjustwidth}{0.5cm}{0.5cm}
|
|
|
|
|
\begin{minted}[linenos]{c}
|
2024-08-27 13:27:07 +02:00
|
|
|
|
msr = HWCR_MSR;
|
|
|
|
|
_RDMSR(msr, &lo, &hi);
|
|
|
|
|
/* FIXME: Why use SSEDIS */
|
2024-08-27 16:03:22 +02:00
|
|
|
|
if (lo & (1 << 17)) { /* save the old value */
|
|
|
|
|
_Wrap32Dis = 1;
|
2024-08-27 13:27:07 +02:00
|
|
|
|
}
|
2024-08-27 16:03:22 +02:00
|
|
|
|
lo |= (1 << 17); /* HWCR.wrap32dis */
|
|
|
|
|
lo &= ~(1 << 15); /* SSEDIS */
|
|
|
|
|
_WRMSR(msr, lo, hi); /* Setting wrap32dis allows 64-bit memory
|
|
|
|
|
* references in real mode */
|
2024-08-26 19:19:02 +02:00
|
|
|
|
\end{minted}
|
|
|
|
|
\end{adjustwidth}
|
2024-08-27 13:27:07 +02:00
|
|
|
|
\caption{Questioning the use of
|
|
|
|
|
\texttt{SSEDIS} in the MSR setting,
|
|
|
|
|
extract from
|
|
|
|
|
\protect\path{dqsTrainRcvrEn_SW_Fam15} function in
|
|
|
|
|
\protect\path{src/northbridge/amd/amdmct/mct_ddr3/mcrsrc.c}}
|
2024-08-26 19:19:02 +02:00
|
|
|
|
\label{lst:fixme_ssedis}
|
|
|
|
|
\end{listing}
|
|
|
|
|
|
|
|
|
|
The concern here is that disabling the \path{SSEDIS}
|
|
|
|
|
(SSE Disable) bit could have unintended side effects,
|
|
|
|
|
particularly in environments where SSE instructions are
|
|
|
|
|
expected to be enabled. This could impact the performance of
|
|
|
|
|
the system during training and potentially lead to instability.
|
|
|
|
|
\\
|
|
|
|
|
|
|
|
|
|
The code also highlights a potential misprint in the BKDG
|
2024-08-27 16:03:22 +02:00
|
|
|
|
regarding the \path{WrDqDqsEarly} value
|
|
|
|
|
(lst. \ref{lst:fixme_misprint}).
|
2024-08-26 19:19:02 +02:00
|
|
|
|
|
|
|
|
|
\begin{listing}
|
|
|
|
|
\begin{adjustwidth}{0.5cm}{0.5cm}
|
|
|
|
|
\begin{minted}[linenos]{c}
|
|
|
|
|
/* NOTE: While the BKDG states to only program DqsRcvEnGrossDelay, this appears
|
|
|
|
|
* to have been a misprint as DqsRcvEnFineDelay should be set to zero as well.
|
|
|
|
|
*/
|
|
|
|
|
\end{minted}
|
|
|
|
|
\end{adjustwidth}
|
2024-08-27 13:27:07 +02:00
|
|
|
|
\caption{A possible misprint
|
|
|
|
|
in the BKDG regarding delay settings,
|
|
|
|
|
extract from
|
|
|
|
|
\protect\path{dqsTrainRcvrEn_SW_Fam15} function in
|
|
|
|
|
\protect\path{src/northbridge/amd/amdmct/mct_ddr3/mcrsrc.c}}
|
2024-08-26 19:19:02 +02:00
|
|
|
|
\label{lst:fixme_misprint}
|
|
|
|
|
\end{listing}
|
|
|
|
|
|
|
|
|
|
This indicates that the implementation may be based on incorrect
|
|
|
|
|
or incomplete documentation, leading to potential errors in
|
|
|
|
|
setting the delay values. If this is indeed a misprint in the
|
|
|
|
|
BKDG, the correction should be verified with updated
|
|
|
|
|
documentation, and the implementation should be adjusted
|
|
|
|
|
accordingly. \\
|
|
|
|
|
|
|
|
|
|
In addition to the explicit \path{TODO} and \path{FIXME}
|
|
|
|
|
comments, there are other aspects of the implementation that
|
|
|
|
|
could impact performance and stability. \\
|
|
|
|
|
|
|
|
|
|
The logic for adjusting the seed values based on the memory
|
|
|
|
|
frequency and the platform's minimum supported frequency is
|
2024-08-27 16:03:22 +02:00
|
|
|
|
complex and prone to errors
|
|
|
|
|
(lst. \ref{lst:seed_adjustment_logic}),
|
|
|
|
|
especially when combined with the
|
|
|
|
|
incomplete \path{TODO} features.
|
|
|
|
|
|
|
|
|
|
\begin{listing}
|
|
|
|
|
\begin{adjustwidth}{0.5cm}{0.5cm}
|
|
|
|
|
\begin{minted}[linenos]{c}
|
|
|
|
|
initial_seed = (uint16_t) (((((uint64_t) initial_seed) *
|
|
|
|
|
fam15h_freq_tab[mem_clk] * 100) / (min_mem_clk * 100)));
|
|
|
|
|
\end{minted}
|
|
|
|
|
\end{adjustwidth}
|
|
|
|
|
\caption{Complex seed adjustment logic,
|
|
|
|
|
extract from
|
|
|
|
|
\protect\path{dqsTrainRcvrEn_SW_Fam15} function in
|
|
|
|
|
\protect\path{src/northbridge/amd/amdmct/mct_ddr3/mcrsrc.c}}
|
|
|
|
|
\label{lst:seed_adjustment_logic}
|
|
|
|
|
\end{listing}
|
|
|
|
|
|
|
|
|
|
The risk here is that incorrect
|
2024-08-26 19:19:02 +02:00
|
|
|
|
seed values could be used, leading to timing mismatches during
|
2024-08-27 16:03:22 +02:00
|
|
|
|
the training process. \\
|
|
|
|
|
|
|
|
|
|
Added to that, stock seeds from the BKDG are used
|
|
|
|
|
(lst. \ref{lst:dqs_receiver_training_seeds}).
|
|
|
|
|
However, it seems that that seeds for used for DQS
|
2024-08-26 19:19:02 +02:00
|
|
|
|
training should be extensively determined for each motherboard,
|
|
|
|
|
and the BKDG \cite{BKDG} does not tell otherwise. Moreover,
|
|
|
|
|
seeds can be configured uniquely for every possible socket,
|
|
|
|
|
channel, DIMM module, and even byte lane combination. The current
|
|
|
|
|
implementation is here only using the recommended seeds from
|
|
|
|
|
the table 99 of the BKDG \cite{BKDG}, which is not sufficient
|
|
|
|
|
and absolutely not adapted to every DIMM module in the market.
|
|
|
|
|
\\
|
|
|
|
|
|
|
|
|
|
\begin{listing}
|
|
|
|
|
\begin{adjustwidth}{0.5cm}{0.5cm}
|
|
|
|
|
\begin{minted}[linenos]{c}
|
|
|
|
|
if (pDCTstat->Status & (1 << SB_Registered)) {
|
|
|
|
|
if (package_type == PT_GR) {
|
|
|
|
|
/* Socket G34: Fam15h BKDG v3.14 Table 99 */
|
|
|
|
|
if (MaxDimmsInstallable == 1) {
|
|
|
|
|
if (channel == 0)
|
|
|
|
|
seed = 0x43;
|
|
|
|
|
else if (channel == 1)
|
|
|
|
|
seed = 0x3f;
|
|
|
|
|
else if (channel == 2)
|
|
|
|
|
seed = 0x3a;
|
|
|
|
|
else if (channel == 3)
|
|
|
|
|
seed = 0x35;
|
|
|
|
|
} else if (MaxDimmsInstallable == 2) {
|
|
|
|
|
if (channel == 0)
|
|
|
|
|
seed = 0x54;
|
|
|
|
|
else if (channel == 1)
|
|
|
|
|
seed = 0x4d;
|
|
|
|
|
else if (channel == 2)
|
|
|
|
|
seed = 0x45;
|
|
|
|
|
else if (channel == 3)
|
|
|
|
|
seed = 0x40;
|
|
|
|
|
} else if (MaxDimmsInstallable == 3) {
|
|
|
|
|
if (channel == 0)
|
|
|
|
|
seed = 0x6b;
|
|
|
|
|
else if (channel == 1)
|
|
|
|
|
seed = 0x5e;
|
|
|
|
|
else if (channel == 2)
|
|
|
|
|
seed = 0x4b;
|
|
|
|
|
else if (channel == 3)
|
|
|
|
|
seed = 0x3d;
|
|
|
|
|
}
|
|
|
|
|
\end{minted}
|
|
|
|
|
\end{adjustwidth}
|
2024-08-27 13:27:07 +02:00
|
|
|
|
\caption{Seeds used for DQS Receiver training,
|
|
|
|
|
extract from
|
|
|
|
|
\protect\path{dqsTrainRcvrEn_SW_Fam15} function in
|
|
|
|
|
\protect\path{src/northbridge/amd/amdmct/mct_ddr3/mcrsrc.c}}
|
2024-08-26 19:19:02 +02:00
|
|
|
|
\label{lst:dqs_receiver_training_seeds}
|
|
|
|
|
\end{listing}
|
|
|
|
|
|
|
|
|
|
The current implementation also has limited error handling and
|
|
|
|
|
reporting. While some errors are detected during training, the
|
|
|
|
|
code does not have robust mechanisms for recovering from or
|
|
|
|
|
correcting these errors. \\
|
|
|
|
|
|
|
|
|
|
This approach might lead to further complications in high-load
|
|
|
|
|
scenarios or when the memory configuration changes, as the
|
|
|
|
|
underlying issues are not resolved. \\
|
|
|
|
|
|
|
|
|
|
\subsubsection{Write leveling}
|
|
|
|
|
|
|
|
|
|
While the current implementation of write leveling on AMD Fam15h
|
|
|
|
|
G34 processors with RDIMMs can be functional in some cases and
|
|
|
|
|
provides the necessary steps to align DQS signals correctly
|
|
|
|
|
during write operations, there are several areas where the
|
|
|
|
|
implementation is either incomplete, relies on temporary
|
|
|
|
|
workarounds, or may introduce stability and performance issues.
|
|
|
|
|
\\
|
|
|
|
|
|
|
|
|
|
One of the most significant concerns with the current
|
|
|
|
|
implementation is the presence of unresolved \path{TODO} and
|
|
|
|
|
\path{FIXME} comments throughout the code. These comments
|
|
|
|
|
indicate areas where the implementation is either incomplete or
|
|
|
|
|
has known issues that have not been fully resolved. \\
|
|
|
|
|
|
|
|
|
|
In the \path{procConfig} function, a \path{TODO} comment
|
|
|
|
|
mentions that the current implementation may not be using
|
2024-08-27 13:27:07 +02:00
|
|
|
|
the correct or final value for this variable, once again because
|
|
|
|
|
of a value from RC2[0] that isn't fetched, potentially
|
2024-08-26 19:19:02 +02:00
|
|
|
|
leading to inaccuracies in the seed values used during write
|
2024-08-27 16:03:22 +02:00
|
|
|
|
leveling (lst. \ref{lst:todo_seed_generation}).
|
|
|
|
|
This inaccuracy can result in timing mismatches, which
|
2024-08-26 19:19:02 +02:00
|
|
|
|
may cause data corruption or other stability issues. \\
|
|
|
|
|
|
|
|
|
|
\begin{listing}
|
|
|
|
|
\begin{adjustwidth}{0.5cm}{0.5cm}
|
|
|
|
|
\begin{minted}[linenos]{c}
|
|
|
|
|
uint8_t AddrCmdPrelaunch = 0; /* TODO: Fetch the correct value from RC2[0] */
|
|
|
|
|
\end{minted}
|
|
|
|
|
\end{adjustwidth}
|
2024-08-27 13:27:07 +02:00
|
|
|
|
\caption{Incomplete seed generation
|
|
|
|
|
implementation,
|
|
|
|
|
extract from
|
|
|
|
|
\protect\path{procConfig} function in
|
|
|
|
|
\protect\path{src/northbridge/amd/amdmct/mct_ddr3/mhwlc_d.c}}
|
2024-08-26 19:19:02 +02:00
|
|
|
|
\label{lst:todo_seed_generation}
|
|
|
|
|
\end{listing}
|
|
|
|
|
|
2024-08-27 13:27:07 +02:00
|
|
|
|
Another \path{FIXME} in the code indicates that the
|
|
|
|
|
\path{WrDqDqsEarly} parameter, which is critical for fine-tuning
|
|
|
|
|
the DQS signal’s timing during write operations, is being
|
2024-08-27 16:03:22 +02:00
|
|
|
|
ignored due to unresolved issues
|
|
|
|
|
(lst. \ref{lst:fixme_wrdqdqs_early}). This omission can result in
|
2024-08-27 13:27:07 +02:00
|
|
|
|
less accurate timing adjustments, leading to potential marginal
|
|
|
|
|
instability in systems where tight timing margins are critical.
|
2024-08-26 19:19:02 +02:00
|
|
|
|
\\
|
|
|
|
|
|
|
|
|
|
\begin{listing}
|
|
|
|
|
\begin{adjustwidth}{0.5cm}{0.5cm}
|
|
|
|
|
\begin{minted}[linenos]{c}
|
2024-08-27 13:27:07 +02:00
|
|
|
|
/* FIXME: Ignore WrDqDqsEarly for now to work around training issues */
|
2024-08-26 19:19:02 +02:00
|
|
|
|
\end{minted}
|
|
|
|
|
\end{adjustwidth}
|
2024-08-27 13:27:07 +02:00
|
|
|
|
\caption{Omission of
|
|
|
|
|
\texttt{WrDqDqsEarly} parameter,
|
|
|
|
|
extract from
|
|
|
|
|
\protect\path{procConfig} function in
|
|
|
|
|
\protect\path{src/northbridge/amd/amdmct/mct_ddr3/mhwlc_d.c}}
|
|
|
|
|
\label{lst:fixme_wrdqdqs_early}
|
|
|
|
|
\end{listing}}
|
|
|
|
|
|
|
|
|
|
The current implementation uses generic or "stock" seed values
|
2024-08-27 16:03:22 +02:00
|
|
|
|
for certain configurations, such as Socket G34
|
|
|
|
|
(lst. \ref{lst:fixme_mainboard_specific_overrides}). Without
|
2024-08-27 13:27:07 +02:00
|
|
|
|
mainboard-specific overrides, the memory initialization process
|
|
|
|
|
might not be fully optimized for the particular motherboard in
|
|
|
|
|
use. This could result in suboptimal performance or stability
|
|
|
|
|
issues in specific environments, particularly in server
|
|
|
|
|
applications where memory performance is critical. \\
|
|
|
|
|
|
|
|
|
|
\begin{listing}
|
|
|
|
|
\begin{adjustwidth}{0.5cm}{0.5cm}
|
|
|
|
|
\begin{minted}[linenos]{c}
|
|
|
|
|
/* FIXME: Implement mainboard-specific seed and WrDqsGrossDly base overrides.
|
|
|
|
|
* 0x41 and 0x0 are the "stock" values */
|
|
|
|
|
\end{minted}
|
|
|
|
|
\end{adjustwidth}
|
2024-08-27 14:14:41 +02:00
|
|
|
|
\caption{Lack of
|
2024-08-27 13:27:07 +02:00
|
|
|
|
mainboard-specific seed overrides,
|
|
|
|
|
extract from
|
|
|
|
|
\protect\path{procConfig} function in
|
|
|
|
|
\protect\path{src/northbridge/amd/amdmct/mct_ddr3/mhwlc_d.c}}
|
|
|
|
|
\label{lst:fixme_mainboard_specific_overrides}
|
2024-08-26 19:19:02 +02:00
|
|
|
|
\end{listing}
|
|
|
|
|
|
2024-08-27 13:27:07 +02:00
|
|
|
|
In \path{AgesaHwWlPhase2}, there is a \path{FIXME} comment that
|
|
|
|
|
suggests that the Critical Gross Delay adjustment has been
|
2024-08-27 16:03:22 +02:00
|
|
|
|
temporarily disabled due to conflicts with RDIMM training
|
|
|
|
|
(lst. \ref{lst:fixme_cgd_adjustment}).
|
2024-08-27 13:27:07 +02:00
|
|
|
|
Disabling this adjustment can lead to less precise DQS alignment,
|
|
|
|
|
especially in complex memory configurations like those using
|
|
|
|
|
RDIMMs, potentially causing instability or degraded performance.
|
2024-08-26 19:19:02 +02:00
|
|
|
|
\\
|
|
|
|
|
|
|
|
|
|
\begin{listing}
|
|
|
|
|
\begin{adjustwidth}{0.5cm}{0.5cm}
|
|
|
|
|
\begin{minted}[linenos]{c}
|
2024-08-27 13:27:07 +02:00
|
|
|
|
/* FIXME: For now, disable CGD adjustment as it seems to interfere with
|
|
|
|
|
* registered DIMM training */
|
2024-08-26 19:19:02 +02:00
|
|
|
|
\end{minted}
|
|
|
|
|
\end{adjustwidth}
|
2024-08-27 13:27:07 +02:00
|
|
|
|
\caption{Disabled CGD adjustment due
|
|
|
|
|
to conflicts,
|
|
|
|
|
extract from
|
|
|
|
|
\protect\path{AgesaHwWlPhase2} function in
|
|
|
|
|
\protect\path{src/northbridge/amd/amdmct/mct_ddr3/mhwlc_d.c}}
|
|
|
|
|
\label{lst:fixme_cgd_adjustment}
|
2024-08-26 19:19:02 +02:00
|
|
|
|
\end{listing}
|
|
|
|
|
|
2024-08-27 16:03:22 +02:00
|
|
|
|
The function also bypasses (lst. \ref{lst:fixme_bypass_critical_adjustments})
|
|
|
|
|
certain critical adjustments if the memory speed is being tuned (e.g.,
|
2024-08-26 19:19:02 +02:00
|
|
|
|
during frequency stepping). This bypass is noted as a temporary
|
|
|
|
|
measure due to problems encountered during testing, where the
|
|
|
|
|
first pass values were found to cause issues with PHY training
|
|
|
|
|
on all Family 15h processors tested. This approach indicates a
|
|
|
|
|
lack of robustness in the implementation, particularly in
|
|
|
|
|
handling dynamic changes in memory frequency, which is essential
|
|
|
|
|
for server environments where performance tuning is common. \\
|
|
|
|
|
|
|
|
|
|
\begin{listing}
|
|
|
|
|
\begin{adjustwidth}{0.5cm}{0.5cm}
|
|
|
|
|
\begin{minted}[linenos]{c}
|
|
|
|
|
/* FIXME: Using the Pass 1 training values causes major phy training problems on
|
|
|
|
|
* all Family 15h processors I tested (Pass 1 values are randomly too high,
|
2024-08-27 16:03:22 +02:00
|
|
|
|
* and Pass 2 cannot lock). Figure out why this is and fix it, then remove
|
|
|
|
|
* the bypass code below... */
|
2024-08-26 19:19:02 +02:00
|
|
|
|
\end{minted}
|
|
|
|
|
\end{adjustwidth}
|
2024-08-27 13:27:07 +02:00
|
|
|
|
\caption{Bypass of critical
|
|
|
|
|
adjustments during speed tuning,
|
|
|
|
|
extract from
|
|
|
|
|
\protect\path{AgesaHwWlPhase2} function in
|
|
|
|
|
\protect\path{src/northbridge/amd/amdmct/mct_ddr3/mhwlc_d.c}}
|
2024-08-26 19:19:02 +02:00
|
|
|
|
\label{lst:fixme_bypass_critical_adjustments}
|
|
|
|
|
\end{listing}
|
|
|
|
|
|
|
|
|
|
The current implementation attempts to compensate for noise and
|
|
|
|
|
instability by overriding faulty values with seed values in
|
2024-08-27 16:03:22 +02:00
|
|
|
|
\path{AgesaHwWlPhase2} (lst. \ref{lst:reactive_error_handling}).
|
|
|
|
|
However, this approach is somewhat blunt
|
2024-08-26 19:19:02 +02:00
|
|
|
|
and reactive, addressing the symptoms rather than the underlying
|
|
|
|
|
causes of instability. This method does not ensure that noise or
|
|
|
|
|
instability is sufficiently mitigated, potentially leading to
|
|
|
|
|
marginal or sporadic failures during normal operation. \\
|
|
|
|
|
|
|
|
|
|
\begin{listing}
|
|
|
|
|
\begin{adjustwidth}{0.5cm}{0.5cm}
|
|
|
|
|
\begin{minted}[linenos]{c}
|
|
|
|
|
if (faulty_value_detected) {
|
2024-08-27 13:27:07 +02:00
|
|
|
|
pDCTData->WLGrossDelay[index+ByteLane] =
|
|
|
|
|
pDCTData->WLSeedGrossDelay[index+ByteLane];
|
|
|
|
|
pDCTData->WLFineDelay[index+ByteLane] =
|
|
|
|
|
pDCTData->WLSeedFineDelay[index+ByteLane];
|
2024-08-26 19:19:02 +02:00
|
|
|
|
status = 1;
|
|
|
|
|
}
|
|
|
|
|
\end{minted}
|
|
|
|
|
\end{adjustwidth}
|
2024-08-27 13:27:07 +02:00
|
|
|
|
\caption{Blunt error handling to compensate for noise and
|
|
|
|
|
instability,
|
|
|
|
|
extract from
|
|
|
|
|
\protect\path{AgesaHwWlPhase2} function in
|
|
|
|
|
\protect\path{src/northbridge/amd/amdmct/mct_ddr3/mhwlc_d.c}}
|
2024-08-26 19:19:02 +02:00
|
|
|
|
\label{lst:reactive_error_handling}
|
|
|
|
|
\end{listing}
|
|
|
|
|
|
|
|
|
|
The handling of x4 DIMMs, with separate training for each nibble,
|
|
|
|
|
introduces additional complexity. While necessary for these
|
|
|
|
|
configurations, the logic is fragmented, with several points
|
|
|
|
|
where the function branches based on whether the DIMM is x4.
|
|
|
|
|
This complexity increases the risk of bugs or missed conditions,
|
|
|
|
|
particularly if future changes or enhancements are made to the
|
|
|
|
|
code. The overcomplicated logic can also make the code more
|
|
|
|
|
difficult to maintain and extend. \\
|
|
|
|
|
|
2024-08-27 14:14:41 +02:00
|
|
|
|
\subsubsection{DQS position training}
|
2024-08-26 19:19:02 +02:00
|
|
|
|
|
|
|
|
|
While the DQS position training algorithm implemented in the
|
|
|
|
|
\path{TrainDQSRdWrPos_D_Fam15} function may work in some
|
|
|
|
|
cased to ensure optimal data strobe alignment, there are
|
|
|
|
|
several critical flaws and issues within the implementation
|
|
|
|
|
that could impact its effectiveness and reliability. \\
|
|
|
|
|
|
|
|
|
|
Throughout the function, there is an overreliance on hardcoded
|
|
|
|
|
constants and magic numbers, such as:
|
|
|
|
|
|
|
|
|
|
\begin{itemize}
|
2024-08-27 13:27:07 +02:00
|
|
|
|
\item The use of \texttt{0x20} to represent 1 UI (Unit
|
|
|
|
|
Interval) in multiple places.
|
|
|
|
|
\item The constant \texttt{16} used in the adjustment of
|
|
|
|
|
\texttt{region\_center} during the processing of results.
|
|
|
|
|
\item Magic numbers like \texttt{32} and \texttt{48} in the
|
|
|
|
|
array dimensions for \texttt{dqs\_results\_array}.
|
2024-08-26 19:19:02 +02:00
|
|
|
|
\end{itemize}
|
|
|
|
|
|
|
|
|
|
These values should be replaced with named constants or
|
|
|
|
|
variables that clearly indicate their purpose, improving code
|
|
|
|
|
readability and maintainability. Additionally, using
|
|
|
|
|
well-defined constants would allow easier adjustments if the
|
|
|
|
|
algorithm needs to be adapted for different hardware
|
|
|
|
|
configurations or future revisions of the architecture. \\
|
|
|
|
|
|
|
|
|
|
The error handling within the function is rudimentary, with
|
|
|
|
|
errors being flagged primarily by setting bits in the
|
|
|
|
|
\texttt{Errors} variable. However, the function does not
|
|
|
|
|
provide detailed diagnostics or recovery strategies when an
|
|
|
|
|
error occurs. For example:
|
|
|
|
|
|
|
|
|
|
\begin{itemize}
|
|
|
|
|
\item If no passing DQS delay is found for a lane, the
|
|
|
|
|
function simply sets an error bit without attempting any
|
|
|
|
|
corrective actions or providing detailed information on
|
|
|
|
|
what went wrong.
|
|
|
|
|
\item The early abort mechanism based on the value read from
|
|
|
|
|
the \texttt{0x264} register does not offer a robust
|
|
|
|
|
fallback or retry mechanism, which could lead to
|
|
|
|
|
situations where minor, recoverable issues cause the
|
|
|
|
|
entire training process to fail.
|
|
|
|
|
\end{itemize}
|
|
|
|
|
|
|
|
|
|
Improving the error handling to include detailed diagnostics,
|
|
|
|
|
logging, and potentially corrective actions (such as retrying
|
|
|
|
|
the training with adjusted parameters) would make the function
|
|
|
|
|
more resilient and reliable. \\
|
|
|
|
|
|
|
|
|
|
The function contains several areas where the logic is more
|
|
|
|
|
complex than necessary, which can lead to difficulties in
|
|
|
|
|
understanding and maintaining the code. Examples include:
|
|
|
|
|
|
|
|
|
|
\begin{itemize}
|
|
|
|
|
\item The nested loops for iterating over write and read
|
|
|
|
|
delays are deeply nested, making it challenging to
|
|
|
|
|
follow the flow of the code and understand the
|
|
|
|
|
interactions between different parts of the algorithm.
|
|
|
|
|
\item The use of multiple copies of delay settings (e.g.,
|
|
|
|
|
\texttt{current\_write\_data\_delay},
|
|
|
|
|
\texttt{initial\_write\_data\_timing}, and
|
|
|
|
|
\texttt{initial\_write\_dqs\_delay}) introduces
|
|
|
|
|
redundancy and increases the likelihood of errors
|
|
|
|
|
or inconsistencies.
|
|
|
|
|
\end{itemize}
|
|
|
|
|
|
|
|
|
|
Refactoring the code to simplify the logic, reduce redundancy,
|
|
|
|
|
and make the flow of operations clearer would improve both the
|
|
|
|
|
readability and reliability of the implementation. \\
|
|
|
|
|
|
|
|
|
|
The current implementation does not adequately handle edge cases
|
|
|
|
|
and boundary conditions, such as:
|
|
|
|
|
|
|
|
|
|
\begin{itemize}
|
|
|
|
|
\item The warning issued when a negative DQS recovery delay
|
|
|
|
|
is detected suggests that the function continues despite
|
|
|
|
|
recognizing a potentially critical issue, which could
|
2024-08-27 16:03:22 +02:00
|
|
|
|
lead to system instability
|
|
|
|
|
(lst. \ref{lst:dqs_train_negative_delay}).
|
2024-08-26 19:19:02 +02:00
|
|
|
|
\item The averaging of delay values for dual-rank DIMMs does
|
|
|
|
|
not account for the possibility of significant
|
|
|
|
|
discrepancies between the ranks, which could result in
|
|
|
|
|
suboptimal or unstable settings.
|
|
|
|
|
\item The function does not include comprehensive checks for
|
|
|
|
|
situations where the calculated delay settings might
|
|
|
|
|
exceed hardware limitations or cause timing violations.
|
|
|
|
|
\end{itemize}
|
|
|
|
|
|
2024-08-27 13:27:07 +02:00
|
|
|
|
|
|
|
|
|
\begin{listing}
|
|
|
|
|
\begin{adjustwidth}{0.5cm}{0.5cm}
|
|
|
|
|
\begin{minted}[linenos]{c}
|
|
|
|
|
if (best_count > 2) {
|
|
|
|
|
uint16_t region_center = (best_pos + (best_count / 2));
|
|
|
|
|
if (region_center < 16) {
|
2024-08-27 16:03:22 +02:00
|
|
|
|
printk(BIOS_WARNING,
|
|
|
|
|
"TrainDQSRdWrPos: negative DQS recovery delay detected!");
|
2024-08-27 13:27:07 +02:00
|
|
|
|
region_center = 0;
|
|
|
|
|
} else {
|
|
|
|
|
region_center -= 16;
|
|
|
|
|
}
|
|
|
|
|
...
|
|
|
|
|
current_read_dqs_delay[lane] = region_center;
|
|
|
|
|
passing_dqs_delay_found[lane] = 1;
|
|
|
|
|
write_dqs_read_data_timing_registers(current_read_dqs_delay, dev, dct, dimm, index_reg);
|
|
|
|
|
}
|
|
|
|
|
\end{minted}
|
|
|
|
|
\end{adjustwidth}
|
|
|
|
|
\caption{Allowing a negative DQS recovery delay measurement,
|
|
|
|
|
extract from
|
|
|
|
|
\protect\path{TrainDQSRdWrPos_D_Fam15} function in
|
|
|
|
|
\protect\path{src/northbridge/amd/amdmct/mct_ddr3/mctdqs_d.c}}
|
2024-08-27 16:03:22 +02:00
|
|
|
|
\label{lst:dqs_train_negative_delay}
|
2024-08-27 13:27:07 +02:00
|
|
|
|
\end{listing}
|
|
|
|
|
|
2024-08-26 19:19:02 +02:00
|
|
|
|
Improving the handling of edge cases and boundary conditions,
|
|
|
|
|
possibly by incorporating more robust validation checks and
|
|
|
|
|
conservative fallback mechanisms, would make the algorithm more
|
|
|
|
|
reliable in a wider range of scenarios. \\
|
|
|
|
|
|
|
|
|
|
The code contains several \texttt{TODO} and \texttt{FIXME}
|
|
|
|
|
comments that indicate incomplete or problematic parts of
|
|
|
|
|
the implementation:
|
|
|
|
|
|
|
|
|
|
\begin{itemize}
|
|
|
|
|
\item The comment \texttt{TODO: Fetch the correct value
|
|
|
|
|
from RC2[0]} suggests that critical configuration values
|
|
|
|
|
are not correctly initialized, which could compromise
|
|
|
|
|
the entire training process.
|
|
|
|
|
\item The \texttt{FIXME} comments related to early abort
|
|
|
|
|
checks and DQS recovery delay calculations indicate that
|
|
|
|
|
there are known issues with the current approach that
|
|
|
|
|
have not been resolved, potentially leading to incorrect
|
|
|
|
|
or unstable results.
|
|
|
|
|
\item The handling of antiphase results, particularly with
|
|
|
|
|
respect to checking for early aborts, is incomplete and
|
|
|
|
|
could lead to situations where incorrect results are
|
|
|
|
|
accepted without proper validation.
|
|
|
|
|
\end{itemize}
|
|
|
|
|
|
|
|
|
|
The current implementation's approach to iterating over every
|
|
|
|
|
possible combination of write and read delays is exhaustive but
|
|
|
|
|
may be inefficient. The function performs multiple reads and
|
|
|
|
|
writes to hardware registers for every iteration, which could
|
|
|
|
|
be time-consuming, especially on systems with a large number
|
|
|
|
|
of lanes or complex memory configurations. \\
|
|
|
|
|
|
|
|
|
|
Consideration should be given to optimizing the algorithm,
|
|
|
|
|
possibly by narrowing the search space based on prior knowledge
|
|
|
|
|
or implementing more efficient search techniques, to reduce
|
|
|
|
|
the time required for DQS position training without compromising
|
|
|
|
|
accuracy. \\
|
|
|
|
|
|
2024-08-27 14:14:41 +02:00
|
|
|
|
\subsubsection{On saving training values in NVRAM}
|
|
|
|
|
|
|
|
|
|
The function \path{mctAutoInitMCT_D} is responsible for
|
|
|
|
|
automatically initializing the memory controller training (MCT)
|
|
|
|
|
process, which involves configuring various memory parameters
|
|
|
|
|
and performing training routines to ensure stable and efficient
|
|
|
|
|
memory operation. However, the fact that
|
|
|
|
|
\path{mctAutoInitMCT\_D} does not allow for the restoration of
|
|
|
|
|
training data from NVRAM (lst. \ref{lst:mctAutoInitMCT_D_fixme})
|
|
|
|
|
poses several significant problems. \\
|
|
|
|
|
|
|
|
|
|
Memory training is a time-consuming process that involves
|
|
|
|
|
multiple iterations of read/write operations, delay adjustments,
|
|
|
|
|
and calibration steps. By not restoring previously saved
|
|
|
|
|
training data from NVRAM, the system is forced to re-run the
|
|
|
|
|
full training sequence every time it boots up. This leads to
|
|
|
|
|
longer boot times, which can be particularly problematic in
|
|
|
|
|
environments where quick system restarts are critical, such
|
|
|
|
|
as in servers or embedded systems. \\
|
|
|
|
|
|
|
|
|
|
Each time memory training is performed, it puts additional
|
|
|
|
|
stress on the memory modules and the memory controller.
|
|
|
|
|
Repeatedly executing the training process at every boot can
|
|
|
|
|
contribute to the wear and tear of hardware components,
|
|
|
|
|
potentially reducing their lifespan. This issue is especially
|
|
|
|
|
concerning in systems that frequently power cycle or reboot. \\
|
|
|
|
|
|
|
|
|
|
Memory training is sensitive to various factors, such as
|
|
|
|
|
temperature, voltage, and load conditions. As a result, the
|
|
|
|
|
training results can vary slightly between different boot
|
|
|
|
|
cycles. Without the ability to restore previously validated
|
|
|
|
|
training data, there is a risk of inconsistency in memory
|
|
|
|
|
performance across reboots. This could lead to instability
|
|
|
|
|
or suboptimal memory operation, affecting the overall
|
|
|
|
|
performance of the system. \\
|
|
|
|
|
|
|
|
|
|
If the memory training process fails during boot, the system
|
|
|
|
|
may be unable to operate properly or may fail to boot entirely.
|
|
|
|
|
By restoring validated training data from NVRAM, the system
|
|
|
|
|
can bypass the training process altogether, reducing the risk
|
|
|
|
|
of boot failures caused by training issues. Without this
|
|
|
|
|
feature, any minor issue that affects training could result
|
|
|
|
|
in system downtime. \\
|
|
|
|
|
|
|
|
|
|
Finally, modern memory controllers often include power-saving
|
|
|
|
|
features that are fine-tuned during the training process. By
|
|
|
|
|
reusing validated training data from NVRAM, the system can
|
|
|
|
|
quickly return to an optimized state with lower power
|
|
|
|
|
consumption.
|
|
|
|
|
The inability to restore this data forces the system to
|
|
|
|
|
operate at a potentially less efficient state until training
|
|
|
|
|
is complete, leading to higher power consumption during the
|
|
|
|
|
boot process. \\
|
|
|
|
|
|
|
|
|
|
\subsubsection{A seedless DQS position training algorithm}
|
|
|
|
|
|
|
|
|
|
An algorithm to find the best timing for the DQS so that the
|
|
|
|
|
memory controller can reliably read data from the memory
|
|
|
|
|
could be done without relying on any pre-known starting
|
|
|
|
|
values (seeds). This would allow for better reliability and
|
|
|
|
|
wider support for different situations. The algorithm
|
|
|
|
|
could be describe as follows. \\
|
2024-08-26 19:19:02 +02:00
|
|
|
|
|
2024-08-27 14:14:41 +02:00
|
|
|
|
\begin{itemize}
|
|
|
|
|
\item Prepare Memory Controller:
|
|
|
|
|
The memory controller needs to be in a state where it can
|
|
|
|
|
safely adjust the DQS timing without affecting the normal
|
|
|
|
|
operation of the system. By blocking the DQS signal locking,
|
|
|
|
|
we ensure that the adjustments made during training do not
|
|
|
|
|
interfere with the controller’s ability to capture data
|
|
|
|
|
until the optimal settings are found.
|
|
|
|
|
|
|
|
|
|
\item Initialize Variables:
|
|
|
|
|
Set up variables to store the various timing settings and
|
|
|
|
|
test results for each bytelane. This setup is crucial
|
|
|
|
|
because each bytelane might require a different optimal
|
|
|
|
|
timing, and keeping track of these values ensures that the
|
|
|
|
|
algorithm can correctly determine the best delay settings
|
|
|
|
|
later.
|
|
|
|
|
\end{itemize}
|
|
|
|
|
|
|
|
|
|
The main loop is the core of the algorithm, where different
|
|
|
|
|
timing settings are systematically explored. By looping
|
|
|
|
|
through possible delay settings, the algorithm ensures
|
|
|
|
|
that it doesn't miss any potential optimal timings. The
|
|
|
|
|
loop structure allows a methodical test of a range of
|
|
|
|
|
delays to find the most reliable one. \\
|
|
|
|
|
|
|
|
|
|
The gross delay is here the coarse adjustment to the timing
|
|
|
|
|
of the DQS signal. It shifts the timing window by a large
|
|
|
|
|
amount, helping to broadly align the DQS with the data
|
|
|
|
|
lines (DQ). The fine delay, which is the smaller, more
|
|
|
|
|
precise change to the timing of the DQS signal once the
|
|
|
|
|
coarse alignment (through gross delay) has been achieved,
|
|
|
|
|
would then be computed. \\
|
|
|
|
|
|
|
|
|
|
To compute a delay, here would be the steps:
|
2024-08-26 19:19:02 +02:00
|
|
|
|
|
2024-08-27 14:14:41 +02:00
|
|
|
|
\begin{itemize}
|
|
|
|
|
\item Set a delay:
|
|
|
|
|
Setting an initial delay allows the algorithm to start
|
|
|
|
|
testing. The initial delay might be zero or another default
|
|
|
|
|
value, providing a baseline from which to begin the search
|
|
|
|
|
for the optimal timing.
|
|
|
|
|
|
|
|
|
|
\item Test it:
|
|
|
|
|
After setting the delay, it is essential to test whether the
|
|
|
|
|
memory controller can read data correctly. This step is
|
|
|
|
|
critical because it indicates whether the current delay
|
|
|
|
|
setting is within the acceptable range for reliable data
|
|
|
|
|
capture.
|
|
|
|
|
|
|
|
|
|
\item Check the result:
|
|
|
|
|
If the memory controller successfully reads data, it means
|
|
|
|
|
the current delay setting is valid. This information is
|
|
|
|
|
crucial because it helps define the range of acceptable
|
|
|
|
|
timings. If the test fails, it indicates that the curren
|
|
|
|
|
t delay setting is outside the range where the memory
|
|
|
|
|
controller can reliably capture data.
|
|
|
|
|
|
|
|
|
|
\item Increase/decrease delay:
|
|
|
|
|
By incrementally adjusting the delay, either increasing or
|
|
|
|
|
decreasing, the algorithm can explore different timing
|
|
|
|
|
settings in a controlled manner. This ensures that the
|
|
|
|
|
entire range of possible delays is covered without skipping
|
|
|
|
|
over any potential good delays.
|
|
|
|
|
|
|
|
|
|
\item Test again:
|
|
|
|
|
Re-testing after each adjustment ensures that the exact
|
|
|
|
|
point where the DQS timing goes from acceptable (pass) to
|
|
|
|
|
unacceptable (fail) is caught. This step helps in
|
|
|
|
|
identifying the transition point, which is often the optimal
|
|
|
|
|
place to set the DQS delay.
|
|
|
|
|
|
|
|
|
|
\item Look for a transition:
|
|
|
|
|
The transition from pass to fail is where the DQS timing
|
|
|
|
|
crosses the boundary of the valid timing window. This
|
|
|
|
|
transition is crucial because it marks the end of the
|
|
|
|
|
reliable range. The best timing is usually just before
|
|
|
|
|
this transition.
|
|
|
|
|
|
|
|
|
|
\item Record the best setting:
|
|
|
|
|
Storing the best delay setting for each bytelane ensures
|
|
|
|
|
that a reliable timing configuration is available when the
|
|
|
|
|
training is complete.
|
|
|
|
|
|
|
|
|
|
\item Confirm all bytelanes:
|
|
|
|
|
Before finalizing the settings, it is important to ensure
|
|
|
|
|
that the chosen delays work for all bytelanes. This step
|
|
|
|
|
serves as a final safeguard against errors, ensuring that
|
|
|
|
|
every part of the data bus is correctly aligned.
|
|
|
|
|
\end{itemize}
|
2024-08-26 19:19:02 +02:00
|
|
|
|
|
2024-08-27 14:14:41 +02:00
|
|
|
|
Each bytelane (8-bit segment of data) may require a
|
|
|
|
|
different optimal delay setting. By repeating the process
|
|
|
|
|
for all bytelanes, the algorithm ensures that the entire
|
|
|
|
|
data bus is correctly timed. Misalignment in even one
|
|
|
|
|
bytelane can lead to data errors, making it essential to
|
|
|
|
|
tune every bytelane individually. \\
|
|
|
|
|
|
|
|
|
|
Once the best settings are confirmed, they need to be
|
|
|
|
|
applied to the memory controller for use during normal
|
|
|
|
|
operation. This step locks in the most reliable timing
|
|
|
|
|
configuration found during the training process. \\
|
|
|
|
|
|
|
|
|
|
After the optimal settings are applied, it is necessary
|
|
|
|
|
to allow the DQS signal locking mechanism to resume. This
|
|
|
|
|
locks in the delay settings, ensuring stable operation going
|
|
|
|
|
forward. \\
|
|
|
|
|
|
|
|
|
|
Finally, the algorithm needs to indicate whether it was
|
|
|
|
|
successful in finding reliable timing settings for all
|
|
|
|
|
bytelanes. This feedback is crucial for determining whether
|
|
|
|
|
the memory system is correctly configured or if further
|
|
|
|
|
adjustments or troubleshooting are needed. \\
|
2024-07-24 17:00:17 +02:00
|
|
|
|
|
2024-08-25 11:54:54 +02:00
|
|
|
|
% ------------------------------------------------------------------------------
|
|
|
|
|
% CHAPTER 5: Virtualization of the operating system through firmware abstraction
|
|
|
|
|
% ------------------------------------------------------------------------------
|
2024-08-22 19:40:05 +02:00
|
|
|
|
\chapter{Virtualization of the operating system through firmware abstraction}
|
|
|
|
|
|
|
|
|
|
In contemporary computing systems, the operating system (OS) no longer
|
|
|
|
|
interacts directly with hardware in the same way it did in earlier computing
|
|
|
|
|
architectures. Instead, the OS operates within a highly abstracted
|
|
|
|
|
environment, where critical functions are managed by various firmware
|
|
|
|
|
components such as ACPI, SMM, UEFI, Intel Management Engine (ME), and AMD
|
|
|
|
|
Platform Security Processor (PSP). This layered abstraction has led to the
|
|
|
|
|
argument that the OS is effectively running in a virtualized environment,
|
|
|
|
|
akin to a virtual machine (VM).
|
|
|
|
|
|
|
|
|
|
\section{ACPI and abstraction of hardware control}
|
|
|
|
|
|
|
|
|
|
The Advanced Configuration and Power Interface (ACPI) provides a
|
|
|
|
|
standardized method for the OS to manage hardware configuration and
|
|
|
|
|
power states, effectively abstracting the underlying hardware
|
|
|
|
|
complexities. ACPI abstracts hardware details, allowing the OS to
|
|
|
|
|
interact with hardware components without needing direct control over
|
|
|
|
|
them. This abstraction is similar to how a hypervisor abstracts physical
|
|
|
|
|
hardware for VMs, enabling a consistent interface regardless of the
|
|
|
|
|
underlying hardware specifics. \\
|
|
|
|
|
|
|
|
|
|
According to \textcite{bellosa2010}, the abstraction provided by ACPI
|
|
|
|
|
not only simplifies the OS's interaction with hardware but also limits
|
|
|
|
|
the OS's ability to fully control the hardware, which is instead managed
|
|
|
|
|
by ACPI-compliant firmware. This layer of abstraction contributes to the
|
|
|
|
|
virtualization-like environment in which the OS operates. \\
|
|
|
|
|
|
2024-08-25 11:54:54 +02:00
|
|
|
|
More importantly, the ACPI Component Architecture (ACPICA) is a critical
|
|
|
|
|
component integrated into the Linux kernel, serving as the foundation
|
|
|
|
|
for the system's ACPI implementation \cite{intel_acpi_programming_2023}.
|
|
|
|
|
ACPICA provides the core ACPI functionalities, such as hardware
|
|
|
|
|
configuration, power management, and thermal management, which are
|
|
|
|
|
essential for modern computing platforms. However, its integration into
|
|
|
|
|
the Linux kernel has brought significant complexity and code overhead,
|
|
|
|
|
making Linux heavily dependent on ACPICA for managing ACPI-related
|
|
|
|
|
tasks.
|
|
|
|
|
|
|
|
|
|
ACPICA is a large and complex project, with its codebase encompassing
|
|
|
|
|
a wide range of functionalities required to implement ACPI standards.
|
|
|
|
|
The integration of ACPICA into the Linux kernel significantly increases
|
|
|
|
|
the kernel's overall code size. An example of that can easily be
|
|
|
|
|
reproduced with a small experiment (lst. \ref{lst:acpica_in_linux}).
|
|
|
|
|
|
|
|
|
|
\begin{listing}[H]
|
|
|
|
|
\begin{adjustwidth}{0.5cm}{0.5cm}
|
|
|
|
|
\inputminted{sh}{listings/acpica_size.sh}
|
|
|
|
|
\end{adjustwidth}
|
2024-08-25 15:57:26 +02:00
|
|
|
|
\caption{How to estimate the impact of ACPICA in Linux}
|
2024-08-25 11:54:54 +02:00
|
|
|
|
\label{lst:acpica_in_linux}
|
|
|
|
|
\end{listing}
|
|
|
|
|
|
|
|
|
|
As of recent statistics, ACPICA comprises between 100,000 to 200,000
|
|
|
|
|
lines of code, making it one of the larger subsystems within the Linux
|
|
|
|
|
kernel. This size is indicative of the extensive range of features
|
|
|
|
|
and capabilities ACPICA must support, including but not limited to the
|
|
|
|
|
ACPI interpreter, AML (ACPI Machine Language) parser, and various
|
|
|
|
|
hardware-specific drivers. The ACPICA codebase is not monolithic; it is
|
|
|
|
|
highly modular and consists of various components, each responsible for
|
|
|
|
|
specific ACPI functions. For instance, ACPICA includes components for
|
|
|
|
|
managing ACPI tables, interpreting AML bytecode, handling events, and
|
|
|
|
|
interacting with hardware. This modularity, while beneficial for
|
|
|
|
|
isolating different functionalities, also contributes to the overall
|
|
|
|
|
complexity of the system. The separation of ACPICA into multiple modules
|
|
|
|
|
necessitates careful coordination and integration with the rest of the
|
|
|
|
|
Linux kernel, adding to the kernel's complexity. \\
|
|
|
|
|
|
|
|
|
|
ACPICA's integration into the Linux kernel is designed to maintain a
|
|
|
|
|
clear separation between the core ACPI functionalities and the kernel's
|
|
|
|
|
other subsystems \cite{intel_acpi_programming_2023}. This separation is
|
|
|
|
|
achieved through well-defined interfaces and abstraction layers,
|
|
|
|
|
allowing the Linux kernel to interact with ACPICA without being tightly
|
|
|
|
|
coupled to its internal implementation details. For example, ACPICA
|
|
|
|
|
provides an API that the Linux kernel can use to interact with ACPI
|
|
|
|
|
tables, execute ACPI methods, and manage power states. This API
|
|
|
|
|
abstracts the underlying complexity of the ACPI implementation, making
|
|
|
|
|
it easier for kernel developers to incorporate ACPI support without
|
|
|
|
|
delving into the intricacies of ACPICA's internals.
|
|
|
|
|
Moreover, ACPICA's role in interpreting AML bytecode, which is
|
|
|
|
|
essentially a form of low-level programming language embedded in ACPI
|
|
|
|
|
tables, adds a layer of abstraction. The Linux kernel relies on ACPICA
|
|
|
|
|
to execute AML methods and manage hardware resources according to the
|
|
|
|
|
ACPI specifications. This reliance further underscores the idea that
|
|
|
|
|
ACPI acts as a virtualizing environment, shielding the kernel from
|
|
|
|
|
the complexities of directly interfacing with hardware components.
|
|
|
|
|
|
2024-08-22 19:40:05 +02:00
|
|
|
|
\section{SMM as a hidden execution layer}
|
|
|
|
|
|
|
|
|
|
System Management Mode (SMM) is a special-purpose operating mode
|
|
|
|
|
provided by x86 processors, designed to handle system-wide functions
|
|
|
|
|
such as power management, thermal monitoring, and hardware control,
|
|
|
|
|
independent of the OS. SMM operates transparently to the OS, executing
|
|
|
|
|
code that the OS cannot detect or control, similar to how a hypervisor
|
|
|
|
|
controls the execution environment of VMs. \\
|
|
|
|
|
|
|
|
|
|
Research by \textcite{huang2009invisible} argues that SMM introduces a
|
|
|
|
|
hidden layer of execution that diminishes the OS's control over the
|
|
|
|
|
hardware, creating a virtualized environment where the OS is unaware of
|
|
|
|
|
and unable to influence certain system-level operations. This hidden
|
|
|
|
|
execution layer reinforces the idea that the OS runs in an environment
|
|
|
|
|
similar to a VM, with the firmware acting as a hypervisor. \\
|
|
|
|
|
|
|
|
|
|
\section{UEFI and persistence}
|
|
|
|
|
|
|
|
|
|
The Unified Extensible Firmware Interface (UEFI) has largely replaced
|
|
|
|
|
the traditional BIOS in modern systems, providing a sophisticated
|
|
|
|
|
environment that includes a kernel-like structure capable of running
|
|
|
|
|
drivers and applications independently of the OS. UEFI remains active
|
|
|
|
|
even after the OS has booted, continuing to manage certain hardware
|
|
|
|
|
functions, which abstracts these functions away from the OS. \\
|
|
|
|
|
|
|
|
|
|
\textcite{mcclean2017uefi} discusses how UEFI creates a persistent
|
|
|
|
|
execution environment that overlaps with the OS's operation, effectively
|
|
|
|
|
placing the OS in a position where it runs on top of another controlling
|
|
|
|
|
layer, much like a guest OS in a VM. This persistence and the ability of
|
|
|
|
|
UEFI to manage hardware resources independently further blur the lines
|
2024-08-25 11:54:54 +02:00
|
|
|
|
between traditional OS operation and virtualized environments.
|
|
|
|
|
Indeed, as we studied in a precedent chapter, UEFI is designed as a
|
|
|
|
|
modular and extensible firmware interface that sits between the
|
|
|
|
|
computer's hardware and the operating system. Unlike the monolithic
|
|
|
|
|
BIOS, UEFI is composed of several layers and components, each
|
|
|
|
|
responsible for different aspects of the system's boot and runtime
|
|
|
|
|
processes. The core components of UEFI include the Pre-EFI
|
|
|
|
|
Initialization (PEI), Driver Execution Environment (DXE),
|
|
|
|
|
Boot Device Selection (BDS), and Runtime Services. Each of these
|
|
|
|
|
components plays a critical role in initializing the hardware,
|
|
|
|
|
managing drivers, selecting boot devices, and providing runtime
|
|
|
|
|
services to the OS. \\
|
|
|
|
|
|
|
|
|
|
The PEI (Pre-EFI Initialization) phase is responsible for initializing
|
|
|
|
|
the CPU, memory, and other essential hardware components. It ensures
|
|
|
|
|
that the system is in a stable state before handing control to the
|
|
|
|
|
DXE phase. In the DXE phase, the system loads and initializes various
|
|
|
|
|
drivers required for the OS to interact with the hardware. The DXE phase
|
|
|
|
|
also constructs the UEFI Boot Services, which provide the OS with
|
|
|
|
|
interfaces to the hardware during the boot process. The BDS (Boot Device
|
|
|
|
|
Selection) phase is responsible for selecting the device from which the
|
|
|
|
|
OS will boot. It interacts with the UEFI Boot Manager to determine the
|
|
|
|
|
correct boot path and load the OS. After the OS has booted, UEFI
|
|
|
|
|
provides Runtime Services that remain accessible to the OS. These
|
|
|
|
|
services include interfaces for managing system variables, time, and
|
|
|
|
|
hardware. UEFI also supports the execution of standalone applications,
|
|
|
|
|
which can be used for system diagnostics, firmware updates, or other
|
|
|
|
|
tasks. These applications operate independently of the OS, highlighting
|
|
|
|
|
UEFI's capabilities as a minimalistic OS. \\
|
|
|
|
|
|
|
|
|
|
UEFI abstracts the underlying hardware from the OS, providing a
|
|
|
|
|
standardized interface for the OS to interact with different hardware
|
|
|
|
|
components. This abstraction simplifies the development of OSes and
|
|
|
|
|
drivers, as they do not need to be tailored for specific hardware
|
|
|
|
|
configurations. UEFI's hardware abstraction is one of the key features
|
|
|
|
|
that enable it to act as a virtualizing environment for the OS
|
|
|
|
|
\cite{mcclean2017uefi}.
|
|
|
|
|
|
|
|
|
|
\subsection{Memory Management}
|
|
|
|
|
|
|
|
|
|
UEFI provides a detailed memory map to the OS during the boot process,
|
|
|
|
|
which includes information about available, reserved, and used memory
|
|
|
|
|
regions. The OS uses this memory map to manage its own memory allocation
|
|
|
|
|
and paging mechanisms. The overlap in memory management functions
|
|
|
|
|
highlights UEFI's role in preparing the system for OS operation.
|
|
|
|
|
This memory map includes all the memory regions in the system,
|
|
|
|
|
categorized into different types, such as usable memory, reserved
|
|
|
|
|
memory, and memory-mapped I/O. The OS relies on this map to understand
|
|
|
|
|
the system's memory layout and avoid conflicts \cite{osdev_uefi_memory}.
|
|
|
|
|
The OS extends UEFI's memory
|
|
|
|
|
management by implementing its own memory allocation, paging, and
|
|
|
|
|
virtual memory mechanisms. However, the OS's memory management is
|
|
|
|
|
built on the foundation provided by UEFI, demonstrating the close
|
|
|
|
|
relationship between the two.
|
|
|
|
|
|
|
|
|
|
\subsection{File System Management}
|
|
|
|
|
|
|
|
|
|
UEFI includes its own file system management capabilities, which overlap
|
|
|
|
|
with those of the OS. The most notable example is the EFI System
|
|
|
|
|
Partition (ESP), a special partition formatted with the FAT file system
|
|
|
|
|
that UEFI uses to store bootloaders, drivers, and other critical files
|
|
|
|
|
\cite{uefi_spec}. The ESP is a mandatory partition in UEFI systems,
|
|
|
|
|
containing the bootloaders, firmware updates, and other files
|
|
|
|
|
necessary for system initialization. UEFI accesses the ESP
|
|
|
|
|
independently of the OS, but the OS can also access and manage files
|
|
|
|
|
on the ESP, creating an overlap in file system management functions
|
|
|
|
|
\cite{uefi_smm_security}. UEFI natively supports the FAT file
|
|
|
|
|
system, allowing it to read and write files on the ESP. This support
|
|
|
|
|
overlaps with the OS's file system management, as both UEFI and the
|
|
|
|
|
OS can manipulate files on the ESP.
|
|
|
|
|
|
|
|
|
|
\subsection{Device Drivers}
|
|
|
|
|
|
|
|
|
|
As we studied in an earlier chapter, UEFI includes its own driver
|
|
|
|
|
model, allowing it to load and execute drivers independently of the
|
|
|
|
|
OS. This capability overlaps with the OS's driver management
|
|
|
|
|
functions, as both UEFI and the OS manage hardware devices through
|
|
|
|
|
drivers.
|
|
|
|
|
UEFI drivers are typically used during
|
|
|
|
|
the boot process to initialize and control hardware devices. These
|
|
|
|
|
drivers provide the necessary interfaces for the OS to interact with
|
|
|
|
|
the hardware once it has booted \cite{uefi_smm_security}.
|
|
|
|
|
After the OS has booted, it loads its own drivers for hardware
|
|
|
|
|
devices. However, the OS often relies on the initial hardware setup
|
|
|
|
|
performed by UEFI drivers.
|
|
|
|
|
|
|
|
|
|
\subsection{Power Management}
|
|
|
|
|
|
|
|
|
|
UEFI provides power management services that overlap with the OS's
|
|
|
|
|
power management functions. These services allow UEFI to manage
|
|
|
|
|
power states and transitions independently of the OS \cite{uefi_spec}.
|
|
|
|
|
These services ensure that the system conserves power during periods
|
|
|
|
|
of inactivity and can quickly resume operation when needed
|
|
|
|
|
The OS extends UEFI's power management by implementing its own
|
|
|
|
|
power-saving mechanisms, such as CPU throttling and dynamic voltage
|
|
|
|
|
scaling.
|
2024-08-22 19:40:05 +02:00
|
|
|
|
|
|
|
|
|
\section{Intel and AMD: control beyond the OS}
|
|
|
|
|
|
|
|
|
|
Intel Management Engine (ME) and AMD Platform Security Processor (PSP)
|
|
|
|
|
are embedded microcontrollers within Intel and AMD processors,
|
|
|
|
|
respectively. These components run their own firmware and operate
|
|
|
|
|
independently of the main CPU, handling tasks such as security
|
|
|
|
|
enforcement, remote management, and digital rights management (DRM). \\
|
|
|
|
|
|
|
|
|
|
\textcite{bulygin2013chipset} highlights how these microcontrollers have
|
|
|
|
|
control over the system that supersedes the OS, managing hardware and
|
|
|
|
|
security functions without the OS's knowledge or consent. This level of
|
|
|
|
|
control is reminiscent of a hypervisor that manages the resources and
|
|
|
|
|
security of VMs. The OS, in this context, operates similarly to a VM
|
|
|
|
|
that does not have full control over the hardware it ostensibly manages. \\
|
|
|
|
|
|
2024-08-27 16:03:22 +02:00
|
|
|
|
\section{Processors microcode}
|
|
|
|
|
|
|
|
|
|
Modern CPUs are incredibly complex, with their functionality relying
|
|
|
|
|
heavily on microcode to interpret and execute instructions. Microcode
|
|
|
|
|
acts as a translation layer between the high-level instructions that
|
|
|
|
|
software provides and the lower-level operations that the hardware
|
|
|
|
|
can execute. Microcode operates directly within the CPU. \\
|
|
|
|
|
|
|
|
|
|
CPU microcode is a set of low-level firmware instructions embedded
|
|
|
|
|
within the processor. It translates complex machine instructions into
|
|
|
|
|
simpler, executable sequences of operations that the CPU's hardware
|
|
|
|
|
can directly perform \cite{Intel2018}. This layer of abstraction allows
|
|
|
|
|
CPU manufacturers to update or patch the behavior of the processor
|
|
|
|
|
post-manufacturing, which is crucial for addressing bugs, optimizing
|
|
|
|
|
performance, and applying security patches \cite{Wilcox2018}.
|
|
|
|
|
|
|
|
|
|
In a sense, microcode can be seen as an argument for the CPU running
|
|
|
|
|
a form of low-level virtual machine. Just as a VM abstracts and manages
|
|
|
|
|
hardware resources for a guest OS, microcode abstracts and manages the
|
|
|
|
|
complexity of CPU hardware for machine-level instructions. This
|
|
|
|
|
virtualization enables the CPU to support a wide variety of instructions
|
|
|
|
|
and operational modes without needing to change the underlying hardware
|
|
|
|
|
\cite{Abraham1983}.
|
|
|
|
|
|
2024-08-22 19:40:05 +02:00
|
|
|
|
\section{The OS as a virtualized environment}
|
|
|
|
|
|
|
|
|
|
The combined effect of these firmware components (ACPI, SMM, UEFI,
|
|
|
|
|
Intel ME, and AMD PSP) creates an environment where the OS operates in
|
|
|
|
|
a virtualized or highly abstracted layer. The OS does not directly
|
|
|
|
|
manage the hardware; instead, it interfaces with these firmware
|
|
|
|
|
components, which themselves control the hardware resources. This
|
|
|
|
|
situation is analogous to a virtual machine, where the guest OS
|
|
|
|
|
operates on virtualized hardware managed by a hypervisor. \\
|
|
|
|
|
|
|
|
|
|
\textcite{smith2019firmware} argues that modern OS environments,
|
|
|
|
|
influenced by these firmware components, should be considered
|
|
|
|
|
virtualized environments. The firmware acts as an intermediary layer
|
|
|
|
|
that abstracts and controls hardware resources, thereby limiting the
|
|
|
|
|
OS's direct access and control. \\
|
|
|
|
|
|
|
|
|
|
The presence and operation of modern firmware components such as ACPI,
|
2024-08-27 16:03:22 +02:00
|
|
|
|
SMM, UEFI, Intel ME, and AMD PSP and even CPU microcode contribute to
|
|
|
|
|
a significant abstraction of hardware from the OS.
|
|
|
|
|
This abstraction creates an environment that
|
2024-08-22 19:40:05 +02:00
|
|
|
|
parallels the operation of a virtual machine, where the OS functions
|
|
|
|
|
within a controlled, virtualized layer managed by these firmware
|
|
|
|
|
systems. The growing body of research supports this perspective,
|
|
|
|
|
suggesting that the traditional notion of an OS directly managing
|
|
|
|
|
hardware is increasingly outdated in the face of these complex,
|
|
|
|
|
autonomous firmware components.
|
2024-07-24 17:00:17 +02:00
|
|
|
|
|
2024-08-22 19:54:35 +02:00
|
|
|
|
\chapter*{Conclusion}
|
2024-07-24 17:00:17 +02:00
|
|
|
|
\addcontentsline{toc}{chapter}{Conclusion}
|
|
|
|
|
|
2024-08-22 19:54:35 +02:00
|
|
|
|
This document has explored the evolution and current state of firmware,
|
|
|
|
|
particularly focusing on the transition from traditional BIOS to more
|
|
|
|
|
advanced firmware interfaces such as UEFI and \textit{coreboot}. The
|
|
|
|
|
evolution from a simple set of routines stored in ROM to complex systems
|
|
|
|
|
like UEFI and \textit{coreboot} highlights the growing importance of
|
2024-08-27 13:52:27 +02:00
|
|
|
|
firmware in modern computing.
|
|
|
|
|
|
|
|
|
|
Firmware now plays a critical role not
|
2024-08-22 19:54:35 +02:00
|
|
|
|
only in hardware initialization but also in memory management, security,
|
|
|
|
|
and system performance optimization. \\
|
|
|
|
|
|
|
|
|
|
The study of the ASUS KGPE-D16 mainboard illustrates how firmware,
|
|
|
|
|
particularly \textit{coreboot}, plays a crucial role in the efficient
|
|
|
|
|
and secure operation of high-performance systems. The KGPE-D16, with its
|
|
|
|
|
support for free software-compatible firmware, exemplifies the potential
|
|
|
|
|
of libre firmware to deliver both high performance and freedom from
|
|
|
|
|
proprietary constraints. However, it is important to acknowledge that
|
|
|
|
|
the KGPE-D16 is not without its imperfections. The detailed analysis of
|
|
|
|
|
firmware components, such as the bootblock, romstage, and especially the
|
|
|
|
|
RAM initialization and training algorithms, reveals areas where the
|
|
|
|
|
firmware can be further refined to enhance system stability and
|
|
|
|
|
performance. These improvements are not only beneficial for the KGPE-D16
|
|
|
|
|
but can also be applied to other boards, extending the impact of these
|
|
|
|
|
optimizations across a broader range of hardware. \\
|
|
|
|
|
|
|
|
|
|
Moreover, the discussion on modern firmware components such as ACPI,
|
|
|
|
|
SMM, UEFI, Intel ME, and AMD PSP demonstrates how these elements
|
|
|
|
|
abstract hardware from the operating system, creating a virtualized
|
|
|
|
|
environment where the OS operates more like a guest in a
|
|
|
|
|
hypervisor-controlled system. This abstraction raises important
|
|
|
|
|
considerations about control, security, and user freedom in contemporary
|
|
|
|
|
computing.
|
|
|
|
|
As we continue to witness the increasing complexity and influence of
|
|
|
|
|
firmware in computing, it becomes crucial to advocate for free
|
|
|
|
|
software-compatible hardware. The dependence on proprietary firmware and
|
|
|
|
|
the associated restrictions on user freedom are growing concerns that
|
|
|
|
|
need to be addressed. The development and adoption of libre firmware
|
|
|
|
|
solutions, such as \textit{coreboot} and GNU Boot, are essential steps
|
|
|
|
|
towards ensuring that users retain control over their hardware and
|
|
|
|
|
software environments. \\
|
|
|
|
|
|
|
|
|
|
It is imperative that the community of developers, researchers, and
|
|
|
|
|
users come together to support and contribute to the development of
|
|
|
|
|
free firmware. By fostering innovation and collaboration in this field,
|
|
|
|
|
we can advance towards a future where free software-compatible hardware
|
|
|
|
|
becomes the norm, ensuring that computing remains open, secure, and
|
|
|
|
|
under the control of its users. The significance of a libre BIOS cannot
|
|
|
|
|
be overstated, it is the foundation upon which a truly free and open
|
|
|
|
|
computing ecosystem can be built \cite{coreboot_fsf}.
|
|
|
|
|
The importance of the GNU Boot project cannot be
|
|
|
|
|
overstated. As a fully free firmware initiative, GNU Boot represents a
|
|
|
|
|
critical step towards achieving truly libre BIOSes, ensuring that users
|
|
|
|
|
can maintain full control over their hardware and firmware environments.
|
|
|
|
|
The continued development and support of GNU Boot are essential for
|
|
|
|
|
advancing the goals of free software and protecting user freedoms in the
|
|
|
|
|
increasingly complex landscape of modern computing. \\
|
2024-07-24 17:00:17 +02:00
|
|
|
|
|
|
|
|
|
\newpage
|
|
|
|
|
|
|
|
|
|
% Bibliography
|
|
|
|
|
\nocite{*}
|
|
|
|
|
\printbibliography
|
2024-08-27 13:27:07 +02:00
|
|
|
|
\addcontentsline{toc}{chapter}{Bibliography}
|
|
|
|
|
\newpage
|
|
|
|
|
|
|
|
|
|
\chapter*{Appendix: Long code listings}
|
|
|
|
|
\addcontentsline{toc}{chapter}{Appendix: Long code listings}
|
2024-08-27 14:14:41 +02:00
|
|
|
|
\renewcommand{\thelisting}{L.\arabic{listing}}
|
2024-08-27 13:27:07 +02:00
|
|
|
|
|
|
|
|
|
\begin{listing}
|
|
|
|
|
\begin{adjustwidth}{0.5cm}{0.5cm}
|
|
|
|
|
\inputminted{c}{
|
|
|
|
|
listings/src_northbridge_amd_amdmct_mct_ddr3_mct_d_1.c}
|
|
|
|
|
\end{adjustwidth}
|
|
|
|
|
\caption{
|
|
|
|
|
Beginning of
|
|
|
|
|
\protect\path{mctAutoInitMCT_D()}, extract from
|
|
|
|
|
\protect\path{src/northbridge/amd/amdmct/mct_ddr3/mct_d.c}}
|
|
|
|
|
\label{lst:mctAutoInitMCT_D_1}
|
|
|
|
|
\end{listing}
|
|
|
|
|
|
|
|
|
|
\begin{listing}
|
|
|
|
|
\begin{adjustwidth}{0.5cm}{0.5cm}
|
|
|
|
|
\inputminted{c}{
|
|
|
|
|
listings/src_northbridge_amd_amdmct_mct_ddr3_mct_d_2.c}
|
|
|
|
|
\end{adjustwidth}
|
|
|
|
|
\caption{
|
|
|
|
|
DIMM initialization in
|
|
|
|
|
\protect\path{mctAutoInitMCT_D()}, extract from
|
|
|
|
|
\protect\path{src/northbridge/amd/amdmct/mct_ddr3/mct_d.c}}
|
|
|
|
|
\label{lst:mctAutoInitMCT_D_2}
|
|
|
|
|
\end{listing}
|
|
|
|
|
|
|
|
|
|
\begin{listing}
|
|
|
|
|
\begin{adjustwidth}{0.5cm}{0.5cm}
|
|
|
|
|
\inputminted{c}{
|
|
|
|
|
listings/src_northbridge_amd_amdmct_mct_ddr3_mct_d_3.c}
|
|
|
|
|
\end{adjustwidth}
|
|
|
|
|
\caption{
|
|
|
|
|
Voltage control in
|
|
|
|
|
\protect\path{mctAutoInitMCT_D()}, extract from
|
|
|
|
|
\protect\path{src/northbridge/amd/amdmct/mct_ddr3/mct_d.c}}
|
|
|
|
|
\label{lst:mctAutoInitMCT_D_3}
|
|
|
|
|
\end{listing}
|
|
|
|
|
|
|
|
|
|
\begin{listing}
|
|
|
|
|
\begin{adjustwidth}{0.5cm}{0.5cm}
|
|
|
|
|
\inputminted{c}{
|
|
|
|
|
listings/src_northbridge_amd_amdmct_mct_ddr3_mct_d_fixme.c}
|
|
|
|
|
\end{adjustwidth}
|
|
|
|
|
\caption{
|
|
|
|
|
\protect\path{mctAutoInitMCT_D()} does not allow restoring
|
|
|
|
|
previous training values, extract from
|
|
|
|
|
\protect\path{src/northbridge/amd/amdmct/mct_ddr3/mct_d.c}}
|
|
|
|
|
\label{lst:mctAutoInitMCT_D_fixme}
|
|
|
|
|
\end{listing}
|
|
|
|
|
|
|
|
|
|
\begin{listing}
|
|
|
|
|
\begin{adjustwidth}{0.5cm}{0.5cm}
|
|
|
|
|
\inputminted{c}{
|
|
|
|
|
listings/src_northbridge_amd_amdmct_mct_ddr3_mct_d_4.c}
|
|
|
|
|
\end{adjustwidth}
|
|
|
|
|
\caption{
|
|
|
|
|
Preparing SMBus, DCTs and NB in
|
|
|
|
|
\protect\path{mctAutoInitMCT_D()} from
|
|
|
|
|
\protect\path{src/northbridge/amd/amdmct/mct_ddr3/mct_d.c}}
|
|
|
|
|
\label{lst:mctAutoInitMCT_D_4}
|
|
|
|
|
\end{listing}
|
|
|
|
|
|
|
|
|
|
\begin{listing}
|
|
|
|
|
\begin{adjustwidth}{0.5cm}{0.5cm}
|
|
|
|
|
\inputminted{c}{
|
|
|
|
|
listings/src_northbridge_amd_amdmct_mct_ddr3_mct_d_5.c}
|
|
|
|
|
\end{adjustwidth}
|
|
|
|
|
\caption{
|
|
|
|
|
Get DQS, reset and activate ECC in
|
|
|
|
|
\protect\path{mctAutoInitMCT_D()} from
|
|
|
|
|
\protect\path{src/northbridge/amd/amdmct/mct_ddr3/mct_d.c}}
|
|
|
|
|
\label{lst:mctAutoInitMCT_D_5}
|
|
|
|
|
\end{listing}
|
|
|
|
|
|
|
|
|
|
\begin{listing}
|
|
|
|
|
\begin{adjustwidth}{0.5cm}{0.5cm}
|
|
|
|
|
\inputminted{c}{
|
|
|
|
|
listings/src_northbridge_amd_amdmct_mct_ddr3_mct_d_6.c}
|
|
|
|
|
\end{adjustwidth}
|
|
|
|
|
\caption{
|
|
|
|
|
Mapping DRAM with cache, validating DCT nodes
|
|
|
|
|
and finishing the init process in
|
|
|
|
|
\protect\path{mctAutoInitMCT_D()} from
|
|
|
|
|
\protect\path{src/northbridge/amd/amdmct/mct_ddr3/mct_d.c}}
|
|
|
|
|
\label{lst:mctAutoInitMCT_D_6}
|
|
|
|
|
\end{listing}
|
|
|
|
|
|
|
|
|
|
\begin{listing}[H]
|
|
|
|
|
\begin{adjustwidth}{0.5cm}{0.5cm}
|
|
|
|
|
\begin{minted}[linenos]{c}
|
|
|
|
|
retry_requested = 0;
|
|
|
|
|
for (Node = 0; Node < MAX_NODES_SUPPORTED; Node++) {
|
|
|
|
|
struct DCTStatStruc *pDCTstat;
|
|
|
|
|
pDCTstat = pDCTstatA + Node;
|
|
|
|
|
|
|
|
|
|
if (pDCTstat->NodePresent) {
|
|
|
|
|
if (pDCTstat->TrainErrors & (1 << SB_FatalError)) {
|
|
|
|
|
printk(BIOS_ERR, "DIMM training FAILED! Restarting system...");
|
|
|
|
|
soft_reset();
|
|
|
|
|
}
|
|
|
|
|
if (pDCTstat->TrainErrors & (1 << SB_RetryConfigTrain)) {
|
|
|
|
|
retry_requested = 1;
|
|
|
|
|
|
|
|
|
|
pDCTstat->TrainErrors &= ~(1 << SB_RetryConfigTrain);
|
|
|
|
|
pDCTstat->TrainErrors &= ~(1 << SB_NODQSPOS);
|
|
|
|
|
pDCTstat->ErrStatus &= ~(1 << SB_RetryConfigTrain);
|
|
|
|
|
pDCTstat->ErrStatus &= ~(1 << SB_NODQSPOS);
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
if (retry_requested) {
|
|
|
|
|
printk(BIOS_DEBUG, "%s: Restarting training on algorithm request\n",
|
|
|
|
|
__func__);
|
|
|
|
|
/* Reset frequency to minimum */
|
|
|
|
|
for (Node = 0; Node < MAX_NODES_SUPPORTED; Node++) {
|
|
|
|
|
struct DCTStatStruc *pDCTstat;
|
|
|
|
|
pDCTstat = pDCTstatA + Node;
|
|
|
|
|
if (pDCTstat->NodePresent) {
|
|
|
|
|
uint8_t original_target_freq = pDCTstat->TargetFreq;
|
|
|
|
|
uint8_t original_auto_speed = pDCTstat->DIMMAutoSpeed;
|
|
|
|
|
pDCTstat->TargetFreq = mhz_to_memclk_config(mctGet_NVbits(NV_MIN_MEMCLK));
|
|
|
|
|
pDCTstat->Speed = pDCTstat->DIMMAutoSpeed = pDCTstat->TargetFreq;
|
|
|
|
|
SetTargetFreq(pMCTstat, pDCTstatA, Node);
|
|
|
|
|
pDCTstat->TargetFreq = original_target_freq;
|
|
|
|
|
pDCTstat->DIMMAutoSpeed = original_auto_speed;
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
/* Apply any DIMM timing changes */
|
|
|
|
|
for (Node = 0; Node < MAX_NODES_SUPPORTED; Node++) {
|
|
|
|
|
struct DCTStatStruc *pDCTstat;
|
|
|
|
|
pDCTstat = pDCTstatA + Node;
|
|
|
|
|
if (pDCTstat->NodePresent) {
|
|
|
|
|
AutoCycTiming_D(pMCTstat, pDCTstat, 0);
|
|
|
|
|
if (!pDCTstat->GangedMode)
|
|
|
|
|
if (pDCTstat->DIMMValidDCT[1] > 0)
|
|
|
|
|
AutoCycTiming_D(pMCTstat, pDCTstat, 1);
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
goto retry_dqs_training_and_levelization;
|
|
|
|
|
}
|
|
|
|
|
\end{minted}
|
|
|
|
|
\end{adjustwidth}
|
|
|
|
|
\caption{Error detection and retry mechanism during DQS training,
|
|
|
|
|
extract from the
|
|
|
|
|
\protect\path{DQSTiming_D} function in
|
|
|
|
|
\protect\path{src/northbridge/amd/amdmct/mct_ddr3/mct_d.c}}
|
|
|
|
|
\label{lst:error_handling}
|
|
|
|
|
\end{listing}
|
|
|
|
|
|
|
|
|
|
\begin{listing}[H]
|
|
|
|
|
\begin{adjustwidth}{0.5cm}{0.5cm}
|
|
|
|
|
\begin{minted}[linenos]{c}
|
|
|
|
|
if (Pass == SecondPass) {
|
|
|
|
|
if (pDCTstat->TargetFreq > mhz_to_memclk_config(mctGet_NVbits(NV_MIN_MEMCLK))) {
|
|
|
|
|
uint8_t global_phy_training_status = 0;
|
|
|
|
|
final_target_freq = pDCTstat->TargetFreq;
|
|
|
|
|
|
|
|
|
|
while (pDCTstat->Speed != final_target_freq) {
|
|
|
|
|
if (is_fam15h())
|
|
|
|
|
pDCTstat->TargetFreq =
|
|
|
|
|
fam15h_next_highest_memclk_freq(pDCTstat->Speed);
|
|
|
|
|
else
|
|
|
|
|
pDCTstat->TargetFreq = final_target_freq;
|
|
|
|
|
SetTargetFreq(pMCTstat, pDCTstatA, Node);
|
|
|
|
|
timeout = 0;
|
|
|
|
|
do {
|
|
|
|
|
status = 0;
|
|
|
|
|
timeout++;
|
|
|
|
|
status |= PhyWLPass2(pMCTstat, pDCTstat, 0,
|
|
|
|
|
(pDCTstat->TargetFreq == final_target_freq));
|
|
|
|
|
status |= PhyWLPass2(pMCTstat, pDCTstat, 1,
|
|
|
|
|
(pDCTstat->TargetFreq == final_target_freq));
|
|
|
|
|
if (status)
|
|
|
|
|
printk(BIOS_INFO,
|
|
|
|
|
"%s: Retrying write levelling due to invalid value(s) "
|
|
|
|
|
"detected in last phase\n",
|
|
|
|
|
__func__);
|
|
|
|
|
} while (status && (timeout < 8));
|
|
|
|
|
global_phy_training_status |= status;
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
pDCTstat->TargetFreq = final_target_freq;
|
|
|
|
|
|
|
|
|
|
if (global_phy_training_status)
|
|
|
|
|
printk(BIOS_WARNING,
|
|
|
|
|
"%s: Uncorrectable invalid value(s) detected in second phase of "
|
|
|
|
|
"write levelling; "
|
|
|
|
|
"continuing but system may be unstable!\n",
|
|
|
|
|
__func__);
|
|
|
|
|
|
|
|
|
|
uint8_t dct;
|
|
|
|
|
for (dct = 0; dct < 2; dct++) {
|
|
|
|
|
sDCTStruct *pDCTData = pDCTstat->C_DCTPtr[dct];
|
|
|
|
|
memcpy(pDCTData->WLGrossDelayFinalPass,
|
|
|
|
|
pDCTData->WLGrossDelayPrevPass,
|
|
|
|
|
sizeof(pDCTData->WLGrossDelayPrevPass));
|
|
|
|
|
memcpy(pDCTData->WLFineDelayFinalPass,
|
|
|
|
|
pDCTData->WLFineDelayPrevPass,
|
|
|
|
|
sizeof(pDCTData->WLFineDelayPrevPass));
|
|
|
|
|
pDCTData->WLCriticalGrossDelayFinalPass =
|
|
|
|
|
pDCTData->WLCriticalGrossDelayPrevPass;
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
\end{minted}
|
|
|
|
|
\end{adjustwidth}
|
|
|
|
|
\caption{Write Leveling (second pass), extract from the
|
|
|
|
|
\texttt{WriteLevelization\_HW} function in
|
|
|
|
|
\texttt{src/northbridge/amd/amdmct/mct\_ddr3/mcthwl.c}.}
|
|
|
|
|
\label{lst:write_level_second_pass}
|
|
|
|
|
\end{listing}
|
|
|
|
|
|
|
|
|
|
\begin{listing}
|
|
|
|
|
\begin{adjustwidth}{0.5cm}{0.5cm}
|
|
|
|
|
\begin{minted}[linenos]{c}
|
|
|
|
|
uint8_t MaxDimmsInstallable = mctGet_NVbits(NV_MAX_DIMMS_PER_CH);
|
|
|
|
|
|
|
|
|
|
if (pDCTstat->Status & (1 << SB_Registered)) {
|
|
|
|
|
if (package_type == PT_GR) {
|
|
|
|
|
// Socket G34: Fam15h BKDG v3.14 Table 99
|
|
|
|
|
if (MaxDimmsInstallable == 1) {
|
|
|
|
|
if (channel == 0)
|
|
|
|
|
seed = 0x43;
|
|
|
|
|
else if (channel == 1)
|
|
|
|
|
seed = 0x3f;
|
|
|
|
|
else if (channel == 2)
|
|
|
|
|
seed = 0x3a;
|
|
|
|
|
else if (channel == 3)
|
|
|
|
|
seed = 0x35;
|
|
|
|
|
}
|
|
|
|
|
...
|
|
|
|
|
}
|
|
|
|
|
...
|
|
|
|
|
} else if (pDCTstat->Status & (1 << SB_LoadReduced)) {
|
|
|
|
|
// Load Reduced DIMM configuration
|
|
|
|
|
if (package_type == PT_GR) {
|
|
|
|
|
// Socket G34: Fam15h BKDG v3.14 Table 99
|
|
|
|
|
if (MaxDimmsInstallable == 1) {
|
|
|
|
|
if (channel == 0)
|
|
|
|
|
seed = 0x123;
|
|
|
|
|
...
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
\end{minted}
|
|
|
|
|
\end{adjustwidth}
|
|
|
|
|
\caption{Seed generation for DQS receiver enable training based on DIMM type
|
|
|
|
|
and configuration,
|
|
|
|
|
extract from
|
|
|
|
|
\protect\path{fam15_receiver_enable_training_seed} function in
|
|
|
|
|
\protect\path{src/northbridge/amd/amdmct/mct_ddr3/mctsrc.c}}
|
|
|
|
|
\label{lst:seed_generation}
|
|
|
|
|
\end{listing}
|
2024-07-24 17:00:17 +02:00
|
|
|
|
|
|
|
|
|
\newpage
|
|
|
|
|
|
2024-08-25 11:54:54 +02:00
|
|
|
|
% ------------------------------------------------------------------------------
|
2024-08-27 13:27:07 +02:00
|
|
|
|
%
|
|
|
|
|
%
|
|
|
|
|
% LICENSES
|
|
|
|
|
%
|
|
|
|
|
%
|
|
|
|
|
%
|
2024-08-25 11:54:54 +02:00
|
|
|
|
% ------------------------------------------------------------------------------
|
2024-08-27 13:27:07 +02:00
|
|
|
|
\chapter*{\center\rlap{GNU General Public License version 2}}
|
|
|
|
|
\addcontentsline{toc}{chapter}{GNU General Public License version 2}
|
|
|
|
|
|
|
|
|
|
\parindent 0in
|
|
|
|
|
|
|
|
|
|
Version 2, June 1991
|
|
|
|
|
|
|
|
|
|
Copyright \copyright\ 1989, 1991 Free Software Foundation, Inc.
|
|
|
|
|
|
|
|
|
|
\bigskip
|
|
|
|
|
|
|
|
|
|
51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA
|
|
|
|
|
|
|
|
|
|
\bigskip
|
|
|
|
|
|
|
|
|
|
Everyone is permitted to copy and distribute verbatim copies
|
|
|
|
|
of this license document, but changing it is not allowed.
|
|
|
|
|
|
|
|
|
|
\bigskip{\bf\large Preamble}\bigskip
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
The licenses for most software are designed to take away your freedom to
|
|
|
|
|
share and change it. By contrast, the GNU General Public License is
|
|
|
|
|
intended to guarantee your freedom to share and change free software---to
|
|
|
|
|
make sure the software is free for all its users. This General Public
|
|
|
|
|
License applies to most of the Free Software Foundation's software and to
|
|
|
|
|
any other program whose authors commit to using it. (Some other Free
|
|
|
|
|
Software Foundation software is covered by the GNU Library General Public
|
|
|
|
|
License instead.) You can apply it to your programs, too.
|
|
|
|
|
|
|
|
|
|
When we speak of free software, we are referring to freedom, not price.
|
|
|
|
|
Our General Public Licenses are designed to make sure that you have the
|
|
|
|
|
freedom to distribute copies of free software (and charge for this service
|
|
|
|
|
if you wish), that you receive source code or can get it if you want it,
|
|
|
|
|
that you can change the software or use pieces of it in new free programs;
|
|
|
|
|
and that you know you can do these things.
|
|
|
|
|
|
|
|
|
|
To protect your rights, we need to make restrictions that forbid anyone to
|
|
|
|
|
deny you these rights or to ask you to surrender the rights. These
|
|
|
|
|
restrictions translate to certain responsibilities for you if you
|
|
|
|
|
distribute copies of the software, or if you modify it.
|
|
|
|
|
|
|
|
|
|
For example, if you distribute copies of such a program, whether gratis or
|
|
|
|
|
for a fee, you must give the recipients all the rights that you have. You
|
|
|
|
|
must make sure that they, too, receive or can get the source code. And
|
|
|
|
|
you must show them these terms so they know their rights.
|
|
|
|
|
|
|
|
|
|
We protect your rights with two steps: (1) copyright the software, and (2)
|
|
|
|
|
offer you this license which gives you legal permission to copy,
|
|
|
|
|
distribute and/or modify the software.
|
|
|
|
|
|
|
|
|
|
Also, for each author's protection and ours, we want to make certain that
|
|
|
|
|
everyone understands that there is no warranty for this free software. If
|
|
|
|
|
the software is modified by someone else and passed on, we want its
|
|
|
|
|
recipients to know that what they have is not the original, so that any
|
|
|
|
|
problems introduced by others will not reflect on the original authors'
|
|
|
|
|
reputations.
|
|
|
|
|
|
|
|
|
|
Finally, any free program is threatened constantly by software patents.
|
|
|
|
|
We wish to avoid the danger that redistributors of a free program will
|
|
|
|
|
individually obtain patent licenses, in effect making the program
|
|
|
|
|
proprietary. To prevent this, we have made it clear that any patent must
|
|
|
|
|
be licensed for everyone's free use or not licensed at all.
|
|
|
|
|
|
|
|
|
|
The precise terms and conditions for copying, distribution and
|
|
|
|
|
modification follow.
|
|
|
|
|
|
|
|
|
|
\bigskip{\Large \sc Terms and Conditions For Copying, Distribution and
|
|
|
|
|
Modification}\bigskip
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
%\renewcommand{\theenumi}{\alpha{enumi}}
|
|
|
|
|
\begin{enumerate}
|
|
|
|
|
|
|
|
|
|
\addtocounter{enumi}{-1}
|
|
|
|
|
|
|
|
|
|
\item
|
|
|
|
|
|
|
|
|
|
This License applies to any program or other work which contains a notice
|
|
|
|
|
placed by the copyright holder saying it may be distributed under the
|
|
|
|
|
terms of this General Public License. The ``Program'', below, refers to
|
|
|
|
|
any such program or work, and a ``work based on the Program'' means either
|
|
|
|
|
the Program or any derivative work under copyright law: that is to say, a
|
|
|
|
|
work containing the Program or a portion of it, either verbatim or with
|
|
|
|
|
modifications and/or translated into another language. (Hereinafter,
|
|
|
|
|
translation is included without limitation in the term ``modification''.)
|
|
|
|
|
Each licensee is addressed as ``you''.
|
|
|
|
|
|
|
|
|
|
Activities other than copying, distribution and modification are not
|
|
|
|
|
covered by this License; they are outside its scope. The act of
|
|
|
|
|
running the Program is not restricted, and the output from the Program
|
|
|
|
|
is covered only if its contents constitute a work based on the
|
|
|
|
|
Program (independent of having been made by running the Program).
|
|
|
|
|
Whether that is true depends on what the Program does.
|
|
|
|
|
|
|
|
|
|
\item You may copy and distribute verbatim copies of the Program's source
|
|
|
|
|
code as you receive it, in any medium, provided that you conspicuously
|
|
|
|
|
and appropriately publish on each copy an appropriate copyright notice
|
|
|
|
|
and disclaimer of warranty; keep intact all the notices that refer to
|
|
|
|
|
this License and to the absence of any warranty; and give any other
|
|
|
|
|
recipients of the Program a copy of this License along with the Program.
|
|
|
|
|
|
|
|
|
|
You may charge a fee for the physical act of transferring a copy, and you
|
|
|
|
|
may at your option offer warranty protection in exchange for a fee.
|
|
|
|
|
|
|
|
|
|
\item
|
|
|
|
|
|
|
|
|
|
You may modify your copy or copies of the Program or any portion
|
|
|
|
|
of it, thus forming a work based on the Program, and copy and
|
|
|
|
|
distribute such modifications or work under the terms of Section 1
|
|
|
|
|
above, provided that you also meet all of these conditions:
|
|
|
|
|
|
|
|
|
|
\begin{enumerate}
|
|
|
|
|
|
|
|
|
|
\item
|
|
|
|
|
|
|
|
|
|
You must cause the modified files to carry prominent notices stating that
|
|
|
|
|
you changed the files and the date of any change.
|
|
|
|
|
|
|
|
|
|
\item
|
|
|
|
|
|
|
|
|
|
You must cause any work that you distribute or publish, that in
|
|
|
|
|
whole or in part contains or is derived from the Program or any
|
|
|
|
|
part thereof, to be licensed as a whole at no charge to all third
|
|
|
|
|
parties under the terms of this License.
|
|
|
|
|
|
|
|
|
|
\item
|
|
|
|
|
If the modified program normally reads commands interactively
|
|
|
|
|
when run, you must cause it, when started running for such
|
|
|
|
|
interactive use in the most ordinary way, to print or display an
|
|
|
|
|
announcement including an appropriate copyright notice and a
|
|
|
|
|
notice that there is no warranty (or else, saying that you provide
|
|
|
|
|
a warranty) and that users may redistribute the program under
|
|
|
|
|
these conditions, and telling the user how to view a copy of this
|
|
|
|
|
License. (Exception: if the Program itself is interactive but
|
|
|
|
|
does not normally print such an announcement, your work based on
|
|
|
|
|
the Program is not required to print an announcement.)
|
|
|
|
|
|
|
|
|
|
\end{enumerate}
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
These requirements apply to the modified work as a whole. If
|
|
|
|
|
identifiable sections of that work are not derived from the Program,
|
|
|
|
|
and can be reasonably considered independent and separate works in
|
|
|
|
|
themselves, then this License, and its terms, do not apply to those
|
|
|
|
|
sections when you distribute them as separate works. But when you
|
|
|
|
|
distribute the same sections as part of a whole which is a work based
|
|
|
|
|
on the Program, the distribution of the whole must be on the terms of
|
|
|
|
|
this License, whose permissions for other licensees extend to the
|
|
|
|
|
entire whole, and thus to each and every part regardless of who wrote it.
|
|
|
|
|
|
|
|
|
|
Thus, it is not the intent of this section to claim rights or contest
|
|
|
|
|
your rights to work written entirely by you; rather, the intent is to
|
|
|
|
|
exercise the right to control the distribution of derivative or
|
|
|
|
|
collective works based on the Program.
|
|
|
|
|
|
|
|
|
|
In addition, mere aggregation of another work not based on the Program
|
|
|
|
|
with the Program (or with a work based on the Program) on a volume of
|
|
|
|
|
a storage or distribution medium does not bring the other work under
|
|
|
|
|
the scope of this License.
|
|
|
|
|
|
|
|
|
|
\item
|
|
|
|
|
You may copy and distribute the Program (or a work based on it,
|
|
|
|
|
under Section 2) in object code or executable form under the terms of
|
|
|
|
|
Sections 1 and 2 above provided that you also do one of the following:
|
|
|
|
|
|
|
|
|
|
\begin{enumerate}
|
|
|
|
|
|
|
|
|
|
\item
|
|
|
|
|
|
|
|
|
|
Accompany it with the complete corresponding machine-readable
|
|
|
|
|
source code, which must be distributed under the terms of Sections
|
|
|
|
|
1 and 2 above on a medium customarily used for software interchange; or,
|
|
|
|
|
|
|
|
|
|
\item
|
|
|
|
|
|
|
|
|
|
Accompany it with a written offer, valid for at least three
|
|
|
|
|
years, to give any third party, for a charge no more than your
|
|
|
|
|
cost of physically performing source distribution, a complete
|
|
|
|
|
machine-readable copy of the corresponding source code, to be
|
|
|
|
|
distributed under the terms of Sections 1 and 2 above on a medium
|
|
|
|
|
customarily used for software interchange; or,
|
|
|
|
|
|
|
|
|
|
\item
|
|
|
|
|
|
|
|
|
|
Accompany it with the information you received as to the offer
|
|
|
|
|
to distribute corresponding source code. (This alternative is
|
|
|
|
|
allowed only for noncommercial distribution and only if you
|
|
|
|
|
received the program in object code or executable form with such
|
|
|
|
|
an offer, in accord with Subsection b above.)
|
|
|
|
|
|
|
|
|
|
\end{enumerate}
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
The source code for a work means the preferred form of the work for
|
|
|
|
|
making modifications to it. For an executable work, complete source
|
|
|
|
|
code means all the source code for all modules it contains, plus any
|
|
|
|
|
associated interface definition files, plus the scripts used to
|
|
|
|
|
control compilation and installation of the executable. However, as a
|
|
|
|
|
special exception, the source code distributed need not include
|
|
|
|
|
anything that is normally distributed (in either source or binary
|
|
|
|
|
form) with the major components (compiler, kernel, and so on) of the
|
|
|
|
|
operating system on which the executable runs, unless that component
|
|
|
|
|
itself accompanies the executable.
|
|
|
|
|
|
|
|
|
|
If distribution of executable or object code is made by offering
|
|
|
|
|
access to copy from a designated place, then offering equivalent
|
|
|
|
|
access to copy the source code from the same place counts as
|
|
|
|
|
distribution of the source code, even though third parties are not
|
|
|
|
|
compelled to copy the source along with the object code.
|
|
|
|
|
|
|
|
|
|
\item
|
|
|
|
|
You may not copy, modify, sublicense, or distribute the Program
|
|
|
|
|
except as expressly provided under this License. Any attempt
|
|
|
|
|
otherwise to copy, modify, sublicense or distribute the Program is
|
|
|
|
|
void, and will automatically terminate your rights under this License.
|
|
|
|
|
However, parties who have received copies, or rights, from you under
|
|
|
|
|
this License will not have their licenses terminated so long as such
|
|
|
|
|
parties remain in full compliance.
|
|
|
|
|
|
|
|
|
|
\item
|
|
|
|
|
You are not required to accept this License, since you have not
|
|
|
|
|
signed it. However, nothing else grants you permission to modify or
|
|
|
|
|
distribute the Program or its derivative works. These actions are
|
|
|
|
|
prohibited by law if you do not accept this License. Therefore, by
|
|
|
|
|
modifying or distributing the Program (or any work based on the
|
|
|
|
|
Program), you indicate your acceptance of this License to do so, and
|
|
|
|
|
all its terms and conditions for copying, distributing or modifying
|
|
|
|
|
the Program or works based on it.
|
|
|
|
|
|
|
|
|
|
\item
|
|
|
|
|
Each time you redistribute the Program (or any work based on the
|
|
|
|
|
Program), the recipient automatically receives a license from the
|
|
|
|
|
original licensor to copy, distribute or modify the Program subject to
|
|
|
|
|
these terms and conditions. You may not impose any further
|
|
|
|
|
restrictions on the recipients' exercise of the rights granted herein.
|
|
|
|
|
You are not responsible for enforcing compliance by third parties to
|
|
|
|
|
this License.
|
|
|
|
|
|
|
|
|
|
\item
|
|
|
|
|
If, as a consequence of a court judgment or allegation of patent
|
|
|
|
|
infringement or for any other reason (not limited to patent issues),
|
|
|
|
|
conditions are imposed on you (whether by court order, agreement or
|
|
|
|
|
otherwise) that contradict the conditions of this License, they do not
|
|
|
|
|
excuse you from the conditions of this License. If you cannot
|
|
|
|
|
distribute so as to satisfy simultaneously your obligations under this
|
|
|
|
|
License and any other pertinent obligations, then as a consequence you
|
|
|
|
|
may not distribute the Program at all. For example, if a patent
|
|
|
|
|
license would not permit royalty-free redistribution of the Program by
|
|
|
|
|
all those who receive copies directly or indirectly through you, then
|
|
|
|
|
the only way you could satisfy both it and this License would be to
|
|
|
|
|
refrain entirely from distribution of the Program.
|
|
|
|
|
|
|
|
|
|
If any portion of this section is held invalid or unenforceable under
|
|
|
|
|
any particular circumstance, the balance of the section is intended to
|
|
|
|
|
apply and the section as a whole is intended to apply in other
|
|
|
|
|
circumstances.
|
|
|
|
|
|
|
|
|
|
It is not the purpose of this section to induce you to infringe any
|
|
|
|
|
patents or other property right claims or to contest validity of any
|
|
|
|
|
such claims; this section has the sole purpose of protecting the
|
|
|
|
|
integrity of the free software distribution system, which is
|
|
|
|
|
implemented by public license practices. Many people have made
|
|
|
|
|
generous contributions to the wide range of software distributed
|
|
|
|
|
through that system in reliance on consistent application of that
|
|
|
|
|
system; it is up to the author/donor to decide if he or she is willing
|
|
|
|
|
to distribute software through any other system and a licensee cannot
|
|
|
|
|
impose that choice.
|
|
|
|
|
|
|
|
|
|
This section is intended to make thoroughly clear what is believed to
|
|
|
|
|
be a consequence of the rest of this License.
|
|
|
|
|
|
|
|
|
|
\item
|
|
|
|
|
If the distribution and/or use of the Program is restricted in
|
|
|
|
|
certain countries either by patents or by copyrighted interfaces, the
|
|
|
|
|
original copyright holder who places the Program under this License
|
|
|
|
|
may add an explicit geographical distribution limitation excluding
|
|
|
|
|
those countries, so that distribution is permitted only in or among
|
|
|
|
|
countries not thus excluded. In such case, this License incorporates
|
|
|
|
|
the limitation as if written in the body of this License.
|
|
|
|
|
|
|
|
|
|
\item
|
|
|
|
|
The Free Software Foundation may publish revised and/or new versions
|
|
|
|
|
of the General Public License from time to time. Such new versions will
|
|
|
|
|
be similar in spirit to the present version, but may differ in detail to
|
|
|
|
|
address new problems or concerns.
|
|
|
|
|
|
|
|
|
|
Each version is given a distinguishing version number. If the Program
|
|
|
|
|
specifies a version number of this License which applies to it and ``any
|
|
|
|
|
later version'', you have the option of following the terms and conditions
|
|
|
|
|
either of that version or of any later version published by the Free
|
|
|
|
|
Software Foundation. If the Program does not specify a version number of
|
|
|
|
|
this License, you may choose any version ever published by the Free Software
|
|
|
|
|
Foundation.
|
|
|
|
|
|
|
|
|
|
\item
|
|
|
|
|
If you wish to incorporate parts of the Program into other free
|
|
|
|
|
programs whose distribution conditions are different, write to the author
|
|
|
|
|
to ask for permission. For software which is copyrighted by the Free
|
|
|
|
|
Software Foundation, write to the Free Software Foundation; we sometimes
|
|
|
|
|
make exceptions for this. Our decision will be guided by the two goals
|
|
|
|
|
of preserving the free status of all derivatives of our free software and
|
|
|
|
|
of promoting the sharing and reuse of software generally.
|
|
|
|
|
|
|
|
|
|
\bigskip{\Large\sc
|
|
|
|
|
No Warranty
|
|
|
|
|
}\bigskip
|
|
|
|
|
|
|
|
|
|
\item
|
|
|
|
|
{\sc Because the program is licensed free of charge, there is no warranty
|
|
|
|
|
for the program, to the extent permitted by applicable law. Except when
|
|
|
|
|
otherwise stated in writing the copyright holders and/or other parties
|
|
|
|
|
provide the program ``as is'' without warranty of any kind, either expressed
|
|
|
|
|
or implied, including, but not limited to, the implied warranties of
|
|
|
|
|
merchantability and fitness for a particular purpose. The entire risk as
|
|
|
|
|
to the quality and performance of the program is with you. Should the
|
|
|
|
|
program prove defective, you assume the cost of all necessary servicing,
|
|
|
|
|
repair or correction.}
|
|
|
|
|
|
|
|
|
|
\item
|
|
|
|
|
{\sc In no event unless required by applicable law or agreed to in writing
|
|
|
|
|
will any copyright holder, or any other party who may modify and/or
|
|
|
|
|
redistribute the program as permitted above, be liable to you for damages,
|
|
|
|
|
including any general, special, incidental or consequential damages arising
|
|
|
|
|
out of the use or inability to use the program (including but not limited
|
|
|
|
|
to loss of data or data being rendered inaccurate or losses sustained by
|
|
|
|
|
you or third parties or a failure of the program to operate with any other
|
|
|
|
|
programs), even if such holder or other party has been advised of the
|
|
|
|
|
possibility of such damages.}
|
|
|
|
|
|
|
|
|
|
\end{enumerate}
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
\bigskip{\Large\sc End of Terms and Conditions}\bigskip
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
\pagebreak[2]
|
|
|
|
|
|
|
|
|
|
\section*{Appendix: How to Apply These Terms to Your New Programs}
|
|
|
|
|
|
|
|
|
|
If you develop a new program, and you want it to be of the greatest
|
|
|
|
|
possible use to the public, the best way to achieve this is to make it
|
|
|
|
|
free software which everyone can redistribute and change under these
|
|
|
|
|
terms.
|
|
|
|
|
|
|
|
|
|
To do so, attach the following notices to the program. It is safest to
|
|
|
|
|
attach them to the start of each source file to most effectively convey
|
|
|
|
|
the exclusion of warranty; and each file should have at least the
|
|
|
|
|
``copyright'' line and a pointer to where the full notice is found.
|
|
|
|
|
|
|
|
|
|
\begin{quote}
|
|
|
|
|
one line to give the program's name and a brief idea of what it does. \\
|
|
|
|
|
Copyright (C) yyyy name of author \\
|
|
|
|
|
|
|
|
|
|
This program is free software; you can redistribute it and/or modify
|
|
|
|
|
it under the terms of the GNU General Public License as published by
|
|
|
|
|
the Free Software Foundation; either version 2 of the License, or
|
|
|
|
|
(at your option) any later version.
|
|
|
|
|
|
|
|
|
|
This program is distributed in the hope that it will be useful,
|
|
|
|
|
but WITHOUT ANY WARRANTY; without even the implied warranty of
|
|
|
|
|
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
|
|
|
|
|
GNU General Public License for more details.
|
|
|
|
|
|
|
|
|
|
You should have received a copy of the GNU General Public License
|
|
|
|
|
along with this program; if not, write to the Free Software
|
|
|
|
|
Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA.
|
|
|
|
|
\end{quote}
|
|
|
|
|
|
|
|
|
|
Also add information on how to contact you by electronic and paper mail.
|
|
|
|
|
|
|
|
|
|
If the program is interactive, make it output a short notice like this
|
|
|
|
|
when it starts in an interactive mode:
|
|
|
|
|
|
|
|
|
|
\begin{quote}
|
|
|
|
|
Gnomovision version 69, Copyright (C) yyyy name of author \\
|
|
|
|
|
Gnomovision comes with ABSOLUTELY NO WARRANTY; for details type `show w'. \\
|
|
|
|
|
This is free software, and you are welcome to redistribute it
|
|
|
|
|
under certain conditions; type `show c' for details.
|
|
|
|
|
\end{quote}
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
The hypothetical commands {\tt show w} and {\tt show c} should show the
|
|
|
|
|
appropriate parts of the General Public License. Of course, the commands
|
|
|
|
|
you use may be called something other than {\tt show w} and {\tt show c};
|
|
|
|
|
they could even be mouse-clicks or menu items---whatever suits your
|
|
|
|
|
program.
|
|
|
|
|
|
|
|
|
|
You should also get your employer (if you work as a programmer) or your
|
|
|
|
|
school, if any, to sign a ``copyright disclaimer'' for the program, if
|
|
|
|
|
necessary. Here is a sample; alter the names:
|
|
|
|
|
|
|
|
|
|
\begin{quote}
|
|
|
|
|
Yoyodyne, Inc., hereby disclaims all copyright interest in the program \\
|
|
|
|
|
`Gnomovision' (which makes passes at compilers) written by James Hacker. \\
|
|
|
|
|
|
|
|
|
|
signature of Ty Coon, 1 April 1989 \\
|
|
|
|
|
Ty Coon, President of Vice
|
|
|
|
|
\end{quote}
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
This General Public License does not permit incorporating your program
|
|
|
|
|
into proprietary programs. If your program is a subroutine library, you
|
|
|
|
|
may consider it more useful to permit linking proprietary applications
|
|
|
|
|
with the library. If this is what you want to do, use the GNU Library
|
|
|
|
|
General Public License instead of this License.
|
|
|
|
|
|
2024-08-22 15:38:22 +02:00
|
|
|
|
\chapter*{\center\rlap{GNU Free Documentation License}}
|
2024-07-24 17:00:17 +02:00
|
|
|
|
\addcontentsline{toc}{chapter}{GNU Free Documentation License}
|
|
|
|
|
|
2024-08-27 13:27:07 +02:00
|
|
|
|
Version 1.3, 3 November 2008
|
2024-07-24 17:00:17 +02:00
|
|
|
|
|
|
|
|
|
|
2024-08-27 13:27:07 +02:00
|
|
|
|
Copyright \copyright{} 2000, 2001, 2002, 2007, 2008 Free Software Foundation, Inc.
|
2024-07-24 17:00:17 +02:00
|
|
|
|
|
2024-08-27 13:27:07 +02:00
|
|
|
|
\bigskip
|
|
|
|
|
|
|
|
|
|
\path{<https://fsf.org/>}
|
2024-07-24 17:00:17 +02:00
|
|
|
|
|
2024-08-27 13:27:07 +02:00
|
|
|
|
\bigskip
|
|
|
|
|
|
|
|
|
|
Everyone is permitted to copy and distribute verbatim copies
|
|
|
|
|
of this license document, but changing it is not allowed.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
\bigskip\bigskip{\bf\large Preamble}\bigskip
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
The purpose of this License is to make a manual, textbook, or other
|
|
|
|
|
functional and useful document ``free'' in the sense of freedom: to
|
|
|
|
|
assure everyone the effective freedom to copy and redistribute it,
|
|
|
|
|
with or without modifying it, either commercially or noncommercially.
|
|
|
|
|
Secondarily, this License preserves for the author and publisher a way
|
|
|
|
|
to get credit for their work, while not being considered responsible
|
|
|
|
|
for modifications made by others.
|
|
|
|
|
|
|
|
|
|
This License is a kind of ``copyleft'', which means that derivative
|
|
|
|
|
works of the document must themselves be free in the same sense. It
|
|
|
|
|
complements the GNU General Public License, which is a copyleft
|
|
|
|
|
license designed for free software.
|
|
|
|
|
|
|
|
|
|
We have designed this License in order to use it for manuals for free
|
|
|
|
|
software, because free software needs free documentation: a free
|
|
|
|
|
program should come with manuals providing the same freedoms that the
|
|
|
|
|
software does. But this License is not limited to software manuals;
|
|
|
|
|
it can be used for any textual work, regardless of subject matter or
|
|
|
|
|
whether it is published as a printed book. We recommend this License
|
|
|
|
|
principally for works whose purpose is instruction or reference.
|
2024-07-24 17:00:17 +02:00
|
|
|
|
|
2024-08-22 15:38:22 +02:00
|
|
|
|
|
|
|
|
|
|
2024-08-27 13:27:07 +02:00
|
|
|
|
\bigskip\bigskip{\Large\bf 1. APPLICABILITY AND DEFINITIONS\par}\bigskip
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
This License applies to any manual or other work, in any medium, that
|
|
|
|
|
contains a notice placed by the copyright holder saying it can be
|
|
|
|
|
distributed under the terms of this License. Such a notice grants a
|
|
|
|
|
world-wide, royalty-free license, unlimited in duration, to use that
|
|
|
|
|
work under the conditions stated herein. The ``\textbf{Document}'', below,
|
|
|
|
|
refers to any such manual or work. Any member of the public is a
|
|
|
|
|
licensee, and is addressed as ``\textbf{you}''. You accept the license if you
|
|
|
|
|
copy, modify or distribute the work in a way requiring permission
|
|
|
|
|
under copyright law.
|
|
|
|
|
|
|
|
|
|
A ``\textbf{Modified Version}'' of the Document means any work containing the
|
|
|
|
|
Document or a portion of it, either copied verbatim, or with
|
|
|
|
|
modifications and/or translated into another language.
|
2024-07-24 17:00:17 +02:00
|
|
|
|
|
2024-08-27 13:27:07 +02:00
|
|
|
|
A ``\textbf{Secondary Section}'' is a named appendix or a front-matter section of
|
|
|
|
|
the Document that deals exclusively with the relationship of the
|
|
|
|
|
publishers or authors of the Document to the Document's overall subject
|
|
|
|
|
(or to related matters) and contains nothing that could fall directly
|
|
|
|
|
within that overall subject. (Thus, if the Document is in part a
|
|
|
|
|
textbook of mathematics, a Secondary Section may not explain any
|
|
|
|
|
mathematics.) The relationship could be a matter of historical
|
|
|
|
|
connection with the subject or with related matters, or of legal,
|
|
|
|
|
commercial, philosophical, ethical or political position regarding
|
|
|
|
|
them.
|
2024-07-24 17:00:17 +02:00
|
|
|
|
|
2024-08-27 13:27:07 +02:00
|
|
|
|
The ``\textbf{Invariant Sections}'' are certain Secondary Sections whose titles
|
|
|
|
|
are designated, as being those of Invariant Sections, in the notice
|
|
|
|
|
that says that the Document is released under this License. If a
|
|
|
|
|
section does not fit the above definition of Secondary then it is not
|
|
|
|
|
allowed to be designated as Invariant. The Document may contain zero
|
|
|
|
|
Invariant Sections. If the Document does not identify any Invariant
|
|
|
|
|
Sections then there are none.
|
|
|
|
|
|
|
|
|
|
The ``\textbf{Cover Texts}'' are certain short passages of text that are listed,
|
|
|
|
|
as Front-Cover Texts or Back-Cover Texts, in the notice that says that
|
|
|
|
|
the Document is released under this License. A Front-Cover Text may
|
|
|
|
|
be at most 5 words, and a Back-Cover Text may be at most 25 words.
|
|
|
|
|
|
|
|
|
|
A ``\textbf{Transparent}'' copy of the Document means a machine-readable copy,
|
|
|
|
|
represented in a format whose specification is available to the
|
|
|
|
|
general public, that is suitable for revising the document
|
|
|
|
|
straightforwardly with generic text editors or (for images composed of
|
|
|
|
|
pixels) generic paint programs or (for drawings) some widely available
|
|
|
|
|
drawing editor, and that is suitable for input to text formatters or
|
|
|
|
|
for automatic translation to a variety of formats suitable for input
|
|
|
|
|
to text formatters. A copy made in an otherwise Transparent file
|
|
|
|
|
format whose markup, or absence of markup, has been arranged to thwart
|
|
|
|
|
or discourage subsequent modification by readers is not Transparent.
|
|
|
|
|
An image format is not Transparent if used for any substantial amount
|
|
|
|
|
of text. A copy that is not ``Transparent'' is called ``\textbf{Opaque}''.
|
2024-07-24 17:00:17 +02:00
|
|
|
|
|
2024-08-27 13:27:07 +02:00
|
|
|
|
Examples of suitable formats for Transparent copies include plain
|
|
|
|
|
ASCII without markup, Texinfo input format, LaTeX input format, SGML
|
|
|
|
|
or XML using a publicly available DTD, and standard-conforming simple
|
|
|
|
|
HTML, PostScript or PDF designed for human modification. Examples of
|
|
|
|
|
transparent image formats include PNG, XCF and JPG. Opaque formats
|
|
|
|
|
include proprietary formats that can be read and edited only by
|
|
|
|
|
proprietary word processors, SGML or XML for which the DTD and/or
|
|
|
|
|
processing tools are not generally available, and the
|
|
|
|
|
machine-generated HTML, PostScript or PDF produced by some word
|
|
|
|
|
processors for output purposes only.
|
|
|
|
|
|
|
|
|
|
The ``\textbf{Title Page}'' means, for a printed book, the title page itself,
|
|
|
|
|
plus such following pages as are needed to hold, legibly, the material
|
|
|
|
|
this License requires to appear in the title page. For works in
|
|
|
|
|
formats which do not have any title page as such, ``Title Page'' means
|
|
|
|
|
the text near the most prominent appearance of the work's title,
|
|
|
|
|
preceding the beginning of the body of the text.
|
|
|
|
|
|
|
|
|
|
The ``\textbf{publisher}'' means any person or entity that distributes
|
|
|
|
|
copies of the Document to the public.
|
|
|
|
|
|
|
|
|
|
A section ``\textbf{Entitled XYZ}'' means a named subunit of the Document whose
|
|
|
|
|
title either is precisely XYZ or contains XYZ in parentheses following
|
|
|
|
|
text that translates XYZ in another language. (Here XYZ stands for a
|
|
|
|
|
specific section name mentioned below, such as ``\textbf{Acknowledgements}'',
|
|
|
|
|
``\textbf{Dedications}'', ``\textbf{Endorsements}'', or ``\textbf{History}''.)
|
|
|
|
|
To ``\textbf{Preserve the Title}''
|
|
|
|
|
of such a section when you modify the Document means that it remains a
|
|
|
|
|
section ``Entitled XYZ'' according to this definition.
|
|
|
|
|
|
|
|
|
|
The Document may include Warranty Disclaimers next to the notice which
|
|
|
|
|
states that this License applies to the Document. These Warranty
|
|
|
|
|
Disclaimers are considered to be included by reference in this
|
|
|
|
|
License, but only as regards disclaiming warranties: any other
|
|
|
|
|
implication that these Warranty Disclaimers may have is void and has
|
|
|
|
|
no effect on the meaning of this License.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
\bigskip\bigskip{\Large\bf 2. VERBATIM COPYING\par}\bigskip
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
You may copy and distribute the Document in any medium, either
|
|
|
|
|
commercially or noncommercially, provided that this License, the
|
|
|
|
|
copyright notices, and the license notice saying this License applies
|
|
|
|
|
to the Document are reproduced in all copies, and that you add no other
|
|
|
|
|
conditions whatsoever to those of this License. You may not use
|
|
|
|
|
technical measures to obstruct or control the reading or further
|
|
|
|
|
copying of the copies you make or distribute. However, you may accept
|
|
|
|
|
compensation in exchange for copies. If you distribute a large enough
|
|
|
|
|
number of copies you must also follow the conditions in section~3.
|
|
|
|
|
|
|
|
|
|
You may also lend copies, under the same conditions stated above, and
|
|
|
|
|
you may publicly display copies.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
\bigskip\bigskip{\Large\bf 3. COPYING IN QUANTITY\par}\bigskip
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
If you publish printed copies (or copies in media that commonly have
|
|
|
|
|
printed covers) of the Document, numbering more than 100, and the
|
|
|
|
|
Document's license notice requires Cover Texts, you must enclose the
|
|
|
|
|
copies in covers that carry, clearly and legibly, all these Cover
|
|
|
|
|
Texts: Front-Cover Texts on the front cover, and Back-Cover Texts on
|
|
|
|
|
the back cover. Both covers must also clearly and legibly identify
|
|
|
|
|
you as the publisher of these copies. The front cover must present
|
|
|
|
|
the full title with all words of the title equally prominent and
|
|
|
|
|
visible. You may add other material on the covers in addition.
|
|
|
|
|
Copying with changes limited to the covers, as long as they preserve
|
|
|
|
|
the title of the Document and satisfy these conditions, can be treated
|
|
|
|
|
as verbatim copying in other respects.
|
|
|
|
|
|
|
|
|
|
If the required texts for either cover are too voluminous to fit
|
|
|
|
|
legibly, you should put the first ones listed (as many as fit
|
|
|
|
|
reasonably) on the actual cover, and continue the rest onto adjacent
|
|
|
|
|
pages.
|
|
|
|
|
|
|
|
|
|
If you publish or distribute Opaque copies of the Document numbering
|
|
|
|
|
more than 100, you must either include a machine-readable Transparent
|
|
|
|
|
copy along with each Opaque copy, or state in or with each Opaque copy
|
|
|
|
|
a computer-network location from which the general network-using
|
|
|
|
|
public has access to download using public-standard network protocols
|
|
|
|
|
a complete Transparent copy of the Document, free of added material.
|
|
|
|
|
If you use the latter option, you must take reasonably prudent steps,
|
|
|
|
|
when you begin distribution of Opaque copies in quantity, to ensure
|
|
|
|
|
that this Transparent copy will remain thus accessible at the stated
|
|
|
|
|
location until at least one year after the last time you distribute an
|
|
|
|
|
Opaque copy (directly or through your agents or retailers) of that
|
|
|
|
|
edition to the public.
|
|
|
|
|
|
|
|
|
|
It is requested, but not required, that you contact the authors of the
|
|
|
|
|
Document well before redistributing any large number of copies, to give
|
|
|
|
|
them a chance to provide you with an updated version of the Document.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
\bigskip\bigskip{\Large\bf 4. MODIFICATIONS\par}\bigskip
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
You may copy and distribute a Modified Version of the Document under
|
|
|
|
|
the conditions of sections 2 and 3 above, provided that you release
|
|
|
|
|
the Modified Version under precisely this License, with the Modified
|
|
|
|
|
Version filling the role of the Document, thus licensing distribution
|
|
|
|
|
and modification of the Modified Version to whoever possesses a copy
|
|
|
|
|
of it. In addition, you must do these things in the Modified Version:
|
|
|
|
|
|
|
|
|
|
\begin{itemize}
|
|
|
|
|
\item[A.]
|
|
|
|
|
Use in the Title Page (and on the covers, if any) a title distinct
|
|
|
|
|
from that of the Document, and from those of previous versions
|
|
|
|
|
(which should, if there were any, be listed in the History section
|
|
|
|
|
of the Document). You may use the same title as a previous version
|
|
|
|
|
if the original publisher of that version gives permission.
|
|
|
|
|
|
|
|
|
|
\item[B.]
|
|
|
|
|
List on the Title Page, as authors, one or more persons or entities
|
|
|
|
|
responsible for authorship of the modifications in the Modified
|
|
|
|
|
Version, together with at least five of the principal authors of the
|
|
|
|
|
Document (all of its principal authors, if it has fewer than five),
|
|
|
|
|
unless they release you from this requirement.
|
|
|
|
|
|
|
|
|
|
\item[C.]
|
|
|
|
|
State on the Title page the name of the publisher of the
|
|
|
|
|
Modified Version, as the publisher.
|
|
|
|
|
|
|
|
|
|
\item[D.]
|
|
|
|
|
Preserve all the copyright notices of the Document.
|
|
|
|
|
|
|
|
|
|
\item[E.]
|
|
|
|
|
Add an appropriate copyright notice for your modifications
|
|
|
|
|
adjacent to the other copyright notices.
|
|
|
|
|
|
|
|
|
|
\item[F.]
|
|
|
|
|
Include, immediately after the copyright notices, a license notice
|
|
|
|
|
giving the public permission to use the Modified Version under the
|
|
|
|
|
terms of this License, in the form shown in the Addendum below.
|
|
|
|
|
|
|
|
|
|
\item[G.]
|
|
|
|
|
Preserve in that license notice the full lists of Invariant Sections
|
|
|
|
|
and required Cover Texts given in the Document's license notice.
|
|
|
|
|
|
|
|
|
|
\item[H.]
|
|
|
|
|
Include an unaltered copy of this License.
|
|
|
|
|
|
|
|
|
|
\item[I.]
|
|
|
|
|
Preserve the section Entitled ``History'', Preserve its Title, and add
|
|
|
|
|
to it an item stating at least the title, year, new authors, and
|
|
|
|
|
publisher of the Modified Version as given on the Title Page. If
|
|
|
|
|
there is no section Entitled ``History'' in the Document, create one
|
|
|
|
|
stating the title, year, authors, and publisher of the Document as
|
|
|
|
|
given on its Title Page, then add an item describing the Modified
|
|
|
|
|
Version as stated in the previous sentence.
|
|
|
|
|
|
|
|
|
|
\item[J.]
|
|
|
|
|
Preserve the network location, if any, given in the Document for
|
|
|
|
|
public access to a Transparent copy of the Document, and likewise
|
|
|
|
|
the network locations given in the Document for previous versions
|
|
|
|
|
it was based on. These may be placed in the ``History'' section.
|
|
|
|
|
You may omit a network location for a work that was published at
|
|
|
|
|
least four years before the Document itself, or if the original
|
|
|
|
|
publisher of the version it refers to gives permission.
|
|
|
|
|
|
|
|
|
|
\item[K.]
|
|
|
|
|
For any section Entitled ``Acknowledgements'' or ``Dedications'',
|
|
|
|
|
Preserve the Title of the section, and preserve in the section all
|
|
|
|
|
the substance and tone of each of the contributor acknowledgements
|
|
|
|
|
and/or dedications given therein.
|
|
|
|
|
|
|
|
|
|
\item[L.]
|
|
|
|
|
Preserve all the Invariant Sections of the Document,
|
|
|
|
|
unaltered in their text and in their titles. Section numbers
|
|
|
|
|
or the equivalent are not considered part of the section titles.
|
|
|
|
|
|
|
|
|
|
\item[M.]
|
|
|
|
|
Delete any section Entitled ``Endorsements''. Such a section
|
|
|
|
|
may not be included in the Modified Version.
|
|
|
|
|
|
|
|
|
|
\item[N.]
|
|
|
|
|
Do not retitle any existing section to be Entitled ``Endorsements''
|
|
|
|
|
or to conflict in title with any Invariant Section.
|
|
|
|
|
|
|
|
|
|
\item[O.]
|
|
|
|
|
Preserve any Warranty Disclaimers.
|
|
|
|
|
\end{itemize}
|
2024-07-24 17:00:17 +02:00
|
|
|
|
|
2024-08-27 13:27:07 +02:00
|
|
|
|
If the Modified Version includes new front-matter sections or
|
|
|
|
|
appendices that qualify as Secondary Sections and contain no material
|
|
|
|
|
copied from the Document, you may at your option designate some or all
|
|
|
|
|
of these sections as invariant. To do this, add their titles to the
|
|
|
|
|
list of Invariant Sections in the Modified Version's license notice.
|
|
|
|
|
These titles must be distinct from any other section titles.
|
2024-07-24 17:00:17 +02:00
|
|
|
|
|
2024-08-27 13:27:07 +02:00
|
|
|
|
You may add a section Entitled ``Endorsements'', provided it contains
|
|
|
|
|
nothing but endorsements of your Modified Version by various
|
|
|
|
|
parties---for example, statements of peer review or that the text has
|
|
|
|
|
been approved by an organization as the authoritative definition of a
|
|
|
|
|
standard.
|
|
|
|
|
|
|
|
|
|
You may add a passage of up to five words as a Front-Cover Text, and a
|
|
|
|
|
passage of up to 25 words as a Back-Cover Text, to the end of the list
|
|
|
|
|
of Cover Texts in the Modified Version. Only one passage of
|
|
|
|
|
Front-Cover Text and one of Back-Cover Text may be added by (or
|
|
|
|
|
through arrangements made by) any one entity. If the Document already
|
|
|
|
|
includes a cover text for the same cover, previously added by you or
|
|
|
|
|
by arrangement made by the same entity you are acting on behalf of,
|
|
|
|
|
you may not add another; but you may replace the old one, on explicit
|
|
|
|
|
permission from the previous publisher that added the old one.
|
2024-08-22 15:38:22 +02:00
|
|
|
|
|
2024-08-27 13:27:07 +02:00
|
|
|
|
The author(s) and publisher(s) of the Document do not by this License
|
|
|
|
|
give permission to use their names for publicity for or to assert or
|
|
|
|
|
imply endorsement of any Modified Version.
|
2024-08-22 15:38:22 +02:00
|
|
|
|
|
2024-07-24 17:00:17 +02:00
|
|
|
|
|
|
|
|
|
|
2024-08-27 13:27:07 +02:00
|
|
|
|
\bigskip\bigskip{\Large\bf 5. COMBINING DOCUMENTS\par}\bigskip
|
2024-07-24 17:00:17 +02:00
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
2024-08-27 13:27:07 +02:00
|
|
|
|
You may combine the Document with other documents released under this
|
|
|
|
|
License, under the terms defined in section~4 above for modified
|
|
|
|
|
versions, provided that you include in the combination all of the
|
|
|
|
|
Invariant Sections of all of the original documents, unmodified, and
|
|
|
|
|
list them all as Invariant Sections of your combined work in its
|
|
|
|
|
license notice, and that you preserve all their Warranty Disclaimers.
|
2024-07-24 17:00:17 +02:00
|
|
|
|
|
2024-08-27 13:27:07 +02:00
|
|
|
|
The combined work need only contain one copy of this License, and
|
|
|
|
|
multiple identical Invariant Sections may be replaced with a single
|
|
|
|
|
copy. If there are multiple Invariant Sections with the same name but
|
|
|
|
|
different contents, make the title of each such section unique by
|
|
|
|
|
adding at the end of it, in parentheses, the name of the original
|
|
|
|
|
author or publisher of that section if known, or else a unique number.
|
|
|
|
|
Make the same adjustment to the section titles in the list of
|
|
|
|
|
Invariant Sections in the license notice of the combined work.
|
2024-07-24 17:00:17 +02:00
|
|
|
|
|
2024-08-27 13:27:07 +02:00
|
|
|
|
In the combination, you must combine any sections Entitled ``History''
|
|
|
|
|
in the various original documents, forming one section Entitled
|
|
|
|
|
``History''; likewise combine any sections Entitled ``Acknowledgements'',
|
|
|
|
|
and any sections Entitled ``Dedications''. You must delete all sections
|
|
|
|
|
Entitled ``Endorsements''.
|
2024-08-22 15:38:22 +02:00
|
|
|
|
|
|
|
|
|
|
2024-08-27 13:27:07 +02:00
|
|
|
|
\bigskip\bigskip{\Large\bf 6. COLLECTIONS OF DOCUMENTS\par}\bigskip
|
2024-07-24 17:00:17 +02:00
|
|
|
|
|
|
|
|
|
|
2024-08-27 13:27:07 +02:00
|
|
|
|
You may make a collection consisting of the Document and other documents
|
|
|
|
|
released under this License, and replace the individual copies of this
|
|
|
|
|
License in the various documents with a single copy that is included in
|
|
|
|
|
the collection, provided that you follow the rules of this License for
|
|
|
|
|
verbatim copying of each of the documents in all other respects.
|
2024-07-24 17:00:17 +02:00
|
|
|
|
|
2024-08-27 13:27:07 +02:00
|
|
|
|
You may extract a single document from such a collection, and distribute
|
|
|
|
|
it individually under this License, provided you insert a copy of this
|
|
|
|
|
License into the extracted document, and follow this License in all
|
|
|
|
|
other respects regarding verbatim copying of that document.
|
2024-07-24 17:00:17 +02:00
|
|
|
|
|
|
|
|
|
|
2024-08-22 15:38:22 +02:00
|
|
|
|
|
2024-08-27 13:27:07 +02:00
|
|
|
|
\bigskip\bigskip{\Large\bf 7. AGGREGATION WITH INDEPENDENT WORKS\par}\bigskip
|
2024-08-22 15:38:22 +02:00
|
|
|
|
|
2024-07-24 17:00:17 +02:00
|
|
|
|
|
|
|
|
|
|
2024-08-27 13:27:07 +02:00
|
|
|
|
A compilation of the Document or its derivatives with other separate
|
|
|
|
|
and independent documents or works, in or on a volume of a storage or
|
|
|
|
|
distribution medium, is called an ``aggregate'' if the copyright
|
|
|
|
|
resulting from the compilation is not used to limit the legal rights
|
|
|
|
|
of the compilation's users beyond what the individual works permit.
|
|
|
|
|
When the Document is included in an aggregate, this License does not
|
|
|
|
|
apply to the other works in the aggregate which are not themselves
|
|
|
|
|
derivative works of the Document.
|
2024-07-24 17:00:17 +02:00
|
|
|
|
|
2024-08-27 13:27:07 +02:00
|
|
|
|
If the Cover Text requirement of section~3 is applicable to these
|
|
|
|
|
copies of the Document, then if the Document is less than one half of
|
|
|
|
|
the entire aggregate, the Document's Cover Texts may be placed on
|
|
|
|
|
covers that bracket the Document within the aggregate, or the
|
|
|
|
|
electronic equivalent of covers if the Document is in electronic form.
|
|
|
|
|
Otherwise they must appear on printed covers that bracket the whole
|
|
|
|
|
aggregate.
|
2024-07-24 17:00:17 +02:00
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
2024-08-27 13:27:07 +02:00
|
|
|
|
\bigskip\bigskip{\Large\bf 8. TRANSLATION\par}\bigskip
|
2024-07-24 17:00:17 +02:00
|
|
|
|
|
|
|
|
|
|
2024-08-22 15:38:22 +02:00
|
|
|
|
|
2024-08-27 13:27:07 +02:00
|
|
|
|
Translation is considered a kind of modification, so you may
|
|
|
|
|
distribute translations of the Document under the terms of section~4.
|
|
|
|
|
Replacing Invariant Sections with translations requires special
|
|
|
|
|
permission from their copyright holders, but you may include
|
|
|
|
|
translations of some or all Invariant Sections in addition to the
|
|
|
|
|
original versions of these Invariant Sections. You may include a
|
|
|
|
|
translation of this License, and all the license notices in the
|
|
|
|
|
Document, and any Warranty Disclaimers, provided that you also include
|
|
|
|
|
the original English version of this License and the original versions
|
|
|
|
|
of those notices and disclaimers. In case of a disagreement between
|
|
|
|
|
the translation and the original version of this License or a notice
|
|
|
|
|
or disclaimer, the original version will prevail.
|
2024-08-22 15:38:22 +02:00
|
|
|
|
|
2024-08-27 13:27:07 +02:00
|
|
|
|
If a section in the Document is Entitled ``Acknowledgements'',
|
|
|
|
|
``Dedications'', or ``History'', the requirement (section~4) to Preserve
|
|
|
|
|
its Title (section~1) will typically require changing the actual
|
|
|
|
|
title.
|
2024-07-24 17:00:17 +02:00
|
|
|
|
|
|
|
|
|
|
2024-08-27 13:27:07 +02:00
|
|
|
|
|
|
|
|
|
\bigskip\bigskip{\Large\bf 9. TERMINATION\par}\bigskip
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
You may not copy, modify, sublicense, or distribute the Document
|
|
|
|
|
except as expressly provided under this License. Any attempt
|
|
|
|
|
otherwise to copy, modify, sublicense, or distribute it is void, and
|
|
|
|
|
will automatically terminate your rights under this License.
|
|
|
|
|
|
|
|
|
|
However, if you cease all violation of this License, then your license
|
|
|
|
|
from a particular copyright holder is reinstated (a) provisionally,
|
|
|
|
|
unless and until the copyright holder explicitly and finally
|
|
|
|
|
terminates your license, and (b) permanently, if the copyright holder
|
|
|
|
|
fails to notify you of the violation by some reasonable means prior to
|
|
|
|
|
60 days after the cessation.
|
|
|
|
|
|
|
|
|
|
Moreover, your license from a particular copyright holder is
|
|
|
|
|
reinstated permanently if the copyright holder notifies you of the
|
|
|
|
|
violation by some reasonable means, this is the first time you have
|
|
|
|
|
received notice of violation of this License (for any work) from that
|
|
|
|
|
copyright holder, and you cure the violation prior to 30 days after
|
|
|
|
|
your receipt of the notice.
|
|
|
|
|
|
|
|
|
|
Termination of your rights under this section does not terminate the
|
|
|
|
|
licenses of parties who have received copies or rights from you under
|
|
|
|
|
this License. If your rights have been terminated and not permanently
|
|
|
|
|
reinstated, receipt of a copy of some or all of the same material does
|
|
|
|
|
not give you any rights to use it.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
\bigskip\bigskip{\Large\bf 10. FUTURE REVISIONS OF THIS LICENSE\par}\bigskip
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
The Free Software Foundation may publish new, revised versions
|
|
|
|
|
of the GNU Free Documentation License from time to time. Such new
|
|
|
|
|
versions will be similar in spirit to the present version, but may
|
|
|
|
|
differ in detail to address new problems or concerns. See
|
|
|
|
|
\path{https://www.gnu.org/licenses/}.
|
|
|
|
|
|
|
|
|
|
Each version of the License is given a distinguishing version number.
|
|
|
|
|
If the Document specifies that a particular numbered version of this
|
|
|
|
|
License ``or any later version'' applies to it, you have the option of
|
|
|
|
|
following the terms and conditions either of that specified version or
|
|
|
|
|
of any later version that has been published (not as a draft) by the
|
|
|
|
|
Free Software Foundation. If the Document does not specify a version
|
|
|
|
|
number of this License, you may choose any version ever published (not
|
|
|
|
|
as a draft) by the Free Software Foundation. If the Document
|
|
|
|
|
specifies that a proxy can decide which future versions of this
|
|
|
|
|
License can be used, that proxy's public statement of acceptance of a
|
|
|
|
|
version permanently authorizes you to choose that version for the
|
|
|
|
|
Document.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
\bigskip\bigskip{\Large\bf 11. RELICENSING\par}\bigskip
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
``Massive Multiauthor Collaboration Site'' (or ``MMC Site'') means any
|
|
|
|
|
World Wide Web server that publishes copyrightable works and also
|
|
|
|
|
provides prominent facilities for anybody to edit those works. A
|
|
|
|
|
public wiki that anybody can edit is an example of such a server. A
|
|
|
|
|
``Massive Multiauthor Collaboration'' (or ``MMC'') contained in the
|
|
|
|
|
site means any set of copyrightable works thus published on the MMC
|
|
|
|
|
site.
|
|
|
|
|
|
|
|
|
|
``CC-BY-SA'' means the Creative Commons Attribution-Share Alike 3.0
|
|
|
|
|
license published by Creative Commons Corporation, a not-for-profit
|
|
|
|
|
corporation with a principal place of business in San Francisco,
|
|
|
|
|
California, as well as future copyleft versions of that license
|
|
|
|
|
published by that same organization.
|
|
|
|
|
|
|
|
|
|
``Incorporate'' means to publish or republish a Document, in whole or
|
|
|
|
|
in part, as part of another Document.
|
|
|
|
|
|
|
|
|
|
An MMC is ``eligible for relicensing'' if it is licensed under this
|
|
|
|
|
License, and if all works that were first published under this License
|
|
|
|
|
somewhere other than this MMC, and subsequently incorporated in whole
|
|
|
|
|
or in part into the MMC, (1) had no cover texts or invariant sections,
|
|
|
|
|
and (2) were thus incorporated prior to November 1, 2008.
|
|
|
|
|
|
|
|
|
|
The operator of an MMC Site may republish an MMC contained in the site
|
|
|
|
|
under CC-BY-SA on the same site at any time before August 1, 2009,
|
|
|
|
|
provided the MMC is eligible for relicensing.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
\bigskip\bigskip{\Large\bf ADDENDUM: How to use this License for your documents\par}\bigskip
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
To use this License in a document you have written, include a copy of
|
|
|
|
|
the License in the document and put the following copyright and
|
|
|
|
|
license notices just after the title page:
|
|
|
|
|
|
|
|
|
|
\bigskip
|
|
|
|
|
\begin{quote}
|
|
|
|
|
Copyright \copyright{} YEAR YOUR NAME.
|
|
|
|
|
Permission is granted to copy, distribute and/or modify this document
|
|
|
|
|
under the terms of the GNU Free Documentation License, Version 1.3
|
|
|
|
|
or any later version published by the Free Software Foundation;
|
|
|
|
|
with no Invariant Sections, no Front-Cover Texts, and no Back-Cover Texts.
|
|
|
|
|
A copy of the license is included in the section entitled ``GNU
|
|
|
|
|
Free Documentation License''.
|
|
|
|
|
\end{quote}
|
|
|
|
|
\bigskip
|
|
|
|
|
|
|
|
|
|
If you have Invariant Sections, Front-Cover Texts and Back-Cover Texts,
|
|
|
|
|
replace the ``with \dots\ Texts.''\ line with this:
|
|
|
|
|
|
|
|
|
|
\bigskip
|
|
|
|
|
\begin{quote}
|
|
|
|
|
with the Invariant Sections being LIST THEIR TITLES, with the
|
|
|
|
|
Front-Cover Texts being LIST, and with the Back-Cover Texts being LIST.
|
|
|
|
|
\end{quote}
|
|
|
|
|
\bigskip
|
|
|
|
|
|
|
|
|
|
If you have Invariant Sections without Cover Texts, or some other
|
|
|
|
|
combination of the three, merge those two alternatives to suit the
|
|
|
|
|
situation.
|
|
|
|
|
|
|
|
|
|
If your document contains nontrivial examples of program code, we
|
|
|
|
|
recommend releasing these examples in parallel under your choice of
|
|
|
|
|
free software license, such as the GNU General Public License,
|
|
|
|
|
to permit their use in free software.
|
2024-07-24 17:00:17 +02:00
|
|
|
|
|
|
|
|
|
\end{document}
|