Misc enhancements
This commit is contained in:
parent
b23764ea21
commit
207fa28d1a
Binary file not shown.
|
@ -1542,186 +1542,186 @@
|
||||||
is essential for achieving the desired balance of performance,
|
is essential for achieving the desired balance of performance,
|
||||||
reliability, and stability in demanding server environments.
|
reliability, and stability in demanding server environments.
|
||||||
|
|
||||||
\subsection{General steps for DDR3 configuration}
|
\section{General steps for DDR3 configuration}
|
||||||
|
|
||||||
DDR3 memory initialization is a detailed and essential
|
DDR3 memory initialization is a detailed and essential
|
||||||
process that ensures both the stability and performance of the
|
process that ensures both the stability and performance of the
|
||||||
system. The process involves several critical steps: detection
|
system. The process involves several critical steps: detection
|
||||||
and identification of memory modules, initial configuration of the
|
and identification of memory modules, initial configuration of the
|
||||||
memory controller, adjustment of timing and voltage settings, and
|
memory controller, adjustment of timing and voltage settings, and
|
||||||
the execution of training and calibration procedures. \\
|
the execution of training and calibration procedures. \\
|
||||||
|
|
||||||
The initialization begins with the detection and identification of
|
The initialization begins with the detection and identification of
|
||||||
the installed memory modules. During the BIST, the firmware reads
|
the installed memory modules. During the BIST, the firmware reads
|
||||||
the Serial Presence Detect (SPD) data stored on
|
the Serial Presence Detect (SPD) data stored on
|
||||||
each memory module. SPD data contains crucial information about
|
each memory module. SPD data contains crucial information about
|
||||||
the memory module's specifications, including size, speed, CAS
|
the memory module's specifications, including size, speed, CAS
|
||||||
latency (CL), RAS to CAS delay (tRCD), row precharge time (tRP),
|
latency (CL), RAS to CAS delay (tRCD), row precharge time (tRP),
|
||||||
and row cycle time (tRC). This data allows to configure
|
and row cycle time (tRC). This data allows to configure
|
||||||
the memory controller for optimal compatibility and performance. \\
|
the memory controller for optimal compatibility and performance. \\
|
||||||
|
|
||||||
Indeed, once the memory modules have been identified, the firmware
|
Indeed, once the memory modules have been identified, the firmware
|
||||||
proceeds to the initial configuration of the memory controller.
|
proceeds to the initial configuration of the memory controller.
|
||||||
This controller is governed by a state machine that
|
This controller is governed by a state machine that
|
||||||
manages the sequence of operations required to initialize,
|
manages the sequence of operations required to initialize,
|
||||||
maintain, and control memory access. This state machine consists of
|
maintain, and control memory access. This state machine consists of
|
||||||
multiple states that represent various phases of memory operation,
|
multiple states that represent various phases of memory operation,
|
||||||
such as reset, initialization, calibration, and data transfer.
|
such as reset, initialization, calibration, and data transfer.
|
||||||
The transitions between these states are either automatic or
|
The transitions between these states are either automatic or
|
||||||
command-driven, depending on the specific requirements of each
|
command-driven, depending on the specific requirements of each
|
||||||
phase \cite{samsung_ddr3}\cite{micron_ddr3}.
|
phase \cite{samsung_ddr3}\cite{micron_ddr3}.
|
||||||
This state machine is presented in the
|
This state machine is presented in the
|
||||||
fig. \ref{fig:ddr3_state_machine}. Automatic transitions, depicted
|
fig. \ref{fig:ddr3_state_machine}. Automatic transitions, depicted
|
||||||
by thick arrows in the automaton, occur without external
|
by thick arrows in the automaton, occur without external
|
||||||
intervention. These typically include transitions that ensure
|
intervention. These typically include transitions that ensure
|
||||||
the memory enters a stable state, such as the transition from
|
the memory enters a stable state, such as the transition from
|
||||||
power-on to initialization, or from calibration to idle states.
|
power-on to initialization, or from calibration to idle states.
|
||||||
These transitions are crucial for maintaining the integrity and
|
These transitions are crucial for maintaining the integrity and
|
||||||
stability of the memory system, as they ensure that the controller
|
stability of the memory system, as they ensure that the controller
|
||||||
progresses through necessary stages like ZQ calibration and write
|
progresses through necessary stages like ZQ calibration and write
|
||||||
leveling, which are essential for proper signal timing and
|
leveling, which are essential for proper signal timing and
|
||||||
impedance matching
|
impedance matching
|
||||||
\cite{samsung_ddr3}\cite{micron_ddr3}\cite{burnett_ddr3}. \\
|
\cite{samsung_ddr3}\cite{micron_ddr3}\cite{burnett_ddr3}. \\
|
||||||
|
|
||||||
On the other hand, command-driven transitions, represented by normal
|
On the other hand, command-driven transitions, represented by normal
|
||||||
arrows in the automaton, require specific commands issued by the
|
arrows in the automaton, require specific commands issued by the
|
||||||
memory controller or the CPU to advance to the next state. For
|
memory controller or the CPU to advance to the next state. For
|
||||||
instance, the transition from the idle state to the data transfer
|
instance, the transition from the idle state to the data transfer
|
||||||
state requires explicit read or write commands. Similarly,
|
state requires explicit read or write commands. Similarly,
|
||||||
transitioning from the initialization state to the calibration
|
transitioning from the initialization state to the calibration
|
||||||
state involves issuing mode register set (MRS) commands that
|
state involves issuing mode register set (MRS) commands that
|
||||||
configure the memory’s operating parameters. These command-driven
|
configure the memory’s operating parameters. These command-driven
|
||||||
transitions are integral to the dynamic operation of the memory
|
transitions are integral to the dynamic operation of the memory
|
||||||
system, allowing the controller to respond to the system's
|
system, allowing the controller to respond to the system's
|
||||||
operational needs and ensuring that memory accesses are performed
|
operational needs and ensuring that memory accesses are performed
|
||||||
efficiently and accurately \cite{samsung_ddr3}\cite{micron_ddr3}. \\
|
efficiently and accurately \cite{samsung_ddr3}\cite{micron_ddr3}. \\
|
||||||
|
|
||||||
The memory controller configuration
|
The memory controller configuration
|
||||||
involves setting up fundamental parameters such as the memory clock
|
involves setting up fundamental parameters such as the memory clock
|
||||||
(MEMCLK) frequency and the memory channel configuration. The MEMCLK
|
(MEMCLK) frequency and the memory channel configuration. The MEMCLK
|
||||||
frequency is derived from the SPD data, while the memory channels
|
frequency is derived from the SPD data, while the memory channels
|
||||||
are configured to operate in single, dual, or quad-channel modes,
|
are configured to operate in single, dual, or quad-channel modes,
|
||||||
depending on the system architecture and the installed modules
|
depending on the system architecture and the installed modules
|
||||||
\cite{burnett_ddr3}. Proper configuration of the memory controller
|
\cite{burnett_ddr3}. Proper configuration of the memory controller
|
||||||
is vital to ensure synchronization with the memory modules,
|
is vital to ensure synchronization with the memory modules,
|
||||||
establishing a stable foundation for subsequent operations. \\
|
establishing a stable foundation for subsequent operations. \\
|
||||||
|
|
||||||
The first critical step, during the INIT phase involves the
|
The first critical step, during the INIT phase involves the
|
||||||
adjustment of timing and voltage settings. These settings are
|
adjustment of timing and voltage settings. These settings are
|
||||||
essential for ensuring that DDR3 memory operates efficiently and
|
essential for ensuring that DDR3 memory operates efficiently and
|
||||||
reliably. Key timing parameters include CAS Latency (CL), RAS to
|
reliably. Key timing parameters include CAS Latency (CL), RAS to
|
||||||
CAS Delay (tRCD), Row Precharge Time (tRP), and Row Cycle Time (tRC).
|
CAS Delay (tRCD), Row Precharge Time (tRP), and Row Cycle Time (tRC).
|
||||||
These parameters are finely tuned to balance speed and stability
|
These parameters are finely tuned to balance speed and stability
|
||||||
\cite{samsung_ddr3}. The BIOS uses the SPD data to set these
|
\cite{samsung_ddr3}. The BIOS uses the SPD data to set these
|
||||||
parameters and may also adjust them dynamically to achieve the
|
parameters and may also adjust them dynamically to achieve the
|
||||||
best possible performance. Voltage settings, such as DRAM voltage
|
best possible performance. Voltage settings, such as DRAM voltage
|
||||||
(typically 1.5V for DDR3) and termination voltage (VTT), are also
|
(typically 1.5V for DDR3) and termination voltage (VTT), are also
|
||||||
configured to maintain stable operation, especially under varying
|
configured to maintain stable operation, especially under varying
|
||||||
conditions such as temperature fluctuations \cite{micron_ddr3}. \\
|
conditions such as temperature fluctuations \cite{micron_ddr3}. \\
|
||||||
|
|
||||||
Training and calibration are among the most complex and crucial
|
Training and calibration are among the most complex and crucial
|
||||||
stages of DDR3 memory initialization. The fly-by topology used
|
stages of DDR3 memory initialization. The fly-by topology used
|
||||||
for address, command, and clock signals in DDR3 modules enhances
|
for address, command, and clock signals in DDR3 modules enhances
|
||||||
signal integrity by reducing the number of stubs and their lengths,
|
signal integrity by reducing the number of stubs and their lengths,
|
||||||
but it also introduces skew between the clock (CK) and data strobe
|
but it also introduces skew between the clock (CK) and data strobe
|
||||||
(DQS) signals \cite{micron_ddr3}. This skew must be compensated to
|
(DQS) signals \cite{micron_ddr3}. This skew must be compensated to
|
||||||
ensure that data is written and read correctly. The BIOS performs
|
ensure that data is written and read correctly. The BIOS performs
|
||||||
write leveling, which adjusts the timing of DQS relative to CK
|
write leveling, which adjusts the timing of DQS relative to CK
|
||||||
for each memory module. This process ensures that the memory
|
for each memory module. This process ensures that the memory
|
||||||
controller can write data accurately across all modules, even
|
controller can write data accurately across all modules, even
|
||||||
when they exhibit slight variations in signal timing due to the
|
when they exhibit slight variations in signal timing due to the
|
||||||
physical layout \cite{samsung_ddr3}. \\
|
physical layout \cite{samsung_ddr3}. \\
|
||||||
|
|
||||||
\begin{figure}[H]
|
\begin{figure}[H]
|
||||||
\centering
|
\centering
|
||||||
\begin{tikzpicture}[scale=0.6,
|
\begin{tikzpicture}[scale=0.6,
|
||||||
transform shape,
|
transform shape,
|
||||||
shorten >=1pt,
|
shorten >=1pt,
|
||||||
node distance=5cm and 5cm,
|
node distance=5cm and 5cm,
|
||||||
on grid,
|
on grid,
|
||||||
auto]
|
auto]
|
||||||
% States
|
% States
|
||||||
\node[state, initial] (reset) {RESET};
|
\node[state, initial] (reset) {RESET};
|
||||||
\node[draw=none,fill=none] (any) [below=2cm of reset] {ANY};
|
\node[draw=none,fill=none] (any) [below=2cm of reset] {ANY};
|
||||||
\node[state] (init) [right=of reset] {INIT};
|
\node[state] (init) [right=of reset] {INIT};
|
||||||
\node[state] (zqcal) [below=of init] {ZQ Calibration};
|
\node[state] (zqcal) [below=of init] {ZQ Calibration};
|
||||||
\node[state, accepting] (idle) [right=of init] {IDLE};
|
\node[state, accepting] (idle) [right=of init] {IDLE};
|
||||||
\node[state] (writelevel) [above=of idle] {WRITE LEVELING};
|
\node[state] (writelevel) [above=of idle] {WRITE LEVELING};
|
||||||
\node[state] (refresh) [right=of idle] {REFRESH};
|
\node[state] (refresh) [right=of idle] {REFRESH};
|
||||||
\node[state] (activation) [below=of idle] {ACTIVATION};
|
\node[state] (activation) [below=of idle] {ACTIVATION};
|
||||||
\node[state] (bankactive) [below=of activation] {BANK ACTIVE};
|
\node[state] (bankactive) [below=of activation] {BANK ACTIVE};
|
||||||
\node[state] (readop) [below right=of bankactive] {READ OP};
|
\node[state] (readop) [below right=of bankactive] {READ OP};
|
||||||
\node[state] (writeop) [below left=of bankactive] {WRITE OP};
|
\node[state] (writeop) [below left=of bankactive] {WRITE OP};
|
||||||
\node[state] (prechrg) [below right=of readop] {PRE-CHARGING};
|
\node[state] (prechrg) [below right=of readop] {PRE-CHARGING};
|
||||||
% Transitions
|
% Transitions
|
||||||
\path[->, line width=0.2mm, >=stealth]
|
\path[->, line width=0.2mm, >=stealth]
|
||||||
(reset) edge node {} (init)
|
(reset) edge node {} (init)
|
||||||
(idle) edge [bend left=20] node {} (writelevel)
|
(idle) edge [bend left=20] node {} (writelevel)
|
||||||
edge [bend left=20] node {REF} (refresh)
|
edge [bend left=20] node {REF} (refresh)
|
||||||
edge node {} (activation)
|
edge node {} (activation)
|
||||||
edge [bend left=10] node {ZQCL/S} (zqcal)
|
edge [bend left=10] node {ZQCL/S} (zqcal)
|
||||||
(activation) edge node {} (bankactive)
|
(activation) edge node {} (bankactive)
|
||||||
(bankactive) edge [bend left=30] node {PRE} (prechrg)
|
(bankactive) edge [bend left=30] node {PRE} (prechrg)
|
||||||
edge [bend left=20] node {write} (writeop)
|
edge [bend left=20] node {write} (writeop)
|
||||||
edge [bend right=20] node {read} (readop)
|
edge [bend right=20] node {read} (readop)
|
||||||
(writeop) edge [loop left] node {write} (writeop)
|
(writeop) edge [loop left] node {write} (writeop)
|
||||||
edge [bend left=10] node {read\_a} (readop)
|
edge [bend left=10] node {read\_a} (readop)
|
||||||
edge [bend right=15] node {PRE} (prechrg)
|
edge [bend right=15] node {PRE} (prechrg)
|
||||||
(readop) edge [loop right] node {read} (readop)
|
(readop) edge [loop right] node {read} (readop)
|
||||||
edge [bend left=10] node {write\_a} (writeop)
|
edge [bend left=10] node {write\_a} (writeop)
|
||||||
edge [bend right=15] node {PRE} (prechrg);
|
edge [bend right=15] node {PRE} (prechrg);
|
||||||
% Thick transitions
|
% Thick transitions
|
||||||
\path[->, line width=0.5mm, >=stealth]
|
\path[->, line width=0.5mm, >=stealth]
|
||||||
(any) edge node {} (reset)
|
(any) edge node {} (reset)
|
||||||
(init) edge node {ZQCL} (zqcal)
|
(init) edge node {ZQCL} (zqcal)
|
||||||
(zqcal) edge [bend left=10] node {} (idle)
|
(zqcal) edge [bend left=10] node {} (idle)
|
||||||
(writelevel) edge [bend left=20] node {MRS} (idle)
|
(writelevel) edge [bend left=20] node {MRS} (idle)
|
||||||
(refresh) edge [bend left=20] node {} (idle)
|
(refresh) edge [bend left=20] node {} (idle)
|
||||||
(writeop) edge node {} (prechrg)
|
(writeop) edge node {} (prechrg)
|
||||||
edge [bend left=20] node {} (bankactive)
|
edge [bend left=20] node {} (bankactive)
|
||||||
(readop) edge [bend left=15] node {} (prechrg)
|
(readop) edge [bend left=15] node {} (prechrg)
|
||||||
edge [bend right=20] node {} (bankactive)
|
edge [bend right=20] node {} (bankactive)
|
||||||
(prechrg) edge [bend right=20] node {} (idle);
|
(prechrg) edge [bend right=20] node {} (idle);
|
||||||
\end{tikzpicture}
|
\end{tikzpicture}
|
||||||
\caption{DDR3 controller state machine}
|
\caption{DDR3 controller state machine}
|
||||||
\label{fig:ddr3_state_machine}
|
\label{fig:ddr3_state_machine}
|
||||||
\end{figure}
|
\end{figure}
|
||||||
|
|
||||||
ZQ calibration is another vital procedure that adjusts the
|
ZQ calibration is another vital procedure that adjusts the
|
||||||
output driver impedance and on-die termination (ODT) to match
|
output driver impedance and on-die termination (ODT) to match
|
||||||
the system’s characteristic impedance \cite{micron_ddr3}. This
|
the system’s characteristic impedance \cite{micron_ddr3}. This
|
||||||
calibration is critical for maintaining signal integrity under
|
calibration is critical for maintaining signal integrity under
|
||||||
different operating conditions, such as voltage and temperature
|
different operating conditions, such as voltage and temperature
|
||||||
changes. During initialization, the memory controller issues a
|
changes. During initialization, the memory controller issues a
|
||||||
ZQCL command to the DRAM modules, triggering the calibration
|
ZQCL command to the DRAM modules, triggering the calibration
|
||||||
sequence that optimizes impedance settings.
|
sequence that optimizes impedance settings.
|
||||||
This ensures that the memory system can
|
This ensures that the memory system can
|
||||||
operate with tight timing tolerances, which is crucial for
|
operate with tight timing tolerances, which is crucial for
|
||||||
systems requiring high reliability.
|
systems requiring high reliability.
|
||||||
Read training is also essential to ensure that data read from
|
Read training is also essential to ensure that data read from
|
||||||
the memory modules is interpreted correctly by the memory
|
the memory modules is interpreted correctly by the memory
|
||||||
controller. This process involves adjusting the timing of the
|
controller. This process involves adjusting the timing of the
|
||||||
read data strobe (DQS) to align perfectly with the data being
|
read data strobe (DQS) to align perfectly with the data being
|
||||||
received. Proper read training is necessary for reliable data
|
received. Proper read training is necessary for reliable data
|
||||||
retrieval, which directly impacts system performance and stability. \\
|
retrieval, which directly impacts system performance and stability. \\
|
||||||
|
|
||||||
ZQCS (ZQ Calibration Short) however is a procedure used
|
ZQCS (ZQ Calibration Short) however is a procedure used
|
||||||
to periodically adjust the DRAM's ODT and output driver impedance
|
to periodically adjust the DRAM's ODT and output driver impedance
|
||||||
during normal operation. Unlike the full ZQCL (ZQ Calibration Long),
|
during normal operation. Unlike the full ZQCL (ZQ Calibration Long),
|
||||||
which is performed during initial memory initialization, ZQCS is a
|
which is performed during initial memory initialization, ZQCS is a
|
||||||
quicker, less comprehensive calibration that fine-tunes the
|
quicker, less comprehensive calibration that fine-tunes the
|
||||||
impedance settings in response to changes in temperature, voltage,
|
impedance settings in response to changes in temperature, voltage,
|
||||||
or other environmental factors. This helps maintain optimal signal
|
or other environmental factors. This helps maintain optimal signal
|
||||||
integrity and performance throughout the memory's operation without
|
integrity and performance throughout the memory's operation without
|
||||||
the need for a full recalibration. \\
|
the need for a full recalibration. \\
|
||||||
|
|
||||||
In summary, the DDR3 memory initialization process in systems
|
In summary, the DDR3 memory initialization process in systems
|
||||||
like the ASUS KGPE-D16 involves a series of detailed and
|
like the ASUS KGPE-D16 involves a series of detailed and
|
||||||
interdependent steps that are critical for ensuring system
|
interdependent steps that are critical for ensuring system
|
||||||
stability and performance. These include the detection and
|
stability and performance. These include the detection and
|
||||||
identification of memory modules, the initial configuration of
|
identification of memory modules, the initial configuration of
|
||||||
the memory controller, precise adjustments of timing and voltage
|
the memory controller, precise adjustments of timing and voltage
|
||||||
settings, and rigorous training and calibration procedures.
|
settings, and rigorous training and calibration procedures.
|
||||||
|
|
||||||
\section{Memory initialization techniques}
|
\section{Memory initialization techniques}
|
||||||
|
|
||||||
|
@ -3468,7 +3468,7 @@ uint8_t AddrCmdPrelaunch = 0; /* TODO: Fetch the correct value from RC2[0] */
|
||||||
* 0x41 and 0x0 are the "stock" values */
|
* 0x41 and 0x0 are the "stock" values */
|
||||||
\end{minted}
|
\end{minted}
|
||||||
\end{adjustwidth}
|
\end{adjustwidth}
|
||||||
\caption{\texttt{FIXME} indicating the need for
|
\caption{Lack of
|
||||||
mainboard-specific seed overrides,
|
mainboard-specific seed overrides,
|
||||||
extract from
|
extract from
|
||||||
\protect\path{procConfig} function in
|
\protect\path{procConfig} function in
|
||||||
|
@ -3562,7 +3562,7 @@ if (faulty_value_detected) {
|
||||||
code. The overcomplicated logic can also make the code more
|
code. The overcomplicated logic can also make the code more
|
||||||
difficult to maintain and extend. \\
|
difficult to maintain and extend. \\
|
||||||
|
|
||||||
\subsection{DQS position training}
|
\subsubsection{DQS position training}
|
||||||
|
|
||||||
While the DQS position training algorithm implemented in the
|
While the DQS position training algorithm implemented in the
|
||||||
\path{TrainDQSRdWrPos_D_Fam15} function may work in some
|
\path{TrainDQSRdWrPos_D_Fam15} function may work in some
|
||||||
|
@ -3714,183 +3714,181 @@ if (best_count > 2) {
|
||||||
the time required for DQS position training without compromising
|
the time required for DQS position training without compromising
|
||||||
accuracy. \\
|
accuracy. \\
|
||||||
|
|
||||||
\subsection{On a wider scale...}
|
\subsubsection{On saving training values in NVRAM}
|
||||||
|
|
||||||
\subsubsection{Saving training values in NVRAM}
|
The function \path{mctAutoInitMCT_D} is responsible for
|
||||||
|
automatically initializing the memory controller training (MCT)
|
||||||
|
process, which involves configuring various memory parameters
|
||||||
|
and performing training routines to ensure stable and efficient
|
||||||
|
memory operation. However, the fact that
|
||||||
|
\path{mctAutoInitMCT\_D} does not allow for the restoration of
|
||||||
|
training data from NVRAM (lst. \ref{lst:mctAutoInitMCT_D_fixme})
|
||||||
|
poses several significant problems. \\
|
||||||
|
|
||||||
The function \path{mctAutoInitMCT_D} is responsible for
|
Memory training is a time-consuming process that involves
|
||||||
automatically initializing the memory controller training (MCT)
|
multiple iterations of read/write operations, delay adjustments,
|
||||||
process, which involves configuring various memory parameters
|
and calibration steps. By not restoring previously saved
|
||||||
and performing training routines to ensure stable and efficient
|
training data from NVRAM, the system is forced to re-run the
|
||||||
memory operation. However, the fact that
|
full training sequence every time it boots up. This leads to
|
||||||
\path{mctAutoInitMCT\_D} does not allow for the restoration of
|
longer boot times, which can be particularly problematic in
|
||||||
training data from NVRAM (lst. \ref{lst:mctAutoInitMCT_D_fixme})
|
environments where quick system restarts are critical, such
|
||||||
poses several significant problems. \\
|
as in servers or embedded systems. \\
|
||||||
|
|
||||||
Memory training is a time-consuming process that involves
|
Each time memory training is performed, it puts additional
|
||||||
multiple iterations of read/write operations, delay adjustments,
|
stress on the memory modules and the memory controller.
|
||||||
and calibration steps. By not restoring previously saved
|
Repeatedly executing the training process at every boot can
|
||||||
training data from NVRAM, the system is forced to re-run the
|
contribute to the wear and tear of hardware components,
|
||||||
full training sequence every time it boots up. This leads to
|
potentially reducing their lifespan. This issue is especially
|
||||||
longer boot times, which can be particularly problematic in
|
concerning in systems that frequently power cycle or reboot. \\
|
||||||
environments where quick system restarts are critical, such
|
|
||||||
as in servers or embedded systems. \\
|
|
||||||
|
|
||||||
Each time memory training is performed, it puts additional
|
Memory training is sensitive to various factors, such as
|
||||||
stress on the memory modules and the memory controller.
|
temperature, voltage, and load conditions. As a result, the
|
||||||
Repeatedly executing the training process at every boot can
|
training results can vary slightly between different boot
|
||||||
contribute to the wear and tear of hardware components,
|
cycles. Without the ability to restore previously validated
|
||||||
potentially reducing their lifespan. This issue is especially
|
training data, there is a risk of inconsistency in memory
|
||||||
concerning in systems that frequently power cycle or reboot. \\
|
performance across reboots. This could lead to instability
|
||||||
|
or suboptimal memory operation, affecting the overall
|
||||||
|
performance of the system. \\
|
||||||
|
|
||||||
Memory training is sensitive to various factors, such as
|
If the memory training process fails during boot, the system
|
||||||
temperature, voltage, and load conditions. As a result, the
|
may be unable to operate properly or may fail to boot entirely.
|
||||||
training results can vary slightly between different boot
|
By restoring validated training data from NVRAM, the system
|
||||||
cycles. Without the ability to restore previously validated
|
can bypass the training process altogether, reducing the risk
|
||||||
training data, there is a risk of inconsistency in memory
|
of boot failures caused by training issues. Without this
|
||||||
performance across reboots. This could lead to instability
|
feature, any minor issue that affects training could result
|
||||||
or suboptimal memory operation, affecting the overall
|
in system downtime. \\
|
||||||
performance of the system. \\
|
|
||||||
|
|
||||||
If the memory training process fails during boot, the system
|
Finally, modern memory controllers often include power-saving
|
||||||
may be unable to operate properly or may fail to boot entirely.
|
features that are fine-tuned during the training process. By
|
||||||
By restoring validated training data from NVRAM, the system
|
reusing validated training data from NVRAM, the system can
|
||||||
can bypass the training process altogether, reducing the risk
|
quickly return to an optimized state with lower power
|
||||||
of boot failures caused by training issues. Without this
|
consumption.
|
||||||
feature, any minor issue that affects training could result
|
The inability to restore this data forces the system to
|
||||||
in system downtime. \\
|
operate at a potentially less efficient state until training
|
||||||
|
is complete, leading to higher power consumption during the
|
||||||
|
boot process. \\
|
||||||
|
|
||||||
Finally, modern memory controllers often include power-saving
|
\subsubsection{A seedless DQS position training algorithm}
|
||||||
features that are fine-tuned during the training process. By
|
|
||||||
reusing validated training data from NVRAM, the system can
|
|
||||||
quickly return to an optimized state with lower power
|
|
||||||
consumption.
|
|
||||||
The inability to restore this data forces the system to
|
|
||||||
operate at a potentially less efficient state until training
|
|
||||||
is complete, leading to higher power consumption during the
|
|
||||||
boot process. \\
|
|
||||||
|
|
||||||
\subsubsection{A seedless DQS position training algorithm}
|
An algorithm to find the best timing for the DQS so that the
|
||||||
|
memory controller can reliably read data from the memory
|
||||||
|
could be done without relying on any pre-known starting
|
||||||
|
values (seeds). This would allow for better reliability and
|
||||||
|
wider support for different situations. The algorithm
|
||||||
|
could be describe as follows. \\
|
||||||
|
|
||||||
An algorithm to find the best timing for the DQS so that the
|
\begin{itemize}
|
||||||
memory controller can reliably read data from the memory
|
\item Prepare Memory Controller:
|
||||||
could be done without relying on any pre-known starting
|
The memory controller needs to be in a state where it can
|
||||||
values (seeds). This would allow for better reliability and
|
safely adjust the DQS timing without affecting the normal
|
||||||
wider support for different situations. The algorithm
|
operation of the system. By blocking the DQS signal locking,
|
||||||
could be describe as follows. \\
|
we ensure that the adjustments made during training do not
|
||||||
|
interfere with the controller’s ability to capture data
|
||||||
|
until the optimal settings are found.
|
||||||
|
|
||||||
\begin{itemize}
|
\item Initialize Variables:
|
||||||
\item Prepare Memory Controller:
|
Set up variables to store the various timing settings and
|
||||||
The memory controller needs to be in a state where it can
|
test results for each bytelane. This setup is crucial
|
||||||
safely adjust the DQS timing without affecting the normal
|
because each bytelane might require a different optimal
|
||||||
operation of the system. By blocking the DQS signal locking,
|
timing, and keeping track of these values ensures that the
|
||||||
we ensure that the adjustments made during training do not
|
algorithm can correctly determine the best delay settings
|
||||||
interfere with the controller’s ability to capture data
|
later.
|
||||||
until the optimal settings are found.
|
\end{itemize}
|
||||||
|
|
||||||
\item Initialize Variables:
|
The main loop is the core of the algorithm, where different
|
||||||
Set up variables to store the various timing settings and
|
timing settings are systematically explored. By looping
|
||||||
test results for each bytelane. This setup is crucial
|
through possible delay settings, the algorithm ensures
|
||||||
because each bytelane might require a different optimal
|
that it doesn't miss any potential optimal timings. The
|
||||||
timing, and keeping track of these values ensures that the
|
loop structure allows a methodical test of a range of
|
||||||
algorithm can correctly determine the best delay settings
|
delays to find the most reliable one. \\
|
||||||
later.
|
|
||||||
\end{itemize}
|
|
||||||
|
|
||||||
The main loop is the core of the algorithm, where different
|
The gross delay is here the coarse adjustment to the timing
|
||||||
timing settings are systematically explored. By looping
|
of the DQS signal. It shifts the timing window by a large
|
||||||
through possible delay settings, the algorithm ensures
|
amount, helping to broadly align the DQS with the data
|
||||||
that it doesn't miss any potential optimal timings. The
|
lines (DQ). The fine delay, which is the smaller, more
|
||||||
loop structure allows a methodical test of a range of
|
precise change to the timing of the DQS signal once the
|
||||||
delays to find the most reliable one. \\
|
coarse alignment (through gross delay) has been achieved,
|
||||||
|
would then be computed. \\
|
||||||
|
|
||||||
The gross delay is here the coarse adjustment to the timing
|
To compute a delay, here would be the steps:
|
||||||
of the DQS signal. It shifts the timing window by a large
|
|
||||||
amount, helping to broadly align the DQS with the data
|
|
||||||
lines (DQ). The fine delay, which is the smaller, more
|
|
||||||
precise change to the timing of the DQS signal once the
|
|
||||||
coarse alignment (through gross delay) has been achieved,
|
|
||||||
would then be computed. \\
|
|
||||||
|
|
||||||
To compute a delay, here would be the steps:
|
\begin{itemize}
|
||||||
|
\item Set a delay:
|
||||||
|
Setting an initial delay allows the algorithm to start
|
||||||
|
testing. The initial delay might be zero or another default
|
||||||
|
value, providing a baseline from which to begin the search
|
||||||
|
for the optimal timing.
|
||||||
|
|
||||||
\begin{itemize}
|
\item Test it:
|
||||||
\item Set a delay:
|
After setting the delay, it is essential to test whether the
|
||||||
Setting an initial delay allows the algorithm to start
|
memory controller can read data correctly. This step is
|
||||||
testing. The initial delay might be zero or another default
|
critical because it indicates whether the current delay
|
||||||
value, providing a baseline from which to begin the search
|
setting is within the acceptable range for reliable data
|
||||||
for the optimal timing.
|
capture.
|
||||||
|
|
||||||
\item Test it:
|
\item Check the result:
|
||||||
After setting the delay, it is essential to test whether the
|
If the memory controller successfully reads data, it means
|
||||||
memory controller can read data correctly. This step is
|
the current delay setting is valid. This information is
|
||||||
critical because it indicates whether the current delay
|
crucial because it helps define the range of acceptable
|
||||||
setting is within the acceptable range for reliable data
|
timings. If the test fails, it indicates that the curren
|
||||||
capture.
|
t delay setting is outside the range where the memory
|
||||||
|
controller can reliably capture data.
|
||||||
|
|
||||||
\item Check the result:
|
\item Increase/decrease delay:
|
||||||
If the memory controller successfully reads data, it means
|
By incrementally adjusting the delay, either increasing or
|
||||||
the current delay setting is valid. This information is
|
decreasing, the algorithm can explore different timing
|
||||||
crucial because it helps define the range of acceptable
|
settings in a controlled manner. This ensures that the
|
||||||
timings. If the test fails, it indicates that the curren
|
entire range of possible delays is covered without skipping
|
||||||
t delay setting is outside the range where the memory
|
over any potential good delays.
|
||||||
controller can reliably capture data.
|
|
||||||
|
|
||||||
\item Increase/decrease delay:
|
\item Test again:
|
||||||
By incrementally adjusting the delay, either increasing or
|
Re-testing after each adjustment ensures that the exact
|
||||||
decreasing, the algorithm can explore different timing
|
point where the DQS timing goes from acceptable (pass) to
|
||||||
settings in a controlled manner. This ensures that the
|
unacceptable (fail) is caught. This step helps in
|
||||||
entire range of possible delays is covered without skipping
|
identifying the transition point, which is often the optimal
|
||||||
over any potential good delays.
|
place to set the DQS delay.
|
||||||
|
|
||||||
\item Test again:
|
\item Look for a transition:
|
||||||
Re-testing after each adjustment ensures that the exact
|
The transition from pass to fail is where the DQS timing
|
||||||
point where the DQS timing goes from acceptable (pass) to
|
crosses the boundary of the valid timing window. This
|
||||||
unacceptable (fail) is caught. This step helps in
|
transition is crucial because it marks the end of the
|
||||||
identifying the transition point, which is often the optimal
|
reliable range. The best timing is usually just before
|
||||||
place to set the DQS delay.
|
this transition.
|
||||||
|
|
||||||
\item Look for a transition:
|
\item Record the best setting:
|
||||||
The transition from pass to fail is where the DQS timing
|
Storing the best delay setting for each bytelane ensures
|
||||||
crosses the boundary of the valid timing window. This
|
that a reliable timing configuration is available when the
|
||||||
transition is crucial because it marks the end of the
|
training is complete.
|
||||||
reliable range. The best timing is usually just before
|
|
||||||
this transition.
|
|
||||||
|
|
||||||
\item Record the best setting:
|
\item Confirm all bytelanes:
|
||||||
Storing the best delay setting for each bytelane ensures
|
Before finalizing the settings, it is important to ensure
|
||||||
that a reliable timing configuration is available when the
|
that the chosen delays work for all bytelanes. This step
|
||||||
training is complete.
|
serves as a final safeguard against errors, ensuring that
|
||||||
|
every part of the data bus is correctly aligned.
|
||||||
|
\end{itemize}
|
||||||
|
|
||||||
\item Confirm all bytelanes:
|
Each bytelane (8-bit segment of data) may require a
|
||||||
Before finalizing the settings, it is important to ensure
|
different optimal delay setting. By repeating the process
|
||||||
that the chosen delays work for all bytelanes. This step
|
for all bytelanes, the algorithm ensures that the entire
|
||||||
serves as a final safeguard against errors, ensuring that
|
data bus is correctly timed. Misalignment in even one
|
||||||
every part of the data bus is correctly aligned.
|
bytelane can lead to data errors, making it essential to
|
||||||
\end{itemize}
|
tune every bytelane individually. \\
|
||||||
|
|
||||||
Each bytelane (8-bit segment of data) may require a
|
Once the best settings are confirmed, they need to be
|
||||||
different optimal delay setting. By repeating the process
|
applied to the memory controller for use during normal
|
||||||
for all bytelanes, the algorithm ensures that the entire
|
operation. This step locks in the most reliable timing
|
||||||
data bus is correctly timed. Misalignment in even one
|
configuration found during the training process. \\
|
||||||
bytelane can lead to data errors, making it essential to
|
|
||||||
tune every bytelane individually. \\
|
|
||||||
|
|
||||||
Once the best settings are confirmed, they need to be
|
After the optimal settings are applied, it is necessary
|
||||||
applied to the memory controller for use during normal
|
to allow the DQS signal locking mechanism to resume. This
|
||||||
operation. This step locks in the most reliable timing
|
locks in the delay settings, ensuring stable operation going
|
||||||
configuration found during the training process. \\
|
forward. \\
|
||||||
|
|
||||||
After the optimal settings are applied, it is necessary
|
Finally, the algorithm needs to indicate whether it was
|
||||||
to allow the DQS signal locking mechanism to resume. This
|
successful in finding reliable timing settings for all
|
||||||
locks in the delay settings, ensuring stable operation going
|
bytelanes. This feedback is crucial for determining whether
|
||||||
forward. \\
|
the memory system is correctly configured or if further
|
||||||
|
adjustments or troubleshooting are needed. \\
|
||||||
Finally, the algorithm needs to indicate whether it was
|
|
||||||
successful in finding reliable timing settings for all
|
|
||||||
bytelanes. This feedback is crucial for determining whether
|
|
||||||
the memory system is correctly configured or if further
|
|
||||||
adjustments or troubleshooting are needed. \\
|
|
||||||
|
|
||||||
% ------------------------------------------------------------------------------
|
% ------------------------------------------------------------------------------
|
||||||
% CHAPTER 5: Virtualization of the operating system through firmware abstraction
|
% CHAPTER 5: Virtualization of the operating system through firmware abstraction
|
||||||
|
@ -4219,7 +4217,8 @@ if (best_count > 2) {
|
||||||
|
|
||||||
\chapter*{Appendix: Long code listings}
|
\chapter*{Appendix: Long code listings}
|
||||||
\addcontentsline{toc}{chapter}{Appendix: Long code listings}
|
\addcontentsline{toc}{chapter}{Appendix: Long code listings}
|
||||||
\renewcommand{\thelisting}{\arabic{listing}}
|
\renewcommand{\thelisting}{L.\arabic{listing}}
|
||||||
|
\setcounter{listing}{0}
|
||||||
|
|
||||||
\begin{listing}
|
\begin{listing}
|
||||||
\begin{adjustwidth}{0.5cm}{0.5cm}
|
\begin{adjustwidth}{0.5cm}{0.5cm}
|
||||||
|
|
|
@ -28,28 +28,25 @@
|
||||||
\contentsline {section}{\numberline {3.2}AMD Platform Security Processor and Intel Management Engine}{28}{section.3.2}%
|
\contentsline {section}{\numberline {3.2}AMD Platform Security Processor and Intel Management Engine}{28}{section.3.2}%
|
||||||
\contentsline {chapter}{\numberline {4}Memory initialization and training}{30}{chapter.4}%
|
\contentsline {chapter}{\numberline {4}Memory initialization and training}{30}{chapter.4}%
|
||||||
\contentsline {section}{\numberline {4.1}Importance of DDR3 memory initialization}{30}{section.4.1}%
|
\contentsline {section}{\numberline {4.1}Importance of DDR3 memory initialization}{30}{section.4.1}%
|
||||||
\contentsline {subsection}{\numberline {4.1.1}General steps for DDR3 configuration}{31}{subsection.4.1.1}%
|
\contentsline {section}{\numberline {4.2}General steps for DDR3 configuration}{31}{section.4.2}%
|
||||||
\contentsline {section}{\numberline {4.2}Memory initialization techniques}{34}{section.4.2}%
|
\contentsline {section}{\numberline {4.3}Memory initialization techniques}{34}{section.4.3}%
|
||||||
\contentsline {subsection}{\numberline {4.2.1}Memory training algorithms}{34}{subsection.4.2.1}%
|
\contentsline {subsection}{\numberline {4.3.1}Memory training algorithms}{34}{subsection.4.3.1}%
|
||||||
\contentsline {subsection}{\numberline {4.2.2}BIOS and Kernel Developer Guide (BKDG) recommendations}{35}{subsection.4.2.2}%
|
\contentsline {subsection}{\numberline {4.3.2}BIOS and Kernel Developer Guide (BKDG) recommendations}{35}{subsection.4.3.2}%
|
||||||
\contentsline {subsubsection}{\numberline {4.2.2.1}DDR3 initialization procedure}{36}{subsubsection.4.2.2.1}%
|
\contentsline {subsubsection}{\numberline {4.3.2.1}DDR3 initialization procedure}{36}{subsubsection.4.3.2.1}%
|
||||||
\contentsline {subsubsection}{\numberline {4.2.2.2}ZQ calibration process}{36}{subsubsection.4.2.2.2}%
|
\contentsline {subsubsection}{\numberline {4.3.2.2}ZQ calibration process}{36}{subsubsection.4.3.2.2}%
|
||||||
\contentsline {subsubsection}{\numberline {4.2.2.3}Write leveling process}{37}{subsubsection.4.2.2.3}%
|
\contentsline {subsubsection}{\numberline {4.3.2.3}Write leveling process}{37}{subsubsection.4.3.2.3}%
|
||||||
\contentsline {section}{\numberline {4.3}Current implementation and potential improvements}{39}{section.4.3}%
|
\contentsline {section}{\numberline {4.4}Current implementation and potential improvements}{39}{section.4.4}%
|
||||||
\contentsline {subsection}{\numberline {4.3.1}Current implementation in coreboot on the KGPE-D16}{39}{subsection.4.3.1}%
|
\contentsline {subsection}{\numberline {4.4.1}Current implementation in coreboot on the KGPE-D16}{39}{subsection.4.4.1}%
|
||||||
\contentsline {subsubsection}{\numberline {4.3.1.1}Details on the DQS training function}{41}{subsubsection.4.3.1.1}%
|
\contentsline {subsubsection}{\numberline {4.4.1.1}Details on the DQS training function}{41}{subsubsection.4.4.1.1}%
|
||||||
\contentsline {subsubsection}{\numberline {4.3.1.2}Details on the write leveling implementation}{43}{subsubsection.4.3.1.2}%
|
\contentsline {subsubsection}{\numberline {4.4.1.2}Details on the write leveling implementation}{43}{subsubsection.4.4.1.2}%
|
||||||
\contentsline {subsubsection}{\numberline {4.3.1.3}Details on the write leveling implementation}{44}{subsubsection.4.3.1.3}%
|
\contentsline {subsubsection}{\numberline {4.4.1.3}Details on the DQS position training function}{45}{subsubsection.4.4.1.3}%
|
||||||
\contentsline {subsection}{\numberline {4.3.2}Write Leveling on AMD Fam15h G34 Processors with RDIMMs}{44}{subsection.4.3.2}%
|
\contentsline {subsubsection}{\numberline {4.4.1.4}Details on the DQS receiver training function}{47}{subsubsection.4.4.1.4}%
|
||||||
\contentsline {subsubsection}{\numberline {4.3.2.1}Details on the DQS position training function}{45}{subsubsection.4.3.2.1}%
|
\contentsline {subsection}{\numberline {4.4.2}Potential enhancements}{50}{subsection.4.4.2}%
|
||||||
\contentsline {subsubsection}{\numberline {4.3.2.2}Details on the DQS receiver training function}{48}{subsubsection.4.3.2.2}%
|
\contentsline {subsubsection}{\numberline {4.4.2.1}DQS receiver training}{50}{subsubsection.4.4.2.1}%
|
||||||
\contentsline {subsection}{\numberline {4.3.3}Potential enhancements}{50}{subsection.4.3.3}%
|
\contentsline {subsubsection}{\numberline {4.4.2.2}Write leveling}{52}{subsubsection.4.4.2.2}%
|
||||||
\contentsline {subsubsection}{\numberline {4.3.3.1}DQS receiver training}{50}{subsubsection.4.3.3.1}%
|
\contentsline {subsubsection}{\numberline {4.4.2.3}DQS position training}{54}{subsubsection.4.4.2.3}%
|
||||||
\contentsline {subsubsection}{\numberline {4.3.3.2}Write leveling}{52}{subsubsection.4.3.3.2}%
|
\contentsline {subsubsection}{\numberline {4.4.2.4}On saving training values in NVRAM}{56}{subsubsection.4.4.2.4}%
|
||||||
\contentsline {subsection}{\numberline {4.3.4}DQS position training}{54}{subsection.4.3.4}%
|
\contentsline {subsubsection}{\numberline {4.4.2.5}A seedless DQS position training algorithm}{57}{subsubsection.4.4.2.5}%
|
||||||
\contentsline {subsection}{\numberline {4.3.5}On a wider scale...}{56}{subsection.4.3.5}%
|
|
||||||
\contentsline {subsubsection}{\numberline {4.3.5.1}Saving training values in NVRAM}{56}{subsubsection.4.3.5.1}%
|
|
||||||
\contentsline {subsubsection}{\numberline {4.3.5.2}A seedless DQS position training algorithm}{57}{subsubsection.4.3.5.2}%
|
|
||||||
\contentsline {chapter}{\numberline {5}Virtualization of the operating system through firmware abstraction}{59}{chapter.5}%
|
\contentsline {chapter}{\numberline {5}Virtualization of the operating system through firmware abstraction}{59}{chapter.5}%
|
||||||
\contentsline {section}{\numberline {5.1}ACPI and abstraction of hardware control}{59}{section.5.1}%
|
\contentsline {section}{\numberline {5.1}ACPI and abstraction of hardware control}{59}{section.5.1}%
|
||||||
\contentsline {section}{\numberline {5.2}SMM as a hidden execution layer}{60}{section.5.2}%
|
\contentsline {section}{\numberline {5.2}SMM as a hidden execution layer}{60}{section.5.2}%
|
||||||
|
|
22
packages.tex
22
packages.tex
|
@ -33,18 +33,29 @@
|
||||||
\usepackage[a4paper, portrait, margin=1.45cm]{geometry}
|
\usepackage[a4paper, portrait, margin=1.45cm]{geometry}
|
||||||
|
|
||||||
% Set parameters
|
% Set parameters
|
||||||
|
|
||||||
|
% No warnings
|
||||||
\WarningsOff
|
\WarningsOff
|
||||||
|
|
||||||
|
% No message for text justification
|
||||||
|
\hbadness=10000
|
||||||
|
|
||||||
|
% Start at page 0
|
||||||
\setcounter{page}{0}
|
\setcounter{page}{0}
|
||||||
|
|
||||||
|
% Link every toc element
|
||||||
\hypersetup{linktoc=all}
|
\hypersetup{linktoc=all}
|
||||||
|
|
||||||
|
% Enhance footnotes
|
||||||
\addtolength{\skip\footins}{0.6pc}
|
\addtolength{\skip\footins}{0.6pc}
|
||||||
\renewcommand*\footnoterule{} %Footnode separator line
|
\renewcommand*\footnoterule{} %Footnode separator line
|
||||||
|
|
||||||
%\def\siecle#1{\textsc{\romannumeral #1}\textsuperscript{e}~siècle}
|
%\def\siecle#1{\textsc{\romannumeral #1}\textsuperscript{e}~siècle}
|
||||||
|
|
||||||
|
% Place dots on sections in toc
|
||||||
\renewcommand{\cftsecleader}{\cftdotfill{\cftdotsep}} %places dots on sections lines as well
|
\renewcommand{\cftsecleader}{\cftdotfill{\cftdotsep}} %places dots on sections lines as well
|
||||||
|
|
||||||
|
% Tweak auto-tabulations
|
||||||
\cftsetindents{section}{0pt}{4em}
|
\cftsetindents{section}{0pt}{4em}
|
||||||
\cftsetindents{subsection}{10pt}{4em}
|
\cftsetindents{subsection}{10pt}{4em}
|
||||||
\cftsetindents{subsubsection}{20pt}{4em}
|
\cftsetindents{subsubsection}{20pt}{4em}
|
||||||
|
@ -52,12 +63,15 @@
|
||||||
\cftsetindents{subparagraph}{40pt}{4em}
|
\cftsetindents{subparagraph}{40pt}{4em}
|
||||||
\def\cftdotsep{1}
|
\def\cftdotsep{1}
|
||||||
\cftsetpnumwidth{1em}
|
\cftsetpnumwidth{1em}
|
||||||
|
|
||||||
\renewcommand{\cftchapafterpnum}{\vspace{\cftbeforechapskip}}
|
|
||||||
\renewcommand{\familydefault}{\sfdefault}
|
|
||||||
|
|
||||||
\setlength\parindent{0pt}
|
\setlength\parindent{0pt}
|
||||||
|
|
||||||
|
% Defaults space before chapters
|
||||||
|
\renewcommand{\cftchapafterpnum}{\vspace{\cftbeforechapskip}}
|
||||||
|
|
||||||
|
% Sans-serif, of course
|
||||||
|
\renewcommand{\familydefault}{\sfdefault}
|
||||||
|
|
||||||
|
% Configure listings
|
||||||
\usemintedstyle{solarized-light}
|
\usemintedstyle{solarized-light}
|
||||||
\definecolor{bg}{HTML}{FAF9F6}
|
\definecolor{bg}{HTML}{FAF9F6}
|
||||||
\definecolor{linenumcolor}{rgb}{0.6, 0.6, 0.6} % Light gray color
|
\definecolor{linenumcolor}{rgb}{0.6, 0.6, 0.6} % Light gray color
|
||||||
|
|
Loading…
Reference in New Issue