Misc enhancements
This commit is contained in:
parent
b23764ea21
commit
207fa28d1a
Binary file not shown.
|
@ -1542,186 +1542,186 @@
|
|||
is essential for achieving the desired balance of performance,
|
||||
reliability, and stability in demanding server environments.
|
||||
|
||||
\subsection{General steps for DDR3 configuration}
|
||||
\section{General steps for DDR3 configuration}
|
||||
|
||||
DDR3 memory initialization is a detailed and essential
|
||||
process that ensures both the stability and performance of the
|
||||
system. The process involves several critical steps: detection
|
||||
and identification of memory modules, initial configuration of the
|
||||
memory controller, adjustment of timing and voltage settings, and
|
||||
the execution of training and calibration procedures. \\
|
||||
DDR3 memory initialization is a detailed and essential
|
||||
process that ensures both the stability and performance of the
|
||||
system. The process involves several critical steps: detection
|
||||
and identification of memory modules, initial configuration of the
|
||||
memory controller, adjustment of timing and voltage settings, and
|
||||
the execution of training and calibration procedures. \\
|
||||
|
||||
The initialization begins with the detection and identification of
|
||||
the installed memory modules. During the BIST, the firmware reads
|
||||
the Serial Presence Detect (SPD) data stored on
|
||||
each memory module. SPD data contains crucial information about
|
||||
the memory module's specifications, including size, speed, CAS
|
||||
latency (CL), RAS to CAS delay (tRCD), row precharge time (tRP),
|
||||
and row cycle time (tRC). This data allows to configure
|
||||
the memory controller for optimal compatibility and performance. \\
|
||||
The initialization begins with the detection and identification of
|
||||
the installed memory modules. During the BIST, the firmware reads
|
||||
the Serial Presence Detect (SPD) data stored on
|
||||
each memory module. SPD data contains crucial information about
|
||||
the memory module's specifications, including size, speed, CAS
|
||||
latency (CL), RAS to CAS delay (tRCD), row precharge time (tRP),
|
||||
and row cycle time (tRC). This data allows to configure
|
||||
the memory controller for optimal compatibility and performance. \\
|
||||
|
||||
Indeed, once the memory modules have been identified, the firmware
|
||||
proceeds to the initial configuration of the memory controller.
|
||||
This controller is governed by a state machine that
|
||||
manages the sequence of operations required to initialize,
|
||||
maintain, and control memory access. This state machine consists of
|
||||
multiple states that represent various phases of memory operation,
|
||||
such as reset, initialization, calibration, and data transfer.
|
||||
The transitions between these states are either automatic or
|
||||
command-driven, depending on the specific requirements of each
|
||||
phase \cite{samsung_ddr3}\cite{micron_ddr3}.
|
||||
This state machine is presented in the
|
||||
fig. \ref{fig:ddr3_state_machine}. Automatic transitions, depicted
|
||||
by thick arrows in the automaton, occur without external
|
||||
intervention. These typically include transitions that ensure
|
||||
the memory enters a stable state, such as the transition from
|
||||
power-on to initialization, or from calibration to idle states.
|
||||
These transitions are crucial for maintaining the integrity and
|
||||
stability of the memory system, as they ensure that the controller
|
||||
progresses through necessary stages like ZQ calibration and write
|
||||
leveling, which are essential for proper signal timing and
|
||||
impedance matching
|
||||
\cite{samsung_ddr3}\cite{micron_ddr3}\cite{burnett_ddr3}. \\
|
||||
Indeed, once the memory modules have been identified, the firmware
|
||||
proceeds to the initial configuration of the memory controller.
|
||||
This controller is governed by a state machine that
|
||||
manages the sequence of operations required to initialize,
|
||||
maintain, and control memory access. This state machine consists of
|
||||
multiple states that represent various phases of memory operation,
|
||||
such as reset, initialization, calibration, and data transfer.
|
||||
The transitions between these states are either automatic or
|
||||
command-driven, depending on the specific requirements of each
|
||||
phase \cite{samsung_ddr3}\cite{micron_ddr3}.
|
||||
This state machine is presented in the
|
||||
fig. \ref{fig:ddr3_state_machine}. Automatic transitions, depicted
|
||||
by thick arrows in the automaton, occur without external
|
||||
intervention. These typically include transitions that ensure
|
||||
the memory enters a stable state, such as the transition from
|
||||
power-on to initialization, or from calibration to idle states.
|
||||
These transitions are crucial for maintaining the integrity and
|
||||
stability of the memory system, as they ensure that the controller
|
||||
progresses through necessary stages like ZQ calibration and write
|
||||
leveling, which are essential for proper signal timing and
|
||||
impedance matching
|
||||
\cite{samsung_ddr3}\cite{micron_ddr3}\cite{burnett_ddr3}. \\
|
||||
|
||||
On the other hand, command-driven transitions, represented by normal
|
||||
arrows in the automaton, require specific commands issued by the
|
||||
memory controller or the CPU to advance to the next state. For
|
||||
instance, the transition from the idle state to the data transfer
|
||||
state requires explicit read or write commands. Similarly,
|
||||
transitioning from the initialization state to the calibration
|
||||
state involves issuing mode register set (MRS) commands that
|
||||
configure the memory’s operating parameters. These command-driven
|
||||
transitions are integral to the dynamic operation of the memory
|
||||
system, allowing the controller to respond to the system's
|
||||
operational needs and ensuring that memory accesses are performed
|
||||
efficiently and accurately \cite{samsung_ddr3}\cite{micron_ddr3}. \\
|
||||
On the other hand, command-driven transitions, represented by normal
|
||||
arrows in the automaton, require specific commands issued by the
|
||||
memory controller or the CPU to advance to the next state. For
|
||||
instance, the transition from the idle state to the data transfer
|
||||
state requires explicit read or write commands. Similarly,
|
||||
transitioning from the initialization state to the calibration
|
||||
state involves issuing mode register set (MRS) commands that
|
||||
configure the memory’s operating parameters. These command-driven
|
||||
transitions are integral to the dynamic operation of the memory
|
||||
system, allowing the controller to respond to the system's
|
||||
operational needs and ensuring that memory accesses are performed
|
||||
efficiently and accurately \cite{samsung_ddr3}\cite{micron_ddr3}. \\
|
||||
|
||||
The memory controller configuration
|
||||
involves setting up fundamental parameters such as the memory clock
|
||||
(MEMCLK) frequency and the memory channel configuration. The MEMCLK
|
||||
frequency is derived from the SPD data, while the memory channels
|
||||
are configured to operate in single, dual, or quad-channel modes,
|
||||
depending on the system architecture and the installed modules
|
||||
\cite{burnett_ddr3}. Proper configuration of the memory controller
|
||||
is vital to ensure synchronization with the memory modules,
|
||||
establishing a stable foundation for subsequent operations. \\
|
||||
The memory controller configuration
|
||||
involves setting up fundamental parameters such as the memory clock
|
||||
(MEMCLK) frequency and the memory channel configuration. The MEMCLK
|
||||
frequency is derived from the SPD data, while the memory channels
|
||||
are configured to operate in single, dual, or quad-channel modes,
|
||||
depending on the system architecture and the installed modules
|
||||
\cite{burnett_ddr3}. Proper configuration of the memory controller
|
||||
is vital to ensure synchronization with the memory modules,
|
||||
establishing a stable foundation for subsequent operations. \\
|
||||
|
||||
The first critical step, during the INIT phase involves the
|
||||
adjustment of timing and voltage settings. These settings are
|
||||
essential for ensuring that DDR3 memory operates efficiently and
|
||||
reliably. Key timing parameters include CAS Latency (CL), RAS to
|
||||
CAS Delay (tRCD), Row Precharge Time (tRP), and Row Cycle Time (tRC).
|
||||
These parameters are finely tuned to balance speed and stability
|
||||
\cite{samsung_ddr3}. The BIOS uses the SPD data to set these
|
||||
parameters and may also adjust them dynamically to achieve the
|
||||
best possible performance. Voltage settings, such as DRAM voltage
|
||||
(typically 1.5V for DDR3) and termination voltage (VTT), are also
|
||||
configured to maintain stable operation, especially under varying
|
||||
conditions such as temperature fluctuations \cite{micron_ddr3}. \\
|
||||
The first critical step, during the INIT phase involves the
|
||||
adjustment of timing and voltage settings. These settings are
|
||||
essential for ensuring that DDR3 memory operates efficiently and
|
||||
reliably. Key timing parameters include CAS Latency (CL), RAS to
|
||||
CAS Delay (tRCD), Row Precharge Time (tRP), and Row Cycle Time (tRC).
|
||||
These parameters are finely tuned to balance speed and stability
|
||||
\cite{samsung_ddr3}. The BIOS uses the SPD data to set these
|
||||
parameters and may also adjust them dynamically to achieve the
|
||||
best possible performance. Voltage settings, such as DRAM voltage
|
||||
(typically 1.5V for DDR3) and termination voltage (VTT), are also
|
||||
configured to maintain stable operation, especially under varying
|
||||
conditions such as temperature fluctuations \cite{micron_ddr3}. \\
|
||||
|
||||
Training and calibration are among the most complex and crucial
|
||||
stages of DDR3 memory initialization. The fly-by topology used
|
||||
for address, command, and clock signals in DDR3 modules enhances
|
||||
signal integrity by reducing the number of stubs and their lengths,
|
||||
but it also introduces skew between the clock (CK) and data strobe
|
||||
(DQS) signals \cite{micron_ddr3}. This skew must be compensated to
|
||||
ensure that data is written and read correctly. The BIOS performs
|
||||
write leveling, which adjusts the timing of DQS relative to CK
|
||||
for each memory module. This process ensures that the memory
|
||||
controller can write data accurately across all modules, even
|
||||
when they exhibit slight variations in signal timing due to the
|
||||
physical layout \cite{samsung_ddr3}. \\
|
||||
Training and calibration are among the most complex and crucial
|
||||
stages of DDR3 memory initialization. The fly-by topology used
|
||||
for address, command, and clock signals in DDR3 modules enhances
|
||||
signal integrity by reducing the number of stubs and their lengths,
|
||||
but it also introduces skew between the clock (CK) and data strobe
|
||||
(DQS) signals \cite{micron_ddr3}. This skew must be compensated to
|
||||
ensure that data is written and read correctly. The BIOS performs
|
||||
write leveling, which adjusts the timing of DQS relative to CK
|
||||
for each memory module. This process ensures that the memory
|
||||
controller can write data accurately across all modules, even
|
||||
when they exhibit slight variations in signal timing due to the
|
||||
physical layout \cite{samsung_ddr3}. \\
|
||||
|
||||
\begin{figure}[H]
|
||||
\centering
|
||||
\begin{tikzpicture}[scale=0.6,
|
||||
transform shape,
|
||||
shorten >=1pt,
|
||||
node distance=5cm and 5cm,
|
||||
on grid,
|
||||
auto]
|
||||
% States
|
||||
\node[state, initial] (reset) {RESET};
|
||||
\node[draw=none,fill=none] (any) [below=2cm of reset] {ANY};
|
||||
\node[state] (init) [right=of reset] {INIT};
|
||||
\node[state] (zqcal) [below=of init] {ZQ Calibration};
|
||||
\node[state, accepting] (idle) [right=of init] {IDLE};
|
||||
\node[state] (writelevel) [above=of idle] {WRITE LEVELING};
|
||||
\node[state] (refresh) [right=of idle] {REFRESH};
|
||||
\node[state] (activation) [below=of idle] {ACTIVATION};
|
||||
\node[state] (bankactive) [below=of activation] {BANK ACTIVE};
|
||||
\node[state] (readop) [below right=of bankactive] {READ OP};
|
||||
\node[state] (writeop) [below left=of bankactive] {WRITE OP};
|
||||
\node[state] (prechrg) [below right=of readop] {PRE-CHARGING};
|
||||
% Transitions
|
||||
\path[->, line width=0.2mm, >=stealth]
|
||||
(reset) edge node {} (init)
|
||||
(idle) edge [bend left=20] node {} (writelevel)
|
||||
edge [bend left=20] node {REF} (refresh)
|
||||
edge node {} (activation)
|
||||
edge [bend left=10] node {ZQCL/S} (zqcal)
|
||||
(activation) edge node {} (bankactive)
|
||||
(bankactive) edge [bend left=30] node {PRE} (prechrg)
|
||||
edge [bend left=20] node {write} (writeop)
|
||||
edge [bend right=20] node {read} (readop)
|
||||
(writeop) edge [loop left] node {write} (writeop)
|
||||
edge [bend left=10] node {read\_a} (readop)
|
||||
edge [bend right=15] node {PRE} (prechrg)
|
||||
(readop) edge [loop right] node {read} (readop)
|
||||
edge [bend left=10] node {write\_a} (writeop)
|
||||
edge [bend right=15] node {PRE} (prechrg);
|
||||
% Thick transitions
|
||||
\path[->, line width=0.5mm, >=stealth]
|
||||
(any) edge node {} (reset)
|
||||
(init) edge node {ZQCL} (zqcal)
|
||||
(zqcal) edge [bend left=10] node {} (idle)
|
||||
(writelevel) edge [bend left=20] node {MRS} (idle)
|
||||
(refresh) edge [bend left=20] node {} (idle)
|
||||
(writeop) edge node {} (prechrg)
|
||||
edge [bend left=20] node {} (bankactive)
|
||||
(readop) edge [bend left=15] node {} (prechrg)
|
||||
edge [bend right=20] node {} (bankactive)
|
||||
(prechrg) edge [bend right=20] node {} (idle);
|
||||
\end{tikzpicture}
|
||||
\caption{DDR3 controller state machine}
|
||||
\label{fig:ddr3_state_machine}
|
||||
\end{figure}
|
||||
\begin{figure}[H]
|
||||
\centering
|
||||
\begin{tikzpicture}[scale=0.6,
|
||||
transform shape,
|
||||
shorten >=1pt,
|
||||
node distance=5cm and 5cm,
|
||||
on grid,
|
||||
auto]
|
||||
% States
|
||||
\node[state, initial] (reset) {RESET};
|
||||
\node[draw=none,fill=none] (any) [below=2cm of reset] {ANY};
|
||||
\node[state] (init) [right=of reset] {INIT};
|
||||
\node[state] (zqcal) [below=of init] {ZQ Calibration};
|
||||
\node[state, accepting] (idle) [right=of init] {IDLE};
|
||||
\node[state] (writelevel) [above=of idle] {WRITE LEVELING};
|
||||
\node[state] (refresh) [right=of idle] {REFRESH};
|
||||
\node[state] (activation) [below=of idle] {ACTIVATION};
|
||||
\node[state] (bankactive) [below=of activation] {BANK ACTIVE};
|
||||
\node[state] (readop) [below right=of bankactive] {READ OP};
|
||||
\node[state] (writeop) [below left=of bankactive] {WRITE OP};
|
||||
\node[state] (prechrg) [below right=of readop] {PRE-CHARGING};
|
||||
% Transitions
|
||||
\path[->, line width=0.2mm, >=stealth]
|
||||
(reset) edge node {} (init)
|
||||
(idle) edge [bend left=20] node {} (writelevel)
|
||||
edge [bend left=20] node {REF} (refresh)
|
||||
edge node {} (activation)
|
||||
edge [bend left=10] node {ZQCL/S} (zqcal)
|
||||
(activation) edge node {} (bankactive)
|
||||
(bankactive) edge [bend left=30] node {PRE} (prechrg)
|
||||
edge [bend left=20] node {write} (writeop)
|
||||
edge [bend right=20] node {read} (readop)
|
||||
(writeop) edge [loop left] node {write} (writeop)
|
||||
edge [bend left=10] node {read\_a} (readop)
|
||||
edge [bend right=15] node {PRE} (prechrg)
|
||||
(readop) edge [loop right] node {read} (readop)
|
||||
edge [bend left=10] node {write\_a} (writeop)
|
||||
edge [bend right=15] node {PRE} (prechrg);
|
||||
% Thick transitions
|
||||
\path[->, line width=0.5mm, >=stealth]
|
||||
(any) edge node {} (reset)
|
||||
(init) edge node {ZQCL} (zqcal)
|
||||
(zqcal) edge [bend left=10] node {} (idle)
|
||||
(writelevel) edge [bend left=20] node {MRS} (idle)
|
||||
(refresh) edge [bend left=20] node {} (idle)
|
||||
(writeop) edge node {} (prechrg)
|
||||
edge [bend left=20] node {} (bankactive)
|
||||
(readop) edge [bend left=15] node {} (prechrg)
|
||||
edge [bend right=20] node {} (bankactive)
|
||||
(prechrg) edge [bend right=20] node {} (idle);
|
||||
\end{tikzpicture}
|
||||
\caption{DDR3 controller state machine}
|
||||
\label{fig:ddr3_state_machine}
|
||||
\end{figure}
|
||||
|
||||
ZQ calibration is another vital procedure that adjusts the
|
||||
output driver impedance and on-die termination (ODT) to match
|
||||
the system’s characteristic impedance \cite{micron_ddr3}. This
|
||||
calibration is critical for maintaining signal integrity under
|
||||
different operating conditions, such as voltage and temperature
|
||||
changes. During initialization, the memory controller issues a
|
||||
ZQCL command to the DRAM modules, triggering the calibration
|
||||
sequence that optimizes impedance settings.
|
||||
This ensures that the memory system can
|
||||
operate with tight timing tolerances, which is crucial for
|
||||
systems requiring high reliability.
|
||||
Read training is also essential to ensure that data read from
|
||||
the memory modules is interpreted correctly by the memory
|
||||
controller. This process involves adjusting the timing of the
|
||||
read data strobe (DQS) to align perfectly with the data being
|
||||
received. Proper read training is necessary for reliable data
|
||||
retrieval, which directly impacts system performance and stability. \\
|
||||
ZQ calibration is another vital procedure that adjusts the
|
||||
output driver impedance and on-die termination (ODT) to match
|
||||
the system’s characteristic impedance \cite{micron_ddr3}. This
|
||||
calibration is critical for maintaining signal integrity under
|
||||
different operating conditions, such as voltage and temperature
|
||||
changes. During initialization, the memory controller issues a
|
||||
ZQCL command to the DRAM modules, triggering the calibration
|
||||
sequence that optimizes impedance settings.
|
||||
This ensures that the memory system can
|
||||
operate with tight timing tolerances, which is crucial for
|
||||
systems requiring high reliability.
|
||||
Read training is also essential to ensure that data read from
|
||||
the memory modules is interpreted correctly by the memory
|
||||
controller. This process involves adjusting the timing of the
|
||||
read data strobe (DQS) to align perfectly with the data being
|
||||
received. Proper read training is necessary for reliable data
|
||||
retrieval, which directly impacts system performance and stability. \\
|
||||
|
||||
ZQCS (ZQ Calibration Short) however is a procedure used
|
||||
to periodically adjust the DRAM's ODT and output driver impedance
|
||||
during normal operation. Unlike the full ZQCL (ZQ Calibration Long),
|
||||
which is performed during initial memory initialization, ZQCS is a
|
||||
quicker, less comprehensive calibration that fine-tunes the
|
||||
impedance settings in response to changes in temperature, voltage,
|
||||
or other environmental factors. This helps maintain optimal signal
|
||||
integrity and performance throughout the memory's operation without
|
||||
the need for a full recalibration. \\
|
||||
ZQCS (ZQ Calibration Short) however is a procedure used
|
||||
to periodically adjust the DRAM's ODT and output driver impedance
|
||||
during normal operation. Unlike the full ZQCL (ZQ Calibration Long),
|
||||
which is performed during initial memory initialization, ZQCS is a
|
||||
quicker, less comprehensive calibration that fine-tunes the
|
||||
impedance settings in response to changes in temperature, voltage,
|
||||
or other environmental factors. This helps maintain optimal signal
|
||||
integrity and performance throughout the memory's operation without
|
||||
the need for a full recalibration. \\
|
||||
|
||||
In summary, the DDR3 memory initialization process in systems
|
||||
like the ASUS KGPE-D16 involves a series of detailed and
|
||||
interdependent steps that are critical for ensuring system
|
||||
stability and performance. These include the detection and
|
||||
identification of memory modules, the initial configuration of
|
||||
the memory controller, precise adjustments of timing and voltage
|
||||
settings, and rigorous training and calibration procedures.
|
||||
In summary, the DDR3 memory initialization process in systems
|
||||
like the ASUS KGPE-D16 involves a series of detailed and
|
||||
interdependent steps that are critical for ensuring system
|
||||
stability and performance. These include the detection and
|
||||
identification of memory modules, the initial configuration of
|
||||
the memory controller, precise adjustments of timing and voltage
|
||||
settings, and rigorous training and calibration procedures.
|
||||
|
||||
\section{Memory initialization techniques}
|
||||
|
||||
|
@ -3468,7 +3468,7 @@ uint8_t AddrCmdPrelaunch = 0; /* TODO: Fetch the correct value from RC2[0] */
|
|||
* 0x41 and 0x0 are the "stock" values */
|
||||
\end{minted}
|
||||
\end{adjustwidth}
|
||||
\caption{\texttt{FIXME} indicating the need for
|
||||
\caption{Lack of
|
||||
mainboard-specific seed overrides,
|
||||
extract from
|
||||
\protect\path{procConfig} function in
|
||||
|
@ -3562,7 +3562,7 @@ if (faulty_value_detected) {
|
|||
code. The overcomplicated logic can also make the code more
|
||||
difficult to maintain and extend. \\
|
||||
|
||||
\subsection{DQS position training}
|
||||
\subsubsection{DQS position training}
|
||||
|
||||
While the DQS position training algorithm implemented in the
|
||||
\path{TrainDQSRdWrPos_D_Fam15} function may work in some
|
||||
|
@ -3714,183 +3714,181 @@ if (best_count > 2) {
|
|||
the time required for DQS position training without compromising
|
||||
accuracy. \\
|
||||
|
||||
\subsection{On a wider scale...}
|
||||
\subsubsection{On saving training values in NVRAM}
|
||||
|
||||
\subsubsection{Saving training values in NVRAM}
|
||||
The function \path{mctAutoInitMCT_D} is responsible for
|
||||
automatically initializing the memory controller training (MCT)
|
||||
process, which involves configuring various memory parameters
|
||||
and performing training routines to ensure stable and efficient
|
||||
memory operation. However, the fact that
|
||||
\path{mctAutoInitMCT\_D} does not allow for the restoration of
|
||||
training data from NVRAM (lst. \ref{lst:mctAutoInitMCT_D_fixme})
|
||||
poses several significant problems. \\
|
||||
|
||||
The function \path{mctAutoInitMCT_D} is responsible for
|
||||
automatically initializing the memory controller training (MCT)
|
||||
process, which involves configuring various memory parameters
|
||||
and performing training routines to ensure stable and efficient
|
||||
memory operation. However, the fact that
|
||||
\path{mctAutoInitMCT\_D} does not allow for the restoration of
|
||||
training data from NVRAM (lst. \ref{lst:mctAutoInitMCT_D_fixme})
|
||||
poses several significant problems. \\
|
||||
Memory training is a time-consuming process that involves
|
||||
multiple iterations of read/write operations, delay adjustments,
|
||||
and calibration steps. By not restoring previously saved
|
||||
training data from NVRAM, the system is forced to re-run the
|
||||
full training sequence every time it boots up. This leads to
|
||||
longer boot times, which can be particularly problematic in
|
||||
environments where quick system restarts are critical, such
|
||||
as in servers or embedded systems. \\
|
||||
|
||||
Memory training is a time-consuming process that involves
|
||||
multiple iterations of read/write operations, delay adjustments,
|
||||
and calibration steps. By not restoring previously saved
|
||||
training data from NVRAM, the system is forced to re-run the
|
||||
full training sequence every time it boots up. This leads to
|
||||
longer boot times, which can be particularly problematic in
|
||||
environments where quick system restarts are critical, such
|
||||
as in servers or embedded systems. \\
|
||||
Each time memory training is performed, it puts additional
|
||||
stress on the memory modules and the memory controller.
|
||||
Repeatedly executing the training process at every boot can
|
||||
contribute to the wear and tear of hardware components,
|
||||
potentially reducing their lifespan. This issue is especially
|
||||
concerning in systems that frequently power cycle or reboot. \\
|
||||
|
||||
Each time memory training is performed, it puts additional
|
||||
stress on the memory modules and the memory controller.
|
||||
Repeatedly executing the training process at every boot can
|
||||
contribute to the wear and tear of hardware components,
|
||||
potentially reducing their lifespan. This issue is especially
|
||||
concerning in systems that frequently power cycle or reboot. \\
|
||||
Memory training is sensitive to various factors, such as
|
||||
temperature, voltage, and load conditions. As a result, the
|
||||
training results can vary slightly between different boot
|
||||
cycles. Without the ability to restore previously validated
|
||||
training data, there is a risk of inconsistency in memory
|
||||
performance across reboots. This could lead to instability
|
||||
or suboptimal memory operation, affecting the overall
|
||||
performance of the system. \\
|
||||
|
||||
Memory training is sensitive to various factors, such as
|
||||
temperature, voltage, and load conditions. As a result, the
|
||||
training results can vary slightly between different boot
|
||||
cycles. Without the ability to restore previously validated
|
||||
training data, there is a risk of inconsistency in memory
|
||||
performance across reboots. This could lead to instability
|
||||
or suboptimal memory operation, affecting the overall
|
||||
performance of the system. \\
|
||||
If the memory training process fails during boot, the system
|
||||
may be unable to operate properly or may fail to boot entirely.
|
||||
By restoring validated training data from NVRAM, the system
|
||||
can bypass the training process altogether, reducing the risk
|
||||
of boot failures caused by training issues. Without this
|
||||
feature, any minor issue that affects training could result
|
||||
in system downtime. \\
|
||||
|
||||
If the memory training process fails during boot, the system
|
||||
may be unable to operate properly or may fail to boot entirely.
|
||||
By restoring validated training data from NVRAM, the system
|
||||
can bypass the training process altogether, reducing the risk
|
||||
of boot failures caused by training issues. Without this
|
||||
feature, any minor issue that affects training could result
|
||||
in system downtime. \\
|
||||
Finally, modern memory controllers often include power-saving
|
||||
features that are fine-tuned during the training process. By
|
||||
reusing validated training data from NVRAM, the system can
|
||||
quickly return to an optimized state with lower power
|
||||
consumption.
|
||||
The inability to restore this data forces the system to
|
||||
operate at a potentially less efficient state until training
|
||||
is complete, leading to higher power consumption during the
|
||||
boot process. \\
|
||||
|
||||
Finally, modern memory controllers often include power-saving
|
||||
features that are fine-tuned during the training process. By
|
||||
reusing validated training data from NVRAM, the system can
|
||||
quickly return to an optimized state with lower power
|
||||
consumption.
|
||||
The inability to restore this data forces the system to
|
||||
operate at a potentially less efficient state until training
|
||||
is complete, leading to higher power consumption during the
|
||||
boot process. \\
|
||||
\subsubsection{A seedless DQS position training algorithm}
|
||||
|
||||
\subsubsection{A seedless DQS position training algorithm}
|
||||
An algorithm to find the best timing for the DQS so that the
|
||||
memory controller can reliably read data from the memory
|
||||
could be done without relying on any pre-known starting
|
||||
values (seeds). This would allow for better reliability and
|
||||
wider support for different situations. The algorithm
|
||||
could be describe as follows. \\
|
||||
|
||||
An algorithm to find the best timing for the DQS so that the
|
||||
memory controller can reliably read data from the memory
|
||||
could be done without relying on any pre-known starting
|
||||
values (seeds). This would allow for better reliability and
|
||||
wider support for different situations. The algorithm
|
||||
could be describe as follows. \\
|
||||
\begin{itemize}
|
||||
\item Prepare Memory Controller:
|
||||
The memory controller needs to be in a state where it can
|
||||
safely adjust the DQS timing without affecting the normal
|
||||
operation of the system. By blocking the DQS signal locking,
|
||||
we ensure that the adjustments made during training do not
|
||||
interfere with the controller’s ability to capture data
|
||||
until the optimal settings are found.
|
||||
|
||||
\begin{itemize}
|
||||
\item Prepare Memory Controller:
|
||||
The memory controller needs to be in a state where it can
|
||||
safely adjust the DQS timing without affecting the normal
|
||||
operation of the system. By blocking the DQS signal locking,
|
||||
we ensure that the adjustments made during training do not
|
||||
interfere with the controller’s ability to capture data
|
||||
until the optimal settings are found.
|
||||
\item Initialize Variables:
|
||||
Set up variables to store the various timing settings and
|
||||
test results for each bytelane. This setup is crucial
|
||||
because each bytelane might require a different optimal
|
||||
timing, and keeping track of these values ensures that the
|
||||
algorithm can correctly determine the best delay settings
|
||||
later.
|
||||
\end{itemize}
|
||||
|
||||
\item Initialize Variables:
|
||||
Set up variables to store the various timing settings and
|
||||
test results for each bytelane. This setup is crucial
|
||||
because each bytelane might require a different optimal
|
||||
timing, and keeping track of these values ensures that the
|
||||
algorithm can correctly determine the best delay settings
|
||||
later.
|
||||
\end{itemize}
|
||||
The main loop is the core of the algorithm, where different
|
||||
timing settings are systematically explored. By looping
|
||||
through possible delay settings, the algorithm ensures
|
||||
that it doesn't miss any potential optimal timings. The
|
||||
loop structure allows a methodical test of a range of
|
||||
delays to find the most reliable one. \\
|
||||
|
||||
The main loop is the core of the algorithm, where different
|
||||
timing settings are systematically explored. By looping
|
||||
through possible delay settings, the algorithm ensures
|
||||
that it doesn't miss any potential optimal timings. The
|
||||
loop structure allows a methodical test of a range of
|
||||
delays to find the most reliable one. \\
|
||||
The gross delay is here the coarse adjustment to the timing
|
||||
of the DQS signal. It shifts the timing window by a large
|
||||
amount, helping to broadly align the DQS with the data
|
||||
lines (DQ). The fine delay, which is the smaller, more
|
||||
precise change to the timing of the DQS signal once the
|
||||
coarse alignment (through gross delay) has been achieved,
|
||||
would then be computed. \\
|
||||
|
||||
The gross delay is here the coarse adjustment to the timing
|
||||
of the DQS signal. It shifts the timing window by a large
|
||||
amount, helping to broadly align the DQS with the data
|
||||
lines (DQ). The fine delay, which is the smaller, more
|
||||
precise change to the timing of the DQS signal once the
|
||||
coarse alignment (through gross delay) has been achieved,
|
||||
would then be computed. \\
|
||||
To compute a delay, here would be the steps:
|
||||
|
||||
To compute a delay, here would be the steps:
|
||||
\begin{itemize}
|
||||
\item Set a delay:
|
||||
Setting an initial delay allows the algorithm to start
|
||||
testing. The initial delay might be zero or another default
|
||||
value, providing a baseline from which to begin the search
|
||||
for the optimal timing.
|
||||
|
||||
\begin{itemize}
|
||||
\item Set a delay:
|
||||
Setting an initial delay allows the algorithm to start
|
||||
testing. The initial delay might be zero or another default
|
||||
value, providing a baseline from which to begin the search
|
||||
for the optimal timing.
|
||||
\item Test it:
|
||||
After setting the delay, it is essential to test whether the
|
||||
memory controller can read data correctly. This step is
|
||||
critical because it indicates whether the current delay
|
||||
setting is within the acceptable range for reliable data
|
||||
capture.
|
||||
|
||||
\item Test it:
|
||||
After setting the delay, it is essential to test whether the
|
||||
memory controller can read data correctly. This step is
|
||||
critical because it indicates whether the current delay
|
||||
setting is within the acceptable range for reliable data
|
||||
capture.
|
||||
\item Check the result:
|
||||
If the memory controller successfully reads data, it means
|
||||
the current delay setting is valid. This information is
|
||||
crucial because it helps define the range of acceptable
|
||||
timings. If the test fails, it indicates that the curren
|
||||
t delay setting is outside the range where the memory
|
||||
controller can reliably capture data.
|
||||
|
||||
\item Check the result:
|
||||
If the memory controller successfully reads data, it means
|
||||
the current delay setting is valid. This information is
|
||||
crucial because it helps define the range of acceptable
|
||||
timings. If the test fails, it indicates that the curren
|
||||
t delay setting is outside the range where the memory
|
||||
controller can reliably capture data.
|
||||
\item Increase/decrease delay:
|
||||
By incrementally adjusting the delay, either increasing or
|
||||
decreasing, the algorithm can explore different timing
|
||||
settings in a controlled manner. This ensures that the
|
||||
entire range of possible delays is covered without skipping
|
||||
over any potential good delays.
|
||||
|
||||
\item Increase/decrease delay:
|
||||
By incrementally adjusting the delay, either increasing or
|
||||
decreasing, the algorithm can explore different timing
|
||||
settings in a controlled manner. This ensures that the
|
||||
entire range of possible delays is covered without skipping
|
||||
over any potential good delays.
|
||||
\item Test again:
|
||||
Re-testing after each adjustment ensures that the exact
|
||||
point where the DQS timing goes from acceptable (pass) to
|
||||
unacceptable (fail) is caught. This step helps in
|
||||
identifying the transition point, which is often the optimal
|
||||
place to set the DQS delay.
|
||||
|
||||
\item Test again:
|
||||
Re-testing after each adjustment ensures that the exact
|
||||
point where the DQS timing goes from acceptable (pass) to
|
||||
unacceptable (fail) is caught. This step helps in
|
||||
identifying the transition point, which is often the optimal
|
||||
place to set the DQS delay.
|
||||
\item Look for a transition:
|
||||
The transition from pass to fail is where the DQS timing
|
||||
crosses the boundary of the valid timing window. This
|
||||
transition is crucial because it marks the end of the
|
||||
reliable range. The best timing is usually just before
|
||||
this transition.
|
||||
|
||||
\item Look for a transition:
|
||||
The transition from pass to fail is where the DQS timing
|
||||
crosses the boundary of the valid timing window. This
|
||||
transition is crucial because it marks the end of the
|
||||
reliable range. The best timing is usually just before
|
||||
this transition.
|
||||
\item Record the best setting:
|
||||
Storing the best delay setting for each bytelane ensures
|
||||
that a reliable timing configuration is available when the
|
||||
training is complete.
|
||||
|
||||
\item Record the best setting:
|
||||
Storing the best delay setting for each bytelane ensures
|
||||
that a reliable timing configuration is available when the
|
||||
training is complete.
|
||||
\item Confirm all bytelanes:
|
||||
Before finalizing the settings, it is important to ensure
|
||||
that the chosen delays work for all bytelanes. This step
|
||||
serves as a final safeguard against errors, ensuring that
|
||||
every part of the data bus is correctly aligned.
|
||||
\end{itemize}
|
||||
|
||||
\item Confirm all bytelanes:
|
||||
Before finalizing the settings, it is important to ensure
|
||||
that the chosen delays work for all bytelanes. This step
|
||||
serves as a final safeguard against errors, ensuring that
|
||||
every part of the data bus is correctly aligned.
|
||||
\end{itemize}
|
||||
Each bytelane (8-bit segment of data) may require a
|
||||
different optimal delay setting. By repeating the process
|
||||
for all bytelanes, the algorithm ensures that the entire
|
||||
data bus is correctly timed. Misalignment in even one
|
||||
bytelane can lead to data errors, making it essential to
|
||||
tune every bytelane individually. \\
|
||||
|
||||
Each bytelane (8-bit segment of data) may require a
|
||||
different optimal delay setting. By repeating the process
|
||||
for all bytelanes, the algorithm ensures that the entire
|
||||
data bus is correctly timed. Misalignment in even one
|
||||
bytelane can lead to data errors, making it essential to
|
||||
tune every bytelane individually. \\
|
||||
Once the best settings are confirmed, they need to be
|
||||
applied to the memory controller for use during normal
|
||||
operation. This step locks in the most reliable timing
|
||||
configuration found during the training process. \\
|
||||
|
||||
Once the best settings are confirmed, they need to be
|
||||
applied to the memory controller for use during normal
|
||||
operation. This step locks in the most reliable timing
|
||||
configuration found during the training process. \\
|
||||
After the optimal settings are applied, it is necessary
|
||||
to allow the DQS signal locking mechanism to resume. This
|
||||
locks in the delay settings, ensuring stable operation going
|
||||
forward. \\
|
||||
|
||||
After the optimal settings are applied, it is necessary
|
||||
to allow the DQS signal locking mechanism to resume. This
|
||||
locks in the delay settings, ensuring stable operation going
|
||||
forward. \\
|
||||
|
||||
Finally, the algorithm needs to indicate whether it was
|
||||
successful in finding reliable timing settings for all
|
||||
bytelanes. This feedback is crucial for determining whether
|
||||
the memory system is correctly configured or if further
|
||||
adjustments or troubleshooting are needed. \\
|
||||
Finally, the algorithm needs to indicate whether it was
|
||||
successful in finding reliable timing settings for all
|
||||
bytelanes. This feedback is crucial for determining whether
|
||||
the memory system is correctly configured or if further
|
||||
adjustments or troubleshooting are needed. \\
|
||||
|
||||
% ------------------------------------------------------------------------------
|
||||
% CHAPTER 5: Virtualization of the operating system through firmware abstraction
|
||||
|
@ -4219,7 +4217,8 @@ if (best_count > 2) {
|
|||
|
||||
\chapter*{Appendix: Long code listings}
|
||||
\addcontentsline{toc}{chapter}{Appendix: Long code listings}
|
||||
\renewcommand{\thelisting}{\arabic{listing}}
|
||||
\renewcommand{\thelisting}{L.\arabic{listing}}
|
||||
\setcounter{listing}{0}
|
||||
|
||||
\begin{listing}
|
||||
\begin{adjustwidth}{0.5cm}{0.5cm}
|
||||
|
|
|
@ -28,28 +28,25 @@
|
|||
\contentsline {section}{\numberline {3.2}AMD Platform Security Processor and Intel Management Engine}{28}{section.3.2}%
|
||||
\contentsline {chapter}{\numberline {4}Memory initialization and training}{30}{chapter.4}%
|
||||
\contentsline {section}{\numberline {4.1}Importance of DDR3 memory initialization}{30}{section.4.1}%
|
||||
\contentsline {subsection}{\numberline {4.1.1}General steps for DDR3 configuration}{31}{subsection.4.1.1}%
|
||||
\contentsline {section}{\numberline {4.2}Memory initialization techniques}{34}{section.4.2}%
|
||||
\contentsline {subsection}{\numberline {4.2.1}Memory training algorithms}{34}{subsection.4.2.1}%
|
||||
\contentsline {subsection}{\numberline {4.2.2}BIOS and Kernel Developer Guide (BKDG) recommendations}{35}{subsection.4.2.2}%
|
||||
\contentsline {subsubsection}{\numberline {4.2.2.1}DDR3 initialization procedure}{36}{subsubsection.4.2.2.1}%
|
||||
\contentsline {subsubsection}{\numberline {4.2.2.2}ZQ calibration process}{36}{subsubsection.4.2.2.2}%
|
||||
\contentsline {subsubsection}{\numberline {4.2.2.3}Write leveling process}{37}{subsubsection.4.2.2.3}%
|
||||
\contentsline {section}{\numberline {4.3}Current implementation and potential improvements}{39}{section.4.3}%
|
||||
\contentsline {subsection}{\numberline {4.3.1}Current implementation in coreboot on the KGPE-D16}{39}{subsection.4.3.1}%
|
||||
\contentsline {subsubsection}{\numberline {4.3.1.1}Details on the DQS training function}{41}{subsubsection.4.3.1.1}%
|
||||
\contentsline {subsubsection}{\numberline {4.3.1.2}Details on the write leveling implementation}{43}{subsubsection.4.3.1.2}%
|
||||
\contentsline {subsubsection}{\numberline {4.3.1.3}Details on the write leveling implementation}{44}{subsubsection.4.3.1.3}%
|
||||
\contentsline {subsection}{\numberline {4.3.2}Write Leveling on AMD Fam15h G34 Processors with RDIMMs}{44}{subsection.4.3.2}%
|
||||
\contentsline {subsubsection}{\numberline {4.3.2.1}Details on the DQS position training function}{45}{subsubsection.4.3.2.1}%
|
||||
\contentsline {subsubsection}{\numberline {4.3.2.2}Details on the DQS receiver training function}{48}{subsubsection.4.3.2.2}%
|
||||
\contentsline {subsection}{\numberline {4.3.3}Potential enhancements}{50}{subsection.4.3.3}%
|
||||
\contentsline {subsubsection}{\numberline {4.3.3.1}DQS receiver training}{50}{subsubsection.4.3.3.1}%
|
||||
\contentsline {subsubsection}{\numberline {4.3.3.2}Write leveling}{52}{subsubsection.4.3.3.2}%
|
||||
\contentsline {subsection}{\numberline {4.3.4}DQS position training}{54}{subsection.4.3.4}%
|
||||
\contentsline {subsection}{\numberline {4.3.5}On a wider scale...}{56}{subsection.4.3.5}%
|
||||
\contentsline {subsubsection}{\numberline {4.3.5.1}Saving training values in NVRAM}{56}{subsubsection.4.3.5.1}%
|
||||
\contentsline {subsubsection}{\numberline {4.3.5.2}A seedless DQS position training algorithm}{57}{subsubsection.4.3.5.2}%
|
||||
\contentsline {section}{\numberline {4.2}General steps for DDR3 configuration}{31}{section.4.2}%
|
||||
\contentsline {section}{\numberline {4.3}Memory initialization techniques}{34}{section.4.3}%
|
||||
\contentsline {subsection}{\numberline {4.3.1}Memory training algorithms}{34}{subsection.4.3.1}%
|
||||
\contentsline {subsection}{\numberline {4.3.2}BIOS and Kernel Developer Guide (BKDG) recommendations}{35}{subsection.4.3.2}%
|
||||
\contentsline {subsubsection}{\numberline {4.3.2.1}DDR3 initialization procedure}{36}{subsubsection.4.3.2.1}%
|
||||
\contentsline {subsubsection}{\numberline {4.3.2.2}ZQ calibration process}{36}{subsubsection.4.3.2.2}%
|
||||
\contentsline {subsubsection}{\numberline {4.3.2.3}Write leveling process}{37}{subsubsection.4.3.2.3}%
|
||||
\contentsline {section}{\numberline {4.4}Current implementation and potential improvements}{39}{section.4.4}%
|
||||
\contentsline {subsection}{\numberline {4.4.1}Current implementation in coreboot on the KGPE-D16}{39}{subsection.4.4.1}%
|
||||
\contentsline {subsubsection}{\numberline {4.4.1.1}Details on the DQS training function}{41}{subsubsection.4.4.1.1}%
|
||||
\contentsline {subsubsection}{\numberline {4.4.1.2}Details on the write leveling implementation}{43}{subsubsection.4.4.1.2}%
|
||||
\contentsline {subsubsection}{\numberline {4.4.1.3}Details on the DQS position training function}{45}{subsubsection.4.4.1.3}%
|
||||
\contentsline {subsubsection}{\numberline {4.4.1.4}Details on the DQS receiver training function}{47}{subsubsection.4.4.1.4}%
|
||||
\contentsline {subsection}{\numberline {4.4.2}Potential enhancements}{50}{subsection.4.4.2}%
|
||||
\contentsline {subsubsection}{\numberline {4.4.2.1}DQS receiver training}{50}{subsubsection.4.4.2.1}%
|
||||
\contentsline {subsubsection}{\numberline {4.4.2.2}Write leveling}{52}{subsubsection.4.4.2.2}%
|
||||
\contentsline {subsubsection}{\numberline {4.4.2.3}DQS position training}{54}{subsubsection.4.4.2.3}%
|
||||
\contentsline {subsubsection}{\numberline {4.4.2.4}On saving training values in NVRAM}{56}{subsubsection.4.4.2.4}%
|
||||
\contentsline {subsubsection}{\numberline {4.4.2.5}A seedless DQS position training algorithm}{57}{subsubsection.4.4.2.5}%
|
||||
\contentsline {chapter}{\numberline {5}Virtualization of the operating system through firmware abstraction}{59}{chapter.5}%
|
||||
\contentsline {section}{\numberline {5.1}ACPI and abstraction of hardware control}{59}{section.5.1}%
|
||||
\contentsline {section}{\numberline {5.2}SMM as a hidden execution layer}{60}{section.5.2}%
|
||||
|
|
22
packages.tex
22
packages.tex
|
@ -33,18 +33,29 @@
|
|||
\usepackage[a4paper, portrait, margin=1.45cm]{geometry}
|
||||
|
||||
% Set parameters
|
||||
|
||||
% No warnings
|
||||
\WarningsOff
|
||||
|
||||
% No message for text justification
|
||||
\hbadness=10000
|
||||
|
||||
% Start at page 0
|
||||
\setcounter{page}{0}
|
||||
|
||||
% Link every toc element
|
||||
\hypersetup{linktoc=all}
|
||||
|
||||
% Enhance footnotes
|
||||
\addtolength{\skip\footins}{0.6pc}
|
||||
\renewcommand*\footnoterule{} %Footnode separator line
|
||||
|
||||
%\def\siecle#1{\textsc{\romannumeral #1}\textsuperscript{e}~siècle}
|
||||
|
||||
% Place dots on sections in toc
|
||||
\renewcommand{\cftsecleader}{\cftdotfill{\cftdotsep}} %places dots on sections lines as well
|
||||
|
||||
% Tweak auto-tabulations
|
||||
\cftsetindents{section}{0pt}{4em}
|
||||
\cftsetindents{subsection}{10pt}{4em}
|
||||
\cftsetindents{subsubsection}{20pt}{4em}
|
||||
|
@ -52,12 +63,15 @@
|
|||
\cftsetindents{subparagraph}{40pt}{4em}
|
||||
\def\cftdotsep{1}
|
||||
\cftsetpnumwidth{1em}
|
||||
|
||||
\renewcommand{\cftchapafterpnum}{\vspace{\cftbeforechapskip}}
|
||||
\renewcommand{\familydefault}{\sfdefault}
|
||||
|
||||
\setlength\parindent{0pt}
|
||||
|
||||
% Defaults space before chapters
|
||||
\renewcommand{\cftchapafterpnum}{\vspace{\cftbeforechapskip}}
|
||||
|
||||
% Sans-serif, of course
|
||||
\renewcommand{\familydefault}{\sfdefault}
|
||||
|
||||
% Configure listings
|
||||
\usemintedstyle{solarized-light}
|
||||
\definecolor{bg}{HTML}{FAF9F6}
|
||||
\definecolor{linenumcolor}{rgb}{0.6, 0.6, 0.6} % Light gray color
|
||||
|
|
Loading…
Reference in New Issue