Documentation/Intel/NativeRaminit: Remove trailing whitespace

Change-Id: I1d38aea07e2d9ffb89115410603a5beac5e4d44d
Signed-off-by: Elyes HAOUAS <ehaouas@noos.fr>
Reviewed-on: https://review.coreboot.org/25831
Tested-by: build bot (Jenkins) <no-reply@coreboot.org>
Reviewed-by: Patrick Georgi <pgeorgi@google.com>
This commit is contained in:
Elyes HAOUAS 2018-04-25 21:45:53 +02:00 committed by Patrick Georgi
parent 4713b5cd9e
commit 0c80d2f8e3
3 changed files with 70 additions and 70 deletions

View File

@ -2,7 +2,7 @@
The MCHBAR can be enabled by using register 0x48 of PCI(0:0:0) device. The MCHBAR can be enabled by using register 0x48 of PCI(0:0:0) device.
This documentation is incomplete and might be incorrect. This documentation is incomplete and might be incorrect.
Please handle with care ! Please handle with care !
**MCHBAR + 0x4** **MCHBAR + 0x4**

View File

@ -15,15 +15,15 @@ This chapter explains the frequency selection done on Sandybride and Ivybridge.
| XMP | Extreme Memory Profiles | - | - | | XMP | Extreme Memory Profiles | - | - |
## SPD ## SPD
The [SPD](https://de.wikipedia.org/wiki/Serial_Presence_Detect "Serial Presence Detect") The [SPD](https://de.wikipedia.org/wiki/Serial_Presence_Detect "Serial Presence Detect")
located on every DIMM is factory program with various timings. One of them located on every DIMM is factory program with various timings. One of them
specifies the maximum clock frequency the DIMM should be used with. The specifies the maximum clock frequency the DIMM should be used with. The
operating frequency is stores as fixed point value (tCK), rounded to the next operating frequency is stores as fixed point value (tCK), rounded to the next
smallest supported operating frequency. Some smallest supported operating frequency. Some
[SPD](https://de.wikipedia.org/wiki/Serial_Presence_Detect "Serial Presence Detect") [SPD](https://de.wikipedia.org/wiki/Serial_Presence_Detect "Serial Presence Detect")
contains additional and optional contains additional and optional
[XMP](https://de.wikipedia.org/wiki/Extreme_Memory_Profile "Extreme Memory Profile") [XMP](https://de.wikipedia.org/wiki/Extreme_Memory_Profile "Extreme Memory Profile")
data, that stores so called "performance" modes, that advertises higher clock data, that stores so called "performance" modes, that advertises higher clock
frequencies. frequencies.
## XMP profiles ## XMP profiles
@ -32,51 +32,51 @@ Only **XMP profile 1** is being used in case it advertises:
* 1.5V operating voltage * 1.5V operating voltage
* The channel's installed DIMM count doesn't exceed the XMP coded limit * The channel's installed DIMM count doesn't exceed the XMP coded limit
In case the XMP profile doesn't fullfill those limits, the regular SPD will be In case the XMP profile doesn't fullfill those limits, the regular SPD will be
used. used.
> **Note:** XMP Profiles are supported since coreboot 4.4. > **Note:** XMP Profiles are supported since coreboot 4.4.
It is possible to ignore the max DIMM count limit set by XMP profiles. It is possible to ignore the max DIMM count limit set by XMP profiles.
By activating Kconfig option `NATIVE_RAMINIT_IGNORE_XMP_MAX_DIMMS` it is By activating Kconfig option `NATIVE_RAMINIT_IGNORE_XMP_MAX_DIMMS` it is
possible to install two DIMMs per channel, even if XMP tells you not to do. possible to install two DIMMs per channel, even if XMP tells you not to do.
> **Note:** Ignoring XMP Profiles limit is supported since coreboot 4.7. > **Note:** Ignoring XMP Profiles limit is supported since coreboot 4.7.
## Soft fuses ## Soft fuses
Every board manufacturer does program "soft" fuses to indicate the maximum Every board manufacturer does program "soft" fuses to indicate the maximum
DRAM frequency supported. However, those fuses don't set a limit in hardware DRAM frequency supported. However, those fuses don't set a limit in hardware
and thus are called "soft" fuses, as it is possible to ignore them. and thus are called "soft" fuses, as it is possible to ignore them.
> **Note:** Ignoring the fuses might cause system instability ! > **Note:** Ignoring the fuses might cause system instability !
On Sandy Bride *CAPID0_A* is being read, and on Ivybridge *CAPID0_B* is being On Sandy Bride *CAPID0_A* is being read, and on Ivybridge *CAPID0_B* is being
read. coreboot reads those registers and honors the limit in case the Kconfig read. coreboot reads those registers and honors the limit in case the Kconfig
option `CONFIG_NATIVE_RAMINIT_IGNORE_MAX_MEM_FUSES` wasn't set. option `CONFIG_NATIVE_RAMINIT_IGNORE_MAX_MEM_FUSES` wasn't set.
Power users that want to let their RAM run at DRAM's "stock" frequency need to Power users that want to let their RAM run at DRAM's "stock" frequency need to
enable the Kconfig symbol. enable the Kconfig symbol.
It is possible to override the soft fuses limit by using a board-specific It is possible to override the soft fuses limit by using a board-specific
[devicetree](#devicetree) setting. [devicetree](#devicetree) setting.
> **Note:** Ignoring max mem freq. fuses is supported since coreboot 4.7. > **Note:** Ignoring max mem freq. fuses is supported since coreboot 4.7.
## <a name="hard_fuses"></a> Hard fuses ## <a name="hard_fuses"></a> Hard fuses
"Hard" fuses are programmed by Intel and limit the maximum frequency that can "Hard" fuses are programmed by Intel and limit the maximum frequency that can
be used on a given CPU/board/chipset. At time of writing there's no register be used on a given CPU/board/chipset. At time of writing there's no register
to read this limit, before trying to set a given DRAM frequency. The memory PLL to read this limit, before trying to set a given DRAM frequency. The memory PLL
won't lock, indicating that the chosen memory multiplier isn't available. In won't lock, indicating that the chosen memory multiplier isn't available. In
this case coreboot tries the next smaller memory multiplier until the PLL will this case coreboot tries the next smaller memory multiplier until the PLL will
lock. lock.
## <a name="devicetree"></a> Devicetree ## <a name="devicetree"></a> Devicetree
The devicetree register ```max_mem_clock_mhz``` overrides the "soft" fuses set The devicetree register ```max_mem_clock_mhz``` overrides the "soft" fuses set
by the board manufacturer. by the board manufacturer.
By using this register it's possible to force a minimum operating frequency. By using this register it's possible to force a minimum operating frequency.
## Reference clock ## Reference clock
While Sandybride supports 133 MHz reference clock (REFCK), Ivy Bridge also While Sandybride supports 133 MHz reference clock (REFCK), Ivy Bridge also
supports 100 MHz reference clock. The reference clock is multiplied by the DRAM supports 100 MHz reference clock. The reference clock is multiplied by the DRAM
multiplier to select the DRAM frequency (SCK) by the following formula: multiplier to select the DRAM frequency (SCK) by the following formula:
REFCK * MULT = 1 / DCK REFCK * MULT = 1 / DCK
@ -122,11 +122,11 @@ else:
for i in SPDs: for i in SPDs:
freq_max := MIN(freq_max, ddr_spd_max_mhz[i])``` freq_max := MIN(freq_max, ddr_spd_max_mhz[i])```
As you can see, by using DIMMs with different maximum DRAM frequencies, the As you can see, by using DIMMs with different maximum DRAM frequencies, the
slowest DIMMs' frequency will be selected, to prevent over-clocking it. slowest DIMMs' frequency will be selected, to prevent over-clocking it.
The selected frequency gives the PLL multiplier to operate at. In case the PLL The selected frequency gives the PLL multiplier to operate at. In case the PLL
locks (see Take me to [Hard fuses](#hard_fuses)) the frequency will be used for locks (see Take me to [Hard fuses](#hard_fuses)) the frequency will be used for
all DIMMs. At this point it's not possible to change the multiplier again, all DIMMs. At this point it's not possible to change the multiplier again,
until the system has been powered off. In case the PLL doesn't lock, the next until the system has been powered off. In case the PLL doesn't lock, the next
smaller multiplier will be used until a working multiplier will be found. smaller multiplier will be used until a working multiplier will be found.

View File

@ -2,22 +2,22 @@
## Introduction ## Introduction
This chapter explains the read training sequence done on Sandy Bride and This chapter explains the read training sequence done on Sandy Bride and
Ivy Bridge memory initialization. Ivy Bridge memory initialization.
Read training is done to compensate the skew between DQS and SCK and to find Read training is done to compensate the skew between DQS and SCK and to find
the smallest supported roundtrip delay. the smallest supported roundtrip delay.
Every board does have a vendor depended routing topology, and can be equip Every board does have a vendor depended routing topology, and can be equip
with any combination of DDR3 memory modules, that introduces different with any combination of DDR3 memory modules, that introduces different
skew between the memory lanes. With DDR3 a "Fly-By" routing topology skew between the memory lanes. With DDR3 a "Fly-By" routing topology
has been introduced, that makes the biggest part of DQS-SCK skew. has been introduced, that makes the biggest part of DQS-SCK skew.
The memory code measures the actual skew and actives delay gates, The memory code measures the actual skew and actives delay gates,
that will "compensate" the skew. that will "compensate" the skew.
When in read training the DRAM and the controller are placed in a special mode. When in read training the DRAM and the controller are placed in a special mode.
On every read instruction the DRAM outputs a predefined pattern and the memory On every read instruction the DRAM outputs a predefined pattern and the memory
controller samples the DQS after a given delay. As the pattern is known, the controller samples the DQS after a given delay. As the pattern is known, the
actual delay of every lane can be measured. actual delay of every lane can be measured.
The values programmed in read training effect DRAM-to-MC transfers only ! The values programmed in read training effect DRAM-to-MC transfers only !
@ -36,34 +36,34 @@ The values programmed in read training effect DRAM-to-MC transfers only !
| DQS | Data Strobe signal used to sample all lane's DQ signals | - | - | | DQS | Data Strobe signal used to sample all lane's DQ signals | - | - |
## Hardware ## Hardware
The hardware does have delay logic blocks that can delay the DQ / DQS of a The hardware does have delay logic blocks that can delay the DQ / DQS of a
lane/rank by one or multiple clock cylces and it does have delay logic blocks lane/rank by one or multiple clock cylces and it does have delay logic blocks
that can delay the signal by a multiple of 1/64th DCK per lane. that can delay the signal by a multiple of 1/64th DCK per lane.
All delay values can be controlled via software by writing registers in the All delay values can be controlled via software by writing registers in the
MCHBAR. MCHBAR.
## IO phase ## IO phase
The IO phase can be adjusted in [0-512) * 1/64th DCK. Incrementing it by 64 is The IO phase can be adjusted in [0-512) * 1/64th DCK. Incrementing it by 64 is
the same as Incrementing IO delay by 1. the same as Incrementing IO delay by 1.
## IO delay ## IO delay
Delays the DQ / DQS signal by one or multiple clock cycles. Delays the DQ / DQS signal by one or multiple clock cycles.
### Roundtrip time ### Roundtrip time
The roundtrip time is the time the memory controller waits for data arraving The roundtrip time is the time the memory controller waits for data arraving
after a read has been issued. Due to clock-domain crossings, multiple after a read has been issued. Due to clock-domain crossings, multiple
delay instances and phase interpolators, the signal runtime to DRAM and back delay instances and phase interpolators, the signal runtime to DRAM and back
to memory controller defaults to 55 DCKs. The real roundtrip time has to be to memory controller defaults to 55 DCKs. The real roundtrip time has to be
measured. measured.
After a read command has been issued, a counter counts down until zero has been After a read command has been issued, a counter counts down until zero has been
reached and activates the input buffers. reached and activates the input buffers.
The following pictures shows the relationship between those three values. The following pictures shows the relationship between those three values.
The picture was generated from 16 IO delay values times 64 timA values. The picture was generated from 16 IO delay values times 64 timA values.
The highest IO delay was set on the right-hand side, while the last block The highest IO delay was set on the right-hand side, while the last block
on the left-hand side has zero IO delay. on the left-hand side has zero IO delay.
** roundtrip 55 DCKs ** ** roundtrip 55 DCKs **
@ -82,39 +82,39 @@ on the left-hand side has zero IO delay.
[timA_lane0-3_rt53]: timA_lane0-3_rt53.png "timA for lane0 - lane3, roundtrip 53" [timA_lane0-3_rt53]: timA_lane0-3_rt53.png "timA for lane0 - lane3, roundtrip 53"
As you can see the signal has some jitter as every sample was taken in a As you can see the signal has some jitter as every sample was taken in a
different loop iteration. The result register only contains a single bit per different loop iteration. The result register only contains a single bit per
lane. lane.
## Algorithm ## Algorithm
### Steps ### Steps
The algorithm finds the roundtrip time, IO delay and IO phase. The IO phase The algorithm finds the roundtrip time, IO delay and IO phase. The IO phase
will be adjusted to match the falling edge of the preamble of each lane. will be adjusted to match the falling edge of the preamble of each lane.
The roundtrip time is adjusted to an minimal value, that still includes the The roundtrip time is adjusted to an minimal value, that still includes the
preamble. preamble.
### Synchronize to data phase ### Synchronize to data phase
The first measurement done in read-leveling samples all DQS values for one The first measurement done in read-leveling samples all DQS values for one
phase [0-64) * 1/64th DCK. It then searches for the middle of the low data phase [0-64) * 1/64th DCK. It then searches for the middle of the low data
symbol and adjusts timA to the found phase and thus the following measurements symbol and adjusts timA to the found phase and thus the following measurements
will be aligned to the low data symbol. will be aligned to the low data symbol.
The code assumes that the initial roundtrip time causes the measurement to be The code assumes that the initial roundtrip time causes the measurement to be
in the alternating pattern data phase. in the alternating pattern data phase.
### Finding the preamble ### Finding the preamble
After adjusting the IO phase to the middle of one data symbol the preamble will After adjusting the IO phase to the middle of one data symbol the preamble will
be located. Unlike the data phase, which is an alternating pattern (010101...), be located. Unlike the data phase, which is an alternating pattern (010101...),
the preamble consists of two high data cycles. the preamble consists of two high data cycles.
The code decrements the IO delay/RTT and samples the DQS signal with timA The code decrements the IO delay/RTT and samples the DQS signal with timA
untouched. As it has been positioned in the middle of the data symbol, it'll untouched. As it has been positioned in the middle of the data symbol, it'll
read as either "low" or "high". read as either "low" or "high".
If it's "low" we are still in the data phase. If it's "low" we are still in the data phase.
If it's "high" we have found the preamble. If it's "high" we have found the preamble.
The roundtrip time and IO delay will be adjusted until all lanes are aligned. The roundtrip time and IO delay will be adjusted until all lanes are aligned.
The resulting IO delay is visible in the picture below. The resulting IO delay is visible in the picture below.
** roundtrip time: 49 DCKs, IO delay (at blue point): 6 DCKs ** ** roundtrip time: 49 DCKs, IO delay (at blue point): 6 DCKs **
@ -122,17 +122,17 @@ The resulting IO delay is visible in the picture below.
[timA_lane0-3_discover_420x]: timA_lane0-3_discover_420x.png "timA for lane0 - lane3, finding minimum roundtrip time" [timA_lane0-3_discover_420x]: timA_lane0-3_discover_420x.png "timA for lane0 - lane3, finding minimum roundtrip time"
** Note: The sampled data has been shifted by timA. The preamble is now ** Note: The sampled data has been shifted by timA. The preamble is now
in phase. ** in phase. **
## Fine adjustment ## Fine adjustment
As timA still points the middle of the data symbol an offset of 32 is added. As timA still points the middle of the data symbol an offset of 32 is added.
It now points the falling edge of the preamble. It now points the falling edge of the preamble.
The fine adjustment is to reduce errors introduced by jitter. The phase is The fine adjustment is to reduce errors introduced by jitter. The phase is
adjusted from `timA - 25` to `timA + 25` and the DQS signal is sampled 100 adjusted from `timA - 25` to `timA + 25` and the DQS signal is sampled 100
times. The fine adjustment finds the middle of each rising edge (it's actual times. The fine adjustment finds the middle of each rising edge (it's actual
the falling edge of the preamble) to get the final IO phase. You can see the the falling edge of the preamble) to get the final IO phase. You can see the
result in the picture below. result in the picture below.
![alt text][timA_lane0-3_adjust_fine] ![alt text][timA_lane0-3_adjust_fine]