|
Technical details of Barcelona/Phenom erratum #298
Date: 12/11/07
(Computer Geeks) Keywords: html, linux
In case anyone cares, the technical details of the AMD K10 TLB erratum (that's "bug" for the layman) was posted by an AMD fellow on a Linux mailing list. I found it via a post response on Informationweek.com:
http://informationweek.com/blog/main/archives/2007/12/bug_in_amds_qua.html
AMD Family 10h revision B2 processors suffer from an issue in the processor TLB known as erratum 298. Erratum 298 is documented in a forthcoming update to the Revision Guide for AMD Family 10h Processors (PID 41322). The workaround in the Revision Guide document is intended to be applied by BIOS. The BIOS workaround has performance implications which can be avoided by having the OS directly workaround the issue. A Linux 64-bit patch was developed for 2.6.23.8 by AMD's OSRC team and will be posted to this list by Joerg Roedel. The patch is for demonstration purposes and is NOT being recommended to be applied upstream.
Erratum 298 will be described as follows: "The processor operation to change the accessed or dirty bits of a page translation table entry in the L2 from 0b to 1b may not be atomic. A small window of time exists where other cached operations may cause the stale page translation table entry to be installed in the L3 before the modified copy is returned to the L2. In addition, if a probe for this cache line occurs during this window of time, the processor may not set the accessed or dirty bit and may corrupt data for an unrelated cached operation. The system may experience a machine check event reporting an L3 protocol error has occurred. In this case, the MC4 status register (MSR 0000_0410) will be equal to B2000000_000B0C0F or BA000000_000B0C0F. The MC4 address register (MSR 0000_0412) will be equal to 26h."
The L2 Eviction Linux kernel performance patch re-enables the registers set for the BIOS workaround described in the Revision Guide document. It then prevents the processor from performing the operation that can trigger erratum 298. The patch works by emulating the Accessed and Dirty bits.
The basis for the kernel patch solution depends on the root cause of the L2 eviction problem. The only exposure for the problem is when the TLB needs to set an A or D bit in a page table entry. If the TLB never needs to set an A or D bit, the bug cannot occur. By emulating the A and D bits with the help of the Present and Writable bits, the patch will ensure the real A and D bits are always preset. It works by forcing a page fault when the first access is made to a page with the emulated A bit not set, and when the first write access is made to a writable page with the emulated D bit not set. Emulated A and D bits are stored in bits generally available to the OS in the page table entry.
Elsie Wahlig
This response on the Informationweek page pretty much sums up the situation:
Now here is where the real stink occurs -- AMD invited press (rather than ship press review kits as normal) to benchmark Phenom for the launch day (Nov. 19th). Of course, the bench systems were setup by AMD, without any work around, and also over clocked the northbridge from what is stock conditions. In otherwords, it is likely that most, if not all, reported data on the Phenom desktop performance is overstated from what you will buy at retail. As such, TechReport obtained samples (retail samples) outside of AMD and retested the Phenom with and without the erratum workaround: http://www.techreport.com/articles.x/13741
There is between 5 to 50% performance hit, averaging to 13.9% overall when using the BIOS update AMD supplied but conveniently neglected to install during the "Tahoe Trip".
The entire launch of this product is, for all intents and purposes, entirely botched.
Extremely ugly train wreck.
Source: http://community.livejournal.com/computergeeks/1134601.html
|