Although those CPU(Phenom) cores are derived from the ones found in current Athlon 64 X2 processors, AMD has made substantial revisions to them in order to improve per-clock performance and efficiency. The cores now have a wider, 32-byte instruction fetch, and the floating-point units can execute 128-bit SSE operations in a single clock cycle. Phenom can execute the Supplemental SSE3 instructions Intel included in its Core 2 processors, but not the newer SSE4 extensions in Intel's just-introduced 45nm chips. The K10 core has more bandwidth throughout in order to accommodate higher throughput—internally between units on the chip, between the L1 and L2 caches, and between the L2 cache and the north bridge/memory controller.
These improved cores are, of course, now grouped four to a chip, and AMD has added a third level to the cache hierarchy in order to assist with integration of the cores. As a result, each Phenom core has 64K of L1 data cache, 512K of dedicated L2 cache, and access to the 2MB L3 cache shared between all cores. An interesting quirk of the Phenom design is that the L3 cache runs at the clock speed of the memory controller/north bridge section of the chip, which is typically slower than the CPU core clocks. Since the L3 cache is an integral part of the memory hierarchy, north bridge clock speeds will be a key factor in overall Phenom performance.
Comment