Architecture detail[edit]
Featuring Fine-grained multithreading
Re-introduce Single Edge Contact Cartridge (SECC) in some high end server model with pcie Slot.
All desktop/laptop model will be BGA and socket will be abandoned.
HyperTransport will be replaced by exclusive lane of PCIe 4.0 as default interconnection for processor
20 nm Bulk Silicon manufacturing process by Taiwanese Silicon Manufacture Company(TSMC)
An ARM instruction layer is added in the processor module for emulation purpose.
Each module contain 4 to 8 "in order" non superscalar Cluster core and a 2 way 64 KiB exclusive L1 instruction cache is shared by all cores and each module contain 2MB exclusive l2 cache.
each cluster core is capable of running 4 way simultaneous multithreading(8 way SMT on Opteron line).
8 Kb unified direct map write through L1 cache per cluster core with 16 byte cache line (inclusive).
52 integer stage pipeline and 75 floating point pipeline design.
up to 16 module per die and 4 way 64 MB L3 write through cache share by all module.
Architecture support up to 4096 thread per processor, high end model will pack in "multi-chip module"(MCM).
HSA support by default.
Each processor contain a 2048sp Hawaii GCN core with on chip 1T cell 2 GB HBM stack L4 Cache(full speed) and south bridge is now integrated, including SATA controller and PCIe controller. 8GB HBM will be featured on Opteron but soldered on PCB board with SECC package at half of processor speed and packed with 2 additional 4096sp Fuji core on board instead of just one hawaii gpu on consumer line.
6Ghz+ by default
Each Cluster core contain a 1 uops simple instruction decoder, any more than 2 uops complex instruction will be decode in instruction sequencer with 3 cycle latency.
Up to 3 Teraflop with HSA.
Some model will direct compete with Xeon Phi in HPC market
Up to 800W TDP
http://en.wikipedia.org/wiki/AMD_Zen