Intels Next-Generation Skymont E-cores and Their Role in Upcoming Microarchitectures

Published by

Intel's next-generation Skymont E-cores are set to play a crucial role in the company's upcoming microarchitectures, including "Lunar Lake" and "Arrow Lake," as well as in the "Twin Lake" low-power processor. These cores are positioned alongside Intel's next-generation "Lion Cove" P-cores, which have garnered significant attention for their competition with AMD's "Zen 5" technology. Recent leaks from an Intel presentation, possibly intended for PC OEMs, have surfaced online showing thumbnail images of presentation slides. These leaks provide a glimpse into the advancements Intel is making with its E-core technology, despite the limited visibility of the full content of the slides.

The "Skymont" E-core is reported to achieve a double-digit percentage increase in Instructions Per Cycle (IPC) compared to the "Crestmont" E-core, which powers the current "Meteor Lake" processors. This IPC gain is noteworthy, considering "Crestmont" already achieved an approximate 4% IPC improvement over the "Gracemont" E-cores used in the "Raptor Lake" and "Alder Lake" microarchitectures. The IPC level of "Skymont" is anticipated to be on par with that of the "Sunny Cove" or "Willow Cove" P-cores, which are integral to the "Ice Lake" and "Tiger Lake" microarchitectures, respectively. These cores were comparable to AMD’s "Zen 3" core in terms of IPC performance. Intel's achievement in IPC improvement with "Skymont" is attributed to several architectural enhancements. These include an upgraded branch prediction unit and a broader 9-wide decode unit, a significant step up from the 6-wide decode unit found in "Crestmont." Additionally, the integer Arithmetic Logic Unit (ALU) width has been doubled from 4 to 8, allowing for greater parallel processing of integer calculations. Further enhancements in the "Skymont" design include dependency optimization in the out-of-order execution engine and increased queuing capacity across the engine. This optimization helps in improving the efficiency and speed of data processing, which is crucial for high-performance computing and advanced multitasking.

The architectural layout of the E-cores likely remains in clusters with a shared L2 cache among a specified number of cores. This configuration is designed to optimize data retrieval speeds and reduce latency in multi-core operations, thereby enhancing overall processor efficiency.


Source: HXL (Twitter) via tpu

Share this content
Twitter Facebook Reddit WhatsApp Email Print