The Pattern Cache: A Mechanism to Reduce Power Consumption in Out of Order Processors by Removing Duplicated Efforts

Morrow, Bradley II

dc.contributor.advisor	Barnes, Ronald
dc.contributor.author	Morrow, Bradley II
dc.date.accessioned	2018-05-08T14:25:07Z
dc.date.available	2018-05-08T14:25:07Z
dc.date.issued	2018-05
dc.identifier.uri	https://hdl.handle.net/11244/299798
dc.description.abstract	Out-of-order engines are the basis for nearly every high performance general purpose processor today due to their ability to mask the penalties associated with long latency operations. Unfortunately, these benefits come at a cost of a substantial amount of power consuming hardware. In addition, the prevalence of loops in code means that this hardware often duplicates its efforts, rescheduling the same sequence thousands of times within a typical program. While each iteration of the loop is not identical due to branches and variable length operations, for the SPEC2017 benchmarks tested, the most common dynamically scheduled instruction pattern takes up anywhere from 43% to 88% of the reorderings, and the four most common patterns accounting for anywhere from 70% to 98%. To eliminate some of the duplicated work in finding the same dynamic schedule, the execution patterns that the out-of-order engine creates can be recorded in a cache that indexes patterns based on the branch that immediately precedes that pattern. If the same pattern is seen enough times, then much of the dynamic scheduling hardware can be power off and the previously determined schedule can be used. This powered off hardware includes the common data bus that broadcasts output registers written in a particular cycle to every reservations station. Instead, the system can replay instructions based on the recorded order. There are many parameters within this system that affect both the amount of time spent in this replay mode and the relative performance of the system. These include the start threshold, stop threshold, length of time to wait on a replayed instruction to be ready, the mechanism for handling squashes, and timeouts, the number of patterns to store for a branch, and the length of history considered. While the general trade-offs of adjusting these parameters is often the same for most benchmarks, the optimum value depends both on the value of the other parameters and the particular set of code being used. With this in mind, the proposed system has been shown to be able to achieve over 40% of the time spent in replay mode with only a 2% reduction in performance relative to a standard out of order processor. While this system has been shown to work well, the goal of reducing power by removing the duplicated efforts means that the pattern cache size must be limited. These limitations include limits to both the length of patterns as well as the number of patterns that can be stored. Limiting the length of pattern to reasonable sizes has almost no affect on the system operation in most cases, but limiting the number of patterns does. For some benchmarks it is still possible to get most of the performance with a pattern cache size on the order of kilobytes (30% utilization with a 1% performance drop in the best case), other benchmarks only have utilization rates at half of their maximum possible values. With this in mind, the proposed system shows promising initial results, but for this system to be an effective power saving tool more work must be done to find ways to limit the size of the pattern cache with smaller reductions in utilization rates.	en_US
dc.language	en_US	en_US
dc.subject	Computer	en_US
dc.subject	Out-of-Order	en_US
dc.subject	Power	en_US
dc.subject	Architecture	en_US
dc.title	The Pattern Cache: A Mechanism to Reduce Power Consumption in Out of Order Processors by Removing Duplicated Efforts	en_US
dc.contributor.committeeMember	Sigmarsson, Hjalti
dc.contributor.committeeMember	Dyer, John
dc.date.manuscript	2018-05-07
dc.thesis.degree	Master of Science	en_US
ou.group	College of Engineering::School of Electrical and Computer Engineering	en_US

Files in this item

Name:: 2018_Morrow_Bradley_Thesis.pdf
Size:: 605.0Kb
Format:: PDF

View/Open

Name:: 2018_Morrow_Bradley_Thesis.zip
Size:: 594.0Kb
Format:: Unknown

View/Open

This item appears in the following Collection(s)

OU - Theses [2188]

Show simple item record

SHAREOK^TM

advancing Oklahoma scholarship, research and institutional memory

The Pattern Cache: A Mechanism to Reduce Power Consumption in Out of Order Processors by Removing Duplicated Efforts

Files in this item

This item appears in the following Collection(s)