question archive The importance of having a good branch predictor depends on how often conditional branches are executed

The importance of having a good branch predictor depends on how often conditional branches are executed

Subject:Computer SciencePrice:2.87 Bought7

The importance of having a good branch predictor depends on how often conditional branches are
executed. Together with branch predictor accuracy, this will determine how much time is spent stalling due to
mispredicted branches. In this exercise, assume that the breakdown of dynamic instructions into various instruction
categories is as follows:

R-type

Beq

JMP

LW

SW

40%

25%

5%

25%

5%

Always-taken

Always-not-taken

2-bit

45%

55%

85%

Stall cycles due to mispredicted branches increase the CPI. What is the extra CPI due to
mispredicted branches with the always-taken predictor? Assume that branch outcomes are determined in
the EX stage, that there are no data hazards, and that no delay slots are used.

pur-new-sol

Purchase A New Answer

Custom new solution created by our subject matter experts

GET A QUOTE

Answer Preview

Answer:

Stall cycles due to mispredicted branches increase the CPI. What is the extra CPI due to mispredicted branches with the always-taken predictor? Assume that branch outcomes are determined in the EX stage, that there are no data hazards, and that no delay slots are used.

100-45 = 55%

3 * 0.25 * 0.55 = 0.412

Repeat 4.15.1 for the “always-not-taken” predictor.

100-55=45

3*0.25*0.45 = 0.337

With the 2-bit predictor, what speedup would be achieved if we could convert half of the branch instructions in a way that replaces a branch instruction with an ALU instruction? Assume that correctly and incorrectly predicted instructions have the same chance of being replaced.

CPI without conversion=1+3*(1-0.85)*0.25 = 1.1125

CPI with conversion = 1 + 3*(1-0.85) * 0.25 * 0.5 = 1.05625

Speedup = 1.1125/1.05625 = 1.0532

With the 2-bit predictor, what speedup would be achieved if we could convert half of the branch instructions in a way that replaced each branch instruction with two ALU instructions? Assume that correctly and incorrectly predicted instructions have the same chance of being replaced.

CPI without conversion=1+3*(1-0.85)*0.25 = 1.1125

CPI with conversion=1+(1+3*(1-0.85))*0.25*0.5 = 1.18125

Speedup = 1.1125/1.18125 = 0.941

Some branch instructions are much more predictable than others. If we know that 80% of all executed branch instructions are easy-to-predict loop-back branches that are always predicted correctly, what is the accuracy of the 2-bit predictor on the remaining 20% of the branch instructions?

Accuracy = 0.15/0.2 =75%