RDTSCP - Throughput and Uops
With unroll_count=500 and no inner loop
Code:
0: 0f 01 f9 rdtscp
Show nanoBench command
Results:
Instructions retired: 1.0
Core cycles: 36.0
Reference cycles: 25.45
UOPS_EXECUTED.THREAD: 22.0
RETIRE_SLOTS: 26.0
UOPS_MITE: 2.01
UOPS_MS: 24.0
UOPS_DISPATCHED.INT_EU_ALL: 22.0
UOPS_DISPATCHED.ALU: 8.0
UOPS_DISPATCHED.SLOW: 4.0
UOPS_DISPATCHED.STD: 0.0
UOPS_DISPATCHED.SHIFT: 2.0
UOPS_DISPATCHED.JMP: 6.0
UOPS_DISPATCHED.STA: 0.0
UOPS_DISPATCHED.V0: 0.0
UOPS_DISPATCHED.V1: 0.0
UOPS_DISPATCHED.V2: 0.0
UOPS_DISPATCHED.V3: 0.0
DIV_CYCLES: 0.0
ILD_STALL.LCP: 0.0
UOPS_MITE>=1: 1.0
With unroll_count=500, no inner loop, and 1 NOP
Code:
0: 0f 01 f9 rdtscp 3: 90 nop
Show nanoBench command
Results:
Instructions retired: 2.0
Core cycles: 36.0
Reference cycles: 25.38
UOPS_EXECUTED.THREAD: 22.0
RETIRE_SLOTS: 27.0
UOPS_MITE: 2.99
UOPS_MS: 24.0
UOPS_DISPATCHED.INT_EU_ALL: 22.0
UOPS_DISPATCHED.ALU: 8.0
UOPS_DISPATCHED.SLOW: 4.0
UOPS_DISPATCHED.STD: 0.0
UOPS_DISPATCHED.SHIFT: 2.0
UOPS_DISPATCHED.JMP: 6.0
UOPS_DISPATCHED.STA: 0.0
UOPS_DISPATCHED.V0: 0.0
UOPS_DISPATCHED.V1: 0.0
UOPS_DISPATCHED.V2: 0.0
UOPS_DISPATCHED.V3: 0.0
DIV_CYCLES: 0.0
ILD_STALL.LCP: 0.0
UOPS_MITE>=1: 2.0
With loop_count=1000 and unroll_count=10
Code:
0: 0f 01 f9 rdtscp
Show nanoBench command
Results:
Instructions retired: 1.2
Core cycles: 35.8
Reference cycles: 25.24
UOPS_EXECUTED.THREAD: 22.1
RETIRE_SLOTS: 26.1
UOPS_MITE: 2.1
UOPS_MS: 24.02
UOPS_DISPATCHED.INT_EU_ALL: 22.1
UOPS_DISPATCHED.ALU: 8.0
UOPS_DISPATCHED.SLOW: 4.0
UOPS_DISPATCHED.STD: 0.0
UOPS_DISPATCHED.SHIFT: 2.0
UOPS_DISPATCHED.JMP: 6.1
UOPS_DISPATCHED.STA: 0.0
UOPS_DISPATCHED.V0: 0.0
UOPS_DISPATCHED.V1: 0.0
UOPS_DISPATCHED.V2: 0.0
UOPS_DISPATCHED.V3: 0.0
DIV_CYCLES: 0.0
ILD_STALL.LCP: 0.0
UOPS_MITE>=1: 1.1
With loop_count=100 and unroll_count=100
Code:
0: 0f 01 f9 rdtscp
Show nanoBench command
Results:
Instructions retired: 1.02
Core cycles: 35.98
Reference cycles: 25.36
UOPS_EXECUTED.THREAD: 22.01
RETIRE_SLOTS: 26.01
UOPS_MITE: 2.01
UOPS_MS: 24.0
UOPS_DISPATCHED.INT_EU_ALL: 22.01
UOPS_DISPATCHED.ALU: 8.0
UOPS_DISPATCHED.SLOW: 4.0
UOPS_DISPATCHED.STD: 0.0
UOPS_DISPATCHED.SHIFT: 2.0
UOPS_DISPATCHED.JMP: 6.01
UOPS_DISPATCHED.STA: 0.0
UOPS_DISPATCHED.V0: 0.0
UOPS_DISPATCHED.V1: 0.0
UOPS_DISPATCHED.V2: 0.0
UOPS_DISPATCHED.V3: 0.0
DIV_CYCLES: 0.0
ILD_STALL.LCP: 0.0
UOPS_MITE>=1: 1.01