INVLPG (M8)
Summary:
"Invalidate TLB Entries"
Reference:
https://www.felixcloutier.com/x86/invlpg
Extension:
BASE
Category:
SYSTEM
ISA-Set:
I486REAL
CPL:
0
iform:
INVLPG_MEMb
iclass:
INVLPG
ASM:
INVLPG
Operands
Operand 1 (r): Memory
Available performance data
Arrow Lake-P
Arrow Lake-E
Meteor Lake-P
Meteor Lake-E
Emerald Rapids
Alder Lake-P
Alder Lake-E
Rocket Lake
Tiger Lake
Ice Lake
Cascade Lake
Cannon Lake
Skylake-X
Coffee Lake
Kaby Lake
Skylake
Broadwell
Haswell
Ivy Bridge
Sandy Bridge
Westmere
Nehalem
Wolfdale
Conroe
Tremont
Goldmont Plus
Goldmont
Airmont
Bonnell
AMD Zen 5
AMD Zen 4
AMD Zen 3
AMD Zen 2
AMD Zen+
Arrow Lake-P
Measurements
Throughput
Measured (loop):
226.33
Measured (unrolled):
226.13
Number of μops
Executed: 42
Retire slots: 39
Decoded (MITE): 0
Microcode Sequencer (MS): 41
Requires the complex decoder (no other instruction can be decoded with simple decoders in the same cycle)
Port usage:
22*ALU+3*INT_OTHER+6*JMP+1*SHIFT+5*STA+5*STD
Arrow Lake-E
Measurements
Throughput
Measured (loop):
389.08
Measured (unrolled):
384.75
Number of μops
Executed: 103
Microcode Sequencer (MS): 103
Requires the complex decoder
Meteor Lake-P
Measurements
Throughput
Computed from the port usage: 8.00 (if an indexed addressing mode is used: 7.00)
Measured (loop):
220.18 (if an indexed addressing mode is used: 220.03)
Measured (unrolled):
218.17 (if an indexed addressing mode is used: 218.29)
Number of μops
Executed: 42
Retire slots: 40
Decoded (MITE): 0
Microcode Sequencer (MS): 42
Requires the complex decoder (2 other instructions can be decoded with simple decoders in the same cycle)
Port usage:
5*p0+3*p0156B+9*p06+6*p1+1*p15+5*p49+8*p5+5*p78 (if an indexed addressing mode is used: 5*p0+7*p056+9*p06+6*p1+5*p49+5*p78)
Meteor Lake-E
Measurements
Throughput
Measured (loop):
390.82
Measured (unrolled):
387.00
Number of μops
Executed: 103
Microcode Sequencer (MS): 102
Requires the complex decoder
Emerald Rapids
Measurements
Throughput
Computed from the port usage: 8.00 (if an indexed addressing mode is used: 7.33)
Measured (loop):
256.17 (if an indexed addressing mode is used: 251.10)
Measured (unrolled):
245.36 (if an indexed addressing mode is used: 264.78)
Number of μops
Executed: 44
Retire slots: 42
Decoded (MITE): 0
Microcode Sequencer (MS): 44
Requires the complex decoder (no other instruction can be decoded with simple decoders in the same cycle)
Port usage:
5*p0+4*p0156B+9*p06+7*p1+1*p15B+5*p49+8*p5+5*p78 (if an indexed addressing mode is used: 5*p0+8*p056+9*p06+7*p1+5*p49+5*p78)
Alder Lake-P
Measurements
Throughput
Computed from the port usage: 7.00
Measured (loop):
224.15 (if an indexed addressing mode is used: 224.10)
Measured (unrolled):
223.67
Number of μops
Executed: 42
Retire slots: 40
Decoded (MITE): 0
Microcode Sequencer (MS): 42
Requires the complex decoder (no other instruction can be decoded with simple decoders in the same cycle)
Port usage:
5*p0+4*p0156B+8*p06+6*p1+2*p15B+5*p49+7*p5+5*p78 (if an indexed addressing mode is used: 5*p0+8*p056+8*p06+6*p1+5*p49+5*p78)
Alder Lake-E
Measurements
Throughput
Measured (loop):
396.87
Measured (unrolled):
393.00
Number of μops
Executed: 103
Microcode Sequencer (MS): 102
Requires the complex decoder
Rocket Lake
Measurements
Throughput
Computed from the port usage: 9.00
Measured (loop):
261.30
Measured (unrolled):
258.58
Number of μops
Executed: 45
Retire slots: 46
Decoded (MITE): 0
Microcode Sequencer (MS): 48
Requires the complex decoder (no other instruction can be decoded with simple decoders in the same cycle)
Port usage:
5*p0+4*p0156+13*p06+6*p1+1*p23+5*p49+6*p5+5*p78
Tiger Lake
Measurements
Throughput
Computed from the port usage: 9.00
Measured (loop):
260.07
Measured (unrolled):
257.54
Number of μops
Executed: 45
Retire slots: 46
Decoded (MITE): 0
Microcode Sequencer (MS): 48
Requires the complex decoder (no other instruction can be decoded with simple decoders in the same cycle)
Port usage:
5*p0+1*p015+3*p0156+13*p06+6*p1+1*p23+5*p49+6*p5+5*p78
Ice Lake
Measurements
Throughput
Computed from the port usage: 8.50
Measured (loop):
218.00
Measured (unrolled):
216.44
Number of μops
Executed: 43
Retire slots: 42
Decoded (MITE): 0
Microcode Sequencer (MS): 44
Requires the complex decoder (no other instruction can be decoded with simple decoders in the same cycle)
Port usage:
6*p0+4*p0156+11*p06+6*p1+5*p49+6*p5+5*p78
Cascade Lake
Measurements
Throughput
Computed from the port usage: 9.25
Measured (loop):
228.92
Measured (unrolled):
230.47
Number of μops
Executed: 47
Retire slots: 42
Decoded (MITE): 0
Microcode Sequencer (MS): 44
Requires the complex decoder (no other instruction can be decoded with simple decoders in the same cycle)
Port usage:
7*p0+4*p0156+11*p06+7*p1+2*p23+3*p237+5*p4+8*p5
Cannon Lake
Measurements
Throughput
Computed from the port usage: 8.25
Measured (loop):
215.22
Measured (unrolled):
213.12
Number of μops
Executed: 43
Retire slots: 42
Decoded (MITE): 0
Microcode Sequencer (MS): 44
Requires the complex decoder (no other instruction can be decoded with simple decoders in the same cycle)
Port usage:
4*p0+4*p0156+11*p06+8*p1+2*p23+3*p237+5*p4+6*p5
Skylake-X
Measurements
Throughput
Computed from the port usage: 9.50
Measured (loop):
213.80
Measured (unrolled):
212.04
Number of μops
Executed: 47
Retire slots: 42
Decoded (MITE): 0
Microcode Sequencer (MS): 44
Requires the complex decoder (no other instruction can be decoded with simple decoders in the same cycle)
Port usage:
7*p0+3*p0156+12*p06+7*p1+2*p23+3*p237+5*p4+8*p5
Coffee Lake
Measurements
Throughput
Computed from the port usage: 9.50
Measured (loop):
213.38
Measured (unrolled):
210.75
Number of μops
Executed: 47
Retire slots: 42
Decoded (MITE): 0
Microcode Sequencer (MS): 44
Requires the complex decoder (no other instruction can be decoded with simple decoders in the same cycle)
Port usage:
7*p0+3*p0156+12*p06+7*p1+2*p23+3*p237+5*p4+8*p5
Kaby Lake
Measurements
Throughput
Computed from the port usage: 9.50
Measured (loop):
213.38
Measured (unrolled):
210.75
Number of μops
Executed: 47
Retire slots: 42
Decoded (MITE): 0
Microcode Sequencer (MS): 44
Requires the complex decoder (no other instruction can be decoded with simple decoders in the same cycle)
Port usage:
7*p0+3*p0156+12*p06+7*p1+2*p23+3*p237+5*p4+8*p5
Skylake
Measurements
Throughput
Computed from the port usage: 9.50
Measured (loop):
213.38
Measured (unrolled):
210.94
Number of μops
Executed: 47
Retire slots: 42
Decoded (MITE): 0
Microcode Sequencer (MS): 44
Requires the complex decoder (no other instruction can be decoded with simple decoders in the same cycle)
Port usage:
7*p0+3*p0156+12*p06+7*p1+2*p23+3*p237+5*p4+8*p5
Broadwell
Measurements
Throughput
Computed from the port usage: 11.00
Measured (loop):
209.02
Measured (unrolled):
206.75
Number of μops
Executed: 49
Retire slots: 42
Decoded (MITE): 0
Microcode Sequencer (MS): 44
Requires the complex decoder (no other instruction can be decoded with simple decoders in the same cycle)
Port usage:
7*p0+3*p0156+15*p06+6*p1+2*p23+3*p237+5*p4+8*p5
Haswell
Measurements
Throughput
Computed from the port usage: 10.50
Measured (loop):
210.02
Measured (unrolled):
207.84
Number of μops
Executed: 49
Retire slots: 42
Decoded (MITE): 0
Microcode Sequencer (MS): 44
Requires the complex decoder (no other instruction can be decoded with simple decoders in the same cycle)
Port usage:
7*p0+3*p0156+14*p06+7*p1+2*p23+3*p237+5*p4+8*p5
Ivy Bridge
Measurements
Throughput
Computed from the port usage: 19.00
Measured (loop):
238.70
Measured (unrolled):
236.56
Number of μops
Executed: 49
Retire slots: 45
Decoded (MITE): 1
Microcode Sequencer (MS): 46
Requires the complex decoder (no other instruction can be decoded with simple decoders in the same cycle)
Port usage:
10*p0+1*p015+10*p1+5*p23+4*p4+19*p5
Sandy Bridge
Measurements
Throughput
Computed from the port usage: 18.00
Measured (loop):
235.82
Measured (unrolled):
233.63
Number of μops
Executed: 49
Retire slots: 45
Decoded (MITE): 1
Microcode Sequencer (MS): 46
Requires the complex decoder (no other instruction can be decoded with simple decoders in the same cycle)
Port usage:
11*p0+1*p015+10*p1+5*p23+4*p4+18*p5
Westmere
Measurements
Throughput
Computed from the port usage: 16.00
Measured (loop):
203.80
Measured (unrolled):
203.00
Number of μops
Executed: 48
Retire slots: 45
Microcode Sequencer (MS): 133
Requires the complex decoder
Port usage:
11*p0+5*p015+8*p1+2*p2+3*p3+3*p4+16*p5
Nehalem
Measurements
Throughput
Computed from the port usage: 16.00
Measured (loop):
211.80
Measured (unrolled):
211.00
Number of μops
Executed: 49
Retire slots: 45
Microcode Sequencer (MS): 141
Requires the complex decoder
Port usage:
8*p0+4*p015+11*p1+2*p2+4*p3+4*p4+16*p5
Wolfdale
Measurements
Throughput
Computed from the port usage: 33.00
Measured (loop):
250.17
Measured (unrolled):
249.09
Number of μops
Executed: 57
Port usage:
14*p0+11*p1+7*p2+6*p3+6*p4+33*p5
Conroe
Measurements
Throughput
Computed from the port usage: 33.00
Measured (loop):
248.70
Measured (unrolled):
246.48
Number of μops
Executed: 60
Port usage:
13*p0+15*p1+7*p2+6*p3+6*p4+33*p5
Tremont
Measurements
Throughput
Measured (loop):
72.78
Measured (unrolled):
72.00
Number of μops
Executed: 16
Microcode Sequencer (MS): 15
Requires the complex decoder
Goldmont Plus
Measurements
Throughput
Measured (loop):
60.03
Measured (unrolled):
59.75
Number of μops
Executed: 18
Microcode Sequencer (MS): 18
Requires the complex decoder
Goldmont
Measurements
Throughput
Measured (loop):
61.00
Measured (unrolled):
61.00
Number of μops
Executed: 21
Microcode Sequencer (MS): 21
Requires the complex decoder
Airmont
Measurements
Throughput
Measured (loop):
71.44
Measured (unrolled):
70.87
Number of μops
Executed: 21
Microcode Sequencer (MS): 21
Requires the complex decoder
Bonnell
Measurements
Throughput
Measured (loop):
59.04
Measured (unrolled):
59.00
Number of μops
Executed: 20
Microcode Sequencer (MS): 20
Requires the complex decoder
AMD Zen 5
Measurements
Throughput
Measured (loop):
146.90
Measured (unrolled):
144.62
Number of μops
Executed: 27
Documentation
Latency: NA
Throughput: NA
Number of μops: ucode
Port usage: ucode
AMD Zen 4
Measurements
Throughput
Measured (loop):
118.72
Measured (unrolled):
118.00
Number of μops
Executed: 26
Documentation
Number of μops: ucode
AMD Zen 3
Measurements
Throughput
Measured (loop):
119.00
Measured (unrolled):
118.40
Number of μops
Executed: 26
Documentation
Number of μops: ucode
AMD Zen 2
Measurements
Throughput
Measured (loop):
127.80
Measured (unrolled):
126.00
Number of μops
Executed: 24
Documentation
Number of μops: ucode
AMD Zen+
Measurements
Throughput
Measured (loop):
127.53
Measured (unrolled):
126.00
Number of μops
Executed: 24
Documentation
Number of μops: ucode