Windows 11 July 2023 Update: Reduces Game Stutter

Published by

Click here to post a comment for Windows 11 July 2023 Update: Reduces Game Stutter on our message forum
data/avatar/default/avatar05.webp
So 1000Hz and above can enjoy better smoothness?
data/avatar/default/avatar03.webp
throttling and coalescing background raw mouse listeners and restricting their message rate
This sounds like capping input (processing) rate to something less than 1000 Hz?!
https://forums.guru3d.com/data/avatars/m/273/273678.jpg
Timur Born:

This sounds like capping input (processing) rate to something less than 1000 Hz?!
no
data/avatar/default/avatar16.webp
But?
https://forums.guru3d.com/data/avatars/m/273/273678.jpg
Timur Born:

But?
its interrupt moderation, but for mice.
data/avatar/default/avatar28.webp
USB input devices with polling rate above 125Hz are a thing since over a decade (maybe 2s?), it's a shame that Microsoft still hasn't fix issues about this. Well it took 15 years to fix it into microsoft office.... Maybe we should not be surprised at all..
data/avatar/default/avatar21.webp
Astyanax:

its interrupt moderation, but for mice.
How does that work for periodically "polled" devices? The mouse does not create an interrupt on its own anyway, but has the USB driver poll either every 1 ms (1000 Hz) or every 125 µs (faster than 1000 Hz). So at least for 1000 Hz I don't understand how "restricting their message rate" works? For mice over 1000 Hz that don't need the full 125 µs polling rate it seems more relevant/feasible.
https://forums.guru3d.com/data/avatars/m/273/273678.jpg
Timur Born:

How does that work for periodically "polled" devices? The mouse does not create an interrupt.
USB is only polling based over the usb bus, beyond the bus at the controller to xapic, it is interrupt driven. Positional updates are also implemented as an SWI (rather than wired). So the drivers labelled USB*.sys operate as a message signal (polled) and are communicated through to the *hci.sys which then becomes a hardware interrupt, via either legacy pin (1-2.x) or MSI (3.x) Windows 8 and later already added some major optimizations to the *hci drivers to minimize interrupt storms on particular device usages, for instance there are a few games that paired with an xbox controller using the native xinput stack, hit heavy cpu utilisation as vibration commands are not throttled (ie, the developer was stupid and constantly sent down zero'd actuator commands to the device and overloaded the cpu, in windows 7 this would demonstrate a fps hit and high System process usage on a usb 2 controller, and just low fps on a usb 3 controller. The same app on windows 8+ does not experience this either issue. https://forums.guru3d.com/threads/input-lag-issue-usb-xhci-driver-inner-workings.447496/
data/avatar/default/avatar08.webp
But with data only arriving at 1000 Hz I would also expect interrupts to only happen at that period anyway (ignoring other USB devices). Either there is a new positional message or there is not?! And you cannot "coalesce" positional messages either, because that would cause corresponding jitter.
https://forums.guru3d.com/data/avatars/m/273/273678.jpg
Timur Born:

But with data only arriving at 1000 Hz I would also expect interrupts to only happen at that period anyway (ignoring other USB devices). Either there is a new positional message or there is not?! And you cannot "coalesce" positional messages either, because that would cause corresponding jitter.
A great write up for things is in this thread here https://www.overclock.net/threads/usb-polling-precision.1550666/
data/avatar/default/avatar19.webp
Astyanax:

A great write up for things is in this thread here https://www.overclock.net/threads/usb-polling-precision.1550666/
This post seems to explain polling rates, buffers, latencies and jitter and corresponding optimizations when you suffer from problems. But I do not see any further explanations than that. So I have to ask again, how do you "coalescent" and "restrict message rates" of a mouse device passively polled at fixed 1000 Hz frequency?
data/avatar/default/avatar35.webp
O.K hang on if they are doing interrupt moderation what is to stop a whole lot of messages getting bundled up and sent as one causing a noticeable delay between action and reaction? It does sound as if there is the possibility for mouse moments to be dropped in transit in order to reduce interrupts.
https://forums.guru3d.com/data/avatars/m/273/273678.jpg
VaultDweller:

O.K hang on if they are doing interrupt moderation what is to stop a whole lot of messages getting bundled up and sent as one causing a noticeable delay between action and reaction? It does sound as if there is the possibility for mouse moments to be dropped in transit in order to reduce interrupts.
Bundles are processed more efficiently than rapid singular interrupts.
data/avatar/default/avatar08.webp
But if you have a fixed poling rate how do you do this, fixed denotes that they only way to bundle these things up would either ignore so many polls or delay those poles being reported, ignoring polls effectively reduces the number of polls and would make the mouse less responsive. Coalescing the data and feeding it through in packets could result in actions and results being either delayed or completely out of sync. In short it effectively sounds like Microsoft has unilaterally decided that everyone no matter how much they have spent to buy a gaming mouse to gain an advantage will be limited to the same experience - that of one of Microsofts desktop office mice... With all that Microsoft is doing with OneUI it sounds like they are trying to force us eventually to OneHardware. These sort of decisions are literally Microsoft cutting their own throat. This was the response of a friend of mine. "Nice of Microsoft to assume the PC gaming performance required to handle a 1000Hz polling rate is limited to that of a Surface laptop. This is another reason why continually working towards improving the gaming experience in Linux is so important. BTW, I’m thinking to replace Windows 10 on my other PC with a Linux distribution. I’ll keep a version of Windows on the main PC only for certain titles that won’t run or run well on Linux. "
data/avatar/default/avatar37.webp
I did a rough comparison of ISR + DPC calls of Wdf01000.sys while moving a mouse at 1000 Hz before and after the update. Even before the update the combined count of ISRs + DPCs is only a few hundred (around 300-400) per second for 1000 Hz mouse-movement. After the update I don't see any considerable change in that number. If it is lower then not by an amount considerable enough to be easily measurable. So whatever changed, it seems to be a different change than this. I also wonder what exactly is meant by "background listeners" which is emphasized so prominently in the changelog.
data/avatar/default/avatar01.webp
Post on the Blur Busters forum:
Microsoft Engineer explaining what has changed: Hi folks, Microsoft engineer here! I posted back in July 2021 asking about people experiencing FPS drops in games when using high report-rate mice. I got a ton of feedback both in that post and via DMs, and now, I have a new ask - can you test out our improvements? They've officially reached Windows Insider build 25272: Made a change to help improve performance when using a high report rate mouse while gaming. By far, the biggest issue we saw in the traces we examined was a high number of background subscribers to mouse raw input. Every time a mouse input message was generated, we'd also have to deliver these raw input messages to all the subscribers, and all of that simultaneous activity across the recipient processes could cause the game itself to be slower in processing the input (taking away time spent rendering frames). Pushed to an extreme, we could see a game taking longer to process the input than it took for new input to be delivered to it (since high report mice produce input so quickly). This could cause the game to get stuck in a loop of input processing, desperately trying to drain the input queue, all the while starving out its rendering. We made two major changes to help this scenario: We reduced or removed lock acquisitions from some APIs frequently called in this code path, allowing simultaneous recipients to make better forward progress without getting in each other's way. We capped the rate at which background recipients of raw mouse input could receive messages, so e.g. instead of seeing 1000 messages per second for a 1000Hz mouse, it may only see 125 (but each containing the combined payload of the coalesced messages). As part of testing, I made a stress-test setup on a laptop and tried out League of Legends - here's the before and after. The results are promising, but I'd love more real-world validation now that the changes are live! We're especially interested in understanding if the reduction of message generation to background subscribers results in any compatibility issues in applications - e.g. you could have some utility that reports your mouse FPS, and it may only say 125Hz instead of 1000Hz. Please try it out if you can, and let me know how it goes.
data/avatar/default/avatar24.webp
what a shit fix :V does this affect wndproc only or even if polling directly the HID device for raw data?
data/avatar/default/avatar39.webp
You'd have to ask the developer I guess. From what I understand the mouse is still polled at full rate and then the full rate messages are sent to the foreground gaming process. But "listeners" of "background" processes get the capped/coalesced number of messages instead.
data/avatar/default/avatar03.webp
all this makes me regret DirectInput..
data/avatar/default/avatar31.webp
Well, many small messages in short time-frame have always been a problem for computers. Even the very slow MIDI protocol still suffers from it as do SSDs. Bandwidth isn't everything and that stuff has to be processed in time. So capping/coalescing mouse-input for background processes doesn't seem like the worst idea. What I don't like is how high frequency mouse-polling seems to be more about fixing the symptoms of a problem (like jitter) instead of fixing the cause (how can *fixed* polling cause noticeable jitter in the first place?).