Fixes an issue where a dual GPU system would keep allocating dGPU PBOs in
`cpu_copy()` every frame, but `cleanup()` was not being called, since the
surface thread for the builtin output was rendered on the
primary/integrated GPU, targeting the same GPU.
Even if a different monitor was compositing on the dGPU, That wouldn't
help since that thread has it's own `GpuManager` with it's own renderer
and cache.
Running cleanup at the end (or start) of each frame seems like a good
idea. Not sure if it would be best to avoid additional calls, or if
that's desirable/fine.