Your GPU Is Probably Idle
A GPU holding memory isn't the same as a GPU doing work (an H100 can sit at 0% utilization with 20 GiB allocated), and most idle time comes from everything around the card, not the card itself. So feed it from the input pipeline, hand it big tensor-friendly shapes, fuse small kernels with torch....