Page 1 of 2

Another graphics card question

Posted: Sat Jul 23, 2022 9:18 pm
by Mike in Rancho
So with prices starting to come down, I've begun considering replacing the mostly-pathetic GTX 745 in my desktop. As maybe the best bang for the buck, I have started leaning towards perhaps an RTX 3060ti. Not quite at MSRP yet, but perhaps soon. Also, I think Nvidia and AMD should be releasing the next gen cards this year, hopefully in months, which might also help with prices on GPUs.

I think I recall that past threads directed us to the Open CL benchmarks for ST's GPU performance? Although those benchmarks and various gaming/speed rankings and benchmarks don't exactly match, they are often close and the 3060ti seems pretty decent as to all of them. Though of course it's no 3090.

One thing I just read about the 30xx series cards though, at least after a certain manufacturing date, is that they implement LHR as an anti-cyrpto feature. Not that I know anything about crypto, but since ST's GPU is an offload when appropriate for number crunching and calculations, is ST affected by LHR? And if so, enough that one should drop back to a prior gen like 20xx?

Re: Another graphics card question

Posted: Sun Jul 24, 2022 6:33 am
by hixx
Hi Mike,
sorry I don't know anything about the crypto restriction piece neither, but what is really a must for ST is larger VRAM. The VRAM has fast access so ST is able to push & pull data more quickly. In that light it might be wiser to go for a middle of the road GPU with max VRAM amount rather than a top end model with just a standard "graphics task" VRAM. On my Apple M1 Max I have the binned 24-core GPU version and I did some GPU testing with some ST modules:
WIPE showed GPU around 20% AVG, peaking up to some 90% or so
HDR is another long-runner with the GPU around 10% loaded mostly and some bursts up to 90%
SVDecon has a huge amount of GPU bursts around 80-90%
The one GPU hog is SuperStructure though: it fully loads the GPU 100% in 4 phases towards the end of the calculation

Of course this is Apple not NVIDIA, but I assume You should be fine with the 3060 and enough VRAM. Most calculations have large parts leaning on the CPU anyway, so the amount of calculations when GPU becomes a limiting factor is very minor. Even my 2015 imac with an aged AMD390 R9 had a decent GPU ST performance compared to my M1 Max Usually The limiting factor is rather the VRAM
Clear Skies,
Jochen

Re: Another graphics card question

Posted: Sun Jul 24, 2022 7:26 pm
by Mike in Rancho
Thanks Jochen,

My current card has 4GB of VRAM, and any new one should have more, possibly least 8. However, when I use various performance monitors, I haven't really seen ST use much of the VRAM at all.

A faster CPU would probably speed things up more for ST overall, but that won't be happening any time soon. A GPU upgrade would help with a number of things, and if it gives ST a little boost also that'll be icing on top.

A 3060 ti does seem to have a good Open CL benchmark - though whether that was measured on the initial FE/reference card before all 30xx series implemented LHR, I don't know.

Re: Another graphics card question

Posted: Mon Jul 25, 2022 1:37 am
by admin
I think the Low Hash Rate cards work by detecting specific patterns in behavior. Those patterns apparently have to do with lots of random memory location access. I don't think this should impact StarTools much, though it is possible something like SVDecon may get caught in this, as it does switch/interpolate/jumps between a few thousand different point spread functions during decon. I think you should be safe, but I can't say for sure.

Jochen brings up a good point about VRAM. Large datasets will require *a lot* of VRAM. Depending on your future needs (big mosaics? massive JWST datasets?), this may or may not be something to take into account.

Re: Another graphics card question

Posted: Mon Jul 25, 2022 11:16 pm
by Mike in Rancho
Thanks Ivo!

Interesting that a card might think SVD is...mining. :lol:

However, by VRAM I presume you and Jochen are talking about the GPU card's onboard memory? i.e. Video RAM...not virtual?

With my current 4GB GPU, I've run various monitors and I don't think I've seen it go much over 1GB used of the dedicated GPU memory. One was just for a split second (perhaps in Wipe or AutoDev, I forget) to maybe 1.2 gig, and then when I ran SS on a large file it used 0.9 gig of the GPU RAM for quite a bit. Pretty large file too.

Now, plenty of the mainboard's 32GB of RAM was used by ST throughout processing, but why so little of the GPU RAM? And if ST will only use 1GB of the 4GB I have now, would there be any point to GPU with 8GB? Other than the DDR6 might be faster, and that so many new GPU's just happen to come with a decent chunk of RAM...

:?:

Is is possible I have something set wrong in Windows? Or that I am using the wrong diagnostic monitors? I tried the GPU performance section of Task Manager, GPU-z, and the statistics graphs of MSI Afterburner. Not all at the same time of course.

Re: Another graphics card question

Posted: Tue Jul 26, 2022 5:34 am
by hixx
Hi Mike,
yes Vram means Video RAM, not virtual. VRAm ist a RAM Type with much faster throughput than normal RAM. For ST calculation You' want it to hold a 3 channels of a full image ( e.g 24 MB x 3(RGB x2 /(32 bit) =144 MBs multiplied by a number of copies e.g in modules using iterations. Also You'll need actual "Video" Memory on top to support the graphics. That stacks to a pretty large number, but with 8 GB I suppose You'll be OK for 99.9% of all calculations. If You VRAM is running out, it will push & shove portions of data between VRAM an normal RAM or SSD, so Your limiting factor would become the bus architecture, RAM or SSD. I remeber there is a thread somewhere around here on the forum regarding VRAM,
Clear Skies,
Jochen

Re: Another graphics card question

Posted: Wed Jul 27, 2022 10:20 pm
by Mike in Rancho
Thanks Jochen,

No my point was that my VRAM is not even close to running out, even on the old school GPU. I did momentarily get it up to 1.5GB used the other night working on my mosaic, which is like two 26 megapixel files stitched together. But it was an HOO. Maybe when I add a third channel to make it SHO? Or if I had an NBAccent file?

Oh well, I'll keep an eye on the monitors, but with the stuff I do I haven't seen even my 4GB VRAM get threatened. But a newer card is likely to have faster speed overall, plus more and faster VRAM.

I suppose I do have to wonder a bit though if my "new to me" i7-6700 is hindering ST. At least for unbinned mosaics and extra large files. For my normal resolution processing things don't seem problematically slow.

Re: Another graphics card question

Posted: Fri Sep 23, 2022 10:42 pm
by riverpoet
To revive this topic... I've been trying to get Startools to process every step in an "instant" (=5-10s or so) - working on my 20MP FIT stacks.
Upgraded 5600X to 5900X, so double the threads/cores, not much difference.
RAM disk = some difference.
GPUS
- RX 580 - decent
- RX 5700XT - close to what I wanted
- RTX 3060 12GB LHR - I'd say slower than 5700
- RTX 3060Ti 8GB LHR - similar to 3060, prehaps even slower
- RTX 3070TI 8GB LHR - faster than 3060/3060TI/580, but I think 5700XT was faster

Today, my brother asks me to check if his RTX 3070 FHR works on my computer as It doesn't seem to be working on 2 machines he tried. It is detected and works fine, so I try to give it a "stress test" using StarTools. To my shock, this card seems the fastest of the ones I tried by a mile! Most functions take less than 5s, some 5-10s, Decon and NR 45s. The biggest bottleneck seems to be clicking Keep button which pauses for like 5s?
(BTW, how come HDR and SuperStructure seem slower then running Startools GPU vs Startools CPU? Doesn't the GPU exe use CPU when appropriate?)

Now this was not a direct and numbers based-comparison of GPUs but as I am impatient and want Startools to work as fast as say PhotoShop, I think my subjective impression is not far from reality. So, I now suspect StarTools is indeed quite noticeably affected by LHR. So, better to either buy AMD card, old "FHR" 3000 series cards or RTX 3090...

Peter

Re: Another graphics card question

Posted: Sat Sep 24, 2022 2:10 am
by KlausKlaus
On my MBA M2 (16GB RAM, 512GB) every step apart from HDR and more so, SuperStructure just takes an instant.

Klaus

Re: Another graphics card question

Posted: Sat Sep 24, 2022 8:37 pm
by Mike in Rancho
riverpoet wrote: Fri Sep 23, 2022 10:42 pm To revive this topic... I've been trying to get Startools to process every step in an "instant" (=5-10s or so) - working on my 20MP FIT stacks.
Upgraded 5600X to 5900X, so double the threads/cores, not much difference.
RAM disk = some difference.
GPUS
- RX 580 - decent
- RX 5700XT - close to what I wanted
- RTX 3060 12GB LHR - I'd say slower than 5700
- RTX 3060Ti 8GB LHR - similar to 3060, prehaps even slower
- RTX 3070TI 8GB LHR - faster than 3060/3060TI/580, but I think 5700XT was faster

Today, my brother asks me to check if his RTX 3070 FHR works on my computer as It doesn't seem to be working on 2 machines he tried. It is detected and works fine, so I try to give it a "stress test" using StarTools. To my shock, this card seems the fastest of the ones I tried by a mile! Most functions take less than 5s, some 5-10s, Decon and NR 45s. The biggest bottleneck seems to be clicking Keep button which pauses for like 5s?
(BTW, how come HDR and SuperStructure seem slower then running Startools GPU vs Startools CPU? Doesn't the GPU exe use CPU when appropriate?)

Now this was not a direct and numbers based-comparison of GPUs but as I am impatient and want Startools to work as fast as say PhotoShop, I think my subjective impression is not far from reality. So, I now suspect StarTools is indeed quite noticeably affected by LHR. So, better to either buy AMD card, old "FHR" 3000 series cards or RTX 3090...

Peter
Excellent info from your experiences, Peter, thanks! I've still been browsing prices on occasion, so it is good to know that ST's GPU offloading might be hampered by LHR. It would be interesting if you had any numbers to put to the experiments, with the same dataset and prior processing. Of course much will probably also depend on your CPU and overall rig, which seems fast.

Is that 3070 an older model, perhaps FE? I wasn't aware that you could get these in FHR anymore, but I should look. I don't know if I want to trust eBay though on something that could be a few hundy.

New 1.8 HDR is very heavily CPU dependent, without a lot of GPU offloading. Just the type of calculations required, according to Ivo. There's a big thread here on it where several of us ran timed experiments trying out various module parameters to see their effect on completion times.

I thought SS did use GPU offloading. Either way I would be surprised the non-GPU would run the modules faster.