This can't be right - 14% CPU utilization render

Terje wrote on 4/16/2013, 5:42 PM
So, I went out and bought me a new work station. Alienware thing with an Intel Intel Core i7-3930K CPU, 16G of RAM and a GTX-660 video card. Just plugged it in and put Vegas on it. Oh, man, I am disappointed.

This thing renders twice as fast as my old AMD quad core thing, but the AMD is ancient, so the four cores are nowhere near at 75% of the hexacore. Igt should render at at least 3x or better. Geekbench clocks my AMD at 5000 or something, and the Alienware at above 20 000.

I did not expect Vegas to utilize my GPU - I don't think the Kepler architecture is supported yet. I did expect it to be able to utilize my CPU though. A little bit.

Simple project with some quicktime movs with transparency as overlays, some text, a few images, a couple of AVCHD video clips. Render to AVCHD. The CPU Utilization never goes above 15%. On my AMD it never goes above about 40%, so I am not only disappointed with the new box, the old one wasn't being taxed either.

This can't be right though. Why on earth would Vegas only use 14% CPU for rendering? I have (well, actually, Vegas did) set render threads to 16.

Is there anything I can do to make Vegas use my CPU?

Comments

videoITguy wrote on 4/16/2013, 6:01 PM
wooa shame on you for letting Alienware put the 660 card on your tab. Don't know what your expecting....
musicvid10 wrote on 4/16/2013, 6:09 PM
"

Your render is only as fast as the slowest link in the chain. Therein lies your problem, not the new computer or Vegas.
All that 14% means is that your CPU is not the bottleneck.
;?)
Terje wrote on 4/16/2013, 6:50 PM
Seems like that is an issue, yes. I am wondering if Quicktime, which I have very bad experience on Windows with is an issue. I tried the good old Rendertest from John Cline, and I got some encouraging data as regards to GPU rendering.

GPU rendering on: CPU at about 30% - render time 42 seconds
GPU rendering off: CPU at about 50% - render time 182 seconds

So, GPU definitely makes a difference here.

Tried another project with some more complexity, put the project on an SSD, rendering to another SSD to reduce disk overhead. Was not overly impressed with the results, but perhaps it is contributable to QT. Since I have similar projects sitting around for Premiere, I will try that tomorrow.
Terje wrote on 4/16/2013, 6:53 PM
@videoITguy

>> Alienware must have seen you coming ...
>> such light video timeline requirements...if that is your typical workflow,

Dude, next time, before answering questions from someone, take a look in the mirror. If what you see is a huge snooty orifice that appears to be brown'ish in color and containing parts of yesterday's dinner, please refrain from posting anything online it is unlikely that it is going to be helpful, or actually anything but what resembles arrogant yellow'ish fluid.
NormanPCN wrote on 4/16/2013, 7:27 PM
One thing to remember. With a CPU supporting hyperthreading cores, 50% CPU use is really a fully loaded CPU. Hyperthreading gives 10-30% boost. It is not a whole CPU as windows shows it in task manager.

That said, if the application has enough threads in flight you should peg at 90+%.

How many render threads do you have set in preferences?
Laurence wrote on 4/16/2013, 7:30 PM
>Dude, next time, before answering questions from someone, take a look in the mirror. If what you see is a huge snooty orifice that appears to be brown'ish in color and containing parts of yesterday's dinner, please refrain from posting anything online it is unlikely that it is going to be helpful, or actually anything but what resembles arrogant yellow'ish fluid.

That is about the best slam I have ever read! I humbly bow at your feet ;-)
Terje wrote on 4/16/2013, 7:34 PM
>> How many render threads do you have set in preferences?

16 threads set in prefs. Will investigate further but probably not before the week end. Looks like the GPU is picking up some of the work load. Going to remove QT from the equation too.
Barry W. Hull wrote on 4/16/2013, 8:04 PM
Dude? Funny.
NormanPCN wrote on 4/17/2013, 9:24 AM
The default 16 of Vegas is excessive and counter productive actually. Causes excess context switches. At most the threads should be the number of logical cores your CPU has. There is a strong argument for a little less, but that argument gets quite technical.

The only QT is use is with DNxHD as the codec and I notice that QT spools to disk. Silly in my mind. The end of the encode Vegas is done feeding frames to the encoder and QT keeps going on.

Looking at disk activity you see them, QT file surrogate, reading from a temp and writing to the final file. This is a waste and slows things down. Better to write to the disk once. This is aggravated by DNxHD having large files since it is an intermediate codec.

rmack350 wrote on 4/17/2013, 11:02 AM
Norman,

Where would I look to observe Quicktime spooling to disk?

Rob

<Edit>Forget I asked. Disk Activity is viewable in Resource Monitor.</edit>
Hulk wrote on 4/17/2013, 11:43 AM
musicvid,

Can you explain what you mean here?

"Your render is only as fast as the slowest link in the chain. Therein lies your problem, not the new computer or Vegas.
All that 14% means is that your CPU is not the bottleneck."


It seems that no matter what CPU/GPU combination you have it is the job of Vegas to utilize both of them to the fullest extent possible for the best user experience.
NormanPCN wrote on 4/17/2013, 12:00 PM
You can use resource monitor via task manager to see that. You will see a process that quicktime created called fileiosurrogate (if I remember correctly).

I have only rendered to DNxHD via Quicktime and this is what I have observed. In other words, other QT codecs may behave differently.

If your output file is MyMovie.mov, you will see a MyMovie.mov.tmp in the same directory as your final file. Vegas finishes sending frames to the encoder long before the encode is finished. Meaning you stop seeing frames go by in the preview window, and the frame counter stops. During this final phase I looked at resource monitor at all disk activity and see FileIoSurrogate with the activity. The tmp file was the full size and the final mov was growing in size. FileIOSurrogate was reading as much as it was writing. It was reading from the temp and writing to the final mov file.

This is the basis of my statement about Quicktime spooling.
dxdy wrote on 4/17/2013, 7:15 PM
Thinking back to tomshardware.com tests, QT is always described as single threaded.

My 3730k routinely runs high 90s with MC MP4 renders.
NormanPCN wrote on 4/18/2013, 9:49 AM
Now that makes sense. For internal codecs Vegas can control the number of threads. Quicktime is it's own system, it will do what it wants unless it gives API features to control encode threads and it probably does not.
musicvid10 wrote on 4/18/2013, 11:23 AM
Hulk,
The codec and filter libraries determine initial thread / core utilization, only rarely the rendering engine. Unfortunately, many libraries still use legacy architecture and have never been updated for hyperthreading or even multicore utilization.

If a pipe has a 1/2" inlet (the net decoder->filters->encoder output), and the outlet (engine->CPU) has a 1" opening, the output flow will still be limited as if the entire pipe was only 1/2" in diameter.

Flow = Pressure / Resistance.

This is true of every codec/filter/rendering configuration I know of. For instance, Handbrake has powerful hyperthreading capabilities, a potential to process hundreds of frames per second. However, if one of the filters or codecs in the chain utilizes only one processing thread, that is the bottleneck. Think of an automobile engine with a fuel restrictor. It can only use what it is given.
Hulk wrote on 4/18/2013, 1:47 PM
musicvid,

I am very familiar with the continuity equation (and the Navier-Stokes equations for that matter, that was boring part of fluid dynamics in college;).

If I am reading you correctly you are saying that if the encoder or decoder is coded for single thread only then that would be the bottleneck for the entire render/preview?

While I understand that there is only a finite amount of instruction level parallelism to be exploited in legacy code, I think you would agree that any semi-modern processor can decode any codec significantly faster than 30fps. So it doesn't seem logical the decode would be a bottle neck.

And if one core/thread were occupied with the decode there would still be additional cores at the ready for filtering, and additional cores on top of that available for encoding.

The point being that it seems incredulous that a modern application like VP would not be able to exploit the parallel nature of both rendering and previewing the timeline.

Video is one of the text book cases "built" for multicore/multithread processing. Vegas is software and if the engineers find a bottleneck they aren't supposed to throw up their arms in the air and give up. They are supposed to "open the pipe," create a bypass, code their way out of it.

I am an avid Vegas supporter, have been since V3, and will remain so. I will give credit when deserved and criticize when deserved.

The lack of fully utilizing mulit-threading in Vegas, in this day and age of multicores is inexcusable. I remember when Vegas first when multi-thread they simply devoted a core to video and a core to audio. A very cavemen-like course grained multithreading approach. Yes it was easy and provided some additional core usage but that's about it.

The lack of fully CPU/GPU utilization still be reported around here by Vegas hints that the Vegas engineers have still not implemented a more fine grained threading approach to Vegas.

All of the Vegas features are fantastic but creative video editing comes down to two primary factors:
1. A stable platform that doesn't impede your creative flow
2. The ability to work "in the moment" with realtime previews

- Mark
NormanPCN wrote on 4/18/2013, 3:14 PM
Hulk,

From my observation Vegas is completely threaded in all cases but one item of note is single threaded.

Each track video stream appears to be single threaded. I have edited GoPro Hero3 Black 1080p60 30Mbps H.264 files and Vegas cannot decode that and playback smoothly. While doing this nearly all my cores are unused. So it appears the decode of a stream is single threaded.

Now they could write multi-threaded decoders as applicable, but then when someone has a large project with many video tracks simultaneously in flight (aka not empty) then mult-thread decode is actually a hinderence once the number of threads equals the number of cores. Thread context switches cost time. This is a fine point but I can understand the argument for a single threaded decode. Why optimize for the trivial case. This is why high load servers use thread pools, versus thread per client architectures.

That said. I agree that decode is not any bottleneck most of the time, but high bitrate H.264 is harsh to decode. For example, I Smart proxied those 1080p60 files and suddenly my machine went from struggling to not breaking a sweat, and the proxies were higher bitrate, but mpeg-2. Those are the only files I came across that slowed down the pipeline. Everything else, including 1080p30 20Mbps files were able to render way above realtime for me with the MainConcept OpenCL AVC encoder. Most encoders are so slow that almost nothing else matters.
NormanPCN wrote on 4/18/2013, 3:18 PM
This is the basis of my statement about Quicktime spooling.

edt: FileIoSurrogate is a Vegas process. No idea why Vegas is using this with QT, but is does add overhead.
rmack350 wrote on 4/18/2013, 3:48 PM
I think Vegas always uses FileIOSurrogate with Quicktime files. Does it use it for something else?

Rob
Hulk wrote on 4/18/2013, 8:02 PM
NormanPCN,

Interesting information. When you play back that problematic native video outside of Vegas using a dedicated player does it choke your system? I'm wondering if this is a Vegas specific decode issue.

- Mark
FilmingPhotoGuy wrote on 4/19/2013, 2:05 AM
When I render GoPro 1280x720 footage with FX's I get 33% to 40%. When I render AVCHD I get 90%.

i7 920, 12GB 133 RAM, GeForce GTX560

TheRhino wrote on 4/19/2013, 4:49 AM
I do a lot of AVID DNxHD 220Mbps renders and it is the key reason we have all source footage on one fast (hardware) RAID and render all target footage to another fast RAID. Whenever I recommend folks use a RAID if they are serious about video editing, I quickly get shot-down by a couple regular posters who think they know everything.

On my three year-old 4ghz 6-core 980X system, when Vegas is done serving frames within the DNxHD codec, the final file is written very quickly. Since I typically render two VEGs at the same time, the HDDs are doing a lot of read/writes at the same time and the RAIDs prevent bottlenecks. (I can even copy files across the network and/or burn (2) Blu-rays while (2) VEGs are rendering in the background... Try that on a system without a hardware RAID...)

Compared to the cheap motherboads placed in ready-made systems, my three year-old ASUS P6T6 is a workstation class motherboard meaning that it was designed to prevent PCIe bottlenecks. Although now dated, and only PCIe 2.0, the drivers are very mature and I have not had any serious issues in the last three years. Typically I upgrade my fastest system within three years, but the 6-core Sandy Bridge's only perform 20% faster (if setup properly...)

Workstation C with $600 USD of upgrades in April, 2021
--$360 11700K @ 5.0ghz
--$200 ASRock W480 Creator (onboard 10G net, TB3, etc.)
Borrowed from my 9900K until prices drop:
--32GB of G.Skill DDR4 3200 ($100 on Black Friday...)
Reused from same Tower Case that housed the Xeon:
--Used VEGA 56 GPU ($200 on eBay before mining craze...)
--Noctua Cooler, 750W PSU, OS SSD, LSI RAID Controller, SATAs, etc.

Performs VERY close to my overclocked 9900K (below), but at stock settings with no tweaking...

Workstation D with $1,350 USD of upgrades in April, 2019
--$500 9900K @ 5.0ghz
--$140 Corsair H150i liquid cooling with 360mm radiator (3 fans)
--$200 open box Asus Z390 WS (PLX chip manages 4/5 PCIe slots)
--$160 32GB of G.Skill DDR4 3000 (added another 32GB later...)
--$350 refurbished, but like-new Radeon Vega 64 LQ (liquid cooled)

Renders Vegas11 "Red Car Test" (AMD VCE) in 13s when clocked at 4.9 ghz
(note: BOTH onboard Intel & Vega64 show utilization during QSV & VCE renders...)

Source Video1 = 4TB RAID0--(2) 2TB M.2 on motherboard in RAID0
Source Video2 = 4TB RAID0--(2) 2TB M.2 (1) via U.2 adapter & (1) on separate PCIe card
Target Video1 = 32TB RAID0--(4) 8TB SATA hot-swap drives on PCIe RAID card with backups elsewhere

10G Network using used $30 Mellanox2 Adapters & Qnap QSW-M408-2C 10G Switch
Copy of Work Files, Source & Output Video, OS Images on QNAP 653b NAS with (6) 14TB WD RED
Blackmagic Decklink PCie card for capturing from tape, etc.
(2) internal BR Burners connected via USB 3.0 to SATA adapters
Old Cooler Master CM Stacker ATX case with (13) 5.25" front drive-bays holds & cools everything.

Workstations A & B are the 2 remaining 6-core 4.0ghz Xeon 5660 or I7 980x on Asus P6T6 motherboards.

$999 Walmart Evoo 17 Laptop with I7-9750H 6-core CPU, RTX 2060, (2) M.2 bays & (1) SSD bay...

Hulk wrote on 4/19/2013, 8:35 AM
I don't know everything but I do know that with any semi-modern hard drive, disk performance has no impact on rendering or preview. Perhaps if you have an older drive and/or are working with fully uncompressed HD, and several streams at that, you might have a problem. But I doubt it.

You see, it's not 1998 when hard drives were struggling to achieve 10MB/sec transfer rates and we had to turn off all kinds of background processes, make sure the drive wasn't being polled, etc.. Any current hard drive will sustain 50MB/sec with many doing over 100MB/sec. Now consider that a 40mbps video stream is only about 6 or 7MB/sec with audio and you know why disk performance isn't really an issue anymore. Even multiple streams aren't an issue these days.

Any decent SSD will absolutely destroy any mechanical disk RAID setup in any performance metric that matters. Namely the problematic 4k read/write test.

On top of all this I used to run a benchmark site for a certain video editing app. Despite this being nearly 10 years ago, even then hard drive performance had no effect on rendering and preview. If it is a problem then either someone has their system set up incorrectly or a horribly fragmented hard drive.

I
NormanPCN wrote on 4/19/2013, 10:42 AM
Interesting information. When you play back that problematic native video outside of Vegas using a dedicated player does it choke your system?

It does seem a little twitchy looking in Media layer classic home cinema. Mostly smooth I would say. Not to the point of stutter, but something just seems off. Maybe micro frame drops. This is a GoPro in bright daylight so there is no motion blur.

Vegas can be like this. Odd looking, but often quite stuttery. So I would say the Vegas decoder is a tad slower and/or the video edit engine has a bit of overhead a player does not have.

With the smart proxies. Smooth as silk. Vegas smart proxies are MPEG-2 (HDCAM EX 35Mbps HQ) at "24" fps. No resampling, no frames lost/gained, just marked as 24fps. The proxy plays at your source rate, but the point being that the effective bitrate is higher than the encoded bitrate. So my 60fps source is played back at 2.5x the proxy encoded bitrate. The proxy is about 26-27Mbps average.