Friday, April 02, 2010

Fourplay: Q9650 Quad-Core Benchmarks

It's been almost exactly four years since I first upgraded to a dual-core processor. Prophetically, I concluded the article with the statement, "Make no mistake, dual-core CPUs are here to stay". In retrospect, perhaps I should amend that to read, "multi-core CPUs are here to stay" because it's clear that with future applications, there is no room for just a single-core or even dual-core processor. In fact, it's becoming increasingly difficult for a mono-core processor (even a relatively fast one like a 3.2GHz Pentium 4) to successfully handle several tasks at once such as running a virus scan, checking email and editing a Word document.

For the past couple years, I've observed the maxim that it's better to run a high-clocked dual-core rather than a slow-clocked quad-core as few applications and games utilized more than two cores. However, with the release of Battlefield: Bad Company 2 and DiRT 2, I can no longer ignore the need for a quad-core processor. Granted, they're both demanding games, so I initially dismissed the occasional choppiness as a limitation of the video card. But I was still puzzled as to why my brand new Radeon 5850, one of the fastest video cards available, would be struggling. Finally, I spotted the problem when I checked the task manager and saw that both games were completely maxxing out my dual-core E8500.

It was September 2006 when Intel unveiled the first quad-core processor at their annual developer conference. The Core 2 Extreme QX6700 was built on the Kentsfield core, featured 8MB L2 Cache, a 65nm fab, and a 1066MHz FSB. It was clocked at 2.66GHz and cost $1,000. Originally, it was used to showcase the game Alan Wake and its multi-threaded ability to utilize four processor cores and run entirely on DirectX 10. However, this has become a sore spot among computer enthusiasts as it was recently announced that when Alan Wake arrives this summer, it will be an Xbox 360 exclusive. The irony is palpable as the Xbox features neither a quad-core processor nor DirectX 10.

Despite that letdown, Alan Wake did provide us a glimpse into how a brand-new multi-threaded PC game such as Battlefield: Bad Company 2 might use a quad-core processor. The game spawns five independent threads which are for Audio, Physics, Rendering, Streaming, and Terrain Tessellation. Obviously, the Audio thread is responsible for all sound in the game, and it is also said to be one of the least CPU dependant. Conversely, the Physics thread can be the most demanding, consuming up to 80% of one core by itself, particularly if there is no hardware support such as PhysX. The Rendering thread organizes the data to be sent to the GPU for display, while the Streaming thread loads the game off the hard drive. By dedicating one thread to this, it helps the game seamlessly transition from one area to the next. Finally, the Terrain Tessellation thread is tasked with procedurally generating the environment as it unfolds, which helps minimize objects popping into view. A key benefit of DirectX 11 is hardware tessellation but to what degree it is implemented in the DX11 portion of Battlefield is currently unknown. Regardless, with new and future multi-threaded games, it's very easy to see why even dual-core processors are no longer capable of properly supporting them.

After much deliberation, I decided on a 3GHz Core 2 Quad Q9650 to replace my 3.16GHz Core 2 Duo E8500. It was the path of least resistance and allowed me to keep my current setup and simply swap out the CPU. Frankly, with Intel's over-inflated prices of their last-generation Core 2 lineup, I could have purchased a new Socket 1156 motherboard and Core i5 processor for the same $330 that the Q9650 cost. But that's just Intel's way of trying to force customers to buy their newest chips. Still, it's quite a premium considering that the 2.83GHz Q9550 is $100 less and the 2.66GHz Q9450 is $200 cheaper. Essentially, I'm paying two benjamins for a paltry 340MHz. However, as I learned later with the benchmarks, megahertz still plays a very important role no matter how many cores are available.

HEATSTROKE

The Q9650 benefits from several architectural improvements over Intel's original quad-core QX6700. For starters, the Q9650's core is code named Yorkfield and possesses 50% more L2 Cache (for a total of 12MB), a smaller, more efficient 45nm fab and a faster 1333MHz FSB. In essence, it's basically two 3.0GHz E8400's joined together. However, the stock fan and heatsink are the same that shipped with my two year-old E8500 and is remarkably cheap considering the $330 price tag. In retrospect, I'd have been better off removing the motherboard and spending $25 on an aftermarket cooler rather than struggling with Intel's bargain-bin unit. Locking the heatsink in place so that it maintained direct contact was quite a feat and several times the entire PC shut down from overheating. At one point, I thought I might actually snap the motherboard in half trying to secure the fan. When I did manage to get it up long enough to check the BIOS, the CPU was hotter than a George Foreman grill. The temperature was a sizzling 200 degrees Fahrenheit, nearly double the safe operating limit. Eventually, with the motherboard removed (something I'd tried to avoid) and the fan fully secured, temps were closer to room temperature. And thankfully, like that New Year's Eve 1999 party, there was no permanent damage. But out of curiosity, I checked one of our new Dell workstations outfitted with a 2.4GHz Q6600 quad-core processor and noticed that Dell also eschews the ill-functioning Intel cooler. Dell's custom unit is comprised of a towering aluminum heatsink ventilated by a massive 120mm fan. It's both ultra cool and quiet-- resting temps were a chilly 30 degrees Celsius and even under full load it only spiked a little past 40 degrees. In comparison, my sweltering system operates at nearly double those values. Finally, it's worth noting that with the new Core i7 980X, Intel has adopted an entirely different cooling design that attaches via a back plate and four screws. My question is what took them so long?

DISCLAIMER: It's worth noting that these tests are more of an apples to oranges comparison, because the E8500 is heavily overclocked to 3.66GHz versus my choosing to run the Q9650 at its factory setting of 3.0GHz to get an accurate baseline. Had I chosen to return the E8500 to its original 3.16GHz speed, the difference would have been much more pronounced. However, because I run the E8500 continuously at 3.66GHz, I figured it would most accurately represent the true increase on a daily basis. Unfortunately, synthetic tests such as the ones below don't seem to accurately reflect the improvement I'm seeing in actual games. For this reason, I declined to use the results I'd gathered from SiSoft Sandra 2010 as they likewise seemed well off the mark.

3DMARK VANTAGE

Futuremark's DX10 extravaganza is growing long in the tooth, and a DX11 replacement is due soon, but it's still a very demanding benchmark. Typically, 3DMark is used to test the video card, but in this case I was more interested in the CPU tests. Sure enough, the GPU score was virtually identical, but the CPU tests reflected nearly a 70% improvement with the score leaping from the E8500's 7,250 to the Q9650's 12,260. Additionally, the CPU Test 1 bounced from 671 with the dual core to 1,070 with the quad-core and CPU-dependant physics tasks such as the Futuremark flags whipping in the wind were noticeably more realistic. Although Futuremark doesn't explicitly advertise Vantage as supporting four cores, it's clear that it does.

BATTLEFIELD: BAD COMPANY 2

I've joked that I bought a $330 processor to play a $50 game, but it's really not that much of a stretch. Although BC2 doesn't have a proper benchmark, the tangible difference between my overclocked E8500 and the Q9650 was extremely pronounced here. Prior to this installment, I'd never been a fan of the Battlefield series having just briefly tried the Vietnam chapter in 2004. I was hooked on Call of Duty: Modern Warfare 2, and was skeptical that BC2 could challenge it, but it certainly made a believer out of me from the first few moments. Unfortunately, the intermittent frame rate stuttering was distracting and was not letting me enjoy the game to its fullest degree. So when I forced the DX9 render path (instead of DX11) and saw no improvement, I knew the bottleneck was not the video card. A little research turned up my dual-core processor as the culprit and even after upgrading to a quad-core, CPU ultilization spread over all four cores was still heavy at 70%. Given the processor load, it seems like the Xbox 360's 3.2GHz PowerPC Tri-Core Xenon will be on the ragged edge of running this game.

CINEBENCH RELEASE 10

Ever since I first used Cinebench a couple years ago, I've been eager to try it with a quad-core processor as it can actually support up to 16 cores. For this test, I also threw the Dell quad-core workstation into the mix as I was curious how it would stack up against my homebuilt quad-core. However, the multi-threaded Cinebench made it clear in no uncertain terms that clock speed is still king. The 3.66GHz E8500 rendered the high-resolution image on one core in just 3 minutes and 30 seconds, the 3GHz Q9650 took nearly a full minute longer at 4 minutes and 25 seconds, and the 2.4GHz Q6600 labored behind at a lengthy six minutes. The roles were somewhat reversed on the Multiple CPU rendering as the Q9650's 3GHz and two extra cores helped it win the fastest time in a scant 1 minute and 14 seconds. Meanwhile, the megahertz-muscle of the dual-core E8500 posted a valiant effort of 1 minute and 56 seconds but the slower four-cores of the Q6600 barely edged it out with a time of 1 minute and 42 seconds. Unfortunately, it wasn't until after I'd benchmarked all three processors (and sold the E8500) that I discovered there was a newer version, Cinebench Release 11.5, available.

CONVERT X to DVD 4

Aside from the obvious benefit to gaming, I was equally excited to try the Q9650 for video encoding. I typically download several 700MB and 1.5GB AVI files per week and it takes anywhere from 12-24 minutes to encode one. The latest version of Convert X to DVD has an option whereby you can set the number of cores for the program to use. I converted a 1.5GB copy of Avatar with my dual-core E8500 overclocked to 3.66GHz which took exactly 24 minutes. With the Q9650's two-additional cores, it chopped the time in half to 12 minutes. Interestingly, Convert X doesn't seem to care about processor speed, as the time was exactly halved going from a faster dual-core to a slower quad-core.

DiRT 2

Without a doubt, DiRT 2 benefited the most from the quad-core upgrade. But as with the other programs, the DirectX 11 in-game benchmark failed to accurately confirm the significant improvement. It showed a minor increase in the minimum frame rate from 50 to 55 FPS and the maximum frame rate from 60 to 64 FPS. Yet those statistics fail to convey the sense of speed that had been missing from the game. I feel intimately acquainted with DiRT 2 because I played many hours of it in DX9 before I received my Radeon 5850. Following that, I started the entire game over because I wanted to experience it in its entire DX11 splendor. Now, despite nearly finishing it again, I have once more restarted it because the difference with four cores is so profound. In fact, the frame rate is so fast now that the game feels like it's perpetually stuck on fast-forward as the action unfurls like a projector reel that has jumped its sprockets. With the drop-dead gorgeous visuals and ultra-high definition environments, DiRT 2 is like watching car porn on Blu-Ray.

INTEL ICE STORM FIGHTERS

Surprisingly, not only is this the oldest benchmark in the group, but it's also the only demo that visibly displays the load on all four cores. That's because it was commissioned by Intel as a selling tool to promote their quad-core processors. Designed by Futuremark, it's not even DX10, but the blizzard of activity (think of the climatic snow battle on Hoth in The Empire Strikes Back) dragged my E8500's frame rate into the 30s. However, the Q9650 easily juggled the multi-core onslaught, while maintaining a solid 50-100 FPS depending on the action.

WINDOWS EXPERIENCE INDEX

This software first debuted in 2007, a not-so-subtle application designed to inform dim-witted customers why their computer was running so slowly with Vista. The arbitrary tests attempt to accurately rate your system on a scale of zero to 7.9 in Windows 7, but as I mentioned before, Windows seems to discriminate against anything less than four cores. My overclocked 3.66GHz E8500 registered a 6.9 on the scale, while the 3GHz Q9650's processor calculations per second net it a 7.3 score. Naturally, I take these "assessments" with a grain of salt, but they're still interesting nonetheless.

OVERCLOCKING

Obviously, I couldn't resist the urge for long before I tinkered with the front side bus. Overclocking from 333MHz to 370MHz yielded a new speed of 3.33GHz that made it faster than any Intel consumer quad-core you can buy-- the pricey QX9770 and Core i7-960 both top out at 3.2GHz. Considering each of them sell in excess of $500, it makes my overclocked 3.33GHz Q9650 suddenly seem like a bargain. Additionally, the 11-percent FSB hike didn't noticeably affect the CPU temperature, which was a blessing in itself considering what I had gone through with the overheating. As for the benchmarks, every application save Convert X benefited from the extra megahertz. Echoing what I observed earlier, Convert X required the same time to encode Avatar at 3GHz as it did at 3.33GHz, indicating that it is limited more by physical cores that processor speed. 3DMark's CPU score was boosted from 12,260 to 12,950 and CPU Test 1 went from 1,070 to 1,770. An additional 25 seconds was sliced from the single-core Cinebench test but the four-core test was just 8 seconds quicker. And finally, the Windows Experience Index reassessed my CPU with a 7.4 score.

CONCLUSION

I learned from this exercise that whether it's one or one-hundred benchmarks, sometimes there's no substitute for simply playing the game to get a "real-world" feeling for the improvement. Granted, it's easy for cynics to dismiss such a practice because it can be unduly influenced by a host of outside variables-- namely enthusiasm and excitement which can unrealistically inflate the perceived performance of the product. But when you've scrutinized certain programs dozens of times over a series of months or years as I have with the preceding games and benchmarks, you tend to develop a trained eye for the subject matter. Prior to my purchase, I'd read a lot of forum reviews where owners spoke enthusiastically of never being able to go back to a dual-core after using a quad-core and I can certainly agree with that. However, Windows 7 doesn't seem to boot any faster, nor do programs appear to load quicker. If anything, single-threaded software feels more responsive with the overclocked dual-core and the benchmarks bear that out. And even in multi-threaded applications like Cinebench, an overclocked dual-core is still nearly as fast as a slower quad-core. So for now, quad-core adoption is only encouraged if the individual user has enough applications to warrant it and can afford a fast one. But don't say I didn't warn you, because multi-core CPUs are here to stay.








  Pumpkin Spice It's not everyday you park next to an orange Lotus Elise       Pirate Press            November 2023          At the en...