Bits ‘n’ Bytes Gaming Overlord Martin Watts has penned a great article about the inherent flaws in game ratings and how both readers and outlets tend to interpret them. This is a longstanding issue, one that everyone seems to hate but for which no one seems able to impose a fix. Check out Martin’s view and ask yourself: how many times have you seen a score of “seven” and considered the game mediocre? I know I’ve done it…
Email the author of this post at steerpike@tap-repeatedly.com.
I think the “70 is the new 50” thing comes from our decades in the school system. 75% is a C. It’s hardwired into our brains. 50% is well into the F range.
This doesn’t make it right, of course, but I definitely believe the traditional scholastic grade factor determines the ratings. Thankfully, we’re on a 0-5 scale and sidestep most of that, even if Metacritic hates us as a result.
Mr. Watts speaks the truth. I think long time gamers are used to all the slanted scoring systems, as sad as that is, but it’s people outside the industry that I worry about.
What Metacritic is quite useful for is to gauge how much acclaim critics in each industry bestow upon their given medium. I won’t use music as a comparison, because generally being easily digestible you don’t frequently see scathing reviews; a better comparison is film.
Here is a reality: most (read: 8 out of 10, let’s say) films that see widespread public release are lousy. Forgettable. Awful. Embarrassing. Plain bad. That’s easy to see without Metacritic; but that a lot of film scores on Metacritic actually do reflect this reality is telling of the general integrity of critics in that industry.
Let’s take a random example: Saw III, a film I haven’t seen, but one that I think is generally regarded as “mediocre at best” has an aggregate score of 48. It’s in a nice bright shade of yellow. Now I feel like comparing that to my personal favourite film, Dr. Strangelove. Wow, it has a 96! That’s truly excellent, and I feel that most others would agree with me that it deserves the near 50-point advantage it claims over Saw III (and for a moment, let’s just appreciate the ridiculousness of the 100-point scale, okay?)
Now, I’d like to take another two examples, but this time from the video game industry. Two games I’m quite familiar with: I’m going to make this easy on myself and look up Portal. One of my favourite games ever, one of the most agreed-upon tour de forces video games have ever seen. Wow, it has a 90! That’s a pretty stellar score, it doesn’t get a lot better. Now to compare it to a game that I felt was a bit overrated by most, and one that I gave an average, or “middling” score here on Tap, I think deservedly so, Dead Space. It was nothing special, quite clumsy, but peppered with some clever spots. And it has an aggregate score of… 89? Wait.. huh. What the fuck?
Sorry for my over-exaggerated and over-acted point-making, but there lies quite openly for any to see a gaping flaw in the integrity of our industry. Once again, for a moment forgetting the absurdity of 100-point review scales, who in their goddamn right mind would put a game like Dead Space within ONE POINT of a fucking masterpiece like Portal?!?
“I rest my case.”
“You rest your case?”
“Oh, I’m sorry, I thought that was just a figure of speech. Case closed!”
It’s also proof that as a medium grows more complex it becomes increasingly difficult to score things mathematically. It’s like an earlier discussion in Lewis’s Vocal Coaching article, wherein people were debating whether subjective things can be honestly “scored” at all.
Dead Space 89? It’s a fair enough game, but it’s nothing special. I wouldn’t call it broken or unpleasant to play, but it’s certainly forgettable.
Portal? A masterpiece that will be remembered for decades. A timeless jewel. From a purely mathematical perspective I see the “90” – it’s kinda short, for example; certainly shorter than Dead Space. Every now and then the level design strayed from Perfect to the realm of merely Outstanding, which is a downgrade. But as an object it is so glittering that a 1-100 scale can’t really contain it.
This is why I really like the fact that Tap looks back at old games. Citizen Kane bombed during its theatrical release. Only later did people recognize its importance. And some that were seen as important at the time – like Griffiths’ Birth of a Nation – are used now in school as cautionary tales of how dangerous an artistic medium can be when it comes to influencing the ignorant.
The ratings/scoring system has been like this, and broken, for a while now. I remember back in the day when I would flip through my copy of “PC Gamer” and pretty much ignore any game that was rated below an “8”. Anything below a “9” would have to be carefully scrutinized and cross-checked to see just how bad the flaws in the game really were.
Like most things, the internet has both helped perpeuate and alleviate this issue. There are more “scores” and “ratings” given games today then ever before and most still follow the same 7/70 = horrible and a bunch of 9/90+ ratings. There are also a ton of smaller, independent review sites, like this one, that are more than ready to give a game a fair, honest review and trash a game if need be. I think it’s a good balance.
I remember when “Dragon Age 2” came out last month. Almost all of the big/more popular outlets were giving the game rave reviews, while a lot of the smaller sites were trashing it. I was talking about it to Steerpike and he wisely said, “It’s probably not as good as the big sites say it is and not as bad as the smaller ones. Probably somewhere in between.” After playing it for about 24 hours or so, I agree. It’s not the best thing ever, but it is by no means a mess like some portrayed it as.
At this point, I am not sure how much ratings/reviews really influence my purchases any more. I pretty much know what I like and I what I want to buy and what I don’t. There are certainly games on the margins that will be review dependnet, but even in those cases, I will usually wait for someone I know well and who knows me well, to play it first and give me their assessment before I trust the judgment of some reviewer I have never met.
Great point xtal. Me and the Miss pay close attention to what film critics say because generally speaking they’re usually spot on when compared to fap-happy game critics. There’s always the exception — I’m looking at you Hurt Locker — but we can check a score on Rotten Toms and see which films are doing the rounds and have a good idea of whether it will really be any good.
If you were to present Dead Space and Portal alongside their Metacritic scores to somebody who has no fucking idea what the difference is between the two, I would presume that they’d be hard pressed to choose one over the other. Using your example of Dr. Strangelove and Saw III, it’d be much more clear cut.
I think one problem is that with the rise of the internet in the medium’s infancy anybody can be a critic. You literally have no idea who the person is reviewing a game, their taste, their history, their sensitivity to certain issues and yet in some in cases, scratch that, in many cases (as highlighted by Martin Watts’ article) their voices have the potential to ruin people’s lives or elevate garbage.
And that’s where building a trustworthy community comes in! Get to know a site’s personalities, come to know the likes and dislikes of the individuals.
For example, here at Tap, when you glance a review by Dobry you think: “hate filled,” by Lewis: “a game with no end,” by Steerpike: “wordy and pretentious,” by Gregg: “her slippers my ass.”
I think a large part of the problem with video game scoring lies with us. Maybe not us on Tap, because we’re generally just better people *cough*, but us as gamers. What we expect and/or what we tolerate from a game has warped what we see and read into a review score.
It’s got to the point now where I find myself becoming suspicious of an 8/10 score.. especially when awarded to high profile games. 8/10 has become the new “safe” score to give a game you thoroughly enjoyed and would highly recommend, but which has a few issues. Give it a 7 and you open up a can of worms. “Worse than random other game which got an 8!?!”. “This reads like an 8!?!”. “How can you not give Halo at least 9/10!?”. You get the picture.
I also think hype is a factor. I remember reading reviews of Mafia II when that was released, and some sites gave it some pretty middling scores. Once that happened, I remember reading a ton of comments from gamers slating the reviewers because they wanted Mafia II to be good. Because the trailers looked good and it had a pretty big marketing budget that told people they were getting something of a higher quality. As a result of that, people who hadn’t even played the game were calling out the reviewer – who HAD played it – for being wrong. Some of it was just dogs abuse, infact. Is it easier to tell the truth in this situation, or just slap an 8/10 on the end of a review and say “yeah, it’s alright”. And if you do that, what happens to the other 8/10 games that are genuinely good games?
I’m waffling. I’d also quite happily remove score systems altogether.
I’m just going to pretend that you actually commented on the topic you meant to, Armand. ๐
I still stand by my comparison. I realize that Dead Space is generally understood to hold more merit than any Saw film; however, I think my point remains clear: in film criticism you can typically find a vast divide between the magnificent and the “meh.”
In video game scoring you’ll find almost all big name, known releases in the range of 85 to 97. The former being used to describe just about anything that is “pretty good”; the latter reserved for titles that a wealth of critics consider revolutionary (BioShock, GTA IV, Half-Life 2, Super Mario Galaxy, etc.) [note: ^not my opinion^]
There’s no integrity in that.
If review scores worked properly, this is how recent titles would aggregate: [note: indeed my opinion]
Dead Space – 65
Grand Theft Auto IV – 70
Crysis – 75
Fallout 3 – 80
Call of Duty 4 – 85
Braid – 90
BioShock – 95
Portal – 99.999
Instead, what we actually have is ALL of those games scored between 89 and 94 (exception: GTA IV at 98).
Try really hard, just from memory: When was the last time some massively hyped franchise sequel didn’t end up with a 93 or 94 on Metacritic? 93 and 94 is basically the default “safe” score for games that are expected to be awesome (read: they are sequels to popular games…)
How is anyone supposed to differentiate and actually find value in these arbitrary numbers? There’s two ways: people reviewing games either a) scrap number systems altogether or b) work out a point system that extends beyond the confines of 8.0 – 9.5.
I think people have found a way to differentiate – they think anything below an 85 is bad, anything below 75 is garbage. Which leaves a lot of numbers unused. I think it’s what Dobry said – we have the school grading system in mind. And while people say a “C” is “average,” everyone knows it’s a little under par.
But games aren’t 100 questions, nor are they two hours like movies. Most games can’t be utterly ruined by a stupid plot twist at the end, or even utterly ruined by bad acting. They’d lose points for that, sure, but a movie might not survive the critical savaging.
A concise score is helpful because sometimes you just want a quick bottom line. Maybe we should add a Pros and Cons section to our reviews as well.
xtal: Iโm just going to pretend that you actually commented on the topic you meant to, Armand.
Yes, I’m a fool. ๐
I hear what you’re saying on a lot of this, and agree with most of it. But I think the reason games like GTA4 and Dead Space get higher marks is that a lot of people really think of these as stellar grade A games. Don’t get me wrong, I hated GTA4, but from the reviews I’ve read of it on the major gaming sites, and the way it’s discussed in gaming conversations, you’d think it were The Godfather of videogames or something.
Also, Fallout 3 should get AT LEAST a 90, maybe more like a 95. I just checked and it has a 91 on Metacritic. I’m good with that. (What can I say, I love that game!)
I like seeing a hard number too, but I’ve learned whose to trust. Years ago I read a lot of PC Gamer. Their reviewer quite enjoyed Beyond Good & Evil and awarded it a 73% score. Looking back you might call that underrated, but I’d say it’s a fair score.
IGN, on the other hand, I take all their 9.0 scores with a grain of salt.
A 73% for Beyond Good and Evil fair? Man, you are a much harsher judge on this stuff than myself. ๐
I’m in agreement with Ajax. Scoring games is pretty much moot.
I always liked Siskel’s and Ebert’s thumbs up/thumbs down system for movie reviews. It was honest for one thing and depended on the critic’s argument to back up the rating. It also gave more weight to emotional reactions. You could hate on a movie while admitting it’s good points.
The 100 point rating system is especially onerous and misleading. This game is an 87. This game is a 96. What the hell does that mean? It’s based on the belief that the reviewer has access to some kind of measuring instrument that gives us a read out of game quality.
In the end it’s Pass or Fail and the devil is in the details.
(Also, wow, you guys really like Portal.)
I find that a lot of game rating scales (Gamespot’s especially) make more sense the way they are used in practice if you pretend they are logarithmic. As in, according to Gamespot, Portal(9.0) is ten times as good a game as Dragon Age II(8.0).
For the record, I love BG&E. If it’s not in my Top 5 list then it’s on the doorstep. I’m just trying to speak objectively when I say I think that’s a fair score, don’t get me wrong.
And to Scout’s point, it’s useful to get some sort of “final verdict” or whatnot. I like a simple 4-point scale: 1 is bad, 2 is average, 3 is good, 4 is great and beyond. I also like our 5-point scale (technically 6 I guess, with zeros) here which is accompanied by fun little images. No offense to the new images, but I do miss that smiling gold star fellow. ๐
I guess I’ll disagree with the usefulness of rating reviews beyond up or down and leave it at that. The hardest part of writing a review for me is, after all that writing, canceling it all out by applying a number. People want what they want though.
Hey, you alliance folk: this BnB article got a nod in The Sunday Papers today! That’s always nice, isn’t it?
To be honest, when reading about a game in a review, I don’t really look at the numbers at all anymore… I usually just look at pros and cons (after reading an article, if I do), and I’ll take it into consideration.
Then I’ll watch trailers, play demos, watch gameplay etc to decide if I’ll get a game. Oh, and of course discuss games with people who’ve played them (like the good people here).
And THEN I’ll see what my budget is, and maybe consider getting the game I’ve been looking into.
Word of mouth is pretty important, but it has to come from a source you can trust, and I’d trust any of you guys over a magazine review of someone I haven’t met. There’s no way an arbitrary score can take into consideration my personal preferences in a game etc. A car racing game with a unanimous 99.99% will never convince me to buy it.
Car racing games are at about a 40-50% for me (for the lovely looking ones), and subjectively I can’t see them getting any higher because I just don’t dig them that much. Throw in some weapons, some sci-fi-ness, some RPG-lite, and now we’re getting up into the 80-90% mark for me! (And they can even keep going around a little track).
“Throw in some weapons, some sci-fi-ness, some RPG-lite, and now weโre getting up into the 80-90% mark for me!”
Aside from the “And THEN Iโll see what my budget is”, you’re not Armand in disguise by any chance? ๐
I can neither confirm nor deny my Armand-ness…