An Algorithm for Discovering "Hidden Gems"
So you've probably seen that Ars Technica article that has magically reverse-engineered every steam game's gross (sales+free-key-redemption) numbers.
Everyone's talking about this graph:
The obvious first statement is "WOWS! Video games are a hit-driven business!!!" For what it's worth, our game Defender's Quest is probably located somewhere between 800-1200 on that graph, and we're doing just fine.
But what about the game at 2600? And what's going to happen when Steam totally opens the floodgates?
Every developer I know has a list of obscure indie games they know are great, but just can't get the exposure they deserve. Jay Barson talks about this phenomenon in his recent article The Best-Kept Secrets. (Incidentally, if you like CRPG's and haven't bought Frayed Knights yet, do yourself a favor and grab it)
Here's Jay (emphases mine):
A friend told me about an experiment in inequality and the “cultural” marketplace. Test audiences were exposed to new music, and invited to download and keep some songs. The songs had indicators of how much they’d already been downloaded. Some of the songs had an artificially inflated number. What they discovered is that the artificially inflated number helped the popularity of some songs within the controlled audience, but could not others.Right now stores like Steam and the App store are dominated by "top selling/grossing/downloaded" charts, which all suffer from the same problem -- the charts are self-reinforcing and prone to freezing, thus encouraging shady behavior. You can mitigate this effect in various ways (such as sampling over a shorter time, as Steam seems to do), but the problem remains -- the games on top stay on top.
In other words – some songs had hit potential and others did not. If a song had both hit potential and an apparent wave of popularity behind it, it could take off. A song that sucked rarely did well. Hit potential could not guarantee success, but it’s lack could guarantee the lack of success – or at least a large success. In other words, the distorted perception of popularity had some effect, but was not the sole contributor to the success or failure of the song.
My twitter friends are constantly trying to personally give visibility for such "hidden gems" and I think that's admirable, and we should keep doing stuff like that. At the same time, I think we can supplement those efforts with some mathematical tools.
Here's the idea -- if the above experiment's results are applicable to video games, there might just be a way to build a "hidden gem" detector, given access to the right data. What we're looking for is some quantifiable way to find a game that people really seem to like, but that hasn't yet had a chance to be popular. The kind of game my twitter friends are talking about when they say, "RT! Man, <game X> is wonderful, such a shame no-one has heard of it."
Building a "Hidden Gem" detector
What we're looking for is games with high engagement stats but low overall popularity. The data Ars Technica has been working with is truly awesome, but for these purposes it's a bit thin -- all we've got is gross units owned by players (lumping together actual sales with steam key redemptions), and some play time stats which have some known reliability issues.
But Steam has access to better data than that. Steam knows exactly how many copies of each game have been sold as well as the number of concurrent players, all down to the hour, region, and platform. It also has overall playtime stats per player, though that might be less reliable. There's also conversion rates for games that have demos, achievement data, and more. And of course, there's also the recently-added user reviews.
In short, Steam itself has plenty of data, both private and public, to make a pretty good guess at overall player engagement for a particular title. For "traditional" games, a higher-than-average playtime is a good heuristic for engagement, and for short-form games like Dear Esther, some metric for "completed the experience," such as a specially marked achievement, could work. Or we could just follow Kongregate's example and simply go by user rating, as I suggested in a previous article. (I personally prefer something closer to the Tomatometer over anything like MetaCritic scores).
The key is you have to be careful not only with what you measure, but how you measure it. Give these two articles a quick read before proceeding. It'll just take a minute:
How not to sort by average rating
reddit's new comment sorting system
Back already? Great.
As those articles detailed, the ranking system should measure something other than raw popularity. Otherwise, the first item to get close to the top gets seen more, and thus collects more (sales/upvotes/downloads) and the cycle continues. The ranking system should also automatically take into account uncertainty based on sample size.
If you're ranking based on an engagement stat like play-time, or by user rating, then getting to the top does not guarantee that a game will stay there. In fact, the influx of new eyeballs will more than likely kick the game down a few notches as it gets exposed to a wider audience. This way, only games with high engagement --regardless of popularity-- will be able to stay at the top of the charts for a long time, which is how it should be. The chart is self-correcting rather than self-reinforcing.
Quick example: imagine a single-player RPG with median playtime of 5-10 hours -- well above the average for Steam games, but with less than 2,000 sales. This should be easy to detect. Or one of those crazy avant-garde "experience" games, the kind you only play once but it's amazing. Say that game has 2 hours median playtime, but 90% of it's 3,000 or so players have gone all the way through. This should stand out, too.
We don't have to get super-detail-oriented with this. In a world where 37% of player's Steam games haven't even been loaded a single time, it should be easy to make some simple broad-strokes guesses and pick out games that make you say: "wowzers! A small niche of folks really love this thing."
Let's make a "hidden gems" list for those.
Limitations
The "hidden gem" detector isn't a silver bullet. It doesn't make the game of platform power go away, nor does it solve the structural, sociocultural and/or economic issues that favor the big boys. It also does nothing for games that aren't already on Steam (though Valve has given every indication they're going to let just about everybody in eventually). Here's a few other issues.
Beloved by Niche, Hated by Mainstream
If the "Hidden Gems" list is featured right on the steam front-page, then a lot of cool, weird stuff that's beloved by small fanbases will shoot to the top of the list. Immediately afterwards, there's a fair chance the mass market will cast its withering gaze upon it like the Eye of Sauron and mercilessly smack it down faster than you can say "Walking Simulator." So, perhaps hidden gem lists should be segregated into their natural niches, using steam tags, or an amazon/netflix-style recommendation engine?
On the other hand, you never know what seemingly crazy game idea will actually catch on if it's just given a chance with the mass market. This is exactly the kind of tool that could bring us the next Goat Simulator.
Different Strokes for Different Folks
As I mentioned above, you can't use a one-size-fits-all engagement heuristic to catch all the gems. "Did you finish it?" is much more relevant to judging overall engagement for Dear Esther or The Stanley Parable than it is for World of Warcraft or Dwarf Fortress. Since the whole point of this is to identify interesting things that would normally be passed over, special care will have to be taken with the various heuristics used. And what about things that are harder to measure like Proteus? I'm confident there are ways to detect people's interest in these things, but we'll have to be smart about it.
We Won't Find Everything
This method can't possibly find all the gems, nor should we expect it to. It's just another tool, and that's important to remember. It won't fix everything, but it could help make things a lot "less worse."
Gaming the Charts
Anytime you have a system where charts drive sales, it encourages people to game the charts. If it measures downloads, you drop your price to zero, cross-promote, and use your own money to inflate your stats. So I could easily imagine similar scams if we're measuring user playtime -- with proverbial goldfarmers idling steam games to inflate their client's chart positions. Safeguards could be put in for this -- it should be pretty easy to look at other metrics to detect if someone is "idlefarming" for chart position. Of course, cheating of any form is always an arms race, with the only final solution being to make it not worth the cheater's time and money to cheat in the first place. Which brings me to my next point.
Moving beyond Charts
By itself, I don't think another chart on the steam front page is going to make a huge difference -- being hand-picked for a feature still makes a world of difference and most Steam developers put their marketing efforts into securing that rather than trying to micro-manage the charts.
Valve has said their vision for Steam's future is that anyone can make a steam front page and curate it themselves. The algorithm I've described above could be used not just by automated charts, but by actual human curators. You wouldn't have to expose any sensitive data (like sales or playtime), just give curators the ability to look for these kinds of patterns with advanced search tools.
Right now, if I want to promote an obscure indie game, it has to already be on my radar. How cool would it be if I could just whip out my Hidden Gem Detector and go treasure hunting instead?