Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Recommended Flaw: 31% of Digg Homepage submitted by 10 Users (popfail.com)
26 points by ajbatac on July 7, 2008 | hide | past | favorite | 26 comments


It looks like the number on News.YC is about 26%.

  > (let bys (map [_ 'by] (beststories nil 1000))
      (/ (apply + (map cadr 
                       (firstn 10       
                               (sort (compare > cadr)
                                     (dedup (map [list _ (count _ bys)] 
                                                 bys))))))
         1000))

  .258
This is the fraction of the 1000 top-scoring stories in the past month or so (http://news.ycombinator.com/best) submitted by the top 10 submitters. Who incidentally are:

  nickb    77
  edw519   34
  markbao  26
  bdfh42   20
  nreece   19
  ajbatac  18
  wumi     17
  prakash  16
  smanek   16
  ilamont  15


To me the percentage itself isn't surprising but the ratio

    percent of stories on homepage by top 10 
    --------------------------------------------
    percent of stories submitted total by top 10
might be interesting, especially compared with

   percent of stories on homepage by non-top n
   -----------------------------------------------
   percent of stories submitted total by non-top n
It would reveal if top submitters are just submitting more stories, or if they're submitting better stories.


very true - that would be enlightening.

You can see something similar at http://top.searchyc.com/points_per_submission . That shows you the top users (on YC at least) by points per submission.

http://top.searchyc.com/points_per_comment is even more interesting in my opinion


Hey Paul,

I speak to JD all the time (he wrote the article) and we've been looking at these numbers again every couple of hours for the last couple days, and the 31% that he wrote about (it hovers around the 30% mark) is fairly consistent when looking at the last 500 frontpaged stories.

One thing he didn't write about was that at the time if you expanded it out to top 28 users, that number was around 51% of Digg's frontpage stories.

Around 5am this morning (for me, about 5 hours ago), we ran the numbers again and at that particular time, one user was accounting for 53 of the last 500 frontpaged stories on his own.

Not going to say the data is conclusive or anything, because I frankly just dont know, but when you consider Digg says they have something like 23 million uniques a month and well over 1 million registered users, 50% of their frontpage traffic being dictated by less than 30 users is a little strange.


It doesn't seem strange to me. The top submitters aren't dictating the stories. They're just submitting them first. And it doesn't seem at all unlikely that a few dedicated users could be the first to find a large fraction of the stories that ultimately became popular.


They're not always submitting them first either though. Users like MakiMaki and MrBabyMan (Maki and Andy are both great guys btw) often resubmit other users stories to get them frontpaged.

Not saying thats a bad thing either, because I believe that good content should have the best possible chance of getting noticed - it's just the Recommendation Engine has widened the gap recently between the top users and the large majority of the site.


Hmm. Ok, I believe the number could be partially due to brokenness. Digg made a big design mistake initially in copying Slashdot too literally, and having stories that make the frontpage start at the top and get pushed down. That causes all sorts of problems. Having stories bubble up from the bottom is much more robust. But even if they did that I bet the percentage of frontpage stories from the top 10 users would surprise most people.

Edit: Steve Huffman just reported the number for Reddit, and it's a lot lower than Digg's: http://news.ycombinator.com/item?id=238290


Agreed. I personally think its a great thing that there are a core set of users that keep a site interesting and fresh.

I failed to mention these on accident, but here are a couple more stats to give a little more insight into how the update has affected things.

Pre Recommendation Engine, the top 10 only accounted for between 12-17% of stories on Digg's frontpage, the #1 digger Pre-RE (MrBabyMan) now barely registers post update (he was tied for 11th at the time of the article).


For some reason, that code sample makes me want to investigate lisp.


Over the last week, 6% of the top stories were submitted by top-10 users on reddit.


What are "top" stories? My front page shows 100 stories on reddit -- the default is higher than Digg right?

I do trust that Reddit does have a more robust top-user base, but I'm curious about the context.


A "top" link is a link that has been the hottest link on all of reddit at some point.

There isn't really an analogous concept on Digg since there is no notion of hottness and links don't rise and fall. Links just appear at the top of the front page at some point.


An interesting comparison would be the last n articles that have been ranked <= 25 for the default sub-reddit mix. At least that would compare the articles a user would see on each frontpage without customization.


Well to answer my own question (close enough anyway) the top 10 submitters of the top 500 submissions this week account for 15.2% of articles, for this month its 15.8%


What's the number for subreddits?



yup, nothing new. That might even be a good percentage :) http://www.useit.com/alertbox/participation_inequality.html


Related reading: "The murky demimonde of Amazon's top reviewers"

"Amazon's rankings establish a formal, public competition for power or its online equivalent, recognition...

To the extent that competitive energies drive Top Reviewers and their nemeses to generate content, and to spend time on and publicize Amazon.com, the chief beneficiary of misuse of Amazon.com's ranking system is Amazon itself."

http://www.slate.com/id/2182002/pagenum/all


This is really old news. It first broke end of March with this post: http://socialalerter.com/news/digg-got-it-166-wrong

What's interesting is what happened after Digg's algo change in January. That algo change was supposed to increase diversity of users making the front page. It had the exact opposite effect.


How can it be "old news" when the Recommendation Engine only recently got released?


Because the data has been available for ages. The Recommendation Engine has little to do with going popular - yet.

In the future when we have thousands of new stories published after the Recommendation Engine being available for everyone, the same analysis would be very interesting to do again then.

It boils down to whether you have enough data or not.


On the scale of bitching about digg (which I'm known to do :-)) this one doesn't rank. Given the various power laws out there, I'd say that it's actually pretty diverse.

The people at the top, for whatever reason, work very hard to get there.


This isn't all that surprising to me.

I would guess that there a few groups of users, with approximate proportions:

Lurkers: 70% Diggers: 20% Submitters: 10% Regular Submitters: <1%

.. is there anyone from Digg here that can confirm this? :)


You could probably get this from the digg api, lurkers coming from various traffic figures out there. I'd guess the proportion of lurkers is actually a little higher than that.


I dont know the specific numbers - but I guess if you looked at a site like socialblade.com you could probably guesstimate the submitters/regular submitters number.


On jaanix it is the same ~30% for new users (no personalization).

But for returning users we have a personalized front page, and on top of that the posts are diversified so each topic and person gets fair attention the way YOU want it.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: