| Subscribe via RSS

New to classical? Want to get started? Visit my beginners guide to classical music! Or start browsing the different composers.

Ranking Oversaturation

April 7th, 2008 Posted in classical music

There is a negative aspect to all of the blog ranking which has been going on here, and other places recently. I don’t think it’s too awful an effect, but it’s all just a tiny little bit cheapening, this who has the most popular blog malarkey. Maybe it’s just because my eyes have been exposed to too many of these lists in the last few days, but it feels just a little bit sad to see all of those numbers next to everyones efforts. There is something uncomfortable about quantifying the writing that people have put in pretty much purely because they are passionate about the subject.

Patty at Oboeinsight feels similarly, I think:

Sure, it’s fun to be able to see that some people read this little blog. But really, would I stop blogging if I didn’t make the list? Heck no.

Which I suspect is how almost everyone else either on or off the list feels. I hope so anyway. It’s sort of hard to divorce yourself from this where-am-I-placed type of attitude when the numbers are right in front of you though, and I confess that I will be vigorously scanning any more of these that are published to find my own little website amongst all the others.

I think it’s really important to treat these rankings as vaguely indicative, and not the be-all/end-all of anything. While it’s nice to have a measure of how far an outreach you have compared to others, it’s probably not a good thing to take it terribly seriously. Or at all seriously, in fact.

Perhaps I’m just feeling a little mopey and hypocritical on the cusp of another Monday.

20 Responses to “Ranking Oversaturation”

  1. A.C. Douglas Says:

    “I donâ??t think itâ??s too awful an effect, but itâ??s all just a tiny little bit cheapening, this who has the most popular blog malarkey.”
    â??â??â??â??â??â??â??â??â??â??â??â??â??â??â??â??â??â??â??â??â??â??â??â??â??â??â??-

    Quite right. And that’s precisely why statistical measures of relative popularity such as a count of RSS subscribers â?? itself a fairly iffy statistic at best on several fronts â?? is not the best, nor even a useful, way to rank blogs. There is, however, some value to be derived from a ranking of blogs using a measure of relative importance such as a count of incoming links, incoming links being the universally accepted such measure within the universe of the Web. As I pointed out in my introductory post to what became the S&F Top 50, such lists are “often the only reliable guide a newbie or ‘outsider’ has available to him to sort out the wheat from the chaff initially without his having to slog though dozens upon dozens of blogs himself, most of which turn out to be ultimately valueless reading.” Are lists so constructed the be-all/end-all of anything, as you put it? Of course not, and no-one to my knowledge has ever so much as suggested they are. If they’re properly made using a data set that can be trusted and that measures what needs to be measured for the purpose, they’re merely useful, not determining.

    ACD


  2. A.C. Douglas Says:

    Jeez! Those funny-looking characters up there are not my doing. Your text parser didn’t know how to represent an em dash and so replaced it with that “â??” thingie as it did several other non-ASCII characters.

    Here’s the comment again with the em dashes and those non-ASCII characters replaced. Probably best to delete the original of this (and this explanatory preface as well).

    “I don’t think it’s too awful an effect, but it’s all just a tiny little bit cheapening, this who has the most popular blog malarkey.”
    ————————————————-

    Quite right. And that’s precisely why statistical measures of relative popularity such as a count of RSS subscribers — itself a fairly iffy statistic at best on several fronts — is not the best, nor even a useful, way to rank blogs. There is, however, some value to be derived from a ranking of blogs using a measure of relative importance such as a count of incoming links, incoming links being the universally accepted such measure within the universe of the Web. As I pointed out in my introductory post to what became the S&F Top 50, such lists are “often the only reliable guide a newbie or ‘outsider’ has available to him to sort out the wheat from the chaff initially without his having to slog though dozens upon dozens of blogs himself, most of which turn out to be ultimately valueless reading.” Are lists so constructed the be-all/end-all of anything, as you put it? Of course not, and no-one to my knowledge has ever so much as suggested they are. If they’re properly made using a data set that can be trusted and that measures what needs to be measured for the purpose, they’re merely useful, not determining.

    ACD


  3. patty Says:

    Yeah. I’m with you. I say they don’t matter, but I check ‘em, and my heart takes a little happy leap if I’m on the list. Go figure.

    Now IF I get home from my trip and find that my PO Box is full of money because of my suggestion, I might decide that the whole popularity thing is simply wonderful.

    I’m not counting on that, though! ;-)

    (I have, however, received some oboe reeds due to my little place. Now THAT is quite cool. Especially when I get one I can actually play on!)


  4. Ben Says:

    ACD,

    I think your dismissal of RSS subscribers as “not even a useful” way of ranking blogs is a little shortsighted. Each one of these ranking methods is far from perfect, which is one of the main reasons I looked at a variety of them. The Google backlinks method is certainly not a “universally accepted” measure of relative importance.

    For a start, as you pointed out on your own site, we do not have a very complete understanding of how Google selects which links it includes on the “out of xxx” numbers which we both quote. It does not seem to be nearly as straightforward as a simple pagerank threshold – try clicking on some of the linking websites in any results page and you will find a bunch with no pagerank at all.

    Since we actually have no concrete idea of how this algorithm works I do not think that it is reasonable to claim that this method of ranking is superior to any other, despite what we individually might believe. As we are working with imperfect measures, I feel most comfortable with comparing a selection of them, rather than claiming that any one is obviously the best. I don’t think you can reasonably champion one while dismissing all the others.


  5. A.C. Douglas Says:

    Ben: I didn’t say that the Google Backward Links method is a universally accepted measure of relative importance. What I said was that the count of incoming links is the universally accepted measure of relative importance within the universe of the Web. That’s an inarguable fact. My using Google’s Backward Links as the counter is a judgment call on my part. It’s not perfect by any means as I’ve several times noted. It’s simply the soundest count available at the present time. RSS subscribership is NOT a measure of relative importance, also as I’ve noted several times. It’s a measure of relative popularity. I suppose that in this celebrity-crazy era, some folks might think that’s the right measure to use for ranking blogs. But as I’m certain I don’t have to tell you, the most important things are only rarely also the most popular.

    ACD


  6. A.C. Douglas Says:

    Oh, and I should have noted above that Technorati’s Authority number is NOT a count of incoming links. It’s a count of linking blogs. That’s one of the things that makes it of limited use in a ranking of this sort.

    ACD


  7. Ben Says:

    ACD: It certainly is not an inarguable fact. “Importance” is an extremely subjective term. The number of quality incoming links is currently the universally accepted method of determining the relative authority of websites based upon keyword searches. This is true simply because the underlying Google algorithm is based upon this method, and Google is currently king of the internet search.

    I do not at all think that this equates to a universal measure of importance. It is simply a measure of how well-linked a website is and does not directly measure the amount of people reading or using that material.

    As almost all of the incoming links to our blogs come from other classical music blogs, and almost all of these are from “blogroll” sections, looking at the number of incoming links is basically a measure of how many other classical bloggers are linking to one’s blog.

    From your previous comments it seems as though this is what you consider to be the most accurate measure of importance. Personally I care far more about the number of people reading what I write than I do the number of other classical music bloggers who have me in their blogroll, and I suspect there are others who feel similarly.

    Of course, the question becomes significantly more complicated when we consider the accuracy of the methods employed.


  8. A.C. Douglas Says:

    This is true [that incoming links are the universally accepted measure of relative importance] simply because the underlying Google algorithm is based upon this method, and Google is currently king of the internet search.

    Yes sir. No question about it. And your point would be…What, exactly?

    Personally I care far more about the number of people reading what I write than I do the number of other classical music bloggers who have me in their blogroll, and I suspect there are others who feel similarly.

    I can understand that. Problem is, there’s no statistically inclusive and “clean” way to measure that across the universe of classical music blogs. RSS reader subscriptions gleaned from any one online RSS reader (there are at least four that I know of, and there are dozens of RSS desktop readers from which no statistics are glean-able) include only a subset of a subset of a subset of a subset (that’s four levels of subset) of total blog readers, and only a subset of a subset of a subset (three levels of subset) of those who read blogs via RSS readers. And then there’s the problem of which feed of the three feeds are being read. IOW, RSS subscribership is a fairly useless statistic for ranking purposes even if one believes (I am not among them) that’s the best measure of blog relative importance.

    Of course, the question becomes significantly more complicated when we consider the accuracy of the methods employed [to count incoming links or subscribers].

    Oh, indeed. That’s THE major problem in using any data source whatsoever. As I’ve written previously elsewhere, it’s a matter of doing the best we can with the best available to us; ergo, my choice of Google Backward Links as the data set most fit for the S&F Top 50.

    ACD


  9. Ben Says:

    ACD:

    You bracketed out the important bit:

    The number of quality incoming links is currently the universally accepted method of determining the relative authority of websites based upon keyword searches.

    In other words, it is universally accepted as a measure of importance for the purpose of Google searches. This is quite different from being an universally accepted measure of importance on the web, and this is what I was taking issue with.

    However, I totally agree that it is an excellent indicator of how well established and integrated into the classical music blogging community a site is, and I also think that it is a comparatively “clean” measure. Otherwise I would not have included it. I just think you are overstating your case a bit.

    Additionally I don’t understand where you get the “four levels” of subset from for RSS readers. A subset of readers use RSS, and a (fairly large) subset of these use either Google Reader or Bloglines. That’s two levels. As I’m sure you are aware, most preferential surveys use only a small sample of an entire population, but are indicative of the population as whole.

    Of course, if you don’t think RSS subscribers is a useful metric then that is not going to sway your opinions. I think other people will disagree, and therefore I am providing them with that information.


  10. A.C. Douglas Says:

    Ben: I left out “the important bit” because, quite frankly, I thought you simply misspoke due to carelessness. What you wrote is quite wrong. The relative importance (or, “authority,” to use your term) of websites is NOT based by Google on keyword searches. It’s based on the incoming links to a website’s pages which Google expresses in a single number on a page by page basis: PageRank. PageRank is at the very heart of Google’s statistical enterprise; that which propelled it to its position as “curren[t] king of the internet search,” to use your words. And that’s why the concept of incoming links (called Backward Links by Google) has been accepted universally as a measure of relative importance on the Web.

    As to the four levels of subset, *I* misspoke due to carelessness. What I should have said is that there are four levels of classical music blog readership: 1) total classical music blog readership; 2) classical music blog readership via RSS readers; 3) classical music blog readership via online RSS readers; and 4) classical music blog online RSS readership via any single online RSS reader. So, four levels total, and three levels of subset for any single online RSS reader.

    And as to your, “most preferential surveys use only a small sample of an entire population, but are indicative of the population as whole,” that’s true if, and only if, the small sample is a genuine random sample, a condition which cannot at all be verified to be the case with the small, third-subset-level sample of online RSS classical music blog readership.

    ACD


  11. A.C. Douglas Says:

    Oops

    Incomplete last graf up there. Graf should have read:

    “And as to your, ‘most preferential surveys use only a small sample of an entire population, but are indicative of the population as whole,’ that’s true if, and only if, the small sample is a genuine random sample, a condition which cannot at all be verified to be the case with the small, third-subset-level sample of online RSS classical music blog readership, and is almost certainly NOT the case.”

    ACD


  12. Ben Says:

    ACD:

    That’s not what I meant. To clarify, I said that the number of quality incoming links is one of most important determining factors for keyword searches. i.e. Google searches. When you enter a search term Google identifies appropriate websites based on the keywords entered and ranks them according to their Pagerank, which is calculated partially by the number of quality incoming links and partially based on other factors which nobody except the folks at Google know for sure. Pagerank is nowhere near to being purely a function of the incoming links, but this is at the core of the algorithm. I’m sure you are already familiar with this. To paraphrase your words, this is why the concept of incoming links has been accepted universally as a measure of relative authority for Google searches. This is far more specific than being some universally accepted measure of importance of a webpage.

    I am not arguing with the fact that this is widely accepted as a fairly important rating of authority for searches, but this is a far weaker statement than it being a universally accepted measure of importance.

    Anyway, I think we both agree that incoming links is a pretty good metric to use. To be honest, if you had said “widely” instead of “universally” I would probably have gone along with it.

    I think we can be pretty confident that the number of Google Reader subscribers is a fairly consistent percentage of the total number of RSS subscriptions when comparing blogs of the same topic (that is to say, since essentially the same group of people are reading our blogs, we expect similar percentages to be using the same RSS feed readers) therefore it is a useful measure of RSS readership if the sample is large. In this case the sample population is the entire user base of Google Reader. Essentially each subscription number result is answering the question: of everybody who uses Google reader, how many are subscribed to blog X? This is a far larger population then just the classical music blog readership — as you mistakenly state the sample consists of — and IS in fact a pretty decent “random” (clearly it’s not entirely random as it requires people to be using GR, and there is some bias between blogs) sample.

    There are of course a variety of different reasons why it is not an ideal measure…


  13. A.C. Douglas Says:

    To paraphrase your words, this is why the concept of incoming links has been accepted universally as a measure of relative authority for Google searches.

    I wrote that the success of the concept of PageRank is the reason that incoming links has become the universally accepted measure of relative importance on the Web, and that’s precisely what I meant. Period. Full stop. No qualifying or limiting, “for Google searches.” The acceptance of incoming links as a measure of relative importance goes well beyond that.

    In this case the sample population is the entire user base of Google Reader…., etc.

    Not quite. It would have to be a sample population of classical blog readers in our case, and that sample population is three subsets down, and NOT random; ergo, useless as a statistical metric for our purpose, as I wrote, even if one accepts that a measure of relative popularity rather than one of relative importance is a fit metric for ranking purposes in our case which I do not; ergo, my choice of Google’s Backward Links (i.e., incoming links) as the metric best fit for the purpose.

    I’ve elsewhere several times made the point that my chosen method of ranking for the S&F Top 50 is far from perfect. It’s merely the most appropriate and statistically robust presently available.

    ACD


  14. Ben Says:

    ACD:

    The acceptance of incoming links as a measure of relative importance goes well beyond that.

    Yes it does. Does that mean it is “universally accepted as a measure of importance?” No. Beside your extravagant claim that you are aware of the feelings of the entire internet (“universally accepted”) you are in fact implicitly defining “importance” as “being placed well on google”:

    … the success of the concept of PageRank is the reason that incoming links has [sic] become the universally accepted measure of relative importance on the Web.

    Which is exactly what i was saying.

    I have several issues which I am not going to get into with your “three subsets down” argument, and I am not going to start getting into sampling theory. I will simply state the following two things:

    1) The Rest is Noise has a subscription number of 946, which means there are at least this many classical music blog readers using the service. Even this tiny fraction of the GR subscribers is a huge sample, statistically.

    2) The GR subscription rankings and the backlink rankings are strongly correlated.

    You can claim all you like that the backlinks method is:

    … merely the most appropriate and statistically robust presently available.

    But you have not even begun to prove that statement. It’s outrageous to make that claim when you have essentially no idea about hugely important things such as how Google decides on the number of links to display in the “out of xxx links” line. You are dramatically overselling your chosen method.


  15. A.C. Douglas Says:

    Ben:

    Yes it does. Does that mean it is “universally accepted as a measure of importance?” No, etc.

    Not “the feelings of the entire internet.” The feelings of those — universally — who compile such Web statistics.

    As to your two points:

    1) Even allowing on your insistence on considering total GR subscribers (in which case, the stat of interest becomes the fourth subset level of five total levels), that “huge sample” is still NOT a random sample; ergo, useless for our purpose.

    2) Strongly correlated? I suspect not. From the cursory investigation I made on that point, I would guess that if the correlation coefficient could be computed it would fall well below 0.7 except in the cases of the Big Boys (like Alex Ross, who’s the biggest of the Big Boys). And in any case, since the RSS sample is almost certainly not a random one, one can’t even begin to use such a statistical measure.

    But you have not even begun to prove that statement.

    I don’t have to. That incoming links is (and if you quote this, spare me the sic as in the context of this thread we’re talking about the concept — singular — of incoming links) the universally accepted measure of relative importance is proof enough of the statement for this purpose.

    All the above notwithstanding, this back and forth between us has given me what I think might be a useful idea and a useful thing to do.

    In the next quarterly S&F Top 50 rankings (to be published 1 July), I’ll of course still rank by Google Backward Links, but parenthetically display the Google Reader RSS subscription number for each entry. Better would be to use Google Reader + Bloglines subscription numbers summed for all three feeds (Atom, RSS 1, and RSS 2), but since I have to do everything by hand (the last programming I did was a utility application written in straight assembly for the Intel 8088 — the original IBM PC — some 25 years ago), that would be far too onerous a job for the compiling of a mere curiosity statistic.

    Does that sound like a useful thing to do?

    ACD


  16. A.C. Douglas Says:

    Ha! I just read your new post (on my IE7 feed reader). I swear on all that’s near and dear to me that I read it after I posted the above.

    ACD


  17. A.C. Douglas Says:

    Oops

    Typo up there. My, 0.7 should have read 0.3.

    ACD


  18. Ben Says:

    ACD:

    I think the point you are making is that the sample is not random because we are selecting GR RSS subscribers, correct? However, since we are comparing classical music blogs this is only important if GR subscribers are naturally biased toward specific classical music blogs over others. It is “random enough” given that we are comparing things which should not be affected differently by the bias of the sample selection.

    In short, I think that other factors (such as correctly dealing with multiple feeds per site) limit the usefulness of the RSS numbers way more than random sample concerns do.

    Incidentally, the correlation of the entire data set is about 0.65.

    I still disagree with your “universally accepted” phrasing but I feel that at this stage we are arguing way more about the choice of words than the usefulness of the actual statistic.

    If you feel that people would benefit from the RSS numbers (something which I am quite interested in finding out) and you have the patience, I think it would definitely enhance the statistics. Actually what I would really like is a table which lets you rank by every possible available ranking, similar to the adage ones, but that kind of mammoth production is for people who actually get paid to write about these things, rather then us poor lucky fellows who do it for pleasure…

    Ben


  19. A.C. Douglas Says:

    If you feel that people would benefit from the RSS numbers (something which I am quite interested in finding out) and you have the patience, I think it would definitely enhance the statistics.

    What I think is that people would have a curiosity satisfied by my including that Google Reader RSS subscriptions number. It would as well be interesting, if only marginally useful, to see that number juxtaposed immediately next to the Backward Links rank.

    What the hell. I think I’ll do it (and credit you and this back-and-forth with provoking the idea).

    Good jousting with you.

    ACD


  20. A.C. Douglas Says:

    Ben: I’ve been giving more thought to including that GR number parenthetically in the next S&F Top 50, and the more I think about it, the less a good idea it seems. To include such a fluff statistic in juxtaposition to such a robust one as Backward Links will, I think, accomplish nothing but to cheapen the entire ranking. And so I think I’ll leave it to others to do what they will with RSS subscribership numbers in the ranking of classical music blogs.

    ACD


Leave a Reply