Project Idea: Referer log Zeitgeist
So, take a look at backlinks. It is kind of a mess, culled from my web server’s Apache log, but gratifying to browse. Apparently, I am the top hit on Google for such things as:
I owe such “fame,” I think, in large part, to WordPress’ clever habit of putting title keywords in to the URLs of my posts. Google seems to lend a bit more credibility to URLs that match keywords as opposed to goofy URLs. “What’s in a name? Would a Rose, by any other name …”
Anyway, another thing you’ll see in there is some spam referers that companies will wrap in to HTTP requests with the expressed intent of appearing on a public “backlinks” page to boost their own Google ranking!
I have been thinking to hack up a little log-file parser that pulls the referers out, and checks on them to make sure they actually link to the pages they say, and if the referer is a search engine, to check the results on that search engine and report the ranking. It is nice to:
- Know what keywords are bringing you traffic.
- Filter out the referer spammers.
And, thinking 1.1, it would be neat to track this data by week, or month, and see gaining keyword hits, and declining keyword hits. A personal site “zeitgeist” if you will.
Has anyone ideas on implementation, or if it has already been implemented, or other features that would be nice, or how best to arrange the results? Drop me a line or comment here. Thanks!
-danny
Response
Ben
I recently came across a way to filter out the referer spammers. Take a look over at http://cavlec.yarinareth.net/archives/2005/01/11/killing-referrer-spam/ and see what you think. It’s .htaccess based - they’ll either get 403′d, or it’ll silently redirect to the spamming site.
As far as parsing and doing intelligent things with page referers in WordPress once you’ve dispensed with the riff-raff, take a look at this plugin David Chait did (actually, he improved upon a plugin that I improved upon). - http://www.chait.net/index.php?p=238 - it’ll let you generate lists of top referers, recent Google queries that lead to your site, etc.
Comment / Tip
. . . or leave a Tip
Danny Howard is 100% responsible for the content on this site, except some of it is stolen.
All rights are reserved, unless otherwise noted. Generally, I'm a BSD guy, so you can assume implicit permission to adapt, modify, and redistribute my intellectual property with appropriate attribution. Except some of this content is itself re-appropriated, so you'd best ask first, especially for commercial use. Thanks!
You can contact me via e-mail: dannyman@toldme.com
Most of http://dannyman.toldme.com/ is powered by WordPress.
If you're hip to RSS and whatnot, you can subscribe to this site.
These links are for dannyman: login AND backlinks