Some recent articles and blog postings are finally starting to challenge the Google search religion. I’ve provided a bunch of links for you in the webliography at the end of this column. The real issue is one of search engine spam and the serving up of questionable results. This is a real issue for any kind of searcher, learner, educator, or researcher. It runs the risk of ruining the usefulness of search engines and, in particular, Google. Some scenarios could happen:
1. Google could become a massive dark hole of lousy content driven by the needs of advertisers, marketers, and special interest groups. Users may or may not notice.
2. New competitors could arrive and address these weaknesses and create market options for search that drive improvements across the board. Perhaps Bing or blekko are already starting this.
3. Search engine optimizers could become regulated or self-manage to address the threats to their own interests.
4. Content reputation management systems that have been tried over the years (like a Good Housekeeping Seal) that have been tried over the years may finally come alive.
5. Recommendation systems that rely on the value of the recommender, your own social connections or respected groups, leaders, or professions could influence the relevancy of search results. This shows some potential in recommendations tied to your own contacts in such environments as Facebook; LinkedIn; StumbleUpon; Digg, Inc.; Quora; or even a renewed Delicious. Peer recommendations are already working better in music, movies, and recreational reading than they are in the research and question and answer space.
Each of the above potential opportunity scenarios has some chance of occurring. Some are desirable goals, but most also run the risk of being double-edged swords. While you could get better answers under some scenarios, it comes at a cost of narrowness, a dependence on group-sourcing answers, and/or a reduction in innovative thought and serendipity. So, what to do?
I’d suggest that what is most important, in the near term, is to build credulity skills in learners and researchers about what’s behind the results they get from web search engines. To do this, we must add a greater dimension to the teaching of searching and information literacies. We must move beyond the teaching of raw searching skills and the retrieval of information, simple content quality evaluations, and the narrowly based search training for media literacy to avoid the dangerous, prurient, and gambling aspects of the web. These skills are important, but there are more fundamental insights that can be gained by understanding the business models behind search engines. Learners and researchers should know and be able to ask themselves who or what chooses to promote that link on the pages of search results they are seeing. Are those links driven by simple mathematical relevancy or a search algorithm? Are special interest groups, political parties, individuals, lobbyists, or commercial advertising interests determining the results searchers are finding?
So, here are some insights into what we need to be teaching, in addition to all the good stuff we’re already doing now.
First, every hour, 1 million spam pages of content are created. Spammers are out to harm users, steal publisher traffic, and defraud legitimate advertisers. A new search engine has created a spam clock to highlight this issue (see Figure 1).
For starters, every searcher should know who creates spam pages, why, and how they influence search results. For instance, did you know that Yahoo! owns one of the largest content and article creation companies that are designed to drive traffic to advertisers? Can you name the other majors? This is an important issue. These so-called content farms are companies such as Demand Media and Answers.com. Each creates thousands of pieces of content per day. This content may actually be correct … or not. On the surface analysis it seems a bit shallow, but it serves as link bait to attract searchers to information that may be biased or lack perspective.
For instance, it may be paid for by a single pharmaceutical company to drive people to review its drug therapy. It may be a class action cohort attempting to build numbers for a mesothelioma legal suit. It might be an appliance manufacturer attempting to influence your consumer choice of freezer or stove brand. Both of those companies are now firmly inside the top 20 web properties in the U.S., on a par with the likes of Apple, Inc. and AOL, Inc. Surprised? Google alone makes $1 billion dollars in profit every month or so. It is highly unlikely that any of that money is coming from the pockets of you or your students. The search engines are focused on serving the needs of their real customers—the advertisers—and have many tools and services at their disposal to delight those paying clients.
Search Engine Optimization
Search engine optimization (SEO) and its little brother, social media optimization (SMO), are the big boys of influence in the world of changing search engine results. These techniques are used by any web property with any degree of sophistication including library websites. There are white-hat and black-hat search optimizers. Usually for a fee, they work to ensure that your web presence (website, Facebook profile, Twitter feed, etc.) gets the traffic you desire. Sometimes they want to sell something, and other times they are promoting a point of view.
There are well-known sites from racist organizations such as Stormfront that promote their causes and points of view. This is an example of black-hat optimization. White-hat optimization is that undertaken by charities and commercial interests. Political parties, politicians, and political action committees have become expert in driving voters to their sites and editorials. In recent years, these have become very sophisticated with the ability to geocode SEO (aka GEO) and direct results at the electoral district, area code, ZIP code, or census tract level. I am told that you can purchase the ability to use localized SEO at the school and college campus level since young targets are the sweet spot of advertisers.
Google is excellent at providing search results for the big who, what, where, and when questions. SEO, SMO, and GEO play a key role in making the search results better. Who would want an answer to pizza that contained only the biggest chains and not the local ones they could visit along with a local coupon? The difficult questions—those that start with why and how—are more important, and they are the foundation of an education that is based on critical thinking. Most of the time we get delightful results because the questions are simple. So, we get lured into a sense of comfort and trust when we fail to notice that the results are heavily influenced when the questions are harder—health issues, politics, business decisions, and more. Intelligent searchers will question their search results and dig deeper when the response is important to a decision they are making. We need to teach this deeply and scaffold those skills as learners age and their questions increase in difficulty, importance, and impact.
Clutter, Spam, Relevancy
Google search results have become a spammed and cluttered mess. At this point it seems to be a game of Whac-A-Mole to build a search algorithm that senses spam sites and SEO content. The big engines are notoriously secretive about their algorithms, and that is understandable. They reportedly change them often, maybe even daily. Google has become a search religion, or a bad habit, and that’s dangerous to critical thinking, democracy, and the learners and users we care about. Google may have outlasted its usefulness, or it may overcome its current deficiencies and problems. At this point, though, the only ethical thing for educators is to do is to train learners and researchers for the future and to encourage them to explore options beyond Google.
In my opinion, blekko, Exalead, and Bing are fine choices for a start. Any one of these can suffer from the same issues, and the skills apply broadly. And when you add the alternative models of library sources that do not depend on revenue from advertisers, you have a better toolkit and skills as a searcher. A library’s licensed database resource and online catalogue results are never influenced by SEO techniques and third-party manipulation. That’s a key piece of information that everyone should know in order to succeed.
To learn more about the dangers of trusting the search results too much, follow and read the following citations. Many have good examples of sites and searches that show the impact of overly influenced search results that could be readily adapted and used for training sessions.
1. Google’s decreasingly useful, spam-filled web search: www.marco.org/2617546197
2. Trouble In the House of Google: www.codinghorror.com/blog/2011/01/trouble-in-the-house-of-google.html
3. Why We Desperately Need a New (and Better) Google: http://techcrunch.com/2011/01/01/why-we-desperately-need-a-new-and-better-google-2
4. Dishwashers, and How Google Eats Its Own Tail: http://paul.kedrosky.com/archives/2009/12/dishwashers_dem.html
5. Content Farms: Why Media, Blogs & Google Should Be Worried: www.readwriteweb.com/archives/content_farms_impact.php
6. On the increasing uselessness of Google: http://broadstuff.com/archives/2370-On-the-increasing-uselessness-of-Google.html
7. Google’s “Gold Standard” Search Results Take Big Hit In New York Times Story: http://searchengineland.com/googles-gold-standard-results-take-hit-new-york-times-57081
8. How The “Focus On First” Helps Hide Google’s Relevancy Problems: http://searchengineland.com/focus-on-first-helps-hide-googles-relevancy-problems-50253
9. What Is Search Engine Spam? The Video Edition: http://searchengineland.com/what-is-search-engine-spam-the-video-edition-15202
10. Google’s Search Engine Optimization Starter Guide (32-page PDF): www.google.com/webmasters/docs/search-engine-optimization-starter-guide.pdf
11. Google’s Search Algorithm Has Been Ruined, Time To Move Back To Curation (GOOG): www.businessinsider.com/googles-search-algorithm-is-spinning-out-of-control-2011-1?utm_source=feedburner&utm_medium=feed&utm_campaign=Feed%3A+typepad%2Falleyinsider%2Fsilicon_alley_insider+%28Silicon+Alley+Insider%29
12. Blekko Launches Spam Clock To Keep Pressure On Google: http://searchengineland.com/blekko-launches-spam-clock-to-keep-pressure-on-google-60634?utm_source=feedburner&utm_medium=feed&utm_campaign=Feed%3A+searchengineland+%28Search+Engine+Land%3A+Google%2C+Bing%2C+SEO%2C+PPC%2C+SEM+%26+Search+Marketing+News%29
Contact Stephen at firstname.lastname@example.org.