Benutzer:Mathias Schindler/WDI

aus Wikipedia, der freien Enzyklopädie
Zur Navigation springen Zur Suche springen

As every other decent open source/free software/free content project, Wikipedia too strives for World Domination. So far, we are doing pretty good. Unfortunatly, there is no easy way of telling how good exactly.

The World Domination Index is a proposal for an apparatus that will give us a precise figure about the current world domination of the wikipedia project. For the time being, the index will be calculated from the average rank of the wikipedia sites on a set of terms from headlines from major US being queried in google. As the US, the media and google dominate the world at the same time, it is logical to take this information for calculating our score. World domination is achieved when Wikipedia has a WDI of 1.

example[Bearbeiten | Quelltext bearbeiten]

USAToday is the largest newspaper (in a broader sense) in the US. The current headlines are:

Bush sending delegation to Lebanon
President Bush announced Saturday that he is sending a delegation to Lebanon to meet with Lebanese Prime Minister Fuad Siniora
Bush praises Afghan, Pakistani allies
President Bush, playing middleman next week between sometimes-sparring neighbors Afghanistan and Pakistan, praised the U.S. allies
Bush unaware of Musharraf's contention
President Bush said Friday that if a U.S. official tried to strong-arm Pakistan into fighting the war on terror after the Sept.

The top words are

  1. Bush
  2. delegation
  3. Lebanon
  4. Afghan
  5. Pakistani
  6. allies
  7. Musharraf
  8. contention

If you google on these terms, you will get the following results:

  • Bush: 4 (en.wikipedia.org/wiki/George_W._Bush)
  • delegation: 3 (en.wikipedia.org/wiki/Delegation)
  • Lebanon: 1 (en.wikipedia.org/wiki/Lebanon)
  • Afghan: 7 (en.wikipedia.org/wiki/Afghan)
  • Pakistani: 1 (en.wikipedia.org/wiki/Pakistan)
  • allies: 1 (en.wikipedia.org/wiki/Allies)
  • Musharraf: 1 (en.wikipedia.org/wiki/Pervez_Musharraf)
  • contention: 2 (en.wikipedia.org/wiki/Contention)

In an unbalanced score, the current WDI would be 2.5. Way to go.

Of course, there are hundreds of modifications possible. WDI calculation for certain topics (US, politics, World, Finance). You have to deal with stemming or word families (the English language makes it pretty easy, compared to, hmm, Finnish). You can plot the index and add trends to it. In the long run, it would be fun to see if we first achieve feature-saturation or the world domination itself. You are of course invited for both.

Task[Bearbeiten | Quelltext bearbeiten]

  1. Collect headlines from the largest newspaper RSS feeds. (Appendix 1)
  2. Find the most important words of today
  3. Google them and write down the rank of the first wikipedia hit
  4. calculate the WDI


Appendix 1[Bearbeiten | Quelltext bearbeiten]

  1. USA Today 2,528,437 http://asp.usatoday.com/marketing/rss/index.aspx?POE=FOOTER
  2. The Wall Street Journal 2,058,342 http://online.wsj.com/public/page/0_0813.html
  3. The New York Times 1,683,855 http://www.nytimes.com/services/xml/rss/index.html
  4. Los Angeles Times 1,231,318 http://www.latimes.com/services/site/la-rssinfopage,0,5039586.htmlstory?coll=la-navigation&track=leftnav-rss
  5. The Washington Post 960,684 http://www.washingtonpost.com/wp-dyn/rss/index.html
  6. Chicago Tribune 957,212 http://www.chicagotribune.com/services/site/chi-rsspromo-htmlstory,0,4885820.htmlstory
  7. New York Daily News 795,153 http://www.nydailynews.com/front/index.rss
  8. The Philadelphia Inquirer 705,965 http://www.philly.com/mld/inquirer/news/front/rss.xml
  9. Denver Post 704,806 http://www.denverpost.com/newshound/ci_2882891
  10. Houston Chronicle 692,557 http://www.chron.com/news/rss/
  11. New York Post 691,420 http://www.nypost.com/rss/
  12. Detroit News 669,315 http://info.detnews.com/rss/index.cfm
  13. Dallas Morning News 649,709 http://www.dallasnews.com/rss/
  14. Minneapolis Star Tribune 606,698 http://www.startribune.com/776/story/58027.html
  15. Boston Globe 604,068 http://www.boston.com/tools/rss/?p1=Foot_RSS-B
  16. The Newark Star-Ledger 599,628 http://www.nj.com/rss/
  17. Atlanta Journal Constitution 561,405 http://www.ajc.com/services/content/services/rss/index.html
  18. The Arizona Republic 556,465 http://www.azcentral.com/rss/feeds/republicfront.xml