24 January 2008

Fischer's Keywords

I'm a big fan of Google news. By a fortunate chain of events, I happened to be browsing it just as the reports of Fischer's death started popping up. Before the first report came through (I think it was BBC News) there were a little more than 40 items matching 'chess Fischer'. Today there are 'about 1,526'. How do you sort through 1500 stories to find those that are interesting? What can you learn from this mountain of information?

One trick is to count how often certain keywords are repeated across the stories. For example, a search on 'chess fischer Iceland' returns 'about 1,241' stories; 'chess fischer Spassky' returns 'about 1,162'; 'chess fischer grandmaster' 'about 800'. [For the rest of this post and in the interest of readability, I'll leave off both 'chess fischer' when mentioning a search, and 'about' for the count of results. The meaning of 'grandmaster' (800) should be obvious.]

Since the stories are mainly obituaries -- 'died' (1314) and 'kidney failure' (145, although the cause of death has not been confirmed) -- I expect Fischer's residences to be well represented. This works for his childhood: 'Chicago' (453) and 'Brooklyn' (534); fails for his early adult life: 'Pasadena' (36) and 'Los Angeles' (31); and works again for his years in exile: 'Philippines' (125) and 'Japan' (290). The popularity of Japan undoubtedly has a lot to do with his arrest at 'Narita' (101).

His family and friends are less well represented: mother 'Regina' (22), father 'Gerhardt' (9) or 'Gerhard' (1), sister 'Joan' (28) and her married name 'Targ' (1), biological father 'Nemenyi' (6), teacher 'Nigro' (6), trainer and friend 'Collins' (10), and biographer 'Brady' (21). The 'Church of God' (12) and 'Armstrong' (7) receive some notice, while 'Watai' (83) shows that people were paying attention as Fischer's life neared its end.

The public perception of his personality turns to 'genius' (501), 'anti-Semitic' (208) / 'anti-Semite' (30), 'reclusive' (185), 'tragic' (33) / 'tragedy' (125), 'prodigy' (111), 'legend' (88), 'crazy' (46), and 'paranoid' (31) / 'paranoia' (42). The trigger phrases 'mental illness' (8), 'deranged' (6), and 'racist' (5) are less popular.

Besides Spassky, the only other chess players who figure significantly are 'Karpov' (231) and 'Kasparov' (223). 'Anand' (36) gets noticed for the recent possibility of a match, while 'Petrosian' (20), 'Larsen' (12), and 'Taimanov' (13) barely make a dent. They still do better than 'interzonal' (10) and 'zonal' (1), which in fact is used as 'Inter-Zonal'.

Of Fischer's other accomplishments 'random chess' (78) is better known than 'chess960' (17), while 'fischer clock' (16) is largely overlooked. As for 'Hollywood' (15), is this about the inevitable movie documenting the story of his life?


sdann said...

You seem to be one of the only people documenting such searches. I've done about 50 Google Alerts on Fischer during the last week, seeking interesting headlines and blog comments, and received noteworthy results, and non-Fischer chess info purely by accident. How can all this info be put into a useful consensus? My usual tech circle has other ideas. I think it is worthwhile because of Fischer's role. What is your take?

Stephen Dann

Mark Weeks said...

Stephen - All good questions. I only covered a few of the points I wanted to make with this particular exercise and will come back to it in a few days. I'll tackle your questions at the same time. - Mark