Health Hacker

At the intersection of science, technology and health

Health Hacker header image 2

Using Google to Track STD Trends

November 22nd, 2008 · No Comments · Uncategorized

Google recently released Google Flu Trends, through its non-profit arm, Google.org. The site uses search terms that have been found to be good indicators of flu activity, and Google claims that it can estimate flu activity in a US state up to two weeks faster than traditional monitoring systems.

But what other symptoms do Google users want to learn about? The social taboos and anxiety surrounding STDs make Google a potentially widely-used source of information when it comes to uncovering the symptoms of these diseases.

Just how widely? Well, the idea for this post came to me recently when I was using Google’s Insights for Search to explore search trends on various topics. The amount of search information that Google has made public is pretty staggering, and I think that we’ve probably just begun to scratch the surface in terms of its application and usefulness. In any event, running a query for the term “symptoms” in 2008 yielded the following Top Searches and Rising Searches (click to enlarge):

The US Centers for Disease Control and Prevention (CDC) tracks national occurrences of syphilis, Chlamydia, and gonorrhea (Trends in Sexually Transmitted Diseases).  Let’s see how the CDC data compares to the Google data.  In order to analyze a larger data set, we’ll look at a period of time from 2006 until present.

Here are the results of a search for regional interest in the terms “Syphilis Symptoms“. [1]

The most current CDC statistics (from 2006), indicate that the following 10 states have the highest syphilis rates.  A search rank [bracketed] has been placed next to the states that also appeared in the Google analysis.  Where a state in the CDC’s top ten did not appear in the Google top ten, I’ve placed the Search Volume Index in parentheses.  The Search Volume Index is a normalized value, with the state appearing in position 1 having a Search Volume Index of 100 (see the figure, above).

  1. Louisiana [3]
  2. Alabama [1]
  3. Georgia [4]
  4. Nevada (0)
  5. Maryland (89)
  6. California (77)
  7. Texas (89)
  8. Tennessee (78)
  9. New Mexico (0)
  10. Florida [9]

The Google analysis identifies four of the top ten, including the top three.  Of the remaining six states, two (Maryland and Texas) have enough relative search volume to place them one index point outside of the top ten, while two (California and Tennessee) still have a relatively high search index score.  Two states (Nevada and New Mexico) are aberrations.  What if we restrict the search to 2006, the same year of data that the CDC is analyzing?

The correlation breaks down, with only New York, Florida, Texas, and California showing significant search volume.  This data is much less reliable, due to a number of factors, including internet penetration and Google’s search market share (est. at 45% in June of 2006 and 69% in January of 2008).

Here’s what the rest of the data looks like, using the 2006-present period:

“Chlamydia Symptoms”

CDC’s Top Ten

  1. Alaska (0)
  2. Mississippi [1]
  3. South Carolina (76)
  4. New Mexico (0)
  5. Alabama [2]
  6. Hawaii (0)
  7. Georgia [6]
  8. Delaware (0)
  9. Tennessee (75)
  10. Illinois (74)

“Gonorrhea Symptoms”

CDC’s Top Ten

  1. Mississippi (0)
  2. South Carolina (0)
  3. Louisiana [1]
  4. Alabama (0)
  5. Georgia [2]
  6. North Carolina [4]
  7. Delaware (0)
  8. Missouri (79)
  9. Ohio (80)
  10. Tennessee (80)

What does this all mean?  Well, it’s a bit hard to tell right now, but people are already using Google more frequently to self-diagnose.  In some cases, there are some pretty glaring differences between Search Volume Index and the CDC’s numbers (e.g., see Mississippi & S. Carolina in the gonorrhea results).  This could be an artifact of Google’s normalization algorithms, and/or a function of internet access or cultural/educational differences in certain regions. [2]

The CDC’s 2006 report, released in November of 2007, is retroactive, and it will be interesting to see how these numbers change when the 2007 report is released.

[1] Note that I’ve applied the “Health” filter to the data in this article.  While this may change the composition of the “Top Ten” somewhat, due to the small differences between search volumes, the overall distribution on the map remains fairly constant when this filter is removed.

[2] One of the “breakout” terms listed in the first figure, above, is “clamidia”.  Apparently shellfish are also concerned about STDs.

Free Bonus Material:

Genital Herpes Symptoms

Genital Warts Symptoms

HIV Symptoms

If you enjoyed this post, please consider subscribing to our Feed.

Tags: ············



0 responses so far ↓

  • There are no comments yet...Kick things off by filling out the form below.

Leave a Comment