Did anyone get sick this weekend?
determining the effectiveness of using social media posts for public health surveillance
Background Recreational water illnesses are not as well known as food borne illnesses in the media. There are several pathogens associated with ingesting surface water including Giardia, Cryptosporidium, and Toxoplasmosis. The use of technology for public health surveillance is also little known to the public and can provide much insight into other illnesses on social media not otherwise reported to public health and medical professionals. Illnesses on social media could represent a portion of unreported cases. These cases could be found on social media as a popular outlet for individual expression. Methods Social media posts were found using a variety of keywords including symptoms of significant waterborne illnesses and terms associated with human and environmental contamination. Social media posts were collected from forums and popular social media platforms such as reddit. The posts were then correlated with beach water quality data for a sampling site as geographically close to a case location as possible. Results Social media and water quality data collected from the Columbia river region were correlated. The correlation coefficient of 0.2335 indicates that there is no correlation between social media posts and beach water quality data. Numerous limitations may have impacted the correlation coefficient. Keywords associated with symptoms were more effective in obtaining quality threads and posts compared to other terms. Conclusions Correlating social media posts to water quality data in the Columbia river region does not provide statistically significant results. Manual gathering of social media data for public health surveillance is found to be inefficient and impractical. Further study is required in order to determine the effectiveness of using social media for public health data gathering. It remains to be seen whether correlating posts about illness on social media to water quality data is an effective method of surveillance for public health.