Some of the most useful health datasets in modern public health were not collected by governments or hospitals. They were collected by regular people. Volunteer weather stations. Birdwatchers logging sightings into eBird. Cyclists uploading their morning commutes into Strava heatmaps. Period-tracking apps that quietly assembled some of the largest fertility datasets ever compiled.
The pattern is consistent. If you lower the friction of contributing one tiny, specific observation, and you make the aggregation public, the collective signal becomes more useful than any individual data point ever was.
Boogers might be the most underused candidate for this treatment.
The wastewater precedent
During the early waves of COVID-19, public health officials noticed something that surprised almost nobody who had been studying infectious disease for a long time: sewage is data. Analyzing the virus load in wastewater from a city gave earlier warnings of a local outbreak than hospital case counts did, because sewage does not wait for people to feel sick enough to go to a clinic.
The lesson wasn't really about sewage. The lesson was that involuntary biological output from a large group of people, aggregated, is a more honest signal than voluntary reporting.
Nasal mucus sits in roughly the same territory as wastewater, with one crucial difference. It is not anonymized by default. Every person produces it, handles it, and disposes of it. The friction to observe it is almost zero. The friction to report it is almost entirely social.
What a public booger dataset could actually show
This is deliberately speculative, because the experiment has not been properly run at scale. But consider the kinds of patterns you would expect to emerge from a large, anonymous, time-stamped dataset of mucus observations:
- Regional allergy seasons, mapped by actual onset of symptoms rather than official pollen counts.
- Wildfire smoke impact on upper airways, hour by hour.
- Post-holiday respiratory illness spikes, resolved at the neighborhood level.
- Office-return effects in cities where remote work had been the norm.
- Correlations between local air quality changes and subjective nasal complaints.
None of this would be diagnostic. All of it would be descriptive. Descriptive epidemiology is how public health historically got most of its useful insights, and it is worth trying on a signal we have been culturally conditioned to discard.
The privacy question comes before the medical one
For a public booger dataset to be useful, two things have to be true. Enough people have to contribute, and none of those contributions can be traceable back to a specific person. This is a solved problem for most citizen-science datasets. It is the part that people assume is hard and that turns out to be relatively easy.
The hard part is the first part. Getting people to contribute.
The best citizen-science datasets in history were collected by people who weren't thinking about the dataset. They were thinking about birds, or weather, or their own running pace. The data happened anyway.
Here the social stigma we have discussed elsewhere is doing real damage. The act of saying "here is what I observed about my nasal output this morning" is much more socially costly than "here is where I cycled this weekend" or "here is which bird I saw at my feeder," even though none of them are, on any rational axis, more or less gross than the others.
A primitive version is already running
SnotShot, the app that shares its name with the domain you are reading this on, already runs a small version of this experiment. Users submit their scans to a global feed and a leaderboard. No names are attached. The aggregate data gets used for the same kind of fuzzy pattern-finding that any citizen-science project does — and also, because we are being honest about what actually motivates contributions, for a ranking game.
The ranking game is not incidental. It is how you solve the contribution problem. You give the contributor a reason to show up that has nothing to do with public health, and the public health signal collects itself in the background as a side effect. The bird was cute. The commute was fast. The booger was legendary. The dataset grew.
The underlying argument
There is a reasonable objection to all of this, which is: why bother? Isn't nasal health a minor category compared to the major categories we already track?
It is minor per observation. It is not minor per population. The upper respiratory tract is the first filter most airborne threats have to get through, and the variation in mucus is a cheap proxy for the variation in what people are breathing. A region whose collective mucus shifts noticeably over a weekend is probably a region whose air did too. That kind of cheap proxy is exactly the kind of signal citizen science has historically been best at harvesting.
Boogers are not going to replace anybody's existing monitoring. They might complement it, cheaply, at the edge, in the places where the existing monitoring is thinnest. That is a reasonable outcome for an experiment most people would not have agreed to run ten years ago.