Understanding and Forecasting Allergenic Pollen in the United States
Pollen is a common allergen that causes significant health impacts to approximately one third of the population of the United States. Diagnosis and treatment of pollen allergies can be improved with knowledge of the airborne pollen concentrations. However, there is a lack of accurate pollen information to support allergy treatments. The goal of this dissertation is to describe and understand allergenic pollen in the atmosphere and to develop a forecast model of pollen for allergy management. The National Allergy Bureau’s daily pollen concentration are analyzed for the contiguous United States and available stations in Canada from 2003-2018. Pollen calendars are produced to provide clear, quantitative visualizations of pollen data by location and pollen taxa. Start of the pollen season for trees and grass pollen season depend strongly on latitude, with earlier start dates at lower latitudes. Season duration is correlated with the start dates, such that locations with earlier start dates have longer seasons. To successfully manage allergy symptoms, it is important to know where, when and how much pollen will be in the atmosphere. Models can provide this knowledge by forecasting pollen concentrations. A machine learning Random Forest model is developed to predict daily pollen concentrations from meteorological and vegetation conditions. Retrospective forecasts are evaluated for city and regional models for four pollen types (Quercus, Cupressaceae, Ambrosia and Poaceae) in four locations. Data augmentation is investigated as a technique to improve model performance. Due to NAB pollen data’s limited spatiotemporal coverage, data-augmented models are the best performing models. A reanalysis data set is constructed from pollen observations and the data-augmented model predictions. We developed two forecasts to aid allergy management: a 14-day forecast of start of pollen season because many allergy medications take a couple weeks to be fully effective; and a 1-3 day forecast of high pollen days to allow allergy sufferers to minimize pollen exposure on a short- term basis. Temporal resolution of the pollen data is important in training the model and affects the skill of the model. The Random Forest model trained on reanalysis data set makes skillful forecast for both start date and high pollen days.
- Atmospheric sciences