Explaining County and State Data Discrepancies
ECDH Director Melissa Lyon elaborates on the differences and struggles with real-time data reporting
Melissa Lyon, director of the Erie County Department of Health (ECDH) joined Erie County Executive Kathy Dahlkemper via Zoom today, helping to explain the ECDH's methods for data collection and the unique challenges they are facing during the COVID-19 pandemic.
Lyon began by discussing some of the data discrepancies between state and county numbers. "There seems to be distress and misunderstanding of how public health data occurs and is reported," she explained. She went on to compare what the health department normally does versus how they proceed in a pandemic. Normally, data collection – for things such as vital statistics or cancer registries – happens over a longer period of time and is entered in a database. It's then reviewed or "scrubbed" for any errors, duplications, and outliers. Then, it's used to map trends and graphs over that given time.
"Rarely is this ever done in real-time," Lyon noted. She compared this to influenza reports, which are done in week batches on a week-delay, while things like STDs are compared annually, biannually, or monthly.
She wagered this time-intensive process against the voracity of the public for a wide scope of real-time COVID-19 statistics. "Everyone wants this data today. In all honesty, the public health data system, their analysis and reporting system – it's just not designed to do this." The county uses the Disease Case Investigation System, she noted.
"We've been doing our best to provide that data that was meaningful to the severity of the COVID-19 pandemic for our community, but by doing so, we've almost set ourselves up for criticism, and that our numbers are wrong, or made up, or useless because they might contain errors."
She warned that by its nature, real-time data may have errors, or not match other data sources, as everything that's coming in is raw data. "Raw data has value, but it takes time to turn raw data into meaningful data," Lyon noted. "And that time is something that unfortunately we don't have during a pandemic," she said.
After explaining this, Lyon was quick to explain why the county data does not match the state data every day. The Erie County Department of Health (ECDH) uses a different reporting timeframe than the PA Department of Health. They are both 24-hour timeframes, but the county records data from 3 p.m. to 3 p.m., reporting it at 9 a.m. the next day, while the state collects data from midnight to midnight, reporting it at noon. ECDH uses their timeframe to provide for a better-suited workflow and workload.
Later in her statements, she asserted that state and county data eventually matches up, after being thoroughly reviewed and scrubbed. This has been evidenced even in real-time, as minor discrepancies generally are resolved within a few days. She gave no specific endorsement for which source media outlets should prefer to report on, citing it as a choice.
Lyon further promised that ECDH will review their process and make adjustments and improvements as necessary. Lyon concluded her statements, assuring that "I want you to know and trust that the Erie County Department of Health is reporting data that is accurate and reliable based on our workflow and our workday."
Nick Warren admits to his own personal veracity for real-time data and applauds the difficult work that the Erie County Department of Health is doing amidst this unprecedented crisis. He can be contacted at nick@eriereader.com