Promising Practices in SNAP PER Reduction: Data-Driven Preauthorization Reviews
By Sarah Esty (Aspen Institute Financial Security Program) and Eric Giannella (Georgetown University Better Government Lab)
To assist states working to rapidly reduce SNAP payment error rates to avoid new cost-share requirements, the Safety Net Response Network has been convening monthly meetings of state data practitioners for peer learning around successful error reduction strategies. Through these meetings, as well as research led by data experts at Georgetown and Yale, promising practices are emerging. While we will refine and validate these practices in the coming months as more data becomes available, given the urgency of the moment for states, we are sharing them now so states can begin integrating them into their practices while there is still time to influence the PER rates that will determine FY27 cost sharing. We welcome feedback or suggestions about other ideas to include.
The top promising practices include:
- Separate models, focused on material errors: consider different models for over-payments, underpayments, and ineligible cases; focus all models only on errors >$58
- Focus on model precision: the goal should be dollars of errors prevented per minute of review time. Given limited review time, most states will want to focus on prioritizing cases with a high likelihood of correctable, high-dollar errors.
- Sequencing reviews: be thoughtful about where in your process to deploy your data-driven reviews. Some states have added an additional pre-certification step (ahead of QA), pending available staff / capacity to complete reviews in a timely fashion; other states do rapid QA of certified cases to catch errors before they persist for long.
- Feature engineering: Given the small number of errors to model, variables need to be as informative as possible. Spend time exploring all the data you have, as well as how to capture more data in each variable, such as scaling by household size. Be sure to explore system data and operational data, such as how long RFIs are outstanding, or number of system calculations run.
- Machine learning based prioritization criteria: Our analyses have been able to produce new compound criteria with multiple variables (e.g., number of months where expenses > income & has children) that have not emerged from simpler, threshold based approaches. These can be found through a variety of tree-based models, but regression trees are the easiest to use and interpret.
- Exclusion criteria: Identifying and removing low-error likelihood cases from review queues to focus resources on highest-risk cases. For example, a low benefit amount to max allotment ratio means that a larger amount of income has been reported, meaning that omissions large enough to cause an error are less likely.
- Targeted reviews: Lighter review of more cases, focused on reviewing higher-risk case elements, instead of deeper review of fewer cases.
1. Model Design
Try modeling over-payments, under-payments and ineligible cases separately to determine if different factors are predictive. If you find better predictive power with separate models, you can also use the information about which model flagged the case to also specify which fields need more attention during the review. For example, one state found that the key drivers of their overpayment model were: Total Net Non-Exempt Income; Household Total Gross Earned Income; Shelter Deduction Gap; Medical Expenses; Child Care Expenses; Maximum Days a task has been sitting in a queue. But their underpayment model had different key variables: Total Net Non-Exempt Income and Shelter Deduction Gap. These differences could allow the team to focus their reviews on different parts of flagged files.
Make sure your model is built only off of errors over $58 – smaller errors that don’t affect PER may have different predictors. For example, some states have had issues with not accounting for the annual SSI cost of living adjustment on cases, but the dollar amount of variance for those was under the threshold, so finding and fixing those errors was ineffective for PER reduction. Certain categories of deductions can be distractions as they produce a high volume of small, uncounted, or less important errors.
Depending on capacity, you might want to focus even more closely on the highest magnitude errors. For example, one state found that just 10 cases drove over 1/7th of their PER – the difference between falling into a lower cost bucket. Another state found that the top 10% of highest-error cases contained 30% of the dollar amount of the errors – making finding those cases the very highest priority. Many states therefore are building models that predict dollar amount of error, rather than a binary yes/no of being likely to have an error.
2. Model Precision
If you have a fixed number of cases the team can review, work backward from that number to set your model’s precision (so that you are pulling the most valuable cases up to that amount). Or, create models that assume you can review a range of cases, such as 3%, 6%, and 9%. Additionally, if your existing review efforts are preventing timely issuance of benefits, consider setting a fixed review target based on your team’s average review caseload capacity and prioritize reviews within those limits. (You may also want to expand the number of cases the team can review by moving through cases more quickly – see next section.)
3. Sequencing Reviews
States are also making different choices about where to use risk scoring or prioritized cases based on error likelihood. Some provide that data directly to caseworkers to nudge towards closer reviews of higher-risk cases before cases are finalized (and allow for lighter touch final readthroughs of lower-risk files).
Other states (with sufficient capacity) have added a new preauthorization review step. Typically, states conduct a Quality Assurance (QA) review to identify and fix errors in recently (re-)certified cases (before they become errors that get counted in PER), and a Quality Control (QC) process to measure the official error rate (which, after review and finalization by the federal QC team, determines new cost-share amounts). With OBBBA’s significant financial impacts attached to PER creating heightened focus on reducing errors, many states are also adding in a pre-QA review step: preauthorization review. These reviews are typically performed by supervisors or lead case managers, although sometimes by staff who have been trained in QA. Performed on cases before certification, they look to find and remediate errors before they occur (whether for new cases or renewals). Effective states are able to add this step into their existing workflows, automatically passing cases that are ready to be certified for review (if warranted by targeting criteria), and conducting rapid reviews (within less than a day) to avoid slowdowns and timeliness issues. To avoid causing delays, cases will proceed through the system if they are not reviewed within 24 hours, or will be flagged for another caseworker to review on an expedited basis.
In states with timeliness issues or more limited capacity, data-driven prioritization for which files are reviewed by the QA teams can help maximize the number of errors caught early, without delaying processing. Files have already been certified by the QA step, so expanding this effort is less likely to impact timeliness. And all states already perform QA, so better targeting which files are reviewed to those likeliest to have the highest-dollar errors is a no regrets move for all states.
4. Feature Engineering
Many of the best prioritization criteria capture more than one dimension of a household’s situation (like household size over 3) – they summarize information across dimensions. For example, shelter cost divided by income, might reveal someone living over their means who has failed to disclose a source of income or mis-entered their shelter expenses.
There is also a lot of value of system-generated variables, like days pending at a step, number of handoffs between caseworkers, or times the budget was recalculated. These system values can indicate complexity of a case (such as cases that required multiple requests for additional information or had the budget redone multiple times due to some challenging policy questions), and points in a case where potential errors might have been introduced (like large numbers of handoffs). Talk to your IT team about what you may be able to pull from unexpected sources like change logs.
Some rules we think might be useful:
| Feature Type | Category | Feature Description, examples | Hypothesis for inclusion |
| Income & Deduction Changes | Volatility Complexity | • Indicators for and extent of changes in income/deductions within the past 6/12 months • Can scale these by household size | Changes may signal benefits or reported values are likely to change again in the future. |
| Benefit Level | Complexity | • $ difference from maximum benefit • Ratio of Benefit/BenefitMax; between (0, 1] | Summarizes potential for missingness and potential size of error – see brief on getting more from this variable |
| LAM (Living Above Means) | Logical | • Sum of Earnings – Total Shelter Costs • Could turn into a ratio and include earned income: shelter expenses / gross income | Cases with expenses higher than income may be missing income or overstating expenses |
| ESAP & Time Since Certification | Volatility Complexity | • Indicator for Elderly Simplified Application Project • Number of months since last certification period | ESAP: low potential for income volatility.Time since certification may allow for more errors |
| Address Changes | Volatility | • Time since address was updated • Number of address updates in previous year | Location volatility could impact shelter costs, making it likely for benefits to be misallocated. |
| Case Composition | Complexity | • Multiple adult households vs. adult + children | More adults may indicate more income information |
| Outlier Values | Logical | • Shelter costs > 99th percentile for that ZIP | May indicate an error in information provided |
5. Machine Learning-Based Prioritization Criteria
Machine learning can also reveal more complex patterns that don’t have readily discernible reasons for the combination of factors or the thresholds (as the simpler rules and variables above do). Some of the best complex rules we’ve found are:
earned income > $[amount] &
uncapped benefit to max allotment ratio of > [amount] &
uncapped benefit to max allotment ratio of < [amount]
shelter expenses / HH size < $[amount]
unearned / HH size > [$ amount]
shelter expenses > [$ amount] and < [$ amount]
uncapped benefit to max allotment > [amount] &
uncapped benefit to max allotment > [amount]
net_income / FPL
shelter expenses > [$ amount] and < [$ amount]
uncapped benefit to max allotment > [amount] &
uncapped benefit to max allotment > [amount]
has children &
for at least X months: expenses > income
Non elderly, non-disabled &
Zero income &
expenses > $x [amount]
The “[amount]” values indicate where you will want to test your own data to determine relevant thresholds. (And you can reach out to us if you don’t have the needed in-house staffing or tools and want your state’s thresholds using the 2023 and 2024 public QC data). We hope these rules can be helpful to try out, whether you are a state with a sophisticated analytics team that can re-run our analyses from scratch, or you are a single person with Excel trying our rules with a few different dollar amounts to see how they work.
6. Exclusion Criteria
Models can also be great at flagging very low risk cases, which allows states to exclude those, preserving limited QA or preauthorization resources for cases with higher error risk.
Examples of exclusion rules that have worked well:
- Cases with < $100 in benefit amount
- Cases where the benefit amount divided by the maximum allotment is less than 0.3 or 0.4
- Cases that have less than $X in total gross (e.g., 400) AND less than $Y in earned income / HH member (e.g., $50)
- Cases that, given deductions, expenses, and income, would need an extremely large error before hitting the PER. Could combine with whether there are any ABAWDs in the household. We describe how to create a variable that reflects these situations in this blog.
7. Targeted Reviews
In #1, we discussed optimizing which cases you pull if your review capacity is fixed. However, capacity is contingent on choices states make about breadth versus depth in how their staff do case reviews at both the preauthorization step (if they have one) and QA. We recommend focusing on breadth over depth, reviewing more cases in a focused way (looking just at the fields likeliest to be tied to errors), instead of doing a thorough review of all case elements (which could take up to an hour or more per case). As an example, leading states are able to do case reviews in under 5 minutes, allowing one person to review an average of 480 cases a week. In a state that is reviewing one case an hour, that is only 40 cases a week. Even if the faster review misses half of the errors of the deeper review, that team is going to catch 6x more errors than the deep review approach.
Using your state’s data on the biggest error reasons can help tailor which case elements get that review. It can also help you decide whether to add or remove elements (by conducting a workload analysis that measures the marginal additional time to complete review of a certain content area, against the share of errors attributable to the issues that review would catch). See our case study for an example of how this has worked well.
If your state is using separate models for different types of errors (like over versus underpayments), knowing which model produced the review flag can also help reviewers look at different fields that might be involved.
Even better than generalized prediction of which case elements to review is an individualized AI-powered review recommendation that looks at the case itself to identify potential discrepancies, miscalculations, missing or conflicting documentation, or misapplications of policy. Such tools can highlight exactly which fields need review on a case, the likely issue to look out for, and where to look in supporting documents like interview notes, data pull results, or paystubs.
