We’re working on today’s batch of new taxpayer-funded patent applications. The FedInvent Report will be out later this week. In the meantime, we wanted to share some thoughts on federal innovation and in some cases the lack thereof.
If you have been reading FedInvent for a while, you may have heard about Mr. Tinfoil Hat, the accountant who shuns filing your taxes electronically. His advice is to use as little IRS tech as possible. File your taxes on paper.
Mr. Tinfoil Hat is a prominent IRS educator delivering required classes for IRS agents and private sector practitioners. His partner runs one of the top Masters of Tax graduate degree programs in the country. He is the go-to guy for big corporations with US and international tax policy questions. They both have private clients to keep their skills up by working with real people and real tax returns. They were early into the game of how to handle the tax matters for clients dabbling in cryptocurrency, among other new and emerging tax conundrums. These guys know their stuff. They don't trust IRS's technology.
This week Michelle Singletary wrote a column on the shortcomings of IRS technology. Her column focused on how the IRS can't get its scanning technology act together and automate the paper tax return process. There is a massive backlog of unprocessed tax returns, many filed on paper, that are languishing somewhere. Part of Ms. Singletary’s reporting quotes the IRS.
You can read Ms. Singletary's column here if the paywall behaves.
On March 30, 2022, Erin M. Collins, the IRS's national taxpayer advocate, published a post called, "Getting Rid of the Kryptonite: The IRS Should Quickly Implement Scanning Technology to Process Paper Tax Returns."
The Tax Advocate Service blog is the taxpayer's voice at the IRS. Ms. Collin’s post says that "paper is the IRS's Kryptonite and the IRS is buried in it. The reason paper returns are so challenging is that the IRS still has not implemented technology to machine read them, so each digit on every paper return must be manually keystroked into IRS systems by an employee. The Taxpayer Advocate blog says that for a moderately complex return, several hundred digits may need to be transcribed. For longer returns with more forms and schedules, the number of digits may approach or exceed 1,000 digits."
One thousand digits per form is not a lot of characters. Entering numeric data is called ten-key data entry. The operators use the numeric section of the keyboard. A skilled operator can enter 8,000 KPH (keystrokes per hour). It's not the keying that takes time. It's the handling of the paper documents. Early adopters of image-based data capture had to deal with grumpy ten-key data entry operators because the operators could key in the data faster than the imaging system could refresh their screen. Slow image display made them lose their data entry tempo.
Other government agencies routinely capture massive amounts of data that far exceed 1,000 digits on IRS forms. Most of those agencies have already transitioned to using scanning and OCR technology for data capture. The paper stops at the scanner. All of the work is done digitally.
The IRS taxpayer advocate's blog post notes that differentiating between characters is a problem. No, it's not. "For example, a "1" and a "7" may look similar, so OCR may read the digit incorrectly."
Automated data capture providers know how to build business rules into the data capture routines that can add things up and do quality checks to make sure they are collecting the right data. The IRS also has a massive repository of employer and financial institution data on income, withholding, and federal, state, and local taxes. The IRS repository includes 1099 data from your mortgage company and your bank. The IRS can use the employer's payroll data to verify the accuracy of the taxpayer-provided data. For example, if the taxpayer-provided W-2 salary amount is $71,117 and the OCR reads $77117, the IRS employer salary amount can automatically override the OCR data. The system can then use that data to check for quality control on the OCR data. The IRS could use the scanned social security number to preload all expected data and match it against the taxpayer-provided data before an operator even sees any data.
It Can Be Done
The Census of Agriculture collects business and economic data from every farmer, rancher, beekeeper, landscape nursery, greenhouse operator, and fish farm in the US. The USDA's National Agricultural Statistic Service (NASS) outsources the data collection work to the Census Bureau. Census sends out a multipage form to collect data that is mostly numbers. The Ag Census is an economic census that's all about the numbers.
In 2002, the Census Bureau implemented scanning and optical character recognition technology to capture the data on the 39 million Ag Census forms for USDA NASS. The Bureau had four months to do this work. The 2002 forms had over 100,000 unique data fields. The accuracy requirements for capturing the data required above 99% accuracy at the character level, not the word level. Most of the data was numbers.
The Census Bureau, an organization with world-class data quality experts, overcame the is it a "1" or is it a "7" issue by using the built-in capabilities of modern OCR data capture technology.
When the Census Bureau was planning the 2002 Census of Agriculture, the Bureau had an organization called the Computer Assisted Survey Research Organization (CASRO). CASRO's mission was to assess new technology to improve the Bureau's census and survey data capture programs. The program manager responsible for assessing technology for the Economic Census and Ag Census programs wanted something new. CASRO decided to switch things up. They wanted something other than the usual suspects — The big government contractors who wanted a lot of money to present the Bureau with technical options. The team also felt that these companies weren't nimble enough to implement a solution without an eight-digit multi-million dollar contract.
CASRO asked a small specialized systems integration company focused on imaging and data capture to come to CASRO and give a pitch along with two other firms — KPMG and IBM. What could these little guys do? CASRO invited all three firms to provide a proposal and the timeframe to deliver a design and a technical solution that would enable the Bureau to use scanning and OCR for the 2002 Census of Agriculture.
The three proposals CASRO received were:
Small Business: $130,000, solution design in three months.
KPMG bid over $800,000, nine months of interviews and analysis to get to a solution design.
IBM was over $1 million. Their analysis would be handed over in just over a year.
CASRO took a chance and hired the small business. When the contracting officer asked why CASRO was willing to take a chance on a small business, the answer was simple. CASRO could hire KPMG and still spend less money than the IBM proposal if these little guys fail. In addition, if these guys know their stuff, CASRO would have an extra 9-12 months to install the technology and get it ready for the Ag Census. It was the first step in building a state-of-the-art 2002 image-based data capture solution.
It worked. On January 1, 2002, the Census Bureau turned on its Ag Census automated data capture system and never looked back.
(Neither did we. The FedInvent team designed and implemented the Census Bureau’s 2002 Ag Census platform.)
Not Invented Here
The problem at the IRS is a severe case of Not Invented Here disease.
In 2021, the IRS published a request for proposal (RFP) for scanning technology. Our old technology partners called up and asked if we wanted to get the team back together and see if we could solve this problem. Sure. Why not. We were all sitting around the house, running out of puzzles and movies to watch. We took a look.
The IRS RFP was the most innovation-killing acquisition notice we've seen in a while. The RFP limited participation to companies that had already done work for the IRS. Their proposal would receive extra evaluation points when the IRS assessed their offer. Anyone who bids on federal contracts knows that a clause like this is a kiss of death for new companies seeking to make an offer to a federal agency they haven't worked for already. It’s a waste of corporate proposal resources to even try.
The IRS RFP didn't seek to solve a technical problem. Instead, it sought to bring back its preferred vendors to take another run at solving the data capture problem. There was no innovation going on here.
The IRS Has an FFRDC
Then there's the matter of all the money the IRS spends on innovation. The IRS has its own Federally Funded Research and Development Center, the Center for Enterprise Modernization (CEM). The MITRE Corporation operates this FFRDC.
Under the MITRE agreement, CEM acts as a decision partner, integrator, innovation partner, and systems engineer to meet the mission-critical needs of the Treasury, including the Internal Revenue Service (IRS), through fact-based, data-driven analysis, and decision support. "CEM supports Treasury goals including boosting economic growth, promoting financial stability, and achieving operational excellence."
According to Federal Compass and USA Spending, the MITRE Corporation FFRDC contract was awarded by the Treasury Department's Bureau of the Fiscal Service on April 15, 2002. (Cute.) To date, there have been $1.8B in obligations. The contract has a ceiling value of $1.8B, showing a 100% burn rate so far on the contract. The IRS received one offer to operate CEM. The current CEM contract will end on September 30, 2023.
Almost $1.8B in federal FFRDC money, 21 years of work, and no scanning solution?
The Treasury Department and the IRS know how to innovate. The IRS started looking into cryptocurrency transactions back in 2016 before most people even knew what Bitcoin and Ethereum were. The IRS hired Chainalysis to harvest data from both blockchains so they could use public data to track financial transactions happening on these blockchains. With this data, the IRS can track both legitimate transactions and the nefarious transactions of hackers and drug cartels.
The IRS also hired Palantir, the Intel Community, and DOD's go-to data harvesting open-source intelligence company. Palantir used its data analysis tool to identify cryptocurrency patterns and connections in data available to the IRS, allowing the IRS to identify worthwhile investigations and advance investigations. With each step, the IRS has increased data that could be harvested, sharpened its employees' skills, and left fewer dark corners for crypto non-filers and non-reporters to hide. The person who unlocked the mystery of who was operating the Silk Road darknet marketplace was an IRS investigator. The IRS knows how to innovate when it wants to.
(We'll skip the Palantir intellectual property for now. Palantir has made most of its money working exclusively for the Federal government and received funding from In-Q-Tel, the CIA's venture funding entity. Palantir has hundreds of patents. Not a single one has a government interest statement.)
It is boring old-school technology to scan paper and use OCR to collect data and map it to a database. Building an image-based data capture and workflow process will not get you a job at a big government contractor or get you promoted at your agency. It won't look good on your LinkedIn if you're looking for a new job. It’s 25-year-old technology.
The places in the US where big federal scanning and data capture operations operate don't have a lot of five-star restaurants or sommeliers who can recommend a good pinot noir. The Census Bureau's National Processing Center is a massive facility in an old Army base in Jeffersonville, Indiana. To find a good steak and a nice glass of pinot noir you have to drive to Louisville. Some of the data capture for big federal data capture applications happens inside the walls of low and medium-security prisons, not outposts known for quality business travel.
Getting rid of the IRS's kryptonite paper won't be easy. But, if you pull it off, the reward will be having someone tell you it's about time.
Later today, I'll be taking my paper tax returns prepared by Mr. Tinfoil Hat to the post office and mailing them to the IRS. The only proof I'll have that the IRS received the returns will be digital proof of delivery from the Post Office and the presence of the cashed check on my digital bank statement. The IRS probably won't get around to processing the 2021 returns until the current paper backlog is cleared out, maybe in 2023.
As always, please let us know if you have questions about our data or our analysis, You can reach us at email@example.com.
We’re hoping to have the FedInvent Q1 Roundup soon.
Thanks for reading FedInvent. See you later this week.
FedInvent tells the stories of inventors, investigators, and innovators. Wayfinder Digital's FedInvent Project follows the federal innovation ecosphere, taxpayer money, and the inventions it pays for. FedInvent is a work in progress. Please reach out if you have questions or suggestions. You can reach us at firstname.lastname@example.org.
FedInvent is a reader-supported publication focused on the federal innovation ecosphere and taxpayer-funded inventions. Please consider becoming a free or paid subscriber to receive new posts and support our work.