Missing applications in the Pre-Grant Publications Data

Dear Community & PatentsView-Team,

I have a short request to the Pre-Grant Publications Data Download Tables, because I am interested in data to all filed USPTO patent applications, so the pre-granted applications.

But if I match the PatentsView pre-grant dataset application.tsv (5,570,731 observation) with the original USPTO Patent Examination Research Dataset (Public PAIR) from 2019, I see that around 10 Million applications are missing in your dataset (as matching variable I used the application number). Can someone tell my why these applications are missing in the Pre-Grant Datasets? Or in other words which conditions had to be fulfilled by the applications to be part of the PatentsView-dataset?
And a second question: Why are there duplicated application-numbers in the PatentsView-dataset?

Thanks a lot in advance for your answers and help!

Best regards and stay healthy,

different date ranges

Hi Patrick,

The patentview file is built from the application xml the uspto makes available on  It's applications from 2001 on.  The pair data seems to go back to farther.  Also part of the patent statue says that an inventor can elect to suppress publication if they are only applying for a patent in the US.  

You can see this in action if you look at the granted_patent_crosswalk on the pre-grant download page.  It does not have a document_number for every granted patent since 2001 which I assume can be attributed to suppressed applications.  

The duplicate application numbers are in the source xml files.  One example I checked, 14235142, is the application number for document numbers 20150082774 and 20150308313.  If you look up those document numbers in appft you'll see

20150082774 WORK VEHICLE Appl. No.: 14/235142

20150308313 WORK VEHICLE Appl. No.: 14/235142
with Prior Publication:    
Document Identifier    Publication Date
US 20150082774 A1    March 26, 2015

The most repeated that I found was 14/209480 in 20170112793 which lists 4 other Prior publications, each with that same application number
US 20140193495 A1    
US 20160136125 A2    
US 20160271093 A2    
US 20170014367 A2

Russ Allen