Skip to main content
 
 
 
IN THIS SECTION
4 posts
richardedwardh…
Last seen: 09/25/2018 - 10:54
Joined: 08/08/2018 - 11:12
Link usapplicationcitation to application table

I am using the bulk download data.  My understanding of the table "usapplicationcitation":

   patent_id is the id of the *citing* patent   

   application_id of the id the the *cited* application

I want to find the patent_id corresponding to the *cited* application.   I thought the way to do this would be to use the bulk download table "application".   But I can't find how to link these two tables.   None of the application_id's I find in "usapplicationcitation" table appear to be in the "application" table.

Here is an example for data from "usapplicationcitation":

   application_id: 2001/20010002464
         number: 20010002464
citingPatentIds: 7827066,8660877,7664685,RE44932,8364513,7603296,9098818,RE44793,RE46278,9092968,8400296,7917409,8321253

Where is this application_id in the table "application"?

 

PVTeam
Role: moderator
Last seen: 11/29/2024 - 15:02
Joined: 10/17/2017 - 10:47
RE: LINK USAPPLICATIONCITATION TO APPLICATION TABLE

Hello,

We understand that currently there are difficulties linking these two tables. It is in our queue to further investigate this issue.

Thanks,

PVTeam

Russ
Last seen: 12/04/2024 - 17:06
Joined: 11/14/2017 - 22:15
possibilities

The problem here is that the "number" field in usapplicationcitation may not be included as its own field in the bulk grant xml files that the patentsview files/tables are built from.  Take a look at the second citing patent in your example.  In patent.tsv the filename field for 8660877 is ipg140225.xml (link to the zip file at reedtech).  What is there for patent 8660877 is the serial number, application date and a whole lot more.

The serial number makes it into application.tsv as id (if you add a slash after the second character) and application date makes it into application.tsv as date.  What we'd like to add would be the document number, what other patents or patent applications use to reference a published application that has not yet become a granted patent (usapplicationcitation's application_id or 20010002464 in your example).

We can look up the document number for patent 8660877 online at http://appft.uspto.gov/netahtml/PTO/search-adv.html using the application date and the last 6 digits of the serial number in the xml file:  apn/685380 and apd/20121126 to find

United States Patent Application 20130150983
Kind Code A1
Mitchell; Clarence ; et al. June 13, 2013

20130150983 is the document number other patents and patent applications would use to reference this patent application before it was issued as patent 8660877. 20130150983 is found in ipg140225.xml for patent 8660877 as us-related-documents.related-publication.document-id.doc-number but I don't know if all patents are like that or if those fields are even present in all versions of the grant xml files (there are three distinct xml layouts for the years 1976-2001, 2002-2004 and 2005 and up).  

We can pull down the application xml from reedtech for June 13, 2013, the date that appft said that patent 8660877's application was published.  That xml has the serial number and document number for what eventually became patent 8660877 and every other application published on that date.  In this file the serial numbers and document numbers appear well defined. Hopefully these two fields are present in all the application xml layouts.

The document number would be a fabulous addition to the application table, which would then allow linking to usapplicationcitation application_id.  Further investigation would need to be done to see if document numbers could be pulled from the granted xml files already being processed.  If not it requires processing all available application xml files just to pull out the serial number and document number. Or, perhaps the uspto could be persuaded to provide a custom extract from appft?  It would be ongoing though, needed each quarter or however often the patentsview database is updated.  

Potential Problems:

1) There are application citations in usapplicationcitation.tsv before 2001 but bulk application data is only available for applications  published from 2001 onwards.
2) Not every patent application goes on to become a granted patent, yet other patents and patent applications can cite it.  

In your example patent application 20010002464 never became a granted patent yet as your example shows, 13 granted patents cite it.  (We can look up 20010002464 in appft to find its serial number and application date.  A search using those fields does not turn up a granted patent however in patft: apn/037630 and apd/20010102)

So even if document number was added, you would not always be able to equate a cited application to a patent number.

I hope this helps.
Russ
 

PVTeam
Role: moderator
Last seen: 11/29/2024 - 15:02
Joined: 10/17/2017 - 10:47
Re: Response to LINK USAPPLICATIONCITATION TO APPLICATION TABLE

Hi,

We are still in the process of figuring out this linking issue between the usapplicationcitation and application tables.

Russ – Thank you for sharing your findings on the ways to link the usapplicationcitation and application tables. This is something we are addressing by looking at the XML files and we will add "document-number" to our list of requested add-on fields. We will follow-up once we have fully developed a way to link the two tables together.

Thank you,

PVTeam