Skip to main content
 
 
 
IN THIS SECTION
1 posts
atharvag
Last seen: 10/21/2024 - 23:38
Joined: 10/15/2024 - 19:03
Uncertainty regarding entries in "citation_patent_id" column

I was wondering how the entry in the citation_patent_id column is decided. Because it is not the case that our patent_id only cites 1 other patent. I also notice that if one patent_id occurs 4 times in the dataset, then why does the citation_patent_id sometimes differ and sometimes not differ (and why is it restricted to only 1 value)? 

On a deeper level, I also found the following:

Selecting a random citation_id entry from the above, say: 9715899
Ground-level checks:

  • This patent is cited by 33 patents (as per Google patents), but occurs in the dataset as a citation_id only once.


Selecting a random patent that cites 9715899, say 11381412

Ground-level checks:

  • Does 11381412 exist in our dataset as a patent = YES
  • Does this patent cite our 9715899 = NO
  • Which patent does this cite = 8487996

Does 8487996 exist as a patent citation for our patent on Google patents? = NO