Skip to main content
 
 
 
IN THIS SECTION
2 posts
jcb1996
Last seen: 01/22/2020 - 10:35
Joined: 12/19/2019 - 09:41
patent_num_cited_by_us_patents

My results for patent_num_cited_by_us_patents differs from those provided by the API. Around 60% of the test cases I did matched. Those that didn't match were incorrect above and below the figure provided by the API.

Rather than using the API, I downloaded the uspatentcitation table. I noticed the patent_num_cited_by_us_patents field wasn't in any of the tables in the data dictionary. So, I created it using the following query:

SELECT citation_id as number,
       COUNT(DISTINCT patent_id) as citations
FROM uspatentcitation
GROUP BY citation_id;

It appears that the computation above is different to that of what the API is returning. How is that field generated using the API?

Some examples:

Patent #: 4985270. API results: 41. Results from my query above: 37

Patent #: 8592650. API results 22. Results from my query above: 49.

Thank you,

Justin

 

Russ
Last seen: 01/26/2024 - 08:16
Joined: 11/14/2017 - 22:15
where clause?

Justin,

The source code is in projects under https://github.com/CSSIP-AIR There's PatentsView-DB to create the database and PatentsView-API for the api itself along with other projects for the python wrapper and online query etc.  

patent-field-specs.json in the API project shows that field coming from the patent table:

  "patent_num_cited_by_us_patents": {
    "entity_name": "patent",
    "column_name": "patent.num_times_cited_by_us_patents",
    "datatype": "int",
    "query": "y",
    "sort": "y"
  }

02_Patent.sql in the DB project shows the value is set with sql similar to yours: 

insert into `{{params.reporting_database}}`.`temp_num_times_cited_by_us_patents`
  (`patent_id`, `num_times_cited_by_us_patents`)
select
  `citation_id`, count(*)
from
  `{{params.raw_database}}`.`uspatentcitation`
where
  `citation_id` is not null and `citation_id` != ''
group by
  `citation_id`;

I didn't download the uspatentcitation table to check but I'd guess you might need the where clause?    Is the unzipped file 8.7 GB and/or is your row count 105,027,310?  In another thread there was a problem with patent.tsv.zip which was recently fixed.  The api's counts match the uspto's (ref/4985270) (ref/8592650 and ISD/1/1/1976->10/08/2019) so I'd assume the api is calculating the value properly.

Russ