Skip to main content
 
 
 
IN THIS SECTION
4 posts
EkaterinaLevitskaya
Last seen: 11/16/2022 - 17:03
Joined: 06/02/2022 - 10:31
Inventor id 2009

Hello,

I'm working with a file which is a crosswalk between author IDs and inventor ids, from 2009 - the file is called "authorlink_uspto.tsv" which can be found here: https://databank.illinois.edu/datasets/IDB-4370459

The format of the inventor_id there is like this: e.g. 1922874

I tried using "persistent_inventor_disambig.tsv" file with the crosswalk between different disambiguations of inventors, but it looks like the earliest version of the inventor_id is 2017, which is in a slightly different format: e.g. 4341225-2

Is there a version of inventor_ids back in 2009? Do you know how I can find inventor_ids back in 2009?

Thank you very much in advance, with much appreciation

EkaterinaLevitskaya
Last seen: 11/16/2022 - 17:03
Joined: 06/02/2022 - 10:31
As a follow up on this, I…

As a follow up on this, I believe I was able to find a way to identify those inventors using the patents that they had (included in the uiuc_uspto.tsv file in the same link: https://databank.illinois.edu/datasets/IDB-4370459)

However, I still have one question. For example, when I try to search for this patent "5564249" (which comes from a dataset with inventors from 2009) in the patent_inventor.tsv file , nothing comes up, but when I search for it in the rawinventor.tsv file, I get the needed information.

Is it better to use the rawinventor.tsv file, in order to get the patent and inventor information?

I used to use patent_inventor.tsv to get that information before, but it looks like some information there is missing, which, however, can be found in the rawinventor.tsv file.

Thank you very much in advance, with much appreciation

Russ
Last seen: 11/17/2022 - 08:29
Joined: 11/14/2017 - 22:15
disambiguated vs raw

I believe the difference would be whether you want the disambiguated inventors or the raw inventors (what was pulled from the uspto source files).  patent_inventor.tsv just associates inventors and their locations to patents, I'm not sure if it used to do more.  You'd have to join with inventors.tsv on inventor_id to get the disambiguated names and with locations.tsv on location_id if you want disambiguated locations.

I see three rows in patent_inventor.tsv for 5564249
"patent_id"     "inventor_id"   "location_id"
"5564249"       "fl:al_ln:deleon-3"     "efbc3c39-cb90-11eb-9615-121df0c29c1e"
"5564249"       "fl:av_ln:zohar-2"      "ff94740d-cb8f-11eb-9615-121df0c29c1e"
"5564249"       "fl:ta_ln:borys-1"      "ff94740d-cb8f-11eb-9615-121df0c29c1e"

from inventor.tsv
"id"    "name_first"    "name_last"     "male_flag"     "attribution_status"
"fl:al_ln:deleon-3"     "Albert"        "Deleon"        1.0     1
"fl:av_ln:zohar-2"      "Avi"   "Zohar" 1.0     1
"fl:ta_ln:borys-1"      "Tadeusz"       "Borys" 1.0     1

I am an alumni of U of I and am hoping this helps!  ILL-INI
Russ Allen

EkaterinaLevitskaya
Last seen: 11/16/2022 - 17:03
Joined: 06/02/2022 - 10:31
Thank you very much for the…

Thank you very much for the help on this, it is helpful to know

I went back to the files and found out that the issue was with the way the patent_inventor file was read-in by Pandas in Python. When I read it in with R, I was able to find the necessary patent "5564249" and other patents. I went back to Python, and converted the "patent_id" column from object to "category", which is said to be the closest equivalent of factor type in R. Once I did that, I was able to search for this patent and others in the dataframe in Python.

Thank you very much for looking into this, it was helpful to find out the problem. I find that "rawinventor" file is useful when looking up the inventors by name and patents, as this information is in one table.

With much appreciation for the help, thank you