Skip to main content
 
 
 
IN THIS SECTION
2 posts
kmarhold
Last seen: 07/19/2021 - 07:22
Joined: 03/18/2019 - 01:50
New assignee_ids and misspelled organization names

After downloading the  27-Nov-2018 version of the bulk data files, I noticed that the naming scheme for assignee_ids has been changed (to starting with 'org'). However, the algorithm seems to match most patents of an organization to misspelled versions of the organization's name now.

For example:
Sorting by number of patent_ids associated with each assignee_id, the top entry is 'International Business Machines of (sic!) Corporation' . Similarly, other entries in the Top-100 are wrong, such as 'Hewlett-Packasrd Development Comany, L.P.', 'Hitachi Metels, Ltd.', 'Texax Instruments Incorporated' or 'LG Elecronics Inc.'

I didn't find such errors in the older file (May 2018 version). Were the older files somehow edited manually to choose the right entry of all that contain "international business machine corporation' in one way or the other?
 

PVTeam
Role: moderator
Last seen: 09/10/2024 - 13:29
Joined: 10/17/2017 - 10:47
Re: New Assignee_Ids and Misspelled Organization Names

Hi,

Thanks for bringing this up!

We implemented a new disambiguation algorithm with the last data update, so the canonical assignment is a little off. However, we will adjust this for the next data update which is planned to go live in mid/late April.

Thank you,

PVTeam