Duplicate Values for Organization Name

6 posts

Thu, 10/24/2019 - 13:38

Hegedus

Last seen: 02/04/2024 - 19:23

Joined: 10/24/2018 - 20:10

Duplicate Values for Organization Name

Hi,

I am working with the assignee file and I am noticing that some organizations are listed 2+ times. For example:

2anza8ga48wsus63rjst1ifmw   3   NULL   NULL   Infineon Technologies AG
5ysm5zlcaf2b02d62mgic7512   3   NULL   NULL   Infineon Technologies AG
org_61gyUoVVQyeF60uJoBif   3   NULL   NULL   Infineon Technologies AG
pn5la4wmxkdt5gof0ubafymhq   3   NULL   NULL   Infineon Technologies AG

Isn't the purpose of the disambiguation process to combine these into a single entity id?

Also I am working with Natural Language Processing and a term of the art used in Lemma which is the base root and allows the combinations for different word use such as plurals to grouped to a single entity. Perhaps that could be considered as in the sample shown below: Note the plural in Americas in the second line. In reality these are probably the same business entity.

org_0T3DUOVT6gX9RCesn7iE 2 NULL NULL Infineon Technologies America Corp.
org_KvWrsyblXUCdRpJcqGns 2 NULL NULL Infineon Technologies Americas Corp.

Andy

Share Your Knowledge in the Community Forum

Contact Us

Terms of Use