Thanks for creating this great platform!
I'm trying to merge the g_assignee_disambiguated file with the g_detail_desc_text_yyyy files on patent_id, and I'm finding that only very few patent_id values match across these files. Specifically, for
2015: ~300,000 patent ids in g_detail_desc_text_2016, only ~10,000 matches with g_assignee_disambiguated are found
2016: ~305,000 patent ids in g_detail_desc_text_2016, only ~10,000 matches with g_assignee_disambiguated are found
I have not checked other years, but it does seem to me that these match rates are too low. For example, for the corresponding pg_assignee_disambiguated and pg_detail_desc_text_yyyy files, the match rates are around ten times higher, yielding over 100,000 matches for the same years.
I could be mistaken, but is it possible that something went wrong with the recent December 2023 update to the g_assignee_disambiguated file?
Thank you again!