Hello PatentsView Community,
I appreciate the chance to ask a question. I am working on my Master's thesis and have found a discrepancy in the data that I am not sure how to understand and would appreciate the help.
I have use the IPC classification bulk data download to match the IPC class to the patent data. There are currently around 129 IPC classes however many more appear in the data set. My problem comes when cleaning the data there are some that appear to use a leading 0 and others that don't. For example there are some labelled as "H2" and others as "H02". Following the official IPC guidelines "H02" is the correct label.
I would therefore link to merge both "H2" and "Ho2" however does anyone know if this is correct or why they both appear?
I don't want to miss something and overestimate the number of "H2" categories, for example.