IPC Classes

Hello PatentsView Community,

I appreciate the chance to ask a question. I am working on my Master's thesis and have found a discrepancy in the data that I am not sure how to understand and would appreciate the help.

I have use the IPC classification bulk data download to match the IPC class to the patent data. There are currently around 129 IPC classes however many more appear in the data set. My problem comes when cleaning the data there are some that appear to use a leading 0 and others that don't. For example there are some labelled as "H2" and others as "H02". Following the official IPC guidelines "H02" is the correct label. 

I would therefore link to merge both "H2" and "Ho2" however does anyone know if this is correct or why they both appear?

I don't want to miss something and overestimate the number of "H2" categories, for example. 

Thank you,

Joseph Emmens

not current ipc values

Hi Joseph,

The ipc values come from the bulk xml grant files that the patent office makes available.  The patent office doesn't produce a bulk source of their current values so what is in ipcr.tsv are the icp classifications when each patent was issued.  They do not necessarily reflect the ipc values a patent is currently classified under.   It also means the patents aren't all classified using the same version of the ipc. lists the different ipcs that were used over time.

This came up recently, see


Correct linking of IPC records

You are correct that linking "H2" and "H02" records is the accurate strategy to use here for IPC fields currently. The differences are likely caused by changes in the specifications of the bulk data download files from USPTO. We are currently developing a quality control process to catch and correct these issues which arise from the raw data. 

Thank you,

The PatentsView Team