Skip to main content
ex. data visualization, research paper
  • What's New with PatentsView - September 2022

    AI & Innovation and Resource Pages Now Available

    As we start a new academic year, PatentsView is working to help researchers better understand the relationships across various patents and innovative technologies. To that end, we’ve launched two new pages: a topic page on Artificial Intelligence & Innovation, and a new Resources page.

    The Artificial Intelligence & Innovation Patent Dataset

    While artificial intelligence (AI) has advanced by leaps and bounds, researchers are still working to understand the many ways AI inventions and innovations have impacted technology and society. To help researchers delve into how this emerging technology is affecting our lives, the United States Patent and Trademark Office (USPTO) released the AI Patent Dataset (AIPD).

    The dataset includes an analysis of 13.2 million patent documents published through 2020, identifying which patents contain AI. The AIPD integrates seamlessly into PatentsView, allowing researchers to explore relationships between patents related to AI and the companies and inventors who hold them.

    What’s on the new AI & Innovation page?

    The new page contains an interactive data visualization that allows users to explore how patents are related to government interest, a deep dive into the machine learning model used to create the AIPD, and the latest AI-related news and reports.

    Visit the new AI & Innovation Page now to find out more.

    What’s on the new Resources page?

    The new PatentsView Resources page provides patent researchers, inventors, and intellectual property afficionados an easy way to find code snippets and packages to better use the PatentsView API, sources to help researchers use BigQuery to explore historical patent and PatentsView data, Zenodo links between patents and scientific articles, and more.

    You can also find information about the I3 Collaborative and IPRoduct repositories. I3 and IPRoduct are working groups for users to contribute to data frames and named data projects as well as export collaboratively made datasets. IPRoduct focuses on connecting patents to products in support of intellectual property rights, and I3 is a project to connect citations and patents among patenting groups worldwide.

    Get connected on the new Resources page.

  • You’re invited to a Symposium about inventors who patent! August 26, 2022

    Policymakers recognize that expanding the participation of women and other underrepresented groups in patenting is critical for growing and sustaining American innovation and prosperity. The challenge is that demographic characteristics of inventors such as sex at birth, gender, ethnicity, and race are not collected as part of the process of applying for or receiving a patent. Join us to learn more about how this challenge is addressed through alternative approaches, the accuracy of various approaches, and the important uses of these statistics.

    At PatentsView, with our partners at the University of Bordeaux, we have identified the gender of inventors named on U.S. patents from 1976 to 2021. By using the information supplied by the World Intellectual Property Organization (WIPO) from their World Gender-Name Dictionary along with other documentation, the team was able  to infer the gender for over 92.6% of the 2 million inventors residing in the United States and 92.3% of the nearly 5 million inventors named on U.S. patents residing around the world. You can download bulk data, as well as yearly data files, with gender information according to the latest algorithms for gender attribution on our website. The desire to map who is creating the latest technology is rapidly increasing. To be certain, this exploration is a global endeavor. 

    You are invited to a day-long symposium on Friday, August 26th, to learn about the latest developments for identifying and analyzing the demographics of inventors who patent. The symposium will feature experts from the United States Patent and Trademark Office, ZestAI, RAND Corporation, Rutgers University, University of Bordeaux, and WIPO. To tie everything together, a capstone panel with practitioners and leaders from the financial industry and academic institutions will share their perspectives on the research, next steps, and the overall policy relevance of knowing who drives innovation through patenting.  

    To read more about each presenter and their work, and to register for this symposium, please visit the PatentsView event page at See you there!

  • What's New with PatentsView - July 2022

    A few months ago, our team announced a break in the quarterly data update cycle. This interruption  allowed us to standardize, consolidate, clean, and amplify our current resources across the bulk data download and API products.  

    The bulk data downloads are currently comprised of more than 100 individual files. These files include disambiguated inventors, assignees, and locations, patent classification lookup tables, government interest information, long-text data and claims, and pre-granted patent application publications. The PatentsView team’s standardization and consolidation efforts aim to decrease barriers to merging these files. Part of this process targets data fields and table structures across the granted patent and pre-granted publications datasets. The standardization will also include updating PatentsView’s naming conventions for USPC and CPC fields to map closer to United States Patent and Trademark Office (USPTO) naming conventions.  

    Data update processes resume this month with an anticipated release this September. The September update will include data released by USPTO in between January and June 2022.  

    Since one goal of the API redesign is to mirror the data available in the bulk data downloads, our team is continuing to increase the number of endpoints and data fields accessible through the beta Elastic Search API. With the September data update, we plan to release all remaining granted patent data currently not available in the MySQL API but available in the downloaded data (i.e. gender, botanic, long-text description fields, applicants, etc.). We will release pre-grant patent publications’ data and endpoints by the end of the year. 

    As always, we welcome and appreciate your feedback and communication with our team. Our email inbox and feedback form are open; we strive to respond to messages within two business days. To ask questions or start a conversation within the PatentsView user community, please see the Forum available on our website. 

    P.S. Here’s a quick note: Our Query Builder tool was supposed to remain available during this time of data product harmonization, but because of new Google email regulations on APIs, we are currently unable to return your datasets via email. We are seeking a solution to this issue and will keep you posted. In the interim, if you require a dataset from the Query Builder, please reach out to our team for guidance.  

  • What’s New with PatentsView – May 2022

    Updates on our tools, website, and upcoming events

    It’s springtime in the United States, the season for growth and change. Our work at PatentsView continues to move ahead; with our growing user base and data team, we are trying new things and exercising our creativity. This can be seen on our Gender & Innovation topic page which now features an interactive data visualization made by some of our team members using PatentsView data. Also new to the topic page are annualized data files made in partnership with USPTO data scientists. These annual files are yearly and contain disambiguated information on patents, their assignees, inventors, and inventor gender. Smaller, more malleable data packages are a new thing for us – so let us know what you think!


    The Query Builder, ElasticSearch API, and original MySQL API are up to date with 2021 quarter 4 data from USPTO (data through December 30, 2021) as of March 28, 2022. The PatentsView team has been working to create additional endpoints in the ElasticSearch API (version 0.1) which will eventually replace the MySQL interface for PatentsView data. To date, over 100 beta users are engaging with the E.S. API. Our team is appreciative of the feedback we have received so far and look forward to additional improvements and additions as the year progresses.  

    Data Updates

    The bulk data download tables are still the biggest and most exhaustive collection of raw and processed data that our team has to offer. We monitor for updates to disambiguation methods and algorithms so that we can continue to improve the disambiguation of location, inventor, assignee, and lawyer information. It also helps us to improve when users of PatentsView data share errors they find in disambiguated results with our team – so, thank you, it really does make PatentsView stronger.

    Our next data update will publish in September 2022 and include quarters 1 and 2 of data for 2022. The PatentsView data science team is taking a break from data update activities through the months April to July to devote all resources toward making planned structural changes to the bulk data download and API products we offer. The next communication from our team will outline these structural changes and how this may affect bulk data download and API users.

    During this four-month period, the data available from the API(s) and Query Builder functions will also remain up to date through December 30, 2021. All PatentsView features will be updated in September with data through June 30, 2022.

    Upcoming Events

    PatentsView, with support from AIR, is hosting the United States Patent and Trademark Office’s Gender and Race Attribution Symposium in August 2022. The symposium is a full-day virtual event aimed to bring together computer scientists, information scientists, economists, and others to discuss the state-of-the-art approaches to and current applications of name-to-gender and name-to-race attribution algorithms. The symposium will review methods and applications to provide an overview of current approaches from leading scholars in the field, and to build knowledge, identify a community of practitioners, and facilitate the application of common approaches. Stay tuned for the save the date announcement from our team!

  • Releasing Annualized Data Files for Patents, Assignees, and Inventor Gender

    We are pleased to release annual datasets for exploring patents and their inventors and companies. The annual datasets are in csv files small enough for users without data science or coding knowledge to manipulate and view information about the inventors and company(ies) associated with every granted patent, including the gender of inventors generated by our gender attribution algorithm. These annual datasets are constructed around patent grant year, and combine corresponding company (assignee) and inventor information for each patent granted between 1976 to 2020. We constructed these datasets to improve user access to PatentsView data, especially those users interested in women inventors.

    So, what’s in the new dataset?

    To create the annualized data, we combined the following list of datasets available from the bulk download page for granted patents. We did not utilize all the fields from each, but selected fields to generate these smaller files. A data dictionary of all the variables within the annualized datasets is available with the downloads for more information on what fields they contain.

    1. rawinventor: includes information on the unique id associated with each patent and inventor.
    2. inventor: includes information on the unique first and last name of each inventor, a flag to indicate gender of each inventor, and the inventor id from the rawinventor dataset.
    3. patent: includes information about the patent grant date.
    4. patent_assignee: is a cross-walk that gives the unique assignee id associated with each patent.
    5. assignee: identifies the unique location id for each assignee and the unique assignee name.
    6. location: contains information about the country, city and state, and county of the assignee if within the U.S.
    7. application: links the application id to the granted patent id, and gives the application year.
    8. ipcr: includes information on the ipc/cpc section (technology field) of the granted patent.


    A couple notes on using the new dataset:

    • As you become acquainted with the data, you may notice that there are only nine fields provided for the first nine inventors listed for each patent. Of course, some patents have more than 9 inventors. The maximum number of inventors for a patent recorded within the PatentsView dataset is 123 inventors. However, 99% of patents have nine inventors or less. To make the datasets more user-friendly, we truncated the data to only include the first nine inventors. If you are interested in acquiring the names and gender information of these very large inventor teams, please refer to the original bulk data.
    • Some patents appear more than once because they are granted to more than one company. We did this to make it easier for an individual to, for example, search for all patents and inventor genders associated with a certain company.



    We used the annualized data to generate the following graphs and to illustrate some of the potential uses for this release. In figure 1, we tally the number of patents for each country that have companies/assignees from different countries to define the international collaboration that occurred in 2020. The United States had the most international collaborations in 2020 with 1,828 patents. Behind the U.S., China and Germany were found to have the second- and third-most international collaborations.

    bar graph listing the top 20 countries who collaborate internationally in the year 2020
    Figure 1


    Among the 1,828 U.S.-international collaborations, we see from the next figure (Figure 2) that 273 of them were collaborations with companies in Germany, the most frequent international collaborator of the U.S. China and Japan are not far behind with 249 and 236 collaborations, respectively, in 2020.

    top 22 United States collaborators by assignee country in 2020
    Figure 2


    Last, we calculate and rank the women’s inventor rate (WIR) by country of assignee. The following graph (Figure 3) shows the countries in the top-25th percentile in terms of the number of granted patents in 2020. As you can see, Taiwan is firmly in first place with 32.2% of inventors identified as women in 2020. In second place is Spain with 22.6%, followed by France with 17.1%. The U.S. is just above the middle of the pack at 13.1%, with several European and Asian countries ahead of them.

    bar graph showing the top 25% of countries with high WIR in their assignees
    Figure 3. 
    *Note: Some names are more difficult to attribute with gender than others. There is particular difficulty in attributing gender to Asian names. For these reasons we have eliminated countries with an attribution rate of less than 90% as the WIR is deemed less reliable. Specifically, this eliminates China, The Cayman Islands, South Korea, India, Hong Kong, Singapore and Saudi Arabia from this graph. This information is available in the annual datasets.  


Button sidebar