Skip to main content
ex. data visualization, research paper
  • Releasing Annualized Data Files for Patents, Assignees, and Inventor Gender

    We are pleased to release annual datasets for exploring patents and their inventors and companies. The annual datasets are in csv files small enough for users without data science or coding knowledge to manipulate and view information about the inventors and company(ies) associated with every granted patent, including the gender of inventors generated by our gender attribution algorithm. These annual datasets are constructed around patent grant year, and combine corresponding company (assignee) and inventor information for each patent granted between 1976 to 2020. We constructed these datasets to improve user access to PatentsView data, especially those users interested in women inventors.

    So, what’s in the new dataset?

    To create the annualized data, we combined the following list of datasets available from the bulk download page for granted patents. We did not utilize all the fields from each, but selected fields to generate these smaller files. A data dictionary of all the variables within the annualized datasets is available with the downloads for more information on what fields they contain.

    1. rawinventor: includes information on the unique id associated with each patent and inventor.
    2. inventor: includes information on the unique first and last name of each inventor, a flag to indicate gender of each inventor, and the inventor id from the rawinventor dataset.
    3. patent: includes information about the patent grant date.
    4. patent_assignee: is a cross-walk that gives the unique assignee id associated with each patent.
    5. assignee: identifies the unique location id for each assignee and the unique assignee name.
    6. location: contains information about the country, city and state, and county of the assignee if within the U.S.
    7. application: links the application id to the granted patent id, and gives the application year.
    8. ipcr: includes information on the ipc/cpc section (technology field) of the granted patent.


    A couple notes on using the new dataset:

    • As you become acquainted with the data, you may notice that there are only nine fields provided for the first nine inventors listed for each patent. Of course, some patents have more than 9 inventors. The maximum number of inventors for a patent recorded within the PatentsView dataset is 123 inventors. However, 99% of patents have nine inventors or less. To make the datasets more user-friendly, we truncated the data to only include the first nine inventors. If you are interested in acquiring the names and gender information of these very large inventor teams, please refer to the original bulk data.
    • Some patents appear more than once because they are granted to more than one company. We did this to make it easier for an individual to, for example, search for all patents and inventor genders associated with a certain company.



    We used the annualized data to generate the following graphs and to illustrate some of the potential uses for this release. In figure 1, we tally the number of patents for each country that have companies/assignees from different countries to define the international collaboration that occurred in 2020. The United States had the most international collaborations in 2020 with 1,828 patents. Behind the U.S., China and Germany were found to have the second- and third-most international collaborations.

    bar graph listing the top 20 countries who collaborate internationally in the year 2020
    Figure 1


    Among the 1,828 U.S.-international collaborations, we see from the next figure (Figure 2) that 273 of them were collaborations with companies in Germany, the most frequent international collaborator of the U.S. China and Japan are not far behind with 249 and 236 collaborations, respectively, in 2020.

    top 22 United States collaborators by assignee country in 2020
    Figure 2


    Last, we calculate and rank the women’s inventor rate (WIR) by country of assignee. The following graph (Figure 3) shows the countries in the top-25th percentile in terms of the number of granted patents in 2020. As you can see, Taiwan is firmly in first place with 32.2% of inventors identified as women in 2020. In second place is Spain with 22.6%, followed by France with 17.1%. The U.S. is just above the middle of the pack at 13.1%, with several European and Asian countries ahead of them.

    bar graph showing the top 25% of countries with high WIR in their assignees
    Figure 3. 
    *Note: Some names are more difficult to attribute with gender than others. There is particular difficulty in attributing gender to Asian names. For these reasons we have eliminated countries with an attribution rate of less than 90% as the WIR is deemed less reliable. Specifically, this eliminates China, The Cayman Islands, South Korea, India, Hong Kong, Singapore and Saudi Arabia from this graph. This information is available in the annual datasets.  


  • What's New with PatentsView - October 2021

    Who are the creative thinkers behind the latest innovations?

    Patenting is an evolving and informative phenomenon. Patenting rates are subject to the ebbs and flows of the financial economy, the devastation of natural disasters, and changes in the labor force (think World Wars and women leaving the home for the workplace). Research on patents, their inventors and owners is expansive and covers the entire world because, beyond what is being created and imagined, patenting has as much to do with the circumstances inventors face as it does with ingenuity.   

    We know of some classic examples, those famous around the world for their inventions that changed our daily lives and possibly the trajectory of humankind. After all, where would we be without the electricity (thank you, Alessandro Volta, Italy) that allows us to work day and night in the name of productivity? Veiled Marxism and broad brushstrokes aside, the subsequent inventions that harnessed the power of Volta’s discovery (Thomas Edison, USA and Joseph Swan, UK) expanded the number of hours in the day that people were able to be awake, active, and creating – a history-altering impact on civilization in pursuit of industrialization. The American inventor Thomas Edison had over one thousand patents in his name and spent much of his later life inventing fulltime in laboratories with other American inventors of the period including the division-of-labor transformer, Henry Ford. During these prolific years in Edison’s life, his home and children were tended by his first wife, Mary (married in 1871 until her death in 1884), and second wife, Mina (married in 1886).

    Since 1790, the year U.S. law changed to allow women to own intellectual property, only twenty women had patents under their names in the United States patenting office by 1840 (source). Over the next century and a half, women inventor rates steadily increased, accelerated by their movement out of traditional home-related tasks and into higher education systems and workplaces. One such inventor, Grace Hopper (1906-1992), was the first woman to graduate from Yale University with a PhD in Mathematics (1934) and in the 1950s she created the computer language COBOL, Common Business-Oriented Language. Hopper’s COBOL is still used today in business, administration, and finance operations for companies and governments. Hopper did not own a patent on COBOL as it was created before computer software technology was a patentable field of invention, another tribute to the changes in the U.S. patenting system. From the single patent filed by a woman inventor in 1804 to the thousands of women inventors in the U.S. patent system today, something massive has changed. In no small part, that massive change reflects a shift in the culture of the United States and beyond.

    We know from gender attribution algorithms used by researchers around the world that patenting rates are still not at parity (equal proportions) between men- and women-inventors despite the percentages of men and women being roughly equal in the world. Perhaps the patenting “playing field” is still leveling out after hundreds of years of differential opportunities, or perhaps there are new barriers to inventing today that prohibit some groups of people from creating at full “Edisonian capacity.”

    Entering this new millennium, patenting researchers are delving into the who of inventorship more than ever in the hopes of unlocking and supporting the next wave of innovation worldwide. Here at PatentsView, our team of developers, data scientists, and social science researchers are working together to bring the latest breakthroughs in disambiguation and attribution science to the public. After a summertime of doubling down on disambiguation algorithms and amping up on gender attribution efforts with our partners, PatentsView is closer than ever to the latest information on who is inventing.

    Part of the PatentsView mission is to deliver the best data we can as efficiently as we can. In this spirit, we are redesigning the application programming interface (API) to our database. The goal of the redesign is to allow for faster query results and a simplified querying mechanism. For more information on the API redesign, please visit this link.

    Want to explore gender data further? You can access our latest gender attribution data in the inventor table under bulk data downloads. Reference the data dictionary to see the variables we have for use in the inventor table. To investigate where inventors are patenting from, use the Query Builder tool or write script to chat with the API for the “last known location” field. What are you investigating with the help of PatentsView data this season? We’d love to know.

  • What's New With PatentsView - July 2021

    Since our last data update, PatentsView data scientists and developers have been hard at work rewriting disambiguation algorithms and streamlining our data pipeline processes for smoother and more replicable update cycles in future months and years to come. With this latest update, which includes patent data through March 30, 2021, we are now two full update cycles into use of our revised algorithms for disambiguating data. For more information on data changes, please visit our release notes page.

    As the data sets get larger and more complex and as new fields and attributes are added to the PatentsView database, our servers, domains, and other hardware must also be upgraded to continue to support our work. Our latest upgrade is the PatentsView application programming interface (affectionately known as the API). The PatentsView API serves 3,000–300,000 requests every day. While a majority of these requests succeed, over the past few years the number of requests that fail has increased due to the size of the data sets. To address this and to stay up to date with industry standards, PatentsView has begun the process of redesigning the API.

    For more information about API changes, please read on.

    API Redesign

    Design Goals

    • Enable a search-centric approach to the API rather than a querying/filter-based approach.
    • Achieve response times in range of seconds rather than minutes.
    • Improve user experience by limiting number and size of individual API requests from the server.
    • Align the API design with industry standards in terms of request and response format, headers, and documentation.


    The technology and design choices for the new API were made with the above goals in mind. The v0.1 API will apply this approach to a narrow scope of patent citations and application citations. As a result, the corresponding fields in the current API, shown below, will be discontinued.

    Discontinued Fields

    API Field Name Group Common Name Type Query Description
    appcit_app_number application_citations Application Number string Y Application ID (issued by USPTO) for application cited by the selected patent
    appcit_category application_citations Entity Category string Y Entity that cited an application in the selected patent
    appcit_date application_citations Filing Date date Y Filing date for application cited in the selected patent
    appcit_kind application_citations Kind Code string Y Patent kind code of application cited by patent
    appcit_sequence application_citations Sequence integer N Order in which a citation is cited by patent
    cited_patent_category cited_patents Patent Category string Y Category of cited patent
    cited_patent_date cited_patents Patent Date date Y Grant date of cited patent
    cited_patent_kind cited_patents Patent Kind string Y Patent kind of cited patent (see patent_kind for details)
    cited_patent_number cited_patents Patent Number string Y Patent number of cited patent
    cited_patent_sequence cited_patents Patent Sequence string N Order in which patent is cited by the selected patent
    cited_patent_title cited_patents Patent Title string Y Title of cited patent
    citedby_patent_category citedby_patents Patent Category string Y Category of citing patent
    citedby_patent_date citedby_patents Patent Date date Y Grant date of patent citing the selected patent
    citedby_patent_kind citedby_patents Patent Kind string Y Patent kind of citing patent (see patent_kind for details)
    citedby_patent_number citedby_patents Patent Number string Y Patent number of citing patent
    citedby_patent_title citedby_patents Patent Title string Y Title of citing patent

    New API Fields

    Patent Citation Endpoint

    API Field Name Group Common Name Type Description
    patent_number patent_citations Patent Number string Patent of interest
    cited_patent_number patent_citations Cited Patent Number string Patent number cited by patent of interest (i.e., backward citation)
    citation_category patent_citations Citing Entity Type string Entity type (e.g., examiner, applicant, etc.) that made the citation on the patent of interest.
    citation_date patent_citations Patent Date date Grant date of the cited patent
    citation_sequence patent_citations Patent Sequence string Order in which the cited patent is listed on the patent of interest

    Application Citation Endpoint

    API Field Name Group Common Name Type Description
    patent_number application_citations Patent Number string Patent of interest
    cited_application_number application_citations Cited Application Number String Application number of the application cited by patent of interest
    citation_category application_citations Citing Entity Type string Entity type (e.g., examiner, applicant, etc.) that made the citation on the patent of interest
    citation_date application_citations Filing Date date Filing date for application cited on the patent of interest
    citation_sequence application_citations Sequence integer Order in which the cited application listed on the patent of interest


    To achieve the design goals related to performance, the scope of the citations’ endpoint has been reduced, as outlined below.

    1. Patent Fields

    What has changed: Patent-related information such as patent title, patent type, patent kind, etc., will not be available in the citations’ endpoint.

    How this affects users: API clients will need to make two requests, one to the citations’ endpoint to obtain the patent numbers and a second to the patent’s endpoint to get the patent-related information.

    1. Citedby and Cited Patents

    What has changed: Previously, users were able to send a patent number (or other queries) and obtain patent numbers that cite the requested patent (called forward citations) as well as the patent numbers that the requested patent has cited (called backward citations). With the new API, users will only be able to obtain patent numbers that the requested patent has cited (i.e., backward citations).

    How this affects users: API clients will need to send two requests:

    • once with the patent numbers of interest in the “patent_number” field to get the list of patents that the requested patent has cited (i.e., backward citations); and
    • again with patent numbers of interest in the “cited_patent_number” field to get the list of patents that cite the requested patent number (i.e., forward citations).

    Bulk Requests

    To support the above changes, the new citations API and the current API will both support a “bulk” request wherein API clients can send up to 1,000 values in either patent number field. The maximum number of patents that can be sent will depend on the mechanism of request (POST vs. GET), and this maximum will be revisited at the end of the pilot phase.

    What Else Is New?

    Swagger-based API documentation will be released along with the v0.1 public release. A summary of the changes are as follows:

    • Developers will need to obtain an API key to access the API.
    • Each API key will be allowed 45 requests per minute.
    • GET request format remains unchanged.
    • POST requests will need to send JSON data (instead of string representation of JSON).
    • The response from the server will have the following:
      • an “error” field indicating if the request resulted in an error;
      • X-Status-Reason and X-Status-Reason-Code in case of an error; and
      • Retry-After header in case of throttled requests.


    Aug. 1: Citations Endpoints API (v0.1) released to pilot users

    Sept. 1: Citations Endpoints API (v0.1) released to public users


  • What's New with PatentsView

    Last year was amazing for PatentsView. In spite of the difficult, uncertain, and changing world brought on by the global coronavirus pandemic, our team was able to successfully transition to working exclusively online. We send a heartfelt thank you to the thousands of scientists and innovators across the globe, and their upstream suppliers, for developing life-saving vaccines that bring hope back to our communities.

    We are happy to inform everyone that the API querying parameters are being improved to better meet the needs of our users (follow the link for more information about changes to the API). We have also updated the algorithms for data disambiguation based on developments in the field of entity resolution and based on feedback from our PatentsView user community. Our disambiguated inventor data are now linked to our gender attribution results. The table includes a flag for inventors identified as male and an attribution status flag for all inventors. Also, for the first time, the raw gender attribution data are available on our download page.

    Our website and community pages have a new domain, with an updated look and feel. With this fresh start, the bulk data download webpages are now searchable by table. The data dictionaries for our Query Builder tool and Bulk Data Download Tables are also integrated into our webpages and fully searchable. Happily, PGPubs data is out of its beta form! The Bulk Data Download Tables are available for granted patents and pre-granted published patent applications (PGPubs). Among other additions to the PatentsView web environment are detailed reference materials for all PatentsView methods and processes, as well as dedicated topic pages. Our first topic page, called Gender & Innovation, focuses on the participation of women as inventors on patents. Future topics may include innovation around COVID-19, trends in AI (artificial intelligence) patenting, and other topics of interest to our data scientists and user community (email us sometime if you have an idea for a page topic).

    If you have questions or interests to share, please use the community forum and data-in-action features that have been integrated into the newly redesigned website. If you are working on a project or using PatentsView data for another purpose you would like to share, please email us to be featured on our Data in Action Spotlight page. These Data in Action articles will also be highlighted on the PatentsView homepage.

    Our PatentsView team also assisted the U.S. Patent and Trademark Office by hosting the virtual USPTO Symposium on Entity Resolution. The symposium included presentations from researchers and applied experts on entity resolution methods and practices from around the world. There were insightful questions and discussion among the 15 panel presenters and over 100 participants, and we are grateful to everyone who joined. Background articles, presentation slides, and recordings of the symposium are available at:

    With all these amazing improvements there are bound to be challenges and difficulties. Please communicate with us via our contact form or email when you encounter an issue with our webpage or data. As always, we continue to improve our systems, documentation, and communications, so don’t be shy—let us know your feedback.

    Happiness and health to all of you!


    The PatentsView Team

  • Diffusion of Artificial Intelligence Technology

    Throughout recent decades, the role of artificial intelligence (AI) in the modern world and the lives of its inhabitants has increased drastically. From advancements in cybersecurity and military technologies to AI-driven greenhouses and algorithms that help health-care workers develop better treatments, AI has made its way into nearly every sector of society.

    The future of AI innovation and its influence on society will only continue to grow. It is with this in mind that the U.S. Patent and Trademark Office (USPTO), Office of the Chief Economist released a new report, titled Inventing AI: Tracing the diffusion of artificial intelligence with U.S. patents. This report details research conducted by USPTO in which a machine-learning AI algorithm was used to determine the volume, nature, and evolution of AI and its component technologies as contained in U.S. patents from 1976 through 2018.

    A main goal of USPTO’s research was to measure the technology diffusion of AI with patent data. Technology diffusion is the process by which a technology is adopted by inventors, organizations, and other innovators as it spreads across different markets. In this report, USPTO details the methods it developed to identify the scope of such diffusion as it relates to AI and its component technologies.

    Patent data are extremely useful for such an analysis as they can give direct insight into the spread and adaptation of a technology or method. When a new, powerful innovation or technology such as AI is created, the speed at which it is adopted by inventors and organizations alike can partially be seen by the increase in patent applications filed and granted with reference to said technology. Figure 1 shows the growth of AI-related patents as a percentage of all U.S. patents by year. In the figure we can see a dramatic increase in AI patents, from less than 5% in 1980 to greater than 20% in 2018—a truly staggering growth in just under 40 years.

    Figure 1. U.S. Inventor and Owner of AI-Related Patents: Percentages From 1975 to 2020

    Figure 1. U.S. Inventor and Owner of AI-Related Patents: Percentages From 1975 to 2020

    In addition to the number of patents filed, patent data are useful in this research because each patent document contains detailed information and metadata. The person or organization that filed the patent, the technological classification of said patent, the location the patent was filed in, and so forth can all be found in one document.

    The role of AI moving forward will be determined by the willingness and ability of inventors to continue working with and innovating on the technologies of today. Although we cannot know for sure just how much of an impact AI will have on our future, research into the scope and diffusion of its technologies can give us a glimpse of what is to come.

    More information on the diffusion of AI as well as the AI method used to identify AI patents is available in the USPTO report.

Button sidebar