Skip to main content
ex. data visualization, research paper
  • What's New with PatentsView - July 2022

    A few months ago, our team announced a break in the quarterly data update cycle. This interruption  allowed us to standardize, consolidate, clean, and amplify our current resources across the bulk data download and API products.  

    The bulk data downloads are currently comprised of more than 100 individual files. These files include disambiguated inventors, assignees, and locations, patent classification lookup tables, government interest information, long-text data and claims, and pre-granted patent application publications. The PatentsView team’s standardization and consolidation efforts aim to decrease barriers to merging these files. Part of this process targets data fields and table structures across the granted patent and pre-granted publications datasets. The standardization will also include updating PatentsView’s naming conventions for USPC and CPC fields to map closer to United States Patent and Trademark Office (USPTO) naming conventions.  

    Data update processes resume this month with an anticipated release this September. The September update will include data released by USPTO in between January and June 2022.  

    Since one goal of the API redesign is to mirror the data available in the bulk data downloads, our team is continuing to increase the number of endpoints and data fields accessible through the beta Elastic Search API. With the September data update, we plan to release all remaining granted patent data currently not available in the MySQL API but available in the downloaded data (i.e. gender, botanic, long-text description fields, applicants, etc.). We will release pre-grant patent publications’ data and endpoints by the end of the year. 

    As always, we welcome and appreciate your feedback and communication with our team. Our email inbox and feedback form are open; we strive to respond to messages within two business days. To ask questions or start a conversation within the PatentsView user community, please see the Forum available on our website. 

    P.S. Here’s a quick note: Our Query Builder tool was supposed to remain available during this time of data product harmonization, but because of new Google email regulations on APIs, we are currently unable to return your datasets via email. We are seeking a solution to this issue and will keep you posted. In the interim, if you require a dataset from the Query Builder, please reach out to our team for guidance.  

  • What’s New with PatentsView – May 2022

    Updates on our tools, website, and upcoming events

    It’s springtime in the United States, the season for growth and change. Our work at PatentsView continues to move ahead; with our growing user base and data team, we are trying new things and exercising our creativity. This can be seen on our Gender & Innovation topic page which now features an interactive data visualization made by some of our team members using PatentsView data. Also new to the topic page are annualized data files made in partnership with USPTO data scientists. These annual files are yearly and contain disambiguated information on patents, their assignees, inventors, and inventor gender. Smaller, more malleable data packages are a new thing for us – so let us know what you think!


    The Query Builder, ElasticSearch API, and original MySQL API are up to date with 2021 quarter 4 data from USPTO (data through December 30, 2021) as of March 28, 2022. The PatentsView team has been working to create additional endpoints in the ElasticSearch API (version 0.1) which will eventually replace the MySQL interface for PatentsView data. To date, over 100 beta users are engaging with the E.S. API. Our team is appreciative of the feedback we have received so far and look forward to additional improvements and additions as the year progresses.  

    Data Updates

    The bulk data download tables are still the biggest and most exhaustive collection of raw and processed data that our team has to offer. We monitor for updates to disambiguation methods and algorithms so that we can continue to improve the disambiguation of location, inventor, assignee, and lawyer information. It also helps us to improve when users of PatentsView data share errors they find in disambiguated results with our team – so, thank you, it really does make PatentsView stronger.

    Our next data update will publish in September 2022 and include quarters 1 and 2 of data for 2022. The PatentsView data science team is taking a break from data update activities through the months April to July to devote all resources toward making planned structural changes to the bulk data download and API products we offer. The next communication from our team will outline these structural changes and how this may affect bulk data download and API users.

    During this four-month period, the data available from the API(s) and Query Builder functions will also remain up to date through December 30, 2021. All PatentsView features will be updated in September with data through June 30, 2022.

    Upcoming Events

    PatentsView, with support from AIR, is hosting the United States Patent and Trademark Office’s Gender and Race Attribution Symposium in August 2022. The symposium is a full-day virtual event aimed to bring together computer scientists, information scientists, economists, and others to discuss the state-of-the-art approaches to and current applications of name-to-gender and name-to-race attribution algorithms. The symposium will review methods and applications to provide an overview of current approaches from leading scholars in the field, and to build knowledge, identify a community of practitioners, and facilitate the application of common approaches. Stay tuned for the save the date announcement from our team!

  • Releasing Annualized Data Files for Patents, Assignees, and Inventor Gender

    We are pleased to release annual datasets for exploring patents and their inventors and companies. The annual datasets are in csv files small enough for users without data science or coding knowledge to manipulate and view information about the inventors and company(ies) associated with every granted patent, including the gender of inventors generated by our gender attribution algorithm. These annual datasets are constructed around patent grant year, and combine corresponding company (assignee) and inventor information for each patent granted between 1976 to 2020. We constructed these datasets to improve user access to PatentsView data, especially those users interested in women inventors.

    So, what’s in the new dataset?

    To create the annualized data, we combined the following list of datasets available from the bulk download page for granted patents. We did not utilize all the fields from each, but selected fields to generate these smaller files. A data dictionary of all the variables within the annualized datasets is available with the downloads for more information on what fields they contain.

    1. rawinventor: includes information on the unique id associated with each patent and inventor.
    2. inventor: includes information on the unique first and last name of each inventor, a flag to indicate gender of each inventor, and the inventor id from the rawinventor dataset.
    3. patent: includes information about the patent grant date.
    4. patent_assignee: is a cross-walk that gives the unique assignee id associated with each patent.
    5. assignee: identifies the unique location id for each assignee and the unique assignee name.
    6. location: contains information about the country, city and state, and county of the assignee if within the U.S.
    7. application: links the application id to the granted patent id, and gives the application year.
    8. ipcr: includes information on the ipc/cpc section (technology field) of the granted patent.


    A couple notes on using the new dataset:

    • As you become acquainted with the data, you may notice that there are only nine fields provided for the first nine inventors listed for each patent. Of course, some patents have more than 9 inventors. The maximum number of inventors for a patent recorded within the PatentsView dataset is 123 inventors. However, 99% of patents have nine inventors or less. To make the datasets more user-friendly, we truncated the data to only include the first nine inventors. If you are interested in acquiring the names and gender information of these very large inventor teams, please refer to the original bulk data.
    • Some patents appear more than once because they are granted to more than one company. We did this to make it easier for an individual to, for example, search for all patents and inventor genders associated with a certain company.



    We used the annualized data to generate the following graphs and to illustrate some of the potential uses for this release. In figure 1, we tally the number of patents for each country that have companies/assignees from different countries to define the international collaboration that occurred in 2020. The United States had the most international collaborations in 2020 with 1,828 patents. Behind the U.S., China and Germany were found to have the second- and third-most international collaborations.

    bar graph listing the top 20 countries who collaborate internationally in the year 2020
    Figure 1


    Among the 1,828 U.S.-international collaborations, we see from the next figure (Figure 2) that 273 of them were collaborations with companies in Germany, the most frequent international collaborator of the U.S. China and Japan are not far behind with 249 and 236 collaborations, respectively, in 2020.

    top 22 United States collaborators by assignee country in 2020
    Figure 2


    Last, we calculate and rank the women’s inventor rate (WIR) by country of assignee. The following graph (Figure 3) shows the countries in the top-25th percentile in terms of the number of granted patents in 2020. As you can see, Taiwan is firmly in first place with 32.2% of inventors identified as women in 2020. In second place is Spain with 22.6%, followed by France with 17.1%. The U.S. is just above the middle of the pack at 13.1%, with several European and Asian countries ahead of them.

    bar graph showing the top 25% of countries with high WIR in their assignees
    Figure 3. 
    *Note: Some names are more difficult to attribute with gender than others. There is particular difficulty in attributing gender to Asian names. For these reasons we have eliminated countries with an attribution rate of less than 90% as the WIR is deemed less reliable. Specifically, this eliminates China, The Cayman Islands, South Korea, India, Hong Kong, Singapore and Saudi Arabia from this graph. This information is available in the annual datasets.  


  • What's New with PatentsView - October 2021

    Who are the creative thinkers behind the latest innovations?

    Patenting is an evolving and informative phenomenon. Patenting rates are subject to the ebbs and flows of the financial economy, the devastation of natural disasters, and changes in the labor force (think World Wars and women leaving the home for the workplace). Research on patents, their inventors and owners is expansive and covers the entire world because, beyond what is being created and imagined, patenting has as much to do with the circumstances inventors face as it does with ingenuity.   

    We know of some classic examples, those famous around the world for their inventions that changed our daily lives and possibly the trajectory of humankind. After all, where would we be without the electricity (thank you, Alessandro Volta, Italy) that allows us to work day and night in the name of productivity? Veiled Marxism and broad brushstrokes aside, the subsequent inventions that harnessed the power of Volta’s discovery (Thomas Edison, USA and Joseph Swan, UK) expanded the number of hours in the day that people were able to be awake, active, and creating – a history-altering impact on civilization in pursuit of industrialization. The American inventor Thomas Edison had over one thousand patents in his name and spent much of his later life inventing fulltime in laboratories with other American inventors of the period including the division-of-labor transformer, Henry Ford. During these prolific years in Edison’s life, his home and children were tended by his first wife, Mary (married in 1871 until her death in 1884), and second wife, Mina (married in 1886).

    Since 1790, the year U.S. law changed to allow women to own intellectual property, only twenty women had patents under their names in the United States patenting office by 1840 (source). Over the next century and a half, women inventor rates steadily increased, accelerated by their movement out of traditional home-related tasks and into higher education systems and workplaces. One such inventor, Grace Hopper (1906-1992), was the first woman to graduate from Yale University with a PhD in Mathematics (1934) and in the 1950s she created the computer language COBOL, Common Business-Oriented Language. Hopper’s COBOL is still used today in business, administration, and finance operations for companies and governments. Hopper did not own a patent on COBOL as it was created before computer software technology was a patentable field of invention, another tribute to the changes in the U.S. patenting system. From the single patent filed by a woman inventor in 1804 to the thousands of women inventors in the U.S. patent system today, something massive has changed. In no small part, that massive change reflects a shift in the culture of the United States and beyond.

    We know from gender attribution algorithms used by researchers around the world that patenting rates are still not at parity (equal proportions) between men- and women-inventors despite the percentages of men and women being roughly equal in the world. Perhaps the patenting “playing field” is still leveling out after hundreds of years of differential opportunities, or perhaps there are new barriers to inventing today that prohibit some groups of people from creating at full “Edisonian capacity.”

    Entering this new millennium, patenting researchers are delving into the who of inventorship more than ever in the hopes of unlocking and supporting the next wave of innovation worldwide. Here at PatentsView, our team of developers, data scientists, and social science researchers are working together to bring the latest breakthroughs in disambiguation and attribution science to the public. After a summertime of doubling down on disambiguation algorithms and amping up on gender attribution efforts with our partners, PatentsView is closer than ever to the latest information on who is inventing.

    Part of the PatentsView mission is to deliver the best data we can as efficiently as we can. In this spirit, we are redesigning the application programming interface (API) to our database. The goal of the redesign is to allow for faster query results and a simplified querying mechanism. For more information on the API redesign, please visit this link.

    Want to explore gender data further? You can access our latest gender attribution data in the inventor table under bulk data downloads. Reference the data dictionary to see the variables we have for use in the inventor table. To investigate where inventors are patenting from, use the Query Builder tool or write script to chat with the API for the “last known location” field. What are you investigating with the help of PatentsView data this season? We’d love to know.

  • What's New With PatentsView - July 2021

    Since our last data update, PatentsView data scientists and developers have been hard at work rewriting disambiguation algorithms and streamlining our data pipeline processes for smoother and more replicable update cycles in future months and years to come. With this latest update, which includes patent data through March 30, 2021, we are now two full update cycles into use of our revised algorithms for disambiguating data. For more information on data changes, please visit our release notes page.

    As the data sets get larger and more complex and as new fields and attributes are added to the PatentsView database, our servers, domains, and other hardware must also be upgraded to continue to support our work. Our latest upgrade is the PatentsView application programming interface (affectionately known as the API). The PatentsView API serves 3,000–300,000 requests every day. While a majority of these requests succeed, over the past few years the number of requests that fail has increased due to the size of the data sets. To address this and to stay up to date with industry standards, PatentsView has begun the process of redesigning the API.

    For more information about API changes, please read on.

    API Redesign

    Design Goals

    • Enable a search-centric approach to the API rather than a querying/filter-based approach.
    • Achieve response times in range of seconds rather than minutes.
    • Improve user experience by limiting number and size of individual API requests from the server.
    • Align the API design with industry standards in terms of request and response format, headers, and documentation.


    The technology and design choices for the new API were made with the above goals in mind. The v0.1 API will apply this approach to a narrow scope of patent citations and application citations. As a result, the corresponding fields in the current API, shown below, will be discontinued.

    Discontinued Fields

    API Field Name Group Common Name Type Query Description
    appcit_app_number application_citations Application Number string Y Application ID (issued by USPTO) for application cited by the selected patent
    appcit_category application_citations Entity Category string Y Entity that cited an application in the selected patent
    appcit_date application_citations Filing Date date Y Filing date for application cited in the selected patent
    appcit_kind application_citations Kind Code string Y Patent kind code of application cited by patent
    appcit_sequence application_citations Sequence integer N Order in which a citation is cited by patent
    cited_patent_category cited_patents Patent Category string Y Category of cited patent
    cited_patent_date cited_patents Patent Date date Y Grant date of cited patent
    cited_patent_kind cited_patents Patent Kind string Y Patent kind of cited patent (see patent_kind for details)
    cited_patent_number cited_patents Patent Number string Y Patent number of cited patent
    cited_patent_sequence cited_patents Patent Sequence string N Order in which patent is cited by the selected patent
    cited_patent_title cited_patents Patent Title string Y Title of cited patent
    citedby_patent_category citedby_patents Patent Category string Y Category of citing patent
    citedby_patent_date citedby_patents Patent Date date Y Grant date of patent citing the selected patent
    citedby_patent_kind citedby_patents Patent Kind string Y Patent kind of citing patent (see patent_kind for details)
    citedby_patent_number citedby_patents Patent Number string Y Patent number of citing patent
    citedby_patent_title citedby_patents Patent Title string Y Title of citing patent

    New API Fields

    Patent Citation Endpoint

    API Field Name Group Common Name Type Description
    patent_number patent_citations Patent Number string Patent of interest
    cited_patent_number patent_citations Cited Patent Number string Patent number cited by patent of interest (i.e., backward citation)
    citation_category patent_citations Citing Entity Type string Entity type (e.g., examiner, applicant, etc.) that made the citation on the patent of interest.
    citation_date patent_citations Patent Date date Grant date of the cited patent
    citation_sequence patent_citations Patent Sequence string Order in which the cited patent is listed on the patent of interest

    Application Citation Endpoint

    API Field Name Group Common Name Type Description
    patent_number application_citations Patent Number string Patent of interest
    cited_application_number application_citations Cited Application Number String Application number of the application cited by patent of interest
    citation_category application_citations Citing Entity Type string Entity type (e.g., examiner, applicant, etc.) that made the citation on the patent of interest
    citation_date application_citations Filing Date date Filing date for application cited on the patent of interest
    citation_sequence application_citations Sequence integer Order in which the cited application listed on the patent of interest


    To achieve the design goals related to performance, the scope of the citations’ endpoint has been reduced, as outlined below.

    1. Patent Fields

    What has changed: Patent-related information such as patent title, patent type, patent kind, etc., will not be available in the citations’ endpoint.

    How this affects users: API clients will need to make two requests, one to the citations’ endpoint to obtain the patent numbers and a second to the patent’s endpoint to get the patent-related information.

    1. Citedby and Cited Patents

    What has changed: Previously, users were able to send a patent number (or other queries) and obtain patent numbers that cite the requested patent (called forward citations) as well as the patent numbers that the requested patent has cited (called backward citations). With the new API, users will only be able to obtain patent numbers that the requested patent has cited (i.e., backward citations).

    How this affects users: API clients will need to send two requests:

    • once with the patent numbers of interest in the “patent_number” field to get the list of patents that the requested patent has cited (i.e., backward citations); and
    • again with patent numbers of interest in the “cited_patent_number” field to get the list of patents that cite the requested patent number (i.e., forward citations).

    Bulk Requests

    To support the above changes, the new citations API and the current API will both support a “bulk” request wherein API clients can send up to 1,000 values in either patent number field. The maximum number of patents that can be sent will depend on the mechanism of request (POST vs. GET), and this maximum will be revisited at the end of the pilot phase.

    What Else Is New?

    Swagger-based API documentation will be released along with the v0.1 public release. A summary of the changes are as follows:

    • Developers will need to obtain an API key to access the API.
    • Each API key will be allowed 45 requests per minute.
    • GET request format remains unchanged.
    • POST requests will need to send JSON data (instead of string representation of JSON).
    • The response from the server will have the following:
      • an “error” field indicating if the request resulted in an error;
      • X-Status-Reason and X-Status-Reason-Code in case of an error; and
      • Retry-After header in case of throttled requests.


    Aug. 1: Citations Endpoints API (v0.1) released to pilot users

    Sept. 1: Citations Endpoints API (v0.1) released to public users


Button sidebar