🎨 IMPACT: A Large-scale Integrated Multimodal Patent Analysis and Creation Dataset for Design Patents

We introduce IMPACT (Integrated Multimodal Patent Analysis and CreaTion Dataset) for Design Patents.

Check our papers here: OpenReview 🔥

✒️ It is a large-scale multimodal patent dataset with detailed captions for design patent figures.

💥 Our dataset includes half a million design patents comprising 3.61 million figures along with captions from patents granted by the United States Patent and Trademark Office USPTO over a 16-year period from 2007 to 2022.

Data

📗 Dataset can be viewed and downloaded here.

import os
from huggingface_hub import hf_hub_download

CSV_FILE = '2022.csv'
os.makedirs(TARGET_DIR, exist_ok=True)
path = hf_hub_download(repo_id='AI4Patents/IMPACT', filename=CSV_FILE, repo_type="dataset")
destination = os.path.join('data', CSV_FILE)
os.rename(path, destination)

Patent Classification

python classification.py

PatentCLIP and multimodal retrieval tasks

🔥 PatentCLIP is based on CLIP, and we use an open source open_clip implementation for finetuning and inference

PatentCLIP with IMPACT dataset

Please download train and val set.

🤗 PatentCLIP-ViT-B checkpoint

🤗 PatentCLIP-Title-ViT-B checkpoint

Usage

Load a PatentCLIP model:

import open_clip
model, _, preprocess = open_clip.create_model_and_transforms('hf-hub:ellen625/PatentCLIP_ViT_B', device=device)

Demo on PatentCLIP and text-image retrieval

Text-image retrieval with PatentCLIP

Multimodal retrieval results

	Dataset	Backbone	Text-Image			Image-Text
			R@1	R@5	R@10	R@1	R@5	R@10
Zero-shot	Image-Title	ResNet50	0.52	2.10	3.32	0.20	0.72	1.64
		ResNet101	1.02	3.20	4.72	0.30	0.82	1.28
		ViT-B-32	1.06	3.54	5.56	0.38	1.62	2.60
		ViT-L-14	2.78	7.38	10.40	1.16	4.30	7.32
	Image-Caption	ResNet50	0.82	2.52	4.08	0.78	2.32	3.48
		ResNet101	1.44	4.52	6.48	0.98	2.98	4.96
		ViT-B-32	1.98	5.24	7.42	1.06	4.26	6.32
		ViT-L-14	4.46	10.74	15.16	3.42	8.90	12.88
Finetuned	Image-Caption	ResNet50	5.38	15.52	22.7	5.9	16.6	23.86
		ResNet101	7.44	20.6	28.48	7.02	19.70	27.58
		ViT-B-32	10.24	25.56	35.06	9.88	25.90	35.08
		ViT-L-14	20.58	43.14	53.00	20.44	42.34	52.56

Acknowledgement

open-clip: the code base we built on for PatentCLIP.
LLaVa for caption generation.

Citation

If you use the code or data in this repo for your work, please consider citing our paper and staring this repo:

@inproceedings{patent2024impact,
    title={{IMPACT}: A Large-scale Integrated Multimodal Patent Analysis and Creation Dataset for Design Patents},
    author={Homaira Huda Shomee, Zhu Wang, Sathya N. Ravi, Sourav Medya},
    booktitle={The Thirty-eight Conference on Neural Information Processing Systems Datasets and Benchmarks Track},
    year={2024},
    url={https://openreview.net/forum?id=l0Ydsl10ci}
    }

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
Caption generation		Caption generation
Data processing		Data processing
Sample data		Sample data
Tasks		Tasks
.DS_Store		.DS_Store
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🎨 IMPACT: A Large-scale Integrated Multimodal Patent Analysis and Creation Dataset for Design Patents

Data

Patent Classification

PatentCLIP and multimodal retrieval tasks

PatentCLIP with IMPACT dataset

Usage

Demo on PatentCLIP and text-image retrieval

Multimodal retrieval results

Acknowledgement

Citation

About

Releases

Packages

Contributors 3

Languages

AI4Patents/IMPACT

Folders and files

Latest commit

History

Repository files navigation

🎨 IMPACT: A Large-scale Integrated Multimodal Patent Analysis and Creation Dataset for Design Patents

Data

Patent Classification

PatentCLIP and multimodal retrieval tasks

PatentCLIP with IMPACT dataset

Usage

Demo on PatentCLIP and text-image retrieval

Multimodal retrieval results

Acknowledgement

Citation

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages