A Python command-line tool to fetch research papers from PubMed and identify papers with authors affiliated with pharmaceutical or biotech companies.
- Search PubMed using their full query syntax
- Identify authors with non-academic affiliations
- Export results to CSV with detailed paper information
- Command-line interface with multiple options
- Type-hinted Python code for better maintainability
- Python 3.13 or higher
- Poetry for dependency management
- Clone the repository:
git clone https://github.com/BiTsY2K/PubMedFetcher.git
cd pubmedfetcher
- Install dependencies using Poetry:
poetry install
This will install all required dependencies:
- requests (>=2.32.3)
- pandas (>=2.2.3)
- logging (>=0.4.9.6)
The tool can be used via the command line using the get-papers-list
command installed by Poetry.
poetry run get-papers-list "<your-search-query>"
-h, --help
: Display usage instructions-d, --debug
: Print debug information during execution-f, --file FILENAME
: Specify output file path (CSV format)- If not provided, results will be printed to console
- Basic search with output to console:
poetry run get-papers-list "cancer immunotherapy"
- Search with file output:
poetry run get-papers-list "CRISPR" -f results.csv
- Search with debug information:
poetry run get-papers-list "antibody development" -d -f results.csv
The tool generates a CSV file with the following columns:
- PubmedID: Unique identifier for the paper
- Title: Title of the paper
- Publication Date: Date the paper was published
- Non-academic Author(s): Names of authors affiliated with non-academic institutions
- Company Affiliation(s): Names of pharmaceutical/biotech companies
- Corresponding Author Email: Email address of the corresponding author
pubmedfetcher/
├── pubmedfetcher/
│ ├── pubmed_fetcher/
│ │ ├── __init__.py
│ │ ├── main.py
│ │ └── modules.py
│ ├── __init__.py
│ ├── types.py
│ └── tests/
│ ├── __init__.py
│ └── test_fetcher.py
├── LICENSE.md
├── README.md
├── poetry.lock
├── pyproject.toml
└── test_data.xml
-
Development Tools:
- Poetry for dependency management
- GitHub for version control
- Python's type hints for static typing
-
Key Libraries:
- requests: For API communication
- pandas: For data manipulation and CSV export
- logging: For debug information and error tracking
- Fork the repository
- Create a feature branch
- Commit your changes
- Push to the branch
- Create a Pull Request
This project is licensed under the MIT License - see the LICENSE file for details.
ROHIT RAI ([email protected])