PDF Text Finder is a TypeScript-based tool that reads a PDF document and searches for a specific string (e.g., "281"). The script uses the pdf-parse
library to extract text from PDF files and then performs a search for the specified text.
- Extracts text from PDF files.
- Searches for occurrences of a specified string (e.g., "281") in the document.
- Outputs the number of matches found.
- TypeScript
- Node.js
- pdf-parse (for extracting text from PDF files)
Make sure you have Node.js and npm (or yarn) installed on your machine.
-
Clone this repository:
git clone [email protected]:gunash-portfolio/pdf-text-finder.git cd pdf-text-finder
-
Install the dependencies:
npm install
-
Place the PDF file you want to scan in the project directory.
-
Update the file path in
pdf.ts
with the path to your PDF file.const pdfPath = './path-to-your-file.pdf'; // Replace with your PDF file path const searchText = '281'; // You can change this to search for any text
-
Run the script:
npx ts-node pdf.ts
-
The script will output the number of times the search string appears in the PDF.
Found 3 occurrences of "searchtext"
Total occurrences of "searchtext": `result of number you will get`
- pdf-parse: A library for extracting text from PDF files. GitHub Repository
- TypeScript: A superset of JavaScript that compiles to plain JavaScript, providing type safety and modern features. TypeScript
- Node.js: A JavaScript runtime built on Chrome's V8 JavaScript engine. Node.js