This project is an AI-powered image captioning app that allows users to upload images and generates descriptive captions using Salesforce’s BLIP model. It integrates modern web development tools like ReactJS, Flask, and Hugging Face’s AI models to deliver an interactive and responsive experience.
- Image Upload: Upload images directly from your device.
- AI-Generated Captions: Uses BLIP to produce human-like captions.
- Responsive UI: Built with ReactJS for a clean and modern interface.
- ReactJS with Vite for fast development and HMR.
- Axios for API communication.
- Flask for serving the API.
- Hugging Face’s Transformers for AI model inference.
- BLIP (Bootstrapped Language-Image Pretraining) from Salesforce.
Start by cloning the repository from GitHub:
git clone [email protected]:allanninal/image-captioning-app.git
cd image-captioning-app
-
Navigate to the
backend
folder:cd backend
-
Create and activate a virtual environment:
python -m venv venv source venv/bin/activate # Linux/Mac venv\Scripts\activate # Windows
-
Install the dependencies from
requirements.txt
:pip install -r requirements.txt
-
Run the Flask server:
python app.py
-
Navigate to the
frontend
folder:cd frontend
-
Install dependencies:
npm install
-
Run the React development server:
npm run dev
- Upload an Image: Use the app interface to upload an image.
- Generate Captions: Click the "Generate Caption" button to see AI-generated captions for the uploaded image.
- Explore Outputs: View descriptive and context-aware captions directly on the app.
- The React frontend allows users to upload images and sends them to the Flask backend.
- Flask processes the image using Hugging Face's BLIP model and returns a caption.
- The caption is displayed in the frontend, providing users with a seamless experience.
- Add drag-and-drop functionality for image uploads.
- Generate multiple captions for a single image.
- Enhance styling with Tailwind CSS or Material-UI.
- Deploy the app to platforms like AWS, Heroku, or Vercel.
This project is licensed under the MIT License. See the LICENSE
file for details.
If you find this project helpful, consider supporting me on Ko-fi:
ko-fi.com/allanninal