img2text

building an image to text system

Image to Text Application

This is a Streamlit web application that takes an image and a text prompt as input, and uses Google's Generative AI model (gemini-1.5-flash) to generate content based on the input. The app can either take just the image or combine both the image and the text prompt to produce a response.

Features

Upload an image (jpg, jpeg, png formats)
Option to provide a text prompt
Generate content using Google's Generative AI (gemini-1.5-flash)
Display the uploaded image and the generated response on the page

Requirements

To run the app, you need to have the following installed:

Python 3.x
Streamlit
Pillow (Python Imaging Library)
python-dotenv for managing environment variables
google-generativeai library for the Generative AI model

Installation

Clone the repository:

git clone https://github.com/your-username/image-to-text-app.git
cd image-to-text-app

Create a virtual environment (optional but recommended):

python -m venv venv
source venv/bin/activate  # On Windows use `venv\Scripts\activate`

Install dependencies:
```
pip install -r requirements.txt
```
Set up environment variables:
- Create a .env file in the root directory.
- Add your Google API key to the .env file:
```
GOOGLE-API-KEY=your-google-api-key-here
```
Run the Streamlit app:
```
streamlit run app.py
```

Usage

Launch the app and upload an image file (jpg, jpeg, or png).
Optionally, provide a text prompt to give additional context.
Click the Submit button to generate content based on the image and the prompt.
The AI-generated response will be displayed on the page.

Dependencies

Streamlit: For creating the web interface
Pillow (PIL): For handling image upload and display
python-dotenv: For managing environment variables like the Google API key
google-generativeai: For accessing Google's Generative AI model

API Configuration

Ensure that your Google API key has the correct permissions to access the generative model. You will need to configure the environment with the API key using a .env file as described above.

License

This project is licensed under the MIT License. Feel free to modify and use the code as per your needs.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
Requirements.txt		Requirements.txt
app.py		app.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

img2text

Image to Text Application

Features

Requirements

Installation

Usage

Dependencies

API Configuration

License

About

Releases

Packages

Languages

License

ShayanHussain1996/img2text

Folders and files

Latest commit

History

Repository files navigation

img2text

Image to Text Application

Features

Requirements

Installation

Usage

Dependencies

API Configuration

License

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages