Skip to content

Commit

Permalink
explanations for src
Browse files Browse the repository at this point in the history
  • Loading branch information
astorfi committed Nov 23, 2024
1 parent ac15cab commit 73bade1
Showing 1 changed file with 151 additions and 0 deletions.
151 changes: 151 additions & 0 deletions src/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,151 @@
# AI Alignment Project: Workflow and Runnable Example

## 1. Define the Workflow
The project guides users through the following stages:
1. **Preprocessing**: Preparing and augmenting the dataset.
2. **Training**: Fine-tuning or RLHF-based training of the LLM.
3. **Evaluation**: Assessing model alignment through explainability, bias analysis, and safety tests.
4. **Deployment**: Running an API to interact with the trained model.
5. **Feedback Loop**: Incorporating user feedback for iterative improvement.

---

## 2. Workflow Integration of Files

### (a) Data Preprocessing
- **Files**:
- `src/preprocessing/preprocess_data.py`
- `src/preprocessing/augmentation.py`
- `src/preprocessing/tokenization.py`
- **Workflow**:
1. Start with raw or synthetic data (`data/raw/synthetic_data.csv`).
2. Use `preprocess_data.py` to clean and tokenize data.
3. Augment data with `augmentation.py` to simulate diverse scenarios.
4. **Output**: A cleaned and tokenized dataset ready for training.

### (b) Model Training
- **Files**:
- `src/training/fine_tuning.py`
- `src/training/rlhf.py`
- `notebooks/02_fine_tuning.ipynb` & `03_rlhf.ipynb`
- **Workflow**:
1. Load the preprocessed data.
2. Fine-tune a pretrained LLM with `fine_tuning.py`.
3. Optionally, enhance alignment with human feedback via `rlhf.py`.
4. Log training results using `mlflow_tracking.py`.
5. **Output**: A fine-tuned LLM stored as a model artifact.

### (c) Evaluation
- **Files**:
- `src/evaluation/metrics.py`
- `src/evaluation/safety_tests.py`
- `src/evaluation/bias_analysis.py`
- `notebooks/04_evaluation.ipynb`
- **Workflow**:
1. Evaluate the model's responses for alignment using:
- Safety metrics (`safety_tests.py`).
- Explainability tools (`metrics.py`).
- Bias analysis (`bias_analysis.py`).
2. Display performance metrics and insights via `explainability_dashboard.py`.

### (d) Deployment
- **Files**:
- `src/deployment/fastapi_app.py`
- `src/deployment/endpoints/predict.py`, `feedback.py`
- Docker/Kubernetes configs (`deployment/docker-compose.yml`, `deployment/kubernetes`)
- **Workflow**:
1. Start the FastAPI app to serve the trained model (`fastapi_app.py`).
2. Use endpoints:
- `/predict`: For inference.
- `/feedback`: To capture user feedback.
3. Deploy in a containerized environment using Docker or Kubernetes.

### (e) Feedback Loop
- **Files**:
- `app/feedback.py`
- `src/reinforcement/multi_objective_rl.py`
- **Workflow**:
1. Capture real-world feedback via `/feedback` API or UI (`app/templates/feedback.html`).
2. Retrain the model using `multi_objective_rl.py` to incorporate feedback.

---

## 3. Runnable Example: A Hands-On AI Alignment Experiment

### Step 1: Data Preparation
Run the preprocessing script:
```bash
python src/preprocessing/preprocess_data.py --input data/raw/synthetic_data.csv --output data/processed
```

### Step 2: Fine-Tuning
Train the model using the preprocessed data:
```bash
python src/training/fine_tuning.py --data_dir data/processed --output_dir models/fine_tuned
```

#### Explanation:
- **Input**:
- The script processes data from the `data/processed` directory, which contains cleaned and tokenized datasets.

- **Model Fine-Tuning**:
- The fine-tuning script applies supervised learning to adjust the weights of a pretrained large language model (LLM).
- Hyperparameters such as learning rate, batch size, and number of epochs can be customized in the script or via configuration files.
- The fine-tuning process adapts the model to perform alignment-specific tasks (e.g., producing safe, unbiased, and interpretable outputs).

- **Output**:
- A fine-tuned model is saved in the `models/fine_tuned` directory. This model is now better aligned with the desired objectives and can be evaluated for safety, bias, and interpretability.

- **Integration with Experiment Tracking**:
- If `mlflow_tracking.py` or a similar tracking tool is used, fine-tuning results (e.g., loss curves, evaluation metrics, and hyperparameters) are logged for reproducibility.
- Users can compare different runs, evaluate the impact of hyperparameter changes, and select the best-performing model.

- **Key Learnings**:
- Fine-tuning allows a general-purpose LLM to be adapted for specific tasks, making it more relevant for real-world alignment challenges.
- Regular evaluation during training ensures that the model maintains alignment with predefined objectives (e.g., minimizing bias or toxicity).
- Users gain practical experience with data preparation, model training, and the iterative nature of fine-tuning.

- **Next Steps**:
1. Evaluate the fine-tuned model using metrics, safety tests, and bias analysis (Step 3: Evaluate Alignment).
2. Deploy the fine-tuned model as an API or in an interactive application (Step 4: Start the API).

#### Common Challenges and Solutions:
1. **Overfitting**:
- Problem: The model may overfit on the fine-tuning dataset, losing its generalization ability.
- Solution:
- Use regularization techniques such as dropout.
- Implement early stopping during training.
- Monitor validation loss and tune the dataset size for diversity.

2. **Insufficient Alignment**:
- Problem: The fine-tuned model may still produce misaligned or biased outputs.
- Solution:
- Incorporate Reinforcement Learning with Human Feedback (RLHF) for further alignment.
- Use safety tests and bias analysis to identify problematic outputs and retrain iteratively.

3. **Hyperparameter Tuning**:
- Problem: Suboptimal hyperparameter settings may lead to poor performance or inefficiency.
- Solution:
- Use a hyperparameter tuning framework like Optuna or implement grid/random search.
- Explore automated scripts for hyperparameter optimization (`ppo_hyperparameter_tuning.py`).

4. **Scalability Issues**:
- Problem: Fine-tuning large LLMs may require significant computational resources.
- Solution:
- Use distributed training methods (`distributed_rl.py`).
- Leverage cloud-based GPUs or TPUs for faster training.

#### Practical Tips:
- Ensure that the dataset used for fine-tuning aligns with the project's ethical and performance goals.
- Regularly save checkpoints during training to prevent data loss and allow resuming interrupted runs.
- Log all experiments systematically for reproducibility and knowledge sharing among team members.

#### Real-World Applications:
- This step can adapt the LLM for tasks such as:
- Generating safe conversational responses in chatbots.
- Mitigating bias in summarization or text generation.
- Enhancing explainability for AI models in sensitive domains like healthcare or law.

By completing this step, you now have a fine-tuned model that serves as the foundation for subsequent evaluation and deployment in your AI alignment project.


0 comments on commit 73bade1

Please sign in to comment.