LipidomicsAnalystR é um pipeline abrangente baseado em R para a análise de dados de lipidômica, projetado para facilitar o estudo de perfis lipídicos e suas associações com desfechos clínicos. A ferramenta permite o processamento eficiente, normalização e análise estatística de grandes conjuntos de dados de lipidômica de fontes biológicas diversas, como plasma e soro.
- Normalização de Dados: Processos automatizados de normalização de dados lipidômicos.
- WGCNA: Realiza análise de rede de co-expressão de lipids ponderada (WGCNA) para identificar clusters de lipídios altamente correlacionados.
- PCA: Implementa análise de componentes principais (PCA) para identificar padrões e reduzir a dimensionalidade.
- Heatmaps: Gera heatmaps personalizáveis com base em classes de lipídios e variáveis clínicas.
- Volcano Plots: Fornece visualização de mudanças significativas nos lipídios com base em modelos de Regressao Linear Ajustados.
- Análise por Classes: Permite análises estratificadas por classes de lipídios.
- Compatibilidade: Compatível com grandes coortes de dados e facilmente integrável com variáveis clínicas para descoberta avançada de biomarcadores.
This repository contains scripts and data used for lipidomic analysis and visualization, focusing on the development and application of machine learning models for lipid-related datasets. The project is structured to facilitate easy navigation through different datasets and analytical scripts.
- data/: Contains datasets used for lipidomic analysis.
- Lipidomics_dataset_cleaned.csv: Cleaned lipidomics dataset.
- Lipidomics_raw_data.csv: Raw lipidomics data.
- Plasma_lipid_data.xlsx: Plasma lipid data used for specific analyses.
- metadata/: Contains metadata files related to the lipidomic datasets.
- scripts/: Contains R scripts for data processing and analysis.
- 00_data_cleaning.R: Script for cleaning lipidomic data.
- 01_data_analysis.R: Main script for data analysis, including statistical methods.
- 02_data_visualization.R: Script for generating visualizations.
- machine_learning/: Contains scripts for machine learning models applied to lipidomic data.
- ml_model_training.R: Script for training machine learning models.
- ml_model_evaluation.R: Script for evaluating model performance.
- results/: Contains the results generated from the analyses.
- Lipidomics_summary_statistics.csv: Summary statistics for lipidomics data.
- Machine_learning_results/: Contains the outputs from machine learning models.
- figures/: Contains figures generated during the analysis.
- Boxplots/: Contains boxplots for lipid distribution.
- Heatmaps/: Contains heatmaps for correlation matrices.
- PCA/: Principal Component Analysis plots.
- tests/: Contains test scripts and data for verifying analysis methods.
The analyses include statistical and machine learning approaches to investigate lipidomic patterns. Key steps include:
- Data Cleaning: Removing missing values and normalizing lipid concentrations.
- Data Visualization: Creating PCA, heatmaps, and boxplots for lipid distribution.
- Machine Learning: Training models to predict clinical outcomes based on lipidomic profiles.
We calculate various statistical measures, such as:
- Mean and Standard Deviation: Descriptive statistics for each lipid species.
- Correlation Analysis: Pearson and Spearman correlation analyses to explore relationships between lipid species.
- PCA: Dimension reduction techniques to identify key lipidomic signatures.
- Clone the repository:
git clone https://github.com/wdartora/LipidomicAnalystR.git
- Navigate to the project directory:
cd LipidomicAnalystR
- Open the R scripts in the
scripts/
folder to explore and run analyses.
Contributions are welcome! If you would like to contribute, please open issues or submit pull requests. Let's advance lipidomics together!
This project is licensed under the MIT License. See the LICENSE file for more details.
Developed by William Jones Dartora