Skip to content

Commit

Permalink
Merge pull request #11 from kuanweih/neurips_arxiv
Browse files Browse the repository at this point in the history
NeurIPS and arXiv
  • Loading branch information
Joshua Yao-Yu Lin authored Nov 6, 2023
2 parents 72f5d78 + 86d3222 commit b7f7581
Show file tree
Hide file tree
Showing 10 changed files with 3,395 additions and 20 deletions.
55 changes: 51 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,11 +1,14 @@
# LenSiam
SimSiam with domain-specific augmentation for lensing task
**LenSiam** is the self-supervised learning architecture of SimSiam plus a novel domain-specific augmentation for strong gravitational lens images.


Create conda env: ```. conda_env_setup.sh```
To create a conda env, run:
```
. create_conda_env.sh
```


To train Simsiam models, run
To train Limsiam models, run
```
python main.py --data_dir path/to/your/dataset -c configs/train.yaml
```
Expand All @@ -16,4 +19,48 @@ To calculate UMAP embeddings, run
python calc_umap.py -c configs/umap.yaml
```

Note: One of the datasets that are used for UMAP calculation is the HST real images. The dataset can be created by running the code in the repo of [kuanweih/lensed_quasar_database_scraper](https://github.com/kuanweih/lensed_quasar_database_scraper).
The [notebook](https://github.com/kuanweih/LenSiam/blob/main/notebooks/NeurIPs_umap_plots.ipynb) contains the code we used to make graphs in our papers in NeurIPS 2023 workshops.


Note:
* One of the datasets that are used for UMAP calculation is the HST real images. The dataset can be created by running the code in the repo of [kuanweih/lensed_quasar_database_scraper](https://github.com/kuanweih/lensed_quasar_database_scraper).

* We thank the open-source code by [Patrick Hua](https://github.com/PatrickHua). The code of SimSiam used in this work was adapted from their [SimSiam](https://github.com/PatrickHua/SimSiam) repository.


<br>

# Key Takeaways
**Figure 1**\
<img src="plots/LenSiam.png" width="750">

**(a)** The LenSiam architecture for this work.\
*We generate positive pairs of lens images through a domain-specific augmentation approach to learn the representation of strong gravitational lens images.*

**(b)** Example of two different source galaxies (top) and their lens images with the identical foreground lens model (bottom). The bottom images represent a positive lens image pair for our LenSiam models.\
*Our lens augmentation takes into account the domain knowledge of gravitational lensing. This allows LenSiam to learn consistent representations of foreground lens properties.*

**(c)** Example of applying the default augmentation to a lens image. The bottom augmented images represent a positive lens image pair for our baseline SimSiam models.\
*The commonly used random augmentation methods are problematic here as the lens properties will be easily changed. For example, enlarging a lens image will directly change the Einstein radius.*

<br>

**Figure 2**\
<img src="plots/umap_color_params.png" width="750">

The UMAPs are color-coded by the Einstein radius $\theta_{\rm E}$, the ellipticity $e_1$, and the radial power-law slope $\gamma$ from the left to right columns. The top row is the UMAPs for LenSiam while the bottom row is the UMAPs for the baseline SimSiam.\
*The nonuniform distributions on the LenSiam UMAPs indicate that its backbone ResNet101 trained by the LenSiam SSL process does learn some key parameters such as* $\theta_{\rm E}$, $e_1$, *and* $\gamma$, *even though it has **NEVER** seen the true parameters during the entire training process.*


<br>
<br>

**Downstream task:**
| Pretrained ResNet101 Models | Framework | $R^2$ (Einstein radius) |
|:----------:|:----------:|:----------:|
| ImageNet-1k (baseline) | Supervised | 0.360 |
| SimSiam (baseline) | Unsupervised | 0.426 |
| ***LenSiam (this work)*** | Unsupervised | ***0.586*** |

As an exploration, we experiment both our LenSiam and SimSiam learned representations with a downstream regression task as a proof of concept. We finetune the model to estimate the Einstein radius $\theta_{\rm E}$ with the [Lens challenge dataset](http://metcalf1.difa.unibo.it/blf-portal/gg_challenge.html), which simulated Euclid-like observations for strong lensing. To simulate the scarcity of real strong lensing data, we select a sub-sample of 1,000 images as the training set and 1,000 images as the test set. With LenSiam pre-train models, we reach $0.586$ in $R^2$ compared with baseline SimSiam models $0.426$ and supervised-only models (the ResNet101 models pre-trained on ImageNet-1k) $0.360$ on Einstein radius. \
*We find that the LenSiam pretraining does help downstream regression task.*
11 changes: 0 additions & 11 deletions conda_env_setup.sh

This file was deleted.

11 changes: 11 additions & 0 deletions create_conda_env.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
#!/bin/bash

# Create a new Conda environment
conda create -y --name lensiam python=3.8

# Activate the Conda environment
conda activate lensiam

# Install required packages via pip
pip install torch==2.1.0 torchvision==0.16.0 torchaudio==2.1.0 --extra-index-url https://download.pytorch.org/whl/cu116
pip install -r requirements.txt
1,108 changes: 1,108 additions & 0 deletions notebooks/Visualizing_results_Einstein_radius_LenSiam.ipynb

Large diffs are not rendered by default.

1,108 changes: 1,108 additions & 0 deletions notebooks/Visualizing_results_Einstein_radius_SimSiam.ipynb

Large diffs are not rendered by default.

1,112 changes: 1,112 additions & 0 deletions notebooks/Visualizing_results_Einstein_radius_baseline.ipynb

Large diffs are not rendered by default.

File renamed without changes.
Binary file added plots/LenSiam.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added plots/umap_color_params.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
10 changes: 5 additions & 5 deletions requirements.txt
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
tqdm
pyyaml
matplotlib
astropy
tqdm==4.66.1
pyyaml==6.0.1
matplotlib==3.7.3
astropy==5.2.2
transformers==4.18.0
umap-learn
umap-learn==0.5.4

0 comments on commit b7f7581

Please sign in to comment.