Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

2022/08/11 Meeting notes #8

Open
Victoria-Samboco opened this issue Aug 11, 2022 · 25 comments
Open

2022/08/11 Meeting notes #8

Victoria-Samboco opened this issue Aug 11, 2022 · 25 comments

Comments

@Victoria-Samboco
Copy link
Contributor

Victoria-Samboco commented Aug 11, 2022

Hi everyone
From today's meeting we discussed about what are the next steps once we already succeeded to image the Sun.
Next steps:

From UHF-band proceed with:

  1. 1GC
  2. 2GC
  3. Apply the Sun imaging step:
    3.1. Split_ms
    3.2. Ra/Dec
    3.3. Apply chgcentre command
    3.4. run wsclean with -subtract-model parameter
@Victoria-Samboco
Copy link
Contributor Author

I have a doubt about the goal of this project. The goal is to develop a pipeline to image the sun using any data from MeerKAT right? But my doubt is why we want to imagine the sun. Initially I understood that we wanted to image the Sun to characterise the level of solar contamination and then somehow eliminate it (the Sun) from the observations, but at the last meeting I understood that the objective is actually for the pipeline to be able to make images of the Sun for any MeerKAT data.

@o-smirnov
Copy link
Collaborator

It is both.

  • Routine images of the Sun are interesting to solar astronomers, so it's a very useful scientific byproduct to have.

  • We have some data with severe contamination from the Sun in the main field. If we can image the Sun in these data, we can then proceed to try to subtract it.

@Victoria-Samboco
Copy link
Contributor Author

Okay. That means that once to image the Sun we have to change the phase centre of the ms to point to the Sun and subtract the model-data from corrected-data we can then subtract the Sun and go back to the previous phase centre (eg. where a specific target we interested in are) but now without the Sun there because we have already subtracted it?

@IanHeywood
Copy link
Collaborator

Further to what Oleg said, we are still not sure what observing conditions lead to the most severe solar contamination of the data.

Is there a particular separation angle between the telescope's pointing direction and the Sun for which solar interference is particularly bad? Can the Sun enter the data from behind the main dish, e.g. by direct illumination of the secondary reflector? Are sunset and sunrise particularly bad times to do observations when the sun is low on the horizon? You will be the first person to conduct a systematic investigation of these effects for MeerKAT.

If you can devise a scheme to quantify the severity of the solar interference for a large range of observational conditions then we can use that information when scheduling future observations on the telescope. I think the best way to approach these questions is to first gather lots of images of the Sun.

Your work can also be fed into the scheduling plans for SKA-MID, which will have a very similar design to MeerKAT, albeit in a much larger scale. There is a direct community benefit to making the observing programs for these instruments more efficient, both in terms of minimising data loss and therefore cost.

@IanHeywood
Copy link
Collaborator

we have to change the phase centre of the ms to point to the Sun and subtract the model-data from corrected-data

I think in the workflow we have been assuming so far, we subtract the MODEL_DATA from CORRECTED_DATA prior to changing the phase centre to point at the Sun. The assumption is that the model visibilities will correspond to the emission in the original target field, following the 2GC (self-calibration) stage of the 'regular' processing pipeline.

we can then subtract the Sun and go back to the previous phase centre (eg. where a specific target we interested in are) but now without the Sun there because we have already subtracted it?

I principle I think this is correct. If you can deconvolve the Sun then a model can be constructed and subtracted from the visibilities. How well the deconvolution process will work remains to be seen, as the Sun will be smeared in time even across a single scan. There may also be some bookkeeping involved, for example we might want to create a custom SOLAR_DATA model column in the MS to preserve the original model...

Lots to explore here techniques-wise, and all of it is interesting (I hope you agree!) and potentially useful to the community. But I think the first thing to do is to implement your routine solar imaging pipeline.

@Victoria-Samboco
Copy link
Contributor Author

Victoria-Samboco commented Aug 15, 2022 via email

@IanHeywood
Copy link
Collaborator

I imagined many things

You should of course always feel free to bring your own ideas to the table!

@Victoria-Samboco
Copy link
Contributor Author

Victoria-Samboco commented Aug 15, 2022 via email

@Victoria-Samboco
Copy link
Contributor Author

Victoria-Samboco commented Aug 22, 2022

Greetings

Iam doing the calibration step and after 1GC I did 2GC (selfcal) step using this script :

selfcal:
  enable: true
  ncpu: 10
  img_npix: 6000
  img_cell: 1.5
  img_niter: 1000000
  img_nchans: 5
  img_robust: 0
  img_taper: '0'
  img_nwlayers_factor: 3
  cal_niter: 5
  cal_timeslots_chunk: -1
  start_iter: 1
  img_specfit_nrcoeff: 2
  img_multiscale: true
  img_multiscale_scales: ''
  img_nrdeconvsubimg: 1024
  image:
    enable: true
    cleanmask_thr: [30,20,15,5]
    clean_cutoff: [0.5,0.5,0.5,0.5]
    col: [DATA,CORRECTED_DATA,CORRECTED_DATA,CORRECTED_DATA]
  calibrate:
    enable: true
    model: ['1','2','3']
    gain_matrix_type: ['Fslope', 'GainDiagPhase', 'GainDiag']
    gsols_chan: [0,0,0]
    gsols_timeslots: [50,50,50]
  transfer_apply_gains:
     enable: false
  transfer_model:
     enable: false
  report: true

and I got this images:
1gc
1gc

analysing I think the image are not good so I thing I cant continue with the folowing proceeds. But now I dont now exacty what I have to do to have a better image. I thoutg I have to run again selfcal in the target data solutions but iam having this error:
2022-08-22 14:07:03 CARACal INFO: obsconf: initializing
2022-08-22 14:07:03 CARACal INFO: obsinfo file 1583662427_sdp_l0.1024ch-J033230_280757-corr-obsinfo.txt exists, not regenerating
2022-08-22 14:07:03 CARACal INFO: summary file 1583662427_sdp_l0.1024ch-J033230_280757-corr-summary.json exists, not regenerating
2022-08-22 14:07:03 CARACal INFO: elevation plot 1583662427_sdp_l0.1024ch-J033230_280757-corr-elevation-tracks.png exists, not regenerating
2022-08-22 14:07:03 CARACal INFO: MS #0: 1583662427_sdp_l0.1024ch-J033230_280757-corr.ms
2022-08-22 14:07:03 CARACal INFO: 1 spectral windows, with NCHAN=1024
2022-08-22 14:07:03 CARACal INFO: CHAN_FREQ from 544257324.21875 Hz to 1087726074.21875 Hz with average channel width of 531250.0 Hz
2022-08-22 14:07:03 CARACal INFO: target (TARGET):
2022-08-22 14:07:03 CARACal INFO: J033230-280757 (ID=0) : 455.00 minutes | RA=53.13 deg, Dec=-28.13 deg
2022-08-22 14:07:03 CARACal ERROR: Can't find an appropriate FIELD for obsinfo: gcal: all. Please check this config setting. It may also be that your MS scan intents are not pupulated correctly, in which case you must set gcal to a list of explicit field names. [RuntimeError]
2022-08-22 14:07:03 CARACal INFO: More information can be found in the logfile at output/logs-20220822-140703/log-caracal.txt
2022-08-22 14:07:03 CARACal INFO: exiting with error code 1

So iam sure of what/how to do.

@o-smirnov
Copy link
Collaborator

I see 12 images, what exactly am I looking at?

At the colour scale you're showing, they look superficially OK, why do you think they're not so good? But please adjust the colour scale so that we see more of the background noise/artefacts, it will be easier to judge quality then.

@IanHeywood
Copy link
Collaborator

I don't really know CARACal so @o-smirnov and @Kincaidr might be better placed to help, but:

2022-08-22 14:07:03 CARACal ERROR: Can't find an appropriate FIELD for obsinfo: gcal: all. Please check this config setting. It may also be that your MS scan intents are not pupulated correctly, in which case you must set gcal to a list of explicit field names. [RuntimeError]

I don't think the external calibrators should be involved for self-calibration operation, and indeed I'm guessing you are working from a MS that only contains target visibilities at this point ( 1583662427_sdp_l0.1024ch-J033230_280757-corr.ms).

The maps look good, so my best guess would be this:

transfer_model:
enable: false
report: true

perhaps it's trying to report on something related to the gaincal at the very end, but the gaincal is missing from the MS. Perhaps try setting report: false. But again I'm not familiar with this pipeline.

As a general point, I think five (or four?) cycles of selfcal is unneccessary, and will only serve to increase the amount of time this takes, for no benefit.

cleanmask_thr: [30,20,15,5]

with this setting I think any apparent improvements between selfcal cycles will simply be due to deeper deconvolution, which you can readily do immediately. MeerKAT is (in my experience) stable enough and with a good enough PSF to just do a single cycle of selfcal, after which the map is DDE-limited. I think you can probably save a lot of time (and electricity) by changing this.

Other things I'm not sure about:

Is img_nrdeconvsubimg: 1024 deconvolving every channel?

It looks like the final (?) selfcal iteration is amplitude and phase (GainDiag): gain_matrix_type: ['Fslope', 'GainDiagPhase', 'GainDiag'], this might also not be ideal since this field has a dominant source at the edge of the primary beam, it will probably make the sources in the map centre look worse.

Again, others please comment.

Cheers.

@IanHeywood
Copy link
Collaborator

I see 12 images, what exactly am I looking at?

My familiarity with the field gives me an advantage here. :)

@Victoria-Samboco are these the full band (MFS) images?

@o-smirnov
Copy link
Collaborator

@Victoria-Samboco, for future reference: when you post YaML into an issue, put triple back-quotes with "yml" at the start like so:

```yml

...and finish with a line of triple-backquotes. Your YaML will then render with nice syntax highlighting, and will preserve correct spacing. If you simply paste YaML into the comment box like you did initially, github's auto-formatting strips the leading spaces and makes it very hard to read the code. I have already edited your comment to fix this. (Likewise, you can render Python code nicely by using triple-backquote-python).

I don't think the external calibrators should be involved for self-calibration operation, and indeed I'm guessing you are working from a MS that only contains target visibilities at this point ( 1583662427_sdp_l0.1024ch-J033230_280757-corr.ms).

Correct. So I would ignore that error.

  report: true

Is just for a final HTML report. I don't use this, the radiopadre notebooks are more informative anyway. (The built-in reports pre-dated radiopadre and have far fewer features).

cleanmask_thr: [30,20,15,5]

with this setting I think any apparent improvements between selfcal cycles will simply be due to deeper deconvolution, which you can readily do immediately. MeerKAT is (in my experience) stable enough and with a good enough PSF to just do a single cycle of selfcal, after which the map is DDE-limited. I think you can probably save a lot of time (and electricity) by changing this.

Agreed.

Other things I'm not sure about:

Is img_nrdeconvsubimg: 1024 deconvolving every channel?

No, this enables wsclean's -parallel-deconvolution option. I do not take responsibility for why it's been renamed and the sense of the option has been inverted (i.e. number of subimages rather than max subimage size).

It looks like the final (?) selfcal iteration is amplitude and phase (GainDiag): gain_matrix_type: ['Fslope', 'GainDiagPhase', 'GainDiag'], this might also not be ideal since this field has a dominant source at the edge of the primary beam, it will probably make the sources in the map centre look worse.

Yep, @IanHeywood's interpretation is correct, and I agree, GainDiag could be risky, In fact I would just do one round with FSlope, since that already includes phases anyway...

@Victoria-Samboco
Copy link
Contributor Author

Victoria-Samboco commented Aug 22, 2022

Actually, there are not 12 images, there are 6. But the second 6 are full field and the first 6 are the same images with zoom in. The images looks like this:

1gc

The first is from de DATA colum and the other ones are from CORRECTED_DATA for different solvers

The last 2 images here I think are the same think, I think maybe is repeated because the nummer of calibration interaction was 5

''' cal_niter: 5 '''

@Victoria-Samboco
Copy link
Contributor Author

Victoria-Samboco commented Aug 22, 2022

@IanHeywood yes these is a full band images .

So that's mean I dont need to use



'GainDiagPhase', 'GainDiag' I just use  'Fslope'. ```

@Kincaidr
Copy link
Collaborator

perhaps it's trying to report on something related to the gaincal at the very end, but the gaincal is missing from the MS. Perhaps try setting report: false. But again I'm not familiar with this pipeline.

Yes, but the error has nothing to do with the selfcal step. @Victoria-Samboco re-ran the pipeline after changing the input ms file from the raw ms to the ms containing only targets (corr-ms). When running the pipeline from selfcal (or any worker) caracal automatically runs obsconfig worker at the start as a pre-validation step. So in this case it could not find the calibrator information.

So the reason why there are 6 images is because cal-niter =5 (model + 5 calibration runs). However, what is a bit confusing is that the script only specifies 3 solvers: [Fslope, GainDiagPhase', GainDiag]. So in order for it to do 5 runs from these solvers, does it by default use the last solver for the last 2 runs? So with cal-niter =5 is it doing [Fslope, GainDiagPhase, GainDiag, GainDiag, GainDiag]? Can you confirm @o-smirnov ?

@o-smirnov
Copy link
Collaborator

When running the pipeline from selfcal (or any worker) caracal automatically runs obsconfig worker at the start as a pre-validation step. So in this case it could not find the calibrator information.

Yep, so the error should just be ignored (maybe post a note to the Caracal repo to adjust the error wording, to make it clear that this is only a problem if the crosscal worker is being run).

Can you confirm @o-smirnov ?

I can neither confirm nor deny, I didn't write the Crosscal worker. :) Best ask Kshitij. Or just read the logs to see what it actually did?

@Kincaidr
Copy link
Collaborator

So that's mean I dont need to use 'GainDiagPhase', 'GainDiag' I just use 'Fslope'.

Yes so the main changes are:

cal-niter: 1
cleanmask_thr: [30,15] (Can experiment here)
col: [DATA,CORRECTED_DATA] (model + 1 round f-slope)
gain_matrix_type: ['Fslope']

For the rest of the inputs, just make sure the arrays have consistent lengths, so under image: subsection they should be 2 and under calibrate: subsection should be 1.

@Victoria-Samboco
Copy link
Contributor Author

Hi Prof Oleg and Ian

Bellow I have images from the model after Selfcal (with auto-mask 10) and the other is the model after applying deep mask with breizorro (mask 5). I would like to know why we have this black dots/regions (negative regions) on the sources.

model_data

and below are the images corresponding to these models

after Selfcal (with mask 10) and after applying deep mask (5).
selfcal_deepmaskimages

@IanHeywood
Copy link
Collaborator

Even though the genuine emission in a Stokes I image is always positive, the clean components can be both positive and negative. Some negative components are generally fine, for example a mixture of (mostly) positive and (some) negative components may be required to faithfully characterise an extended source.

The issue above is that both the automasking and breizorro have interpreted the strong radial artefacts around the brightest source in the image as genuine emission. Clean has then deconvolved these regions, which will likely include negative values that are significantly lower than those that result from pure noise.

I stopped using automasking as I never found a set of parameters that struck a good balance between completeness, avoiding artefacts, and also being applicable to a wide range of fields. For my own processing I "waste" an imaging cycle with unconstrained deconvolution and then use something breizorro-like to make a mask.

The breizorro mask, having a lower threshold than the automasking, will be including a lot more genuine sources, which is why that mask looks a lot busier. However I am surprised by how many artefacts breizorro is including. Can you please post a link to the (1GC?) image that you are generating the mask from?

Something strange also appears in the lower left image, where the bright source in the lower right seems to have a new source appearing next to it, but I can't quite see what is going on with the images at this resolution.

@Victoria-Samboco
Copy link
Contributor Author

Victoria-Samboco commented Aug 31, 2022

For the 1GC image the link is

/net/garfunkel/home/samboco/solarkat/1GC_UHF/output/continuum/image_2/

for the deep mask image is

/net/garfunkel/home/samboco/solarkat/1GC_UHF/output/continuum/deep_masking_image

@IanHeywood
Copy link
Collaborator

/net/garfunkel/home/samboco/solarkat/1GC_UHF/output/continuum/image_2/

I suspect this is actually the 2GC image (wsclean is imaging the CORRECTED_DATA column) and the 1GC image is in ../image_1, can you please confirm?

If so, then my best explanation for the new source in the left hand image above is that there is a strong negative feature in the model which is being reinforced by self-calibration and appearing in the corrected data for the subsequent imaging cycle. You can see it in the image and profiles here:

Screenshot 2022-08-31 at 13 41 24

The deep mask image also has this spurious negative feature (and its associated artefacts), which is probably why the artefacts are much more pronounced in the breizorro map. For mask masking with breizorro I would return to the 1GC image, iteratively raising the threshold until none of the artefacts are present, and then re-image the DATA column with the new, artefact-free mask in place and then repeat the self-calibration.

@Victoria-Samboco
Copy link
Contributor Author

Oh yes sorry. The 1GC image is here
/net/garfunkel/home/samboco/solarkat/1GC_UHF/output/continuum/image_1/

@IanHeywood
Copy link
Collaborator

Some general comments on the wsclean command that was used:

-mem 100 -absmem 100.0
Using both of these might be harmless, but I'm not sure which one takes precedence. If it's the former then you will be potentially using all the RAM on your machine for certain stages in the process, which might not be what you want on a shared computer.

-multiscale
I do not think you need multiscale clean for this field, as it is dominated by compact features, so disabling this will speed up the processing. However if you enable multiscale then I would advise also manually specifying scales with -multiscale-scales. Otherwise wsclean can try to use some very large scales which can end up being quite unstable and resulting in bad maps at the end. Even for the Galactic centre imaging, which is the most multiscale-worthy field I've ever dealt with, I don't think I used scales larger than maybe 27 synthesised beamwidths.

-parallel-deconvolution 320
This explains @o-smirnov's explaination a bit further up, but for a 10240 x 10240 image this will result in ~1000 facets being deconvolved in parallel, which is probably highly inefficient. From the wsclean docs: "Values of 1024-4096 work well for large images. Smaller values would split the image in very many subimages, increases the computational cost of splitting, whereas larger values might lead to too few subimages, and this might therefore lead to lesser parallelization." I use 2560 for images with the same number of pixels. You can probably make another time saving by changing this.

If these are caracal defaults then you might want to confer with the dev team about them.

@IanHeywood
Copy link
Collaborator

A further thought occurs. The strong, spurious negative source does not appear in your lower right image which if I understand correctly represents data that were self-calibrated, with the model being formed from the breizorro mask. This suggests that while breizorro enforces positivity when making masks perhaps wslcean's internal automasking does not. The former behaviour seems a lot safer than the latter when making a Stokes I image.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants