-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for Python3 #270
Conversation
Can one of the admins verify this patch? |
ok to test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks mostly good. Main concern is the following issues:
- range vs. xrange, esp those in nested loops have a high performance and memory impact when running in python 2.7
- iteritems vs. items, same here please use conditionals and six to check which to call depending on environment
- pickle is backed in python on python 2.7 and the default levels are different. This has a performance impact when running in python 2.7. Use a conditional and six when importing
- buildins don't exist in python 2.7 the test coverage probably excludes these files. As it stands those are not backwards compatible with python 2.7
Smells like a python-casacore bug |
What python and what python-casacore are you using. |
This is the KERN-3 casacore. Nothing has changed in the installation
process:
https://github.com/ratt-ru/CubiCal/blob/master/Dockerfile
…On Fri, May 17, 2019 at 12:46 PM Gijs Molenaar ***@***.***> wrote:
What python and what python-casacore are you using.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#270?email_source=notifications&email_token=AEIVPJU277YYY3GPPZLN563PV2EHHA5CNFSM4HNLUJCKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODVUNR4I#issuecomment-493410545>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AEIVPJVBVOX24RFFZUG752DPV2EHHANCNFSM4HNLUJCA>
.
--
-----------------------------------------------------------------------------------------------------
Benjamin Hugo
Junior Software Developer
SARAO
Black River Park, 2 Fir Street, Observatory, Cape Town, Western Cape, 7925
Contact: [+27] 0716293858 <+27%2071%20629%203858>
PhD. student,
Radio Astronomy Techniques and Technologies,
Department of Physics and Electronics,
Rhodes University
Skype: benna.cn
-----------------------------------------------------------------------------------------------------
|
This is due to unicode ending up in the python-casacore code. Only the current (python-)casacore has proper Python2 support for unicode 2. If you encounter errors like this the casacore call containing the string needs to have a Unfortunatly i'm now out of time to work on this. |
ok I will change the build system to check the latest revision from casacore and build from there. Unfortunately this can only happen next week as I'm busy packaging killMS at the moment. |
Ideally it works with both the old and new python-casacore, it is probably an easy fix to insert some |
hmm ok. I will debug this further next week |
Where do you run this container with what? I can't replicate your issue. |
On the Jenkins CI with the following command line
|
for me the tests are running. What specific test fails? |
Its the main acceptance test on 147, see https://jenkins.meqtrees.net/job/cubical-pr/70/console The latest commits don't build successfully |
Ok, my guess is that this is because you use KERN-3 which has casascore 2.4.1 in it. |
ok before this is merged we need to work on a fix for casacore 2. Some of the lofar packages do not work with the new casacore and we do need to be able to run ddfacet, killms and cubical on the same installation, otherwise it just becomes too messy for the users. |
Also we need to keep long term support for Ubuntu 16.04 |
See issue: casacore/python-casacore#174 |
Well we're not breaking it by merging surely. The codebase has been made py3-compatible, we merge it in and carry on the revolution on another branch? |
Ok this version runs through on 16.04 py2 and 18.04 py2 and py3. However I had to made changes to the way you compute time indicies since python3 does not accept floating point arrays as index arrays, see the ms_tile provider. I also removed the SIN projection since the new montblanc does this internally. Since we made so many changes to montblanc itself I've tested it using my small DDFacet use case and it subtracts very well as you can see on the pull request, so I doubt this is a montblanc py3 porting issue. The variation between 16.04 and 18.04 is on the e-3 level, but both now fail. Whether or not this is a substantial difference I don't know. This needs further investigation but i suggest @o-smirnov and @JSKenyon quote the relative error instead of the absolute difference so we can better understand the
and
|
An even better idea is to write out the model and compute the chi^2 between the model and the de-corrected residuals. For now I've implemented the mean relative error check in decibels . I've set the threshold to -30 dB on the mean relative error and -25 dB on the 95th percentile. This gives us some leeway in the tests. Currently on the ubuntu 18.04 system this stands at:
So I think we can merge this bastard @o-smirnov |
@JSKenyon I'll make 2 images to tripple check that this didn't break anything but -30dB is well within instantaneous instrumental noise, so I don't think these gain differences are substantial enough to worry |
cubical/data_handler/ms_tile.py
Outdated
tindx = np.add.accumulate(tindx) | ||
return tindx | ||
|
||
self._row_identifiers = ddid_index * n_bl * ntime + timestep_index(self.times) * n_bl + \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not convinced by this - in my opinion we are trying to fix something which should not be broken. We need to find the root cause otherwise we may have unexpected behaviour elsewhere. Doing a search for times only yields 25 results and I cannot see any reason for self.times to become non-integer. For reference, self.time_col contains float values from the TIME column of the MS. self.times is of the same shape but instead contains integer indices associating each time with the timeslot/integration to which it belongs. These indices are generated by uniquify, which is defined init.py in the data_handler folder.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It must be broken upstream from this method somewhere. I can revert the change and if you run it with py3 it will break
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the safest thing is to sprinkle asserts for the data types into the codebase as they can always be disabled with python if you want full performance
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah there's no way times
should be float. I vote to rename it though, as from the variable name it's not obvious at all that this means "timeslot index" (anyone new to the code base will just take it to mean the plural of "time").
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok I found the difference it was stemming from your n_bl calculation which I already fixed with //
The relative difference is still
*** mean relative diff between CORRECTED_DATA and DE_DATA is -35.98283290863037 dB
*** ninety fifth percentile relative diff between CORRECTED_DATA and DE_DATA is -31.854880104464907 dB
which points to this being the sin projection you previously employed
Or start using type annotations, mypy and pycharm will warn you.
Op wo 29 mei 2019 11:56 schreef Oleg Smirnov <[email protected]>:
… ***@***.**** commented on this pull request.
------------------------------
In cubical/data_handler/ms_tile.py
<#270 (comment)>:
> @@ -147,34 +147,42 @@ def load_montblanc_models(self, uvwco, loaded_models, model_source, cluster, imo
ddid_index, uniq_ddids, _ = data_handler.uniquify(self.ddid_col)
self._freqs = np.array([self.tile.dh.chanfreqs[ddid] for ddid in uniq_ddids])
-
- self._row_identifiers = ddid_index * n_bl * ntime + (self.times - self.times[0]) * n_bl + \
+
+ def timestep_index(times, tol=1.0e-9):
+ """ Compute the prescan operation to find the unique timestep idenfiers for
+ a TIME_CENTROID array """
+ tindx = np.zeros_like(times, dtype=np.int64)
+ tindx[1:] = np.abs(times[1:] - times[:-1]) > tol
+ tindx = np.add.accumulate(tindx)
+ return tindx
+
+ self._row_identifiers = ddid_index * n_bl * ntime + timestep_index(self.times) * n_bl + \
Yeah there's no way times should be float. I vote to rename it though, as
from the variable name it's not obvious at all that this means "timeslot
index" (anyone new to the code base will just take it to mean the plural of
"time").
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#270?email_source=notifications&email_token=AACPVJACIWR3ABBBMS3L5KLPXZHONA5CNFSM4HNLUJCKYY3PNVWWK3TUL52HS4DFWFIHK3DMKJSXC5LFON2FEZLWNFSXPKTDN5WW2ZLOORPWSZGOBZ7EDKY#discussion_r288486664>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AACPVJHXU3S4JW4SFOJGFFLPXZHONANCNFSM4HNLUJCA>
.
|
As discussed above, this looks to be a manifestation of another error.
Since the reference data was generated with the old code, the DE_DATA test (the only one using Montblanc) can be expected to fail, since the effective model source positions will have shifted slightly. If you're right, the old code uses slightly incorrect positions, and the new code will subtract "better". The way to verify that the new code is "better" (or at least "as correct") is to run the test with a full MS (in C and D config), and eyeball the residual images. |
@o-smirnov I've already verified that montblanc is subtracts correctly within MeerKAT resolution which is comparable to VLA C |
I still maintain the differences in projection is well within visibility error bars, however I'm going to merge in montblanc master into the branch and redo the DDF MK deep2 cleaning test, which is an order of magnitude higher resolution than VLA D configuration. |
@JSKenyon this last commit fixes your cythonization upon wheel building. It also gets invoked upon standard python setup.py install which ensures wheels can be build directly from the raw release code. I prefer this method because it means you can call pip install git+.... or pip install directly from source on pypi. It also means that your source distribution on pypi can have a direct corresponding tarball on github without the need to include c/cpp files in the one and not the other. |
Ok, I am sort of convinced that this is in a working state for both 2.7 and 3.6. However, post merge there are several additional fixes which need to be made. I will make issues for them, but for the sake of my memeory I will also mention them here. @o-smirnov currently postmortem flagging breaks in flag3_to_col and we need to verify flagging behaviour as my current MWE ends up with unflagged bad data. |
Postmortem has been broken for a long long time. We need a separate branch
to fix it. If it's even worth fixing, which I'm not too sure about.
What you should try is --g-max-prior-error and --g-max-post-error. These
are now off (0) by default. A setting of 0.1 is appropriate for well
behaved data with the occasional outlier.
Cheers,
Oleg
…---
Sent from my phone. Quality of spelling inversely proportional to finger
size.
On Fri, 31 May 2019, 11:28 JSKenyon, ***@***.***> wrote:
Ok, I am sort of convinced that this is in a working state for both 2.7
and 3.6. However, post merge there are several additional fixes which need
to be made. I will make issues for them, but for the sake of my memeory I
will also mention them here. @o-smirnov <https://github.com/o-smirnov>
currently postmortem flagging breaks in flag3_to_col
<https://github.com/ratt-ru/CubiCal/blob/342e7261117412d1abad03d5045aa5da6f815240/cubical/data_handler/ms_data_handler.py#L1280>
and we need to verify flagging behaviour as my current MWE ends up with
unflagged bad data.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#270?email_source=notifications&email_token=ABRLTP2ZB55U3KNS6VWIZW3PYDVSLA5CNFSM4HNLUJCKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODWUW3QQ#issuecomment-497642946>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ABRLTPZ76MKEF6QRZ5UR3QDPYDVSLANCNFSM4HNLUJCA>
.
|
@o-smirnov that's just it - I did have those set. Perhaps my parset is outdated and there is some other setting I have missed. I made a local change to the postmortem flagging code to make it run, and that produced sensible results. I am just a little concerned that flags are not propagating as expected. But it is not needed in this PR - we can look at fixing up all the little issues after the merge. |
@JSKenyon can you submit all your changes so I can do one final check before we press the (red) merge button |
@bennahugo I have already pushed the minor changes I made. See the last two commits. So feel free to go ahead. |
I think this is mostly done now.
python2 with casacore 3.0 and python-casacore 3.0 works.
python3 with casacore 3.0 and python-casacore 3.0 gives an error:
Which is reported here:
casacore/python-casacore#138
python2 and python3 with the latest casacore and python-casacore give an error, which is reported here: #269
Both of them are in the same area of code, so I guess they are related.
This has been quite some work (again), so I hope this gets merged soon.
Oleg not sure what you want with your printing voodoo, but this is the result from the auto 2to3 convert things.