Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Efficient simulation of multiple sources in a single exposure #53

Merged
merged 39 commits into from
Mar 14, 2017

Conversation

dkirkby
Copy link
Member

@dkirkby dkirkby commented Jan 17, 2017

This PR is to address #6 and, in particular, to look at the feasibility of calculating fiberloss fractions for a whole focal plane on the fly.

@dkirkby dkirkby changed the title Multiple sources Add support for efficient simulation of multiple sources in a single exposure Jan 17, 2017
@dkirkby dkirkby changed the title Add support for efficient simulation of multiple sources in a single exposure Efficient simulation of multiple sources in a single exposure Jan 17, 2017
@dkirkby
Copy link
Member Author

dkirkby commented Jan 17, 2017

I have implemented a simple benchmark in a new cmd-line script quickfiberloss that calculates fiberloss fractions (vs wavelength) for 5K targets randomly placed on the focal plane. The basic flow is:

simulator = specsim.simulator.Simulator(args.config)
for i in range(num_targets):
    fiberloss = specsim.fiberloss.calculate_fiber_acceptance_fraction(
        focal_x[i], focal_y[i], simulator.simulated['wavelength'],
        simulator.source, simulator.atmosphere, simulator.instrument)

I get the following timing results for the loop (i.e., not including the simulator start up time):

  • ntargets = 10: 15.3 us / target
  • ntargets = 100: 13.8 us / target
  • ntargets = 1000: 13.4 us / target
  • ntargets = 5000: 13.5 us / target

In other words, this is surprisingly fast, with a total time for 5K targets of ~70ms!

@dkirkby
Copy link
Member Author

dkirkby commented Jan 17, 2017

Those timing results were completely wrong because the code was still using the fiberloss tables, instead of galsim. The updated numbers look a lot more plausible:

  • n=10: 51 ms/target
  • n=100: 46 ms/target
  • n=1000: 47 ms/target
  • n=5000: 46 ms/target

The total time for 5K targets is now ~230s (3:50 minutes).

This is without any caching of things that do not change (atmospheric PSF and anti-aliased fiber aperture, for example), so the next steps are to:

  • profile and see if there are any easy speedups.
  • study tradeoff of speed vs accuracy for the num_wlen and num_pixels parameters.

@dkirkby
Copy link
Member Author

dkirkby commented Jan 17, 2017

The profile is instructive:

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
    11003   11.441    0.001   20.700    0.002 astropy/units/core.py:1212(filter_units)
    11000   11.206    0.001   13.754    0.001 galsim/base.py:757(drawImage)
42977468/42977467    5.847    0.000    6.900    0.000 {isinstance}
  1166300    1.847    0.000    2.598    0.000 astropy/units/core.py:1186(has_bases_in_common)
   168248    1.508    0.000   10.626    0.000 astropy/units/quantity.py:291(__array_prepare__)
  5581525    1.505    0.000    1.505    0.000 {method 'match' of '_sre.SRE_Pattern' objects}
      250    1.346    0.005    3.970    0.016 astropy/io/ascii/core.py:683(process_lines)
     1000    1.326    0.001   64.245    0.064 specsim/fiberloss.py:18(calculate_fiber_acceptance_fraction)
   366521    1.279    0.000    3.765    0.000 astropy/units/quantity.py:567(_new_view)
2412895/2409763    1.194    0.000    1.336    0.000 {getattr}

It looks like we are spending more time in astropy.units than galsim.base...

@dkirkby
Copy link
Member Author

dkirkby commented Jan 18, 2017

Hardcode the following values:

        # Lookup the RMS blur in focal-plane microns.
        ##blur_rms = instrument.get_blur_rms(wlen, angle_r).to(u.um).value
        blur_rms = 10.0
       ...
        # Lookup the seeing FWHM in arcsecs.
        ##seeing_fwhm = atmosphere.get_seeing_fwhm(wlen).to(u.arcsec).value
        seeing_fwhm = 1.1

and repeat the profile:

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
    11000   10.614    0.001   12.991    0.001 galsim/base.py:757(drawImage)
  5581525    1.691    0.000    1.691    0.000 {method 'match' of '_sre.SRE_Pattern' objects}
      250    1.493    0.006    4.409    0.018 astropy/io/ascii/core.py:683(process_lines)
  3275381    1.285    0.000    1.883    0.000 astropy/io/ascii/core.py:698(<genexpr>)
     1000    1.171    0.001   33.100    0.033 specsim/fiberloss.py:18(calculate_fiber_acceptance_fraction)
   135248    1.143    0.000    8.634    0.000 astropy/units/quantity.py:291(__array_prepare__)
        3    1.023    0.341    1.538    0.513 specsim/camera.py:68(__init__)
2397468/2397467    0.883    0.000    1.773    0.000 {isinstance}
   267521    0.876    0.000    2.518    0.000 astropy/units/quantity.py:567(_new_view)
1840895/1837763    0.863    0.000    0.983    0.000 {getattr}

This confirms that the astropy slowdown is mostly coming from those two calls.

@dkirkby
Copy link
Member Author

dkirkby commented Feb 27, 2017

Instructions for profiling a console-script entry point within ipython:

import cProfile
import specsim.quickfiberloss
cProfile.run('specsim.quickfiberloss.main(["-n", "1000"])', 'profile.out')

To examine the profile results saved in profile.out:

import pstats
p = pstats.Stats('profile.out')
p.sort_stats('time').print_stats(10)

@dkirkby
Copy link
Member Author

dkirkby commented Feb 28, 2017

After the latest commits, the profile is:

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
    11000   10.589    0.001   12.892    0.001 galsim/base.py:757(drawImage)
    11000   10.544    0.001   18.898    0.002 astropy/units/core.py:1212(filter_units)
 41524631    4.744    0.000    4.990    0.000 {isinstance}
  1166000    1.662    0.000    2.329    0.000 astropy/units/core.py:1186(has_bases_in_common)
    84841    0.682    0.000    0.682    0.000 {method 'reduce' of 'numpy.ufunc' objects}
1166824/1160767    0.659    0.000    1.064    0.000 astropy/units/core.py:2097(decompose)
  1166000    0.576    0.000    2.905    0.000 astropy/units/core.py:1195(has_bases_in_common_with_equiv)
     1000    0.572    0.001    1.205    0.001 specsim/fiberloss.py:35(__init__)
1075686/1020527    0.536    0.000    1.141    0.000 astropy/units/core.py:1934(decompose)
    57000    0.527    0.000    0.925    0.000 galsim/transform.py:107(__init__)

so we are still dominated by astropy units.

@dkirkby
Copy link
Member Author

dkirkby commented Mar 7, 2017

Checklist for closing this PR:

  • Resolve travis test failure with py34 (py27 & py35 are working).
  • Resolve travis build_docs -w test failure.
  • Update Simulator.simulate() to accept optional per-fiber arrays of (x,y), fiberloss and SED and return 2D output arrays. See Support parallel simulation of multiple sources #6 for details.
  • Add sphinx doc page summarizing new capabilities.

@dkirkby
Copy link
Member Author

dkirkby commented Mar 9, 2017

I think I have fixed the last travis problems and I am ready to merge now. I decided to push the new "batch simulate" method into a separate PR since there is already a lot to review here. I am assigning @sbailey, but anyone else is welcome to jump in.

@dkirkby
Copy link
Member Author

dkirkby commented Mar 9, 2017

Updated readthedocs for this PR are at http://specsim.readthedocs.io/en/multiplesources/, including:

  • user guide to new fiberloss calculations.
  • description of new quickfiberloss benchmarking script.
  • API docs for new specsim.fiberloss module.

@sbailey
Copy link
Contributor

sbailey commented Mar 11, 2017

Did you want to finish the unchecked punchlist item of "Update Simulator.simulate() to accept optional per-fiber arrays of (x,y), fiberloss and SED and return 2D output arrays. See #6 for details." before closing this, or do you want to merge this and then continue with simulating multiple sources at different focal plane locations in a separate branch/PR?

I don't see how to combine the fiberloss calculations with a spectral simulation here, even for a single target. Could you add a section to the Users Guide with the actual python code to simulate a spectrum of flux vs. wavelength for a r=2 arcsec disk galaxy at x=100,y=50 on the DESI focal plane, and an example of how to override/replace the default fiberloss calculation with a different fiberloss vs. wavelength?

i.e. this code looks fine to merge, though I don't yet know how to use it to accomplish the original goal of this branch / PR. I'm not sure if that is because this PR lays the groundwork but the final integration is still to be done, or whether I'm just not reading the right part of the documentation/code, or whether that connection isn't documented yet.

@dkirkby
Copy link
Member Author

dkirkby commented Mar 11, 2017

You didn't miss anything. I decided to do the final connection in a new PR since it will probably break the public API and there are a lot of changes here already.

To simulate a single fiber with this code, you just need to change instrument.fiberloss.method from table to galsim (either by editing the yaml file or programmatically). But the new API will bypass the single-fiber mode to calculate output arrays for all fibers with one call.

@sbailey
Copy link
Contributor

sbailey commented Mar 13, 2017

If using method=galsim, how does the user specify where on the focal plane the object is?

For consideration: is it necessary to change the public specsim API to support multiple fibers simultaneously? Are there big performance benefits from reusing calculations? We could just wrap the existing single spectrum API in a multi-spectrum API on the desisim side. I think updating the specsim API could be useful, but I don't think it has to happen that way.

Documentation comment doesn't have to be a blocking factor for this PR, but it still applies: specsim provides great details on what it does, but lacks the big picture example of how to put the pieces together to do a simulation (unless using the command line quickspecsim). You may be so familiar with it that it is obvious to you, but I have to poke under the hood in quickspecsim or quickgen to figure that out.

@dkirkby
Copy link
Member Author

dkirkby commented Mar 14, 2017

I am merging now and will update the docs as part of the next PR after consultation with Stephen.

The scipy docs web server has been down for ~2 days now, so this may cause travis failures (since the docs build uses intersphinx).

@dkirkby dkirkby merged commit e1258c7 into master Mar 14, 2017
@dkirkby dkirkby deleted the MultipleSources branch March 14, 2017 21:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants