Quickgen's use of specsim is quite convoluted and exposes a lot of specsim implementat

On the issue of wavelength grids, we had some discussion in <a class="issue-link js-is

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Streamline specsim-quickgen interaction about desisim HOT 17 CLOSED

desihub commented on July 19, 2024

Streamline specsim-quickgen interaction

from desisim.

Comments (17)

dkirkby commented on July 19, 2024

There are four types of information flowing from specsim to quickgen:

Wavelength grids for the simulation and each camera.
Mean number of detected electrons from the source and sky and the corresponding ivars in downsampled bins for each camera.
Perfectly calibrated mean flux in downsampled bins, with resolution applied for each camera.
Sparse resolution matrix for each camera.

Quickgen also adds Gaussian noise to the specim-calculated means, but I would rather do this in specsim (e.g., for consistency between quickgen and quickbrick). See desihub/specsim#3.

from desisim.

dkirkby commented on July 19, 2024

On the issue of wavelength grids, we had some discussion in #86 and desihub/desimodel#7 is very relevant.

The coverage of each camera is implicitly specified by two arrays, throughput and resolution, and explicitly by header keywords WMIN_ALL and WMAX_ALL in desimodel/data/specpsf/psf-quicksim.fits.

Specsim works with two different ranges for each camera: the physical extent of active pixels (camera.ccd_slice) and the slightly larger range of source wavelengths that can disperse into the CCD (camera.response_slice), where resolution tails are truncated at n-sigma (nominally 5-sigma, set by the instrument.cameras.*.constants.num_sigmas_clip config parameters).

Here are the numbers, for reference:

  Range     b(min)   b(max)   r(min)   r(max)   z(min)   z(max) 
           Angstrom Angstrom Angstrom Angstrom Angstrom Angstrom
---------- -------- -------- -------- -------- -------- --------
Throughput 3533.000 5998.000 5564.000 7805.000 7360.000 9913.000
    Header 3569.000 5949.000 5625.000 7741.000 7435.000 9834.000
       CCD 3569.000 5949.000 5625.000 7741.000 7435.000 9834.000
  Response 3565.400 5952.600 5621.700 7744.300 7431.000 9838.000

Note that the psf-quicksim.fits header matches the actual non-zero range of the fwhm_wave, neff_spatial and angstroms_per_row arrays, which is what specsim actually uses to determine the CCD range (rather than the header keywords).

Note also that the throughput limits are sometimes used in desisim via calls like:

desimodel.io.load_throughput('b').wavemin

However, the simulation should never actually need to use wavelengths outside of the tighter "Response" values above, since wavelengths outside this range can never disperse into the CCD, even if they have non-zero throughput.

The simulation wavelength is currently specified (in specsim/data/config/desi.yaml) as:

wavelength_grid:
    unit: Angstrom
    min: 3500.3
    max: 9999.7
    step: 0.1

The limits could be tightened to the response range above, 3565.0 - 9838.0, without changing the simulation results. However, there are two practical considerations:

We might want to change the resolution model or clipping parameter (currently 5-sigma), which would change these values.
The template flux needs to be defined over the full simulation range.

Simspec currently uses 3533.0 - 9913.0 so this is effectively now determining the minimum simulation wavelength limits used by quickgen.

Questions:

Do we want to tighten the simulation limits?
How were the simspec limits chosen? Are they likely to change?
Is 0.1A binning fine enough for the simulation?

from desisim.

dkirkby commented on July 19, 2024

The wavelength grid for downsampled results in each camera is currently determined by (1) the global downsampling parameter, (2) the simulation wavelength grid, and (3) the CCD wavelength limits. For example, the first bin of the b-camera grid covers 3569.0 - 3569.5 (so is centered at 3569.25) with downsampling = 5 and step = 0.1.

The current scheme is fragile since you might reasonably use a finer simulation grid but that then redefines each camera's output grid. The output grid cannot be specified completely independently, since we want to downsample rather than resample, but a more robust parameterization would be to specify an output_step parameter, e.g. 0.5A. This insulates changes to the simulation grid from changes to the output grid, while still requiring that output_step is an exact multiple of the hi-res simulation grid. This parameter could be defined separately for each camera to match each output grid to the resolution.

My concrete proposal is to replace the downsampling parameter with a per-camera output_step parameter with a default of 0.5A (to match the current config). Let me know soon if you don't think this is the right direction to go.

from desisim.

dkirkby commented on July 19, 2024

@sbailey On second thought, shouldn't specsim output physical pixels in each camera, calculated from angstroms_per_row? Or is there a reason to fill frame files with pseudo-pixels, as we currently do?

from desisim.

dkirkby commented on July 19, 2024

I am proceeding with the original integer downsampling scheme (with per-camera output_pixel_size parameters all set to 0.5A) and we can revisit the option to simulate physical pixels in a separate issue.

I propose to define the canonical simulation range as 3550 - 9850A. This fits within the throughput ranged used by simspec (3533 - 9913A), has enough padding beyond the 5-sigma response range (3565 - 9839A) to allow resolution model fiddling, and is nicely rounded for plotting, etc.

from desisim.

dkirkby commented on July 19, 2024

I just noticed that quickgen generates three independent noise realizations for the object and sky frames and the calibrated cframe. This seems clearly wrong for the object frame and cframe arrays, but I'm not so sure about the sky frame file. Assuming the sky array represents the skymodel described here, perhaps it shouldn't have any noise at all?

https://github.com/desihub/desidatamodel/blob/master/doc/DESI_SPECTRO_REDUX/PRODNAME/exposures/NIGHT/EXPID/skymodel-CAMERA-EXPID.rst

from desisim.

sbailey commented on July 19, 2024

Wavelength grid

For most uses of specsim, having the final output grid match the grid that we would extract (e.g. 0.5 A) would be most useful. If using specsim to propagate photons for pixel sims, then we'd use a much finer grid (smaller than CCD pixels). I don't see a need for simulating on exactly the grid of the CCD pixels. i.e. the output_pixel_size parameter sounds good.

3533 - 9913 covers the maximal range seen by any fibers (no single fiber sees that entire range). The median range is 3561 - 9893 so I'd suggest a default simspec range of 3550 - 9900.

Noise

Sky should have some noise, since it will be measured from interpolating O(40) sky fibers. One could imagine some physical modeling or smoothing to minimize the noise, but it won't be perfect. The current code includes a factor of 25 fibers (quickgen line 292), which is semi-realistic since not all 40 sky fibers will contribute to the sky model for every science fiber.

object frame vs. cframe : I think you mean the uncalibrated frame vs. the calibrated cframe? It would be more accurate to have them use calibrated/scaled versions of the same noise realization instead of completely different realizations. Flux calibration and sky subtraction will add additional noise that somewhat breaks that equivalence. In practice that hasn't mattered in the past since we've used quickgen runs for either frame or cframe outputs, but not for using them together. But it would be good to fix this as part of the quickgen/specsim cleanup.

from desisim.

dkirkby commented on July 19, 2024

I assume you are referring to throughput here, since that is the range of the throughput vector in desimodel:

3533 - 9913 covers the maximal range seen by any fibers (no single fiber sees that entire range)

Is this vector actually the max envelope of the individual fiber throughputs?

Specsim does not model variations of throughput, pixel size, etc, across the focal plane so the simulation range should be chosen for typical / median fiber. If we want to account for this variation in future, a next step could be to create configs desi-central and desi-outer that model extreme fibers with appropriate throughputs, pixel sizes, etc.

Even if the median filter has non-zero throughput up to 9893A, the simulated z-CCD only extends up to 9834A. So flux above ~9838A will never contributed to the simulated output.

I am happy to use your proposed 3550 - 9900A but just want to be sure we agree that increasing from 9850 to 9900A will give identical results (with arrays padded by 500 extra zeros).

from desisim.

sbailey commented on July 19, 2024

The attached plot shows the minimum and maximum wavelength coverage vs. fiber number:

3533 is the minimum wavelength seen by any fiber (the edge fibers). These only extend to 9833 in z.
9913 is the maximum wavelength seen by any fiber (the central fibers). These only go down to 3569 in b.

The limiting factor in the min/max is the light actually going off the edge of the CCD, not some throughput coefficient going to 0. It is not a coincidence that the throughput model covers 3533-9913 A, i.e. the maximum possible range.

psf-quicksim covers 3569 - 9834A, which was chosen as the worst case scenario — every fiber sees at least that range. My proposal to have specsim to cover the median fiber case of 3550 - 9900A only makes sense if we update psf-quicksim to also cover that range. It might be better to keep the current more conservative range so that analyses don't use specsim for the median case thinking that it applies to all fibers.

For the record, the plot was made with desimodel 0.4.2.dev32 and:

import desimodel.io
b = desimodel.io.load_psf('b')
z = desimodel.io.load_psf('z')

fibers = np.arange(500)
wmin = b.wavelength(fibers, y=0)
wmax = z.wavelength(fibers, y=z.npix_y-1)

subplot(211)
plot(fibers, wmax)
ylabel('Maximum wavelength [A]')

subplot(212); plot(fibers, wmin)
ylabel('Minimum wavelength [A]')
xlabel('Fiber number')

from desisim.

dkirkby commented on July 19, 2024

@sbailey This is illuminating! Are there expected throughput variations vs fiber number, or is falling off the CCD the main fiber-dependent effect?

I will stick with 3550 - 9850A for the current config, to accompany the current psf-quicksim, etc, but we can revisit this later by changing the baseline and/or create alternate configs for specific extreme fibers.

If we ever do decide to simulate a fiber that reaches either limit (3353, 9913) we would need to extend the tabulated throughput slightly (~5A) to model photons that disperse across this edge, with a corresponding extension of the simspec coverage.

from desisim.

sbailey commented on July 19, 2024

There will be random variations in the fiber throughput due to manufacturing, but it isn't a function of fiber number. Effects that are predictable:

different angular size of the fiber vs. location on focal plane –> different amounts of light make it into the fiber
spectral resolution varies across the focal plane
vignetting

We should be careful about feature creep for specsim — the original idea of quicksim was to be "good enough" for a "typical fiber". Adding some level of variation for different fibers could be good, but at some point is can also be ok to just say that the desired level of fidelity is beyond the scope of this particular simulation tool.

from desisim.

dkirkby commented on July 19, 2024

I agree about the feature creep and that specsim is only aiming to be good enough for a typical / worst case fiber.

There are some open issues to introduce some of these focal-plane variations, but we should be guided by how important the effects are and how easy they are to implement before tackling any of them.

from desisim.

dkirkby commented on July 19, 2024

The next question is how to calculate the perfectly calibrated flux in each camera. Currently quickgen and quickbrick approach this quite differently:

quickgen = D . R . f
quickbrick = (D . R . (k f)) / (D . R . k)

where:

D represents downsampling from 0.1 to 0.5A
R represents resolution convolution on the 0.1A grid
k is the flux-to-electrons calibration vector on the 0.1A grid
f is the source flux above the atmosphere on the 0.1A grid

The quickbrick method makes more sense to me, as being closer to how we actually calibrate flux, so I propose to have quickgen use the same method.

(Neither of these corresponds to the decorrelated flux calculated by spectroperfectionism, but that's an issue for another day.)

from desisim.

julienguy commented on July 19, 2024

agreed

from desisim.

dkirkby commented on July 19, 2024

The per-camera frame noise added by quickgen is more problematic since it samples the coadded ivar, so does not blow up as the camera throughput goes to zero. I am fixing this now.

from desisim.

dkirkby commented on July 19, 2024

What should we use for the downsampled resolution matrix?

quickgen integrates a Gaussian over 0.5A pixels, which gives the downsampled response to a delta function centered in each 0.5A bin.

quickbrick averages the 0.1A resolution matrix over 0.5 x 0.5A blocks, which is roughly the convolution of the quickgen resolution with a 0.5A pixel boxcar.

Both programs should use the same resolution matrix, but which one?

from desisim.

dkirkby commented on July 19, 2024

I decided to use the quickbrick resolution matrix in both programs.

from desisim.

Streamline specsim-quickgen interaction about desisim HOT 17 CLOSED

Comments (17)

Wavelength grid

Noise

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs