**Table of Contents**

- Project Timeline
- Binning Errors
- Reproducing Lee+12 IRIS
- Comparing the MLE without an intercept to Lee+12
- Taurus
- California
- Background HI

## Project Timeline

Below is the proposed project timeline. These should be hard deadlines.

**September 1st**: DGR, HI width, and intercept derived for each cloud.

**October 1st**: Complete Krumholz model analysis.

**November 1st**: Circulate paper to coauthors. Prelims on November 20th. Wait
until after November 20th to begin editing paper.

**December 20**: Submit paper.

## Binning Errors

In the previous post I wrote that the total uncertainty of a binned pixel should include contributions of from error on the mean, and the standard deviation of the population. However this is incorrect. The error on a binned/smoothed pixel should decrease with the larger the bin. That is the error on the mean will become smaller. Thus, the total uncertainty of the sample, , is just the error on the mean

where is the standard deviation for each element. If the variances are the same , i.e., the uncertainty of the binned pixel scales as the root of the number of pixels included in the bin.

## Reproducing Lee+12 IRIS

We would like to compare how our derivation of the DGR compares with the Lee et al. (2012) DGR. We follow similar steps as the paper, using the same data. Figure 1 shows two plots of the fitted vs. N(HI) relationship. The first plot is a scatter plot of the pixels used for the MLE, i.e. the unmasked pixels from the residual masking. The second plot is the contour plot of all pixels in the map. Each plot shows three fits: 1) the MLE fit (MLE fit), fit only to the points in the masked, first plot, 2) the least squares fit to all the data in the plot, (poly scatter fit) 3) the least squares fit to the median data points in the plot (poly median fit).

The least squares fit to the masked data, (poly scatter fit) should be the same as the MLE fit. Indeed this is what we find.

The least squares fit to the median values of the unmasked data should be the same as in Figure 8 of Lee et al. (2012). I attempted to use the same median bins, however it is unclear to me what the bins are. I used N(HI) bins from 6.5 to 9 spaced 0.3 cm apart. Our fit for the DGR of 0.123 to the median bins is slightly higher than the Lee+12 DGR of 0.11. I am unsure of where this discrepancy arises. Is it from the specific bins used? Were certain data points excluded in calculating the median bins?

#### Figure 1

Top: scatter plot of the pixels used for the MLE, i.e. the unmasked pixels from the residual masking. Bottom: contour plot of all pixels in the map. Each plot shows three fits: 1) the MLE fit (MLE fit), fit only to the points in the masked, first plot, 2) the least squares fit to all the data in the plot, (poly scatter fit) 3) the least squares fit to the median data points in the plot (poly median fit). The MLE DGR differs slightly from the median fit, mostly because the MLE is allowed to have an intercept and the median fit is not.

## Comparing the MLE without an intercept to Lee+12

Our masking method and MLE calculation of the DGR should be very similar to a least squares fit to the median values of the entire unmasked dataset. This is because both methods should be excluding the infrequent, high pixels which would bias the fit. Figure 2 are plots the same as in Figure 1, except that the MLE does not fit for an intercept, only for the DGR.

We find that the MLE fit is completely consistent with the median bin fit, as demonstrated in the poly median fit to the entire, unmasked dataset. This is promising.

#### Figure 2

Same as Figure 1, except MLE fit has an intercept of 0 mag.

## Taurus

#### Figure 3

Taurus spectra. Median spectra with model fit to the HI in purple, the fitted components in dashed black and the HI velocity range used as the gray shaded region. The standard deviation of the HI spectrum is also shown. The velocity range is determined by of the center of the tallest fitted component.

#### Figure 4

Same as Figure 1, except for Taurus. The MLE and polynomial fits to the data are most influenced by the lower- data points with the smallest errors. The MLE fit agrees somewhat well with the poly median fit the to entire data set (bottom plot).

## California

#### Figure 5

Same as Figure 3, except for California. The selected HI width excludes much of the HI emission in the region.

#### Figure 6

Same as Figure 1, except for California. The MLE and polynomial fits to the data are most influenced by the lower- data points with the smallest errors. The MLE fit strongly disagrees with the poly median fit (bottom plot). This can likely be explained by the poor linear relationship between N(HI) and given the velocity range we used.

We can see that for California the relationship between and N(HI) is not very linear for diffuse pixels.

Perhaps we should explore selecting a larger HI width for California. We could adopt the method of Planck (2011) which selected the HI range based on the standard deviation of the HI spectrum in the region. For California this would could mean selecting -20 to 15 km/s. More updates to come.

## Background HI

For California we see that an enormous DGR and negative intercept are favored. This means two things, first that there is a lot of dust along the line of sight, likely much of it unassociated with California, second, the N(HI) created using a narrow HI width does not correlate well with the dust, hence the large negative intercept.

To account for this background we can try two things, fit a seperate DGR with the unassociated HI, or remove an HI background.

### Removing an HI background

I ran the experiement of subtracting the fitted components in the California median spectrum from Figure 5 from the HI cube. I excluded the fitted component used to calculate the HI width, as this is our presumed cloud of interest. I subtracted these components from every line of sight.

#### Figure 7

Same as Figure 6, except with an HI background subtraction. For ease of comparison the bottom panel of Figure 6 is shown at the bottom right. The component subtraction changed the fitted intercept by 1 mag, however did not change much of the structure in the N(HI) / distribution.

### Fitting for a seperate component along the line of sight

We could also fit for multiple clouds along the line of sight as done in Martin et al. (2015) and Planck (2011). This would allow us to associate the excess of dust emission with HI not associated with California. This could help resolve the high DGR issue.

Our model would be represented as

where a B subscript represents the background, and the C subscript represents the cloud.