Posts From Category: hidden

Hospital Readmission Description

Table of Contents



Creating a Database

The local page will be at http://localhost:4000/hidden/2016/02/07/hospital-readmission

Setting up the server

after installing postgresql run

service postgresql start

sudo su - postgres

psql

select * from pg_shadow

gives us:

postgres=# select * from pg_shadow;

usename | usesysid | usecreatedb | usesuper | usecatupd | userepl | passwd | v aluntil | useconfig ———-+———-+————-+———-+———–+———+——–+– ——–+———– postgres | 10 | t | t | t | t | |
| ezbc | 16384 | f | f | f | f | |
| (2 rows)

change permissions for ezbc:

ALTER USER ezbc CREATEDB SUPERUSER;

Create a database with

initdb -D data

Then run psql

psql postgres

Then allow the postgresql lock file to be written in /var/run with

sudo chmod a+w /var/run/postgresql

Start the database server with

postgres -D data

From here follow these instructions for how to change database permissions for your user.

Then run the phoenix server:

mix phoenix.server

Create the database

Create ecto user in postgres

createuser ecto --pwprompt

Create the database

createdb -Oecto -Eutf8 test_DB

Make sure you can login to your postgres database by running the following command:

psql test_DB --user ecto --password
mix phoenix.gen.html UserTest usertest name:string email:string bio:string number_of_pets:integer

Migrating the data

Loading CSV into database as a table.

database links

http://www.phoenixframework.org/v0.13.1/docs/ecto-models

http://www.phoenixframework.org/docs/mix-tasks

Drop a database

psql postgres

DROP DATABASE hospital_readmission_server_dev;

Remove schema from priv/repo/migrations/

Create DB

mix ecto.create

Runs any pending migrations. Migrations are table structure changes called “schema”. Database has list of all previous schemas. When migrating Phoenix checks if previous migrations have been performed.

mix ecto.migrate

Create a new migration.

mix ecto.gen.migration add_hospitals_table

Created file priv/repo/migrations/20160306220657_add_hospitals_table.exs

Now specify the columns in our new table, hospitals_table. Lets add just a few columns of different data types to get the idea.

defmodule HospitalReadmissionServer.Repo.Migrations.AddHospitalsTable do
  use Ecto.Migration

  def change do
    create table(:hospitals) do
      add :hospital_name, :string
      add :provider_number, :integer
      add :start_date, :datetime
      add :end_date, :datetime
      timestamps
    end
  end
end

Changing schema in hospitals_table by running

mix ecto.migrate

To view the table changes

psql hospital_readmissions_server_dev

\d will display the tables in the database. \d hospitals will display the hospitals table columns:

     Column      |            Type             |                       Modifiers
                        
-----------------+-----------------------------+--------------------------------
------------------------
 id              | integer                     | not null default nextval('hospi
tals_id_seq'::regclass)
 hospital_name   | character varying(255)      | 
 provider_number | integer                     | 
 start_date      | date                        | 
 end_date        | date                        | 
 inserted_at     | timestamp without time zone | not null
 updated_at      | timestamp without time zone | not null

Phoenix creates the hospitals_id_seq table automatically. Running \d hospitals_id_seq gives us

      Sequence "public.hospitals_id_seq"
    Column     |  Type   |        Value        
---------------+---------+---------------------
 sequence_name | name    | hospitals_id_seq
 last_value    | bigint  | 1
 start_value   | bigint  | 1
 increment_by  | bigint  | 1
 max_value     | bigint  | 9223372036854775807
 min_value     | bigint  | 1
 cache_value   | bigint  | 1
 log_cnt       | bigint  | 0
 is_cycled     | boolean | f
 is_called     | boolean | f
Owned by: public.hospitals.id

which maps changes to the hospitals table.

Copying the table

Create a migration to copy csv data to hospitals database table.

mix ecto.gen.migration populate_hospitals
* creating priv/repo/migrations
* creating priv/repo/migrations/20160306223423_populate_hospitals.exs

Now we will generate a psql script for ecto to run which will copy the csv. Ecto needs to know the location of the csv file. Define the data location with

  def change do
    abs_path = Path.absname 'priv/repo/data_files/Hospital_Readmissions_Reduction_Program.csv'
    IO.puts abs_path
    execute """
      copy hospitals(
        hospital_name,
        provider_number,
        start_date,
        end_date) 
        FROM '#{abs_path}' 
        WITH csv header;
    """

in priv/repo/migrations/20160306223423_populate_hospitals.exs

psql hospital_readmission_server_dev

Perform the migration:

mix ecto.migrate

Now to get the page up and running adjust the schema for the hospitals table in web/models/hospital.ex

defmodule HospitalReadmissionServer.Hospital do
  use HospitalReadmissionServer.Web, :model

  schema "hospitals" do
    field :hospital_name, :string
    field :provider_number, :integer
    field :start_date, Ecto.DateTime
    field :end_date, Ecto.DateTime

    timestamps
  end

  @required_fields ~w(hospital_name)
  @optional_fields ~w(provider_number start_date end_date)

  @doc """
  Creates a changeset based on the `model` and `params`.

  If no params are provided, an invalid changeset is returned
  with no validation performed.
  """
  def changeset(model, params \\ :empty) do
    model
    |> cast(params, @required_fields, @optional_fields)
  end
end

Read More

Magellanic Modeling Ideas

Table of Contents



Stellar Streams

Origin of Horizontal Branch Stars

Belokurov+15 found stellar streams outlined in the post consisting of blue horizontal branch stars extending nearly 50 kpc away from the LMC. A couple origins of these streams could be

  • tidal dwarfs separate from the MCs

  • stellar streams stripped from the SMC

    • The composition and kinematics of these stars should be compared to those of the SMC in order to address if these HB stellar streams are tidal remnants of the SMC, or from another origin.

    • Check with Stephen to address whether or not it was possible for the SMC to have passed through the location of the stellar stream. Gretina Besla’s orbits of the SMC should suffice to answer this question.

  • the gaseous stream formed stellar components

    • the locations of the stellar streams may be associated with the gas. Compare the gas kinematics and locations to the stellar components.

      • Stack HI data spatially and in velocity WRT the stellar stream. See work by Nigra et al. (2012) for stacking techniques. Also look at THINGS papers.

      • Smooth GASS HI to resolution of stellar streams to pull out diffuse emission.

Ionized H Gas

Another interesting point is whether or not an ionized gas component coincides with the stellar component. We should estimate the emission measure from the BHB stars to see what the ionizing potential would be of surrounding cold gas.

Brianna Smart may have an idea of whether or not the WHAM observations are suitably reduced to associate WHAM with these structures.

Cold HI Gas

How far from the MCs do we expect cold gas to form? We expect there to be at least some cold gas, because Blaire Savage has found H$_2$ absorption in the MS, which also means there must be some dust in the stream. Cold gas may be present offset from the galaxies from either scenario:

  • A tidal interaction or ram pressure pulls diffuse gas away from the clouds, then the diffuse gas cools into cold gas.

  • A tidal interaction or ram pressure directly pulls cold gas from the galaxies.

By decomposing the stream we could try to understand how the cold gas is distributed within the stream.

Future Direction

Whatever the project direction, the null result must be scientifically interesting.

Read More

Tracing magellanic Stellar Stream Gas

Table of Contents



Stellar Streams

Recently Belokurov et al. (2015) used blue horizontal giant branch stars (BHBs) identified in the Dark Energy Survey (DES) as distance tracers. They found 4 stellar filaments whose distances were consistent with being stellar streams from the LMC, with one stream unambiguously connected to the LMC. Only one stream overlaps with the Magellanic Stream (MS). See Figure 1 for a map of the four streams at different distances.

The stellar streams could potentially be associated with the gaseous stream in a scenario where the Milky Way halo RAM-pressure strips the gas from the two galaxies.


Figure 1

Density of the BHB stars for different distances. Positions of the DES Year 1 satellites (red triangles) are shown as well as the locations and the designations of the newly detected structures. The gray outline represents the DES footprint. Arrows show the proper motion vectors of the LMC and the SMC, as measured by Kallivayalil et al. (2013) and corrected for the Solar reflex. Blue contours are the HI density of the Magellanic Stream as reported by Nidever et al. (2008).

A nearly vertical spray of BHB stars originating in the LMC is clearly visible. As confirmed by the density distribution, the two most prominent overdensities are i) a hook-like structures at X ∼ 0◦, Y ∼ 20◦ and ii) the S4 cloud of stars at X ∼ 35◦ on the sky, the S4 cloud appears to overlap several newly discovered DES satellites (red triangles). Only S3 overlaps with the Magellanic Stream.


HI Streams

Hammer et al. (2015) examine the structure of the HI in the Magellanic system to argue the Milkway halo is stripping the Magellanic Steam gas via ram-pressure.

H+15 decomposed the Galactic All-Sky Survey (GASS) HI data in a similar way to Nidever et al. (2008), fitting Gaussian components to each line of sight. The GASS data have three times the resolution as the LAB data, used by Nidever et al. (2008). With these high resolution data, they are able to identify several features in the Magellanic Stream (MS) such as a bow shock which allow for a more complete understanding of the MCs’ origin.

Bow Shock

H+15 disentangles the components of the MS to find many high-velocity clouds (HVCs) along a line perpendicular to the MS. This is consistent with a bow shock from colliding and heated gas.

Turbulence in the Stream

H+15 assess the turbulent shape of the stream to estimate the density of the surrounding medium. For fluids with high enough turbulent potential, estimated by the Reynolds number, the flow of the fluid will create vortices. They conclude the size of the vortices along the stream are consistent with increasing hot halo gas density along the stream.

The Leading Arm

The Leading Arm (LA) may be gas deposited by a MW dwarf Spheroidal (dSph) whose gas was rapidly stripped from the stars, leaving no associated stellar component.

Connecting the Scenario

H+15 derive a scenario where the MCs were on nearly parallel orbits. The MW halo begins to strip the gas away, giving rise to gaseous tails. These tails become turbulent within the hot halo, leading to vortices. The MCs collide, leading to a huge increase in gas density, which then produces a bow shock, consistent with the large number of HVCs present in the middle of the stream. The leading arm is a relic of another stripped system. See Figure 2 for a N(HI) map of the Magellanic system outlining this scenario.

H+15 run hydrodynamical simulations of this scenario to confirm the scenario is consistent with observations. They find extremely good agreement between the N(HI) profile of the MS.


Figure 2

Top: Figure 7 from H+15. Shows proposed scenario of the Magellanic system. Bottom: Same except for with approximate region of Figure 1 shown in black outline.


Read More

Predicting Air B&B User Travel

Table of Contents



Competition Description

I decided to have a go at the Kaggle competition for predicting Air B&B user’s future travel.

The Data

The data description can be found here. Below is a description in my own words.

It may be beneficial later to combine the test data and the training data, then randomly create a train, test, and validation dataset from the total sample.

Training Data

train_users.csv - the training set of users

The Sample Data

The sample data consists of three segmented data sets

Test Data

test_users.csv - the test set of users

  • id: user id
  • date_account_created: the date of account creation
  • timestamp_first_active: timestamp of the first activity, note that it can be earlier than date_account_created or date_first_booking because a user can search before signing up
  • date_first_booking: date of first booking
  • gender
  • age
  • signup_method
  • signup_flow: the page a user came to signup up from
  • language: international language preference
  • affiliate_channel: what kind of paid marketing
  • affiliate_provider: where the marketing is e.g. google, craigslist, other
  • first_affiliate_tracked: whats the first marketing the user interacted with before the signing up
  • signup_app
  • first_device_type
  • first_browser
  • country_destination: this is the target variable you are to predict

User Data

  • sessions.csv - web sessions log for users
    • user_id: to be joined with the column ‘id’ in users table
    • action
    • action_type
    • action_detail
    • device_type
    • secs_elapsed

Population Data

  • age_gender_bkts.csv - summary statistics of users’ age group, gender, country of destination

Country Data

  • countries.csv - summary statistics of destination countries in this dataset and their locations

First Examination of the Data

Predictions of the Data

We should convert the labels into dummy variables and individually predict for each country. Then join the results together.

Read More

Tracing Megellanic Stellar Stream Gas

The Megellanic Stream

Below are points summarized by D’Onghia & Fox (2015)

  • The Stream presents a filamentary structure (Wakker 2001; Putman et al. 2003b) with at least two main filaments (Nidever et al. 2008), one with metallicity consistent with the LMC, and one more close in metallicity to the SMC (Fox et al. 2013, Richter et al. 2013)

  • Stream may be a young feature (1–2 Gyr; Besla et al. 2007)

  • There are two scenarios for the formation of the stream

    1. First Passage (Unbound): The LMC and SMC have collided with one another but not yet passed through the MW. This scenario reproduces the bridge, but has trouble explaining the Stream gas content.

    2. Multiple Passage (bound): The LMC and SMC are a strongly interacting pair thus the Stream would originate from material pulled out of the Clouds. Because the gravitational field of the LMC will act in the same manner on gas and stars, we expect a tail of stars pulled out from the SMC into the Stream, especially in the case of a head-on collision.

  • Ram-pressure-stripping

Extracting the GASS Cube

We would like to extract the relevant areas where stellar streams have been detected in Belokurov et al. (2015).


Figure 1

Stellar density in DES1 footprint. Coordinates are galactic offset from the LMC. The streams are detected to the northwest of the LMC + SMC. See Figure 2.



Figure 2

Stellar density in DES1 footprint. Coordinates are galactic offset from the LMC. The streams are detected


The LMC is at l, b = (277$^\circ$, -35$^\circ$). We need an HI cube 70 deg to the west of the LMC and 10 deg to the east, and 20 deg south to 40 deg to the north of the LMC. So the extent of the cube should be and . This corresponds to and . This is also a cube centered at l, b = (303$^\circ$, -25$^\circ$), 150 deg wide, and 100 deg tall.

However the cube downloads are limited to 25 deg by 25 deg. So to compromise I will center the cube at l, b = (280$^\circ$, -45$^\circ$) with 25$^\circ$ sides.

To encompass all HI structure in the stream we will need velocities between -400 and +400 km/s. This is nearly the full range of the GASS cube. However we are interested in associating gas near the two clouds, at the front of the stream. To limit the size of the data cube, we will include velocities between 0 and 400 km/s.

Further cubes with 25$^\circ$ by 25$^\circ$ will be centered at

l [deg] b [deg]
280 -70
305 -45
305 -70

The average beam size of the data is 14.3$\arcmin$ = 0.238333$^\circ$, which I set as the pixel size for the cubes.


Figure 3

$\nhi$ of SMC + LMC from Murray et al. (2015). $\nhi$ map from Putman et al. (2003).


Structure of the MS

Putman 2003

Putman et al. (2003) identify the relevant $HI$ velocities observed with Parkes in the HIPASS survey.

They identify a variety of prominent features, including:

  • bifurcation along the main Stream filament

  • dense, isolated clouds that follow the entire length of the Stream

  • head-tail structures

  • a complex filamentary web at the head of the Stream where gas is being freshly stripped away from the Small Magellanic Cloud and the Bridge

  • The concentration of gas at MS IV looks like a bow shock, suggesting that interaction with halo gas may be responsible for the appearance of the Stream in this region.

  • The head appears to emanate from the northern side of the Magellanic Bridge and SMC at velocities between vLSR ¼ 90 240 km s?1

Assuming a distance of 55 kpc to the entire stream, the head of the stream dominates the mass. The average $\nhi$ for the stream is around $10^{19}$ cm$^{-2}$ at 15$^\prime$ resolution.

They show the MS features in a coehesive diagram below.


Figure 4

$\nhi$ map from Putman et al. (2003).


Details About GASS Cube

The data website can be found here. Cubes can be downloaded here.

Decomposing the GASS Cube

We can use AGD to decompose each spectrum of the cube. To train the smoothing parameter I created 100 synthetic spectra with the RMS of the data, Gaussian velocity widths corresponding to a random kinetic temperature between 30 and 9,000. The amplitudes ranged from 5 * RMS to 25 * RMS. Each spectrum had up to 4 components.

We are now able to begin clustering components with one another. After decomposing the cube, which takes about 10 hours, we can now examine the eigenvectors within the parameter space of the Gaussians. The relevant parameters across which we expect to find eigenvectors in are:

  • Gaussian mean velocity
  • Gaussian velocity width
  • Gaussian amplitude
  • Gaussian x position
  • Gaussian y position

corresponding to 5 dimensions. We can transform these five parameters for each spectrum into an eigenspace, whereby we can cluster components by any correlation of the 5 parameters. This will be done in two steps, first decomposing the Gaussian parameters into its principal components, then clustering with k-means.

Principal Component Analysis

K-means

Relevant Sources

http://adsabs.harvard.edu/abs/2015MNRAS.453..338W

http://adsabs.harvard.edu/abs/2011MNRAS.418.1575P

http://adsabs.harvard.edu/abs/2008MNRAS.388L..29W

http://adsabs.harvard.edu/abs/2003ApJ…586..170P http://adsabs.harvard.edu/abs/2003ApJ…586..170P

Read More

Modeling Shield Galaxies

Choosing a model

There seem to be many models for HI data cubes of rotating galaxies. Here is a non-exhaustive list:

A significant amount of overhead will likely be necessary to prepare the input for each model, and parse the output of the modeling software. It would be best to choose which modeling software will be best suited for the analysis.

Preparing the data

The maximum-likelihood estimation technique requires independent observations in order for meaningful uncertainties in the model fits. For an HI data cube, this means that each pixel needs to include a single beam.

The MLE technique also requires that the uncertainty of each observation is accurate. The HI data cube also needs uncertainty estimates for each pixel. Potential factors contributing to the uncertainty for an independent pixel may include:

  • Flux calibration

  • System temperature variation

Read More

Table Proposal

In this post I outline the different possibilities for creating a table for a rapidly growing child, Violet.

What is the intended use of the table?

A custom table may be an expensive investment for a child who will outgrow the table in a short time. The table may best be appreciated by all parties involved if after the child’s use, the table becomes a coffee table.

How tall should the table be?

We should make a compromise between the height of a coffee table and the height of a growing child. Reputable sources show female children grow quickly, reaching a height of three feet by the age of three years. Other sources describe coffee tables as being 16 - 18 inches tall. Indeed this is the height of the coffee table in my home. I propose 16 inches tall.

How wide and long should the table be?

Based on extensive research, toddlers play with small toys on tables. The table might accommodate such activities were it 30 inches long by 18 inches wide.

Accessories to the table

Previous conversations included discussions including a shelf under the table. A shelf of course provides additional storage space. A shelf also may hinder sitting at the table. If a shelf is to be installed in the table, I suggest the shelf be installed near the top, only a few inches below the top. Here is a rough example.

Style of the table

I can accommodate several styles of the table.

As an example, here is a photo of a previous coffee table I constructed out of cherry and maple. The style of the table is reminiscent of the shaker style.

Type of wood

I recommend using maple or cherry, a combination of maple and cherry, or red or white oak. Below are examples of each

Cost

Depending on the materials used and the style selected, the table will likely be around $200 to $500 depending on the complexity of the table. The most expensive to least expensive materials are cherry, maple, white oak, and red oak. The material cost will scale linearly with the size of the table. I charge $20 per hour.

Next steps

Next we should agree on the dimensions, whether or not to have a shelf, style, and type of wood of the table. Please comment below or email me. I can also make all these decisions if this is an overload of choices.

Read More

Thesis Ideas

Applying Wavelets to the CHILES Dataset

The main goals of applying wavelet decomposition to the CHILES dataset are

  1. Denoise the HI cubes

  2. Compress the HI cubes to smaller data volumes

  3. Identify coherent structures in position and velocity

What is Wavelet Decomposition?

Wavelet decomposition is similar to Fourier decomposition, with the main difference being that a wavelet basis function has a finite width, while the sinusoid basis functions in a Fourier decomposition do not. A more detailed outline of wavelets is well described in Wikipedia. The variable width of the wavelet basis functions allows wavelet decomposition to change resolution at different frequencies. This means that wavelets can reconstruct a function which has both smoothly varying and sharply varying features.

Noise Reduction

An excellent example of a wavelet decomposition’s ability to smooth out noise, while still recovering both large-scale and small-scale features is by Robert Nowak at the UW Madison. Wavelet decomposition is also able to adopt to spatially-varying noise (Martens & Lobanov 2014). Wavelet decomposition may be a useful tool in extracting the lowest signal-to-noise sources in CHILES. See Figure 3 of Miville-Deschenes et al. (2003) for an example of a denoised HI interferometric image of a molecular cloud.

Source Extraction

Martens & Lobanov (2014) developed an automated tool for measuring using wavelet decomposition for measuring source structure. They applied their method to a radio jet, where they were able to associate structure across different velocities.

Using a tool like this could be useful for identifying diffuse HI around galaxies to answer galaxies obtain their gas as a function of redshift. Structure identification in HI cubes is particularly useful for identifying cold mode accretion.

Compressing CHILES Data

One of the proposed data products for CHILES are HI cubes.

Assuming

  • 31.2 kHz channel width

  • 480 MHz bandwidth

  • Imaged primary beam of about 35 for L-band, with B-array resolution of , Nyquist-sampled beam, leading to one channel image size of 2,100 2,100 pixels.

  • 32-bit floating point values

The entire HI cube integrated over the full observation time would be 271 GB of data. Time series analysis would lead to much larger data volumes.

Wavelet decomposition can be used for compressing large volumes of data. A recent study showed how wavelet decomposition can be used for efficient compression of 3D data. This study also described how individual slices of the 3D data could be retrieved quickly.

Compressing the HI cubes could allow users to efficiently query individual slices of the HI cube. The user would simply query the region of sky and frequencies needed. The decomposed basis functions would be reconstructed into the queried data.

Querying compressed data may expand the ability to do time series analysis.

Proposed Research

Here is a rough outline of what I could do with wavelet decomposition and the CHILES dataset:

  1. Denoise time-integrated CHILES dataset with wavelet decomposition.

  2. Compress HI cube. Write tool for reconstructing various parts of the HI cube from the wavelet bases.

  3. Explore source extraction to detect diffuse structure around galaxies in search of cold mode accretion.

  4. If successful with previous steps, denoise and compress the cube for different timescales.

Read More