Difference between revisions of "SSW2022 Activities"

From CoolWiki
Jump to navigationJump to search
m
m
Line 96: Line 96:
 
##What plot did you use? You may be tempted to try sy_gaiamag_01 vs. phot_g_mean_mag but can you find a better one?  
 
##What plot did you use? You may be tempted to try sy_gaiamag_01 vs. phot_g_mean_mag but can you find a better one?  
 
##You probably have a situation where there are so many points that you only have a heatmap (binned greyscale) plot. How can you filter down the catalog so that you can identify individual objects that may be problematic? (Hint: maybe something somewhere like abs("phot_g_mean_mag"-"sy_gaiamag_01") > 0.1?) Do you notice anything in common about these stars?
 
##You probably have a situation where there are so many points that you only have a heatmap (binned greyscale) plot. How can you filter down the catalog so that you can identify individual objects that may be problematic? (Hint: maybe something somewhere like abs("phot_g_mean_mag"-"sy_gaiamag_01") > 0.1?) Do you notice anything in common about these stars?
##I note that we do still have the problem where individual planets are listed once but stars are listed more than once (e.g., Kepler 108 appears once for Kepler 108b and once for 108c), but these points should be plotted identically on top of each other in the plots; they just contribute to source counts. If you want to get rid of them, to first order, keep only those that have a "b" in the planet name, e.g., impose a filter "like '%b%'" on pl_name_01. (This actually helps because it weeds down the catalog enough to see individual points!)
+
##I note that we do still have the problem where individual planets are listed once but stars are listed more than once (e.g., Kepler 108 appears once for Kepler 108b and once for 108c), but these points should be plotted identically on top of each other in the plots; they just contribute to source counts. If you want to get rid of them, to first order, keep only those that have a "b" in the planet name, e.g., impose a filter "like '%b%'" on pl_name_01. This should omit all the planets that are 'c' or 'd' or 'e' ... you get the idea. It will keep any that have "b" in the root of their name, however. (This actually helps because it weeds down the catalog enough to see individual points!)
 
# ''Plot the Gaia absolute CMD diagram for the exoplanet host stars, corrected for the effects of extinction. Make two plots, one using the parallaxes to calculate G and the other using Gaia DR3 distances. What might be the cause of the differences you see? Create a plot to investigate this.''
 
# ''Plot the Gaia absolute CMD diagram for the exoplanet host stars, corrected for the effects of extinction. Make two plots, one using the parallaxes to calculate G and the other using Gaia DR3 distances. What might be the cause of the differences you see? Create a plot to investigate this.''
##For this, we need to dig into the column definitions of the Gaia DR3 catalog -- see table above. You are going to want to plot on the x-axis B-R-E(B-R) to correct for the reddening, and on the y-axis, G-5logd-5-Ag. For the first request, you can invert the parallax to get the distance (watch your units), and for the second request, use the distance provided in the catalog.
+
##For this, we need to dig into the column definitions of the Gaia DR3 catalog -- see table above. You are going to want to plot on the x-axis B-R-E(B-R) to correct for the reddening, and on the y-axis, G-5logd-5-Ag. For the first request, you can invert the parallax to get the distance (watch your units), and for the second request, use the distance provided in the catalog. Pin the plots to compare them side-by-side.
 +
##There are definitely points that are different. Click on points in either plot to find the corresponding rows in the catalog. Do you notice anything about the most discrepant points? The words and the code in the Colab notebook solutions do different things, but one could explore filtering to omit negative parallaxes or those with parallax_over_error < 5.

Revision as of 23:38, 5 October 2022

Introduction

The Sagan Summer Workshop (SSW) is held annually and is meant to be a week-long summer "school" for early career astronomers (advanced undergraduates and graduate students/postdocs. The conferences traditionally have a substantial hands-on component. Each year, they pick a different theme. In 2022, the theme was Exoplanet Science in the Gaia Era. Several of the hands-on components from earlier in the week can be done using IRSA tools, so this is what we have reproduced here. See the SSW website for recordings of the talks that led into these hands-on sessions, as well as detailed instructions as to how to do these exercises using the Google Colab Notebooks provided by the workshop team.

Monday Afternoon, Part 1

  1. Query the Gaia Catalog of Nearby Stars (GCNS) for all stars within 20 pc. - as of the time I am writing this, the GCNS isn't available at IRSA, so you have to either go directly to the ESA Gaia Archive to get it, or get it from VizieR, like at this ftp site. The GCNS is large, but you can get it in csv format, which IRSA tools understand. To make this process easier, though, here is a truncated csv version of this catalog.
    1. Download and uncompress that copy, or download your own copy from VizieR.
    2. Load that csv catalog into IRSA Viewer by clicking on the catalogs tab, then "Load Catalog File."
    3. Filter down the catalog to only have the stars within 20 pc -- turn on filters and restrict the file to have only Plx>50. You should be left with ~2600 stars.
  2. Make an observed Gaia color-magnitude diagram for this sample.
    1. Use IRSA Viewer to make a plot of G vs. B-R -- the columns in the zipped csv file above are x=BPmag-RPmag and y=Gmag. Don't forget to reverse the y axis! Pin the plot so that you can compare it to other plots.
  3. Make an absolute Gaia color-magnitude diagram for this sample. - that is, correct for distance because you have the parallax!
    1. Use IRSA Viewer to make a new plot of absolute G vs. B-R. Hint: Gmag- (5*log10(1000/Plx) - 5).
    2. Pin it so that you can compare the first plot with this one.
    3. What are some things that are the same and different between these two plots? Why?
    4. What happens to the outliers if you plot absolute G vs. G-R instead of G vs. B-R? (Why?)
  4. Make an absolute SDSS color-magnitude diagram for this sample. - the SDSS photometry is included in the GCNS catalog, and you still have the parallax, of course.
    1. The SDSS filters are the columns "gmag", "rmag", "imag", and "zmag". Try g vs. g-i. You may have to cope with some outliers either by filtering the catalog or changing the limits of the plot.
    2. Why does the SDSS CMD look like it does? Why is it worse or better than the Gaia CMD?

Monday Afternoon, Part 2

  1. Query the Exoplanet Archive
    1. Go to the Exoplanet Archive, and find the Planetary Systems Composite Data Table.
    2. Go up the the upper left of the screen and click on "select columns." "Clear all" then select the columns corresponding to the columns in the SSW example -- see table below. Then click "update" and close the pop-up window.
    3. "Download table" to, well, download the table. Save it as an IPAC table file ("IPAC format") to make things easier for the next step.
name in Google Colab name in Exoplanet UI
pl_name Names / Planet Name
hostname Names / Host Name
ra System Data / Position / RA (pick the one in degrees, not sexagesimal)
dec System Data / Position / Dec (pick the one in degrees, not sexagesimal)
sy_gaiamag System Data / Photometry / Gaia Magnitude
st_teff Stellar data / Stellar Effective Temperature
st_logg Stellar data / Stellar Surface Gravity
st_met Stellar data / Stellar Metallicity (in dex)
st_lum Stellar data / Stellar Luminosity
st_rad Stellar data / Stellar Radius
st_age Stellar data / Stellar Age
Some critical columns in Gaia DR3 definition
phot_g_mean_mag G mag
phot_bp_mean_mag Bp mag
phot_rp_mean_mag Rp mag
parallax Parallax in mas
distance_gspphot Distance in pc
ebpminrp_gspphot E(B-R), e.g., reddening in Bp-Rp
ag_gspphot A_G, e.g., reddening in G
  1. Crossmatch the Exoplanet Archive and Gaia DR3
    1. Load the IRSA Catalog Search Tool
    2. Select Gaia.
    3. Select Gaia Source Catalogue (DR3).
    4. Select "Multi-Object Search."
    5. Click on "Browse" to upload the IPAC Table file you just downloaded from the Exoplanet Archive.
    6. To match what they are doing in the Colab notebook most closely, leave "one-to-one matching" turned off, and give it a search radius of 1 arcsec.
  2. The Colab notebook at this point asks, We now have a table with all the Gaia DR3 data for the host stars from the Exoplanet Archive. Not all host stars have a match in Gaia DR3 and how do we known the matches found are correct? Are all host stars uniquely matched to a Gaia source? Come up with a basic check of the cross-matches. Think about plots you could make to spot any matches that might be dubious. Identify exoplanets matched to more than one Gaia DR3 source, and/or bad matches. Find a way to filter these out.
    1. For us, we can bypass much of this, but let's try to explore the spirit of these questions. One of the columns that is returned by the Catalog Search Tool is a column called dist_x which is the distance between the requested position and the match. You can explore the distribution of this in any of a number of ways, but the way that gives the easiest-to-interpret results may seem a little klunky.
      1. Click on the diskette icon on the data table in the Catalog Search Tool to save the cross matched results as an IPAC table file.
      2. Start a new session of IRSA Viewer and upload that IPAC table file.
      3. Once it loads, click on the gears in the plot pane.
      4. Make a histogram of the dist_x column: Add new plot, pick "histogram", enter dist_x, enter your desired parameters (or leave the defaults) and apply.
      5. Where do most of the matches fall? Does it make sense to keep matches up to 1 arcsec away, or is a smaller radius more appropriate?
    2. Let's go back a step, because we can bypass some of this. Return to your Catalog Search Tool matching, and this time, turn on "one-to-one matching" by clicking on the "One-to-one Match" checkbox, and give it a search radius of 1 arcsec. What this does is give you one line of output for each line of input, with the closest match from Gaia DR3 within the specified search radius. If there is no match, there will be a row of nulls for that input line. If there is more than one match within the radius, it will take the closest one.
      1. Click on the diskette icon on the data table in the Catalog Search Tool to save the cross matched results as an IPAC table file.
      2. Return to your prior session of IRSA Viewer and upload that new IPAC table file.
      3. Once it loads, click on the gears in the plot pane.
      4. Make a histogram of the dist_x column: Add new plot, pick "histogram", enter dist_x, enter your desired parameters (or leave the defaults) and apply.
      5. Now, you know for a fact that all the duplicate hits are gone. Are the larger matches you noticed before gone now too?
      6. What happens if you expand the search radius to, say, 3 arcsec? Are those matches likely legitimate matches?
    3. We still need to do a sanity check to see if the matches between what the Exoplanet Archive thinks is the star and what we think is the star is the right match. We pulled the Gaia (DR2) magnitude from the Exoplanet Archive, and we pulled the Gaia DR3 magnitude now. The columns in our uploaded catalog have their original names plus "_01" appended to them. The columns retrieved from the matched catalog, in this case Gaia DR3, have their original names. In the version of the catalog we saved and uploaded to IRSA Viewer, the Gaia DR2 mag is therefore "sy_gaiamag_01", and the Gaia DR3 magnitude is therefore "phot_g_mean_mag". Make a plot that compares sy_gaiamag_01 and phot_g_mean_mag.
    4. What plot did you use? You may be tempted to try sy_gaiamag_01 vs. phot_g_mean_mag but can you find a better one?
    5. You probably have a situation where there are so many points that you only have a heatmap (binned greyscale) plot. How can you filter down the catalog so that you can identify individual objects that may be problematic? (Hint: maybe something somewhere like abs("phot_g_mean_mag"-"sy_gaiamag_01") > 0.1?) Do you notice anything in common about these stars?
    6. I note that we do still have the problem where individual planets are listed once but stars are listed more than once (e.g., Kepler 108 appears once for Kepler 108b and once for 108c), but these points should be plotted identically on top of each other in the plots; they just contribute to source counts. If you want to get rid of them, to first order, keep only those that have a "b" in the planet name, e.g., impose a filter "like '%b%'" on pl_name_01. This should omit all the planets that are 'c' or 'd' or 'e' ... you get the idea. It will keep any that have "b" in the root of their name, however. (This actually helps because it weeds down the catalog enough to see individual points!)
  3. Plot the Gaia absolute CMD diagram for the exoplanet host stars, corrected for the effects of extinction. Make two plots, one using the parallaxes to calculate G and the other using Gaia DR3 distances. What might be the cause of the differences you see? Create a plot to investigate this.
    1. For this, we need to dig into the column definitions of the Gaia DR3 catalog -- see table above. You are going to want to plot on the x-axis B-R-E(B-R) to correct for the reddening, and on the y-axis, G-5logd-5-Ag. For the first request, you can invert the parallax to get the distance (watch your units), and for the second request, use the distance provided in the catalog. Pin the plots to compare them side-by-side.
    2. There are definitely points that are different. Click on points in either plot to find the corresponding rows in the catalog. Do you notice anything about the most discrepant points? The words and the code in the Colab notebook solutions do different things, but one could explore filtering to omit negative parallaxes or those with parallax_over_error < 5.