![]() |
|
|
#1 |
|
Bamboozled!
"πΊππ·π·π"
May 2003
Down not across
29×3×7 Posts |
Tom provided me with some images of the Pleaides. Pretty pictures derived from that data have been posted in another thread. Since then I've been learning how to do photometry.
First off, all the images with reasonably round stars were stacked together to increase the SNR. Then the DAOPHOT package in IRAF was used to find the stars in the DSLR green channel and perform aperture photometry --- an interesting learning exercise in its own right. This produced a list of around 20K "stars", some of which are doubtless peaks in the sky noise. Some genuine stars were missed because their images were saturated. Star coordinates are in pixels and magnitudes have an arbitrary zero offset. A search of the NOMAD1 catalogue for RA,Dec coordinates and BVR magnitudes yielded 5429 entries when constrained to a 2-degree square with all magnitudes brighter than 17.0. IRAF's wcsctrans utility converted the catalogue's coordinates to pixels and then xyxymatch produced a list of correspondences between the two files. From this data I made a file with measured magnitudes (which I call M) matched with the corresponding B, V and R magnitudes --- 1239 of them. Then the fun began. Some matches were clearly coincidences --- faint NOMAD1 stars matched with noise. Some were double identifications: two close NOMAD1 stars unresolved in Tom's images. To see what was going on the file was loaded into R (the data-analysis package, not the photometry band!) and various least squares fits performed. This was in itself a learning exercise because I really don't know how to drive R. As expected, the faintest matches showed enormous residuals so I slung out everything fainter than V=14 and refitted. The initial fits showed what background reading has lead me to expect: the DSLR green channel is very well correlated with the V band but less so with B and R. Good. It also showed a number of outliers scattered pretty much at random, presumably due to false matches to faint stars or unresolved doubles, and quite a lot of scatter at the bright end, where it's much more difficult to measure the star images. Removing those yielded the following results. Code:
lm(formula = M ~ V, data = df)
Residuals:
Min 1Q Median 3Q Max
-0.66014 -0.15829 -0.02395 0.13325 0.88001
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 3.560642 0.086998 40.93 <2e-16 ***
V 1.000470 0.007241 138.17 <2e-16 ***
---
Signif. codes: 0 β***β 0.001 β**β 0.01 β*β 0.05 β.β 0.1 β β 1
Residual standard error: 0.2242 on 450 degrees of freedom
Multiple R-squared: 0.977, Adjusted R-squared: 0.9769
F-statistic: 1.909e+04 on 1 and 450 DF, p-value: < 2.2e-16
Code:
lm(formula = M ~ V + V2, data = df)
Residuals:
Min 1Q Median 3Q Max
-0.58434 -0.14479 -0.01454 0.10051 0.85987
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 7.093384 0.433187 16.375 < 2e-16 ***
V 0.337277 0.080168 4.207 3.12e-05 ***
V2 0.030318 0.003652 8.302 1.21e-15 ***
---
Signif. codes: 0 β***β 0.001 β**β 0.01 β*β 0.05 β.β 0.1 β β 1
Residual standard error: 0.209 on 449 degrees of freedom
Multiple R-squared: 0.98, Adjusted R-squared: 0.9799
F-statistic: 1.102e+04 on 2 and 449 DF, p-value: < 2.2e-16
A plot of the residuals from the linear is attached. It's broadly as I expected, with more scatter at the faint end, but the residuals are disappointingly large and there's an obvious correlation up to V=11 or so. Not sure how to proceed from here but I suspect that paying closer attention to DAOPHOT would be a good start. Alternatively, perhaps the data just isn't good enough for photometry below V=11. All this series of harangues have been documenting my attempt to educate myself. I think I'm learning, slowly. |
|
|
|
|
|
#2 |
|
Just call me Henry
"David"
Sep 2007
Cambridge (GMT/BST)
588010 Posts |
One way of combatting the heteroscedasticity you are seeing is log transforming the dependent variable. Once the result is back transformed you should be able to predict M better with V, if that is your aim.
It looks like you may be calculating the residuals by hand. A shortcut is the residuals function. Plotting the linear model object produces some plots that might help identify if there are points that are huge outliers skewing your linear model. I would imagine that some of the residuals are from misclassified stars for example. It does look like there is something that significantly changes when V reaches ~11. Would the sensor lose accuracy once the "correct" V goes above 11? Assuming that the camera was perfect you should expect a fit of 0+1*V. Given that the slope of very close to this I am surprised by the apparent slope less than 11. I suspect that there are two separate things going on messing up this fit. Any modelling will only end up correcting errors since we know that the slope should be 1. |
|
|
|
|
|
#3 | |
|
Bamboozled!
"πΊππ·π·π"
May 2003
Down not across
29·3·7 Posts |
Quote:
Not sure why you are surprised. The fit is Code:
V 1.000470 0.007241 138.17 <2e-16 *** The fit should not have an intercept of zero because the photometry was with respect to an arbitrary zero magnitude. Without precise knowledge of the sky brightness in terms of photons per square arcsec per second, an equally precise knowledge of the properties of the optics and detectors, and knowledge of the air mass between the star and the detector, it's impossible to perform absolute photometry. Differential photometry by its very nature typically has a non-zero constant offset. At the faint end there will inevitably be a much greater scatter because the estimated star photons become an ever smaller fraction of the estimated sky photons and of the intrinsic noise in the detector. Why there appears to be a breakdown in linearity escapes me at the moment. It may well arise from my inexperience at performing the initial aperture photometry. That's something I'm going to explore further. Thanks for the tip about log transformation. Something else to explore. I can see that you know vastly more about R and data modelling than I. Consider yourself on notice for further questioning.
|
|
|
|
|
|
|
#4 | |
|
Bamboozled!
"πΊππ·π·π"
May 2003
Down not across
29×3×7 Posts |
Quote:
Code:
summary (foo)
Call:
lm(formula = M ~ V, data = df)
Residuals:
Min 1Q Median 3Q Max
-0.16135 -0.04791 -0.01078 0.04423 0.18402
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 4.803201 0.085455 56.21 <2e-16 ***
V 0.868028 0.009436 92.00 <2e-16 ***
---
Signif. codes: 0 β***β 0.001 β**β 0.01 β*β 0.05 β.β 0.1 β β 1
Residual standard error: 0.07683 on 60 degrees of freedom
Multiple R-squared: 0.993, Adjusted R-squared: 0.9928
F-statistic: 8463 on 1 and 60 DF, p-value: < 2.2e-16
The residual plot is attached. It looks fine to my untutored eye. Last fiddled with by xilman on 2017-04-20 at 17:17 |
|
|
|
|
|
|
#5 | ||
|
Just call me Henry
"David"
Sep 2007
Cambridge (GMT/BST)
23·3·5·72 Posts |
Quote:
Care is needed removing outliers. Any fit will be good if all the outliers are removed. Quote:
|
||
|
|
|
|
|
#6 | |
|
Just call me Henry
"David"
Sep 2007
Cambridge (GMT/BST)
23×3×5×72 Posts |
Quote:
I don't think we are going to win with this data. I suspect that a piecewise model is needed. Even then whenever we divide the dataset there is still some appearance of a pattern. I would not like to get this data from a medical trial. |
|
|
|
|
|
|
#7 | |
|
If I May
"Chris Halsall"
Sep 2002
Barbados
2·5·7·139 Posts |
Quote:
OpenCV was initially underwritten by Intel, and is still being supported. Codifying known and published algorithms. The libraries are licence under the BSD licence. Could this possibly help? |
|
|
|
|
|
|
#8 | |
|
Bamboozled!
"πΊππ·π·π"
May 2003
Down not across
29×3×7 Posts |
Quote:
General philosophy aside, I can make the data available for others to analyze, though I suspect not many here will want half a gig of NEF-format images. More likely is the table of matched NOMAD-1 values and DAOPHOT-reduced observational magnitudes and their estimated errors. Given a day or so I could also download neighbouring areas of NOMAD-1 data and match them with the observations. This effort should produce another few thousand entries. |
|
|
|
|
|
|
#9 |
|
Bamboozled!
"πΊππ·π·π"
May 2003
Down not across
29×3×7 Posts |
Another thought occurred overnight. Some of the stars are almost certain to be variables. If their brightness at the time of observation was significantly different from their catalogued value the residuals would be larger. There are (at least) two issues which could bias the selection. Those variables which are too faint to reach the cut-off would be removed from consideration, despite reaching the NOMAD-1 criterion. Another is that I do not know, yet, whether NOMAD-1 records maximum, minimum or some kind of weighted mean magnitudes for variables. Whichever is chosen there is an opportunity for bias.
|
|
|
|
|
|
#10 |
|
Bamboozled!
"πΊππ·π·π"
May 2003
Down not across
2A0016 Posts |
OK, here's the summary of the raw fit to four times as many catalogue entries. Raw in that no attempt has been made to remove outliers or data with magnitudes way fainter than can be plausibly justified.
To be clear, what I'm interested in is finding V given a value of M measured by aperture photometry on a stacked set of images. Code:
> summary (foo)
Call:
lm(formula = V ~ M, data = df)
Residuals:
Min 1Q Median 3Q Max
-6.0316 -0.5887 -0.4071 -0.0404 6.9252
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -2.06930 0.24615 -8.407 <2e-16 ***
M 0.93904 0.01507 62.326 <2e-16 ***
---
Signif. codes: 0 β***β 0.001 β**β 0.01 β*β 0.05 β.β 0.1 β β 1
Residual standard error: 1.151 on 2886 degrees of freedom
Multiple R-squared: 0.5737, Adjusted R-squared: 0.5736
F-statistic: 3884 on 1 and 2886 DF, p-value: < 2.2e-16
Last fiddled with by xilman on 2017-04-23 at 19:19 Reason: Remeber to upload images |
|
|
|
|
|
#11 |
|
Bamboozled!
"πΊππ·π·π"
May 2003
Down not across
29·3·7 Posts |
ISTM from fit1.jpg that if I arbitrary remove all data with abs(residual) greater than, say, 0.1 the remainder will enable me to predict V from M as required. I've absolutely no theoretical justification for making that decision. Any ideas and analysis are very welcome.
Last fiddled with by xilman on 2017-04-23 at 19:28 Reason: s/if if/if/ |
|
|
|