## RV Fits: The Keplerian Solution

Johannes Kepler

We looked at deriving the radial velocity equation a little here but for the sake of this discussion, we will recover some of the material and expand on it. Furthermore, while modelling the radial velocity curve was described here, it was done so from the perspective of statistics and data, as opposed to actually modelling the stellar reflex velocity from of the gavitational influence of a planet. In hind sight, maybe it is this article which should have been titled “making waves.” I will seek to clarify the physical modelling of the radial velocity curve itself, and how the orbital orientation of the planet plays into how the Keplerian radial velocity curve is shaped.

The radial velocity equation is heavily reliant on Kepler’s Laws of Planetary Motion which govern the shape and periods of the orbits of planets and stars. As such, any fits to radial velocity data using the radial velocity equation is called a Keplerian fit. A Keplerian fit is therefore a fit to RV data that is effectively a model of the motion of the star as expected from Kepler’s Laws.

If you recall, the radial velocity equation is given by

$\displaystyle V_r(t) = K[\cos(\epsilon+\omega)+e\cos(\omega)]+\gamma+\dot{\gamma}+\ddot{\gamma}$

Where ω is the longitude of perihelion of the orbit, e is the eccentricity of the orbit, $\gamma$ is a term used to represent the radial velocity offset of the system (the radial velocity of the system arising simply from its motion through space), $\dot{\gamma}$ is any linear radial velocity trend in the data, measured in m s-1, and $\ddot{\gamma}$ is a term for a quadratic trend in the RV data, where the RV data may be curved but not in such a way that it permits the period to be readily known. This is typically observed when the aforementioned trend has been observed long enough that it begins to show some curvature in the RV data. If there is no need for any of the $\gamma$ terms, they may be set to zero. Lastly, K is the amplitude of the RV curve and directly corresponds to the mass of the planet through the semi-major axis of the star around the system barycentre, as well as being dependent on the unknown inclination of the system and the eccentricity of the orbit.

$\displaystyle K=\frac{2\pi a_*\sin{i}}{P\sqrt{1-e^2}}$

The nature of $\epsilon$ was not expanded on in the previous aforementioned page. It represents the “eccentric anomaly” and can be understood to be representative of where the planet is in its orbit at a given time. For circular orbits such that e = 0, the eccentric anomaly is simply equal to the mean anomaly, and is understood to be 0 at the start and 2π = 360o after a full orbit. For non-circular orbits such that e > 0, then it must be calculated.

Eccentric Anomaly (source)

The relationship between the mean anomaly and the eccentric anomaly is given by

$\displaystyle M = \epsilon - e \sin{\epsilon}$

It turns out that it’s impossible to solve for $\epsilon$ mathematically (in the same sense that it is impossible to solve for π) so a recursive algorithm will be needed to approximate it. This algorithm is called Newton’s Method and is given as

$\displaystyle x_{n+1} = x_n - \frac{f(x_n)}{f'(x_n)}$

An initial guess, x0 is plugged into the algorithm and is used to determine x1, which is plugged back into the algorithm, ad infinitum. This algorithm will converge on the actual value for x regardless of the initial guess, with it returning the exact value at x, though of course after a handfull of iterations, a sufficiently close guess is arrived at to avoid the need for excessively redundant calculations. The initial guess is plugged into the original equation being solved for, divided by the initial guess plugged into the derivative of the original equation. As you might imagine from the orbit diagram above, the mean anomaly itself is a fairly decent guess at the the eccentric anomaly, so one is safe using this for x0. In the case of the expression for the eccentric anomaly, we first set the expression equal to zero: $0 = \epsilon-e\sin{\epsilon}-M$ and then with Newton’s Method

$\displaystyle \epsilon_{n+1} = \epsilon_n - \frac{\epsilon -e\sin{\epsilon}-M}{1-e\cos{\epsilon_n}}$

This may be done until $\epsilon$ is determined to the desired precision.

Graphing the RV equation is rather straightforwad with some computer scripting, though it can be somewhat challenging with a graphing calculating given the difficulty in determining $\epsilon$ to sufficient precision throughout the orbit (it turns out that astronomy uses both math and computers!). Let us examine how the shape of the RV curve varies with the orbital parameters (for a description of the various orbital parameters, see here). Obviously changing the period widens the the wavelength of the curve, and obviously varying K affects the amplitude. Changing the eccentricity will change the shape of the orbit significantly and will therefore affect the y-axis symmetry of the RV curve.

Manifestation of e on the RV curve

The “jagged” appearance of RV curves arising from eccentric planets is distinctive and difficult to replicate through other phenomena such as pulsations. As such, the RV signals from eccentric planets is less ambiguously of planetary origin than for planets in circular orbits.

Changes in the mean anomaly of the orbit simply effect a phase shift and move the RV curve of the planet along the x-axis (time axis). Changing the longitude of perihelion, ω, rotates the planetary orbit about its plane in space and therefore will cause what appears to be a phase offset to the orbit for a circular orbit, but will cause noticeable changes to the RV curve shape for eccentric planets. Changing ω for eccentric orbits causes a loss of x-axis symmetry.

Manifestation of ω on the RV curve

As before, the shape of the RV curve here is hard to replicate through processes intrinsic to the star. This “sawtooth” shape of the RV curve is much more reason to believe the RV curve is due to an orbiting companion rather than other issues such as starspots or pulsations.

Now multiple planets can be modeled by simply adding multiple RV curves over each other. For a system of n planets, the RV curve is given by
$\displaystyle V_r(t) = \gamma + \dot{\gamma} + \ddot{\gamma} + \sum_{i=1}^n K_i[\cos(\epsilon_i + \omega_i)+e_i\cos(\omega_i)]$

Keplerian modelling of RV data is simple, fast, and computationally easy. However it leaves out very important aspects which we will look at later, such as planet-planet interactions. That being said, Keplerian fitting is still the most widely used method to deriving extrasolar planet properties from RV data.

## Probability of Transit

Transiting Planets. Credit: NASA

Transiting planets are valuable items to explore the properties of planetary atmospheres. Planet searches like Kepler that focous on fields of sky tend to reap rewards amongst dimmer stars simply because there are many more dim stars in a given patch of the sky than bright ones. Transiting planets around bright stars are of particular value, though, as the increased brightness makes the system easier to study.

Radial velocity surveys tend to monitor brighter stars since spectroscopy is even more severely limited by stellar brightness than photometry, but it is not limited to observing patches of sky – telescopes performing Doppler spectroscopy tend to observe a single object at a time due to technical and physical limitations. Radial velocity surveys are also much less sensitive to the inclination angle of a planet orbit with respect to the plane of the sky. The planet doesn’t have to transit to be spectroscopically detectable. As such, radial velocity surveys tend to generate discoveries of planet candidates with unknown inclinations and true masses, but around much brighter stars than those planets discovered by the transit method.

As such, planet candidates discovered by radial velocity, especially planet candidates in short orbital periods are excellent targets for follow-up observations to attempt to detect transits. Transiting planets that have been discovered first through radial velocity have been of great scientific interest due to their host stellar brightness and thus ease of study. If more such systems are found, it would be of great benefit to understanding extrasolar planet atmosphere. While only a hand-full of transiting planets have been discovered first through radial velocity, they all orbit bright stars and are some of the best-characterised planets outside our solar system.

The probability that a planet will transit is, as has been discussed previously, given by
$\displaystyle P_{tr} = \frac{R_*}{a}$
where a is the semi-major axis of the planet orbit. This is the distance between the centre of the star and the centre of the planet. However, due to the inclination degeneracy – the reoccurring evil villain constantly plaguing radial velocity science – the star-planet separation is unknown. Remember that the period of the RV curve gives only the orbital period of the planet. If the orbital period is held constant, increasing the mass of the planet increases the star-planet separation. An increase in the total system mass requires greater separation between the two bodies to preserve the same orbital period.

For example, if radial velocity observations of a star reveal the presence of a mp sin i = 1 ME planet candidate, but the inclination is actually extremely low such that the true mass of the companion is in the stellar regime, then because the mutual gravitational attraction between the two stars will be much greater than the mutual gravitational attraction between the star and an Earth-mass planet at the same period, the two stars must have a wider separation, otherwise their orbital period would be smaller.

Mathematically, the true semi-major axis is given by
$\displaystyle a = \left(\frac{G[M_*+M_{\text{pl}}(i)]}{4\pi^2}\right)^{1/3}T^{2/3}$
Where G is the gravitational constant, and Mpl(i) is the mass of the planet at a given inclination i, and T is the period of the system. It is worth noting that the true semi-major axis is not significantly different from the minimum semi-major axis as long as the mass of the star is much greater than the mass of the planet – which is typically the case.

The fact that the true semi-major axis is a function of the unknown inclination makes for an interesting clarification: The probability that a planet of unknown inclination will transit is not simply given by Rstar/a, but is only approximated by it. If we assume that the distribution of planet masses is uniform (and extending through into the brown dwarf mass regime), then you would expect a planet with a minimum mass equal to Earth to have a much greater chance of being a bona-fide planet than a planet with a minimum-mass of 10 MJ, simply because there is a greater range of inclinations the former planet can be while still remaining in the planetary mass regime. Taking this a step further, even if both the Earth-mass planet candidate and the 10 Jupiter-mass planet candidate have the same orbital period, the probability that the latter planet transits ends up being less than the Earth-mass planet simply because of its high mass. Since its inclination is unknown, the probability that its mass is so high that the true semi-major axis is noticeably larger than the minimum semi-major axis is much higher, resulting in a likely lower transit probability.

Except it turns out that the mass distribution of planets and brown dwarfs isn’t constant. Earth-sized planets are significantly more common than Jupiter-sized planets, and super-Jupiters appear rare. It isn’t clear yet what the mass distribution planets actually is, with significant uncertainty in the sub-Neptune regime, but it is clear that for a highly accurate estimate of the transit probability, the inclination distribution cannot be thought of as completely random as it is fundamentally tied to the planet mass distribution.

Planet Mass Distribution given by Ida & Lin (Left) and Mordasini (Right)

Consider the case of a super-Jovian planet candidate, perhaps with a minimum mass of 7 or 8 Jupiter-masses. Because a significant fraction of physically allowable inclinations would place the true mass planet into a mass regime that is in reality sparsely populated, it is less likely that the planet candidate’s orbit is in those inclinations. It is thus more likely that the planet candidate’s orbit is edge-on than would be expected from the probability function of randomly oriented orbits. As such, the transit probability of a super-Jovian planet is actually boosted by ~20 – 50% over what you would expect from Ptr = Rstar/a. If this is the case, then we would expect to find an excess in the fraction of transiting planets in this mass regime then would be expected purely from the standard transit probability function. Indeed this is what we see.

Candidate planets with masses in the terrestrial planet regime are similarly affected, with broadened transit probabilies owing to the fact that terrestrial planets are more common than higher mass planets, arguing in favour of a higher inclination than the random inclination distribution function.

On the other hand, planet or brown dwarf candidates of minimum masses in the most sparsely populated region of the mass distribution are unlikely to truly have that mass. They are quite likely in orbits with low inclinations and with much higher true masses. The transit probability for companion candidates with minimum masses in this mass regime are actually reduced from the standard transit probability function.

Geometric and a posteriori transit probabilities

In the table above, taken from this preprint, we see that the geometric transit probability, Ptr,0, can be much less than the a posteriori transit probability, Ptr. The transit probability for 55 Cnc e, for example, jumps up from 28% to 36%. With these higher a posteriori transit probabilities, these short-period low-mass planets should be followed-up for transits. If transits are found, it would be of significant benefit to the extrasolar planet field.

In summary, there are various additional effects that can cause the a posteriori transit probability to be significantly different from the geometric transit probability. Planets with only minimum masses known can be more accurately assigned a transit probability when taking into account the uneven planetary mass distribution. Low-mass planets and super-Jupiters are more likely to transit than their geometric transit probability because a significant range of the inclination space is consumed by planets of masses that are simply rare. These planet candidates are more promising targets for transit follow-up than, for example, Jupiter-mass planets or intermediate-mass brown dwarfs.

## Modelling the Transit Light Curve

The Sun (Credit: NASA)

We looked at a simplistic understanding of planetary transits and transit light curves (here). In the interests of accuracy, it would be worth investigating a more formal understanding of the transit light curve. Specifically, this post will be aimed at actually modelling transit light curve while taking into account some more complicated dynamics such as limb darkening.

The actual shape of the transit light curve can be mathematically described by a function of the sky-projected centre-to-centre distance between the star and the planet, z, where z = 0 is understood to be the mid-transit of a centrally-crossing transit and the impact parameter, b is understood to be the minimum value of z throughout the transit, and the transit starts at about z ~ 1. More accurately, first and fourth contact start at $z = R_* + R_p$, second and third contacts start at $z = R_* - R_p$. Furthermore, let’s define p as the ratio of the radii of the planet and the star, $p = R_p/R_*$.

Let us now define a function of the total fractional area of the star that is not obscured by the planet. Normally you might be inclined to feel that $(R_p/R_*)^2$ suffices, however it describes only the area ratio between second and third contacts, and is thus inadequate to describe the entire transit. We therefore will construct the amount of obscured stellar area as a piecewise function of p and z whose intervals describe different parts of the transit. It will be given as the ratio between the total area and the unobscured area subtracted away from the whole total area. Specifically, $A(p,z) = 1 - \lambda(p,z)$.

$\displaystyle \lambda(p,z) = \begin{cases} 0, & 1 + p < z\\ \frac{1}{\pi} \left[ p^2\kappa_0+\kappa_1 - \sqrt{\frac{4z^2-(1+z^2-p^2)^2}{4}} \right], & \lvert 1 - p\lvert < z \le 1 + p\\ p^2, & z \le 1 - p\\ 1, & z \le p - 1 \end{cases}$

Where $\kappa_1 = cos^{-1}[(1 - p^2 + z^2)/2z]$ and $\kappa_0 = cos^{-1}[(p^2 + z^2 - 1)/2pz]$. There’s a lot going on here but it isn’t too complicated to understand. The first case, $1 + p < z$, describes when the planet is completely off the stellar disc, before first contact hence there is no (zero) obscured area. The second case is the most complicated and describes the time between first and second (third and fourth) contacts. The third case is what we covered earlier, where the amount of obscured area is simply the square of the ratio of their radii, and therefore simply $p^2$. The final case is only satisfied in the (unlikely) condition that the planet-to-star radius ratio > 1, and the entire area of the stellar disc is obscured.

Calculating z as a function of time may be done with knowledge of the impact parameter and use of Pythagorean theorem.

For a transiting planet with an impact parameter of zero and a radius ratio of $R_p/R_* = 0.1$ (producing a 1% transit depth), the above model produces the following light curve plot.

Simulated Light Curve

Real light curves are actually curved, however, whereas this model is clearly much more “rigid” in shape. The reason for this is an effect known as limb darkening, where the outer layers of the star are not able to isometrically scatter light from underneath, causing less light from the limb of the star to reach the observer than from the centre of the stellar disc. For this reason, stars generally appear dimmer at their limbs and brighter in their centres. The magnitude of this difference will depend on a number of factors intrinsic to each star.

Proper handling of limb darkening will be required to obtain high-fidelity fits to high-accuracy data. A particularly popular and effective model for limb darkening was presented by Claret (2000). We can consider a four-parameter limb-darkening law describing the relative brightness of a point on the star expressed as a function of the angle between the observer, the centre of the star, and a line between the stellar centre and the given point on the star.

$\displaystyle I(r) = 1 - \sum^4_{n=1} c_n(1 - \mu^{n/2})$

Where $\mu$ is given as $\cos \theta = \sqrt{1 - r^2}$, and each value of c is called a “limb darkening coefficient.” In the case of the centre of the stellar disc, $I(0) = 1$. Each star will have a unique set of limb darkening coefficients, and they will change depending on what wavelength the transit is being observed in (due to how the properties of the stellar photosphere’s light scattering varies with wavelength). One could choose to have as many limb darkening parameters (and thus coefficients) as desired, but a high number is rarely required to adequately fit transit light curve data.

Limb darkening must be taken into account to get accurate measurements of radius ratios of transiting objects, and thus planetary radii. Because limb darkening will concentrate the brightness of the star into the centre of the disk, a planet that would cause a 1% transit depth in a uniformly bright star would cause a deeper transit depth in a central transit around limb-darkened star, and a shallower transit depth in a grazing transit around a limb-darkened star. Proper care in modelling the limb-darkening behaviour of the star is therefore required to ensure good measurement of the planetary radius.

We see therefore that the brightness difference is not a pure function of the ratio of the stellar and planetary radii. $\Delta F = (R_p/R_*)^2$ is only an approximation, and only correct for the unrealistic case of a uniformly bright stellar disc.

So what can we do to get a more comprehensive representation of a light curve? Let’s modify the equation for $\Delta F$ we’re familiar with and consider brightnesses instead of just purely area ratios. Let’s take our limb-darkening equation as the radius of the star and determine the area, since it describes the brightness of the star from centre to limb. With this equation as the stellar “radius,” we use some simple calculus to expand it to the “brightness area” of the circle that is the stellar disc. This calculates the total brightness of the star while taking into consideration the changing brightness behaviour from centre to limb. The actual transit depth of a planet at any given distance to the centre of the star (represented as z again) is simply the ratio of the total stellar brightness in the area of the star under the planet to the total brightness of the stellar disc. Therefore, we draft a new function based off our original understanding of $\Delta F$, while taking into effect the above model for how much of the star’s area is covered at any given time in the transit, as well as bringing in our limb-darkening model to describe the brightness distribution of the stellar disc:

$\displaystyle F(p,z) = \left[ \int^1_0 dr 2rI(r) \right]^{-1} \int^1_0 dr I(r)\frac{d[A(p/r,z/r)r^2]}{dr}$

And it makes quite a difference, too. By incorporating limb darkening coefficients, we’re able to much more accurately simulate the transit of a planet across a limb-darkened star. The diagram below shows the same planet (with p = 0.1) transiting stars of different limb-darkening parameters. The shape and modification to the transit depth is readily apparent.

Effects of Limb-Darkening coefficients

## Great Mysteries Revealed

Giant squid finally caught on video

It was recently announced that Japanese scientists have managed to finally capture video footage of a giant squid. We’ve known for some time that they exist, but until now, they have never been seen like this in their native environment.

Previously discussed detection methods for extrasolar planets leave much to be desired. Doppler spectroscopy tells you only a minimum mass for the planet and its orbital period. Transit photometry tells you the size and orbital period of the planet, and is typically not capable of confirming the detected object as a planet (since the mass is unknown, a brown dwarf or low-mass star could masquerade as a planet). Gravitational microlensing tells you the planet’s mass but only a projected distance from the star, and usually nothing about the orbit. Astrometry tells you the planet’s true mass and orbit in three dimensions, but otherwise permits you no more information about the planet than Doppler spectroscopy. Combining these techniques will allow one to tease out more information, but it’s still awfully indirect. We know the planets are out there, but it would be nice to actually see them.

The ultimate detection method of the future that will provide the most information is direct imaging. It’s very straightforward, all it involves is a large telescope with a decent imaging contrast ratio. It sounds easy but there are daunting challenges involved due to the proximity of planets to their stars and the brightness of their host stars.

The first directly imaged exoplanet

As of the time of this writing, only a couple dozen or so planets have been imaged with 8 – 10 metre class telescopes as well as HST. For the most part, they were all imaged in the infrared, they are all very massive as far as planets go, and they are all widely separated from their stars – typically hundreds of AU.

The greatest problem is that stars are very bright, and planets are comparatively very dim. In visible light, there’s a brightness contrast between the Sun and Jupiter of a factor of a billion, but depending on the wavelength and system age and planet mass, the brightness contrast can be as low as ten thousand. This problem is made worse by the diffraction of the starlight across the focal plane of the telescope produces a large amount of “noise,” with the star’s light spread out over a greater area of the image. A method of removing this excess light from a star is a requirement for directly imaging its planets.

One method of doing so is to use a coronograph. This is effectively an object in the telescope to block the light from the star, allowing one to see “around” the star. While you might therefore expect there to be dark “hole” in the image where the star is, the diffraction of light around the coronograph still produces a brighter, noisy area where the star would be. Imperfections in the coronograph will result in extra noise, and it is not clear that perfect coronographs are achievable. Since coronographs (even perfect ones) only attenuate the coherent part of the light’s wavefront (the shape of the waves of light). Imperfections in the wavefront (called aberrations) can leak through the coronograph to produce a residual noise in the form of a speckled halo around the star. Methods exist to correct this, such as wavefront correction with deformable mirrors or to calibrate images for speckles. It’s not perfect, but it certainly removes a lot of the excess starlight.

Four planets at HR 8799

Some caution is necessary when discovering planets through direct imaging. A supposed planet next to a star could turn out to be a background star. Monitoring of the planet candidate over some time will be needed to determine if it is bound to the star. Since stars and their associated planets move through space together, there is a certain motion in the sky that the planet and the star will follow if they are bound, whereas the two will have vastly divergent motions in the sky if they are not.

The advantages of directly imaging extrasolar planets is beyond having pictures – it permits direct spectroscopy of the planet’s light. Now while it is possible to gather information about the atmosphere of a transiting planet at a great distance from the star, the fraction of planets at large distances from their star that will transit are quite low. Direct imaging opens up access to all planets with a sufficient brightness and angular separation from the star.

I suspect in the future that this will be the most prevalent way of truly characterizing the planets in the solar neighborhood. Statistically speaking, few nearby Earths will transit, so we will require direct imaging to test their atmospheres for the presence of bio-markers that may be indicative of life.

## Making Waves

Photo by Mila Zinkova, 2007

We looked at how the Doppler effect can tell you about the radial velocity of a star (here), and how you can use this information to detect an orbiting companion (here). Let’s now look a bit more at some of the later parts of this process. You’ve already finished your observations, you’re done with the telescope time and you have a collection of data points. What do you do with them?

The first thing you will want to do is to create a model that fits the data. This is done using the equations in the post referred to in the above paragraph. The model will simply manifest itself as a plot of the radial velocity behaviour you would expect for a planet with a given mass and orbit. The closer your model fits the data, the more likely your model is a good representation of reality. For example, let’s look at radial velocity data of the giant star Iota Draconis (Credit: ESO).

Iota Draconis (HR 810) Radial Velocity Data

The individual points are the actual data, whereas the sine wave is the model of a 2.26 MJ planet with a 320-day period. We see that the proposed model fits the data well (being very close to the data), and is thus likely representative of reality. There are a few stray points and the model is not a perfect match to the data, but the data points can be prone to intrinsic noise, such as flares or granulation on the star’s surface, or instrumental noise with the spectrometer.

We see that the radial velocity data set covers several orbits of the planet around the star. As the planet continues to orbit the star (as it will do so for aeons) and the radial velocity data continues to come in (as we would hope would occur, more data is more opportunities to discover something new), the graph could become extremely long and complicated. For example, consider a set of radial velocity data for HD 192263 (source).

Unfolded HD 192263 RV data

What we could do would be to fold the entire data set by a certain period of time, specifically the orbital period (taking the time of the measurement modulo the orbital period). Since the model is periodic, that is, repeating each orbital period, the data points (provided they do have the same periodicity that you’re folding by) will fold up into a nice, neat, singular sine curve.

HD 192263 Folded RV data

Here it is quite convincing that the model well fits the data. But how do we quantify how good of a fit to the data our model is? A value exists called $\chi^2$(read “chi squared”).

$\displaystyle \chi^2 = \sum \frac{(O-C)^2}{\sigma^2}$

That is, calculate the difference between each observed data point and the model (or “observed minus calculated”), divided by $\sigma^2$, which represents the variance of the data. Then total up each of these. The higher the $\chi^2$, the poorer the model fits the data. The variance, $\sigma^2$,can be determined with

$\displaystyle \sigma^2 = \frac{1}{N} \sum_{i=1}^{N}(v_i - \bar{v})^2$

Which is simply an average of the differences of each point, $v$, from the average values of the points $\bar{v}$. To put this in other terms (and this is more of a discussion about general statistics than it is specific to radial velocity data), all of the radial velocity measurements can be averaged together. The average of the differences of each point from that average radial velocity value is the variance of the data, $\sigma^2$, and is a representation of how scattered the data is. Our $\chi^2$value is simply the total of each point’s difference from our model, while dividing it by the variance to compensate for highly scattered data. Models that are not rigidly affixed to the data points (high total OC) are given more forgiveness of the data itself has a lot of variance ($\sigma^2$) anyway. If there is low variance, then only models that are closer to the datapoints (low OC) are treated well, or given a low $\chi^2$ value.

We can obtain another value called the “reduced chi-squared” which takes into account the number of freely adjustable variables (or, the “degrees of freedom”). This is obtained simply by dividing $\chi^2$by the degrees of freedom, $d$.

$\displaystyle \chi^2_{red} = \frac{1}{Nd} \sum \frac{(O-C)^2}{\sigma^2} = \frac{\chi^2}{d}$

The need to do this stems from the fact that a quantitative analysis of data must be undertaken to seriously distinguish between two models whose goodness of fit to the data may not be as easily discriminated between in our “chi-by-eye” technique of just looking at them.

Large reduced chi-squared values indicate a poor model fit, and a value of 1 is typically sought but can be extremely difficult or impossible to achieve due to instrumental or intrinsic noise that are out of one’s ability to control. Values of the reduced chi squared that are less than 1 are typically a sign that you are “over-fitting” the model to the data. This can result from trying too hard to fit the model to noise, or overestimating the variance.

HD192263 RV Residuals

Above are the residuals to the HD 192263 dataset. This is keeping the x-axis (the time of the measurement) and plotting the O-C on the y-axis, instead of the radial velocity value for that data point. In this example, there is a trend in the data that has been modelled out which could potentially be due to a second orbiting companion that is far enough away that only a very small section of its orbit has been observed.

Other than the trend, there is no apparent structure in the residuals, and so there is no reason to suspect the existence of additional orbiting bodies.

One can calculate the standard deviation, or variance, of the residuals to get a measure of how scattered the residuals are. This is typically represented with $\sigma$ or written as ‘rms’ in literature, and is an algebraically obvious transformation of one of the above equations,

$\displaystyle \sigma = \sqrt{\frac{1}{N} \sum_{i=1}^{N}(v_i - \bar{v})^2}$

The radial velocity jitter of a star scales with both its mass and chromospheric activity. A quiet M dwarf may have a jitter as low as 2 to 3 metres per second, whereas a sun-like star may have a jitter on the order of 3 – 5 metres per second. F-type stars have jitters close to 10 metres per second, and the jitter sky-rockets from there, where A-type stars are nearly unusable for using radial velocity to detect extrasolar planets. As of the time of this writing, no planets have ever been discovered around an A-type star through the use of the radial velocity method. So there is an amount of variance in the residuals we should expect to have given the star we’re studying. If the variance is much higher than expected for the star, then it could be evidence of either unusually high stellar chromospheric activity, or as-yet undetected planets with small amplitude signals.

The systems we have looked at thus far throughout this blog in the context of radial velocity have been systems whose radial velocity data graphs are dominated by a single periodicity. Indeed, our own solar system is like this, with Jupiter dominating the sun’s radial velocity profile. But not all planetary systems come so conveniently organised.

I leave you with an image of how daunting this can be and why we are all thankful that computers are here to help us: An image of the 2008 three-planet fit for HD 40307 (since discovered to have three more planets).

HD 40307 Three-Planet Fit

Update (15 Apr 2013): I have corrected a math error in the equation for variance and standard deviation. In both cases the $(v_i-\bar{v})$ term should have been squared.

## Dancing in the Dark

Epislon Eridani, the nearest known planet-hosting star other than the sun.

The motion of a star around the barycentre of a planetary system is not necessarily confined to detectability through indirect methods only. The barycentric motion of a star in a planetary system can, with sufficient precision, be directly observed. Logically, the larger the star’s barycentric semi-major axis, the more easy it would be to observe such a motion. However very long, multi-decade period orbits are harder to observe simply because they progress at timescales comparable to the human lifespan. As such, astrometry is biased toward intermediate-period, massive planets.

Often, the barycentric motion of a star will be tiny. Our sun being an example rarely diverges from two solar-radii from the Solar System barycenter. Across the extreme distances, these changes are difficult to resolve, leading to large error bars that can swamp out the real orbital motion. For this reason, it is often the case that a large number of measurements are required to build up confidence in the detection of an astrometric signal, much as with the Doppler spectroscopy method.

While the above plot is messy, it makes clear that the stellar orbit is far from face-on. It gives us a rough measurement of the stellar orbit’s inclination.

The advantages of astrometry justify the difficulty in performing it for planetary systems. By directly observing the barycentric motion of a star, it is possible to reconstruct the orbit of the orbiting planet fully in 3D (while distinguishing which node is the ascending node, $\it{\Omega}$, will require a combination of astrometric and radial velocity data). Because astrometry provides the full 3D orbit, the inclination may be measured and thus the true mass of the planet is found, resolving the inclination degeneracy that plagues Doppler spectroscopy.

Because the star is not the only thing in motion, some work is needed before jumping into searching for exoplanet-induced astrometric signals. Firstly, the proper motion (the natural drift of the position of the star in the sky as the result of each star’s independent galactocentric orbit) of the star must be modelled out. Additionally, the orbital motion of Earth around the Sun will cause another signal in the astrometric data that must be modelled out. After this is done, whatever remains must be the intrinsic motion of the star under the influence of other bodies.

To give some idea of comparison, from “above” the solar system, the sun’s astrometric signal would look like this.

The large yellow circle represents the diameter of the sun, and the black line is the path it takes, due to the influence of the planets in our solar system. Jupiter and Saturn dominate the astrometric signal. The other planets aren’t detectable in this timespan. Notice that the amplitude of the astrometric variation is comparable to the diameter of the star itself. More specifically, for a star with a distance$D$and a single orbiting companion, the astrometric amplitude can be estimated as

$\displaystyle \alpha = \arctan{\left( \frac{a * q}{D(q+1)} \right)}$

where q is the mass ratio between the planet and the star$M_p / M_*$, and a is given as

$\displaystyle a = (G(M_* + M_p))^{1/3} (P/2 \pi)^{2/3}$

The need for precision thus makes itself apparent. So far, very few extrasolar planets have been detected this way. On the positive side, radial velocity candidates whose true masses are significantly higher than their RV-derived minimum mass have larger than expected amplitudes. This makes astrometry quite effective at determining if exoplanet candidates are actually low mass or even main sequence stars. In most cases, astrometry can set an upper limit to the astrometric amplitude, which translates directly into an upper limit for the planet’s mass. This can permit even a non-detection to secure the planetary nature of a candidate extrasolar planet.

## Microlensing

The observant eye will notice something strange about the orientations of some of these galaxies

One method for detecting extrasolar planets that does not get a significant amount of attention is the use of gravitational microlensing. The reasons for this may be that the method has not produced a long list of planet detections so far (indeed, as of the time of this writing, only fourteen planets have been found using this method in the nearly two decades of searching for them this way), or perhaps because it’s a fairly complicated topic. The concept of detecting planets through their transits across a star is fairly intuitive, and the idea of using Doppler spectroscopy can be made to make sense after considering it for a time, but gravitational microlensing is far outside the norm of human experience.

Stars are not stationary in space. The galaxy has two or three hundred billion swirling around in a disk. So from the perspective of any of them, stars appear to pass close to other stars. When a foreground object passes very close (along our line of sight) to a background light source, the foreground object’s gravity may act as a lens, distorting the background image while magnifying it at the same time. This is called a “microlensing event.” The background source is split into two separate images with a separation on the order of a milliarcsecond (too small to resolve with our current technology). Since the total area of the images are larger than the actual area of the source, the event results in an apparent brightening of the background source. Photometry of the event will show a symmetrical brightening and dimming of the two stars (called a “Paczynski curve”), which themselves are unresolved due to their extreme apparent proximity.

When an observer, a lens, and a source are well aligned, light from the source will be bent by the lens by a deflection angle expressed as

$\displaystyle \alpha = \frac{4GM}{r_E c^2}$

Where$r_E$is the Einstein radius. This causes the apparent location of the source to be displaced from the true position of the source by an angle$\theta_E$. In the case where the source goes directly behind the lens, the image of the source is distorted into a ring around the lens. The radius of this ring, projected onto the source plane, has a radius$\hat{r}_E$and which called the “projected Einstein radius.” If the alignment is not perfect, the symmetry will be broken and instead of a ring, two images will appear.

The difference between$\hat{r}_E$and$\tilde{r}_E$is that$\tilde{r}_E$is the Einstein radius projected onto the observer plane, whereas$\hat{r}_E$is the Einstein radius projected onto the source plane. This minor difference illustrates the relations between the observables ($\theta_E$,$\tilde{r}_E$) and the physical parameters ($M$, $\pi_{rel}$).

The geometry of a lensing event may be visualised as

Lensing geometry, conceptual and mathematical

Under small-angle approximation,

$\displaystyle \frac{\alpha}{\tilde{r}_E} = \frac{\theta_E}{r_E}$

so

$\displaystyle \tilde{r}_E \theta_E = \alpha r_E = \frac{4GM}{c^2}$

And because of the exterior-angle theorem,

$\displaystyle \theta_E = \frac{\tilde{r}_E}{D_l} - \frac{\tilde{r}_E}{D_s}$

Where$D_l$and$D_s$are the distances to the lens and the source, respectively. Therefore, the image displacement from the true position,$\theta_E$, is expressed as

$\displaystyle \frac{\theta_E}{\tilde{r}_E} = \frac{\pi_{rel}}{AU}$

Where$\pi_{rel}$is the relative paralax between the lens and the source.

The distance of the two images of the source from the lens is expressed as

$\displaystyle y_{\pm} = \pm\frac{1}{2} \left( \sqrt{u^2 + 4} \pm u \right)$

Where$u = \beta / \theta_E$and$y = \theta / \theta_E$, and the positive, or “major” image is always on the outside of the Einstein ring, whereas the negative, or “minor” image is always inside the Einstein ring. The angular separation between these two images as the time of closest alignment is ~$2\theta_E$. Thus, for typical lens masses (0.1 – 1$M_{\odot}$and lens source distances (1 – 10 kpc),$\theta_E \lesssim$ mas, and so the images are unresolved.

The magnification of the source is dependent on the impact parameter of the event

We know the images are magnified because surface brightness is conserved. The magnification of each image is purely the ratio of the area of the image to the area of the source. The images are typically elongated tangentially by an amount$y_{\pm} /u$, but are compressed radially by an amount$dy_{\pm} / du$. The magnification of the images are then expressed as

$\displaystyle A_{\pm} = \left| \frac{y_{\pm}}{u} \frac{dy_{\pm}}{du} \right| = \frac{1}{2} \left[ \frac{u^2+2}{u \sqrt{u^2+4}} \pm 1 \right]$

And their sum, the total magnification of the source, is given as

$\displaystyle A(u) = \frac{u^2+2}{u \sqrt{u^2+4}}$

Now, because the lens, source and observer are all in relative motion, a microlensing event will be time-resolved (which is why microlensing phenomena are called “events”). In the simplest case of uniform, rectilinear motion, the relative lens-source motion can be expressed as

$\displaystyle u(t) = \left[ u_0^2 + \left(\frac{t - t_0}{t_E} \right)^2 \right]^{1/2}$

Where$u_0$ is the minimum separation between the lens and source (in units of$\theta_E$,$t_0$is the time of maximum magnification (corresponding to when$u = u_0$, and$t_E$is the time scale to cross the angular Einstein ring radius,$t_E \equiv \theta_E / \mu_{rel}$, where$\mu_{rel}$is the proper motion of the source relative to the lens. This form for$u$ gives rise to a smooth, symmetric microlensing event which, for events toward the Galactic bulge, tend to last on the order of a month but can be less than a day, or years long. The magnification is$A > 1.34$for$u \leq 1$, and so the events are substantial and easily detectable.

It was suggested in 1991 (by Mao & Paczynski) that gravitational microlensing could be used to discover planets around the primary foreground lensing star. If the lensing star has a planetary companion that happens to be located near the paths of one or both of the two images created by the primary lens, then the planet’s own gravity will contribute to the gravitational lensing and perturb the microlensing event when the image sweeps past the planet. The duration of this deviation is$~t_{E,p} = q^{1/2}t_E$, where$q = m_p/M$ is the mass ratio and$m_p$is the mass of the planet. The magnitude of perturbation depends on how close the source images passes by the planet. Given a range of microlensing event durations of 10 – 100 days, the durations for the perturbations caused by a planet may last a few hours for an Earth-mass planet, to a few days for a Jupiter-mass planet. The location of the perturbation relative to the peak of the primary event depends on the angle of the projected star-planet axis relative to the source trajectory, and$d$, the instantaneous angular separation between the planet and host star in units of$\theta_E$.

Appearance of a lensing event

Assuming the orientation of the source trajectory is random, and assuming the position of the planet about the star is random, the time of this perturbation is unpredictable. The probability of the planet being detectable is

$\displaystyle A(t_{0,p}) \frac{\theta_{E,p}}{\theta_E}$

Where$A(t_{0,p})$is the unperturbed magnification of the image that is being perturbed at time$t_{0,p}$ of the perturbation, and$\theta_{E,p} \equiv q^{1/2}\theta_E$ is the Einstein ring radius of the planet itself. Here, the factor$A$accounts for the fact that the area of the image plane covered by magnifiecd images is larger by their magnification.

Because the planet must be near one of the two images in order to produce a detectable perturbation, and these images are always located near the Einstein radius when the source is significantly magnified, the sensitivity of the microlensing method reaches its highest for planet-star separations of$r_E$, or$d \approx 1$. Microlensing can also detect planets well outside the Einstein ring ($d \gg 1$), though with less sensitivity. Since the magnification of the minor image decreases with position as $y^4$, microlensing is not generally sensitive to planets $d \ll 1$, i.e., close-in planets.

For a planet with a projected separation within a factor of 2$r_E$, the detection probability ranges from tens of percent for Jupiter-mass planets, to a few percent for Earth-mass planets.

In rare cases where the lens and source are very well aligned (and are therefore high-magnification events), the two images of the source sweep along almost the entire Einstein ring of the lensing star. In this case, all planets with separations near$r_E$are detected regardless of their orientation with respect to the source trajectory. These events have nearly 100% sensitivity to planets near the Einstein ring radius, including low-mass planets. But these events are rare.

When the lens is binary, then the magnification is not a simple function of the angular separation between the lens and the source and so the light curve of a lensing event of a background source will not be a smooth symmetrical curve. The presence of two gravitationally lensing bodies makes an asymmetrical distribution and intensity of magnification regions on the plane of the sky, through which the background lens, when passing through, will be magnified asymmetrically with time. These jagged regions of very high magnification are called “caustics.”

When the source crosses one of these caustics, it is very strongly magnified and this constitutes an extremely powerful tool. For example, despite a distance of 15,000 light years, the source in the microlensing event MOA 2002-BLG-33 was able to be spatially resolved enough that its shape was determined(!) simply because it fortuitously crossed one of these caustics in a tight binary lensing system. This effectively delivered an angular resolution of 0.04 µas(!). As the star crosses a caustic, the caustic gives us a probe into the surface brightness variation across the stellar surface and therefore gives us a probe into the limb darkening of the star. And because the source can be made very bright through the magnification, spectroscopy can get a measurement of the stellar effective temperature and atmospheric abundances when this would not typically be feasible due to the source star’s great distance and hence low brightness. This video has a decent animation showing a source crossing a caustic in a binary lens system.

Single and binary lens magnification regions. Note the binary lens creates high-magnification regions calld "caustics"

Another thing to note is that a binary lens always produces a magnification of all the images of$A \geq 3$when the source is inside the caustic curve

Binary lenses always produce either three or five images of the source, and will have one, two, or three closed and non-self-interacting caustic curves. Which of these is the case depends on the mass ratio of the lens$q \equiv m_1 / m_2$ and on the angular separation of the two lens components in units of the Einstein ring radius of the binary,$d \equiv |z_{m,1} - z_{m,2}|$. A triple lens will produce a maximum of ten images, but no less than four, with the number of images at any given time always being a multiple of two. This can lead to quite complicated magnification topologies, nested and/or self-intersecting caustic curves. In general, the maximum number of images a lens of$N_l$bodies will produce can be expressed as$5(N_l - 1)$for$N_l \geq 2$.

Mathematically, these caustics are the set of positions for which a point source would have infinite magnification. Caustic curves have multiple concave segments called “folds,” which meet at points called “cusps.” Because caustic curves are closed, these caustic crossing come in pairs. Since most of the length of the caustic is made of fold caustics, both the caustic entry and exit are on folds. Once a fold caustic entry is observed, an exit is guaranteed, and is usually a fold exit.

Even if the mass ratio is extreme, a star+planet system constitutes a binary system here, for both bodies contribute to lensing the background source, and the science is effectively the same. Because the magnification strength of the caustics depends on the mass ratio between the two lenses, one can determine the mass of the second lens in the system, whether it be a planet or additional star, with as much certainty as they can determine the mass of the host lensing star.

The concept of detecting planets through microlensing is highly technical and there are great practical challenges associated with it. Analysis of microlensing data is complicated and time-consuming, and once planets are unambiguously found, little is known about them, and they will certainly never be as well studied as closer planets in the solar neighbourhood. It was originally thought that the only information that can be gathered from microlensing detections would be the planet/star mass ratio and the projected separations of planets. Individual planet detections would be of little interest because the microlensing event gives very little information on the system, and follow-up spectroscopy would be difficult or impossible due to both the great distance of the system, and the fact that it would be guaranteed to be blended with the background star for some years. While experience with actual microlensing events has shown that it is possible to get considerably more information, it’s extremely difficult to extract and requires a large amount of data. The detection method does not bask in the glory that has been brought to the Doppler Spectroscopy method that is well established and successful or the transit method that Kepler is enjoying great success with, and resources like dedicated telescopes are scarce, as is manpower – not many seem interested in the detection method.

The detection of an extrasolar planet

Light curves that arise from binary lenses tend to have an extraordinarily diverse and complex range of phenomenology. Unlike the transit light curves where one might even say “if you’ve seen one, you’ve seen them all,” light curves from microlensing events take a huge amount of shapes, sizes and durations. This complicates the interpretation of observed light curves for multiple reasons. The features of binary and planetary microlensing light curves have no direct relationship with the parameters of the underlying model. Consider transit light curves or radial velocity plots. For both, there’s a clear, predictable way in which they should behave and so you can gain the parameters of the system through the clearly visible behaviour of the plots. The radius comes from the transit depth, the period comes from the frequency of the periodicity, and so on. Microlensing is not nearly as straightforward. So it is difficult to choose even an initial guess for the best fit parameters of the system. Even when a solution is found, it is difficult to be sure that it’s the best, and that some other, better one doesn’t exist with considerably different parameters. Lens fits are therefore often degenerate, as they can have two decent fits that are completely different. For this reason, in some lensing events, it is hard to tell whether a planet or star is responsible for the perturbed light curve. Also, slight changes in the values of the parameters can lead to severe changes in the resulting light curve, especially the sharp changes associated with the source coming near or crossing the caustic.

While this may be discouraging, it is important to realise that microlensing surveys do have advantages. Microlensing surveys are sensitive to planets in regions of parameter space that are difficult or impossible to probe with current instrumentation, especially finding low-mass planets beyond the ice line in a planetary system. Detecting an Earth-mass planet at 5 AU around a star of any mass would pose a considerable challenge with current instrumentation, and only microlensing has the capability to feasibly detect such an object. Microlensing is a great tool for planet statistics, being less constrained by the planet mass and semi-major axis, it is able to detect planets of very low mass as well as free-floating planets. It could (and has) detect multi-planet systems in some cases, and could detect analogues of every planet in our solar system (with the possible exception of Mercury). This permits microlensing surveys to set constraints on the statistics of effectively all planets with a mass greater than or equal to Mars. Furthermore, it permits the detection of planets throughout the Galaxy, permitting discussions about how those planet statistics change across the Galaxy once a sufficient quantity of planets have been found. Additionally, it is possible, in principle, to use the method to discover extrasolar planets in nearby galaxies, including the Andromeda Galaxy.

The strategy to finding planets through gravitational microlensing has been the same since it was first proposed in 1992 (by Gould & Loeb), even if it has matured and developed considerably. Finding extrasolar planets this way works in a two-step method. Telescopes observe the sky and discover microlensing events occurring. Next, smaller dedicated observatories are alerted in real-time to the existence of the event and focus their attention to gathering high cadence data. Higher magnification events are favoured for follow-up because they have the highest chance of revealing the presence of planets around the lensing star. Future microlensing surveys will be more optimised and will combine these steps into one single step. Microlensing events will work in a sort of “survey mode,” where wide-field telescopes will discover microlensing events and will switch to high-cadence modes throughout the duration of the event.

The uniqueness of the event means one only has one shot at discovering the planet. After the microlensing event is over, the planet may not be detectable again through this method for tens or hundreds of thousands of years as stars do not often come so close to background stars to produce detectable lensing. For this reason, microlensing surveys target dense star fields, frequently targeting the galactic disk and centre, and the lens is frequently a main-sequence star or stellar remnant in the foreground disk or galactic bulge, whereas the background source is often a main-sequence star or giant star typically in the bulge. The uniqueness of microlensing events for a single target may raise a legitimate question about the security of the science involved in the detection of planets this way. If the microlensing event has a decent amount of data, than the mathematics lead to a clear and unambiguous planet detection.

For a binary (or higher) lens, relative orbital motion of the two components may be detectable as a manifestation of deviations from the light curve that would be expected under the usual, static, unmoving binary lens model. This gives you the projected velocity against the plane of the sky because the motion of the second body around the primary in the lens can cause two types of change to the distribution of the caustics (and thus magnification regions). The angular separation between the two bodies may increase or decrease, resulting in a fundamental change in the caustic structure and hence magnification pattern, and the orbital motion in the form of the perpendicular velocity component could be detected, which would simply rotate the pattern of the lens on the sky. If one has a planet’s projected velocity, and if the mass of the lens is known, and if one assumes a circular orbit for the planet, then a fairly well-constrained orbit may be obtained, including its inclination. For especially well-resolved events, the eccentricity can be estimated. As an outstanding example, the microlensing event OGLE-2006-BLG-109 revealed two planets that appear to be Jupiter-Saturn analogues, but scaled down a bit. For the outer planet, the eccentricity was well constrained. For both, an inclination was derived and the system was found to be coplanar.

The light curve for the microlensing event OGLE-2006-BLG-109

The serious attempt to find planets with microlensing began in 1995, but no convincing planet detections were made until 2001. In this six year period, a relatively small number of microlensing events were reported each year (~50 – 100), but the theory and practice improved and in 2001, the OGLE group upgraded to a new camera with a higher cadence and a sixteen-fold improvement in field of view. This boosted the number of events being reported by an order of magnitude by 2002. The following year, the microlensing technique produced its first planet discovery, OGLE-2003-BLG-235Lb/MOA-2003-BLG-53Lb, which is often abbreviated to OGLE235-MOA53. The MOA group upgraded their telescope to a 1.8 metre, 2 square degree telescope in 2004. By 2007, the OGLE and MOA groups were reporting ~850 events per year.

The future of microlensing surveys is promising, even if the detection rate will probably not ever compete with radial velocity or the transit method for the forseeable future. Next generation ground-based surveys will have large fields of view, making it possible to monitor tens of millions of stars simultaneously with cadences of 10 – 20 minutes. This would produce thousands of detected microlensing events per year. Simultaneous high-cadence measurements of microlensing events, once identified, would permit the unambiguous discovery of Earth-mass planets. This would require telescopes spread across Earth to permit constant sky coverage. If Earth-mass planets with separations of several AU are common in the Galaxy, such a set-up should find several per year. Furthermore, if there exists at least one free-floating planet for every star in the Galaxy, the survey should detect them at a rate of hundreds per year. Already, microlensing surveys have placed constraints on the frequency of free-floating planets, finding them to be almost twice as common as main-sequence stars.

Ideally, microlensing surveys would benefit greatly from a space-based mission. A space-based microlensing mission could place robust constraints on the frequency of$\geq$Mars-mass planets at separations($a \gtrsim$ 0.5 AU. The biases intrinsic to microlensing are completely unlike the biases for Doppler spectroscopy and transit photometry and so this would provide a complimentary and independent confirmation of such planet statistics.