## Probability of Transit

Transiting Planets. Credit: NASA

Transiting planets are valuable items to explore the properties of planetary atmospheres. Planet searches like Kepler that focous on fields of sky tend to reap rewards amongst dimmer stars simply because there are many more dim stars in a given patch of the sky than bright ones. Transiting planets around bright stars are of particular value, though, as the increased brightness makes the system easier to study.

Radial velocity surveys tend to monitor brighter stars since spectroscopy is even more severely limited by stellar brightness than photometry, but it is not limited to observing patches of sky – telescopes performing Doppler spectroscopy tend to observe a single object at a time due to technical and physical limitations. Radial velocity surveys are also much less sensitive to the inclination angle of a planet orbit with respect to the plane of the sky. The planet doesn’t have to transit to be spectroscopically detectable. As such, radial velocity surveys tend to generate discoveries of planet candidates with unknown inclinations and true masses, but around much brighter stars than those planets discovered by the transit method.

As such, planet candidates discovered by radial velocity, especially planet candidates in short orbital periods are excellent targets for follow-up observations to attempt to detect transits. Transiting planets that have been discovered first through radial velocity have been of great scientific interest due to their host stellar brightness and thus ease of study. If more such systems are found, it would be of great benefit to understanding extrasolar planet atmosphere. While only a hand-full of transiting planets have been discovered first through radial velocity, they all orbit bright stars and are some of the best-characterised planets outside our solar system.

The probability that a planet will transit is, as has been discussed previously, given by
$\displaystyle P_{tr} = \frac{R_*}{a}$
where a is the semi-major axis of the planet orbit. This is the distance between the centre of the star and the centre of the planet. However, due to the inclination degeneracy – the reoccurring evil villain constantly plaguing radial velocity science – the star-planet separation is unknown. Remember that the period of the RV curve gives only the orbital period of the planet. If the orbital period is held constant, increasing the mass of the planet increases the star-planet separation. An increase in the total system mass requires greater separation between the two bodies to preserve the same orbital period.

For example, if radial velocity observations of a star reveal the presence of a mp sin i = 1 ME planet candidate, but the inclination is actually extremely low such that the true mass of the companion is in the stellar regime, then because the mutual gravitational attraction between the two stars will be much greater than the mutual gravitational attraction between the star and an Earth-mass planet at the same period, the two stars must have a wider separation, otherwise their orbital period would be smaller.

Mathematically, the true semi-major axis is given by
$\displaystyle a = \left(\frac{G[M_*+M_{\text{pl}}(i)]}{4\pi^2}\right)^{1/3}T^{2/3}$
Where G is the gravitational constant, and Mpl(i) is the mass of the planet at a given inclination i, and T is the period of the system. It is worth noting that the true semi-major axis is not significantly different from the minimum semi-major axis as long as the mass of the star is much greater than the mass of the planet – which is typically the case.

The fact that the true semi-major axis is a function of the unknown inclination makes for an interesting clarification: The probability that a planet of unknown inclination will transit is not simply given by Rstar/a, but is only approximated by it. If we assume that the distribution of planet masses is uniform (and extending through into the brown dwarf mass regime), then you would expect a planet with a minimum mass equal to Earth to have a much greater chance of being a bona-fide planet than a planet with a minimum-mass of 10 MJ, simply because there is a greater range of inclinations the former planet can be while still remaining in the planetary mass regime. Taking this a step further, even if both the Earth-mass planet candidate and the 10 Jupiter-mass planet candidate have the same orbital period, the probability that the latter planet transits ends up being less than the Earth-mass planet simply because of its high mass. Since its inclination is unknown, the probability that its mass is so high that the true semi-major axis is noticeably larger than the minimum semi-major axis is much higher, resulting in a likely lower transit probability.

Except it turns out that the mass distribution of planets and brown dwarfs isn’t constant. Earth-sized planets are significantly more common than Jupiter-sized planets, and super-Jupiters appear rare. It isn’t clear yet what the mass distribution planets actually is, with significant uncertainty in the sub-Neptune regime, but it is clear that for a highly accurate estimate of the transit probability, the inclination distribution cannot be thought of as completely random as it is fundamentally tied to the planet mass distribution.

Planet Mass Distribution given by Ida & Lin (Left) and Mordasini (Right)

Consider the case of a super-Jovian planet candidate, perhaps with a minimum mass of 7 or 8 Jupiter-masses. Because a significant fraction of physically allowable inclinations would place the true mass planet into a mass regime that is in reality sparsely populated, it is less likely that the planet candidate’s orbit is in those inclinations. It is thus more likely that the planet candidate’s orbit is edge-on than would be expected from the probability function of randomly oriented orbits. As such, the transit probability of a super-Jovian planet is actually boosted by ~20 – 50% over what you would expect from Ptr = Rstar/a. If this is the case, then we would expect to find an excess in the fraction of transiting planets in this mass regime then would be expected purely from the standard transit probability function. Indeed this is what we see.

Candidate planets with masses in the terrestrial planet regime are similarly affected, with broadened transit probabilies owing to the fact that terrestrial planets are more common than higher mass planets, arguing in favour of a higher inclination than the random inclination distribution function.

On the other hand, planet or brown dwarf candidates of minimum masses in the most sparsely populated region of the mass distribution are unlikely to truly have that mass. They are quite likely in orbits with low inclinations and with much higher true masses. The transit probability for companion candidates with minimum masses in this mass regime are actually reduced from the standard transit probability function.

Geometric and a posteriori transit probabilities

In the table above, taken from this preprint, we see that the geometric transit probability, Ptr,0, can be much less than the a posteriori transit probability, Ptr. The transit probability for 55 Cnc e, for example, jumps up from 28% to 36%. With these higher a posteriori transit probabilities, these short-period low-mass planets should be followed-up for transits. If transits are found, it would be of significant benefit to the extrasolar planet field.

In summary, there are various additional effects that can cause the a posteriori transit probability to be significantly different from the geometric transit probability. Planets with only minimum masses known can be more accurately assigned a transit probability when taking into account the uneven planetary mass distribution. Low-mass planets and super-Jupiters are more likely to transit than their geometric transit probability because a significant range of the inclination space is consumed by planets of masses that are simply rare. These planet candidates are more promising targets for transit follow-up than, for example, Jupiter-mass planets or intermediate-mass brown dwarfs.

## The Real Ones

On the last post, we looked at recovering a periodic signal from a radial velocity plot and interpreting it as a planet. Now let’s look at a few of the complications involved in this.

A powerful statistical tool used to get an idea of what kind of periodic signals are in your radial velocity data set is a Lomb–Scargle periodogram (the mathematical details for the interested reader may be found here, but it is sufficiently complex to warrant skipping over in the interests of maintaining reader attention and reasonable post length). In the interests of brevity, further references to the Lomb-Scargle periodogram will be shortened to simply “periodogram.”

The purpose of this periodogram is to give an indication of how likely an arbitrary periodicity is in a data set whose data points need not be equally spaced (as is frequently the case in astronomy for a variety of reasons). Periodicities that are strongly represented in the data are assigned a higher “power,” where periodicities that are not present or only weakly present are given a lower power.

Let’s look at an example using a radial velocity data set for BD-08 2823 (source) If we calculate a periodogram for the data set, we come up with this

BD-08 2823 RV Data Periodogram

The dashed line represents a 0.1% false alarm probability (FAP). A clear, obvious peak is seen at 1 day, 230 days and ~700 days, implying that periodicities of 1 day, 230 days, and ~700 days are present in the data. Creating a one-planet model with a Saturn-mass planet at 238 days produces a nice fit. After subtracting this signal from the data, we’re left with the residuals. Now we may run up a periodogram of the residuals and see what’s left in the data.

BD-08 2823 Periodogram of Residuals

We see three noteworthy things. First and foremost is the emergence of a new peak in the periodogram that was not strongly present before at 5.6 days. We also see that the peak at 1 day remains. Lastly we see that the peak toward 700 days has weakened and moved further out. It would seem to suggest the 700-day signal is perhaps not real, or was an artifact of the 238-day signal.

Why was the 5.6-day signal not present in the first periodogram? The answer may lie in it’s mass: the planet has a mere 14 Earth-masses. It’s RV signal is completely dominated by the Saturn-mass planet. The giant planet forces the shape of the RV diagram and the signal of the second planet is just dragged along, superimposed on the larger signal.

On the radial velocity data plot, the two-planet fit we have come to looks like this:

BD-08 2823 Two-Planet Fit

It is important to realise that the obvious sine curve is not necessarily a bold line, but there is a second periodicity in there going up and down frantically, once every 5.6 days, compared to the Saturn-mass planet, at 237 days.

The fit has a reduced chi-squared of χ2 = 3.2, and a scatter of σO-C = 4.3 m s-1. There’s no obvious structure to the residuals and the scatter is not terribly bad, so any new signals will likely indicate planets of low mass. Let’s check in on the periodogram of the residuals to the two-planet fit and see what may be left in the data.

Periodogram of Residuals to 2-Planet Fit

That signal out toward a thousand days is stubbornly refusing to go away, despite a low χ2. It may either not be real, or it may be indicative of a low-amplitude signal with a rather long period.

Also noteworthy is that the periodicity at one day continues to exist, rather strongly. This periodicity is what’s known as an alias. Because the telescope observes only at night, the observations are roughly evenly spaced – there are (on average) twelve hour gaps between each data point. Therefore a sine curve with a period of 24 hours can be made to fit the data. To illustrate this, consider this (completely made up) data set:

Fake Signal

There’s no doubt that the data is well-fitted by the sine curve, but there is no real evidence that the periodicity proposed by it arises from a real, physical origin. What’s more, a sine curve with half this period could also equally well fit the data. So could a sine curve with a third of this period, and so on. There are mathematically an infinite number of aliases at ever-shortening periods that can be fit to this data.

Generally, if you observe a system with a frequency of $f_o$, and there exists a true signal with a frequency of $f_t$, then aliases will exist at frequencies $f_{t+i} * f_o$, where i is an integer.

Therefore we see that these aliases are caused by the sampling rate. If we could get data between the data points already available, if we could double our observation frequency, we could break this degeneracy. But the problem for telescopes on Earth is that the star is not actually up in the sky more than half the day, and a given portion of the time it is up could be during daylight hours. Therefore the radial velocity data sets of most stars can be plagued with short-period aliases since there is typically a small window of a few hours to observe any given star. It must be noted that as the seasons change and the stars are in different places in the sky at night, that window of availability will shift around a bit, allowing one some leverage in breaking these degeneracies. Ultimately, telescopes in multiple locations around the world (or one in space) would sufficiently break these degeneracies.

A real example of aliases exist in this example from an Alpha Arietis data set. In this case, the alias is not nearly so straightforward. Two signals of periods 0.445 days and 0.571 days can be modelled to fit the data.

Alpha Arietis RV Alias

So which of these two signals correspond to an actual planet? It turns out neither of them do: these radial velocity variations are caused by pulsations on the star – contracting and expansion of the star produces Keplerian-like signals in radial velocity data, too. That’s yet another thing to watch out for. This can be detected with simultaneous photometry of the star. If there is a photometric periodicity that is equivalent to your radial velocity periodicity, avoid claiming a planet at this period as if your academic credibility depends on it.

Additional observations could easily break this degeneracy, provided they are planned at times where the two signals do not overlap.

We see therefore that it is important to keep in mind that a low FAP speaks only to whether or not the signal is real, and not where or what it actually came from. The one-day periodicity is surely present in the data, but it is not of physical origin. It can also be extremely hard to tell whether or not a signal at a given period is actually an alias of another, more real period. There are times when the peak of an alias in the periodogram can be higher than the actual, real period. For reasons that include these, radial velocity fits must be considered fairly preliminary. New data may provide drastic revisions to the orbital periods of proposed planets if signals are exposed to be aliases.

Confusion over aliases have occurred before in literature. HD 156668 b and 55 Cnc e have both had their orbital periods considerably revised after it was realised that their published periods were, in fact, aliases. In the case of 55 Cnc e, the new, de-aliased orbital period ended up being vindicated after transits were detected). The GJ 581 data set, for example, is severely limited by sampling aliases that have spawned controversies over the possible existence of additional planets in that system.

In summary, periodograms are a useful tool to provide the user with a starting point when fitting Keplerian signals to radial velocity data, but they cannot distinguish real signals from aliases. Many observations with a diverse sampling rate are necessary to disentangle aliases from true planetary signals. Ultimately, a cautious approach to fitting signals to radial velocity data works best.