The Basics of Special Relativity
Special relativity is built upon two postulates:
- Speed of light in the vacuum is constant in all inertial frames
- The laws of physics are the same in all inertial frames
Definition: Inertial frames are coordinate systems travelling at constant velocities relative to each other. An observer’s rest frame is a coordinate system in which the observer isn’t moving.
The same cannot be said for accelerating frames as Newton’s second law F=ma necessitates an effective force explicitly entering in all physical laws expressed for the accelerating frame. This degree of freedom ought to be excluded from our formulation of the theory.
Despite what online sources like to depict, Einstein’s genius didn’t simply stem from baseless imagination. He too, stood on the shoulder of giants who laid the foundation that allowed him to come to a conclusion, if inspected in hindsight, might not seem so bewildering after all.
His ideas were grounded on the theory of electromagnetism in the 1800s, which were revolutionised by James Clerk Maxwell, through his famous Maxwell equations that described the fundamental interactions between electric (E) and magnetic fields (B).
In a vacuum( without any currents or charges ) they take the form
▽⋅E=0 ▽⋅B =0 ▽×B =μ₀∊₀∂ₜE ▽×E =-∂ₜB
Where E=(E₁, E₂, E₃) and B=(B₁, B₂, B₃) are vectors (in a Cartesian coordinate system) representing the electric and magnetic field, respectively.
For those unfamiliar with vector calculus, it is defined that
▽⋅v=∂₁v₁+∂₂v₂+∂₃v₃ ▽×v=(∂₂v₃-∂₃v₂, ∂₃v₁-∂₁v₃,∂₁v₂-∂₂v₁)
where the subscript 1,2,3 behind ∂ denotes differentiation with respect to the Cartesian coordinates x, y and z. (The second equation produces a vector)
It can be readily verified that
▽×(▽×B)=▽(▽⋅B)-▽²B=▽×(μ₀∊₀∂ₜE)=-μ₀∊₀(∂ₜ)²B
▽×(▽×E)=▽(▽⋅E)-▽²E=▽×(-∂ₜB )=-μ₀∊₀(∂ₜ)²E
The equations are beautifully symmetrical. Inspecting the second of them, we see that
▽×(▽×E)=-μ₀∊₀(∂ₜ)²E
admits a solutions of the form E=E₀ cos[w((1/√μ₀∊₀)t±x)] (the reader may easily check), which describes an electric field with amplitude E₀ travelling in the positive/negative x direction with speed v= 1/√μ₀∊₀. (Since adding or subtracting from the argument of f(x)=cos(x) moves the cosine curve in the negative or positive x direction, respectively) (And of course, the wave may travel in any direction and sine solutions are also allowed)
Now both μ₀ and ∊₀ are experimentally determined constants, but in a miraculous “coincidence” they give v=c≈299,792,458 m/s, the experimentally determined speed of light in a vacuum!
It was then established that light is in fact an electromagnetic wave, its speed being c in all inertial frames as Maxwell’s equations hold regardless of the inertial coordinate system.
This conclusion had been tested by the Michelson-Morley experiment: If light speed isn’t constant, there is a unique coordinate system in which light travels with the same speed in all directions. If Earth has a velocity relative to this unique coordinate system, light will “seem faster/slower” in the observer’s rest frame if emitted in different directions.
Think of throwing a ball off the edge of a train. In your point of view(train’s rest frame), the ball travels slowest if thrown along the train’s path, and fastest if thrown in the direction opposite to that.
Results of the experiment show no differences between light speed in different directions.
What does all of this entail?
First of all it leads to a non-obvious way of switching between inertial frames: Consider a frame S with Cartesian coordinates x,y,z and a time coordinate t. A light source is placed at the point (x, y, z, t)=(0, 0, 0, 0), emitting light spheres expressed by the equation
x²+y²+z²=c²t²
(a sphere about the origin with radius equal to distance traversed by light in time t)
Consider another inertial frame S’, with coordinates (x’, y’, z’, t’), having speed v in the x direction with respect to S, all its coordinate axes remain parallel to that of S. The two frames coincide at (0, 0, 0, 0)
Due to light speed invariance, an observer stationary in S’ sees the exact same light sphere as a stationary observer in S!
(x')²+(y')²+(z')²=c²(t')²
Now, our problem boils down to finding the transformation x🠒x’, y🠒y’, z🠒z’, t🠒t’ that maps the light sphere in S to the one in S’. Or in symbols,
k²(x²+y²+z²-c²t²)=(x')²+(y')²+(z')²-c²(t')²=0
Before one dives into the tedious task of plugging equations, some limiting assumptions must be made.
- Motion of S’ in the x’ direction relative to S (a boost in the x direction) should not affect motion in directions orthogonal to x’, which means that ky=y' kz=z'
- Motion parallel to x axis in S should remain parallel to x’ axis in S’, meaning that there should be no y, z components in the expression for x’.
- As c is a constant, t cannot be equal to t’, even elapsed time is observer-dependent!
The dependence resides in the boost only, so we may officially write
x'=k(a₁x+a₂t) t'=k(b₁t+b₂x)
- (Important) If one transforms from S to S’ then from S’ back to S again, it amounts to multiplying y or z by k(v)k(-v), which should be 1. But k is independent of the sign of v (it could be dependent on magnitude but not the sign, since space doesn’t have “memory” to know which direction is “forwards” or “backwards”)
Thus, k(v)=1 (-1 would be an inversion, which we don’t want)
- In the limit v🠒0, the transformations must reduce to “no transformation at all”
lim(v🠒0)(a₁, a₂,b₁, b₂)=(1, 0, 1, 0)
- Plugging in a point, which we choose to be the origin of S’. It must hold that
0=a₁(vt)+a₂t), since an observer in S sees that origin moving in the x direction with velocity v.
This gives a₁v=-a₂
Finally we get to solve the equations: Substituting x'=a₁x-a₁vt t'=b₁t+b₂x
into (x²+y²+z²-c²t²)=(x')²+(y')²+(z')²-c²(t')²=0 gives
z²+y²+(a₁x-a₁vt)²-c²(b₁t+b₂x)²=x²+y²+z²-c²t²
a₁²x²+a₁²v²t²-2a₁²vxt-c²b₁²t²-c²b₂²x²-2c²b₁b₂xt= x²-c²t
This equation is true regardless of the values of x or t, so coefficients of equal powers of x or t must be equal.
a₁²-c²b₂²= 1 (1) a₁²v²-c²b₁²=-c² (2) -a₁²v=c²b₁b₂ (3)
(1) gives b₂=√(a₁²-1)/c while (2) gives b₁= √(v²a₁²+c²)/c
Substituting into (3) yields a₁²v=-c²(√(a₁²-1)/c)( √(v²a₁²+c²)/c)
a₁⁴v²=(a₁²-1)(v²a₁²+c²)
a₁=1/√(1-(v²/c²))
Plugging this into our expressions for b₂ and b₁ immediately solves the equation
b₁=1/√(1-(v²/c²)) b₂=-(v/c²)a₁
Altogether, we obtain the Lorentz transformations
x'=𝛽(x-vt) t'=𝛽(t-(v/c²)x) y'=y z'=z
Where its conventionally defined that 𝛽= 1/√(1-(v²/c²))
Two simple but important results follow: Length contraction and time dilation.
(For convenience, we use the same frames S and S’ defined earlier)
- Time dilation
A clock fixed at x'=x₀’ in S’ records two successive events separated by time T, meaning that the events have coordinates (x₀', 0, 0, t₀') and (x₀', 0, 0, t₀'+T)
Applying Lorentz transformations, these events have time coordinates
t₁=𝛽(t₀'+(v/c²)x₀') t₂=𝛽(t₀'+T+(v/c²)x₀')
(transforming from S’ to S requires a boost in the negative x direction, so we replace v with -v)
The time interval between these events in S has become t₂-t₁=𝛽T
For any velocity v < c, we have 𝛽>1, meaning that more time has passed between events in frame S than in frame S’.
- Length Contraction
A stationary rod has endpoints x₁’ and x₂ ‘ (simultaneously observed at time t') in frame S’. Transforming to frame S, we have
x₁=𝛽(x₁’+vt') x₂=𝛽(x₂’+vt')
The length of the rod in frame S’ is l'=x₁’-x₂ ‘ , while length of the rod measured in S is
l=x₁-x₂ =𝛽(x₁’-x₂ ‘) , meaning that all lengths along the direction of v in frame S’ have been contracted with a factor of 1/𝛽
Last but not least, we derive Einstein’s famous equation E=mc².
This equation is a direct consequence of the fact that an object’s mass isn’t conserved as one switches between inertial frames. An expansion involving kinetic energy ends up in the relation between rest and “moving” mass, allowing us to interpret energy as an intrinsic quantity that arises from both the rest mass and motion of an object, which is a manifestation of how the mass of the object transforms between inertial frames.
Consider a simple collision between two particles, labelled 1 and 2. Here it isn’t necessary that total kinetic energy is conserved, though the conservation of total momentum and total mass within each coordinate system is strictly necessary. For convenience, we assume that all motion is aligned in the x direction and the x axis is preserved in all coordinate frames.
In particle 2’s rest frame(S), particle 1 moves with velocity u and has mass m=m(u)(a function of u) before they collide. A stationary observer in S observes particle 2 to have its rest mass m₀
After the collision, both particles stick together and move with velocity U, their combined mass being M(U)
As seen from the centre of mass (CM) frame, the end product is at rest. (the CM frame is a frame defined with an origin that has total momentum equal to 0)
From our assumed conservation laws, it follows that
m(u)+m₀=M(U) m(u)u+0=M(U)U (in frame S)
Which combine to give m(u)=m₀(U/(u-U))
Particle 1 has velocity u relative S, while S’ has velocity U relative S. Using relativistic composition of velocities (left as an exercise to the reader): Any velocity v along the x axis in S transforms to
v'=(v-U)/((1-vU/c²) in frame S’
Velocity of particle 1 and 2 in S’ is then u₁=(u-U)/(1-uU/c²)
u₂=(0-U)/((1-0)=-U
Since S’ is the CM frame, by definition we have (u-U)/(1-uU/c²)=U
u=2U/(1+U²/c²)
We now solve for U using the quadratic formula U=(c²/u)[1±√(1-(u²/c²)]
Substitution gives m(u)= 𝛽m₀
For u small compared to c, one expands in a Maclaurin series to find the satisfactory result
m(u)= 𝛽m₀=m₀+(1/c²)(m₀u²/2)+O(u⁴/c⁴)
mc²= 𝛽m₀c²=m₀c²+m₀u²/2+O(u⁴/c²)
Which implies that we regard the energy E of the particle as E=mc²