7.1 Math in Two Dimensions

7.1a Review of Plane Geometry

This section is a review of plane geometry. Plane Geometry is about properties of points lines and figures in a plane (or more generally in other surfaces). The elementary concepts are points and straight lines. Lines that do not meet are said to be parallel.

Euclid, long ago noticed the following things:

Every pair of points determines a unique straight line containing both.
Two lines are parallel if they never meet.
In a plane, every line and a point not on it determine a unique line parallel to the former containing the latter. (This is not true on all other surfaces)
Every pair of lines that are not parallel have a unique point of intersection.

He derived all sorts of wonderful consequences from these statements.

If you cut a line in two at point \(C\), each half is called a ray, starting at \(C\).

If you cut at \(D\) any ray starting at \(C\), you get another ray starting at \(D\) and a line segment with ends at \(D\) and \(C\).

Two rays, \(a\) and \(b\) both starting at \(C\) and meeting only at \(C\), determine two angles at \(C\). Unless \(a\) and \(b\) are part of the same straight line, one of these is smaller than the other. We can call them \(aCb\) and \(bCa\). We can describe the angle \(aCb\) as corresponding to all the rays that emanate from \(C\) that are clockwise past \(a\) and before \(b\).

We can describe any point in the plane by its \(x\) and \(y\) coordinates. We always put the x coordinate first. Thus \((5,7)\) means the point whose \(x\) coordinate is \(5\) and \(y\) coordinate is \(7\). A line segment can then be described by the coordinates of its two ends. If the two end points both have the same \(y\) coordinate, the length of the segment is the difference between their \(x\) coordinates, (and the same with \(x\) and \(y\) reversed). Thus the distance between points with coordinates \((5,7)\) and \((2,7)\) is \(3\).

We have to define the distance between points when the points differ from each other in both coordinates. We want distance to be a meaningful concept and one that does not depend on the coordinate system being used. The definition we choose has the important and necessary property that the length of a segment is the same no matter what direction we choose as the x direction; that direction is our choice. (Length and distance would not be intrinsic to the segments if they changed with that choice.)

And this is the definition: The square of the distance between two points is the sum of the squares of the differences in each coordinate. Distance is the positive square root of this sum. Thus, the distance squared between points having coordinates \((1,1)\) and \((4,5)\) is \(3^2 + 4^2\) or \(9 + 16\) which is \(25\); the distance between these points is \(5\).

We call some particular length a unit length, and say that a segment has length \(x\) if it is \(x\) times longer than that unit. It really doesn't matter what the unit is, plane geometry is the same with any. (By the way, this is not so if we were dealing with the surface of a sphere instead of a plane.)

The next question is, how do we similarly describe angles? We can declare any angle to be a unit angle, and associate the size of any other angle to be whatever multiple of that unit angle it is. Traditionally, there are two commonly used unit angles, and it is wise to be familiar with both.

Euclid (or some other ancient person) defined the angle obtained by going all the way around from one side of a ray to the other side of it, to be \(360\) degrees. (Why? I think the answer is that it is easy to divide 360 by small numbers; in fact 360 is divisible by all numbers from 1 to 10 except 7, and it is itself not a very big a number.)

With that definition, a "straight line" angle, which occurs when \(a\) and \(b\) (the sides of the angle) point exactly opposite one another, is \(180\) degrees. A right angle, which is half of a straight line angle, is \(90\) degrees, and so on. Two segments or rays or lines that make a right angle at \(C\) are said to be perpendicular.

The second commonly used measure of involves a unit circle. This is the set of points that are all a unit distance from the central point \(C\). Then we can measure an angle by the length of the portion of the unit circle inside the angle. The distance all the way around the unit radius circle is the circumference of the unit circle which is \(2\pi\). That means that a straight angle has size \(\pi\) and a right angle size is \(\frac{\pi}{2}\). The unit of distance here is called the radian.

What is a radian?

Well, \(\pi\) is close to \(\frac{22}{7}\). So \(2\pi\) is near \(\frac{44}{7}\) or roughly \(6.28\) radians. If we divide \(360\) by \(6.28\) we get that a radian is something near \(57\) degrees.)

To be more specific, \(\frac{22}{7}\) is \(3.142857\)... \(\pi\) is \(3.141593\)... \(1\) radian is \(\frac{360}{2\pi}\) which is \(57.29578\) degrees while \(\frac{360}{2*22/7}\) is \(57.27273\).
(You are best off not trying to remember these details. It is enough to remember that the angle change in going around a circle is \(360\) degrees, and also is \(2\pi\) radians. This means that a radian is \(\frac{360}{2\pi}\) degrees. If you do not want to use a machine to get the answer above, you can replace \(\pi\) by \(\frac{22}{7}\) and approximate \(1\) radian by \(\frac{360*7}{44}\) which is \(\frac{630}{11}\) and you will be wrong by a little less than one twentieth of one percent.)

What did Euclid deduce from his postulates?

Here is one simple fact and its proof:

Fact: When lines meet, opposite angles are the same, and the sum of any two adjacent angles is \(\pi\) radians.

Proof: Suppose lines \(a\) and \(b\) meet at \(C\) and denote the ray of \(a\) on one side of point \(C\) by \(a\) and on the other side by \(a'\), and do similarly for \(b\). Then there are 4 angles at \(C\): they are \(aCb, bCa', a'Cb'\) and \(b'Ca\).

Then any consecutive pair of these, including \(b'Ca\) and \(aCb\), form a straight line angle, if added together.

This means that, for example \(aCb\) and \(a'Cb'\) when added to \(bCa'\) both are the same, which implies that \(aCb\) and \(a'Cb'\) are the same.

Before mentioning any more conclusions, we make one more definition. Suppose we have an angle \(\theta\) that is less than a right angle.

7.1b The sine function

We choose a center point and draw a unit circle around it. (This is a circle whose radii have length \(1\).) We draw the angle in question at the center and choose one side of it to be the \(x\) axis (on which \(y\) is \(0\).). Let \(P\) be the point at which the other side of the angle meets the circle. Then we draw a line segment in the direction of the y-axis that goes between the x-axis and \(P\).

The \(y\) coordinate of the point \(P\), which is the length of that perpendicular line, is called the sine of the angle \(θ\), written as \(\sin θ\). Notice that the perpendicular line is a straight line down from P to x-axis, and so is shorter than the path from P to that axis along the circle, which is the size of the angle in radians. This means that the sine is always less than the angle when the latter is measured in radians. When the angle is small, the sine is pretty close to the angle size, measured in radians, because the straight path and the path on the circle are almost the same. Thus we have \(\sin 0 = 0\) and the derivative of \(\sin x\) at \(x = 0\) is \(1\) when angles are measured in radians.

Notice also that the sine starts \(0\) at the angle \(0\), and increases to \(1\) when the angle becomes \(\frac{\pi}{2}\). We can define it the same way for larger angles as well, as the y coordinate of \(P\). After \(\frac{\pi}{2}\) the sine decreases as the angle increases, and reaches \(-1\) when we get three quarters of the way around the circle, at angle \(\frac{3\pi}{2}\). Then it goes up again as \(θ\) increases. Of course when the sine is negative it is minus the length of the perpendicular line, which corresponds to the fact that the \(y\) coordinate of \(P\) is negative there.

Angles are often described (in radians) as going from \(-\pi\) to \(-\pi\) as you go around the circle. That way, \(0\) angle as usual corresponds to the positive x-axis, but angles below the x-axis have negative size. If so the angles for which the sine is negative are negative angles. The sine is an odd function of its angle argument which means:

\[\sin(-θ) = -\sin(θ)\]

The angle "complementary" to \(θ\) is the angle whose sides are the positive y-axis and the ray from the center of the circle (the origin here) through \(P\). The sine of the complementary angle to \(\theta\) is called the cosine of \(\theta\), written as \(\cos(θ)\). This cosine is the \(x\) coordinate of the point \(P\) on the unit circle. A glance at the picture shows that the cosine, the \(x\) coordinate of \(P\), is an even function of the angle \(\theta\). It is the same whether the angle is positive or negative.

Triangles

Another way to describe the sine of \(\theta\) when \(\theta\) is less than \(\frac{\pi}{2}\) is in terms of the triangle formed by the two sides \(a\) and \(\b\) of the angle and any line perpendicular to one side .

This triangle has a right angle where that side \((B)\) meets the perpendicular line. The sine of \(\theta\) is then the length of the side of that right triangle opposite \(\theta\), divided by the length of the side opposite the right angle, which is called the hypotenuse of that triangle. (I think it is daffy to use Greek letters for angles, and to use words like hypotenuse for the longest side of a right triangle, but that’s what everyone does.)

The points on the unit circle all have length \(1\), which means that the sum of the squares of the \(x\) component and \(y\) component of all points on it is \(1\). The \(y\) component’s length is \(\sin(\theta)\); its square is \(\sin^2(\theta)\). We can deduce that the square of the \(x\) component, which is \(\cos(x)^2\) must be \(1- \sin^2(\theta)\). This tells us

\[\sin^2(\theta) + \cos^2(\theta) = 1\]

Suppose we pick three points not all on one line, and join them in pairs by line segments. They form a triangle. Each triangle has three sides and three interior angles. Two triangles are said to be congruent, when the three interior angles of one have the same values as interior angles of the other, and three side lengths of one match those of the other . If the \(3\) angles are the same we call them similar triangles even when the lengths are different.

There are two interesting questions about triangles: First, what restrictions are there on the three side lengths and three angle sizes for them to form a triangle?

If we consider lengths alone, for them to form a triangle the largest side cannot be larger than the sum of the other two sides . (Proof: The two smaller sides must meet opposite ends of the largest one, and otherwise they are then not long enough to meet each other. If the sum of the lengths of the two smaller ones is exactly the length of the largest one, they all have to lie on one line which does not describe a triangle.) This condition is called the Triangle Inequality.

We can prove that the sum of the angles in a triangle is \(\pi\) radians, but it requires use of the "parallel postulate". (The parallel postulate says that there is exactly one line parallel to any line that passes through any point.) This has to be, because the sum of the angles of a triangle is not \(\pi\) radians on the surface of a sphere. If you define each pair of antipodal points to be a single point, geometry on the surface of sphere obeys all the others of Euclids axioms and postulates.

If we consider angles alone, we have seen that in a right triangle, the other two angles are complementary, which means that the sum of their sizes, in radians, is \(\frac{\pi}{2}\). Thus the sum of all three angles of a right triangle is \(\pi\), the straight line angle.

This is so for any triangle:

The sum of the interior angles in any triangle is \(\pi\) radians.

Proof: Suppose we start with a triangle \(ABC\) whose largest angle is at point \(A\). If we draw a line segment from \(A\) to \(BC\) perpendicular to the latter meeting it at point \(P\), we have divided our triangle into two right triangles, and the sum of the angles of these two is \(2\pi\) radians. This sum consists of the interior angles at \(A, B\) and \(C\), and a straight line angle at \(P\). Since the angle at \(P\) is \(\pi\) radians in size, the interior angles sum to \(\pi\) as well.

Exercise 7.11: Suppose the perpendicular so that \(P\) was outside of the original triangle. Prove by similar reasoning that the sum of the angles of that triangle is \(\pi\) radians.

The second question is: How many of the six parameters (angle sizes and side lengths) are needed to determine all the size parameters of a triangle?

Euclid used constructions with ruler and compass to answer such questions, and these are lots of fun. But we can do even better using the concept of the sine. With it, we can actually figure out the missing information whenever the triangle is determined.

Obviously, if all we know of a triangle is one side length, there are lots of triangles that are not similar to one another that can have a side of that length. The same is true if we only know one angle. Knowing two angles tells us the third angle as well, since the sum of all three is \(\pi\) radians. That means knowing two angles means all triangles with same are similar, but we know nothing of their side size. Knowing two angles and a side length between any particular pair of angles determines all three lengths, as we shall see.

Knowing two side lengths alone does not determine angles at all ; knowing two side lengths and the angle between them does, and so does knowing all three sides . There are elegant facts that allow us to determine all the missing information as we shall see.

Actually knowing two side lengths \(A\) and \(B\) with \(A\) greater than \(B\), and knowing the angle where the \(B\) side meets the third \(C\) side, determines everything as well, and we can find the missing information here also.

When the sides lengths are \(A, B\) and \(C\), and \(A\) is greater than \(B\), and we know the angle where the \(A\) side meets the \(C\) side, there are either \(0\) or two solutions, except for one special case (which happens when the side lengths \(A, B\) and \(C\) determine a right triangle so that \(A^2 = B^2 + C^2\).) To have a solution, \(B\) must be at least \(A \sin \theta\).

Knowing three side lengths determines the triangles completely. We will now prove all these statements by use of sines.

How We Find Missing Triangle Parameters

One tool for doing this is the Law of Sines. This is the statement that the size of side \(A\), divided by the size of side \(B\), is the sine of the angle opposite side \(A\) (this is the angle where \(B\) and \(C\) meet) divided by the sine of angle opposite side \(B\). If we know two angles we know the third, and their sines, so if we know any one side length, we know its ratio to all other side lengths and can calculate the other two side lengths.

Proof of the Law of Sines: Given a triangle with side lengths \(A,B\) and \(C\), draw a line segment perpendicular to the \(C\) side from it to the vertex opposite it. The length of that segment is \(A \sin(AC)\) and also is \(B \sin(BC)\), by the definition of the sine. This means \(\frac{A}{B}\) is \(\frac{\sin(BC)}{\sin(AC)}\) which is the statement above.

Exercise 7.12: Draw yourself a picture with vertex labels instead of segment length labels and verify these statements.

The law of sines tells us that if we know all the angles of a triangle, then we know their sines and hence we know all the ratios between side lengths in it. We can thus deduce that similar triangles have the same ratios of side lengths of corresponding sides.

7.1c Vectors

Before describing how to find missing parameters in a triangle when we know three sides only, or two sides and an angle, we make one more definition. We have used the notation \((5,7)\) to describe a point with \(x\) coordinate \(5\) and \(y\) coordinate \(7\). In these terms a line segment is described by giving the two coordinates of each of its endpoints. This is cumbersome. For many purposes such as determining length, we don't really care where the segment starts; what is important to us is only the differences between each of its two coordinates at the two endpoints. These determine the length and the orientation of the segment.

Thus, given the line segment whose endpoints are \((1,2)\) and \((3,6)\) the differences at the two endpoints are 2 in \(x\) coordinate and \(4\) in y coordinate. We write this information as \(2\hat{i} + 4\hat{j}\), where \(\hat{i}\) and \(\hat{j}\) are called unit vectors in the \(x\) and \(y\) directions and say that the given line segment has \(2\hat{i} + 4\hat{j}\) as its vector. Actually this notation describes the directed segment with direction from \((1,2)\) to \((3,6)\); the segment directed oppositely is the negative of this one.

In general, each directed line segment, say one from \((a,b)\) to \((c,d)\), defines a vector, namely \((c-a)\hat{i} + (d-b)\hat{j}\), with \(\hat{i}\) and \(\hat{j}\) unit vectors in the \(x\) and \(y\) directions respectively. This vector contains information relative to the segment important to us, but says nothing about where it starts, or what the \(x\) and \(y\) directions are, except that they are perpendicular to one another.

To see the use of this definition, suppose we have a triangle with vertex points \((1,2)\), \((3, 7)\) and \((6,2)\)

Each side of the triangle is described by two of these point descriptors, namely those of its two endpoints. And we often have no interest in where the triangle is located in the plane. Suppose we direct the segments to form a cycle.

The line \((1,2)\) to \((3,7)\) has vector \(2\hat{i} + 5\hat{j}\). The line \((3,7)\) to \((6,2)\) has vector \(3\hat{i} - 5\hat{j}\). The line \((6,2)\) to \((1,2)\) has vector \(-5\hat{i}\).

Notice that the sum of the vectors corresponding to this cycle is the \(0\) vector.

In general, the sum of a bunch of vectors that correspond to the line segments of a directed path is the vector from the beginning of that path to its end. In the case of a cycle these are the same point and the sum is thus the \(0\) vector.

Proof: In forming the sum vector, the intermediate coordinates get added from their incoming vector and subtracted from their outgoing one, and so drop out. Only the contributions from the endpoints remain.

This information is implicit on the notation describing lines by points, but that notation has too much information, and is much harder to work with.

But the wonderful thing is, given two line segments we can easily extract important information from their vectors. The first bit of information is what is called their dot product: given \(a\hat{i} + b\hat{j}\), and \(c\hat{i} + d\hat{j}\), their dot product is \(ac + bd\). You multiply together like components and add them up. We have already seen that the dot product of a vector with itself gives the square of the length of its line segment. In general, as we will prove, the dot product of two different vectors gives the product of the length of the two segments, multiplied by the cosine of the angle between them. The angle between them is the angle you get if you line the two segments up with the same back vertex, directed away from it.

When the two segments form part of the boundary of a cycle triangle, the interior angle of the triangle is not the angle of size \(\theta\) between them, but instead has size \(\pi-\theta\), and the cosine of this angle is \(-\cos\theta\). Draw pictures and use them to verify this claim.

The proof of the evaluation of the dot product here comes from the fact that this product is an invariant; which means it does not depend on the orientation of the coordinate system.

How do you know the dot product is an invariant?

Claim: If we rotate our coordinates so that the unit vector \(\hat{i}\) is replaced by \(\hat{i}\cos\theta + \hat{j}\sin\theta\) and \(\hat{j}\) is replaced by \(\hat{i}\sin\theta - \hat{j}\cos\theta\), the dot product between any two vectors does not change.

Exercise 7.13: Prove this for a vector \(\vec{v}\) that points in the \(x\) direction, and a general \(\vec{w}\) vector.

This means we can choose our coordinate system so that the first vector, \(\vec{v}\) whose length is \(|\vec{v}|\) points in the \(x\) direction, so that \(\vec{v}\) is \(|\vec{v}|\hat{i}\). The second vector \(\vec{w}\) similarly is \(|\vec{w}|(\cos\theta\hat{i} + \sin\theta\hat{j})\) when the angle between \(|\vec{v}|\) and \(|\vec{w}|\) is \(\theta\). The dot product of the two is \(|\vec{v}|||\vec{w}|\cos\theta\) by its definition.

Law of Cosines: If three directed line segments form a cycle triangle, then their side lengths \(A\), \(B\) and \(C\) obey \(C^2 = A^2 + B^2 - 2AB \cos(\theta)\), where \(\theta\) is the interior angle of the triangle where the \(A\) and \(B\) segments meet.

Proof: We have seen that the sum of the vectors of all the sides of the triangle is the \(0\) vector. This means that the vector for the \(C\) segment is minus the sum of the \(\hat{A}\) and \(\hat{B}\) vectors.

The square of the length of \(C\) or \(C^2\) is then the square of the sum of the \(\hat{A}\) and \(\hat{B}\) vectors, which is the dot product of this sum with itself. This is \(A^2 + B^2 + 2AB\cos(\pi-\theta)\), (remember that the interior angle \(\theta\) where the \(A\) and \(B\) segments meet is \(\pi - \theta\)). The conclusion follows from the fact: \(\cos(\pi - \theta) = -\cos\theta\).

We can see immediately from this law that knowing \(A\) and \(B\) and \(\theta\) determines \(C\), and also knowing \(A\), \(B\) and \(C\) determines \(\cos(θ)\).

This law of cosines therefore allows us to deduce all side lengths and angles of the triangles given either three side lengths or two side lengths and the angle between the two corresponding sides.

The other case in which all the information can be deduced is when, in the formula above, we know \(C\) and \(A\) and \(\theta_{AB}\), which is angle where \(A\) and \(B\) meet, and \(C\) is bigger than \(A\). Filling in the given information in the law of cosines yields a quadratic equation for \(B\). When \(A\) is less than \(C\) one of the two solutions to this equation is negative, so we can determine the unique solution by finding the one positive solution to the quadratic equation obtained.

Exercise 7.14: Define lengths for \(A\) and \(C\) and an angle \(\theta_{AB}\), obtain the quadratic equation for \(B\) and find its positive solution. Verify that the other solution is negative.

Cross Products and Areas

We have seen above how the dot product of two vectors (along with their dot products with themselves) conveys useful information about the segments they describe, namely the product of their lengths with the cosine of the angle between them.

There is another thing we can do with vectors in the plane called their cross product. This product depends not only on the directions of the line segments but on the order in which one places them. But it is quite simple.

In forming the dot product you multiply like components and add them. In forming the cross product you multiply unlike components and subtract them. Obviously the sign of what you get depends on which you subtract from which. This depends on you and not on the segments. But the magnitude of the cross product has real meaning. It is the Area of the parallelogram formed from the two line segments as adjacent sides. This is twice the area of the triangle with these line segments as sides .

Proof: The area of the parallelogram is its base length multiplied by its height. If the base has length \(a\) and the other side, with length \(b\) forms angle \(\theta\) with the base, the height is \(b \sin\theta\), and the area is \(ab \sin\theta\). That is exactly what the magnitude of the cross product is if a points in the \(x\) direction. The conclusion follows from the invariance of the cross product under rotation of coordinates, which is proven exactly as one proves the invariance of the dot product.

Suppose we have the vectors \(2\hat{i} + 3\hat{j}\), and \(4\hat{i} - 7\hat{i}\), Their cross product is (up to sign) \(3*4 -2*(-7) = 12 + 14 = 26\). Thus the parallelogram they form has area \(26\), and the triangle they form has area half of this or \(13\).

The cross product of a vector with itself is \(0\).

By the way, dot and cross products can be formed in higher dimensions. In three dimensions, points have three components and so do vectors. The dot product is defined the same way in any dimension as the sum of the products of like components, and has the same meaning in all.

The cross product in two dimensions involves both components. In higher dimensions it is formed by taking two dimensional cross products with each pair of coordinates.

In three dimensions you can multiply \(x\) and \(y\) components and subtract and can do the same with \(y\) and \(z\) components and also with \(z\) and \(x\) coordinates. We make a sort of vector by making these in order the \(z\), \(x\) and \(y\) components of the cross product vector.

\[ (a_x\hat{i} + a_y\hat{j} + a_z\hat{k}) \times (b_x\hat{i} + b_y\hat{j} + b_z\hat{k}) = (a_xb_y - a_yb_x)\hat{k} + (a_yb_z-a_zb_y)\hat{i} + (a_zb_x-a_zb_x)\hat{j} \]

(The \(k\) term is the ordinary two dimensional \(x\), \(y\) cross product. You can determine the other terms by changing \(x\) to \(y\), \(y\) to \(z\) and \(z\) to \(x\), and also \(i\) to \(j\), \(j\) to \(k\), and \(k\) to \(i\), once, and also twice.)

The cross product of two vectors in three dimensions points perpendicular to the plane of the segments that these vectors represent. Its magnitude is the area of any parallelogram whose sides are represented by these vectors.

Given three vectors \(A\), \(B\) and \(C\), the dot product of \(C\) with \(A \times B\) is the volume of a parallelepiped whose sides are described by these vectors.

Exercises: 1. Prove these two statements. (Hint: choose directions such that the vector a points in the \(x\) direction and \(b\) lies in the \(xy\) plane.

2. Given two line segments that are perpendicular to each other. What does all this imply about the dot product of their vectors? About the magnitude of their cross product?