18.02A
Information
Recitations: Mondays & Wednesdays, 10 AM in Room 2-132.
Office Hours: Wednesdays, 3-4 PM in Room 2-175.
The course instructor is John Bush. All course materials are on Canvas.
There are no MITx exercises in this course. Some homework is assigned from the textbook Multivariable Calculus, 6th Ed., by Edwards & Penney.
I will use this webpage to record what we discuss.
21-12-08
- Discuss how to graph polar equations of the form \(r = A \cos (n\theta)\), where \(A > 0\) and \(n\) is a positive integer, and relatedly, how to find the bounds for \(\theta\) in polar integration.
- §14.4, #4: Find the area bounded by one loop of the rose \(r = 2\cos(2\theta)\).
- §14.4, #30: Find the volume of the solid bounded by the paraboloid \(z = r^2\), the cylinder \(r = 2a\sin \theta\) (where \(a\) is a constant), and the plane \(z = 0\).
Addendum: Suggested practice from Edwards & Penney.
- §14.2, #12: Find the region of integration for \[\begin{align*} \int_0^\pi \int_0^{\sin x} y\,dy\,dx, \end{align*}\] and evaluate it.
- §14.2, #34: Find the region of integration for \[\begin{align*} \int_0^1 \int_{\arctan y}^{\pi/4} \sec x\,dx\,dy, \end{align*}\] show how to rewrite the integral in the opposite order, and finally, evaluate it. Hint: If \(0 \leq y \leq 1\), then \(x = \arctan y\) if and only if \(y = \tan x\).
- §14.5, #14: Find the mass and centroid of the plane lamina bounded by \(x = 0\) and \(x = 9 - y^2\) with density \(\delta(x, y) = x^2\).
21-12-06
-
Draw the following regions in the \(xy\)-plane, paying special attention to where the bounding curves intersect and which ones are on top of / below which others:
- \(R_1\) is where \(-1 \leq x \leq 1\) and \(0 \leq y \leq 2\).
- \(R_2\) is the region of finite area bounded by \(y = x\) and \(y = x^2\).
- \(R_3\) is the region of finite area bounded by \(y = 1 + x^2\) and \(y = 2x^2\).
- \(R_4\) is the region of finite area bounded by \(y^4 = x^2\) and \(|y| = 2\).
- The simplest double integrals can be written as a product of single-variable integrals. If \(f(x, y) = a(x)b(y)\), where \(a\) and \(b\) respectively do not depend on \(y\) and \(x\), then \[\begin{align*} \iint_{R_1} f(x, y) \,dA = \int_0^2 \int_{-1}^1 f(x, y)\,dx \,dy = \left( \int_{-1}^1 a(x)\,dx \right) \left(\int_0^2 b(y)\,dy\right). \end{align*}\] (In class, we took \(a(x) = \cos(\pi x/2)\) and \(b(y) = \sin(\pi y/2)\).)
- If your region of integration is not a rectangle, then the bounds of your inner integral may depend on the variable of the outer integral. For instance: \[\begin{align*} \iint_{R_3} f(x, y)\,dA = \int_{-1}^1 \int_{2x^2}^{1 + x^2} f(x, y)\,dy\,dx. \end{align*}\] Notice that it would be harder to express this integral in the form \(\int \int f \,dx\,dy\), with \(x\) on the inside and \(y\) on the outside.
- Sometimes both orders of integration are equally easy to write down. For instance, if \(T\) is the triangle whose vertices are \((0, 0)\), \((0, 1)\), \((1, 1)\), then \[\begin{align*} \iint_T f(x, y)\,dA = \int_0^1 \int_x^1 f(x, y)\,dy\,dx = \int_0^1 \int_0^y f(x, y)\,dx\,dy. \end{align*}\] However, depending on \(f\), one order of integration may be easier to evaluate than the other. For instance, take \(f(x, y) = e^{-y^2}\). The inner integral \[\begin{align*} \int_0^1 \int_x^1 e^{-y^2}\,dy\,dx \end{align*}\] is not tractable (based on what we’ve covered), whereas \[\begin{align*} \int_0^1 \int_0^y e^{-y^2}\,dx\,dy = \int_0^1 ye^{-y^2}\,dy, \end{align*}\] which we can solve.
21-12-01
- General form of a Lagrange-multiplier problem: Find the extrema of a function \(f\), subject to a constraint \(g = 0\). The key observation is that at the extrema, the level curves of \(f\) must be tangent to the constraint curve, so the gradients of \(f\) and \(g\) must be parallel or anti-parallel: \[\begin{align*} \nabla f = \lambda \nabla g \end{align*}\] for some constant \(\lambda\) called the Lagrange multiplier. So find the solutions of this vector equation that also satisfy \(g = 0\).
- If a hexagon is inscribed in the unit circle such that its vertices are \((0, \pm 1)\) and \((a, \pm b)\) and \((-a, \pm b)\), where \(a, b > 0\), how can we pick \(a, b\) to maximize its area? Show that the area is given by \[\begin{align*} A = 4 \cdot \frac{1}{2}a(1 + b) = 2a(1 + b). \end{align*}\] The constraint is \(a^2 + b^2 = 1\). So here, take \(f = A\) and \(g = a^2 + b^2 - 1\). The gradient equation gives the system of equations \[\begin{align*} 2(1 + b) &= 2\lambda a,\\ 2a &= 2\lambda b. \end{align*}\] Therefore, \((1 + b)/a = a/b\), from which \(b + b^2 = a^2\). Plugging into the constraint gives \(2b^2 + b - 1 = 0\), which yields \(b = 1/2, -1\). Discarding the negative solution, \(b = 1/2\) and \(a = \sqrt{3}/2\). These values produce a regular hexagon.
- If a triangle has sides of length \(x_1, x_2, x_3\), and we set \(s = \frac{1}{2}(x_1 + x_2 + x_3)\), then its area is given by Heron’s formula \[\begin{align*} A = \sqrt{s(s - x_1)(s - x_2)(s - x_3)}. \end{align*}\] If the perimeter is held fixed, what triangle maximizes the area? It suffices to do a Lagrange-multiplier problem for \(A^2\) subject to the constraint \(s = 1\). The trick to avoid a mess is to simplify \(A^2\) using the constraint before computing partials: \[\begin{align*} A^2|_{s = 1} = (1 - x_1)(1 - x_2)(1 - x_3) \end{align*}\] The gradient equation gives the system of equations \[\begin{align*} (1 - x_2)(1 - x_3) &= \frac{1}{2}\lambda,\\ (1 - x_3)(1 - x_1) &= \frac{1}{2}\lambda,\\ (1 - x_1)(1 - x_2) &= \frac{1}{2}\lambda. \end{align*}\] Combining the first two equations, \((1 - x_2)(1 - x_3) = (1 - x_3)(1 - x_1)\). Since \(x_3 < s = 1\), we can divide out \(1 - x_3\) on both sides, giving \(x_1 = x_2\). Similarly, combining the last two equations, \(x_2 = x_3\).
21-11-29
-
Explain the reasoning behind the 2nd-derivative test. Writing \(\vec{r} = (x, y)\) and \(\Delta\vec{r} = (\Delta x, \Delta y)\), we see that any twice-differentiable function \(f(\vec{r})\) has a Taylor approximation \[\begin{align*}
f(\vec{r} + \Delta\vec{r})
&\approx f(\vec{r}) + f_x(\vec{r})\Delta x + f_y(\vec{r})\Delta y \\
&\qquad + \frac{1}{2}(f_{xx}(\vec{r})(\Delta x)^2 + f_{xy}(\vec{r}) (\Delta x)(\Delta y) + f_{yx}(\vec{r}) (\Delta y)(\Delta x) + f_{yy}(\vec{r})(\Delta y)^2).\nonumber
\end{align*}\] Writing \(H = \begin{pmatrix} f_{xx} &f_{xy} \\ f_{yx} &f_{yy}\end{pmatrix}\), we get \[\begin{align*}
f(\vec{r} + \Delta \vec{r})
&\approx f(\vec{r}) + \nabla f(\vec{r}) \cdot \Delta \vec{r} + \frac{1}{2}(\Delta \vec{r})^t H \Delta \vec{r}.
\end{align*}\] (Recall that \((-)^t\) means “transpose”.) At a critical point, the linear term vanishes. The sign of the quadratic term tells us the concavity of \(f\) at \(\vec{r}\) in the direction of \(\Delta \vec{r}\), so we want to know if different directions give different signs. Assuming \(f_{xy} = f_{yx}\), the matrix \(H\) is symmetric, so by the spectral theorem, it can be diagonalized: If \(H\) has eigenvalues \(\lambda, \mu\) corresponding to eigenvectors \(\vec{u}, \vec{v}\), then \[\begin{align*}
(\Delta \vec{r})^t H \Delta \vec{r} = \lambda (\Delta \vec{u})^2 + \mu (\Delta \vec{v})^2,
\end{align*}\] where \(\Delta \vec{u}, \Delta \vec{v}\) are the components of \(\Delta \vec{r}\) in the basis formed by \(\vec{u}\) and \(\vec{v}\). Recall that the eigenvalues are related to the characteristic polynomial by \[\begin{align*}
\det(tI - H) = (t - \lambda)(t - \mu).
\end{align*}\] At the same time, \[\begin{align*}
\det(tI - H) = t^2 - (f_{xx} + f_{yy})t + \overbrace{f_{xx}f_{yy} - f_{xy}^2}^{\Delta}.
\end{align*}\] Matching coefficients of \(t\), we get \[\begin{align*}
\lambda + \mu &= f_{xx} + f_{yy},\\
\lambda\mu &= \Delta.
\end{align*}\] So some possibilities are:
- \(\Delta > 0\) and \(f_{xx} > 0\): This forces \(\lambda\) and \(\mu\) to be both positive.
- \(\Delta > 0\) and \(f_{xx} < 0\): This forces \(\lambda\) and \(\mu\) to be both negative.
- \(\Delta < 0\): This forces \(\lambda\) and \(\mu\) to have opposite signs.
- Discuss problem 1(b) from problem set 3. The key idea is that the velocity vector \((dx/dt, dy/dt)\) of the shark must point in the same direction as \(\nabla C = (C_x, C_y)\), giving us \(dy/dx = (dy/dt)/(dx/dt) = C_y/C_x\).
21-11-24
Have a restful Thanksgiving.
- Discuss problems 5 and 6 from problem set 3.
- Critical points can be local maxima, local minima, saddle points, or something in between. Consider the family of functions \[\begin{align*} f(x, y) = x^2 + y^2 + txy, \end{align*}\] depending on \(t\). We calculate \(\Delta = 4 - t^2\). If \(|t|< 2\), then \(f\) is a paraboloid with a local minimum at the origin. If \(t = \pm 2\), then \(f\) is a parabolic sheet, flat along the \(x \pm y = 0\) direction. If \(|t| > 2\), then \(f\) is a saddle centered at the origin.
- Consider a rectangular box with (positive) dimensions \(x\), \(y\), \(z\). Its volume is \(V = xyz\), its surface area is \(A = 2(xy + xz + yz)\), and its total edge length is \(L = 4(x + y + z)\). In lecture, Professor Bush showed that if \(V\) is fixed, then \(A\) is minimized when the box is a cube. Let’s show that \(L\) is also minimized when the box is a cube. The key is that fixing volume reduces us to two free parameters. For instance, we can write \(z\) as the function \(z = \frac{V}{xy}\). Now we want to minimize \[\begin{align*} L = 4\left(x + y + \frac{V}{xy}\right) \end{align*}\] over all \(x, y\). By setting \(\partial L/\partial x = 0\) and \(\partial L/\partial y = 0\), we find that the critical points occur when \(V = x^2 y\) and \(V = xy^2\) simultaneously. This forces \(x = y\), and hence \(z = V/xy = x = y\). (It is indeed a minimum because \(L_{xx}L_{yy} - L_{xy}^2 = 48V^2x^{-4}y^{-4} > 0\).)
- Extra: Read about the classification of \(2\)nd-order partial differential equations.
- Extra: Read about the classification of Möbius transformations.
21-11-22
- When people talk about the radial direction at a point \((x_1, \ldots, x_n)\), they mean the unit vector pointing from the origin to \((x_1, \ldots, x_n)\).
- Let \(f(x, y) = \arctan(y/x)\). Compute \(\nabla f(x, y)\), \(|\nabla f(x, y)|\), and \(D_{\vec{u}} f(x, y)\), where \(\vec{u}\) is the radial direction. You should get \[\begin{align*} \nabla f = \left(\partial_x f, \partial_y f\right) = \left(-\frac{y}{\sqrt{x^2 + y^2}}, \frac{x}{\sqrt{x^2 + y^2}}\right), \end{align*}\] whence \(|\nabla f| = 1/\sqrt{x^2 + y^2}\) and \(D_{\vec{u}} f = \nabla f \cdot \vec{u} = 0\). There’s a more conceptual way to arrive at \(D_{\vec{u}} f\): Observe that \(f(x, y)\) equals the angle between the positive \(x\)-axis and \((x, y)\), so along any ray extending outward from the origin, \(f(x, y)\) is constant.
- Above, the level curves of \(f\) are the rays extending radially from the origin. By contrast, for a function like \(g(x, y) = 1/(1 + x^2 + y^2)\) that only depends on \(x\) and \(y\) via the radius \(\sqrt{x^2 + y^2}\), the level curves must be circles.
- A normal vector to a surface \(F(x, y, z) = C\) at a point \((a, b, c)\) is given by \(\nabla F(a, b, c)\), as this is normal to its tangent plane there.
- The surfaces \[\begin{align*} 2 &= x^2 + y^2 + z^2,\\ z &= x^2 + y^2 \end{align*}\] intersect along a curve \(C\): Find normal vectors to the surfaces along \(C\), and use them to compute the angle between the surfaces along \(C\). First, algebra shows that the intersection occurs along \(z^2 + z - 2 = 0\). So either \(z = 2\) or \(z = -1\). We need the former, because \(z = x^2 + y^2 \geq 0\). So altogether, \(C\) is described by the two equations \[\begin{align*} x^2 + y^2 = z = 1. \end{align*}\] Let \(F(x, y, z) = x^2 + y^2 + z^2\) and \(G(x, y, z) = x^2 + y^2 - z\). Then the surfaces are \(F = 2\) and \(G = 0\), so we can use the gradient formula to compute their normals along \(C\): \[\begin{align*} \nabla F &= (2x, 2y, 2z) = (2x, 2y, 2),\\ \nabla G &= (2x, 2y, -1). \end{align*}\] The angle \(\theta\) between the surfaces at a given point is the angle between their normals at that point, so we arrive at \[\begin{align*} \cos \theta = \frac{\nabla F \cdot \nabla G}{|\nabla F||\nabla G|} = \frac{1}{\sqrt{10}}. \end{align*}\] Apparently, \(\theta = \arccos(1/\sqrt{10})\) does not have an elementary formula.
- Extra: If you have \(m\) equations constraining \(n\) unknowns (and \(n \geq m\)), then the dimension of the solution set is \(\leq n - m\), and equality holds when the coefficients are sufficiently generic. But if \(x_1, x_2, y_1, y_2\) are real numbers and we set \[\begin{align*} x &= x_1 + x_2\sqrt{-1},\\ y &= y_1 + y_2\sqrt{-1}, \end{align*}\] then an equation of the form \(f(x, y) = 0\) cuts out a two-dimensional surface in four-dimensional \(x_1x_2y_1y_2\)-space: There’s one constraint coming from the real part, and another from the imaginary part. With this notation, what does the graph of \[\begin{align*} y^2 = x^3 \end{align*}\] look like in four-dimensional space? To learn more about surfaces like these, click on the drawing of a knot in the corner of the Math section of my website.
21-11-17
- Review for test 1 (on Nov. 17). In recitation, we focused on the linear-algebra portions of the practice test: namely, problems 1, 3, and especially 4.
- The gradient of a function \(f(x_1, \ldots, x_n)\) is a vector field, i.e., a function that sends vectors to other vectors: \[\begin{align*} \nabla f (x_1, \ldots, x_n) = (\partial_{x_1} f, \ldots, \partial_{x_n} f). \end{align*}\] At any given point, it describes the direction in \(n\)-dimensional space in which \(f\) is increasing most quickly.
- The gradient of \(f(x, y) = x^2 - y^2\) is given by \(\nabla f (x, y) = (2x, -2y)\). We see that \(\nabla f(0, 1) = (0, -2)\), so at the point \((x, y) = (0, 1)\), the function is increasing most quickly in the negative \(y\)-direction. In fact, the graph of \(z = x^2 - y^2\) is a saddle.
- Addendum: Professor Bush says the only lecture 11 material on the test would be an “easy” question about gradients and/or directional derivatives. We did not get to the latter during recitation. For practice, please see §13.8 in Edwards–Penney, problems 1-20.
- Addendum: The general formula for the unit normal vector to \(\vec{r}(t) \in \mathbb{R}^3\) is \[\begin{align*} \vec{N} = \frac{\vec{T}'}{|\vec{T}'|}, \end{align*}\] where \(\vec{T}' = d\vec{T}/dt\), the derivative of the unit tangent vector with respect to time. In the special case where \(\vec{r}(t) = (x(t), y(t), 0)\), this simplifies to the formula involving \(\hat{k}\).
21-11-15
- Warm-up: Sketch the graphs of \[\begin{align*}z = x^2 + y^2,\quad z = \sqrt{x^2 + y^2},\quad z = \frac{4}{\pi}\arctan(xy).\end{align*}\] Qualitatively, what is the difference between the first two at \((x, y) = (0, 0)\)?
- Discuss \(1\)st- and \(2\)nd-order partial derivatives. Clairaut’s theorem says that \[\begin{align*} \frac{\partial^2 f}{\partial y \partial x}(a, b) = \frac{\partial^2 f}{\partial x \partial y}(a, b) \end{align*}\] as long as both \(\frac{\partial^2 f}{\partial y \partial x}\) and \(\frac{\partial^2 f}{\partial x \partial y}\) exist and are continuous at \((a, b)\). To illustrate the theorem, compute both sides for \(f(x, y) = xe^y\) and arbitrary \((a, b)\).
- We will abbreviate \(f_x = \partial f/\partial x\) and \(f_y = \partial f/\partial y\). Compute these partial derivatives for \(f(x, y) = \sqrt{x^2 + y^2}\) and \(f(x, y) = \frac{4}{\pi}\arctan(xy)\).
- We can use \(1\)st-order partial derivatives to compute tangent planes. As motivation, recall that the tangent line to \(y = g(x)\) at \((x, y) = (a, b)\) is given by \[\begin{align*} y = b + f'(a)(x - a). \end{align*}\] Similarly, the tangent plane to \(z = f(x, y)\) at \((x, y, z) = (a, b, c)\) is given by \[\begin{align*} z = c + f_x(a, b)(x - a) + f_y(a, b)(y - b). \end{align*}\] Compute the tangent planes to \(z = \sqrt{x^2 + y^2}\) at \((-3, 4, 5)\) and \(z = \frac{4}{\pi}\arctan(xy)\) at \((1, \sqrt{3}, 4/3)\). Does the first of these surfaces have a well-defined tangent plane at \((0, 0, 0)\)?
- More generally, the tangent plane to the surface cut out by \(F(x, y, z) = 0\) at \((x, y, z) = (a, b, c)\) is \[\begin{align*} F_x(a, b, c)(x - a) + F_y(a, b, c)(y - b) + F_z(a, b, c)(z - c) = 0. \end{align*}\] Check that this formula specializes to the previous one when \(F(x, y, z) = z - f(x, y)\).
- Discuss problem 7 on problem set 2. If a smooth (oriented) curve lies along a smooth surface, then its unit tangent vector at any point lies in the tangent plane to the surface at that point. In particular, if the curve is contained in two different surfaces, then its tangent vector at \(\vec{p}\) lies in the intersection of their tangent planes at \(\vec{p}\).
- Extra: For an example where the failure of continuity implies the failure of Clairaut’s identity, see Wikipedia.
21-11-10
- Review tangent and normal vectors: If \(\vec{r}(t) = (x(t), y(t))\) is position as a function of time \(t\), then \(\vec{v}(t) = (x'(t), y'(t))\) is velocity. The unit tangent vector to \(\vec{r}\) is \[\begin{align*} \vec{T} = \frac{\vec{v}}{|\vec{v}|}, \end{align*}\] and the unit normal vector to \(\vec{r}\) is \[\begin{align*} \vec{N} = \pm\hat{k} \times \vec{T}, \end{align*}\] where \(\vec{k}\) is the vector that points “out of the page” and the sign is positive, resp. negative iff the trajectory is counterclockwise, resp. clockwise.
- By definition, the curvature of \(\vec{r}\) is the scalar \[\begin{align*} \kappa = \left|\frac{d\vec{T}}{ds}\right|. \end{align*}\] where \(s\) is arclength measured along the trajectory of \(\vec{r}\). In fact, since \(\vec{T}\) has unit length at every point, \(d\vec{T}/ds\) must be perpendicular to \(\vec{T}\), which in turn shows that \(d\vec{T}/ds\) is a scalar multiple of \(\vec{N}\). The unsigned curvature is the scaling factor: \[\begin{align*} \frac{d\vec{T}}{ds} = \kappa \vec{N}. \end{align*}\] In particular, when \(\frac{d\vec{T}}{ds}\) and \(\vec{N}\) are both nonzero, they point in the same direction.
- The unsigned curvature of a 2D trajectory \(\vec{r}(t) = (x(t), y(t))\) is given by \[\begin{align*} \kappa = \frac{1}{|\vec{v}|^3}(|\vec{v} \times \vec{a}|). \end{align*}\] The key step in deriving it is to write the acceleration vector \(\vec{a}\) as a sum of (perpendicular) tangential and centripetal components: \[\begin{align*} \vec{a} = \frac{d}{dt}\vec{v} = \frac{d}{dt}\left(\frac{ds}{dt}\vec{T}\right) = \frac{d^2s}{dt^2} \vec{T} + \left(\frac{ds}{dt}\right)^2 \frac{d\vec{T}}{ds}. \end{align*}\] Above, the last equality uses a combination of the product and chain rules. Using \(ds/dt = |\vec{v}|\) and \(d\vec{T}/ds = \kappa \vec{N}\), we arrive at \[\begin{align*} \vec{a} = \frac{d^2s}{dt^2} \vec{T} + \kappa |\vec{v}|^2 \vec{N}. \end{align*}\] We want to isolate \(\kappa\), so we take the cross product with \(\vec{v}\) to kill the tangential part: \[\begin{align*} \vec{v} \times \vec{a} = 0 + \vec{v} \times (\kappa |\vec{v}|^2 \vec{N}) = \kappa |\vec{v}|^3 \hat{k}. \end{align*}\] So we obtain \(|\vec{v} \times \vec{a}| = \kappa|\vec{v}|^3\). Rearranging gives the desired formula.
- The notes for lecture 9 discuss the signed curvature \[\begin{align*}k = \frac{1}{|\vec{v}|^3}(\hat{k} \cdot (\vec{v} \times \vec{a})).\end{align*}\] instead of the unsigned curvature \(\kappa\). The signed version is only defined for two-dimensional trajectories.
- For the circular trajectory \(\vec{r}(t) = (\alpha \cos(\omega t), \alpha \sin(\omega t))\), calculate \(\vec{v}\), \(\vec{a}\), \(|\vec{v}|\), \(\vec{T}\), \(\vec{N}\), and \(k\). Ultimately, you should find that \[\begin{align*}k = \kappa = \frac{1}{\alpha}.\end{align*}\] In general, the local radius of curvature is equal to \(1/\kappa\). What we see above is the special case where curvature, and radius of curvature, are constant. This example also shows that curvature can be arbitrarily large.
- When \(y = f(x)\), the signed curvature formula simplifies to \[\begin{align*}k = \frac{f''}{(1 + (f')^2)^{3/2}}.\end{align*}\] In the case where \(f(x) = \alpha x^3\) for some nonzero \(\alpha\), we get \[\begin{align*}k = \frac{6ax}{(1 + 9a^2x^4)^{3/2}}.\end{align*}\] Notice that as \(x\) grows large, \(k\) tends to \(0\). This makes sense, because the graph of a cubic looks more and more straight as we go further and further out horizontally.
21-11-08
- To describe a line, we need a starting point \((x_0, y_0, z_0)\) and a direction vector \((a, b, c)\). The line then consists of the points \[\begin{align*}(x, y, z) = (x_0, y_0, z_0) + t (a, b, c)\end{align*}\] as \(t\) runs over all real numbers.
- Discuss a problem that Professor Bush meant to cover in lecture 7: Writing \(L\) for the line through the points \(p_1 = (0, -1, 1)\) and \(p_2 = (2, 1, 0)\), find the equation of \(L\), and then find the shortest distance from \(L\) to the origin. In recitation, we chose the starting point \(p_2\) and the direction vector \(p_1 - p_2 = (-2, -2, 1)\), giving \[\begin{align*}(x, y, z) = (2, 1, 0) + t(-2, -2, 1) = (2 - 2t, 1 - 2t, t).\end{align*}\] The distance from \((x, y, z)\) to the origin is \[\begin{align*}D = \sqrt{x^2 + y^2 + z^2} = \sqrt{9t^2 - 12t + 5}.\end{align*}\] To minimize \(D\), it suffices to minimize \(D^2\), which is a bit simpler. Solving \(d(D^2)/dt = 0\) for \(t\) yields \(t = 2/3\), so the distance is minimized at \((2/3, -1/3, 2/3)\), where it equals \(D = 1\).
- To describe a plane, we need a point \((x_0, y_0, z_0)\) and a normal vector \((A, B, C)\). The plane then consists of the points \((x, y, z)\) such that \[\begin{align*}(A, B, C) \cdot ((x, y, z) - (x_0, y_0, z_0)) = 0,\end{align*}\] which simplifies to \(Ax + By + Cz = Ax_0 + By_0 + Cz_0\).
- Writing \(P\) for the plane that passes through the points \(p_1 = (80, 0, 140)\) and \(p_2 = (540, 75, 0)\) and \(p_3 = (0, -7, 181)\), find the equation of \(P\), and then find the shortest distance from \(P\) to the point \((-12, -21, 1)\). To get a normal vector for \(P\), we can choose any two displacement vectors going between \(p_1\), \(p_2\), \(p_3\) (for instance, \(p_2 - p_1\) and \(p_3 - p_1\)), then form their cross product, as it will be perpendicular to both. One possible equation for the plane is \[\begin{align*}\frac{1}{4}x + \frac{1}{3}y + z = 160.\end{align*}\] To do the second part of the problem, it’s actually faster to use geometry than calculus. Note that the shortest line segment from \(P\) to \((-12, -21, 1)\) will lie perpendicular to \(P\), and hence, parallel or anti-parallel with any normal vector of \(P\). From the equation above, we see that \((1/4, 1/3, 1)\) is such a normal vector, so the line segment we want is part of the line \[\begin{align*}(x, y, z) = (-12, -21, 1) + t(\tfrac{1}{4}, \tfrac{1}{3}, 1) = (-12 + \tfrac{1}{4}t, -21 + \tfrac{1}{3}t, 1 + t).\end{align*}\] To find the point of intersection between this line and \(P\), we plug this formula into the equation of the plane and solve for \(t\). That gives \(t = 144\). We can now solve for the point of intersection, and thus, the distance between it and \((-12, -21, 1)\).
- What’s the equation of the plane of all points equidistance between \(p = (1, 0, -2)\) and \(q = (3, 4, 0)\)? The plane has to bisect the line segment between \(p\) and \(q\), so it passes through the midpoint \((2, 2, -1)\). It must also be perpendicular to that segment, so for the normal vector, we can just use the displacement from \(p\) to \(q\), which is \(q - p = (2, 4, 2)\).
- Discuss a variant of the cycloid problem. In lecture 7, Professor Bush presented a cycloid curve traced out by a point on the edge of a rolling wheel, parametrized by the total amount of rotation \(\theta\) done by the point relative to the wheel’s axle: \[\begin{align*}(x, y) = (a\theta, a) + (-a\sin \theta, -a\cos \theta).\end{align*}\] Above, the first vector on the right describes the position of the axle as a function of \(\theta\). The second vector describes the position of the edge point relative to the axle.
- What if, instead of parametrizing the trajectory of a point on the edge, we parametrized a point only halfway along the radius? (For instance, the reflector on a bicycle wheel, which is normally placed on the spokes rather than on the tire.) I claim that the equation would change to \[\begin{align*}(x, y) = (a\theta, a) + (-\tfrac{a}{2}\sin \theta, -\tfrac{a}{2}\cos \theta).\end{align*}\] The first vector remains the same because the trajectory of the axle doesn’t change; the second vector changes by the substitution \(a \mapsto a/2\). The two contributions to the trajectory are independent of one another.
21-11-03
- Let \(i = \sqrt{-1}\). Assuming \(a, b, c, d\) are real numbers, compare \[\begin{align*}(a + bi) + (c + di) \quad\text{and}\quad (a + bi) \cdot (c + di)\end{align*}\] against \[\begin{align*}\begin{pmatrix} a &-b\\ b &a\end{pmatrix} + \begin{pmatrix} c &-d\\ d &c\end{pmatrix} \quad\text{and}\quad \begin{pmatrix} a &-b\\ b &a\end{pmatrix} \cdot \begin{pmatrix} c &-d\\ d &c\end{pmatrix}.\end{align*}\] What do you notice?
- Review the definitions of eigenvalues and eigenvectors. Discuss their geometric meaning for the following matrices: \[\begin{align*}\begin{pmatrix} 3 &0\\ 0 &1\end{pmatrix},\quad \begin{pmatrix} 1 &0\\ 0 &1\end{pmatrix},\quad \begin{pmatrix} 0 &1\\ 1 &0\end{pmatrix}\end{align*}\] These matrices are simple enough that we can read off the eigenvectors from the geometry of the linear transformation. In particular, the eigenvalues of a diagonal matrix are simply the diagonal entries.
- In general, the eigenvalues of a matrix \(A\) are the roots, in \(t\), of its characteristic polynomial \(\det(tI - A)\), counted with multiplicity. For a \(2 \times 2\) matrix, we see \[\begin{align*}A = \begin{pmatrix} a &b\\ c &d\end{pmatrix} \implies \det(tI - A) = t^2 - \mathrm{tr}(A)t + \det(A),\end{align*}\] where \(\mathrm{tr}(A) = a + d\) and \(\det(A) = ad - bc\). For example, \[\begin{align*}A = \begin{pmatrix} 2 &1\\ 1 &2\end{pmatrix} \implies \det(tI - A) = (t - 1)(t - 3),\end{align*}\] so the eigenvalues are \(1\) and \(3\).
- Above, if \((x, y)\) is an eigenvector of \(A\) with eigenvalue \(3\), then we must have \[\begin{align*}\begin{pmatrix} 2 &1\\ 1 &2\end{pmatrix} \begin{pmatrix}x \\ y\end{pmatrix} = 3 \begin{pmatrix}x \\ y\end{pmatrix}.\end{align*}\] Rearranging, we see this is equivalent to \[\begin{align*}\begin{pmatrix} -1 &1\\ 1 &-1 \end{pmatrix} \begin{pmatrix}x \\ y\end{pmatrix} = \begin{pmatrix}0 \\ 0\end{pmatrix}.\end{align*}\] The set of solutions is the line \(x = y\). So the nonzero vectors along this line are the eigenvectors with eigenvalue \(3\).
- Check that similarly, the nonzero vectors along the line \(x = -y\) are the eigenvectors with eigenvalue \(1\). In general, however, eigenlines need not be perpendicular.
- Discuss problems 2-3 from problem set 1.
- Extra: Google the phrase “Igon Value”.
21-11-01
- Discuss the system of linear equations: \[\begin{align*} \left\{\begin{array}{ll} -2x + 5y &= 4\\ -3x + 7y &= 0 \end{array}\right. \end{align*}\] We can solve it naively by isolating one variable at a time, to get \(x = 28\) and \(y = 12\). We can also rewrite it in matrix form: \[\begin{align*} A \begin{pmatrix} x\\ y\end{pmatrix} = \begin{pmatrix} 4\\ 0\end{pmatrix}, \quad \text{where $A = \begin{pmatrix} -2 &5\\ -3 &7\end{pmatrix}$.} \end{align*}\] Note that \(\det(A) = 1 \neq 0\), so \(A^{-1}\) exists. Indeed, \[\begin{align*} \begin{pmatrix} x\\ y\end{pmatrix} = A^{-1} \begin{pmatrix} 4\\ 0\end{pmatrix}, \quad \text{where $A^{-1} = \begin{pmatrix} 7 &-5\\ 3 &-2\end{pmatrix}$.} \end{align*}\] This again gives \((x, y) = (28, 12)\).
- In general, a system of \(m\) linear equations in \(n\) variables is equivalent to an equation of the form \[\begin{align*}A \begin{pmatrix} x_1\\ \vdots\\ x_n\end{pmatrix} = \begin{pmatrix} c_1\\ \vdots\\ c_n\end{pmatrix},\end{align*}\] where \(A\) is an \(m \times n\) matrix. In the square case where \(m = n\), the matrix is invertible if and only if the system has unique solutions for \(x_1, \ldots, x_n\).
- In our running example, \(A\) transforms the basis vectors \(e_1 = (1, 0)\) and \(e_2 = (0, 1)\) into the vectors \(v_1 = (-2, -3)\) and \(v_2 = (5, 7)\), respectively. Note that \(v_1\) and \(v_2\) are the columns of \(A\). In general, \(A\) transforms \((x, y) = xe_1 + ye_2\) into \(xv_1 + yv_2\).
- The matrix \(A^{-1}\) must undo the effect of the matrix \(A\), and vice versa. In our running example, the fact that \(A^{-1}\) sends \((4, 0) \mapsto (28, 12)\) tells us that \(A\) sends \((28, 12) \mapsto (4, 0)\).
- We can also think of a linear equation in \(x\) and \(y\) as a line. Solving a \(2 \times 2\) system is equivalent to finding where two lines intersect. By comparison, a linear equation in \(x, y, z\) is a plane. Solving a \(3 \times 3\) system is equivalent to finding where three planes intersect. That intersection could be a plane, a line, a point, or just empty.
- Let \(c\) be any constant. Note that the system \[\begin{align*} \left\{\begin{array}{ll} 2x + cz &= 0\\ x - y + 2z &= 0\\ x - 2y + 2z &= 0 \end{array}\right. \end{align*}\] represents three planes that all pass through the origin \((0, 0, 0)\). For what value(s) of \(c\) does a different solution exist? Note that in matrix form, the system is \[\begin{align*} A \begin{pmatrix} x\\ y\\ z\end{pmatrix} = \begin{pmatrix} 0\\ 0\\ 0\end{pmatrix}, \quad \text{where $A = \begin{pmatrix} 2 &0 &c\\ 1 &-1 &2 \\ 1 &-2 &2\end{pmatrix}$.} \end{align*}\] If \(\det(A) \neq 0\), then multiplying both sides on the left by \(A^{-1}\) would show that \((x, y, z) = (0, 0, 0)\) is the only solution. So we are looking for value(s) of \(c\) where \(\det(A) = 0\). It turns out \(\det(A) = 4 - c\), so the only answer is \(c = 4\).
- In the above example, can you write down an explicit non-origin solution? Are there infinitely many?
21-10-27
This recitation was audited by Dr. Jerry Orloff.
- There are analogies between the arithmetic of numbers and the arithmetic of square matrices, but there are also crucial differences. \[\begin{align*}\begin{array}{ll} \textbf{numbers} &\textbf{matrices}\\[1ex] 1 &I = \begin{pmatrix} 1&0&0 \\ 0&1&0 \\ 0&0&1 \end{pmatrix}\\[0.5ex] a \cdot 1 = a = 1 \cdot a &A I = A = I A\\[0.5ex] a \cdot b = b \cdot a &\text{$A B \neq B A$ in general!}\\[0.5ex] \text{If $a \neq 0$, then $a^{-1}$ exists} &\text{If $\det(A) \neq 0$, then $A^{-1}$ exists} \end{array}\end{align*}\] As an example where \(A B \neq B A\), in the \(2 \times 2\) setting, take \(A = \begin{pmatrix} 1&1 \\ 0&1 \end{pmatrix}\) and \(B = \begin{pmatrix} 1&0 \\ 1&1 \end{pmatrix}\).
-
When \(\det(A) \neq 0\), we can compute \(A^{-1}\) in three steps. (See the example in the notes for lecture 3.)
- First, form the matrix of minors \(M\). By definition, \(M_{i,j}\) is the determinant of the matrix formed by removing the \(i\)th row and \(j\)th column from \(A\), but keeping the rows and columns in order.
- Next, form the cofactor matrix \(C\). By definition, \(C_{i,j} = (-1)^{i+j}M_{i,j}\).
- Finally, the inverse matrix is \[\begin{align*}A^{-1} = \frac{1}{\det(A)} C^t,\end{align*}\] where \(C^t\) is the transpose of the cofactor matrix, defined by \((C^t)_{i,j} = C_{j,i}\).
- If a matrix has some zero entries, then we can speed up the calculation of the determinant by doing the cofactor expansion along a row or column containing those zeros. Note that people sometimes write the zero entries in a matrix as blank spaces.
- The volume of the parallelpiped generated by vectors \(\vec{u}\), \(\vec{v}\), \(\vec{w}\) is \[\begin{align*}\begin{pmatrix} u_1 &u_2 &u_3 \\ v_1 &v_2 &v_3 \\ w_1 &w_2 &w_3\end{pmatrix} = (\vec{u} \times \vec{v}) \cdot \vec{w},\end{align*}\] up to sign. You can also write the vectors as columns of the matrix, rather than rows. Just be careful that if you swap the order of two vectors, then the sign of the determinant flips.
- Discuss problem 2 from 18.02A problem set 1. The tetrahedron need not be regular, i.e., its sides can be different lengths. Assign a vector to each vertex, and use the fact that the midpoint between two points is the average of the corresponding vectors.
21-10-25
- If \(\vec{v}\), \(\vec{a}\) are vectors and \(\vec{a}\) is nonzero, then the projection of \(\vec{v}\) onto \(\vec{a}\), denoted \(\mathrm{proj}_{\vec{a}}(\vec{v})\), is the unique scalar multiple of \(\vec{a}\) such that \[\begin{align*}\vec{v} \cdot \vec{a} = \mathrm{proj}_{\vec{a}}(\vec{v}) \cdot \vec{a}.\end{align*}\] By construction, \(\mathrm{proj}_{\vec{a}}(\vec{v}) = C\vec{a}\) for some constant \(C\). The equation above shows \[\begin{align*}C = \frac{\vec{v} \cdot \vec{a}}{\vec{a} \cdot \vec{a}} = \frac{\vec{v} \cdot \vec{a}}{|\vec{a}|^2}.\end{align*}\] For example, \(\mathrm{proj}_{(1, 1, 1)}(0, 0, 4) = (4/3, 4/3, 4/3)\).
- In practice, we’re often given explicit angles between vectors, so we can compute projection vectors using trigonometry, rather than the formula above. Discuss the example of a block on a frictionless incline of angle \(\alpha\), held in place against gravity by a force \(\vec{F}\) applied horizontally. If gravity is \(\vec{G}\), then it turns out we need \(|\vec{F}| = |\vec{G}|\tan \alpha\). Consider the limiting cases \(\alpha = 0^\circ\) and \(\alpha = 90^\circ\).
- The determinant of a \(2 \times 2\) matrix is the area of a parallelogram, up to sign. If the rotation from \((a, b)\) to \((c, d)\) is counterclockwise, then \[\begin{align*}\det \begin{pmatrix} a &b \\ c &d\end{pmatrix} = ad - bc\end{align*}\] is the area of the parallelogram with vertices \((0, 0)\), \((a, b)\), \((c, d)\), \((a + c, b + d)\).
-
Sketch two proofs of this det = area result. The first proof involves fitting the parallelogram inside a rectangle. The second proof, which generalizes to higher dimensions, uses the observation that the determinant function is uniquely characterized by three axioms:
- The determinant of the identity matrix (\(1\)’s along the diagonal, \(0\)’s elsewhere) is \(1\).
- If a row or column of a matrix is scaled by a constant \(\lambda\), then the determinant scales by \(\lambda\).
- If we add a scalar multiple of one row, resp. column, of the matrix to a different row, resp. column, then the determinant is unchanged.