[Classical music] Picture yourself as an early calculus student
about to begin your first course: The months ahead of you
hold within them a lot of hard work Some neat examples, some not so neat examples, beautiful connections to physics, not so beautiful piles of formulas to memorise, plenty of moments of getting stuck and banging your head into a wall, a few nice ‘aha’ moments sprinkled in as well, and some genuinely lovely graphical intuition to help guide you through it all. But if the course ahead of you is anything like my first introduction to calculus or any of the first courses that I’ve seen in the years since, there’s one topic that you will not see, but which I believe stands to greatly accelerate your learning. You see almost all of the visual intuitions from that first year are based on graphs – the derivative is the slope of a graph, the integral is a certain area under that graph, but as you generalize calculus beyond functions whose inputs and outputs are simply numbers, it’s not always possible to graph the function that you’re analyzing. There’s all sorts of different ways that you’d be visualizing these things so if all your intuitions for the fundamental ideas, like derivatives, are rooted too rigidly in graphs, it can make for a very tall and largely unnecessary conceptual hurdle between you and the more “advanced topics”, like multivariable calculus, and complex analysis, differential geometry…. Now, what I want to share with you is a way to think about derivatives which I’ll refer to as the transformational view, that generalizes more seamlessly into some of those more general context where calculus comes up And then we’ll use this alternate view to analyze a certain fun puzzle about repeated fractions. But first off, I just want to make sure that we’re all on the same page about what the standard visual is. If you were to graph a function, which simply takes real numbers as inputs and outputs, one of the first things you learn in a calculus course is that the derivative gives you the slope of this graph. Where what we mean by that is that the derivative of the function is a new function which for every input x returns that slope Now, I’d encourage you not to think of this derivative as slope idea as being the definition of a derivative instead think of it as being more fundamentally about how sensitive the function is to tiny little nudges around the input and the slope is just one way to think about that sensitivity relevant only to this particular way of viewing functions. I have not just another video, but a full series on this topic if it’s something you want to learn more about. Now the basic idea behind the alternate visual for the derivative is to think of this function as mapping all of the input points on the number line to their corresponding outputs on a different number line. In this context what the derivative gives you is a measure of how much the input space gets stretched or squished in various regions. That is if you were to zoom in around a specific input and take a look at some evenly spaced points around it, the derivative of the function of that input is going to tell you how spread out or contracted those points become after the mapping. Here a specific example helps take the function x squared it maps 1 to 1 and 2 to 4 3 to 9 and so on and you could also see how it acts on all of the points in between and if you were to zoom in on a little cluster of points around the input 1 and then see where they land around the relevant output which for this function also happens to be 1 you’d notice that they tend to get stretched out in. In fact, it roughly looks like stretching out by a factor of 2 and the closer you zoom in the more this local behavior Looks just like multiplying by a factor of 2. This is what it means for the derivative of x squared at the input x equals 1 to be 2. It’s what that fact looks like in the context of transformations. If you looked at a neighborhood of points around the input 3, they would get roughly stretched out by a factor of 6. This is what it means for the derivative of this function at the input 3 to equal 6. Around the input 1/4 a small region actually tends to get contracted specifically by a factor of 1/2 and that’s what it looks like for a derivative to be smaller than 1. Now the input 0 is interesting,
zooming in by a factor of 10 It doesn’t really look like a constant stretching or squishing, for one thing all of the outputs end up on the right positive side of things and as you zoom in closer and closer by 100x or by 1000 X It looks more and more like a small neighborhood of points around zero just gets collapsed into zero itself. And this is what it looks like for the derivative to be zero, the local behavior looks more and more like multiplying the whole number line by zero. It doesn’t have to completely collapse everything to a point at a particular zoom level. Instead it’s a matter of what the limiting behavior is as you zoom in closer and closer. It’s also instructive to take a look at the negative inputs here. Things start to feel a little cramped since they collide with where all the positive input values go, and this is one of the downsides of thinking of functions as transformations, but for derivatives, we only really care about the local behavior Anyway, what happens in a small range around a given input. Here, notice that the inputs in a little neighborhood around say negative two. They don’t just get stretched out – they also get flipped around. Specifically, the action on such a neighborhood looks more and more like multiplying by negative four the closer you zoom in this is what it looks like for the derivative of a function to be negative and I think you get the point. This is all well and good, but let’s see how this is actually useful in solving a problem a Friend of mine recently asked me a pretty fun question about the infinite fraction one plus one divided by one plus one divided by one plus one divided by one on and on and on and on and Clearly you watch math videos online So maybe you’ve seen this before but my friend’s question actually cuts to something that you might not have thought about before Relevant to the view of derivatives that we’re looking at here the typical way that you might Evaluate an expression like this is to set it equal to X and then notice that there’s a copy of the full fraction inside itself So you can replace that copy with another X and then just solve for X That is what you want is to find a fixed point of the function 1 plus 1 divided by X But here’s the thing there are actually two solutions for X two special numbers were one plus one divided by that number Gives you back the same thing One is the golden ratio phi Φ φ around 1.618
and the other is negative 0.618 which happens to be -1/φ. I like to call this other number phi’s little brother since just about any property that phi has, this number also has and this raises the question: ‘Would it be valid to say that that infinite fraction that we saw, is somehow also equal to phi’s little brother: -0.618?’ Maybe you initially say “,obviously not! Everything on the left hand side is positive. So how could it possibly equal a negative number?” Well first we should be clear about what we actually mean by an expression like this. One way that you could think about it, and it’s not the only way there’s freedom for choice here, is to imagine starting with some constant like 1 and then repeatedly applying the function 1 plus 1 divided by x and then asking what is this approach as you keep going? I mean certainly symbolically what you get looks more and more like our infinite fraction so maybe if you wanted to equal a number you should ask what this series of numbers approaches and If that’s your view of things,
maybe you start off with a negative number So it’s not so crazy for the whole expression to end up negative. After all If you start with -1/φ then applying this function 1 + 1/x You get back the same number -1/φ. So no matter how many times you apply it you’re staying fixed at this value. But even then there is one reason that you should probably view phi as the favorite brother in this pair, here try this: pull up a calculator of some kind then start with any random number and then plug it into this function 1 + 1/x and then plug that number into 1 + 1/x and then again and again and again and again and again No matter what constant you start with you eventually end up at 1.618 Even if you start with a negative number even one that’s really really close to phi’s little brother Eventually it shys away from that value and jumps back over to phi So what’s going on here? Why is one of these fixed points favored above the other one? Maybe you can already see how the transformational understanding of derivatives is going to be helpful for understanding this set up, but for the sake of having a point of contrast, I want to show you how a problem like this is often taught using graphs. If you were to plug in some random input to this function, the y-value tells you the corresponding output, right? So to think about plugging that output back into the function, you might first move horizontally until you hit the line y equals x and that’s going to give you a position where the x-value corresponds to your previous y-value, right? So then from there you can move vertically to see what output this new x-value has And then you repeat you move horizontally to the line y=x, to find a point whose x-value is the same as the output that you just got and then you move vertically to apply the function again. Now personally, I think this is kind of an awkward way to think about repeatedly applying a function, don’t you? I mean it makes sense, but you can’t have to pause and think about it to remember which way to draw the lines, and you can if you want think through what conditions make this spiderweb process narrow in on a fixed point versus propagating away from it And in fact, go ahead pause right now and try to think it through as an exercise. It has to do with slopes Or if you want to skip the exercise for something that I think gives a much more satisfying understanding think about how this function acts as a transformation. So I’m gonna go ahead and start here by drawing a whole bunch of arrows to indicate where the various sample the input points will go, and side note: Don’t you think this gives a really neat emergent pattern? I wasn’t expecting this, but it was cool to see it pop up when animating. I guess the action of 1 divided by x gives this nice emergent circle and then we’re just shifting things over by 1. Anyway, I want you to think about what it means to repeatedly apply some function like 1 + 1/x in this context. Well after letting it map all of the inputs to the outputs, you could consider those as the new inputs and then just apply the same process again and then again and Do it however many times you want Notice in animating this with a few dots representing the sample points, it doesn’t take many iterations at all before all of those dots kind of clump in around 1.618 . Now remember, we know that 1.618… and its little brother -0.618… on and on stay fixed in place during each iteration of this process, but zoom in on a neighborhood around phi during the map points in that region get contracted around phi meaning that the function 1 + 1/x has a derivative with a magnitude that’s less than 1 at this input in Fact this derivative works out to be around -0.38. So what that means, is that each repeated application scrunches the neighborhood around this number smaller and smaller like a gravitational pull towards phi. So now tell me what you think happens in the neighborhood of phi’s little brother. Over there the derivative actually has a magnitude larger than one, so points near the fixed point are repelled away from it and When you work it out, you can see that they get stretched by more than a factor of two in each iteration. They also get flipped around because the derivative is negative here, but the salient fact for the sake of stability is just the magnitude. Mathematicians would call this right value a stable fixed point and the left one is an unstable fixed point Something is considered stable if when you perturb it just a little bit, it tends to come back towards where it started rather than going away from it. So what we’re seeing is a very useful little fact: that the stability of a fixed point is determined by whet her or not the magnitude of its derivative is bigger or smaller than one and this explains why phi always shows up in the numerical play where you’re just hitting enter on your calculator over and over but phi’s little brother never does. Now as to whether or not you want to consider phi’s little brother a valid value of the infinite fraction Well, that’s really up to you everything we just showed suggests that if you think of this expression as representing a limiting process then because every possible seed value other than phi’s little brother gives you a series converting to φ It does feel kind of silly to put them on equal footing with each other. But maybe you don’t think of it as a limit Maybe the kind of math you’re doing lends itself to treating this as a purely algebraic object like the solutions of a polynomial, which simply has multiple values. Anyway, that’s beside the point and my point here is not that viewing derivatives As this change in density is somehow better than the graphical intuition on the whole in fact picturing an entire function this way can be kind of clunky and impractical as compared to graphs. My point is that it deserves more of a mention in most of the introductory calculus courses, because it can help make a student’s understanding of the derivative a little bit more flexible. Like I mentioned the real reason that I’d recommend you carry this perspective with you as you learn new topics is not so much for what it does with your understanding of single variable calculus it’s for what comes after there are many topics typically taught in a college math department which… How shall I put this lightly? don’t exactly have a reputation of being super accessible. So in the next video I’m gonna show you how a few ideas from these subjects with fancy sounding names like holomorphic functions and the Jacobian determinant are really just extensions of the idea shown here. They really are some beautiful ideas, which I think can be appreciated from a really wide range of mathematical backgrounds and they’re relevant to a surprising number of seemingly unrelated ideas. So stay tuned for that. Now for the final animation I just want to show you a little more of that time-dependent vector field I flashed earlier, but first let’s look at some of the principles of learning from this video sponsor: Brilliant.org There’s a lot of good stuff on this list,
but I want you to look at number two effective math and science learning cultivates curiosity. I love the word choice here. It’s not just that you should be curious in one moment It means creating a context where that curiosity is constantly growing. Just look at the infinite fraction example here It would be one thing if you were curious about why the numbers bounce around the way that they do, but hopefully the conclusion is not just to understand this one example I would want you to start looking at all sorts of other infinite expressions and wonder if there’s some fixed point phenomenon in them, or wonder where else this view of derivatives can be conceptually helpful Brilliant.org is a site where you can learn math and science topics through active problem-solving and if you go take a look I think you’ll agree that they really do adhere to these learning principles, coming from this video you would probably enjoy their “Calculus Done Right,” lessons and they also have many other courses in various math and science topics. Much of it you can check out for free, but they also have a subscription service that gives you access to all sorts of nice guided problems. Going to Brilliant.org/3B1B Lets them know that you came from this channel and it can also get you 20% off of their annual subscription.