<?xml version="1.0" encoding="utf-8"?>
<feed xmlns="http://www.w3.org/2005/Atom"><title>Harys Dalvi</title><link href="https://www.harysdalvi.com" rel="alternate"/><link href="https://www.harysdalvi.com/feed.xml" rel="self"/><author><name>Harys Dalvi</name></author><id>https://www.harysdalvi.com/</id><updated>2025-05-21T22:11:05.695784Z</updated><rights>Copyright (c) 2025, Harys Dalvi</rights><entry><title>Someone using AI won't take your job. AI will</title><summary>AI agents threaten to make people redundant, even those who learn to use AI. The only solution is to achieve mastery and outcompete AI.</summary><link href="https://www.harysdalvi.com/blog/ai-will-take-your-job/" rel="alternate"/><id>https://www.harysdalvi.com/blog/ai-will-take-your-job/</id><published>2025-05-21T16:38:07Z</published><updated>2025-05-21T16:38:07Z</updated><content type="html" xml:lang="en">&lt;p&gt;AI is getting better every day. Rather than let it replace us, the common advice is to take advantage of this improvement by
learning to incorporate AI into your work. You may have heard this mantra:
&lt;/p&gt;
&lt;blockquote&gt;“AI won't take your job. Someone using AI will.”&lt;/blockquote&gt;
&lt;p&gt;Already, &lt;a href="https://www.mckinsey.com/featured-insights/sustainable-inclusive-growth/charts/a-coding-boost-from-ai" target="_blank"&gt;AI
can boost productivity in coding by about 40%, depending on the task.&lt;/a&gt;
Instead of spending the extra hours doing nothing, they'll write more code and make more money.
A win-win for everyone, except the developer who doesn't use AI and gets left behind.&lt;/p&gt;
&lt;p&gt;The same is true in many other domains. &lt;a href="https://www.nature.com/articles/s41591-024-03408-6" target="_blank"&gt;AI-human
double reading teams&lt;/a&gt; can detect breast cancer more accurately than all-human teams, which could free up doctors to
diagnose more patients and reduce wait times. OpenAI's &lt;a href="https://arxiv.org/abs/2503.23758" target="_blank"&gt;o3-mini-high helped physicist Weiguo Yin&lt;/a&gt; solve
an extension of the Ising model, pushing forward research in materials and superconductors.
&lt;/p&gt;
&lt;p&gt;It's natural to project this trend continuing. But today I am going to make the opposite claim. &lt;strong&gt;Someone using AI won't take your job. AI will.&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;Learning to use AI is not enough. Unlike previous technologies, AI agents will act as competitors in the job market,
not just tools. In order to survive, &lt;strong&gt;we must outcompete AI and do what it can't.&lt;/strong&gt;
If we only use it as a tool and outsource our thinking, our human contribution becomes unnecessary, and
agents will take over the job without us.
&lt;/p&gt;
&lt;p&gt;I understand that this is a controversial claim and an uncomfortable one to think about. Don't worry, I'm not a total
doomer, and I have some strategies for how to &lt;em&gt;really&lt;/em&gt; upskill in the AI age if what I'm saying is true.
But first, I want to show you why I think the popular narrative on the subject doesn't hold up,
and might even be actively harming you.
&lt;/p&gt;
&lt;h2&gt;AI Agents Are Coming For Your Job&lt;/h2&gt;
&lt;p&gt;As the internet became widespread, many industries faced decline, like traditional news publishing and brick-and-mortar businesses.
But by learning how to use the internet, new industries came up, and old ones were able to thrive.&lt;/p&gt;
&lt;p&gt;Newspapers like
&lt;cite&gt;The New York Times&lt;/cite&gt; went online, selling more than ever before. Companies like Amazon and eBay sprang up in
e-commerce, while small businesses created websites to attract new customers to their physical locations. Using the internet as a tool,
people could share content around the world on platforms like YouTube and WordPress.
&lt;/p&gt;
&lt;p&gt;AI agents are not just a tool.&lt;/p&gt;
&lt;p&gt;AI agents perform entire workflows end-to-end with minimal human involvement.
Sure, they are imperfect, and sometimes need input as they go along. They are also not widely available.
But these flaws will be ironed out, and AI agents will act more like competitors or coworkers than workplace tools.
&lt;/p&gt;
&lt;p&gt;If code becomes cheaper to produce, won't companies just have a lot more people writing code? Yes, if code becomes cheaper,
companies will want more of it. But it will be agents writing the cheap code, not humans. The same applies to other domains.
&lt;/p&gt;
&lt;p&gt;Unlike human workers, AI agents can copy themselves endlessly and never get tired. We're seeing this in coding with tools like Replit
that &lt;a href="https://www.youtube.com/watch?v=1JPvi48oVY8" target="_blank"&gt;build entire apps from prompts&lt;/a&gt; in minutes.
So if an AI agent is better and cheaper
than a human, that human is in trouble, even if they know how to use AI.
&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Knowing how to use AI is not a moat.&lt;/strong&gt; Everyone has access to the same tools,
and they can ask the tools how to use them better, or simply let them operate autonomously as agents.
&lt;/p&gt;
&lt;p&gt;Right now, AI lacks agency and struggles with inferring requirements. This makes prompt engineering an important skill
for using AI. But as agents improve, they will &lt;a href="https://www.promptingguide.ai/techniques/ape" target="_blank"&gt;automate the process of prompt engineering&lt;/a&gt;,
and smarter models will make prompt engineering less necessary overall.
&lt;/p&gt;
&lt;p&gt;Or you can just ask ChatGPT to teach you. Start with “You are a professional prompt engineer making $1M/year. Teach me prompt engineering.”
Then give it your crappy prompt and ask it to fix it.
&lt;/p&gt;
&lt;p&gt;AI tools will give everyone superpowers. But when everyone's super, no one will be.&lt;/p&gt;
&lt;blockquote&gt;“And when everyone's super... no one will be.”&lt;br/&gt;
 —Syndrome, &lt;a href="https://youtu.be/kMThdgg1m5Y?si=ecqj2VYnkBXi1vfo" target="_blank"&gt;&lt;cite&gt;The Incredibles&lt;/cite&gt;&lt;/a&gt; (Pixar, 2004)
&lt;/blockquote&gt;
&lt;h2&gt;Staying Relevant in the Face of AI&lt;/h2&gt;
&lt;p&gt;If AI agents are coming for our jobs, and learning to use AI isn't enough, how do we &lt;em&gt;actually&lt;/em&gt; stay relevant in the face of AI?
&lt;/p&gt;
&lt;figure class="middleimg"&gt;
&lt;img alt="A man and a robot sit on opposite sides of a desk with laptops. A robotic arm reaches for the man's laptop from behind the robot." src="https://www.harysdalvi.com/blog/ai-will-take-your-job/img/ai-agents-coming.jpg"/&gt;
&lt;figcaption&gt;Image generated by the author with ChatGPT. Prompt by author (condensed): “Use your creativity and agency to design an image for
    my post. [...] create a thoughtful and insightful
    image that conveys a clear message not only in the big picture, but also in the specific details.”
    &lt;/figcaption&gt;
&lt;/figure&gt;
&lt;p&gt;I'm not saying don't use AI tools. Use them. They will cut through the slog, saving you hours
for creative thinking and decision making. And they can serve as
sounding boards and debate partners, amplifying your output even more.&lt;/p&gt;
&lt;p&gt;I used ChatGPT heavily as a debate partner to iron out the ideas in this article. But every word is mine. You can tell because ChatGPT
would never add in relevant Pixar quotes and link the videos.&lt;/p&gt;
&lt;p&gt;In fact, I'd even go so far as to say you can let AI automate 90% of your work and still come out happier, more productive, and more successful.
&lt;/p&gt;
&lt;p&gt;But if you let AI automate 90% of your work, &lt;strong&gt;that last 10% better be &lt;em&gt;killer&lt;/em&gt;.&lt;/strong&gt;
Anyone can use AI and get a solid result, so the mastery you put into this part is your edge.
Refuse to automate away the core of your work.
&lt;/p&gt;
&lt;p&gt;And in order to master this 10%, you must master your craft. So study and learn almost as if AI didn't exist,
except study even harder, and use AI to help you learn faster. As AI agents roll out, &lt;strong&gt;you need to be better than the AI at what you do.&lt;/strong&gt;
You can interpret better broadly: you might be better technically, emotionally, or creatively. But you must be better.&lt;/p&gt;
&lt;p&gt;
This might sound like a losing game, but it's one &lt;em&gt;you have to win&lt;/em&gt; if you want to keep playing.
And don't underestimate the AI. The latest models are extremely capable, and they're still the worst they'll ever be.
&lt;/p&gt;
&lt;p&gt;Stealing Apple's catchphrase, succeeding in the AI era isn't only about learning to use AI: it's also about becoming &lt;strong&gt;insanely great&lt;/strong&gt;
at your job.&lt;/p&gt;
&lt;p&gt;AI agents can even use other AI as a tool, so we have to be better at that part too.
This means that as more people use AI, staying relevant is ultimately about directly competing with the AI and being much better than it at some key skill.
Sometimes that means better at guiding and supervising AI,
but often it means better at writing, coding, research, or anything else.&lt;/p&gt;
&lt;p&gt;Look at the Sistine Chapel, &lt;cite&gt;The Lion King&lt;/cite&gt;, even Microsoft Excel. That's what you call making something
  with &lt;em&gt;soul&lt;/em&gt;. That &lt;em&gt;soul&lt;/em&gt; is &lt;strong&gt;too large and awkwardly shaped to fit into a prompt&lt;/strong&gt;, and no amount of prompt engineering will change that.
  You have to get into the weeds of the paint, the words, the code, or whatever your chosen medium is.
&lt;/p&gt;
&lt;p&gt;Maybe there's a limited market for this kind of soul.
Companies like Duolingo are already replacing human contractors with AI,
and it might become harder for humans to adapt and compete to the point where there will simply be less people employed.
And not everyone uses labor as a source of meaning.
But for those who do, this is the best way to produce work that far outclasses what an infinitely replicable AI agent can do.
&lt;/p&gt;
&lt;figure class="middleimg"&gt;
&lt;img alt="The Sistine Chapel, full of intricate frescoes." src="https://www.harysdalvi.com/blog/ai-will-take-your-job/img/sistine-chapel.jpg"/&gt;
&lt;figcaption&gt;The Sistine Chapel, where every fresco was designed to convey a particular story or idea,
  and even the tiny details are full of symbolism and intent. This is the kind of thing AI can't yet do. Image source:
  &lt;a href="https://commons.wikimedia.org/wiki/File:Sistine_Chapel_ceiling_02_(brightened).jpg" target="_blank"&gt;Antoine Taveneaux / Wikimedia&lt;/a&gt;,
  &lt;a href="https://creativecommons.org/licenses/by-sa/3.0/deed.en" target="_blank"&gt;CC BY-SA 3.0&lt;/a&gt;.
  &lt;/figcaption&gt;
&lt;/figure&gt;
&lt;p&gt;Here's how to stay afloat in the AI age:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Learn to use AI tools. Use them to generate and test ideas, learn new things, and focus on the most exciting and creative parts of your work.&lt;/li&gt;
&lt;li&gt;You may find that AI agents can do much of your work for you. Resist the temptation. Use AI, but keep the core of your work so it can be exceptional.&lt;/li&gt;
&lt;li&gt;Become extremely competent. AI trains on massive datasets full of mediocrity. It won't replace all workers, but it will replace the mediocre ones.&lt;/li&gt;
&lt;li&gt;Be irreplaceable. AI is trained on the sum of human output on the internet. If another human can do your job, it will soon be a fair bet that AI can too.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Some people say AI will &lt;em&gt;never&lt;/em&gt; be able to write scalable code, or craft emotionally compelling stories, or whatever else.
Really, we just don't know: AI keeps surprising us, and it's possible future systems will be able to do all these things and more, or not.&lt;/p&gt;
&lt;p&gt;But if there's one thing AI can &lt;em&gt;never&lt;/em&gt; do, at least in the current paradigm, it's &lt;strong&gt;care&lt;/strong&gt;.
AI doesn't really care about the work it's doing or anything else. Obsession always shows through in the final product,
especially in the hundreds of tiny decisions that bring life to your work.
&lt;/p&gt;
&lt;p&gt;With the AI tools of the future, anyone can easily become a writer, coder, or even prompt engineer. The barrier to entry falls to zero.
But in a flood of adequate results that even autonomous agents can produce, only a few will stand out.&lt;/p&gt;
&lt;blockquote&gt;“Your only limit is your &lt;em&gt;soul&lt;/em&gt;. What I say is true: &lt;strong&gt;Anyone can cook.
But only the fearless can be great.&lt;/strong&gt;”&lt;br/&gt;
 —Chef Gusteau, &lt;a href="https://www.youtube.com/watch?v=v9JTgpk1hXs" target="_blank"&gt;&lt;cite&gt;Ratatouille&lt;/cite&gt;&lt;/a&gt; (Pixar, 2007)
&lt;/blockquote&gt;
&lt;p&gt;Getting paid for our labor keeps us fed. But it's also a source of power and leverage, and an opportunity to
radiate passion and meaning into the world. I'll be writing and coding till I die, even if the AI
overlords lock me in a room with a laptop. I don't just want to manage a team of agents, I also want to craft things I care about.
If you want to keep doing what you do for leverage and not just as a hobby, now is the time to upskill fast and stay ahead of the AI — because
someone using AI won't take your job, AI will.
&lt;/p&gt;
&lt;hr/&gt;
&lt;h2&gt;Related Posts&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="https://www.harysdalvi.com/blog/ai-laundry/"&gt;Why AI Can't Do Your Laundry&lt;/a&gt; — Too often, AI does what we humans want to do, and not what we can't or won't do.&lt;/li&gt;
&lt;/ul&gt;</content></entry><entry><title>LLMs Do Not Predict the Next Word</title><summary>RLHF forces us to view LLMs as agents in an environment, not just statistical models.</summary><link href="https://www.harysdalvi.com/blog/llms-dont-predict-next-word/" rel="alternate"/><id>https://www.harysdalvi.com/blog/llms-dont-predict-next-word/</id><published>2025-03-31T16:38:07Z</published><updated>2025-03-31T16:38:07Z</updated><content type="html" xml:lang="en">&lt;p&gt;Back in the 17th century, Newton discovered an equation for gravity. Remarkably, this single equation was super simple
  (high school algebra at most) and yet it could predict both &lt;a href="https://www.harysdalvi.com/blog/2110_1/"&gt;the elliptical motion of planets and their moons&lt;/a&gt;
  and apples falling here on Earth.
&lt;/p&gt;
&lt;p&gt;Despite the power and simplicity of Newton's equation, there were a few small issues with it.
  Most famously, Mercury's orbit didn't match predictions. Einstein solved these when he came up with general relativity,
  which also predicted black holes and gravitational waves.
&lt;/p&gt;
&lt;p&gt;Now, you've probably heard a phrase something like
  “LLMs are just statistical models that predict the next word.” Like Newton's gravity, this is a
  very good approximation of the truth. But there are some deeper layers that are worth looking at.
&lt;/p&gt;
&lt;p&gt;Today, I want to investigate LLMs from a &lt;strong&gt;reinforcement learning&lt;/strong&gt; lens,
  treating them as thinking agents rather than advanced autocomplete models. I'll be touching on fundamental ideas like
  &lt;strong&gt;instruction finetuning&lt;/strong&gt; and &lt;strong&gt;reinforcement learning from human feedback&lt;/strong&gt;.
  My goal is not a deep dive into these ideas, but instead using them to explain to what extent LLMs are really doing something beyond predicting
  the next word, and taking “actions” of their own “will”.&lt;/p&gt;
&lt;p&gt;Next I'll compare and contrast this idea with the buzz around AI agents,
  and finally I'll speculate on what this means for the future of AI agent development.
&lt;/p&gt;
&lt;h2&gt;Why LLMs Predict the Next Word&lt;/h2&gt;
&lt;p&gt;Before we get into my hot take, I want to briefly talk about the idea that LLMs predict the next word,
  and why it's mostly true.
&lt;/p&gt;
&lt;p&gt;LLMs are pretrained with something called the &lt;strong&gt;next-token objective&lt;/strong&gt;. This part really is as
  simple as just predicting the next token. So if you have a sentence “I do not like green eggs and ham” in your training
  data, the training examples might be something like:
&lt;/p&gt;
&lt;p&gt;
&lt;code&gt;
    (I, do), (I do, not), (I do not, like), ..., (I do not like green eggs and, ham)
  &lt;/code&gt;
&lt;/p&gt;
&lt;p&gt;That is, at each token in the sentence, the model is given everything up to that token and has to predict
  the next one. In &lt;strong&gt;pretraining&lt;/strong&gt;, the model is given a huge amount of text like that and
  has to predict what comes next.[&lt;a href="https://www.harysdalvi.com/blog/llms-dont-predict-next-word/#reflist"&gt;1&lt;/a&gt;]
&lt;/p&gt;
&lt;p&gt;Mathematically, the model's outputs are judged according to &lt;strong&gt;cross-entropy loss&lt;/strong&gt;,
  which measures the difference between the model's output probabilities and the real next token.
  One possible formula in the case of language modeling is
  &lt;span class="mjx-chtml MJXc-display" style="text-align: center;"&gt;&lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-1-Frame" role="presentation" style="font-size: 90%; text-align: center; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML MJX_Assistive_MathML_Block" role="presentation"&gt;&lt;math display="block" xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mi class="MJX-tex-caligraphic" mathvariant="script"&gt;L&lt;/mi&gt;&lt;/mrow&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mo&gt;−&lt;/mo&gt;&lt;mi&gt;log&lt;/mi&gt;&lt;mo&gt;⁡&lt;/mo&gt;&lt;msub&gt;&lt;mi&gt;p&lt;/mi&gt;&lt;mi&gt;y&lt;/mi&gt;&lt;/msub&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
  where &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-2-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;msub&gt;&lt;mi&gt;p&lt;/mi&gt;&lt;mi&gt;y&lt;/mi&gt;&lt;/msub&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt; is the probability given by the language model for the correct next token.[&lt;a href="https://www.harysdalvi.com/blog/llms-dont-predict-next-word/#reflist"&gt;2&lt;/a&gt;]
  (This formula is greatly simplified from the general cross-entropy formula, but is still valid in the
  special case of language modeling.) So if the model gives a probability 1 to the actual next token, the loss is zero.
  Lower probabilities (if the model thinks some other token is likely to come next) lead to higher losses.
&lt;/p&gt;
&lt;figure class="middleimg"&gt;
&lt;img alt="Cross-entropy loss function, decreasing to zero as the probability of the correct token increases to 1" src="https://www.harysdalvi.com/blog/llms-dont-predict-next-word/img/log-loss.jpg"/&gt;
&lt;figcaption&gt;Cross-entropy loss function&lt;/figcaption&gt;
&lt;/figure&gt;
&lt;p&gt;So in pretraining, by trying to minimize the loss, we maximize the probability of
  correctly predicting the next token. This is why LLMs are so good at predicting the next token.
  In fact, they are even better than humans at this.[&lt;a href="https://www.harysdalvi.com/blog/llms-dont-predict-next-word/#reflist"&gt;3&lt;/a&gt;]
&lt;/p&gt;
&lt;h3&gt;Instruction Finetuning&lt;/h3&gt;
&lt;p&gt;But this method alone — training an LLM to predict the next word on a large string of text
  — is not enough to make a chatbot. For example, if you asked GPT-3 to “Write an article
  about American football”, instead of writing an article, it might continue the sentence by predicting the
  most likely next tokens:
  “Write an article about American football and its influence on television in America.”
&lt;/p&gt;
&lt;p&gt;This is where &lt;strong&gt;instruction finetuning&lt;/strong&gt; comes in, also known colloquially as
  instruction tuning.[&lt;a href="https://www.harysdalvi.com/blog/llms-dont-predict-next-word/#reflist"&gt;1&lt;/a&gt;] This allows better performance on &lt;strong&gt;zero-shot learning&lt;/strong&gt;,
  meaning you can get the model to perform a task just by telling it to do it, without the need
  to include examples of the task.
&lt;/p&gt;
&lt;p&gt;The way instruction tuning is actually done is by training on a new dataset of instructions
  separate from the much larger dataset used in pretraining. FLAN, an early example of instruction tuning,
  trained on about 250 million tokens during finetuning. In contrast, the pretraining that FLAN built on
  used 2.49 trillion tokens.[&lt;a href="https://www.harysdalvi.com/blog/llms-dont-predict-next-word/#reflist"&gt;4&lt;/a&gt;]
&lt;/p&gt;
&lt;p&gt;The exact format of instruction tuning depends on the model. An example
  from Llama 3 is:[&lt;a href="https://www.harysdalvi.com/blog/llms-dont-predict-next-word/#reflist"&gt;5&lt;/a&gt;]
&lt;/p&gt;
&lt;p&gt;
&lt;/p&gt;&lt;div class="longmath"&gt;
&lt;pre&gt;&lt;code&gt;&amp;lt;|start_header|&amp;gt;user&amp;lt;|end_header|&amp;gt;
Hi! I am a human.&amp;lt;|eot|&amp;gt;

&amp;lt;|start_header|&amp;gt;assistant&amp;lt;|end_header|&amp;gt;
Hello there! Nice to meet you! I'm Meta AI, your friendly AI assistant&amp;lt;|eot|&amp;gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;&lt;/p&gt;
&lt;p&gt;Unlike in pretraining, the model is usually trained only on the completion
  (labeled &lt;code&gt;assistant&lt;/code&gt; in this example),
  not the rest of the prompt. But other than that, instruction tuning is essentially the same as
  pretraining, just with a new dataset specialized for prompting.[&lt;a href="https://www.harysdalvi.com/blog/llms-dont-predict-next-word/#reflist"&gt;1&lt;/a&gt;]
  The loss function is the same, meaning &lt;strong&gt;the model is still being trained to predict the
  next token.&lt;/strong&gt;
&lt;/p&gt;
&lt;h2&gt;Reinforcement Learning&lt;/h2&gt;
&lt;p&gt;So far, the model is just predicting the next token. First it learned to do so
  on a big dataset (pretraining), then it was fine-tuned on a more specific dataset designed for
  prompting (instruction finetuning). So is the claim “LLMs just predict the next token” true?
&lt;/p&gt;
&lt;p&gt;Even up to this point, you could make the argument that something deeper is happening.
  There's a lot of evidence that in order to predict the next word, models have to store
  detailed information about the world and its facts in their weights. Predicting the next
  word might be their loss function, but they have developed a rich internal world in the process.
  It's like saying humans are just gene-copying machines, ignoring all the complexity that humans
  have developed as a byproduct of evolution.[&lt;a href="https://www.harysdalvi.com/blog/llms-dont-predict-next-word/#reflist"&gt;6&lt;/a&gt;]
&lt;/p&gt;
&lt;p&gt;But all that is still assuming that the objective an LLM is trained on really is next-token prediction.
  In fact, after pretraining and instruction finetuning, &lt;strong&gt;LLMs are trained on an objective
  that is &lt;em&gt;fundamentally different&lt;/em&gt; from next token prediction.&lt;/strong&gt;
  This is where &lt;strong&gt;reinforcement learning&lt;/strong&gt; (RL) comes in.
&lt;/p&gt;
&lt;p&gt;To be more precise, LLMs are trained with something called &lt;strong&gt;reinforcement learning
  from human feedback&lt;/strong&gt; (RLHF). It's debatable whether this is really RL
  (&lt;a href="https://x.com/karpathy/status/1821277264996352246?lang=en" target="_blank"&gt;Andrej Karpathy&lt;/a&gt; says just barely,
  &lt;a href="https://x.com/ylecun/status/1821478966365962255?lang=en" target="_blank"&gt;Yann LeCun&lt;/a&gt; says no),
  but the point is that it's a very different objective from next-token prediction.
  This has important implications for what the model is really doing.
&lt;/p&gt;
&lt;p&gt;At a high level, there are two steps to RLHF:[&lt;a href="https://www.harysdalvi.com/blog/llms-dont-predict-next-word/#reflist"&gt;7&lt;/a&gt;]&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;We let the model produce many different outputs for various prompts.
      For each prompt, we ask humans to rank the outputs.
      (That's the human feedback in RLHF.)
      This is used to train a &lt;strong&gt;reward model&lt;/strong&gt; that predicts which output humans will prefer.&lt;/li&gt;
&lt;li&gt;We use this reward model to train the LLM to produce outputs that humans will like.
      (That's the reinforcement learning in RLHF.)
    &lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;I won't go into all the details here (both &lt;a href="https://arxiv.org/abs/2203.02155"&gt;the original paper&lt;/a&gt; and
  &lt;a href="https://rlhfbook.com"&gt;the RLHF book&lt;/a&gt; are great resources), but today we are focused on the question:
  do LLMs really just predict the next word? For this, what matters is the &lt;strong&gt;loss functions&lt;/strong&gt; of each step.
&lt;/p&gt;
&lt;h3&gt;Reward Modeling&lt;/h3&gt;
&lt;p&gt;The first step is &lt;strong&gt;reward modeling&lt;/strong&gt;, where we train a separate model to predict the &lt;strong&gt;reward&lt;/strong&gt;
  of an output: essentially, how good humans think the output is.
  The loss function for the reward model is
  &lt;span class="mjx-chtml MJXc-display" style="text-align: center;"&gt;&lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-3-Frame" role="presentation" style="font-size: 90%; text-align: center; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML MJX_Assistive_MathML_Block" role="presentation"&gt;&lt;math display="block" xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mi class="MJX-tex-caligraphic" mathvariant="script"&gt;L&lt;/mi&gt;&lt;/mrow&gt;&lt;mo stretchy="false"&gt;(&lt;/mo&gt;&lt;mi&gt;θ&lt;/mi&gt;&lt;mo stretchy="false"&gt;)&lt;/mo&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mo&gt;−&lt;/mo&gt;&lt;mi&gt;log&lt;/mi&gt;&lt;mo&gt;⁡&lt;/mo&gt;&lt;mrow&gt;&lt;mo&gt;(&lt;/mo&gt;&lt;mi&gt;σ&lt;/mi&gt;&lt;mrow&gt;&lt;mo&gt;(&lt;/mo&gt;&lt;msub&gt;&lt;mi&gt;r&lt;/mi&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mi&gt;θ&lt;/mi&gt;&lt;/mrow&gt;&lt;/msub&gt;&lt;mo stretchy="false"&gt;(&lt;/mo&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;mo&gt;,&lt;/mo&gt;&lt;msub&gt;&lt;mi&gt;y&lt;/mi&gt;&lt;mi&gt;w&lt;/mi&gt;&lt;/msub&gt;&lt;mo stretchy="false"&gt;)&lt;/mo&gt;&lt;mo&gt;−&lt;/mo&gt;&lt;msub&gt;&lt;mi&gt;r&lt;/mi&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mi&gt;θ&lt;/mi&gt;&lt;/mrow&gt;&lt;/msub&gt;&lt;mo stretchy="false"&gt;(&lt;/mo&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;mo&gt;,&lt;/mo&gt;&lt;msub&gt;&lt;mi&gt;y&lt;/mi&gt;&lt;mi&gt;l&lt;/mi&gt;&lt;/msub&gt;&lt;mo stretchy="false"&gt;)&lt;/mo&gt;&lt;mo&gt;)&lt;/mo&gt;&lt;/mrow&gt;&lt;mo&gt;)&lt;/mo&gt;&lt;/mrow&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
  This is simplified from the original formula,[&lt;a href="https://www.harysdalvi.com/blog/llms-dont-predict-next-word/#reflist"&gt;7&lt;/a&gt;] but captures
  the core idea. In this formula, our input is a prompt &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-4-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt; along with a pair of outputs &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-5-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;msub&gt;&lt;mi&gt;y&lt;/mi&gt;&lt;mi&gt;w&lt;/mi&gt;&lt;/msub&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt; and &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-6-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;msub&gt;&lt;mi&gt;y&lt;/mi&gt;&lt;mi&gt;l&lt;/mi&gt;&lt;/msub&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;,
  where &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-7-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;msub&gt;&lt;mi&gt;y&lt;/mi&gt;&lt;mi&gt;w&lt;/mi&gt;&lt;/msub&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt; is the output that the human labeler prefers and &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-8-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;msub&gt;&lt;mi&gt;y&lt;/mi&gt;&lt;mi&gt;l&lt;/mi&gt;&lt;/msub&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt; is the one they don't like as much.
  The function &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-9-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;msub&gt;&lt;mi&gt;r&lt;/mi&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mi&gt;θ&lt;/mi&gt;&lt;/mrow&gt;&lt;/msub&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt; is the reward model, which takes an output and returns a score.
  &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-10-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;σ&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt; is the sigmoid function.
&lt;/p&gt;
&lt;p&gt;If we graph the loss as a function of
  &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-11-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;msub&gt;&lt;mi&gt;r&lt;/mi&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mi&gt;θ&lt;/mi&gt;&lt;/mrow&gt;&lt;/msub&gt;&lt;mo stretchy="false"&gt;(&lt;/mo&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;mo&gt;,&lt;/mo&gt;&lt;msub&gt;&lt;mi&gt;y&lt;/mi&gt;&lt;mi&gt;w&lt;/mi&gt;&lt;/msub&gt;&lt;mo stretchy="false"&gt;)&lt;/mo&gt;&lt;mo&gt;−&lt;/mo&gt;&lt;msub&gt;&lt;mi&gt;r&lt;/mi&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mi&gt;θ&lt;/mi&gt;&lt;/mrow&gt;&lt;/msub&gt;&lt;mo stretchy="false"&gt;(&lt;/mo&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;mo&gt;,&lt;/mo&gt;&lt;msub&gt;&lt;mi&gt;y&lt;/mi&gt;&lt;mi&gt;l&lt;/mi&gt;&lt;/msub&gt;&lt;mo stretchy="false"&gt;)&lt;/mo&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt; (how much higher the given reward is for the winning output),
  we get a curve that looks like this:
&lt;/p&gt;
&lt;figure class="bigimg"&gt;
&lt;img alt="Reward model loss function (log sigmoid), decreasing roughly linearly until flattening out around zero" src="https://www.harysdalvi.com/blog/llms-dont-predict-next-word/img/rm-loss.jpg"/&gt;
&lt;figcaption&gt;Reward model loss function&lt;/figcaption&gt;
&lt;/figure&gt;
&lt;p&gt;We can see that as we assign higher scores to the output that humans thought was better,
  the loss goes down, close to zero as we assign much higher scores. On the other hand,
  if we assign lower scores to the output that humans actually thought was better, the loss goes up.
&lt;/p&gt;
&lt;p&gt;By training the reward model on labels provided by humans in this way,
  we end up with a model that can &lt;strong&gt;predict how much humans will like an output.&lt;/strong&gt;
&lt;/p&gt;
&lt;h3&gt;Proximal Policy Optimization&lt;/h3&gt;
&lt;p&gt;So now we have this function &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-12-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;msub&gt;&lt;mi&gt;r&lt;/mi&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mi&gt;θ&lt;/mi&gt;&lt;/mrow&gt;&lt;/msub&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt; for predicting rewards.
  That's cool, but what we actually wanted was to train the model.
  Now we can use the reward model in our loss function for training the actual LLM!
  At the same time, our LLM is already pretty good with instruction finetuning,
  so we'll try not to change it too much. This is the idea of &lt;strong&gt;proximal policy optimization (PPO)&lt;/strong&gt;.
&lt;/p&gt;
&lt;p&gt;
  The new objective function for the LLM is[&lt;a href="https://www.harysdalvi.com/blog/llms-dont-predict-next-word/#reflist"&gt;7&lt;/a&gt;]&lt;/p&gt;
&lt;div class="longmath"&gt;&lt;span class="mjx-chtml MJXc-display" style="text-align: center;"&gt;&lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-13-Frame" role="presentation" style="font-size: 90%; text-align: center; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML MJX_Assistive_MathML_Block" role="presentation"&gt;&lt;math display="block" xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mtable columnalign="right left right left right left right left right left right left" columnspacing="0em 2em 0em 2em 0em 2em 0em 2em 0em 2em 0em" displaystyle="true" rowspacing="3pt"&gt;&lt;mtr&gt;&lt;mtd&gt;&lt;mtext&gt;objective&lt;/mtext&gt;&lt;mo stretchy="false"&gt;(&lt;/mo&gt;&lt;mi&gt;ϕ&lt;/mi&gt;&lt;mo stretchy="false"&gt;)&lt;/mo&gt;&lt;/mtd&gt;&lt;mtd&gt;&lt;mi&gt;&lt;/mi&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;msub&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mi mathvariant="double-struck"&gt;E&lt;/mi&gt;&lt;/mrow&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mo stretchy="false"&gt;(&lt;/mo&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;mo&gt;,&lt;/mo&gt;&lt;mi&gt;y&lt;/mi&gt;&lt;mo stretchy="false"&gt;)&lt;/mo&gt;&lt;mo&gt;∼&lt;/mo&gt;&lt;msub&gt;&lt;mi&gt;D&lt;/mi&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;msubsup&gt;&lt;mi&gt;π&lt;/mi&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mi&gt;ϕ&lt;/mi&gt;&lt;/mrow&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mtext&gt;RL&lt;/mtext&gt;&lt;/mrow&gt;&lt;/msubsup&gt;&lt;/mrow&gt;&lt;/msub&gt;&lt;/mrow&gt;&lt;/msub&gt;&lt;mrow&gt;&lt;mo&gt;[&lt;/mo&gt;&lt;msub&gt;&lt;mi&gt;r&lt;/mi&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mi&gt;θ&lt;/mi&gt;&lt;/mrow&gt;&lt;/msub&gt;&lt;mo stretchy="false"&gt;(&lt;/mo&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;mo&gt;,&lt;/mo&gt;&lt;mi&gt;y&lt;/mi&gt;&lt;mo stretchy="false"&gt;)&lt;/mo&gt;&lt;mo&gt;−&lt;/mo&gt;&lt;mi&gt;β&lt;/mi&gt;&lt;mi&gt;log&lt;/mi&gt;&lt;mo&gt;⁡&lt;/mo&gt;&lt;mrow&gt;&lt;mo&gt;(&lt;/mo&gt;&lt;msubsup&gt;&lt;mi&gt;π&lt;/mi&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mi&gt;ϕ&lt;/mi&gt;&lt;/mrow&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mtext&gt;RL&lt;/mtext&gt;&lt;/mrow&gt;&lt;/msubsup&gt;&lt;mo stretchy="false"&gt;(&lt;/mo&gt;&lt;mi&gt;y&lt;/mi&gt;&lt;mo&gt;∣&lt;/mo&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;mo stretchy="false"&gt;)&lt;/mo&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mo&gt;/&lt;/mo&gt;&lt;/mrow&gt;&lt;msup&gt;&lt;mi&gt;π&lt;/mi&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mtext&gt;SFT&lt;/mtext&gt;&lt;/mrow&gt;&lt;/msup&gt;&lt;mo stretchy="false"&gt;(&lt;/mo&gt;&lt;mi&gt;y&lt;/mi&gt;&lt;mo&gt;∣&lt;/mo&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;mo stretchy="false"&gt;)&lt;/mo&gt;&lt;mo&gt;)&lt;/mo&gt;&lt;/mrow&gt;&lt;mo&gt;]&lt;/mo&gt;&lt;/mrow&gt;&lt;/mtd&gt;&lt;/mtr&gt;&lt;mtr&gt;&lt;mtd&gt;&lt;/mtd&gt;&lt;/mtr&gt;&lt;mtr&gt;&lt;mtd&gt;&lt;/mtd&gt;&lt;mtd&gt;&lt;mi&gt;&lt;/mi&gt;&lt;mspace width="1em"&gt;&lt;/mspace&gt;&lt;mo&gt;+&lt;/mo&gt;&lt;mi&gt;γ&lt;/mi&gt;&lt;msub&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mi mathvariant="double-struck"&gt;E&lt;/mi&gt;&lt;/mrow&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;mo&gt;∼&lt;/mo&gt;&lt;msub&gt;&lt;mi&gt;D&lt;/mi&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mtext&gt;pretrain&lt;/mtext&gt;&lt;/mrow&gt;&lt;/msub&gt;&lt;/mrow&gt;&lt;/msub&gt;&lt;mrow&gt;&lt;mo&gt;[&lt;/mo&gt;&lt;mi&gt;log&lt;/mi&gt;&lt;mo&gt;⁡&lt;/mo&gt;&lt;mo stretchy="false"&gt;(&lt;/mo&gt;&lt;msubsup&gt;&lt;mi&gt;π&lt;/mi&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mi&gt;ϕ&lt;/mi&gt;&lt;/mrow&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mtext&gt;RL&lt;/mtext&gt;&lt;/mrow&gt;&lt;/msubsup&gt;&lt;mo stretchy="false"&gt;(&lt;/mo&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;mo stretchy="false"&gt;)&lt;/mo&gt;&lt;mo stretchy="false"&gt;)&lt;/mo&gt;&lt;mo&gt;]&lt;/mo&gt;&lt;/mrow&gt;&lt;/mtd&gt;&lt;/mtr&gt;&lt;/mtable&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;
&lt;p&gt;This is a lot more complicated than the previous loss functions! I'll try to break it down step by step.
&lt;/p&gt;
&lt;p&gt;First of all, this is technically an &lt;strong&gt;objective function&lt;/strong&gt;, not a loss function.
  So we are trying to maximize this function, not minimize it.
&lt;/p&gt;
&lt;p&gt;
  Let's start with the first term:&lt;/p&gt;
&lt;div class="longmath"&gt;&lt;span class="mjx-chtml MJXc-display" style="text-align: center;"&gt;&lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-14-Frame" role="presentation" style="font-size: 90%; text-align: center; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML MJX_Assistive_MathML_Block" role="presentation"&gt;&lt;math display="block" xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;msub&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mi mathvariant="double-struck"&gt;E&lt;/mi&gt;&lt;/mrow&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mo stretchy="false"&gt;(&lt;/mo&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;mo&gt;,&lt;/mo&gt;&lt;mi&gt;y&lt;/mi&gt;&lt;mo stretchy="false"&gt;)&lt;/mo&gt;&lt;mo&gt;∼&lt;/mo&gt;&lt;msub&gt;&lt;mi&gt;D&lt;/mi&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;msubsup&gt;&lt;mi&gt;π&lt;/mi&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mi&gt;ϕ&lt;/mi&gt;&lt;/mrow&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mtext&gt;RL&lt;/mtext&gt;&lt;/mrow&gt;&lt;/msubsup&gt;&lt;/mrow&gt;&lt;/msub&gt;&lt;/mrow&gt;&lt;/msub&gt;&lt;mrow&gt;&lt;mo&gt;[&lt;/mo&gt;&lt;msub&gt;&lt;mi&gt;r&lt;/mi&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mi&gt;θ&lt;/mi&gt;&lt;/mrow&gt;&lt;/msub&gt;&lt;mo stretchy="false"&gt;(&lt;/mo&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;mo&gt;,&lt;/mo&gt;&lt;mi&gt;y&lt;/mi&gt;&lt;mo stretchy="false"&gt;)&lt;/mo&gt;&lt;mo&gt;]&lt;/mo&gt;&lt;/mrow&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;
&lt;p&gt;Here, &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-15-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mo stretchy="false"&gt;(&lt;/mo&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;mo&gt;,&lt;/mo&gt;&lt;mi&gt;y&lt;/mi&gt;&lt;mo stretchy="false"&gt;)&lt;/mo&gt;&lt;mo&gt;∼&lt;/mo&gt;&lt;msub&gt;&lt;mi&gt;D&lt;/mi&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;msubsup&gt;&lt;mi&gt;π&lt;/mi&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mi&gt;ϕ&lt;/mi&gt;&lt;/mrow&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mtext&gt;RL&lt;/mtext&gt;&lt;/mrow&gt;&lt;/msubsup&gt;&lt;/mrow&gt;&lt;/msub&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt; are prompts (&lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-16-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;) that we are using for RL training,
  and &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-17-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;y&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt; is the output that the model produced for that prompt. So far, our objective function is just
  the expected value over the training data of the reward model &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-18-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;msub&gt;&lt;mi&gt;r&lt;/mi&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mi&gt;θ&lt;/mi&gt;&lt;/mrow&gt;&lt;/msub&gt;&lt;mo stretchy="false"&gt;(&lt;/mo&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;mo&gt;,&lt;/mo&gt;&lt;mi&gt;y&lt;/mi&gt;&lt;mo stretchy="false"&gt;)&lt;/mo&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;.
  So we are trying to &lt;strong&gt;maximize the reward predicted by the reward model that we trained earlier.&lt;/strong&gt;
  Note that this is computed on an entire prompt and output, not the next token.
&lt;/p&gt;
&lt;p&gt;Onto the next term:&lt;/p&gt;
&lt;div class="longmath"&gt;&lt;span class="mjx-chtml MJXc-display" style="text-align: center;"&gt;&lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-19-Frame" role="presentation" style="font-size: 90%; text-align: center; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML MJX_Assistive_MathML_Block" role="presentation"&gt;&lt;math display="block" xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;msub&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mi mathvariant="double-struck"&gt;E&lt;/mi&gt;&lt;/mrow&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mo stretchy="false"&gt;(&lt;/mo&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;mo&gt;,&lt;/mo&gt;&lt;mi&gt;y&lt;/mi&gt;&lt;mo stretchy="false"&gt;)&lt;/mo&gt;&lt;mo&gt;∼&lt;/mo&gt;&lt;msub&gt;&lt;mi&gt;D&lt;/mi&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;msubsup&gt;&lt;mi&gt;π&lt;/mi&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mi&gt;ϕ&lt;/mi&gt;&lt;/mrow&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mtext&gt;RL&lt;/mtext&gt;&lt;/mrow&gt;&lt;/msubsup&gt;&lt;/mrow&gt;&lt;/msub&gt;&lt;/mrow&gt;&lt;/msub&gt;&lt;mrow&gt;&lt;mo&gt;[&lt;/mo&gt;&lt;mo&gt;−&lt;/mo&gt;&lt;mi&gt;β&lt;/mi&gt;&lt;mi&gt;log&lt;/mi&gt;&lt;mo&gt;⁡&lt;/mo&gt;&lt;mrow&gt;&lt;mo&gt;(&lt;/mo&gt;&lt;msubsup&gt;&lt;mi&gt;π&lt;/mi&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mi&gt;ϕ&lt;/mi&gt;&lt;/mrow&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mtext&gt;RL&lt;/mtext&gt;&lt;/mrow&gt;&lt;/msubsup&gt;&lt;mo stretchy="false"&gt;(&lt;/mo&gt;&lt;mi&gt;y&lt;/mi&gt;&lt;mo&gt;∣&lt;/mo&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;mo stretchy="false"&gt;)&lt;/mo&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mo&gt;/&lt;/mo&gt;&lt;/mrow&gt;&lt;msup&gt;&lt;mi&gt;π&lt;/mi&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mtext&gt;SFT&lt;/mtext&gt;&lt;/mrow&gt;&lt;/msup&gt;&lt;mo stretchy="false"&gt;(&lt;/mo&gt;&lt;mi&gt;y&lt;/mi&gt;&lt;mo&gt;∣&lt;/mo&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;mo stretchy="false"&gt;)&lt;/mo&gt;&lt;mo&gt;)&lt;/mo&gt;&lt;/mrow&gt;&lt;mo&gt;]&lt;/mo&gt;&lt;/mrow&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;
&lt;p&gt;Again, &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-20-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt; is the prompt and &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-21-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;y&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt; is the model output. Now &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-22-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;msubsup&gt;&lt;mi&gt;π&lt;/mi&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mi&gt;ϕ&lt;/mi&gt;&lt;/mrow&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mtext&gt;RL&lt;/mtext&gt;&lt;/mrow&gt;&lt;/msubsup&gt;&lt;mo stretchy="false"&gt;(&lt;/mo&gt;&lt;mi&gt;y&lt;/mi&gt;&lt;mo&gt;∣&lt;/mo&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;mo stretchy="false"&gt;)&lt;/mo&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt; is the predicted probabilities of the current model we're training,
  while &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-23-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;msup&gt;&lt;mi&gt;π&lt;/mi&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mtext&gt;SFT&lt;/mtext&gt;&lt;/mrow&gt;&lt;/msup&gt;&lt;mo stretchy="false"&gt;(&lt;/mo&gt;&lt;mi&gt;y&lt;/mi&gt;&lt;mo&gt;∣&lt;/mo&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;mo stretchy="false"&gt;)&lt;/mo&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt; is the predicted probabilities of the &lt;strong&gt;base model&lt;/strong&gt; we started from,
  after pretraining and instruction finetuning.
&lt;/p&gt;
&lt;p&gt;If we move around some notation, letting &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-24-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;p&lt;/mi&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;msubsup&gt;&lt;mi&gt;π&lt;/mi&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mi&gt;ϕ&lt;/mi&gt;&lt;/mrow&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mtext&gt;RL&lt;/mtext&gt;&lt;/mrow&gt;&lt;/msubsup&gt;&lt;mo stretchy="false"&gt;(&lt;/mo&gt;&lt;mi&gt;y&lt;/mi&gt;&lt;mo&gt;∣&lt;/mo&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;mo stretchy="false"&gt;)&lt;/mo&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt; and &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-25-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;q&lt;/mi&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;msup&gt;&lt;mi&gt;π&lt;/mi&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mtext&gt;SFT&lt;/mtext&gt;&lt;/mrow&gt;&lt;/msup&gt;&lt;mo stretchy="false"&gt;(&lt;/mo&gt;&lt;mi&gt;y&lt;/mi&gt;&lt;mo&gt;∣&lt;/mo&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;mo stretchy="false"&gt;)&lt;/mo&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;,
  this is also&lt;/p&gt;
&lt;div class="longmath"&gt;&lt;span class="mjx-chtml MJXc-display" style="text-align: center;"&gt;&lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-26-Frame" role="presentation" style="font-size: 90%; text-align: center; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML MJX_Assistive_MathML_Block" role="presentation"&gt;&lt;math display="block" xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mo&gt;−&lt;/mo&gt;&lt;mi&gt;β&lt;/mi&gt;&lt;msub&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mi mathvariant="double-struck"&gt;E&lt;/mi&gt;&lt;/mrow&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mo stretchy="false"&gt;(&lt;/mo&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;mo&gt;,&lt;/mo&gt;&lt;mi&gt;y&lt;/mi&gt;&lt;mo stretchy="false"&gt;)&lt;/mo&gt;&lt;mo&gt;∼&lt;/mo&gt;&lt;msub&gt;&lt;mi&gt;D&lt;/mi&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;msubsup&gt;&lt;mi&gt;π&lt;/mi&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mi&gt;ϕ&lt;/mi&gt;&lt;/mrow&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mtext&gt;RL&lt;/mtext&gt;&lt;/mrow&gt;&lt;/msubsup&gt;&lt;/mrow&gt;&lt;/msub&gt;&lt;/mrow&gt;&lt;/msub&gt;&lt;mrow&gt;&lt;mo&gt;[&lt;/mo&gt;&lt;mi&gt;log&lt;/mi&gt;&lt;mo&gt;⁡&lt;/mo&gt;&lt;mrow&gt;&lt;mo&gt;(&lt;/mo&gt;&lt;mfrac&gt;&lt;mi&gt;p&lt;/mi&gt;&lt;mi&gt;q&lt;/mi&gt;&lt;/mfrac&gt;&lt;mo&gt;)&lt;/mo&gt;&lt;/mrow&gt;&lt;mo&gt;]&lt;/mo&gt;&lt;/mrow&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;
&lt;p&gt;so we are taking the expected value of &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-27-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;log&lt;/mi&gt;&lt;mo&gt;⁡&lt;/mo&gt;&lt;mo stretchy="false"&gt;(&lt;/mo&gt;&lt;mi&gt;p&lt;/mi&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mo&gt;/&lt;/mo&gt;&lt;/mrow&gt;&lt;mi&gt;q&lt;/mi&gt;&lt;mo stretchy="false"&gt;)&lt;/mo&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;, and using that times &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-28-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;β&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt; as a penalty.
  This expected value is the Kullback-Leibler divergence (or KL divergence) between the two distributions &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-29-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;p&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt; and &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-30-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;q&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;,
  which represents how different the two distributions are. By applying a penalty on the difference,
  we are making sure that as we train the model, &lt;strong&gt;its output probabilities stay somewhat close to what they were
  in the base model&lt;/strong&gt;, when all we had done was pretraining and instruction finetuning.
&lt;/p&gt;
&lt;p&gt;Finally the last term:&lt;/p&gt;
&lt;div class="longmath"&gt;&lt;span class="mjx-chtml MJXc-display" style="text-align: center;"&gt;&lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-31-Frame" role="presentation" style="font-size: 90%; text-align: center; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML MJX_Assistive_MathML_Block" role="presentation"&gt;&lt;math display="block" xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;γ&lt;/mi&gt;&lt;msub&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mi mathvariant="double-struck"&gt;E&lt;/mi&gt;&lt;/mrow&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;mo&gt;∼&lt;/mo&gt;&lt;msub&gt;&lt;mi&gt;D&lt;/mi&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mtext&gt;pretrain&lt;/mtext&gt;&lt;/mrow&gt;&lt;/msub&gt;&lt;/mrow&gt;&lt;/msub&gt;&lt;mrow&gt;&lt;mo&gt;[&lt;/mo&gt;&lt;mi&gt;log&lt;/mi&gt;&lt;mo&gt;⁡&lt;/mo&gt;&lt;mo stretchy="false"&gt;(&lt;/mo&gt;&lt;msubsup&gt;&lt;mi&gt;π&lt;/mi&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mi&gt;ϕ&lt;/mi&gt;&lt;/mrow&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mtext&gt;RL&lt;/mtext&gt;&lt;/mrow&gt;&lt;/msubsup&gt;&lt;mo stretchy="false"&gt;(&lt;/mo&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;mo stretchy="false"&gt;)&lt;/mo&gt;&lt;mo stretchy="false"&gt;)&lt;/mo&gt;&lt;mo&gt;]&lt;/mo&gt;&lt;/mrow&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;
&lt;p&gt;This time, instead of using our RLHF dataset of prompts and outputs, we are going back to the pretraining dataset &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-32-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;msub&gt;&lt;mi&gt;D&lt;/mi&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mtext&gt;pretrain&lt;/mtext&gt;&lt;/mrow&gt;&lt;/msub&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;.
  In fact, we are just &lt;strong&gt;predicting the next token&lt;/strong&gt; here using the exact same loss function as before,
  &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-33-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mo&gt;−&lt;/mo&gt;&lt;mi&gt;log&lt;/mi&gt;&lt;mo&gt;⁡&lt;/mo&gt;&lt;msub&gt;&lt;mi&gt;p&lt;/mi&gt;&lt;mi&gt;y&lt;/mi&gt;&lt;/msub&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;, only times a constant &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-34-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;γ&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;.
  We add this term so that as we do RLHF, we maintain good performance on predicting the next token on the pretraining data.
&lt;/p&gt;
&lt;p&gt;Let's recap the meaning of the three terms of this loss function:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;We try to maximize the reward given by the reward model we trained before.
    Hopefully, this means we are making outputs that humans will like.&lt;/li&gt;
&lt;li&gt;We add a penalty for outputting a distribution that goes too far from the base model.&lt;/li&gt;
&lt;li&gt;We mix in some normal next-token prediction on the pretraining data.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;All this is called
  proximal policy optimization (PPO). Proximal because we are staying close to the base model,
  and policy optimization because the output probabilities of the model are called the model's
  &lt;strong&gt;policy&lt;/strong&gt; in reinforcement learning.
&lt;/p&gt;
&lt;p&gt;
&lt;strong&gt;Only one of these terms (#3) is directly training the model to predict the next token.&lt;/strong&gt;
  Arguably, term #2 is also a proxy for next-token prediction, since we stay close to a base model
  that was trained on next-token prediction.&lt;/p&gt;
&lt;p&gt;But &lt;strong&gt;term #1, the RLHF term, is fundamentally
different from next-token prediction.&lt;/strong&gt;
This means that LLMs do not just predict the next token because
&lt;strong&gt;they are literally not trained to just predict the next token.&lt;/strong&gt;
&lt;/p&gt;
&lt;h2&gt;LLMs as Chess Players&lt;/h2&gt;
&lt;p&gt;Imagine a chess-playing model like AlphaZero. If we ignore the details of tree search,
  the model takes in a chessboard and outputs a distribution over possible moves
  (its &lt;strong&gt;policy&lt;/strong&gt;).
  The model is trained based on the results of games it plays, and over time,
  it gets better at playing games.[&lt;a href="https://www.harysdalvi.com/blog/llms-dont-predict-next-word/#reflist"&gt;8&lt;/a&gt;]
&lt;/p&gt;
&lt;p&gt;This is the essence of reinforcement learning:
  we have an &lt;strong&gt;agent&lt;/strong&gt; (the chess-playing model)
  that interprets the &lt;strong&gt;environment&lt;/strong&gt;
  (the chessboard) and takes &lt;strong&gt;actions&lt;/strong&gt;
  (moves on the chessboard) that in turn affect the environment.
  It tries to choose actions that maximize its perceived &lt;strong&gt;reward&lt;/strong&gt;
  based on the environment.
  Its actions are expressed as a &lt;strong&gt;policy&lt;/strong&gt;,
  which is a probability distribution over possible next moves.
&lt;/p&gt;
&lt;figure class="middleimg"&gt;
&lt;img alt="Reinforcement learning diagram: agent and environment affect each other" class="white-diagram" src="https://www.harysdalvi.com/blog/llms-dont-predict-next-word/img/Reinforcement_learning_diagram.svg"/&gt;
&lt;figcaption&gt;Reinforcement learning diagram&lt;/figcaption&gt;
&lt;/figure&gt;
&lt;p&gt;Regardless of whether RLHF is truly RL, there is an important analogy we can draw here.
  Rather than next-token prediction machines, LLMs are &lt;strong&gt;agents&lt;/strong&gt;
  that interpret their &lt;strong&gt;environment&lt;/strong&gt; (the prompt and output so far)
  and take &lt;strong&gt;actions&lt;/strong&gt; (the next token) that affect the environment.
  They try to choose actions that maximize their perceived &lt;strong&gt;reward&lt;/strong&gt;
  (the reward model, which helps them produce outputs that humans like).
&lt;/p&gt;
&lt;figure class="endfloat"&gt;
&lt;img alt="A computer chess game played between two bots" src="https://www.harysdalvi.com/blog/llms-dont-predict-next-word/img/chess.jpg"/&gt;
&lt;figcaption&gt;RL chess bots predict next moves as possible actions, just like LLMs predict next tokens as possible actions.&lt;/figcaption&gt;
&lt;/figure&gt;
&lt;p&gt;This is a consequence of how we trained the LLM in RLHF.
  The part that doesn't change, no matter how we train the LLM, is its input space
  (strings of tokens) and output space (distributions over tokens).
  We can interpret this in so many different ways:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;During pretraining, an LLM becomes an agent that tries to take actions (next tokens)
      in order to predict the next token. At this point, it's just a statistical model.
    &lt;/li&gt;
&lt;li&gt;During RLHF, an LLM becomes an agent that tries to take actions (next tokens)
      to ultimately produce entire outputs in a way that (indirectly, via a reward model) appeals to human judges.
    &lt;/li&gt;
&lt;li&gt;During chain-of-thought RL training like in DeepSeek R1,[&lt;a href="https://www.harysdalvi.com/blog/llms-dont-predict-next-word/#reflist"&gt;9&lt;/a&gt;] an LLM becomes an agent
      that tries to take actions (next tokens during both reasoning and output)
      to ultimately produce outputs that are more likely to be correct.
    &lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;From here, we can imagine LLMs that write code and evaluate it by running it,
  or even solve math problems and evaluate their solutions with proof assistants.[&lt;a href="https://www.harysdalvi.com/blog/llms-dont-predict-next-word/#reflist"&gt;10&lt;/a&gt;]
  On the more sinister side, we can imagine LLMs that are rewarded for spreading misinformation.
&lt;/p&gt;
&lt;p&gt;In short, to the extent that words matter, LLMs can be thought of as agents that take actions
  rather than just statistical models, even if those “actions” are words.
&lt;/p&gt;
&lt;h2&gt;Why AI Agents?&lt;/h2&gt;
&lt;p&gt;If LLMs are already agents, what's the buzz around “AI agents”?
&lt;/p&gt;
&lt;p&gt;LLMs alone are RL “agents” whose actions are limited to producing tokens.
  By mapping tokens to real-world actions, we can make the agent-like behavior
  already baked into LLMs into something even more tangible.
&lt;/p&gt;
&lt;p&gt;Remember that the reward function in RLHF is a proxy for &lt;strong&gt;how much humans like the output of an LLM&lt;/strong&gt;.
  So an LLM is already trained to generally follow instructions in a way that is appealing to humans.
  This can easily be extended to all sorts of actions that an LLM can take.
&lt;/p&gt;
&lt;p&gt;As is, the model is &lt;strong&gt;mainly trained to appeal to human judges.&lt;/strong&gt;
  Code is a clear example where an agent could instead be trained to code with RL,
  rather than assuming that whatever code the evaluators tend to like is correct.
  Prompt engineering can help with this, but there's a lot of room to improve through actual training,
  especially in clear-cut domains with easy feedback loops like writing code for
  well-defined problems.
&lt;/p&gt;
&lt;p&gt;Additionally, the proxy goal of pleasing human evaluators is great, but it's not perfect.
  LLMs can fool humans, creating outputs that seem good when they are actually flawed.
  In fact, this has already been demonstrated: when training on question answering and code generation,
  one team found that RLHF made the model produce outputs that evaluators liked better over time,
  even though it actually got &lt;em&gt;worse&lt;/em&gt; at the task itself.[&lt;a href="https://www.harysdalvi.com/blog/llms-dont-predict-next-word/#reflist"&gt;11&lt;/a&gt;]&lt;/p&gt;
&lt;p&gt;
  It's not easy to find a better technique, though: reinforcement learning of all kinds
  is very prone to &lt;strong&gt;reward hacking&lt;/strong&gt;, where models learn to exploit the reward function
  without really improving at the task.
&lt;/p&gt;
&lt;p&gt;Although RLHF isn't perfect, it is extremely powerful.
  It's surprising that a “mindless next-token prediction machine” can appear to show intelligence,
  but if we reframe an LLM as a machine that aims to appeal to humans through producing tokens,
  this starts to make a lot more sense.&lt;/p&gt;
&lt;p&gt;It's still true that next-token prediction is an important part of
  LLMs, not only in pretraining but even as a component of RLHF.
  But I hope I was able to explain why there are much deeper layers to how an LLM works,
  first with RLHF, and second with other kinds of RL like chain-of-thought reasoning.
&lt;/p&gt;
&lt;p&gt;Fundamentally, &lt;strong&gt;an LLM is not a next-token predictor.&lt;/strong&gt; It's actually something even more basic:
  &lt;strong&gt;a machine that outputs tokens.&lt;/strong&gt; We can choose whether we train that machine to predict the next token,
  appeal to human evaluators, write code, or do something else entirely.
  And we can choose whether we simply display those tokens to a user or use them to call functions
  and create effects in the real world.
  It's up to us to make the best choices.
&lt;/p&gt;
&lt;hr/&gt;
&lt;h2&gt;Related Posts&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="https://www.harysdalvi.com/blog/can-transformers-solve-everything/"&gt;Can Transformers Solve Everything?&lt;/a&gt; — Looking into the math and the data reveals that transformers are both overused and underused.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id="ref"&gt;References&lt;/h2&gt;
&lt;ol class="reflist" id="reflist"&gt;&lt;li&gt;&lt;a href="https://rlhfbook.com" target="_blank"&gt;Reinforcement Learning from Human Feedback&lt;/a&gt;
  (Nathan Lambert, 2024) &lt;/li&gt;&lt;li&gt;&lt;a href="https://pytorch.org/docs/stable/generated/torch.nn.CrossEntropyLoss.html" target="_blank"&gt;CrossEntropyLoss&lt;/a&gt;
  (PyTorch Contributors, 2024) &lt;/li&gt;&lt;li&gt;&lt;a href="https://arxiv.org/abs/2212.11281" target="_blank"&gt;Language models are better than humans at next-token prediction&lt;/a&gt;
  (Buck Shlegeris, Fabien Roger, Lawrence Chan, &amp;amp; Euan McLean, 2022) &lt;/li&gt;&lt;li&gt;&lt;a href="https://arxiv.org/abs/2109.01652" target="_blank"&gt;Finetuned Language Models Are Zero-Shot Learners&lt;/a&gt;
  (Jason Wei, Maarten Bosma, Vincent Y. Zhao, Kelvin Guu, Adams Wei Yu, Brian Lester, Nan Du, Andrew M. Dai, &amp;amp; Quoc V. Le, 2022) &lt;/li&gt;&lt;li&gt;&lt;a href="https://pytorch.org/torchtune/0.2/tutorials/chat.html" target="_blank"&gt;Fine-tuning Llama3 with Chat Data&lt;/a&gt;
  (torchtune Contributors, 2023) &lt;/li&gt;&lt;li&gt;&lt;a href="https://alejandrotlaie.net/my-objections-to-the-llms-are-just-next-token-predictors-take" target="_blank"&gt;My objection(s) to the "LLMs are just next-token predictors" take&lt;/a&gt;
  (Alejandro Tlaie Boria, 2025) &lt;/li&gt;&lt;li&gt;&lt;a href="https://arxiv.org/abs/2203.02155" target="_blank"&gt;Training language models to follow instructions with human feedback&lt;/a&gt;
  (Ouyang et al., OpenAI, 2022) &lt;/li&gt;&lt;li&gt;&lt;a href="https://arxiv.org/abs/1712.01815" target="_blank"&gt;Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm&lt;/a&gt;
  (David Silver, Thomas Hubert, Julian Schrittwieser, Ioannis Antonoglou, Matthew Lai, Arthur Guez, Marc Lanctot, Laurent Sifre, Dharshan Kumaran, Thore Graepel,
  Timothy Lillicrap, Karen Simonyan, &amp;amp; Demis Hassabis, DeepMind, 2017) &lt;/li&gt;&lt;li&gt;&lt;a href="https://arxiv.org/abs/2501.12948" target="_blank"&gt;DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning&lt;/a&gt;
  (DeepSeek-AI, 2025) &lt;/li&gt;&lt;li&gt;&lt;a href="https://deepmind.google/discover/blog/ai-solves-imo-problems-at-silver-medal-level/" target="_blank"&gt;AI achieves silver-medal standard solving International Mathematical Olympiad problems&lt;/a&gt;
  (AlphaProof and AlphaGeometry teams, 2024) &lt;/li&gt;&lt;li&gt;&lt;a href="https://arxiv.org/abs/2409.12822" target="_blank"&gt;Language Models Learn to Mislead Humans via RLHF&lt;/a&gt;
  (Jiaxin Wen, Ruiqi Zhong, Akbir Khan, Ethan Perez, Jacob Steinhardt, Minlie Huang, Samuel R. Bowman, He He, &amp;amp; Shi Feng, 2024) &lt;/li&gt;&lt;/ol&gt;</content></entry><entry><title>Why AI Can't Do Your Laundry</title><summary>Too often, AI does what we humans want to do, and not what we can't or won't do.</summary><link href="https://www.harysdalvi.com/blog/ai-laundry/" rel="alternate"/><id>https://www.harysdalvi.com/blog/ai-laundry/</id><published>2025-01-16T16:38:07Z</published><updated>2025-01-16T16:38:07Z</updated><content type="html" xml:lang="en">&lt;p&gt;Here are some things AI can do:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Write a sonnet or a short story&lt;/li&gt;
&lt;li&gt;Create realistic or abstract artwork&lt;/li&gt;
&lt;li&gt;Code a passion project&lt;/li&gt;
&lt;li&gt;Teach students new concepts&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;One can argue AI does all or most of these things poorly, which is a fair argument.
But ultimately, AI can do all this at some level, and is only improving over time.
&lt;/p&gt;
&lt;p&gt;Now, here are some things AI &lt;em&gt;can't&lt;/em&gt; do:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Lay bricks&lt;/li&gt;
&lt;li&gt;Clean bathrooms&lt;/li&gt;
&lt;li&gt;Solve world hunger&lt;/li&gt;
&lt;li&gt;Distribute mosquito nets&lt;/li&gt;
&lt;li&gt;Do the dishes&lt;/li&gt;
&lt;li&gt;Do laundry&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;There's a pattern here. It's crazy, and a little upsetting, that &lt;a href="https://www.harysdalvi.com/bardgpt/"&gt;AI
can write poetry&lt;/a&gt; but can't do your laundry.
In general, AI is able to do things humans &lt;strong&gt;want&lt;/strong&gt; to do,
but not things humans &lt;strong&gt;don't want&lt;/strong&gt; to do or &lt;strong&gt;can't&lt;/strong&gt; do.
A lot has been said on this topic with respect to robotics, and why automating manual labor
in many ways is harder than building a language model that reasons or solves difficult math problems.
But I want to tackle a slightly more general problem in this post:
&lt;strong&gt;Why does AI do the things we want to do, and not the things we don't want to do or can't do?
How can we fix this?&lt;/strong&gt;
&lt;/p&gt;
&lt;p&gt;But hidden in this line of reasoning are three assumptions: first, that &lt;strong&gt;manual labor is more difficult to automate than creative tasks&lt;/strong&gt;; second, that &lt;strong&gt;manual labor is desirable to automate while
  creative tasks are not&lt;/strong&gt;;
  and finally, that &lt;strong&gt;manual labor and white-collar knowledge work represent the two main job types for AI to do&lt;/strong&gt;.
  All of these assumptions are up for debate, and are less true than they seem.
&lt;/p&gt;
&lt;h2&gt;Can AI Do Your Laundry?&lt;/h2&gt;
&lt;p&gt;When people say “AI” these days, they usually mean “deep learning with lots of data”.
  If you use that definition, then no, AI can't do your laundry.
&lt;/p&gt;
&lt;p&gt;But if you ask whether technology can
  do your laundry, the answer is yes. Laundry machines and dishwashers have been incredible for humanity: they are
  able to do these chores thoroughly while using less energy than manual cleaning.[&lt;a href="https://www.harysdalvi.com/blog/ai-laundry/#reflist"&gt;1&lt;/a&gt;] Before the invention of appliances
  like these, people, especially women, would often stay at home and do these chores. In this case,
  automation actually &lt;em&gt;massively increased&lt;/em&gt; workforce participation, allowing women more freedom to work instead of
  spending all their time on the household.[&lt;a href="https://www.harysdalvi.com/blog/ai-laundry/#reflist"&gt;2&lt;/a&gt;]
&lt;/p&gt;
&lt;figure class="endfloat"&gt;
&lt;img alt='An advertisement that reads "What a lovely dinner! But oh dear! How I hate to wash the dishes! ... The Faultless Quaker Dishwasher. Not only prevents such remarks as the above, but it washes dishes to perfection and does not chip or break them."' src="https://www.harysdalvi.com/blog/ai-laundry/img/dishwasher.jpg"/&gt;
&lt;figcaption&gt;Advertisement in an 1896 issue of &lt;cite&gt;McClure's Magazine&lt;/cite&gt; for The Faultless Quaker Dishwasher.
  Image source: &lt;a href="https://commons.wikimedia.org/wiki/File:The_Faultless_Quaker_Dishwasher_(1896_advertisement).jpg" target="_blank"&gt;Wikimedia&lt;/a&gt;,
  Public Domain.&lt;/figcaption&gt;
&lt;/figure&gt;
&lt;p&gt;Now, the tasks of doing laundry and dishes mainly involve moving clothes or dishes to the appropriate places
  and pressing a few buttons. Deep learning (DL) could improve this if we had robots that could scrape food from plates and place them
  in the dishwasher, or fold and hang clothes from the dryer, but the current situation is already pretty awesome.
&lt;/p&gt;
&lt;p&gt;So maybe AI, or rather technology, can already do your laundry.&lt;/p&gt;
&lt;p&gt;Besides laundry and dishwashing,
  just as the majority of human work throughout history has been physical labor, so has the majority of automation.
  It's true that increased literacy rates made scribes obsolete, and calculators largely replaced human computers.
  But we also see that the Industrial Revolution displaced textile workers, modern manufacturing
  displaced blacksmiths, and we even had alarm clocks succeed human “knocker-uppers”.
&lt;/p&gt;
&lt;figure class="middleimg"&gt;
&lt;img alt="" src="https://www.harysdalvi.com/blog/ai-laundry/img/human-computers.jpg"/&gt;
&lt;figcaption&gt;Human computers in the NACA High Speed Flight Station "Computer Room", Dryden Flight Research Center Facilities, summer 1949.
  Image source: &lt;a href="https://commons.wikimedia.org/wiki/File:Human_computers_-_Dryden.jpg" target="_blank"&gt;Wikimedia&lt;/a&gt;,
  Public Domain.&lt;/figcaption&gt;
&lt;/figure&gt;
&lt;p&gt;It's only in the most recent wave of AI that this trend seems to have been reversed. Deep learning techniques
  have a much easier time with human knowledge work than physical labor. They can play chess
  and other games at a &lt;a href="https://arxiv.org/abs/1712.01815" target="_blank"&gt;superhuman level&lt;/a&gt;,
  and can even solve &lt;a href="https://wandb.ai/byyoung3/ml-news/reports/OpenAI-Introduces-o3-Pushing-the-Boundaries-of-AI-Reasoning--VmlldzoxMDY3OTUxMA"&gt;unsolved
  math problems&lt;/a&gt;.
  But they can't reliably handle unexpected events on the road[&lt;a href="https://www.harysdalvi.com/blog/ai-laundry/#reflist"&gt;3&lt;/a&gt;] or even bring you a cup of tea[&lt;a href="https://www.harysdalvi.com/blog/ai-laundry/#reflist"&gt;4&lt;/a&gt;].
&lt;/p&gt;
&lt;p&gt;This is known as &lt;a href="https://en.wikipedia.org/wiki/Moravec%27s_paradox" target="_blank"&gt;Moravec's paradox&lt;/a&gt;:
  things that seem hard to us (like reasoning) are easier for computers, while sensorimotor and perception tasks we do effortlessly
  are very hard to replicate.
  You might think the difference between these tasks and the more “advanced” reasoning tasks
  is one of hardware, where building the right hardware to support the intelligence is a difficult problem.
  But the difficulty is actually (mostly) on the software side:
  &lt;a href="https://youtu.be/UxWH5XAcFnM?si=xBn5vPe4TTMEOE7C&amp;amp;t=32" target="_blank"&gt;humans can control robots to perform these tasks
  quite easily.&lt;/a&gt;[&lt;a href="https://www.harysdalvi.com/blog/ai-laundry/#reflist"&gt;5&lt;/a&gt;]
&lt;/p&gt;
&lt;p&gt;One of the main issues is that there isn't enough high-quality data for robots to train on.[&lt;a href="https://www.harysdalvi.com/blog/ai-laundry/#reflist"&gt;5&lt;/a&gt;]
  There are other issues too, but without solving this, a DL approach to robotics has no hope of matching what we've seen in other AI models.
  In contrast, large language models are able to train on the entire internet and then some, and can include both
  images and text in their training.
&lt;/p&gt;
&lt;p&gt;This is the crux of the problem, the fundamental reason why AI can't do physical labor tasks we want to automate
  and instead does creative work we want to do ourselves. The internet is full of everything from TikTok dances to
  classic works of philosophy, but I've never come across first-person video footage of doing the dishes complete with
  muscle movement data.
  &lt;strong&gt;The internet is primarily a place for people to share
  knowledge work and their passions. By training AI on the internet, we are largely training AI to mimic human creativity
  and ingenuity, while leaving out the drudgery of life.&lt;/strong&gt;
&lt;/p&gt;
&lt;p&gt;There are two ways to solve this problem: either collect more data, or create it.
  To create data, we can use &lt;strong&gt;reinforcement learning (RL)&lt;/strong&gt;.
  We can train robots in simulation, or even in real life, to maximize some reward we set.
  But this is vulnerable to &lt;a href="https://en.wikipedia.org/wiki/Reward_hacking" target="_blank"&gt;reward hacking&lt;/a&gt;:
  we've seen this in games like CoastRunners, a boat racing video game, where an AI got a high score by
  knocking over targets for points instead of completing the race.[&lt;a href="https://www.harysdalvi.com/blog/ai-laundry/#reflist"&gt;6&lt;/a&gt;]
  In a more dangerous example, a robot trained to make you tea and nothing else might knock down doors,
  spill hot water, and break cups, as long as it achieves its objective of making you tea.
&lt;/p&gt;
&lt;p&gt;There is also the option of collecting data outside the internet. We could use humans performing the task as a starting point,
  rather than training a pure RL robot. But it's expensive to get all that human input, and it doesn't work for tasks
  that are too dangerous or difficult for humans.
&lt;/p&gt;
&lt;p&gt;And if we take off our AI hype glasses for a second, we might question the use of deep learning at all
  for this purpose. Robotics startups often don't use DL,[&lt;a href="https://www.harysdalvi.com/blog/ai-laundry/#reflist"&gt;7&lt;/a&gt;]
  which makes sense because DL is a statistical technique with room for error, while robotics needs perfect reliability.
  Rather than pure RL/DL, we might see a growth in non-DL approaches or, more likely, a mix of both.
&lt;/p&gt;
&lt;p&gt;I'm looking forward to seeing some great advances in robotics in the coming years, and maybe even
  a “ChatGPT moment”. But these obstacles will likely continue to limit the progress we can make
  in robotics, while we'll see more advances in AI for creative work and knowledge work
  as a consequence of the kind of data we see on the internet.
&lt;/p&gt;
&lt;h2&gt;Should AI Do Your Laundry?&lt;/h2&gt;
&lt;p&gt;The dominant narrative is that AI should automate manual labor rather than creative and stimulating tasks.
  For the most part, I agree. But this isn't as clear-cut as it seems.
&lt;/p&gt;
&lt;p&gt;The word Luddite, now used to describe anyone opposed to new technology, originates from
  a group of textile workers in 19th-century England who felt threatened by the new technologies of the
  Industrial Revolution. They had spent years refining their craft to produce high-quality garments,
  and suddenly people with minimal expertise could churn out textiles with new machines.
  Eventually, the Luddites became so upset that they started smashing textile machines across England,
  which the government made punishable by death.[&lt;a href="https://www.harysdalvi.com/blog/ai-laundry/#reflist"&gt;8&lt;/a&gt;]
  Clearly, they were not happy to have their job automated, even though it was more physical work
  than knowledge work.
&lt;/p&gt;
&lt;p&gt;It's true that unlike dishes or laundry, manual textile work requires a lot of specialized training
  and artistry on top of the physical component. But this is true of many jobs involving physical labor:
  plumbers, mechanics, electricians, and welders all require a lot of training to refine their skills,
  while trades like carpentry are arguably even art forms. Even if this weren't the case,
  the important fact here is that &lt;strong&gt;people in the skilled trades tend to be satisfied with
  their careers:&lt;/strong&gt; over 90%, in fact.[&lt;a href="https://www.harysdalvi.com/blog/ai-laundry/#reflist"&gt;9&lt;/a&gt;]
  Automating away their livelihoods is no more desirable than automating that of a spreadsheet worker.
&lt;/p&gt;
&lt;figure class="middleimg"&gt;
&lt;img alt="" src="https://www.harysdalvi.com/blog/ai-laundry/img/tractor.jpg"/&gt;
&lt;figcaption&gt;A tractor, an example of a machine that automates much of a traditional farmer's job
    without displacing the farmer. Image source: &lt;a href="https://unsplash.com/photos/farmer-seeding-sowing-crops-at-field-sowing-is-the-process-of-planting-seeds-in-the-ground-as-part-of-the-early-spring-time-agricultural-activities-aYdgI3l7q_M" target="_blank"&gt;Unsplash&lt;/a&gt;.
  &lt;/figcaption&gt;
&lt;/figure&gt;
&lt;p&gt;Knowledge work versus blue-collar work is a useful distinction to see what AI is
  &lt;em&gt;most likely&lt;/em&gt; to automate, but it is not as useful to determine what AI &lt;em&gt;should&lt;/em&gt;
  automate. For that, we need to start with a simpler question: &lt;strong&gt;Do people want to do this job?&lt;/strong&gt;
&lt;/p&gt;
&lt;p&gt;If we look at the careers with the lowest levels of happiness,[&lt;a href="https://www.harysdalvi.com/blog/ai-laundry/#reflist"&gt;10&lt;/a&gt;]
  we see a mix of blue-collar and white-collar jobs.
  While we hear about fears of AI automating artists or programmers on social media,
  many actual candidates for automation with current AI are things we find on this list:
  data analyst, customer service representative, administrative assistant. Other blue-collar roles, like cashier and
  retail salesperson, are not being automated per se but are simply shrinking: we have more self-checkout
  and online shopping now.
&lt;/p&gt;
&lt;p&gt;When we say we want AI to do our laundry, we don't necessarily mean that we want it to automate
  blue-collar jobs instead of white-collar ones. Instead, we want it to automate the boring parts
  of life that don't make us money or give us fulfillment. This is good for the economy and human
  livelihoods all around: we saw that the invention of new household appliances increased workforce participation,
  so surely AI for similar purposes will increase both economic productivity and human flourishing.
  What if we could have AI for cooking, cleaning, and taking out the trash?
&lt;/p&gt;
&lt;h2&gt;Besides Laundry, What Should AI Do?&lt;/h2&gt;
&lt;p&gt;What AI should do and what it will do aren't always the same.
  It seems to me that unfortunately, AI will automate away the jobs of some people who wanted to keep those jobs.
  It might not happen as explicitly as in the case of the Luddites, but the shift in economic incentives won't
  be 100% good for everyone. There will be harm, as with any new automation.
&lt;/p&gt;
&lt;p&gt;When faced with the possibility of AI harming us, we can and should ask how we can make AI serve us as regular people.
  An obvious candidate is chores like cooking, cleaning, and taking out the trash. Having robots that can do these non-career chores
  would free up people to work more and enjoy more, just like laundry machines did in the past.&lt;/p&gt;
&lt;p&gt;But there's a lot more to AI than chores.
  So far, I've mostly been talking about using robotics to automate things that humans don't want to do, like
  chores and jobs they dislike. But equally important is &lt;strong&gt;AI automating things humans can't do,
  at least not feasibly&lt;/strong&gt;.
  Unlike something like AI autocompleting a line of code or designing a logo,
  this would be a real step towards using AI to build a world of great prosperity far beyond
  what humans can achieve on their own.
&lt;/p&gt;
&lt;p&gt;The internet is full of things humans have done, so it seems like a bad place to train an AI to do
  things humans can't feasibly do themselves. But the key word in that sentence is things humans can't do &lt;strong&gt;feasibly&lt;/strong&gt;.
  Customer service is a great example: humans can provide customer service over the phone,
  but it's not feasible to provide it at all hours, for all companies, for
  all customers who might have a complaint. With large language models, this can be automated
  at scale, ultimately using human-created data from the internet to accomplish something humans couldn't feasibly
  organize to do. Companies like &lt;a href="https://www.bland.ai" target="_blank"&gt;Bland AI&lt;/a&gt;
  are already working on this.
&lt;/p&gt;
&lt;p&gt;Another example is &lt;strong&gt;upsampling&lt;/strong&gt;: loosely speaking, taking an existing image, video, or audio and filling in the blanks in some way.
  In 2023, the surviving Beatles used AI to restore John Lennon's vocals from old
  recordings and release the last ever Beatles song, “Now and Then”.[&lt;a href="https://www.harysdalvi.com/blog/ai-laundry/#reflist"&gt;11&lt;/a&gt;] We can imagine the general principle of AI upsampling
  for things like remastering old songs, animating and colorizing historical photos, and
  cleaning all kinds of noisy data for business and engineering purposes.
&lt;/p&gt;
&lt;p&gt;Then there's science experiments. Science labs are expensive, and
  running experiments is still costly even if you have one. AI can provide at least reasonably accurate experimental results
  without needing to run a real experiment, allowing scientists to refine their theories more quickly
  while running experiments more selectively. Technologies like &lt;a href="https://alphafold.ebi.ac.uk" target="_blank"&gt;AlphaFold&lt;/a&gt;
  demonstrate this, and promise to speed up research that could save lives.
&lt;/p&gt;
&lt;figure class="middleimg"&gt;
&lt;img alt="" src="https://www.harysdalvi.com/blog/ai-laundry/img/protein.jpg"/&gt;
&lt;figcaption&gt;An illustration of protein structure, which AlphaFold aims to predict.
    Image source: Holger87, &lt;a href="https://commons.wikimedia.org/wiki/File:Protein_structure.png" target="_blank"&gt;Wikimedia&lt;/a&gt;,
    &lt;a href="https://creativecommons.org/licenses/by-sa/3.0/" target="_blank"&gt;CC BY-SA 3.0&lt;/a&gt;.
  &lt;/figcaption&gt;
&lt;/figure&gt;
&lt;p&gt;Finally, one of the main purposes of machine learning in general is prediction: given some past data,
  we use statistical methods to predict what comes next. New machine learning
  methods are being used for &lt;strong&gt;forecasting&lt;/strong&gt; in finance, weather, and even &lt;a href="https://cdnsciencepub.com/doi/10.1139/er-2020-0019" target="_blank"&gt;early natural disaster warnings&lt;/a&gt;.
  No human can look at a giant spreadsheet and come up with these predictions, not even with Excel formulas,
  so this is another case of using AI to expand upon rather than replace human potential.
&lt;/p&gt;
&lt;p&gt;Unlike customer service or upsampling, we need specialized datasets for many of these applications like
  automating science experiments and forecasting: we can't just dump the internet onto a model.
  But so far, we've had encouraging successes in these fields.&lt;/p&gt;
&lt;p&gt;
  Too often, the debate over AI automation focuses only on using AI to replace human activities,
  some we like and others we don't. It's easy to forget that AI can also help us by accomplishing
  things that humans just can't do, just like earlier computer programs from calculators to
  chess bots.
&lt;/p&gt;
&lt;h2&gt;AI for Human Potential&lt;/h2&gt;
&lt;p&gt;AI doesn't have to be all about replacing jobs people love and leaving them with nothing.
  Ideally, we can use AI for two main reasons: first, &lt;strong&gt;automating jobs people hate&lt;/strong&gt;,
  and second, &lt;strong&gt;doing things people can't do without AI&lt;/strong&gt;.
&lt;/p&gt;
&lt;p&gt;Even when automating jobs people hate, like customer service, there is the risk of displacing
  livelihoods. But if companies make more money, they will want to use the money to sell more;
  and in order to sell more, they will hopefully hire people for new jobs.
  These new jobs should have better
  working conditions and more fulfillment than the ones that are being automated. We've
  often gotten this from automation in the past: with 3D printing, a CAD engineer has better working conditions
  and probably a more fulfilling job than a factory worker. If we're careful,
  we can replicate this kind of good automation with AI as well.
&lt;/p&gt;
&lt;p&gt;Automation is the most obvious possible result of AI: we have a new thing with some kind of intelligence,
  and it's natural to think about how it might augment or replace our own intelligence in roles
  we already perform as humans. But it might be better to instead think about the more hidden ways
  that AI can do things
  we can't do, like upsampling, forecasting, and predicting the results of science experiments.
&lt;/p&gt;
&lt;p&gt;We can and should build AI that increases human potential instead of replacing it.
  We &lt;em&gt;should&lt;/em&gt;, but it's not yet certain that we &lt;em&gt;will&lt;/em&gt;: if we (and policymakers) play our cards wrong,
  AI starting to automate human labor for corporations without creating new jobs to compensate is a real possibility.
  How it goes is still up to us.
&lt;/p&gt;
&lt;hr/&gt;
&lt;h2&gt;Related Posts&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="https://www.harysdalvi.com/blog/can-transformers-solve-everything"&gt;Can Transformers Solve Everything?&lt;/a&gt; — Looking into the math and the data reveals that transformers are both overused and underused.&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.harysdalvi.com/blog/llms-dont-predict-next-word"&gt;LLMs Do Not Predict the Next Word&lt;/a&gt; — RLHF forces us to view LLMs as agents in an environment, not just statistical models.&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.harysdalvi.com/blog/pipe-room/"&gt;The Pipe Room&lt;/a&gt; — Can a computer ever be conscious like we are?&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id="ref"&gt;References&lt;/h2&gt;
&lt;ol id="reflist"&gt;&lt;li&gt;
&lt;a href="https://www.finishdishwashing.com/dishwasher-benefits/handwashing-vs-dishwasher/" target="_blank"&gt;Handwashing vs Dishwasher&lt;/a&gt;
  (Reckit Benckiser) &lt;/li&gt;&lt;li&gt;
&lt;a href="https://penntoday.upenn.edu/news/how-appliance-boom-moved-more-women-workforce" target="_blank"&gt;How the appliance boom moved more women into the workforce&lt;/a&gt;
  (Jeremy Greenwood, “Evolving Households: The Imprint of Technology on Life”, 2019) &lt;/li&gt;&lt;li&gt;
&lt;a href="https://www.cbsnews.com/news/self-driving-cars-ford-mustang-mach-nhtsa/" target="_blank"&gt;The main reason why self-driving cars are not ready for prime time&lt;/a&gt;
  (Khristopher J. Brooks, CBS News, 2024) &lt;/li&gt;&lt;li&gt;
&lt;a href="https://alanwinfield.blogspot.com/2019/03/the-tea-test-of-robot-intelligence.html" target="_blank"&gt;The Tea test of robot intelligence&lt;/a&gt;
  (Alan Winfield, 2019) &lt;/li&gt;&lt;li&gt;
&lt;a href="https://harimus.github.io//2024/05/31/motortask.html" target="_blank"&gt;Common misconceptions about the complexity in robotics vs AI&lt;/a&gt;
  (Dan Ogawa Lillrank, 2024) &lt;/li&gt;&lt;li&gt;
&lt;a href="https://openai.com/index/faulty-reward-functions/" target="_blank"&gt;Faulty reward functions in the wild&lt;/a&gt;
  (Clark &amp;amp; Amodei, OpenAI, 2017) &lt;/li&gt;&lt;li&gt;
&lt;a href="https://www.thinkautonomous.ai/blog/robotics-startups/" target="_blank"&gt;Why Robotics Startups Don't Use Deep Learning&lt;/a&gt;
  (Think Autonomous, 2022) &lt;/li&gt;&lt;li&gt;
&lt;a href="https://www.history.com/news/who-were-the-luddites" target="_blank"&gt;Who Were the Luddites?&lt;/a&gt;
  (Evan Andrews, &lt;cite&gt;History&lt;/cite&gt;, 2023) &lt;/li&gt;&lt;li&gt;
&lt;a href="https://ir.angi.com/news-releases/news-release-details/new-angi-report-finds-nearly-90-skilled-tradespeople-satisfied" rel="nofollow" target="_blank"&gt;New
    Angi Report Finds Nearly 90% of Skilled Tradespeople Satisfied in Their Careers&lt;/a&gt;
  (Angi, Inc., 2024) &lt;/li&gt;&lt;li&gt;
&lt;a href="https://jobs.washingtonpost.com/article/10-careers-reporting-the-lowest-levels-of-happiness/" target="_blank"&gt;10
    Careers Reporting the Lowest Levels of Happiness&lt;/a&gt;
  (Andrea Moran, &lt;cite&gt;The Washington Post&lt;/cite&gt;, 2024) &lt;/li&gt;&lt;li&gt;
&lt;a href="https://www.popsci.com/technology/beatles-now-and-then-ai-listen/" target="_blank"&gt;Listen to ‘Now and Then’ by The Beatles, a ‘new’ song recorded using AI&lt;/a&gt;
  (Andrew Paul, &lt;cite&gt;Popular Science&lt;/cite&gt;, 2023) &lt;/li&gt;&lt;/ol&gt;</content></entry><entry><title>Predicting Every Election Since 1916</title><summary>Predicting Every Election Since 1916 in just 91 lines of C++ code, and how "election pundit predictions" betray a misunderstanding of probability</summary><link href="https://www.harysdalvi.com/blog/election-prediction/" rel="alternate"/><id>https://www.harysdalvi.com/blog/election-prediction/</id><published>2024-11-08T16:38:07Z</published><updated>2024-11-08T16:38:07Z</updated><content type="html" xml:lang="en">&lt;p&gt;
  In just 91 lines of C++ code, I perfectly predicted every United States presidential election since 1916. That's 28 straight elections, counting the most recent one in 2024.
&lt;/p&gt;
&lt;p&gt;
  The crazy part is I didn't rely on any complicated polling data trends, voter sentiment, or policy analysis to make these predictions. I just used basic principles of probability.
&lt;/p&gt;
&lt;figure class="middleimg"&gt;
&lt;img alt="" class="white-diagram" src="https://www.harysdalvi.com/blog/election-prediction/img/ElectoralCollege1916.svg"/&gt;
&lt;figcaption&gt;The US presidential election results in 1916. Public domain. By &lt;a href="https://en.wikipedia.org/wiki/File:ElectoralCollege1916.svg" target="_blank"&gt;AndyHogan14, Wikimedia&lt;/a&gt;.&lt;/figcaption&gt;
&lt;/figure&gt;
&lt;p&gt;
  Alright, I'll admit I cheated a little. But arguably not much more than the political pundits that claim to have predicted every election since, say, 1980.
&lt;/p&gt;
&lt;p&gt;
  Every election cycle, you see stories on the news of someone who has correctly predicted every election in however many years. Most recently, I saw stories about
  &lt;a href="https://medium.com/r/?url=https%3A%2F%2Fen.wikipedia.org%2Fwiki%2FAllan_Lichtman" target="_blank"&gt;Allan Lichtman&lt;/a&gt;, who correctly predicted most of the 11 elections from 1984 through 2020. His system for predicting elections is called the “13 Keys”, and consists of 13 true/false questions to predict the winner of the
  election.[&lt;a href="https://www.harysdalvi.com/blog/election-prediction/#reflist"&gt;1&lt;/a&gt;]
&lt;/p&gt;
&lt;p&gt;
  But then Allan Lichtman got the 2024 election wrong. Does this cast doubt upon election pundits who claim to have sophisticated election prediction systems?
&lt;/p&gt;
&lt;p&gt;
  In this article, I'm going to show you how you, too, can &lt;strong&gt;predict every single election in over 100 years.&lt;/strong&gt; You can do this with a very simple deterministic system that requires even less information than the 13 keys, and yet is more accurate, as long as you're willing to be &lt;strong&gt;fooled by statistics!&lt;/strong&gt;
&lt;/p&gt;
&lt;p&gt;
  I'll also explain why, mathematically, the seemingly insightful achievement of predicting election results actually means very little.
&lt;/p&gt;
&lt;h2&gt;How is This Possible?&lt;/h2&gt;
&lt;p&gt;
  How is it possible to compute every single election since 1916? Surely it couldn't happen by random chance. After all, there have been 28 elections since 1916, inclusive. Each one has had at least 2 major candidates, and a few of them actually had 3. So the probability of guessing all 28 elections correctly purely by chance is less than &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-1-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mn&gt;1&lt;/mn&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mo&gt;/&lt;/mo&gt;&lt;/mrow&gt;&lt;msup&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mn&gt;28&lt;/mn&gt;&lt;/mrow&gt;&lt;/msup&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;, which is about 1 in 300 million.
&lt;/p&gt;
&lt;p&gt;
  But wait: 300 million? That's a familiar number: the population of the United States is a little over 300 million. So &lt;strong&gt;if everyone in the United States guessed the election results at random&lt;/strong&gt; for every election since 1916, we would expect about one of them to guess every single outcome correctly. This person would be praised by the country as a masterful political pundit, and everyone would eagerly await their prediction for the next election…
  &lt;strong&gt;even though it would have only a 1/2 chance of being correct!&lt;/strong&gt;
&lt;/p&gt;
&lt;p&gt;
  Of course, very few, if any, Americans today have been alive to predict elections since 1916. And few Americans make public election predictions for the world to judge. So let's try an argument with slightly more realistic numbers.
&lt;/p&gt;
&lt;p&gt;
  Let's say there are 2000 Americans who are potentially in the business of predicting elections, and who are of age to have seen all elections from 1984 through 2024 (that's &lt;strong&gt;12 elections&lt;/strong&gt;). Each one has some kind of a system based on polling data, economic trends, and other factors, giving them a &lt;strong&gt;60% chance&lt;/strong&gt;
  of being correct in any given election. Then the chance that &lt;strong&gt;any given predictor&lt;/strong&gt; gets all 12 elections correct is &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-2-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mo stretchy="false"&gt;(&lt;/mo&gt;&lt;mn&gt;0.6&lt;/mn&gt;&lt;msup&gt;&lt;mo stretchy="false"&gt;)&lt;/mo&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mn&gt;12&lt;/mn&gt;&lt;/mrow&gt;&lt;/msup&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;, or about 0.2%.
  The chance that &lt;strong&gt;at least one predictor&lt;/strong&gt; of the 2000 gets all 12 elections right is &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-3-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mn&gt;1&lt;/mn&gt;&lt;mo&gt;−&lt;/mo&gt;&lt;mo stretchy="false"&gt;(&lt;/mo&gt;&lt;mn&gt;1&lt;/mn&gt;&lt;mo&gt;–&lt;/mo&gt;&lt;mn&gt;0.002&lt;/mn&gt;&lt;msup&gt;&lt;mo stretchy="false"&gt;)&lt;/mo&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mn&gt;2000&lt;/mn&gt;&lt;/mrow&gt;&lt;/msup&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;, or 98.7%!
&lt;/p&gt;
&lt;p&gt;
  If we allow more than 2000 predictors, or more than 60% accuracy, this probability gets even higher.
&lt;/p&gt;
&lt;p&gt;
  This assumes that all predictors are &lt;strong&gt;independent&lt;/strong&gt;, which certainly isn't the case: all of them use much of the same underlying data. But even without the independence of predictors, 98.7% odds with just 2000 predictors is a high number. This indicates that it's quite possible for someone to be right on almost all elections, despite not having a very accurate underlying model.
&lt;/p&gt;
&lt;h2&gt;How Likely is This?&lt;/h2&gt;
&lt;p&gt;
  Let's look deeper into this model of everyone in America guessing randomly.
&lt;/p&gt;
&lt;p&gt;
  In just one election, you have a 1/2 chance of being right. As you increase the number of elections, your chance of being right on all of them drops off exponentially. But your chance of being right on &lt;em&gt;many&lt;/em&gt; or even &lt;em&gt;most&lt;/em&gt; of them remains fairly high for quite a while after.
&lt;/p&gt;
&lt;p&gt;
  In general, the probability of getting &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-4-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;k&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt; out of &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-5-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;n&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt; elections right by guessing randomly is given by the binomial distribution:
  &lt;span class="mjx-chtml MJXc-display" style="text-align: center;"&gt;&lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-6-Frame" role="presentation" style="font-size: 90%; text-align: center; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML MJX_Assistive_MathML_Block" role="presentation"&gt;&lt;math display="block" xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;P&lt;/mi&gt;&lt;mo stretchy="false"&gt;(&lt;/mo&gt;&lt;mi&gt;k&lt;/mi&gt;&lt;mo stretchy="false"&gt;)&lt;/mo&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mrow&gt;&lt;mstyle scriptlevel="0"&gt;&lt;mrow class="MJX-TeXAtom-OPEN"&gt;&lt;mo maxsize="2.047em" minsize="2.047em"&gt;(&lt;/mo&gt;&lt;/mrow&gt;&lt;/mstyle&gt;&lt;mfrac linethickness="0"&gt;&lt;mi&gt;n&lt;/mi&gt;&lt;mi&gt;k&lt;/mi&gt;&lt;/mfrac&gt;&lt;mstyle scriptlevel="0"&gt;&lt;mrow class="MJX-TeXAtom-CLOSE"&gt;&lt;mo maxsize="2.047em" minsize="2.047em"&gt;)&lt;/mo&gt;&lt;/mrow&gt;&lt;/mstyle&gt;&lt;/mrow&gt;&lt;msup&gt;&lt;mrow&gt;&lt;mo&gt;(&lt;/mo&gt;&lt;mfrac&gt;&lt;mn&gt;1&lt;/mn&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/mfrac&gt;&lt;mo&gt;)&lt;/mo&gt;&lt;/mrow&gt;&lt;mi&gt;n&lt;/mi&gt;&lt;/msup&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
  The &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-7-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mrow&gt;&lt;mstyle scriptlevel="0"&gt;&lt;mrow class="MJX-TeXAtom-OPEN"&gt;&lt;mo maxsize="1.2em" minsize="1.2em"&gt;(&lt;/mo&gt;&lt;/mrow&gt;&lt;/mstyle&gt;&lt;mfrac linethickness="0"&gt;&lt;mi&gt;n&lt;/mi&gt;&lt;mi&gt;k&lt;/mi&gt;&lt;/mfrac&gt;&lt;mstyle scriptlevel="0"&gt;&lt;mrow class="MJX-TeXAtom-CLOSE"&gt;&lt;mo maxsize="1.2em" minsize="1.2em"&gt;)&lt;/mo&gt;&lt;/mrow&gt;&lt;/mstyle&gt;&lt;/mrow&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt; factor keeps our probability high for medium numbers of &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-8-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;k&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;.
&lt;/p&gt;
&lt;figure class="bigimg"&gt;
&lt;img alt="Graph: probability of predicting elections by guessing randomly." src="https://www.harysdalvi.com/blog/election-prediction/img/random.png"/&gt;
&lt;/figure&gt;
&lt;p&gt;
  From the graph, once we reach &lt;strong&gt;12 elections&lt;/strong&gt; (1980–2024), we still have a &lt;strong&gt;1.5% chance of getting just 2 elections wrong&lt;/strong&gt; from guessing randomly. So this outcome is very much possible, especially when lots of people try to guess the elections, and when they do just a little better than guessing randomly. But eventually, with a large number of elections, you are almost guaranteed to get more than 5 wrong.
&lt;/p&gt;
&lt;p&gt;
  We can expand this random guessing model to 300 million Americans, using
  &lt;span class="mjx-chtml MJXc-display" style="text-align: center;"&gt;&lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-9-Frame" role="presentation" style="font-size: 90%; text-align: center; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML MJX_Assistive_MathML_Block" role="presentation"&gt;&lt;math display="block" xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;msub&gt;&lt;mi&gt;P&lt;/mi&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mn&gt;300&lt;/mn&gt;&lt;mtext&gt;M&lt;/mtext&gt;&lt;/mrow&gt;&lt;/msub&gt;&lt;mo stretchy="false"&gt;(&lt;/mo&gt;&lt;mi&gt;k&lt;/mi&gt;&lt;mo stretchy="false"&gt;)&lt;/mo&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mn&gt;1&lt;/mn&gt;&lt;mo&gt;−&lt;/mo&gt;&lt;mo stretchy="false"&gt;(&lt;/mo&gt;&lt;mn&gt;1&lt;/mn&gt;&lt;mo&gt;−&lt;/mo&gt;&lt;mi&gt;P&lt;/mi&gt;&lt;mo stretchy="false"&gt;(&lt;/mo&gt;&lt;mi&gt;k&lt;/mi&gt;&lt;mo stretchy="false"&gt;)&lt;/mo&gt;&lt;msup&gt;&lt;mo stretchy="false"&gt;)&lt;/mo&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mn&gt;300&lt;/mn&gt;&lt;mo&gt;,&lt;/mo&gt;&lt;mn&gt;000&lt;/mn&gt;&lt;mo&gt;,&lt;/mo&gt;&lt;mn&gt;000&lt;/mn&gt;&lt;/mrow&gt;&lt;/msup&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
  That is, &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-10-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;msub&gt;&lt;mi&gt;P&lt;/mi&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mn&gt;300&lt;/mn&gt;&lt;mtext&gt;M&lt;/mtext&gt;&lt;/mrow&gt;&lt;/msub&gt;&lt;mo stretchy="false"&gt;(&lt;/mo&gt;&lt;mi&gt;k&lt;/mi&gt;&lt;mo stretchy="false"&gt;)&lt;/mo&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt; is the chance that at least one person out of 300 million guesses exactly &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-11-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;k&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt; elections correctly.
&lt;/p&gt;
&lt;figure class="bigimg"&gt;
&lt;img alt="Graph: probability of at least one person out of 300 million predicting elections." src="https://www.harysdalvi.com/blog/election-prediction/img/300m.png"/&gt;
&lt;/figure&gt;
&lt;p&gt;
  All the way up to about 30 elections, &lt;strong&gt;there's a decent chance that someone will guess every single one correctly, just randomly!&lt;/strong&gt; And we have decent numbers all the way into the 50s, where we might get just 5 elections wrong. Of course, past 5 elections, it's almost certain that someone of the 300 million gets more than 5 elections wrong.
&lt;/p&gt;
&lt;h2&gt;Predicting Every Election Since 1916&lt;/h2&gt;
&lt;p&gt;Now it's time to predict every single election since 1916. The algorithm is very simple:
  &lt;/p&gt;&lt;ul&gt;
&lt;li&gt;For each election, determine who the &lt;strong&gt;top contenders&lt;/strong&gt; are. Generally there are only two top contenders in a US election; a few times, there have been three, four, or even one. I assume it's easy to figure this out.&lt;/li&gt;
&lt;li&gt;Sort the contenders in &lt;strong&gt;alphabetical order by last name.&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Flip a coin.&lt;/strong&gt; Guess heads for the first contender, tails for the second. (Or maybe use a three-sided die for 3 contenders, etc.)&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;/p&gt;
&lt;p&gt;And that's basically it.&lt;/p&gt;
&lt;p&gt;
  But there's one key thing about the coin. It can't be a physical coin. You have to use a &lt;strong&gt;pseudorandom number generator&lt;/strong&gt; in a computer.
&lt;/p&gt;
&lt;p&gt;
  In fact, you have to use C/C++ random number generation. Seed it with the &lt;strong&gt;random seed&lt;/strong&gt; 824050438, and then start picking random values. (Use modulus on each random value to pick the actual candidate.) If you go and check this algorithm with this seed, you'll be amazed to find that
  &lt;strong&gt;you can predict every single election from 1916 to 2024 correctly!&lt;/strong&gt;
&lt;/p&gt;
&lt;p&gt;
  But wait, isn't that cheating?
&lt;/p&gt;
&lt;p&gt;
  Yes, choosing a random seed that I know works perfectly is cheating. But hardly more so than having multiple people predicting the election, and declaring a political pundit only when &lt;em&gt;at least one&lt;/em&gt; gets most of the elections right, just as I declare an optimal seed when &lt;em&gt;at least one&lt;/em&gt; gets all the elections right. It's just a matter of cheating at the individual level versus the societal level.
&lt;/p&gt;
&lt;p&gt;
  Let's make a toy model in Python. You can find the full code, as well as the more efficient C++ version, &lt;a href="https://github.com/crackalamoo/blog-demos/tree/main/2024/election-prediction" target="_blank"&gt;on GitHub&lt;/a&gt;.
&lt;/p&gt;
&lt;p&gt;
  First we set up and preprocess our dataset. In this case, it's the list of all main contenders in US elections, and who the winner was in each case.
&lt;/p&gt;
&lt;pre&gt;&lt;code class="lang-py hljs python"&gt;elections = [ &lt;span class="hljs-comment"&gt;# list the winner first&lt;/span&gt;
  [&lt;span class="hljs-number"&gt;1789&lt;/span&gt;, [&lt;span class="hljs-string"&gt;"Washington"&lt;/span&gt;]],
  [&lt;span class="hljs-number"&gt;1792&lt;/span&gt;, [&lt;span class="hljs-string"&gt;"Washington"&lt;/span&gt;]],
  [&lt;span class="hljs-number"&gt;1796&lt;/span&gt;, [&lt;span class="hljs-string"&gt;"Adams"&lt;/span&gt;, &lt;span class="hljs-string"&gt;"Jefferson"&lt;/span&gt;]],
  [&lt;span class="hljs-number"&gt;1800&lt;/span&gt;, [&lt;span class="hljs-string"&gt;"Jefferson"&lt;/span&gt;, &lt;span class="hljs-string"&gt;"Adams"&lt;/span&gt;]],
  [&lt;span class="hljs-number"&gt;1804&lt;/span&gt;, [&lt;span class="hljs-string"&gt;"Jefferson"&lt;/span&gt;, &lt;span class="hljs-string"&gt;"Cotesworth"&lt;/span&gt;]],
  ...
  [&lt;span class="hljs-number"&gt;1856&lt;/span&gt;, [&lt;span class="hljs-string"&gt;"Buchanan"&lt;/span&gt;, &lt;span class="hljs-string"&gt;"Frémont"&lt;/span&gt;, &lt;span class="hljs-string"&gt;"Filmore"&lt;/span&gt;]],
  [&lt;span class="hljs-number"&gt;1860&lt;/span&gt;, [&lt;span class="hljs-string"&gt;"Lincoln"&lt;/span&gt;, &lt;span class="hljs-string"&gt;"Breckinridge"&lt;/span&gt;, &lt;span class="hljs-string"&gt;"Bell"&lt;/span&gt;, &lt;span class="hljs-string"&gt;"Douglas"&lt;/span&gt;]],
  [&lt;span class="hljs-number"&gt;1864&lt;/span&gt;, [&lt;span class="hljs-string"&gt;"Lincoln"&lt;/span&gt;, &lt;span class="hljs-string"&gt;"McClellan"&lt;/span&gt;]],
  [&lt;span class="hljs-number"&gt;1868&lt;/span&gt;, [&lt;span class="hljs-string"&gt;"Grant"&lt;/span&gt;, &lt;span class="hljs-string"&gt;"Seymour"&lt;/span&gt;]],
  ...
  [&lt;span class="hljs-number"&gt;1996&lt;/span&gt;, [&lt;span class="hljs-string"&gt;"Clinton"&lt;/span&gt;, &lt;span class="hljs-string"&gt;"Dole"&lt;/span&gt;]],
  [&lt;span class="hljs-number"&gt;2000&lt;/span&gt;, [&lt;span class="hljs-string"&gt;"Bush"&lt;/span&gt;, &lt;span class="hljs-string"&gt;"Gore"&lt;/span&gt;]],
  [&lt;span class="hljs-number"&gt;2004&lt;/span&gt;, [&lt;span class="hljs-string"&gt;"Bush"&lt;/span&gt;, &lt;span class="hljs-string"&gt;"Kerry"&lt;/span&gt;]],
  [&lt;span class="hljs-number"&gt;2008&lt;/span&gt;, [&lt;span class="hljs-string"&gt;"Obama"&lt;/span&gt;, &lt;span class="hljs-string"&gt;"McCain"&lt;/span&gt;]],
  [&lt;span class="hljs-number"&gt;2012&lt;/span&gt;, [&lt;span class="hljs-string"&gt;"Obama"&lt;/span&gt;, &lt;span class="hljs-string"&gt;"Romney"&lt;/span&gt;]],
  [&lt;span class="hljs-number"&gt;2016&lt;/span&gt;, [&lt;span class="hljs-string"&gt;"Trump"&lt;/span&gt;, &lt;span class="hljs-string"&gt;"Clinton"&lt;/span&gt;]],
  [&lt;span class="hljs-number"&gt;2020&lt;/span&gt;, [&lt;span class="hljs-string"&gt;"Biden"&lt;/span&gt;, &lt;span class="hljs-string"&gt;"Trump"&lt;/span&gt;]],
  [&lt;span class="hljs-number"&gt;2024&lt;/span&gt;, [&lt;span class="hljs-string"&gt;"Trump"&lt;/span&gt;, &lt;span class="hljs-string"&gt;"Harris"&lt;/span&gt;]]
]

&lt;span class="hljs-keyword"&gt;for&lt;/span&gt; e &lt;span class="hljs-keyword"&gt;in&lt;/span&gt; elections: &lt;span class="hljs-comment"&gt;# preprocessing&lt;/span&gt;
  sorted_names = &lt;span class="hljs-built_in"&gt;sorted&lt;/span&gt;(e[&lt;span class="hljs-number"&gt;1&lt;/span&gt;]) &lt;span class="hljs-comment"&gt;# sort alphabetically&lt;/span&gt;
  result = sorted_names.index(e[&lt;span class="hljs-number"&gt;1&lt;/span&gt;][&lt;span class="hljs-number"&gt;0&lt;/span&gt;]) &lt;span class="hljs-comment"&gt;# index of the winner, in alphabetical order&lt;/span&gt;
  e.append(&lt;span class="hljs-built_in"&gt;len&lt;/span&gt;(sorted_names))
  e.append(result)
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Now let's simulate randomly guessing elections 1 million times.&lt;/p&gt;
&lt;pre&gt;&lt;code class="lang-py hljs python"&gt;&lt;span class="hljs-keyword"&gt;import&lt;/span&gt; random

TRIALS = &lt;span class="hljs-number"&gt;1e6&lt;/span&gt; &lt;span class="hljs-comment"&gt;# 1 million&lt;/span&gt;

&lt;span class="hljs-function"&gt;&lt;span class="hljs-keyword"&gt;def&lt;/span&gt; &lt;span class="hljs-title"&gt;simulate_elections&lt;/span&gt;(&lt;span class="hljs-params"&gt;seed&lt;/span&gt;):&lt;/span&gt;
    &lt;span class="hljs-comment"&gt;# guess randomly using a given seed for all elections&lt;/span&gt;
    random.seed(seed)
    correct = &lt;span class="hljs-number"&gt;0&lt;/span&gt;
    &lt;span class="hljs-keyword"&gt;for&lt;/span&gt; j &lt;span class="hljs-keyword"&gt;in&lt;/span&gt; &lt;span class="hljs-built_in"&gt;range&lt;/span&gt;(&lt;span class="hljs-built_in"&gt;len&lt;/span&gt;(elections)):
        result = random.randint(&lt;span class="hljs-number"&gt;0&lt;/span&gt;, elections[j][&lt;span class="hljs-number"&gt;2&lt;/span&gt;]-&lt;span class="hljs-number"&gt;1&lt;/span&gt;)
        &lt;span class="hljs-keyword"&gt;if&lt;/span&gt; result == elections[j][&lt;span class="hljs-number"&gt;3&lt;/span&gt;]:
            correct += &lt;span class="hljs-number"&gt;1&lt;/span&gt;
    &lt;span class="hljs-keyword"&gt;return&lt;/span&gt; correct

max_correct = &lt;span class="hljs-number"&gt;0&lt;/span&gt;
best_seed = -&lt;span class="hljs-number"&gt;1&lt;/span&gt;

&lt;span class="hljs-keyword"&gt;for&lt;/span&gt; i &lt;span class="hljs-keyword"&gt;in&lt;/span&gt; &lt;span class="hljs-built_in"&gt;range&lt;/span&gt;(&lt;span class="hljs-built_in"&gt;int&lt;/span&gt;(TRIALS)):
    correct = simulate_elections(i)
    &lt;span class="hljs-keyword"&gt;if&lt;/span&gt; correct &amp;gt;= max_correct:
        max_correct = correct
        best_seed = i

&lt;span class="hljs-built_in"&gt;print&lt;/span&gt;(&lt;span class="hljs-string"&gt;f"&lt;span class="hljs-subst"&gt;{max_correct}&lt;/span&gt;/&lt;span class="hljs-subst"&gt;{&lt;span class="hljs-built_in"&gt;len&lt;/span&gt;(elections)}&lt;/span&gt;"&lt;/span&gt;)
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;This code runs in 20 seconds. The best seed comes out to 824728, with 48/60 elections correct. But can we do better? &lt;strong&gt;Can we get every single election correct?&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;We'll start by limiting ourselves to the last 28 elections (1916–2024). The code now runs in 13 seconds and gets 26/28 elections correct with the seed 787252. Getting better!&lt;/p&gt;
&lt;p&gt;In order to improve from here, we need an improvement in processing power. My &lt;a href="https://github.com/crackalamoo/blog-demos/blob/main/2024/election-prediction/election.cpp" target="_blank"&gt;C++ code&lt;/a&gt;,
  which I won't include here, runs on essentially the same principle but adds &lt;strong&gt;multithreading&lt;/strong&gt;.
  This allows me to run 3000 simulations on our dataset in parallel, speeding up this process tremendously.&lt;/p&gt;
&lt;p&gt;In C++, I manage to get &lt;strong&gt;28/28 elections correct&lt;/strong&gt; using the seed 824050438, which takes &lt;strong&gt;20 seconds to find&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;Remember 20 seconds is just the time to discover this seed. Once we have the seed, we can technically compute election results almost instantly without knowing the results in advance! All we need is the list of top contenders in each election. We stuff in our seed and all the results will fall out perfectly.&lt;/p&gt;
&lt;p&gt;So there you have it: &lt;strong&gt;a deterministic algorithm to perfectly predict every US presidential election since 1916!&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;
  This kind of accuracy is a crystal ball, the likes of which has not been seen in any election predictor in American history. Given this immense level of insight, you might be wondering &lt;strong&gt;who will win the 2028 US presidential election&lt;/strong&gt;.
  Assuming a race between a Democrat and a Republican in 2028, the magic random seed 824050438 predicts… &lt;strong&gt;whoever's last name is first in alphabetical order&lt;/strong&gt;. You heard it here first. Don't be surprised if I'm right!
&lt;/p&gt;
&lt;h2&gt;Takeaways for a Scientist&lt;/h2&gt;
&lt;p&gt;What's the takeaway of this experiment in a scientific context, especially data science?&lt;/p&gt;
&lt;p&gt;At first, my takeaway was not to extrapolate past model performance to future performance. After all, hindsight is 20/20. See this relevant XKCD: “Electoral Precedent”.&lt;/p&gt;
&lt;figure class="endfloat"&gt;
&lt;img alt="" src="https://www.harysdalvi.com/blog/election-prediction/img/electoral_precedent_2x.png"/&gt;
&lt;figcaption&gt;&lt;a href="https://xkcd.com/1122/" target="_blank"&gt;XKCD 1122&lt;/a&gt;: Electoral Precedent. By Randall Munroe. &lt;a href="https://creativecommons.org/licenses/by-nc/2.5/" target="_blank"&gt;CC BY-NC 2.5&lt;/a&gt; &lt;/figcaption&gt;
&lt;/figure&gt;
&lt;p&gt;But I don't think that's exactly what we should take away from this. If a model does well on 2000 cat versus dog predictions, I think it's a safe bet that it'll also do quite well on the next 50, even if the future data has some important differences.&lt;/p&gt;
&lt;p&gt;Instead, I think the more relevant insight here pertains to &lt;strong&gt;extrapolating model performance from small datasets&lt;/strong&gt;. 
  When a model has done well on a small dataset, we don't have enough evidence to predict its future performance. The US presidential election dataset is quite small: there have only been 60 as of 2024. Most well-known election predictors only try their hand at around 10, and that too imperfectly!&lt;/p&gt;
&lt;p&gt;Another takeaway is &lt;strong&gt;always use a baseline before trusting your metrics.&lt;/strong&gt; If you don't have at least a random chance baseline for your predictions, if not a more sophisticated model, good performance isn't always an indication that you're doing something right. This is a common mistake in machine learning, where people have the tendency to build deep learning models for simple datasets that work quite well, but ironically still worse than linear regression.&lt;/p&gt;
&lt;p&gt;And how about the takeaway in a &lt;strong&gt;political context&lt;/strong&gt;? I'm not saying that these political analysis models are completely baseless, like a random number prediction based on the candidates' last names. I'm sure they have better than 50% odds because they genuinely take important information into account.&lt;/p&gt;
&lt;p&gt;But I am saying that we should be skeptical when we hear claims of any one person or method being able to consistently predict election results—especially if they get a few wrong, because the probability of getting most but not all correct by pure chance is significant. We should evaluate the methodology further before assuming its accuracy.&lt;/p&gt;
&lt;p&gt;So my overall takeaway is that as a scientist, you should avoid extrapolating performance from small datasets, and always use a baseline before trusting your metrics. And as a citizen, &lt;strong&gt;don't believe everything the election pundits tell you&lt;/strong&gt;: for all you know, they could be flipping coins off camera!&lt;/p&gt;
&lt;hr/&gt;
&lt;h2&gt;Related Posts&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="https://www.harysdalvi.com/blog/mathematical-history"&gt;Mathematical History&lt;/a&gt; — Why we need more historians doing math&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id="ref"&gt;References&lt;/h2&gt;
&lt;p&gt;The GitHub for this article, including figures, is at &lt;a href="https://github.com/crackalamoo/blog-demos/tree/main/2024/election-prediction" target="_blank"&gt;crackalamoo/blog-demos&lt;/a&gt;.&lt;/p&gt;
&lt;ol id="reflist"&gt;&lt;li&gt;&lt;a href="https://medium.com/r/?url=https%3A%2F%2Fen.wikipedia.org%2Fwiki%2FAllan_Lichtman" target="_blank"&gt;Allan Lichtman&lt;/a&gt; (Wikipedia) &lt;/li&gt;&lt;/ol&gt;</content></entry><entry><title>Can Transformers Solve Everything?</title><summary>Looking into the math and the data reveals that transformers are both overused and underused.</summary><link href="https://www.harysdalvi.com/blog/can-transformers-solve-everything/" rel="alternate"/><id>https://www.harysdalvi.com/blog/can-transformers-solve-everything/</id><published>2024-10-30T16:38:07Z</published><updated>2024-10-30T16:38:07Z</updated><content type="html" xml:lang="en">&lt;p&gt;Transformers are best known for their applications in natural language processing.
  They were originally designed for translating between languages,[&lt;a href="https://www.harysdalvi.com/blog/can-transformers-solve-everything/#reflist"&gt;1&lt;/a&gt;]
  and are now most famous for their use in large language models like ChatGPT (generative pretrained &lt;em&gt;transformer&lt;/em&gt;).
&lt;/p&gt;
&lt;p&gt;But since their introduction, transformers have been applied to ever more tasks,
  with great results.
  These include image recognition,[&lt;a href="https://www.harysdalvi.com/blog/can-transformers-solve-everything/#reflist"&gt;2&lt;/a&gt;]
  reinforcement learning,[&lt;a href="https://www.harysdalvi.com/blog/can-transformers-solve-everything/#reflist"&gt;3&lt;/a&gt;] and even
  weather prediction.[&lt;a href="https://www.harysdalvi.com/blog/can-transformers-solve-everything/#reflist"&gt;4&lt;/a&gt;]
&lt;/p&gt;
&lt;p&gt;Even the seemingly specific task of language generation with transformers
  has a number of surprises. Large language models
  have emergent properties
  that feel more intelligent than just predicting the next word. For example,
  they may know various facts about the world, or replicate nuances of
  a person's style of speech.
&lt;/p&gt;
&lt;p&gt;The success of transformers has made some people ask the question of whether transformers
  can do everything. If transformers generalize to so many tasks, is there any reason &lt;em&gt;not&lt;/em&gt; to use a transformer?
&lt;/p&gt;
&lt;p&gt;Clearly, there is still a case for other machine learning models
  and, as is often forgotten these days, non-machine learning models and human intellect.
  But transformers do have a number of unique properties, and have shown incredible results so far.
  There is also a considerable mathematical and empirical basis for why we should
  expect this success to continue.
&lt;/p&gt;
&lt;p&gt;The real question, then, isn't “can transformers solve everything?”
  Instead, it's “why shouldn't they solve everything?”
  There are a few reasons why not:
&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;There are &lt;strong&gt;computational constraints&lt;/strong&gt;: we might not have strong enough computers yet,
    or ever, to compute some things.&lt;/li&gt;
&lt;li&gt;Also, especially in a machine learning context,
    there are &lt;strong&gt;data constraints&lt;/strong&gt;: we might lack the dataset to feasibly train a model
    that is able to compute what we want.&lt;/li&gt;
&lt;li&gt;Finally, there are &lt;strong&gt;algorithmic constraints&lt;/strong&gt;:
    transformers might not be the right architecture or algorithm for what we want to do.
    We might even need to invent the right algorithm.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;In this article we'll look at all these constrains, including a cool demo comparing
  transformers to the classical fast Fourier transform algorithm for time series.
&lt;/p&gt;
&lt;figure class="endfloat"&gt;
&lt;img alt="" src="https://www.harysdalvi.com/blog/can-transformers-solve-everything/img/one-ring.jpg"/&gt;
&lt;figcaption&gt;Are transformers the one architecture to rule them all?
    A depiction of J. R. R. Tolkien's One Ring from Peter Jackson's films. Image source:
    &lt;a href="https://commons.wikimedia.org/wiki/File:One_Ring_Blender_Render.png" target="_blank"&gt;Peter J. Yost / Wikimedia&lt;/a&gt;,
    &lt;a href="https://creativecommons.org/licenses/by-sa/4.0/deed.en"&gt;CC BY-SA 4.0&lt;/a&gt;.
  &lt;/figcaption&gt;
&lt;/figure&gt;
&lt;h2&gt;This Isn't the First Time&lt;/h2&gt;
&lt;p&gt;On Kaggle, XGBoost tends to win competitions on structured data while
  various kinds of neural networks dominate unstructured data competitions.[&lt;a href="https://www.harysdalvi.com/blog/can-transformers-solve-everything/#reflist"&gt;5&lt;/a&gt;]
  For quite a while now, people have been asking if model X is the model to end all models,
  a model that can solve everything.
&lt;/p&gt;
&lt;p&gt;Surprisingly, these wild claims are actually backed up by solid math.
  Most striking is the &lt;strong&gt;universal approximation theorem&lt;/strong&gt;,
  which states that neural networks can approximate &lt;em&gt;any&lt;/em&gt; continuous function
  to any degree of accuracy: you just need enough neurons and nonlinear activation
  functions. In fact, you can even do this in a single layer with enough
  neurons.[&lt;a href="https://www.harysdalvi.com/blog/can-transformers-solve-everything/#reflist"&gt;6&lt;/a&gt;] This idea goes back at least
  to 1989.[&lt;a href="https://www.harysdalvi.com/blog/can-transformers-solve-everything/#reflist"&gt;7&lt;/a&gt;]
&lt;/p&gt;
&lt;p&gt;If we go back even further, we find the Church-Turing thesis from the 1930s. Informally, this is the idea that
  anything that can be computed can also be computed by a Turing machine.[&lt;a href="https://www.harysdalvi.com/blog/can-transformers-solve-everything/#reflist"&gt;8&lt;/a&gt;]
&lt;/p&gt;
&lt;p&gt;This holds up for transformers too. It turns out that subject to some constraints,
  transformers are theoretically capable of approximating any sequence-to-sequence function.[&lt;a href="https://www.harysdalvi.com/blog/can-transformers-solve-everything/#reflist"&gt;9&lt;/a&gt;]
  This means that with a Turing machine in Python and a neural network in PyTorch you can theoretically do any possible
  computation.
&lt;/p&gt;
&lt;p&gt;If that's the case, why haven't we computed everything yet?
&lt;/p&gt;
&lt;p&gt;This goes back to the constraints: &lt;strong&gt;computational constraints&lt;/strong&gt; (not enough computational power),
  &lt;strong&gt;data constraints&lt;/strong&gt; (not enough high quality data),
  and &lt;strong&gt;algorithmic constraints&lt;/strong&gt; (using the wrong algorithm for a problem).
&lt;/p&gt;
&lt;h2&gt;Scaling&lt;/h2&gt;
&lt;p&gt;Data, compute, and energy constraints on transformers, while not the same, are intimately tied together.
  Even if we have enough data to train a transformer model on a task, we might not have enough
  compute or energy to complete the training. All of these fall under
  the problem of &lt;strong&gt;scaling&lt;/strong&gt;.
&lt;/p&gt;
&lt;p&gt;As discussed before, &lt;em&gt;theoretically&lt;/em&gt; transformers can do many, many things. But getting transformers
  to do these things in practice generally requires scale. Often that's
  too expensive, and it's a better idea to use simpler models instead. Let's look at the future of
  scaling transformers to see exactly when transformers are a better option.
&lt;/p&gt;
&lt;h3&gt;Large Companies&lt;/h3&gt;
&lt;p&gt;ChatGPT isn't a crazy algorithmic jump over
  &lt;a href="https://www.harysdalvi.com/discord"&gt;GPT-2, which produced results like this&lt;/a&gt;.
  The difference is largely a difference of scale: more layers, more parameters, and more training.
&lt;/p&gt;
&lt;p&gt;Therefore, some argue that with enough scaling of transformers, we will reach a general transformer model
  that can do just about any thinking a human can: this would be &lt;strong&gt;artificial general intelligence (AGI)&lt;/strong&gt;.
  This could greatly reduce the need for other algorithms and, concerningly, possibly humans as well.
  We just need more data, more compute, and more time; or so the argument goes.
&lt;/p&gt;
&lt;p&gt;As for &lt;strong&gt;data&lt;/strong&gt;, we probably still have a while to go. It's estimated that large language models (LLMs)
  have only trained on 1/30 of all data on the web, which is massive, but leaves a lot of room to expand.
  Additionally, it turns out that training LLMs
  on data that they themselves generated (&lt;strong&gt;synthetic data&lt;/strong&gt;) can improve performance.
  So by letting LLMs improve themselves,
  we might have even more data than these figures would suggest.[&lt;a href="https://www.harysdalvi.com/blog/can-transformers-solve-everything/#reflist"&gt;10&lt;/a&gt;]
&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Compute&lt;/strong&gt; itself also likely won't be a limiting factor for the largest companies.
  GPUs currently take up only a small fraction of all wafer production at
  TSMC (Taiwan Semiconductor Manufacturing Company), meaning we have lots of raw materials left over.[&lt;a href="https://www.harysdalvi.com/blog/can-transformers-solve-everything/#reflist"&gt;10&lt;/a&gt;]
  As demand increases, GPU production can be scaled up with it.
&lt;/p&gt;
&lt;p&gt;Instead, &lt;strong&gt;energy&lt;/strong&gt; could be the bottleneck. To scale large transformer models 5000x,
  as is projected by 2030, the power demand just for a training run is projected to be
  6 gigawatts. This is both extremely expensive and bad for the environment.
  Companies are looking into nuclear power to get around this, but there are various obstacles,
  especially regulatory concerns.[&lt;a href="https://www.harysdalvi.com/blog/can-transformers-solve-everything/#reflist"&gt;10&lt;/a&gt;]&lt;/p&gt;
&lt;p&gt;Even when not training,
  running AI models will require significant power. Technologies such as Chain-of-Thought (CoT) reasoning
  in the new &lt;a href="https://openai.com/o1/" target="_blank"&gt;OpenAI o1 model&lt;/a&gt; could increase inference costs even further.
&lt;/p&gt;
&lt;h3&gt;Smaller Companies&lt;/h3&gt;
&lt;p&gt;Smaller companies likely won't have the resources to train LLMs from scratch,
  and they also won't need them. If a small company needs to access
  a powerful transformer model from a large company, they can either
  pay for queries or host an open source model locally.
&lt;/p&gt;
&lt;p&gt;So far, these large transformer models have mostly been LLMs.
  But in the future, if transformers really can solve everything,
  we might see similar models for other kinds of transformers.
&lt;/p&gt;
&lt;p&gt;
  Some domain-specific transformers might also be less intensive to train,
  so startups can build and sell their own in-house.
  Already we see companies like &lt;a href="https://www.atmo.ai" target="_blank"&gt;Atmo&lt;/a&gt;
  using deep learning for weather forecasting.
  While transformers for language modeling are very intensive
  and slow, Atmo's model is actually faster and more accurate than the corresponding
  atmospheric physics simulations. If transformers can generalize to a wide range of
  domains outside language, we might see startups not only using large companies'
  LLMs, but also building their own niche and innovative transformers.
&lt;/p&gt;
&lt;h2&gt;Domain-Specific Applications&lt;/h2&gt;
&lt;p&gt;Transformers can even do things we don't usually do with machine learning, like
  add and subtract numbers[&lt;a href="https://www.harysdalvi.com/blog/can-transformers-solve-everything/#reflist"&gt;11&lt;/a&gt;] and
  implement hashing algorithms.[&lt;a href="https://www.harysdalvi.com/blog/can-transformers-solve-everything/#reflist"&gt;12&lt;/a&gt;]
  If we use transformers to achieve some sort of AGI (whatever that means),
  then naturally we could do a wide range of tasks like these.
  Would this render other domain-specific models obsolete?
&lt;/p&gt;
&lt;p&gt;The transformer state of the art in arithmetic seems to be 99% accuracy on 100-digit numbers.[&lt;a href="https://www.harysdalvi.com/blog/can-transformers-solve-everything/#reflist"&gt;11&lt;/a&gt;]
  But there's an even better algorithm for adding and subtracting numbers.
  It takes minimal compute, requires no training data, can work with any size numbers, and has 100% accuracy.
  It's called... &lt;a href="https://en.wikipedia.org/wiki/Adder_(electronics)" target="_blank"&gt;adding digits and carrying the extras.&lt;/a&gt;
&lt;/p&gt;
&lt;p&gt;There are many domains where speed, interpretability, and 100% accuracy really matter.
  A transformer will always fail here, even if &lt;em&gt;technically&lt;/em&gt; it could produce a decent result.
  These include arithmetic, cryptography, and mathematical proof verification.
&lt;/p&gt;
&lt;p&gt;So it seems these areas are safe from the influence of transformers, right?
&lt;/p&gt;
&lt;p&gt;Not quite. It's true that transformer models are only the best tool for the job in a certain subset
  of cases. However, one of these cases may be the task of &lt;strong&gt;determining when and where to
  carry out more traditional calculations!&lt;/strong&gt; For example, transformers are much better than other models at
  coding, so in theory they could simply write programs to solve tasks that transformers
  themselves are ill-suited for.
&lt;/p&gt;
&lt;p&gt;With the correct setup, they can do this in a collaborative feedback loop with more traditional tools.
  Consider Google DeepMind's &lt;a href="https://deepmind.google/discover/blog/ai-solves-imo-problems-at-silver-medal-level/" target="_blank"&gt;AlphaProof&lt;/a&gt;.
  This system combines a pretrained language model with &lt;a href="https://www.lean-lang.org" target="_blank"&gt;Lean&lt;/a&gt;,
  a proof assistant that can verify mathematical proofs.
  So rather than just stochastically spitting out a proof,
  the language model can make sure its proof is correct and adjust as necessary.
  AlphaProof won a silver medal on the International Mathematical Olympiad,
  one of the most difficult and prestigious mathematics competitions.
&lt;/p&gt;
&lt;h2&gt;Algorithmic Constraints&lt;/h2&gt;
&lt;p&gt;What do we mean by algorithmic constraints? In general, this is the idea that a transformer
  trained on some data might not be the best algorithm we have to solve a given problem.
&lt;/p&gt;
&lt;p&gt;In fact, this might be a computational constraint in disguise: maybe a transformer can
  technically solve a problem, but the amount of data and compute required is far more
  than with a more specialized algorithm. Let's take a look at one such case.
&lt;/p&gt;
&lt;h3&gt;Demonstration: Bad Algorithm means More Compute&lt;/h3&gt;
&lt;p&gt;We know transformers are expensive. But how much more expensive is a transformer, really?
  Let's test this out by simulating a noisy time series and using two methods to
  pick out the signal: the &lt;a href="https://en.wikipedia.org/wiki/Fast_Fourier_transform" target="_blank"&gt;fast Fourier transform (FFT)&lt;/a&gt;,
  a well-known tool for this job, and a transformer model.
&lt;/p&gt;
&lt;p&gt;We'll use 5000 total data points of the signal
  &lt;span class="mjx-chtml MJXc-display" style="text-align: center;"&gt;&lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-1-Frame" role="presentation" style="font-size: 90%; text-align: center; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML MJX_Assistive_MathML_Block" role="presentation"&gt;&lt;math display="block" xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;sin&lt;/mi&gt;&lt;mo&gt;⁡&lt;/mo&gt;&lt;mo stretchy="false"&gt;(&lt;/mo&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;mo stretchy="false"&gt;)&lt;/mo&gt;&lt;mo&gt;+&lt;/mo&gt;&lt;mfrac&gt;&lt;mn&gt;1&lt;/mn&gt;&lt;mn&gt;5&lt;/mn&gt;&lt;/mfrac&gt;&lt;mi&gt;cos&lt;/mi&gt;&lt;mo&gt;⁡&lt;/mo&gt;&lt;mrow&gt;&lt;mo&gt;(&lt;/mo&gt;&lt;mstyle displaystyle="false" scriptlevel="0"&gt;&lt;mfrac&gt;&lt;mn&gt;11&lt;/mn&gt;&lt;mn&gt;13&lt;/mn&gt;&lt;/mfrac&gt;&lt;/mstyle&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;mo&gt;)&lt;/mo&gt;&lt;/mrow&gt;&lt;mo&gt;+&lt;/mo&gt;&lt;mfrac&gt;&lt;mn&gt;1&lt;/mn&gt;&lt;mn&gt;9&lt;/mn&gt;&lt;/mfrac&gt;&lt;mi&gt;sin&lt;/mi&gt;&lt;mo&gt;⁡&lt;/mo&gt;&lt;mrow&gt;&lt;mo&gt;(&lt;/mo&gt;&lt;mstyle displaystyle="false" scriptlevel="0"&gt;&lt;mfrac&gt;&lt;mn&gt;17&lt;/mn&gt;&lt;mn&gt;37&lt;/mn&gt;&lt;/mfrac&gt;&lt;/mstyle&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;mo&gt;−&lt;/mo&gt;&lt;mstyle displaystyle="false" scriptlevel="0"&gt;&lt;mfrac&gt;&lt;mi&gt;π&lt;/mi&gt;&lt;mn&gt;4&lt;/mn&gt;&lt;/mfrac&gt;&lt;/mstyle&gt;&lt;mo&gt;)&lt;/mo&gt;&lt;/mrow&gt;&lt;mo&gt;,&lt;/mo&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
  plus some Gaussian noise with standard deviation &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-2-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mfrac&gt;&lt;mn&gt;1&lt;/mn&gt;&lt;mn&gt;7&lt;/mn&gt;&lt;/mfrac&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;. We'll split this into 90% training data
  and 10% test data.
&lt;/p&gt;
&lt;p&gt;Our transformer will be decoder-only, with an input size of 25 data points, a hidden dimension of 8,
  a feedforward dimension of 4, 1 attention head, and 1 layer. We'll train for 1 epoch
  with a batch size of 128, using the Adam optimizer with a learning rate of 0.1.
  Writing that out feels ridiculous
  for a simple time series task like extracting a signal, but here we are.
&lt;/p&gt;
&lt;figure class="middleimg"&gt;
&lt;img alt="The FFT matches the signal closely, while the transformer is almost a flat line." src="https://www.harysdalvi.com/blog/can-transformers-solve-everything/img/demo1.jpg"/&gt;
&lt;/figure&gt;
&lt;p&gt;Wow... the FFT did ok, but the transformer is absolutely horrendous!
  Let's try decreasing the learning rate to 0.01?
&lt;/p&gt;
&lt;figure class="middleimg"&gt;
&lt;img alt="The transformer almost starts oscillating with the signal but immediately goes flat again." src="https://www.harysdalvi.com/blog/can-transformers-solve-everything/img/demo2.jpg"/&gt;
&lt;/figure&gt;
&lt;p&gt;Getting a little better. Let's try training for 20 epochs instead of 1.
&lt;/p&gt;
&lt;figure class="middleimg"&gt;
&lt;img alt="The transformer is clearly oscillating with the signal but is slightly slow and muted." src="https://www.harysdalvi.com/blog/can-transformers-solve-everything/img/demo3.jpg"/&gt;
&lt;/figure&gt;
&lt;p&gt;The transformer's predicted frequency and amplitude are a little too low,
  and it's not as smooth as the FFT solution, but now
  at least it's got the spirit. As for quantitative performance,
  the FFT had a root mean square error of 0.24, while the transformer had 0.88.
  The FFT is doing much better, especially considering that due to our 1/7 random noise,
  we wouldn't expect to get below 0.14.
&lt;/p&gt;
&lt;p&gt;Of course, the FFT also wins on training time.
  The FFT took just 0.11 seconds to compute on all training data,
  while the transformer took 18.9 seconds to train all 20 epochs.
  All that for worse performance.
&lt;/p&gt;
&lt;p&gt;The FFT has another benefit too: interpretability.
  We can look inside and see the amplitudes of all the frequencies that the FFT picked up.
&lt;/p&gt;
&lt;figure class="middleimg"&gt;
&lt;img alt="The angular frequency spikes at 1.0, 11/13, and 17/37." src="https://www.harysdalvi.com/blog/can-transformers-solve-everything/img/fft-freqs.jpg"/&gt;
&lt;/figure&gt;
&lt;p&gt;If we look back to the real equation, this is pretty spot-on.
&lt;/p&gt;
&lt;p&gt;In the real world you might not have a dataset like this where
  a traditional non-machine learning model is obviously the perfect choice
  rather than machine learning or a transformer. Machine learning shines where
  &lt;strong&gt;we have the data, but we don't even know where to start when it comes to coding an algorithm.&lt;/strong&gt;
  And neural networks like transformers shine even more in cases of &lt;strong&gt;unstructured data&lt;/strong&gt;
  where we don't have good ideas to compute our own features.
  But hopefully this toy
  example demonstrates the universal approximation theorem,
  as well as why it isn't always a good guide in practice.
  To match or exceed the FFT performance with a transformer, we would need a lot more data and compute.
  Just because transformers
  &lt;em&gt;can&lt;/em&gt; do everything doesn't mean they &lt;em&gt;should&lt;/em&gt;.
&lt;/p&gt;
&lt;p&gt;Transformers aren't the final frontier either. There are exciting architectures like
  Mamba on the horizon that could one day replace transformers.[&lt;a href="https://www.harysdalvi.com/blog/can-transformers-solve-everything/#reflist"&gt;13&lt;/a&gt;] Most likely,
  they will still not replace traditional methods for similar reasons to transformers.
&lt;/p&gt;
&lt;h3&gt;Bad Data means Bad Algorithm&lt;/h3&gt;
&lt;p&gt;Just as algorithmic constraints are sometimes compute constraints in disguise,
  we can also look at some algorithmic constraints as being data constraints in disguise.
&lt;/p&gt;
&lt;p&gt;What I mean is this: if we want to solve a problem with transformers, or any sort of
  machine learning, we usually start with a dataset. It's possible that &lt;strong&gt;no matter how good
  we get at inference in the dataset&lt;/strong&gt;, even if we have a cool machine learning algorithm at the end of it,
  &lt;strong&gt;maybe we didn't actually solve the problem we had to begin with.&lt;/strong&gt;
  In other words, winning a &lt;a href="https://www.kaggle.com/competitions/rsna-breast-cancer-detection" target="_blank"&gt;Kaggle
  competition for cancer detection&lt;/a&gt; does not mean you cured cancer.
&lt;/p&gt;
&lt;p&gt;Going with the Kaggle example, why is this the case? The answer, of course, is that detecting cancer is
  just one small part of curing cancer.
  Because of the targets and labels of our dataset, our machine learning algorithm will always be an algorithm designed only to detect cancer,
  no matter how much we refine it with more data.
&lt;/p&gt;
&lt;p&gt;This is a problem not just with transformers but with all of machine learning.
  There's just one exception: if we were to train a &lt;strong&gt;general machine learning algorithm&lt;/strong&gt;,
  that can take any input and produce an appropriate output, this would not apply.
  Our algorithm would be able to detect cancer &lt;em&gt;and&lt;/em&gt; cure it.
&lt;/p&gt;
&lt;p&gt;The question is whether this applies to LLMs. Is predicting the next word really a proxy for intelligence
  and creative thinking, even up to the level of curing cancer?
  We've already seen that the simple next-word-prediction task seems to capture
  some amount of intelligence.
  But it remains to be seen how far this will go.
&lt;/p&gt;
&lt;p&gt;The internet does contain some data that requires reasoning: you might see a sentence like
  “Socrates is a man. All men are mortal. Therefore...” and have the LLM trained to
  continue. Human fine-tuning can refine this capability further than basic syllogisms. But maybe
  solving advanced reasoning
  through next-word-prediction will turn out to be intractable, and something more is required.
&lt;/p&gt;
&lt;h2&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;According to the &lt;strong&gt;universal approximation theorem&lt;/strong&gt;,
  neural networks can approximate any continuous function to any degree of accuracy.
  This means that in theory, yes, transformers can solve everything time series.
  They might even work in other areas, like images.
&lt;/p&gt;
&lt;p&gt;However, they often come at a large &lt;strong&gt;computational cost&lt;/strong&gt;, and might require &lt;strong&gt;more data&lt;/strong&gt; than we have access to.
  Even in cases where a transformer could work, a traditional model often comes with both
  better performance and lower cost.
&lt;/p&gt;
&lt;p&gt;Transformers are still a powerful model. While they are mostly associated with LLMs,
  startups could do to build more &lt;strong&gt;domain-specific transformers&lt;/strong&gt; as well.
&lt;/p&gt;
&lt;p&gt;In general, given enough data, transformers or other neural networks will eventually do a good job
  matching their dataset.
  But when training a transformer, or any machine learning model,
  sometimes good performance on the dataset isn't really a success.
  You may have to ask &lt;strong&gt;whether solving this dataset is really solving a problem for people&lt;/strong&gt;.
  Just like solving a breast cancer dataset on Kaggle won't end breast cancer,
  it remains to be seen whether solving the next-word-prediction task
  will solve AGI. There are arguments both &lt;a href="https://www.secondbest.ca/p/why-agi-is-closer-than-you-think" target="_blank"&gt;for&lt;/a&gt; and
  &lt;a href="https://www.alignmentforum.org/posts/DvCLEkr9pXLnWikB8/some-arguments-against-strong-scaling" target="_blank"&gt;against&lt;/a&gt;
  the idea that it will.
&lt;/p&gt;
&lt;p&gt;Even if next-word-prediction doesn't solve AGI, it's at least a useful tool for coding.
  When combined with other systems, transformers can solve tasks more robustly,
  even mathematical proofs that require perfect rigor. But the &lt;strong&gt;use of a more traditional
  system is critical here.&lt;/strong&gt;
&lt;/p&gt;
&lt;p&gt;So yes, transformers can solve everything. But they probably shouldn't.
&lt;/p&gt;
&lt;hr/&gt;
&lt;h2&gt;Related Posts&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="https://www.harysdalvi.com/blog/ai-laundry"&gt;Why AI Can't Do Your Laundry&lt;/a&gt; — Too often, AI does what we humans want to do, and not what we can't or won't do.&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.harysdalvi.com/blog/llms-dont-predict-next-word"&gt;LLMs Do Not Predict the Next Word&lt;/a&gt; — RLHF forces us to view LLMs as agents in an environment, not just statistical models.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id="ref"&gt;References&lt;/h2&gt;
&lt;p&gt;The GitHub for the transformers vs FFT demo is at
  &lt;a href="https://github.com/crackalamoo/blog-demos/tree/main/2024/transformers-vs-fft" target="_blank"&gt;crackalamoo/blog-demos&lt;/a&gt;.
&lt;/p&gt;
&lt;ol id="reflist"&gt;&lt;li&gt;&lt;a href="https://arxiv.org/pdf/1706.03762" target="_blank"&gt;Attention Is All You Need&lt;/a&gt;
  (Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Łukasz Kaiser &amp;amp; Illia Polosukhin, 2017) &lt;/li&gt;&lt;li&gt;&lt;a href="https://arxiv.org/abs/2010.11929" target="_blank"&gt;An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale&lt;/a&gt;
  (Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer,
  Georg Heigold, Sylvain Gelly, Jakob Uszkoreit &amp;amp; Neil Houlsby, Google Brain, 2020) &lt;/li&gt;&lt;li&gt;&lt;a href="https://arxiv.org/abs/2106.01345" target="_blank"&gt;Decision Transformer: Reinforcement Learning via Sequence Modeling&lt;/a&gt;
  (Lili Chen, Kevin Lu, Aravind Rajeswaran, Kimin Lee,
  Aditya Grover, Michael Laskin, Pieter Abbeel, Aravind Srinivas &amp;amp; Igor Mordatch, 2021) &lt;/li&gt;&lt;li&gt;&lt;a href="https://arxiv.org/abs/2312.03876" target="_blank"&gt;Scaling transformer neural networks for skillful and reliable medium-range weather forecasting&lt;/a&gt;
  (Tung Nguyen, Rohan Shah, Hritik Bansal, Troy Arcomano, Sandeep Madireddy, Romit Maulik, Veerabhadra Kotamarthi, Ian Foster &amp;amp; Aditya Grover, 2023) &lt;/li&gt;&lt;li&gt;&lt;a href="https://www.kdnuggets.com/2015/12/harasymiv-lessons-kaggle-machine-learning.html" target="_blank"&gt;Lessons from 2 Million Machine Learning Models on Kaggle&lt;/a&gt;
  (Vasyl Harasymiv, KDnuggets, 2015) &lt;/li&gt;&lt;li&gt;&lt;a href="http://neuralnetworksanddeeplearning.com/chap4.html" target="_blank"&gt;A visual proof that neural nets can compute any function&lt;/a&gt;
  (Michael A. Nielsen, &lt;cite&gt;Neural Networks and Deep Learning&lt;/cite&gt;, 2019) &lt;/li&gt;&lt;li&gt;&lt;a href="https://www.sciencedirect.com/science/article/pii/0893608089900208" target="_blank"&gt;Multilayer feedforward networks are universal approximators&lt;/a&gt;
  (Kurt Hornik, Maxwell Stinchcombe &amp;amp; Halbert White, &lt;cite&gt;Neural Networks&lt;/cite&gt;, 1989) &lt;/li&gt;&lt;li&gt;&lt;a href="https://plato.stanford.edu/entries/church-turing/#CaseForChurTuriThes" target="_blank"&gt;The Church-Turing Thesis&lt;/a&gt;
  (Stanford Encyclopedia of Philosophy, 1997–2023) &lt;/li&gt;&lt;li&gt;&lt;a href="https://arxiv.org/abs/1912.10077" target="_blank"&gt;Are Transformers universal approximators of sequence-to-sequence functions?&lt;/a&gt;
  (Chulhee Yun, Srinadh Bhojanapalli, Ankit Singh Rawat, Sashank J. Reddi &amp;amp; Sanjiv Kumar, 2020) &lt;/li&gt;&lt;li&gt;&lt;a href="https://epochai.org/blog/can-ai-scaling-continue-through-2030" target="_blank"&gt;Can AI Scaling Continue Through 2030?&lt;/a&gt;
  (Jaime Sevilla, Tamay Besiroglu, Ben Cottier, Josh You, Edu Roldán, Pablo Villalobos &amp;amp; Ege Erdil, Epoch AI, 2024) &lt;/li&gt;&lt;li&gt;&lt;a href="https://arxiv.org/abs/2405.17399" target="_blank"&gt;Transformers Can Do Arithmetic with the Right Embeddings&lt;/a&gt;
  (Sean McLeish, Arpit Bansal, Alex Stein, Neel Jain, John Kirchenbauer, Brian R. Bartoldson, Bhavya Kailkhura, Abhinav Bhatele, Jonas Geiping, Avi Schwarzschild &amp;amp; Tom Goldstein, 2024) &lt;/li&gt;&lt;li&gt;&lt;a href="https://andrew.gr/stories/sha/" target="_blank"&gt;Implementing an SHA transformer by hand&lt;/a&gt;
  (Andrew Gritsevskiy) &lt;/li&gt;&lt;li&gt;&lt;a href="https://arxiv.org/abs/2312.00752" target="_blank"&gt;Mamba: Linear-Time Sequence Modeling with Selective State Spaces&lt;/a&gt;
  (Albert Gu &amp;amp; Tri Dao, 2023) &lt;/li&gt;&lt;/ol&gt;</content></entry><entry><title>Connecting Covariance and Rotational Inertia</title><summary>A surprisingly deep connection between statistics and classical mechanics</summary><link href="https://www.harysdalvi.com/blog/covariance-rotational-inertia/" rel="alternate"/><id>https://www.harysdalvi.com/blog/covariance-rotational-inertia/</id><published>2024-09-04T16:38:07Z</published><updated>2024-09-04T16:38:07Z</updated><content type="html" xml:lang="en">&lt;p&gt;The &lt;strong&gt;covariance matrix&lt;/strong&gt; in statistics represents the covariances between a set of random variables.
  It has a number of interesting properties, such as:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;It is &lt;strong&gt;symmetric&lt;/strong&gt; (equal to its transpose).&lt;/li&gt;
&lt;li&gt;Elements along its major diagonal are variances. The variance of a random variable is, loosely speaking, the average of a squared quantity.&lt;/li&gt;
&lt;li&gt;Its other elements are covariances between two different variables, which are, loosely speaking, an average product of two variables centered on their expectations.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The &lt;strong&gt;inertia tensor&lt;/strong&gt; in classical mechanics represents the “rotational mass”, or the difficulty of rotating
  an object around an axis. It also has a number of interesting properties, such as:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;It is &lt;strong&gt;symmetric&lt;/strong&gt; (equal to its transpose).&lt;/li&gt;
&lt;li&gt;Elements along its major diagonal are, loosely speaking, the average of a squared quantity. Namely, squared distances from an axis weighted by mass.&lt;/li&gt;
&lt;li&gt;Its other elements are, loosely speaking, an average product of two variables. Namely, the negative product of two coordinates, weighted by mass.&lt;/li&gt;
&lt;/ul&gt;
&lt;p class="image"&gt;🤔&lt;/p&gt;
&lt;p&gt;Is there a connection here? Or is this similar form just a coincidence?
&lt;/p&gt;
&lt;p&gt;There are many ways to ask this question. Today, I'll put it this way: &lt;strong&gt;is the inertia tensor also the covariance matrix
  of some set of random variables?&lt;/strong&gt; And if so, &lt;strong&gt;what are those random variables?&lt;/strong&gt;
&lt;/p&gt;
&lt;h2 id="definitions"&gt;Definitions&lt;/h2&gt;
&lt;p&gt;Let's start by defining the covariance matrix between &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-1-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;n&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt; random variables, which we will label
  &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-2-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;msub&gt;&lt;mi&gt;Y&lt;/mi&gt;&lt;mn&gt;1&lt;/mn&gt;&lt;/msub&gt;&lt;mo&gt;,&lt;/mo&gt;&lt;mo&gt;…&lt;/mo&gt;&lt;mo&gt;,&lt;/mo&gt;&lt;msub&gt;&lt;mi&gt;Y&lt;/mi&gt;&lt;mi&gt;n&lt;/mi&gt;&lt;/msub&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;. The elements of the covariance matrix are given by&lt;/p&gt;
&lt;div class="longmath"&gt;&lt;span class="mjx-chtml MJXc-display" style="text-align: center;"&gt;&lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-3-Frame" role="presentation" style="font-size: 90%; text-align: center; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML MJX_Assistive_MathML_Block" role="presentation"&gt;&lt;math display="block" xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;msub&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mi mathvariant="normal"&gt;C&lt;/mi&gt;&lt;mi mathvariant="normal"&gt;o&lt;/mi&gt;&lt;mi mathvariant="normal"&gt;v&lt;/mi&gt;&lt;/mrow&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mi&gt;i&lt;/mi&gt;&lt;mo&gt;,&lt;/mo&gt;&lt;mi&gt;j&lt;/mi&gt;&lt;/mrow&gt;&lt;/msub&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mi mathvariant="normal"&gt;E&lt;/mi&gt;&lt;/mrow&gt;&lt;mrow&gt;&lt;mo&gt;[&lt;/mo&gt;&lt;mo stretchy="false"&gt;(&lt;/mo&gt;&lt;msub&gt;&lt;mi&gt;Y&lt;/mi&gt;&lt;mi&gt;i&lt;/mi&gt;&lt;/msub&gt;&lt;mo&gt;−&lt;/mo&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mi mathvariant="normal"&gt;E&lt;/mi&gt;&lt;/mrow&gt;&lt;mo stretchy="false"&gt;[&lt;/mo&gt;&lt;msub&gt;&lt;mi&gt;Y&lt;/mi&gt;&lt;mi&gt;i&lt;/mi&gt;&lt;/msub&gt;&lt;mo stretchy="false"&gt;]&lt;/mo&gt;&lt;mo stretchy="false"&gt;)&lt;/mo&gt;&lt;mo stretchy="false"&gt;(&lt;/mo&gt;&lt;msub&gt;&lt;mi&gt;Y&lt;/mi&gt;&lt;mi&gt;j&lt;/mi&gt;&lt;/msub&gt;&lt;mo&gt;−&lt;/mo&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mi mathvariant="normal"&gt;E&lt;/mi&gt;&lt;/mrow&gt;&lt;mo stretchy="false"&gt;[&lt;/mo&gt;&lt;msub&gt;&lt;mi&gt;Y&lt;/mi&gt;&lt;mi&gt;j&lt;/mi&gt;&lt;/msub&gt;&lt;mo stretchy="false"&gt;]&lt;/mo&gt;&lt;mo stretchy="false"&gt;)&lt;/mo&gt;&lt;mo&gt;]&lt;/mo&gt;&lt;/mrow&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mi mathvariant="normal"&gt;C&lt;/mi&gt;&lt;mi mathvariant="normal"&gt;o&lt;/mi&gt;&lt;mi mathvariant="normal"&gt;v&lt;/mi&gt;&lt;/mrow&gt;&lt;mo stretchy="false"&gt;[&lt;/mo&gt;&lt;msub&gt;&lt;mi&gt;Y&lt;/mi&gt;&lt;mi&gt;i&lt;/mi&gt;&lt;/msub&gt;&lt;mo&gt;,&lt;/mo&gt;&lt;msub&gt;&lt;mi&gt;Y&lt;/mi&gt;&lt;mi&gt;j&lt;/mi&gt;&lt;/msub&gt;&lt;mo stretchy="false"&gt;]&lt;/mo&gt;&lt;mo&gt;,&lt;/mo&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;
&lt;p&gt;where &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-4-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mi mathvariant="normal"&gt;E&lt;/mi&gt;&lt;/mrow&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt; represents the expected value of a random variable and &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-5-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mi mathvariant="normal"&gt;C&lt;/mi&gt;&lt;mi mathvariant="normal"&gt;o&lt;/mi&gt;&lt;mi mathvariant="normal"&gt;v&lt;/mi&gt;&lt;/mrow&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;
  represents the covariance of two random variables. In the special case along the
  major diagonal, this definition becomes
  &lt;span class="mjx-chtml MJXc-display" style="text-align: center;"&gt;&lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-6-Frame" role="presentation" style="font-size: 90%; text-align: center; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML MJX_Assistive_MathML_Block" role="presentation"&gt;&lt;math display="block" xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;msub&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mi mathvariant="normal"&gt;C&lt;/mi&gt;&lt;mi mathvariant="normal"&gt;o&lt;/mi&gt;&lt;mi mathvariant="normal"&gt;v&lt;/mi&gt;&lt;/mrow&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mi&gt;i&lt;/mi&gt;&lt;mo&gt;,&lt;/mo&gt;&lt;mi&gt;i&lt;/mi&gt;&lt;/mrow&gt;&lt;/msub&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mi mathvariant="normal"&gt;E&lt;/mi&gt;&lt;/mrow&gt;&lt;mrow&gt;&lt;mo&gt;[&lt;/mo&gt;&lt;mo stretchy="false"&gt;(&lt;/mo&gt;&lt;msub&gt;&lt;mi&gt;Y&lt;/mi&gt;&lt;mi&gt;i&lt;/mi&gt;&lt;/msub&gt;&lt;mo&gt;−&lt;/mo&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mi mathvariant="normal"&gt;E&lt;/mi&gt;&lt;/mrow&gt;&lt;mo stretchy="false"&gt;[&lt;/mo&gt;&lt;msub&gt;&lt;mi&gt;Y&lt;/mi&gt;&lt;mi&gt;i&lt;/mi&gt;&lt;/msub&gt;&lt;mo stretchy="false"&gt;]&lt;/mo&gt;&lt;msup&gt;&lt;mo stretchy="false"&gt;)&lt;/mo&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msup&gt;&lt;mo&gt;]&lt;/mo&gt;&lt;/mrow&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mi mathvariant="normal"&gt;V&lt;/mi&gt;&lt;mi mathvariant="normal"&gt;a&lt;/mi&gt;&lt;mi mathvariant="normal"&gt;r&lt;/mi&gt;&lt;/mrow&gt;&lt;mo stretchy="false"&gt;[&lt;/mo&gt;&lt;msub&gt;&lt;mi&gt;Y&lt;/mi&gt;&lt;mi&gt;i&lt;/mi&gt;&lt;/msub&gt;&lt;mo stretchy="false"&gt;]&lt;/mo&gt;&lt;mo&gt;,&lt;/mo&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
  where &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-7-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mi mathvariant="normal"&gt;V&lt;/mi&gt;&lt;mi mathvariant="normal"&gt;a&lt;/mi&gt;&lt;mi mathvariant="normal"&gt;r&lt;/mi&gt;&lt;/mrow&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt; represents the variance of a random variable.
&lt;/p&gt;
&lt;p&gt;Now let's look at the definition of the inertia tensor &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-8-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;I&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt; of an object around a given point.
  Intuitively, the inertia tensor represents a kind of “rotational mass”: how difficult
  it is to rotate something. For example,
  &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-9-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;msub&gt;&lt;mi&gt;I&lt;/mi&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mi&gt;y&lt;/mi&gt;&lt;mi&gt;y&lt;/mi&gt;&lt;/mrow&gt;&lt;/msub&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt; (also called &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-10-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;msub&gt;&lt;mi&gt;I&lt;/mi&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mn&gt;22&lt;/mn&gt;&lt;/mrow&gt;&lt;/msub&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;) measures the difficulty of rotating an object about the
  &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-11-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;y&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;-axis. The off-diagonal elements have a related meaning: for example &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-12-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;msub&gt;&lt;mi&gt;I&lt;/mi&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;mi&gt;y&lt;/mi&gt;&lt;/mrow&gt;&lt;/msub&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;
  (also called &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-13-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;msub&gt;&lt;mi&gt;I&lt;/mi&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mn&gt;12&lt;/mn&gt;&lt;/mrow&gt;&lt;/msub&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;) represents the fact that when you attempt to rotate an object around
  the &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-14-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;-axis, it may acquire angular momentum around the &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-15-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;y&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;-axis as well, depending on its
  shape.[&lt;a href="https://www.harysdalvi.com/blog/covariance-rotational-inertia/#reflist"&gt;1&lt;/a&gt;] As you might expect, the inertia tensor is proportional to mass:
  heavier things are harder to rotate.
&lt;/p&gt;
&lt;p&gt;
  The elements of the inertia tensor of a continuous solid object are given by
  &lt;span class="mjx-chtml MJXc-display" style="text-align: center;"&gt;&lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-16-Frame" role="presentation" style="font-size: 90%; text-align: center; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML MJX_Assistive_MathML_Block" role="presentation"&gt;&lt;math display="block" xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;msub&gt;&lt;mi&gt;I&lt;/mi&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mi&gt;i&lt;/mi&gt;&lt;mo&gt;,&lt;/mo&gt;&lt;mi&gt;j&lt;/mi&gt;&lt;/mrow&gt;&lt;/msub&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;msub&gt;&lt;mo&gt;∫&lt;/mo&gt;&lt;mi&gt;V&lt;/mi&gt;&lt;/msub&gt;&lt;msub&gt;&lt;mi&gt;δ&lt;/mi&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mi&gt;i&lt;/mi&gt;&lt;mi&gt;j&lt;/mi&gt;&lt;/mrow&gt;&lt;/msub&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mo stretchy="false"&gt;|&lt;/mo&gt;&lt;/mrow&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mi mathvariant="bold"&gt;r&lt;/mi&gt;&lt;/mrow&gt;&lt;msup&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mo stretchy="false"&gt;|&lt;/mo&gt;&lt;/mrow&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msup&gt;&lt;mo&gt;−&lt;/mo&gt;&lt;msub&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;mi&gt;i&lt;/mi&gt;&lt;/msub&gt;&lt;msub&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;mi&gt;j&lt;/mi&gt;&lt;/msub&gt;&lt;mspace width="thinmathspace"&gt;&lt;/mspace&gt;&lt;mi&gt;d&lt;/mi&gt;&lt;mi&gt;m&lt;/mi&gt;&lt;mo&gt;,&lt;/mo&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
  where&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;The indices &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-17-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;i&lt;/mi&gt;&lt;mo&gt;,&lt;/mo&gt;&lt;mi&gt;j&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt; represent the &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-18-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;mo&gt;,&lt;/mo&gt;&lt;mi&gt;y&lt;/mi&gt;&lt;mo&gt;,&lt;/mo&gt;&lt;mi&gt;z&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt; coordinates for &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-19-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;i&lt;/mi&gt;&lt;mo&gt;,&lt;/mo&gt;&lt;mi&gt;j&lt;/mi&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mn&gt;1&lt;/mn&gt;&lt;mo&gt;,&lt;/mo&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;mo&gt;,&lt;/mo&gt;&lt;mn&gt;3&lt;/mn&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt; respectively.
    &lt;/li&gt;&lt;li&gt;The integral here is over the entire volume &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-20-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;V&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt; of the object.&lt;/li&gt;
&lt;li&gt;&lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-21-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;d&lt;/mi&gt;&lt;mi&gt;m&lt;/mi&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mi&gt;ρ&lt;/mi&gt;&lt;mo stretchy="false"&gt;(&lt;/mo&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;mo&gt;,&lt;/mo&gt;&lt;mi&gt;y&lt;/mi&gt;&lt;mo&gt;,&lt;/mo&gt;&lt;mi&gt;z&lt;/mi&gt;&lt;mo stretchy="false"&gt;)&lt;/mo&gt;&lt;mspace width="thinmathspace"&gt;&lt;/mspace&gt;&lt;mi&gt;d&lt;/mi&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;mspace width="thinmathspace"&gt;&lt;/mspace&gt;&lt;mi&gt;d&lt;/mi&gt;&lt;mi&gt;y&lt;/mi&gt;&lt;mspace width="thinmathspace"&gt;&lt;/mspace&gt;&lt;mi&gt;d&lt;/mi&gt;&lt;mi&gt;z&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;
      is an infinitesimal mass element depending on the density distribution &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-22-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;ρ&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt; of the object.&lt;/li&gt;
&lt;li&gt;&lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-23-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mi mathvariant="bold"&gt;r&lt;/mi&gt;&lt;/mrow&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt; is the vector from the point around which we are computing &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-24-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;I&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt; to the
      current integration point, so
      &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-25-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mo stretchy="false"&gt;|&lt;/mo&gt;&lt;/mrow&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mi mathvariant="bold"&gt;r&lt;/mi&gt;&lt;/mrow&gt;&lt;msup&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mo stretchy="false"&gt;|&lt;/mo&gt;&lt;/mrow&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msup&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt; is the squared distance from the center point of rotation to the current point
      within the object.&lt;/li&gt;
&lt;li&gt;&lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-26-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;msub&gt;&lt;mi&gt;δ&lt;/mi&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mi&gt;i&lt;/mi&gt;&lt;mi&gt;j&lt;/mi&gt;&lt;/mrow&gt;&lt;/msub&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt; is the Kronecker delta, indicating that we only include the
      &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-27-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mo stretchy="false"&gt;|&lt;/mo&gt;&lt;/mrow&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mi mathvariant="bold"&gt;r&lt;/mi&gt;&lt;/mrow&gt;&lt;msup&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mo stretchy="false"&gt;|&lt;/mo&gt;&lt;/mrow&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msup&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt; term for elements along the major diagonal, where &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-28-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;i&lt;/mi&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mi&gt;j&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;.
    &lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Since we are integrating over the volume of the object, this is actually a triple integral.
  This is reflected in the definition of &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-29-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;d&lt;/mi&gt;&lt;mi&gt;m&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;.
&lt;/p&gt;
&lt;p&gt;
  We can get rid of the Kronecker delta by expressing the on-diagonal and off-diagonal elements
  separately. Noting that &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-30-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mo stretchy="false"&gt;|&lt;/mo&gt;&lt;/mrow&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mi mathvariant="bold"&gt;r&lt;/mi&gt;&lt;/mrow&gt;&lt;msup&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mo stretchy="false"&gt;|&lt;/mo&gt;&lt;/mrow&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msup&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;msup&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msup&gt;&lt;mo&gt;+&lt;/mo&gt;&lt;msup&gt;&lt;mi&gt;y&lt;/mi&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msup&gt;&lt;mo&gt;+&lt;/mo&gt;&lt;msup&gt;&lt;mi&gt;z&lt;/mi&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msup&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;msubsup&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;mn&gt;1&lt;/mn&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msubsup&gt;&lt;mo&gt;+&lt;/mo&gt;&lt;msubsup&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msubsup&gt;&lt;mo&gt;+&lt;/mo&gt;&lt;msubsup&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;mn&gt;3&lt;/mn&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msubsup&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;, we find
  &lt;span class="mjx-chtml MJXc-display"&gt;&lt;span class="mjx-full-width mjx-chtml MathJax_CHTML" id="MathJax-Element-31-Frame" role="presentation" style="font-size: 90%; min-width: 12.874em; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML MJX_Assistive_MathML_Block" role="presentation"&gt;&lt;math display="block" xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;msub&gt;&lt;mi&gt;I&lt;/mi&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mi&gt;i&lt;/mi&gt;&lt;mo&gt;,&lt;/mo&gt;&lt;mi&gt;i&lt;/mi&gt;&lt;/mrow&gt;&lt;/msub&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;msub&gt;&lt;mo&gt;∫&lt;/mo&gt;&lt;mi&gt;V&lt;/mi&gt;&lt;/msub&gt;&lt;msubsup&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;mn&gt;1&lt;/mn&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msubsup&gt;&lt;mo&gt;+&lt;/mo&gt;&lt;msubsup&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msubsup&gt;&lt;mo&gt;+&lt;/mo&gt;&lt;msubsup&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;mn&gt;3&lt;/mn&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msubsup&gt;&lt;mo&gt;−&lt;/mo&gt;&lt;msubsup&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;mi&gt;i&lt;/mi&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msubsup&gt;&lt;mspace width="thinmathspace"&gt;&lt;/mspace&gt;&lt;mi&gt;d&lt;/mi&gt;&lt;mi&gt;m&lt;/mi&gt;&lt;mspace linebreak="newline"&gt;&lt;/mspace&gt;&lt;msub&gt;&lt;mi&gt;I&lt;/mi&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mi&gt;i&lt;/mi&gt;&lt;mo&gt;,&lt;/mo&gt;&lt;mi&gt;j&lt;/mi&gt;&lt;/mrow&gt;&lt;/msub&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;msub&gt;&lt;mo&gt;∫&lt;/mo&gt;&lt;mi&gt;V&lt;/mi&gt;&lt;/msub&gt;&lt;mo&gt;−&lt;/mo&gt;&lt;msub&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;mi&gt;i&lt;/mi&gt;&lt;/msub&gt;&lt;msub&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;mi&gt;j&lt;/mi&gt;&lt;/msub&gt;&lt;mspace width="thinmathspace"&gt;&lt;/mspace&gt;&lt;mi&gt;d&lt;/mi&gt;&lt;mi&gt;m&lt;/mi&gt;&lt;mspace width="2em"&gt;&lt;/mspace&gt;&lt;mo stretchy="false"&gt;(&lt;/mo&gt;&lt;mi&gt;i&lt;/mi&gt;&lt;mo&gt;≠&lt;/mo&gt;&lt;mi&gt;j&lt;/mi&gt;&lt;mo stretchy="false"&gt;)&lt;/mo&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
  We have a bit of a fundamental problem here: the on-diagonal and off-diagonal elements of the inertia tensor
  are defined differently, which is not the case for the covariance matrix. However, at least in some cases,
  we'll be able to get around this.
&lt;/p&gt;
&lt;h2 id="random-sampling-mass"&gt;Random Sampling by Mass&lt;/h2&gt;
&lt;p&gt;In order to compare the inertia tensor and the covariance matrix, it will help to reframe the inertia tensor in terms of random variables.
  Currently, we are integrating over the volume of an object with respect to a mass element
  &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-32-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;d&lt;/mi&gt;&lt;mi&gt;m&lt;/mi&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mi&gt;ρ&lt;/mi&gt;&lt;mo stretchy="false"&gt;(&lt;/mo&gt;&lt;msub&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;mn&gt;1&lt;/mn&gt;&lt;/msub&gt;&lt;mo&gt;,&lt;/mo&gt;&lt;msub&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msub&gt;&lt;mo&gt;,&lt;/mo&gt;&lt;msub&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;mn&gt;3&lt;/mn&gt;&lt;/msub&gt;&lt;mo stretchy="false"&gt;)&lt;/mo&gt;&lt;mspace width="thinmathspace"&gt;&lt;/mspace&gt;&lt;mi&gt;d&lt;/mi&gt;&lt;msub&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;mn&gt;1&lt;/mn&gt;&lt;/msub&gt;&lt;mspace width="thinmathspace"&gt;&lt;/mspace&gt;&lt;mi&gt;d&lt;/mi&gt;&lt;msub&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msub&gt;&lt;mspace width="thinmathspace"&gt;&lt;/mspace&gt;&lt;mi&gt;d&lt;/mi&gt;&lt;msub&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;mn&gt;3&lt;/mn&gt;&lt;/msub&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;. This is strongly reminiscent of the expected value of a random variable.
&lt;/p&gt;
&lt;p&gt;Let's imagine a random process in which we randomly sample a point from an object weighted by density,
  so denser regions are more likely to get picked. For uniform density, this becomes uniformly sampling from
  an object. Then our &lt;strong&gt;probability density function&lt;/strong&gt; is a literal &lt;strong&gt;density function&lt;/strong&gt;
&lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-33-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;ρ&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;
  divided by the total mass &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-34-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;M&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt; of the object!
  After sampling a point, &lt;strong&gt;we get three random variables for the coordinates&lt;/strong&gt; &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-35-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;msub&gt;&lt;mi&gt;X&lt;/mi&gt;&lt;mn&gt;1&lt;/mn&gt;&lt;/msub&gt;&lt;mo&gt;,&lt;/mo&gt;&lt;msub&gt;&lt;mi&gt;X&lt;/mi&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msub&gt;&lt;mo&gt;,&lt;/mo&gt;&lt;msub&gt;&lt;mi&gt;X&lt;/mi&gt;&lt;mn&gt;3&lt;/mn&gt;&lt;/msub&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt; of that point.
&lt;/p&gt;
&lt;p&gt;Now, &lt;strong&gt;we can compute expected values in terms of these random variables.&lt;/strong&gt; Consider, for example,
  the variance of &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-36-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;msub&gt;&lt;mi&gt;X&lt;/mi&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msub&gt;&lt;mo&gt;+&lt;/mo&gt;&lt;msub&gt;&lt;mi&gt;X&lt;/mi&gt;&lt;mn&gt;3&lt;/mn&gt;&lt;/msub&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;.
  This is equal to the variance of &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-37-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;msub&gt;&lt;mi&gt;X&lt;/mi&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msub&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt; plus the variance of &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-38-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;msub&gt;&lt;mi&gt;X&lt;/mi&gt;&lt;mn&gt;3&lt;/mn&gt;&lt;/msub&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;,
  which we can express in integral form as&lt;/p&gt;
&lt;div class="longmath"&gt;&lt;span class="mjx-chtml MJXc-display" style="text-align: center;"&gt;&lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-39-Frame" role="presentation" style="font-size: 90%; text-align: center; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML MJX_Assistive_MathML_Block" role="presentation"&gt;&lt;math display="block" xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mi mathvariant="normal"&gt;V&lt;/mi&gt;&lt;mi mathvariant="normal"&gt;a&lt;/mi&gt;&lt;mi mathvariant="normal"&gt;r&lt;/mi&gt;&lt;/mrow&gt;&lt;mo stretchy="false"&gt;[&lt;/mo&gt;&lt;msub&gt;&lt;mi&gt;X&lt;/mi&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msub&gt;&lt;mo&gt;+&lt;/mo&gt;&lt;msub&gt;&lt;mi&gt;X&lt;/mi&gt;&lt;mn&gt;3&lt;/mn&gt;&lt;/msub&gt;&lt;mo stretchy="false"&gt;]&lt;/mo&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;msub&gt;&lt;mo&gt;∫&lt;/mo&gt;&lt;mi&gt;V&lt;/mi&gt;&lt;/msub&gt;&lt;mrow&gt;&lt;mo&gt;(&lt;/mo&gt;&lt;mo stretchy="false"&gt;(&lt;/mo&gt;&lt;msub&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msub&gt;&lt;mo&gt;−&lt;/mo&gt;&lt;mi&gt;E&lt;/mi&gt;&lt;mo stretchy="false"&gt;[&lt;/mo&gt;&lt;msub&gt;&lt;mi&gt;X&lt;/mi&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msub&gt;&lt;mo stretchy="false"&gt;]&lt;/mo&gt;&lt;msup&gt;&lt;mo stretchy="false"&gt;)&lt;/mo&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msup&gt;&lt;mo&gt;+&lt;/mo&gt;&lt;mo stretchy="false"&gt;(&lt;/mo&gt;&lt;msub&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;mn&gt;3&lt;/mn&gt;&lt;/msub&gt;&lt;mo&gt;−&lt;/mo&gt;&lt;mi&gt;E&lt;/mi&gt;&lt;mo stretchy="false"&gt;[&lt;/mo&gt;&lt;msub&gt;&lt;mi&gt;X&lt;/mi&gt;&lt;mn&gt;3&lt;/mn&gt;&lt;/msub&gt;&lt;mo stretchy="false"&gt;]&lt;/mo&gt;&lt;msup&gt;&lt;mo stretchy="false"&gt;)&lt;/mo&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msup&gt;&lt;mo&gt;)&lt;/mo&gt;&lt;/mrow&gt;&lt;mfrac&gt;&lt;mi&gt;ρ&lt;/mi&gt;&lt;mi&gt;M&lt;/mi&gt;&lt;/mfrac&gt;&lt;mspace width="thinmathspace"&gt;&lt;/mspace&gt;&lt;mi&gt;d&lt;/mi&gt;&lt;msub&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;mn&gt;1&lt;/mn&gt;&lt;/msub&gt;&lt;mspace width="thinmathspace"&gt;&lt;/mspace&gt;&lt;mi&gt;d&lt;/mi&gt;&lt;msub&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msub&gt;&lt;mspace width="thinmathspace"&gt;&lt;/mspace&gt;&lt;mi&gt;d&lt;/mi&gt;&lt;msub&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;mn&gt;3&lt;/mn&gt;&lt;/msub&gt;&lt;mo&gt;.&lt;/mo&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;
&lt;p&gt;To simplify things, let's &lt;strong&gt;place the origin of our coordinate system at the center of mass of our object.&lt;/strong&gt;
  Then our expected value for each coordinate is zero: &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-40-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;E&lt;/mi&gt;&lt;mo stretchy="false"&gt;[&lt;/mo&gt;&lt;msub&gt;&lt;mi&gt;X&lt;/mi&gt;&lt;mn&gt;1&lt;/mn&gt;&lt;/msub&gt;&lt;mo stretchy="false"&gt;]&lt;/mo&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mi&gt;E&lt;/mi&gt;&lt;mo stretchy="false"&gt;[&lt;/mo&gt;&lt;msub&gt;&lt;mi&gt;X&lt;/mi&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msub&gt;&lt;mo stretchy="false"&gt;]&lt;/mo&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mi&gt;E&lt;/mi&gt;&lt;mo stretchy="false"&gt;[&lt;/mo&gt;&lt;msub&gt;&lt;mi&gt;X&lt;/mi&gt;&lt;mn&gt;3&lt;/mn&gt;&lt;/msub&gt;&lt;mo stretchy="false"&gt;]&lt;/mo&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mn&gt;0&lt;/mn&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;. We now have
  &lt;span class="mjx-chtml MJXc-display" style="text-align: center;"&gt;&lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-41-Frame" role="presentation" style="font-size: 90%; text-align: center; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML MJX_Assistive_MathML_Block" role="presentation"&gt;&lt;math display="block" xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mi mathvariant="normal"&gt;V&lt;/mi&gt;&lt;mi mathvariant="normal"&gt;a&lt;/mi&gt;&lt;mi mathvariant="normal"&gt;r&lt;/mi&gt;&lt;/mrow&gt;&lt;mo stretchy="false"&gt;[&lt;/mo&gt;&lt;msub&gt;&lt;mi&gt;X&lt;/mi&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msub&gt;&lt;mo&gt;+&lt;/mo&gt;&lt;msub&gt;&lt;mi&gt;X&lt;/mi&gt;&lt;mn&gt;3&lt;/mn&gt;&lt;/msub&gt;&lt;mo stretchy="false"&gt;]&lt;/mo&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mfrac&gt;&lt;mn&gt;1&lt;/mn&gt;&lt;mi&gt;M&lt;/mi&gt;&lt;/mfrac&gt;&lt;msub&gt;&lt;mo&gt;∫&lt;/mo&gt;&lt;mi&gt;V&lt;/mi&gt;&lt;/msub&gt;&lt;msubsup&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msubsup&gt;&lt;mo&gt;+&lt;/mo&gt;&lt;msubsup&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;mn&gt;3&lt;/mn&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msubsup&gt;&lt;mspace width="thinmathspace"&gt;&lt;/mspace&gt;&lt;mi&gt;d&lt;/mi&gt;&lt;mi&gt;m&lt;/mi&gt;&lt;mo&gt;,&lt;/mo&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
  where &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-42-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;d&lt;/mi&gt;&lt;mi&gt;m&lt;/mi&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mi&gt;ρ&lt;/mi&gt;&lt;mspace width="thinmathspace"&gt;&lt;/mspace&gt;&lt;mi&gt;d&lt;/mi&gt;&lt;msub&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;mn&gt;1&lt;/mn&gt;&lt;/msub&gt;&lt;mspace width="thinmathspace"&gt;&lt;/mspace&gt;&lt;mi&gt;d&lt;/mi&gt;&lt;msub&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msub&gt;&lt;mspace width="thinmathspace"&gt;&lt;/mspace&gt;&lt;mi&gt;d&lt;/mi&gt;&lt;msub&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;mn&gt;3&lt;/mn&gt;&lt;/msub&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;.
&lt;/p&gt;
&lt;p&gt;Recall that&lt;/p&gt;
&lt;div class="longmath"&gt;&lt;span class="mjx-chtml MJXc-display" style="text-align: center;"&gt;&lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-43-Frame" role="presentation" style="font-size: 90%; text-align: center; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML MJX_Assistive_MathML_Block" role="presentation"&gt;&lt;math display="block" xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;msub&gt;&lt;mi&gt;I&lt;/mi&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mn&gt;11&lt;/mn&gt;&lt;/mrow&gt;&lt;/msub&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;msub&gt;&lt;mo&gt;∫&lt;/mo&gt;&lt;mi&gt;V&lt;/mi&gt;&lt;/msub&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mo stretchy="false"&gt;|&lt;/mo&gt;&lt;/mrow&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mi mathvariant="bold"&gt;r&lt;/mi&gt;&lt;/mrow&gt;&lt;msup&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mo stretchy="false"&gt;|&lt;/mo&gt;&lt;/mrow&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msup&gt;&lt;mo&gt;−&lt;/mo&gt;&lt;msubsup&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;mn&gt;1&lt;/mn&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msubsup&gt;&lt;mspace width="thinmathspace"&gt;&lt;/mspace&gt;&lt;mi&gt;d&lt;/mi&gt;&lt;mi&gt;m&lt;/mi&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;msub&gt;&lt;mo&gt;∫&lt;/mo&gt;&lt;mi&gt;V&lt;/mi&gt;&lt;/msub&gt;&lt;msubsup&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;mn&gt;1&lt;/mn&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msubsup&gt;&lt;mo&gt;+&lt;/mo&gt;&lt;msubsup&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msubsup&gt;&lt;mo&gt;+&lt;/mo&gt;&lt;msubsup&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;mn&gt;3&lt;/mn&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msubsup&gt;&lt;mo&gt;−&lt;/mo&gt;&lt;msubsup&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;mn&gt;1&lt;/mn&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msubsup&gt;&lt;mspace width="thinmathspace"&gt;&lt;/mspace&gt;&lt;mi&gt;d&lt;/mi&gt;&lt;mi&gt;m&lt;/mi&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;msub&gt;&lt;mo&gt;∫&lt;/mo&gt;&lt;mi&gt;V&lt;/mi&gt;&lt;/msub&gt;&lt;msubsup&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msubsup&gt;&lt;mo&gt;+&lt;/mo&gt;&lt;msubsup&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;mn&gt;3&lt;/mn&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msubsup&gt;&lt;mspace width="thinmathspace"&gt;&lt;/mspace&gt;&lt;mi&gt;d&lt;/mi&gt;&lt;mi&gt;m&lt;/mi&gt;&lt;mo&gt;.&lt;/mo&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;
&lt;p&gt;From the previous two lines, &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-44-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mi mathvariant="normal"&gt;V&lt;/mi&gt;&lt;mi mathvariant="normal"&gt;a&lt;/mi&gt;&lt;mi mathvariant="normal"&gt;r&lt;/mi&gt;&lt;/mrow&gt;&lt;mo stretchy="false"&gt;[&lt;/mo&gt;&lt;msub&gt;&lt;mi&gt;X&lt;/mi&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msub&gt;&lt;mo&gt;+&lt;/mo&gt;&lt;msub&gt;&lt;mi&gt;X&lt;/mi&gt;&lt;mn&gt;3&lt;/mn&gt;&lt;/msub&gt;&lt;mo stretchy="false"&gt;]&lt;/mo&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt; is exactly equal to &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-45-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;msub&gt;&lt;mi&gt;I&lt;/mi&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mn&gt;11&lt;/mn&gt;&lt;/mrow&gt;&lt;/msub&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt; divided by &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-46-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;M&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;! And without loss of generality,
  we also have &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-47-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mfrac&gt;&lt;mn&gt;1&lt;/mn&gt;&lt;mi&gt;M&lt;/mi&gt;&lt;/mfrac&gt;&lt;msub&gt;&lt;mi&gt;I&lt;/mi&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mn&gt;22&lt;/mn&gt;&lt;/mrow&gt;&lt;/msub&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mi mathvariant="normal"&gt;V&lt;/mi&gt;&lt;mi mathvariant="normal"&gt;a&lt;/mi&gt;&lt;mi mathvariant="normal"&gt;r&lt;/mi&gt;&lt;/mrow&gt;&lt;mo stretchy="false"&gt;[&lt;/mo&gt;&lt;msub&gt;&lt;mi&gt;X&lt;/mi&gt;&lt;mn&gt;1&lt;/mn&gt;&lt;/msub&gt;&lt;mo&gt;+&lt;/mo&gt;&lt;msub&gt;&lt;mi&gt;X&lt;/mi&gt;&lt;mn&gt;3&lt;/mn&gt;&lt;/msub&gt;&lt;mo stretchy="false"&gt;]&lt;/mo&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt; and &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-48-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mfrac&gt;&lt;mn&gt;1&lt;/mn&gt;&lt;mi&gt;M&lt;/mi&gt;&lt;/mfrac&gt;&lt;msub&gt;&lt;mi&gt;I&lt;/mi&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mn&gt;33&lt;/mn&gt;&lt;/mrow&gt;&lt;/msub&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mi mathvariant="normal"&gt;V&lt;/mi&gt;&lt;mi mathvariant="normal"&gt;a&lt;/mi&gt;&lt;mi mathvariant="normal"&gt;r&lt;/mi&gt;&lt;/mrow&gt;&lt;mo stretchy="false"&gt;[&lt;/mo&gt;&lt;msub&gt;&lt;mi&gt;X&lt;/mi&gt;&lt;mn&gt;1&lt;/mn&gt;&lt;/msub&gt;&lt;mo&gt;+&lt;/mo&gt;&lt;msub&gt;&lt;mi&gt;X&lt;/mi&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msub&gt;&lt;mo stretchy="false"&gt;]&lt;/mo&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;.
&lt;/p&gt;
&lt;p&gt;Since the major diagonals of the covariance matrix represent variances, we might be tempted to say
  &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-49-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mi mathvariant="normal"&gt;C&lt;/mi&gt;&lt;mi mathvariant="normal"&gt;o&lt;/mi&gt;&lt;mi mathvariant="normal"&gt;v&lt;/mi&gt;&lt;/mrow&gt;&lt;mo stretchy="false"&gt;(&lt;/mo&gt;&lt;msub&gt;&lt;mi&gt;Y&lt;/mi&gt;&lt;mn&gt;1&lt;/mn&gt;&lt;/msub&gt;&lt;mo&gt;,&lt;/mo&gt;&lt;msub&gt;&lt;mi&gt;Y&lt;/mi&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msub&gt;&lt;mo&gt;,&lt;/mo&gt;&lt;msub&gt;&lt;mi&gt;Y&lt;/mi&gt;&lt;mn&gt;3&lt;/mn&gt;&lt;/msub&gt;&lt;mo stretchy="false"&gt;)&lt;/mo&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mfrac&gt;&lt;mn&gt;1&lt;/mn&gt;&lt;mi&gt;M&lt;/mi&gt;&lt;/mfrac&gt;&lt;mi&gt;I&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt; where &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-50-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mi mathvariant="normal"&gt;C&lt;/mi&gt;&lt;mi mathvariant="normal"&gt;o&lt;/mi&gt;&lt;mi mathvariant="normal"&gt;v&lt;/mi&gt;&lt;/mrow&gt;&lt;mo stretchy="false"&gt;(&lt;/mo&gt;&lt;msub&gt;&lt;mi&gt;Y&lt;/mi&gt;&lt;mn&gt;1&lt;/mn&gt;&lt;/msub&gt;&lt;mo&gt;,&lt;/mo&gt;&lt;msub&gt;&lt;mi&gt;Y&lt;/mi&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msub&gt;&lt;mo&gt;,&lt;/mo&gt;&lt;msub&gt;&lt;mi&gt;Y&lt;/mi&gt;&lt;mn&gt;3&lt;/mn&gt;&lt;/msub&gt;&lt;mo stretchy="false"&gt;)&lt;/mo&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt; is the covariance matrix
  of &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-51-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;msub&gt;&lt;mi&gt;Y&lt;/mi&gt;&lt;mn&gt;1&lt;/mn&gt;&lt;/msub&gt;&lt;mo&gt;,&lt;/mo&gt;&lt;msub&gt;&lt;mi&gt;Y&lt;/mi&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msub&gt;&lt;mo&gt;,&lt;/mo&gt;&lt;msub&gt;&lt;mi&gt;Y&lt;/mi&gt;&lt;mn&gt;3&lt;/mn&gt;&lt;/msub&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;. Then if we define &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-52-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;msub&gt;&lt;mi&gt;Y&lt;/mi&gt;&lt;mn&gt;1&lt;/mn&gt;&lt;/msub&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;msub&gt;&lt;mi&gt;X&lt;/mi&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msub&gt;&lt;mo&gt;+&lt;/mo&gt;&lt;msub&gt;&lt;mi&gt;X&lt;/mi&gt;&lt;mn&gt;3&lt;/mn&gt;&lt;/msub&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;, &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-53-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;msub&gt;&lt;mi&gt;Y&lt;/mi&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msub&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;msub&gt;&lt;mi&gt;X&lt;/mi&gt;&lt;mn&gt;1&lt;/mn&gt;&lt;/msub&gt;&lt;mo&gt;+&lt;/mo&gt;&lt;msub&gt;&lt;mi&gt;X&lt;/mi&gt;&lt;mn&gt;3&lt;/mn&gt;&lt;/msub&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;, and &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-54-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;msub&gt;&lt;mi&gt;Y&lt;/mi&gt;&lt;mn&gt;3&lt;/mn&gt;&lt;/msub&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;msub&gt;&lt;mi&gt;X&lt;/mi&gt;&lt;mn&gt;1&lt;/mn&gt;&lt;/msub&gt;&lt;mo&gt;+&lt;/mo&gt;&lt;msub&gt;&lt;mi&gt;X&lt;/mi&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msub&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;,
  the elements along the major diagonal are correct. Let's call these &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-55-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;msub&gt;&lt;mi&gt;Y&lt;/mi&gt;&lt;mn&gt;1&lt;/mn&gt;&lt;/msub&gt;&lt;mo&gt;,&lt;/mo&gt;&lt;msub&gt;&lt;mi&gt;Y&lt;/mi&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msub&gt;&lt;mo&gt;,&lt;/mo&gt;&lt;msub&gt;&lt;mi&gt;Y&lt;/mi&gt;&lt;mn&gt;3&lt;/mn&gt;&lt;/msub&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt; our &lt;strong&gt;desired random variables&lt;/strong&gt;, because they are physically interpretable
  in terms of our random coordinates &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-56-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;msub&gt;&lt;mi&gt;X&lt;/mi&gt;&lt;mn&gt;1&lt;/mn&gt;&lt;/msub&gt;&lt;mo&gt;,&lt;/mo&gt;&lt;msub&gt;&lt;mi&gt;X&lt;/mi&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msub&gt;&lt;mo&gt;,&lt;/mo&gt;&lt;msub&gt;&lt;mi&gt;X&lt;/mi&gt;&lt;mn&gt;3&lt;/mn&gt;&lt;/msub&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt; and they also connect the covariance matrix to the inertia tensor.
  Physically, these represent all three distinct sums of two coordinates of a randomly sampled 3D point.
&lt;/p&gt;
&lt;p&gt;There's a problem though: if we define our random variables &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-57-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;msub&gt;&lt;mi&gt;Y&lt;/mi&gt;&lt;mn&gt;1&lt;/mn&gt;&lt;/msub&gt;&lt;mo&gt;,&lt;/mo&gt;&lt;msub&gt;&lt;mi&gt;Y&lt;/mi&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msub&gt;&lt;mo&gt;,&lt;/mo&gt;&lt;msub&gt;&lt;mi&gt;Y&lt;/mi&gt;&lt;mn&gt;3&lt;/mn&gt;&lt;/msub&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt; in this way,
  are the off-diagonal elements of the covariance matrix also correct?
&lt;/p&gt;
&lt;h2 id="off-diagonal-elements"&gt;The Off-Diagonal Elements&lt;/h2&gt;
&lt;p&gt;
With our desired random variables, the diagonal terms of the covariance matrix matched the inertia tensor! Now for the off-diagonal elements, we hope to find&lt;/p&gt;
&lt;div class="longmath"&gt;&lt;span class="mjx-chtml MJXc-display" style="text-align: center;"&gt;&lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-58-Frame" role="presentation" style="font-size: 90%; text-align: center; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML MJX_Assistive_MathML_Block" role="presentation"&gt;&lt;math display="block" xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mi mathvariant="normal"&gt;C&lt;/mi&gt;&lt;mi mathvariant="normal"&gt;o&lt;/mi&gt;&lt;mi mathvariant="normal"&gt;v&lt;/mi&gt;&lt;/mrow&gt;&lt;mo stretchy="false"&gt;(&lt;/mo&gt;&lt;msub&gt;&lt;mi&gt;Y&lt;/mi&gt;&lt;mn&gt;1&lt;/mn&gt;&lt;/msub&gt;&lt;mo&gt;,&lt;/mo&gt;&lt;msub&gt;&lt;mi&gt;Y&lt;/mi&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msub&gt;&lt;mo&gt;,&lt;/mo&gt;&lt;msub&gt;&lt;mi&gt;Y&lt;/mi&gt;&lt;mn&gt;3&lt;/mn&gt;&lt;/msub&gt;&lt;msub&gt;&lt;mo stretchy="false"&gt;)&lt;/mo&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mi&gt;i&lt;/mi&gt;&lt;mo&gt;,&lt;/mo&gt;&lt;mi&gt;j&lt;/mi&gt;&lt;/mrow&gt;&lt;/msub&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mfrac&gt;&lt;mn&gt;1&lt;/mn&gt;&lt;mi&gt;M&lt;/mi&gt;&lt;/mfrac&gt;&lt;msub&gt;&lt;mi&gt;I&lt;/mi&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mi&gt;i&lt;/mi&gt;&lt;mo&gt;,&lt;/mo&gt;&lt;mi&gt;j&lt;/mi&gt;&lt;/mrow&gt;&lt;/msub&gt;&lt;mspace width="thickmathspace"&gt;&lt;/mspace&gt;&lt;mo stretchy="false"&gt;⟹&lt;/mo&gt;&lt;mspace width="thickmathspace"&gt;&lt;/mspace&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mi mathvariant="normal"&gt;E&lt;/mi&gt;&lt;/mrow&gt;&lt;mrow&gt;&lt;mo&gt;[&lt;/mo&gt;&lt;mo stretchy="false"&gt;(&lt;/mo&gt;&lt;msub&gt;&lt;mi&gt;Y&lt;/mi&gt;&lt;mi&gt;i&lt;/mi&gt;&lt;/msub&gt;&lt;mo&gt;−&lt;/mo&gt;&lt;mi&gt;E&lt;/mi&gt;&lt;mo stretchy="false"&gt;[&lt;/mo&gt;&lt;msub&gt;&lt;mi&gt;Y&lt;/mi&gt;&lt;mi&gt;i&lt;/mi&gt;&lt;/msub&gt;&lt;mo stretchy="false"&gt;]&lt;/mo&gt;&lt;mo stretchy="false"&gt;)&lt;/mo&gt;&lt;mo stretchy="false"&gt;(&lt;/mo&gt;&lt;msub&gt;&lt;mi&gt;Y&lt;/mi&gt;&lt;mi&gt;j&lt;/mi&gt;&lt;/msub&gt;&lt;mo&gt;−&lt;/mo&gt;&lt;mi&gt;E&lt;/mi&gt;&lt;mo stretchy="false"&gt;[&lt;/mo&gt;&lt;msub&gt;&lt;mi&gt;Y&lt;/mi&gt;&lt;mi&gt;j&lt;/mi&gt;&lt;/msub&gt;&lt;mo stretchy="false"&gt;]&lt;/mo&gt;&lt;mo stretchy="false"&gt;)&lt;/mo&gt;&lt;mo&gt;]&lt;/mo&gt;&lt;/mrow&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mfrac&gt;&lt;mn&gt;1&lt;/mn&gt;&lt;mi&gt;M&lt;/mi&gt;&lt;/mfrac&gt;&lt;msub&gt;&lt;mo&gt;∫&lt;/mo&gt;&lt;mi&gt;V&lt;/mi&gt;&lt;/msub&gt;&lt;mo&gt;−&lt;/mo&gt;&lt;msub&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;mi&gt;i&lt;/mi&gt;&lt;/msub&gt;&lt;msub&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;mi&gt;j&lt;/mi&gt;&lt;/msub&gt;&lt;mspace width="thinmathspace"&gt;&lt;/mspace&gt;&lt;mi&gt;d&lt;/mi&gt;&lt;mi&gt;m&lt;/mi&gt;&lt;mo&gt;.&lt;/mo&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;
&lt;p&gt;
Since we chose to place our origin at the center of mass of the object, we have the expectation of all &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-59-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;msub&gt;&lt;mi&gt;X&lt;/mi&gt;&lt;mi&gt;i&lt;/mi&gt;&lt;/msub&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt; equal to &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-60-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mn&gt;0&lt;/mn&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;.
And since each &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-61-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;msub&gt;&lt;mi&gt;Y&lt;/mi&gt;&lt;mi&gt;i&lt;/mi&gt;&lt;/msub&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt; is a sum of &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-62-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;msub&gt;&lt;mi&gt;X&lt;/mi&gt;&lt;mi&gt;i&lt;/mi&gt;&lt;/msub&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt; terms, their expectations will also be zero.
Then the above simplifies to
&lt;span class="mjx-chtml MJXc-display" style="text-align: center;"&gt;&lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-63-Frame" role="presentation" style="font-size: 90%; text-align: center; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML MJX_Assistive_MathML_Block" role="presentation"&gt;&lt;math display="block" xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mi mathvariant="normal"&gt;E&lt;/mi&gt;&lt;/mrow&gt;&lt;mo stretchy="false"&gt;[&lt;/mo&gt;&lt;msub&gt;&lt;mi&gt;Y&lt;/mi&gt;&lt;mi&gt;i&lt;/mi&gt;&lt;/msub&gt;&lt;msub&gt;&lt;mi&gt;Y&lt;/mi&gt;&lt;mi&gt;j&lt;/mi&gt;&lt;/msub&gt;&lt;mo stretchy="false"&gt;]&lt;/mo&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mfrac&gt;&lt;mn&gt;1&lt;/mn&gt;&lt;mi&gt;M&lt;/mi&gt;&lt;/mfrac&gt;&lt;msub&gt;&lt;mo&gt;∫&lt;/mo&gt;&lt;mi&gt;V&lt;/mi&gt;&lt;/msub&gt;&lt;mo&gt;−&lt;/mo&gt;&lt;msub&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;mi&gt;i&lt;/mi&gt;&lt;/msub&gt;&lt;msub&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;mi&gt;j&lt;/mi&gt;&lt;/msub&gt;&lt;mspace width="thinmathspace"&gt;&lt;/mspace&gt;&lt;mi&gt;d&lt;/mi&gt;&lt;mi&gt;m&lt;/mi&gt;&lt;mo&gt;.&lt;/mo&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
Let's convert the integral on the right into an expectation.&lt;/p&gt;
&lt;div class="longmath"&gt;&lt;span class="mjx-chtml MJXc-display" style="text-align: center;"&gt;&lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-64-Frame" role="presentation" style="font-size: 90%; text-align: center; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML MJX_Assistive_MathML_Block" role="presentation"&gt;&lt;math display="block" xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mfrac&gt;&lt;mn&gt;1&lt;/mn&gt;&lt;mi&gt;M&lt;/mi&gt;&lt;/mfrac&gt;&lt;msub&gt;&lt;mo&gt;∫&lt;/mo&gt;&lt;mi&gt;V&lt;/mi&gt;&lt;/msub&gt;&lt;mo&gt;−&lt;/mo&gt;&lt;msub&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;mi&gt;i&lt;/mi&gt;&lt;/msub&gt;&lt;msub&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;mi&gt;j&lt;/mi&gt;&lt;/msub&gt;&lt;mspace width="thinmathspace"&gt;&lt;/mspace&gt;&lt;mi&gt;d&lt;/mi&gt;&lt;mi&gt;m&lt;/mi&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;msub&gt;&lt;mo&gt;∫&lt;/mo&gt;&lt;mi&gt;V&lt;/mi&gt;&lt;/msub&gt;&lt;mo&gt;−&lt;/mo&gt;&lt;msub&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;mi&gt;i&lt;/mi&gt;&lt;/msub&gt;&lt;msub&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;mi&gt;j&lt;/mi&gt;&lt;/msub&gt;&lt;mo&gt;⋅&lt;/mo&gt;&lt;mfrac&gt;&lt;mrow&gt;&lt;mi&gt;ρ&lt;/mi&gt;&lt;mo stretchy="false"&gt;(&lt;/mo&gt;&lt;msub&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;mn&gt;1&lt;/mn&gt;&lt;/msub&gt;&lt;mo&gt;,&lt;/mo&gt;&lt;msub&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msub&gt;&lt;mo&gt;,&lt;/mo&gt;&lt;msub&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;mn&gt;3&lt;/mn&gt;&lt;/msub&gt;&lt;mo stretchy="false"&gt;)&lt;/mo&gt;&lt;/mrow&gt;&lt;mi&gt;M&lt;/mi&gt;&lt;/mfrac&gt;&lt;mspace width="thinmathspace"&gt;&lt;/mspace&gt;&lt;mi&gt;d&lt;/mi&gt;&lt;msub&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;mn&gt;1&lt;/mn&gt;&lt;/msub&gt;&lt;mspace width="thinmathspace"&gt;&lt;/mspace&gt;&lt;mi&gt;d&lt;/mi&gt;&lt;msub&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msub&gt;&lt;mspace width="thinmathspace"&gt;&lt;/mspace&gt;&lt;mi&gt;d&lt;/mi&gt;&lt;msub&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;mn&gt;3&lt;/mn&gt;&lt;/msub&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mi mathvariant="normal"&gt;E&lt;/mi&gt;&lt;/mrow&gt;&lt;mo stretchy="false"&gt;[&lt;/mo&gt;&lt;mo&gt;−&lt;/mo&gt;&lt;msub&gt;&lt;mi&gt;X&lt;/mi&gt;&lt;mi&gt;i&lt;/mi&gt;&lt;/msub&gt;&lt;msub&gt;&lt;mi&gt;X&lt;/mi&gt;&lt;mi&gt;j&lt;/mi&gt;&lt;/msub&gt;&lt;mo stretchy="false"&gt;]&lt;/mo&gt;&lt;mo&gt;.&lt;/mo&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;
&lt;p&gt;So, with the definitions previously chosen for each &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-65-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;msub&gt;&lt;mi&gt;Y&lt;/mi&gt;&lt;mi&gt;i&lt;/mi&gt;&lt;/msub&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;, we are forced to conclude
&lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-66-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mi mathvariant="normal"&gt;E&lt;/mi&gt;&lt;/mrow&gt;&lt;mo stretchy="false"&gt;[&lt;/mo&gt;&lt;msub&gt;&lt;mi&gt;Y&lt;/mi&gt;&lt;mi&gt;i&lt;/mi&gt;&lt;/msub&gt;&lt;msub&gt;&lt;mi&gt;Y&lt;/mi&gt;&lt;mi&gt;j&lt;/mi&gt;&lt;/msub&gt;&lt;mo stretchy="false"&gt;]&lt;/mo&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mi mathvariant="normal"&gt;E&lt;/mi&gt;&lt;/mrow&gt;&lt;mo stretchy="false"&gt;[&lt;/mo&gt;&lt;mo&gt;−&lt;/mo&gt;&lt;msub&gt;&lt;mi&gt;X&lt;/mi&gt;&lt;mi&gt;i&lt;/mi&gt;&lt;/msub&gt;&lt;msub&gt;&lt;mi&gt;X&lt;/mi&gt;&lt;mi&gt;j&lt;/mi&gt;&lt;/msub&gt;&lt;mo stretchy="false"&gt;]&lt;/mo&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;! This is not true in general.
&lt;/p&gt;
&lt;figure class="endfloat"&gt;
&lt;img alt="" class="white-diagram" src="https://www.harysdalvi.com/blog/covariance-rotational-inertia/cylinder.svg"/&gt;
&lt;figcaption&gt;A cylinder, symmetric across the &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-67-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;mi&gt;y&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;, &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-68-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;mi&gt;z&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;, and &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-69-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;y&lt;/mi&gt;&lt;mi&gt;z&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt; planes.
    Image source: &lt;a href="https://commons.wikimedia.org/wiki/File:Moment_of_inertia_solid_cylinder.svg" target="_blank"&gt;Grendelkhan
      at the English Wikipedia&lt;/a&gt;, &lt;a href="https://creativecommons.org/licenses/by-sa/3.0/deed.en" target="_blank"&gt;CC
        BY-SA 3.0&lt;/a&gt;.
  &lt;/figcaption&gt;
&lt;/figure&gt;
&lt;p&gt;There is a way around this. First, let's force &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-70-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mi mathvariant="normal"&gt;E&lt;/mi&gt;&lt;/mrow&gt;&lt;mo stretchy="false"&gt;[&lt;/mo&gt;&lt;msub&gt;&lt;mi&gt;X&lt;/mi&gt;&lt;mi&gt;i&lt;/mi&gt;&lt;/msub&gt;&lt;msub&gt;&lt;mi&gt;X&lt;/mi&gt;&lt;mi&gt;j&lt;/mi&gt;&lt;/msub&gt;&lt;mo stretchy="false"&gt;]&lt;/mo&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mn&gt;0&lt;/mn&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;.
  This isn't an unreasonable assumption. In fact, this assumption is true for symmetric shapes.
&lt;/p&gt;
&lt;p&gt;
  Consider an object that is symmetric across the &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-71-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt; and &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-72-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;y&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt; axes. Then &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-73-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mi mathvariant="normal"&gt;E&lt;/mi&gt;&lt;/mrow&gt;&lt;mo stretchy="false"&gt;[&lt;/mo&gt;&lt;msub&gt;&lt;mi&gt;X&lt;/mi&gt;&lt;mn&gt;1&lt;/mn&gt;&lt;/msub&gt;&lt;msub&gt;&lt;mi&gt;X&lt;/mi&gt;&lt;mn&gt;3&lt;/mn&gt;&lt;/msub&gt;&lt;mo stretchy="false"&gt;]&lt;/mo&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mi mathvariant="normal"&gt;E&lt;/mi&gt;&lt;/mrow&gt;&lt;mo stretchy="false"&gt;[&lt;/mo&gt;&lt;msub&gt;&lt;mi&gt;X&lt;/mi&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msub&gt;&lt;msub&gt;&lt;mi&gt;X&lt;/mi&gt;&lt;mn&gt;3&lt;/mn&gt;&lt;/msub&gt;&lt;mo stretchy="false"&gt;]&lt;/mo&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mn&gt;0&lt;/mn&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;
  because for every &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-74-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mo stretchy="false"&gt;(&lt;/mo&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;mo&gt;,&lt;/mo&gt;&lt;mi&gt;z&lt;/mi&gt;&lt;mo stretchy="false"&gt;)&lt;/mo&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt; pair, there is an equally probable &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-75-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mo stretchy="false"&gt;(&lt;/mo&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;mo&gt;,&lt;/mo&gt;&lt;mo&gt;−&lt;/mo&gt;&lt;mi&gt;z&lt;/mi&gt;&lt;mo stretchy="false"&gt;)&lt;/mo&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt; pair; and for every &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-76-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mo stretchy="false"&gt;(&lt;/mo&gt;&lt;mi&gt;y&lt;/mi&gt;&lt;mo&gt;,&lt;/mo&gt;&lt;mi&gt;z&lt;/mi&gt;&lt;mo stretchy="false"&gt;)&lt;/mo&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;
  pair, there is an equally probable &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-77-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mo stretchy="false"&gt;(&lt;/mo&gt;&lt;mi&gt;y&lt;/mi&gt;&lt;mo&gt;,&lt;/mo&gt;&lt;mo&gt;−&lt;/mo&gt;&lt;mi&gt;z&lt;/mi&gt;&lt;mo stretchy="false"&gt;)&lt;/mo&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt; pair. We can make &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-78-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mi mathvariant="normal"&gt;E&lt;/mi&gt;&lt;/mrow&gt;&lt;mo stretchy="false"&gt;[&lt;/mo&gt;&lt;msub&gt;&lt;mi&gt;X&lt;/mi&gt;&lt;mn&gt;1&lt;/mn&gt;&lt;/msub&gt;&lt;msub&gt;&lt;mi&gt;X&lt;/mi&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msub&gt;&lt;mo stretchy="false"&gt;]&lt;/mo&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mn&gt;0&lt;/mn&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;
  by adding another coordinate plane of symmetry.
&lt;/p&gt;
&lt;p&gt;Our next modification will be to sample three independent points rather than just one. Then &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-79-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;msub&gt;&lt;mi&gt;Y&lt;/mi&gt;&lt;mn&gt;1&lt;/mn&gt;&lt;/msub&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt; is the sum of &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-80-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;y&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt; and &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-81-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;z&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;
  coordinates of the first point, &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-82-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;msub&gt;&lt;mi&gt;Y&lt;/mi&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msub&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt; is the sum of &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-83-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt; and &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-84-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;z&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt; coordinates for the second point, and so on.
  This keeps the variances of all &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-85-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;msub&gt;&lt;mi&gt;Y&lt;/mi&gt;&lt;mi&gt;i&lt;/mi&gt;&lt;/msub&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt; the same while making them all independent, so their covariances become zero.
&lt;/p&gt;
&lt;p&gt;However, for asymmetric shapes, the assumption of zeros off the diagonals is violated in general.
  So it looks like our desired random variables won't always work for asymmetric shapes.
&lt;/p&gt;
&lt;p&gt;So far, we have discovered:
  &lt;/p&gt;&lt;ol&gt;
&lt;li&gt;Take an object that is &lt;strong&gt;symmetric&lt;/strong&gt; across at least two of the &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-86-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;mi&gt;y&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;, &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-87-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;y&lt;/mi&gt;&lt;mi&gt;z&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;, and &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-88-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;mi&gt;z&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt; planes.&lt;/li&gt;
&lt;li&gt;Randomly, independently, and identically sample three points in the object, with sampling weighted by density.&lt;/li&gt;
&lt;li&gt;Let &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-89-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;msub&gt;&lt;mi&gt;X&lt;/mi&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mn&gt;1&lt;/mn&gt;&lt;mi&gt;i&lt;/mi&gt;&lt;/mrow&gt;&lt;/msub&gt;&lt;mo&gt;,&lt;/mo&gt;&lt;msub&gt;&lt;mi&gt;X&lt;/mi&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;mi&gt;i&lt;/mi&gt;&lt;/mrow&gt;&lt;/msub&gt;&lt;mo&gt;,&lt;/mo&gt;&lt;msub&gt;&lt;mi&gt;X&lt;/mi&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mn&gt;3&lt;/mn&gt;&lt;mi&gt;i&lt;/mi&gt;&lt;/mrow&gt;&lt;/msub&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt; be random variables representing the &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-90-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;, &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-91-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;y&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;, and &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-92-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;z&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt; coordinates
      of each randomly sampled point number &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-93-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;i&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;.&lt;/li&gt;
&lt;li&gt;Define new random variables &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-94-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;msub&gt;&lt;mi&gt;Y&lt;/mi&gt;&lt;mn&gt;1&lt;/mn&gt;&lt;/msub&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;msub&gt;&lt;mi&gt;X&lt;/mi&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mn&gt;21&lt;/mn&gt;&lt;/mrow&gt;&lt;/msub&gt;&lt;mo&gt;+&lt;/mo&gt;&lt;msub&gt;&lt;mi&gt;X&lt;/mi&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mn&gt;31&lt;/mn&gt;&lt;/mrow&gt;&lt;/msub&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;, &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-95-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;msub&gt;&lt;mi&gt;Y&lt;/mi&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msub&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;msub&gt;&lt;mi&gt;X&lt;/mi&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mn&gt;12&lt;/mn&gt;&lt;/mrow&gt;&lt;/msub&gt;&lt;mo&gt;+&lt;/mo&gt;&lt;msub&gt;&lt;mi&gt;X&lt;/mi&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mn&gt;32&lt;/mn&gt;&lt;/mrow&gt;&lt;/msub&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;, and &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-96-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;msub&gt;&lt;mi&gt;Y&lt;/mi&gt;&lt;mn&gt;3&lt;/mn&gt;&lt;/msub&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;msub&gt;&lt;mi&gt;X&lt;/mi&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mn&gt;13&lt;/mn&gt;&lt;/mrow&gt;&lt;/msub&gt;&lt;mo&gt;+&lt;/mo&gt;&lt;msub&gt;&lt;mi&gt;X&lt;/mi&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mn&gt;23&lt;/mn&gt;&lt;/mrow&gt;&lt;/msub&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;.
      That is, for the &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-97-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;i&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;th point, let &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-98-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;msub&gt;&lt;mi&gt;Y&lt;/mi&gt;&lt;mi&gt;i&lt;/mi&gt;&lt;/msub&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt; be the sum of the two coordinates excluding the &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-99-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;i&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;th.
    &lt;/li&gt;
&lt;li&gt;The covariance matrix of &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-100-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;msub&gt;&lt;mi&gt;Y&lt;/mi&gt;&lt;mn&gt;1&lt;/mn&gt;&lt;/msub&gt;&lt;mo&gt;,&lt;/mo&gt;&lt;msub&gt;&lt;mi&gt;Y&lt;/mi&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msub&gt;&lt;mo&gt;,&lt;/mo&gt;&lt;msub&gt;&lt;mi&gt;Y&lt;/mi&gt;&lt;mn&gt;3&lt;/mn&gt;&lt;/msub&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt; will be equal to &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-101-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mfrac&gt;&lt;mn&gt;1&lt;/mn&gt;&lt;mi&gt;M&lt;/mi&gt;&lt;/mfrac&gt;&lt;mi&gt;I&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;, where
      &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-102-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;I&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt; is the inertia tensor of the object around its center of mass. All off-diagonal elements of &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-103-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;I&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;
      and the covariance matrix will be zero.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;&lt;/p&gt;
&lt;h2 id="asymmetric objects"&gt;Asymmetric Objects&lt;/h2&gt;
&lt;p&gt;These definitions for each &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-104-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;msub&gt;&lt;mi&gt;Y&lt;/mi&gt;&lt;mi&gt;i&lt;/mi&gt;&lt;/msub&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt; don't work for asymmetric objects. But can we find definitions that do work?
  That is, given an asymmetric object of mass &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-105-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;M&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt; and its inertia tensor &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-106-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;I&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;, can we define random variables &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-107-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;msub&gt;&lt;mi&gt;Y&lt;/mi&gt;&lt;mn&gt;1&lt;/mn&gt;&lt;/msub&gt;&lt;mo&gt;,&lt;/mo&gt;&lt;msub&gt;&lt;mi&gt;Y&lt;/mi&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msub&gt;&lt;mo&gt;,&lt;/mo&gt;&lt;msub&gt;&lt;mi&gt;Y&lt;/mi&gt;&lt;mn&gt;3&lt;/mn&gt;&lt;/msub&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;
  such that the covariance matrix of &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-108-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mi mathvariant="bold"&gt;Y&lt;/mi&gt;&lt;/mrow&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mo stretchy="false"&gt;(&lt;/mo&gt;&lt;msub&gt;&lt;mi&gt;Y&lt;/mi&gt;&lt;mn&gt;1&lt;/mn&gt;&lt;/msub&gt;&lt;mo&gt;,&lt;/mo&gt;&lt;msub&gt;&lt;mi&gt;Y&lt;/mi&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msub&gt;&lt;mo&gt;,&lt;/mo&gt;&lt;msub&gt;&lt;mi&gt;Y&lt;/mi&gt;&lt;mn&gt;3&lt;/mn&gt;&lt;/msub&gt;&lt;mo stretchy="false"&gt;)&lt;/mo&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt; is equal to &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-109-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mfrac&gt;&lt;mn&gt;1&lt;/mn&gt;&lt;mi&gt;M&lt;/mi&gt;&lt;/mfrac&gt;&lt;mi&gt;I&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;?
&lt;/p&gt;
&lt;p&gt;First of all, the cop-out, direct answer is yes. If you have a set of random variables, their means, and their covariance matrix,
  you can quite easily sample values for those random variables. Many
  &lt;a href="https://numpy.org/doc/stable/reference/random/generated/numpy.random.multivariate_normal.html" target="_blank"&gt;computing
    packages&lt;/a&gt; have this implemented. So given an inertia tensor &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-110-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;I&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;, we can divide it by &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-111-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;M&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt; and then use the result
    as a covariance matrix to sample random variables with any means we like.
&lt;/p&gt;
&lt;p&gt;
  But this is no fun. What we really want is a physical interpretation: which three interpretable physical random variables
  have the inertia tensor (up to a scaling factor) as their covariance matrix?
&lt;/p&gt;
&lt;h3&gt;Using Eigenvectors&lt;/h3&gt;
&lt;p&gt;
  There's a way we can interpret the inertia tensor of any shape, even asymmetric ones, as a covariance matrix.
  This is done by choosing a different coordinate system in which that the inertia tensor
  is diagonal: that is, all off-diagonal elements are zero. The good news is that this is
  &lt;strong&gt;always possible&lt;/strong&gt;, for any shape, even asymmetric ones![&lt;a href="https://www.harysdalvi.com/blog/covariance-rotational-inertia/#reflist"&gt;2&lt;/a&gt;]
&lt;/p&gt;
&lt;p&gt;How do we find such a coordinate system? By taking the &lt;strong&gt;eigenvectors&lt;/strong&gt;
  of the inertia tensor. In a physical context, the eigenvectors of the inertia tensor are also called the
  &lt;strong&gt;principal axes&lt;/strong&gt; of the object.
&lt;/p&gt;
&lt;figure class="endfloat"&gt;
&lt;img alt="" src="https://www.harysdalvi.com/blog/covariance-rotational-inertia/tennis-racquet.svg"/&gt;
&lt;figcaption&gt;The three principal axes of a tennis racket. Note that the racket is not symmetric across the
    &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-112-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mover&gt;&lt;msub&gt;&lt;mi&gt;e&lt;/mi&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msub&gt;&lt;mo stretchy="false"&gt;^&lt;/mo&gt;&lt;/mover&gt;&lt;/mrow&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;-&lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-113-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mover&gt;&lt;msub&gt;&lt;mi&gt;e&lt;/mi&gt;&lt;mn&gt;3&lt;/mn&gt;&lt;/msub&gt;&lt;mo stretchy="false"&gt;^&lt;/mo&gt;&lt;/mover&gt;&lt;/mrow&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt; plane! Image source:
    &lt;a href="https://commons.wikimedia.org/wiki/File:Tennis_racquet_principal_axes.svg" target="_blank"&gt;CMG Lee,
      Wikimedia&lt;/a&gt;,
      &lt;a href="https://creativecommons.org/licenses/by-sa/4.0/deed.en" target="_blank"&gt;CC BY-SA 4.0&lt;/a&gt;.
  &lt;/figcaption&gt;
&lt;/figure&gt;
&lt;p&gt;If we take the inertia tensor around the center of mass of the object, using the principal axes
  as our three coordinates, we will get a tensor &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-114-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;I&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt; whose off-diagonal elements are zero.
  Then we can find three independent random variables whose covariance matrix is &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-115-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mfrac&gt;&lt;mn&gt;1&lt;/mn&gt;&lt;mi&gt;M&lt;/mi&gt;&lt;/mfrac&gt;&lt;mi&gt;I&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;.
  We can even use the independent random variables from our symmetric object procedure, because the covariances
  were already zero and the variances are still fine.
&lt;/p&gt;
&lt;p&gt;Now we've found a solution that works for any shape! That is, given any object, we can
  &lt;strong&gt;choose a coordinate system&lt;/strong&gt; such that the &lt;strong&gt;inertia tensor&lt;/strong&gt;
  of the object divided by its mass is equal to the &lt;strong&gt;covariance matrix&lt;/strong&gt;
  of three random variables. In particular, these three random variables should be easy to
  obtain by randomly sampling three points within the object by density, and using the coordinates
  of those randomly sampled points.
&lt;/p&gt;
&lt;p&gt;To make things fully clear, let's go through a process by which we can make the
  inertia tensor of &lt;em&gt;any&lt;/em&gt; object proportional to the covariance matrix of three
  easy-to-understand random variables.&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Take &lt;strong&gt;any&lt;/strong&gt; object.&lt;/li&gt;
&lt;li&gt;Set coordinates &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-116-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;msub&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;mn&gt;1&lt;/mn&gt;&lt;/msub&gt;&lt;mo&gt;,&lt;/mo&gt;&lt;msub&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msub&gt;&lt;mo&gt;,&lt;/mo&gt;&lt;msub&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;mn&gt;3&lt;/mn&gt;&lt;/msub&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt; to align with the principal axes of the object. You can find these by taking the
    eigenvectors of the object's inertia tensor in any coordinate system. Center the coordinate system
    on the object's center of mass.
  &lt;/li&gt;
&lt;li&gt;Randomly, independently, and identically sample three points in the object, with sampling weighted by density.&lt;/li&gt;
&lt;li&gt;Let &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-117-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;msub&gt;&lt;mi&gt;X&lt;/mi&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mn&gt;1&lt;/mn&gt;&lt;mi&gt;i&lt;/mi&gt;&lt;/mrow&gt;&lt;/msub&gt;&lt;mo&gt;,&lt;/mo&gt;&lt;msub&gt;&lt;mi&gt;X&lt;/mi&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;mi&gt;i&lt;/mi&gt;&lt;/mrow&gt;&lt;/msub&gt;&lt;mo&gt;,&lt;/mo&gt;&lt;msub&gt;&lt;mi&gt;X&lt;/mi&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mn&gt;3&lt;/mn&gt;&lt;mi&gt;i&lt;/mi&gt;&lt;/mrow&gt;&lt;/msub&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt; be random variables representing the &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-118-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;msub&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;mn&gt;1&lt;/mn&gt;&lt;/msub&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;, &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-119-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;msub&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msub&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;, and &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-120-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;msub&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;mn&gt;3&lt;/mn&gt;&lt;/msub&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt; coordinates
    of each randomly sampled point number &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-121-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;i&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;.&lt;/li&gt;
&lt;li&gt;Define new random variables &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-122-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;msub&gt;&lt;mi&gt;Y&lt;/mi&gt;&lt;mn&gt;1&lt;/mn&gt;&lt;/msub&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;msub&gt;&lt;mi&gt;X&lt;/mi&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mn&gt;21&lt;/mn&gt;&lt;/mrow&gt;&lt;/msub&gt;&lt;mo&gt;+&lt;/mo&gt;&lt;msub&gt;&lt;mi&gt;X&lt;/mi&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mn&gt;31&lt;/mn&gt;&lt;/mrow&gt;&lt;/msub&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;, &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-123-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;msub&gt;&lt;mi&gt;Y&lt;/mi&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msub&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;msub&gt;&lt;mi&gt;X&lt;/mi&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mn&gt;12&lt;/mn&gt;&lt;/mrow&gt;&lt;/msub&gt;&lt;mo&gt;+&lt;/mo&gt;&lt;msub&gt;&lt;mi&gt;X&lt;/mi&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mn&gt;32&lt;/mn&gt;&lt;/mrow&gt;&lt;/msub&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;, and &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-124-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;msub&gt;&lt;mi&gt;Y&lt;/mi&gt;&lt;mn&gt;3&lt;/mn&gt;&lt;/msub&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;msub&gt;&lt;mi&gt;X&lt;/mi&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mn&gt;13&lt;/mn&gt;&lt;/mrow&gt;&lt;/msub&gt;&lt;mo&gt;+&lt;/mo&gt;&lt;msub&gt;&lt;mi&gt;X&lt;/mi&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mn&gt;23&lt;/mn&gt;&lt;/mrow&gt;&lt;/msub&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;.
    That is, for the &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-125-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;i&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;th point, let &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-126-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;msub&gt;&lt;mi&gt;Y&lt;/mi&gt;&lt;mi&gt;i&lt;/mi&gt;&lt;/msub&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt; be the sum of the two coordinates excluding the &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-127-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;i&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;th.
  &lt;/li&gt;
&lt;li&gt;The covariance matrix of &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-128-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;msub&gt;&lt;mi&gt;Y&lt;/mi&gt;&lt;mn&gt;1&lt;/mn&gt;&lt;/msub&gt;&lt;mo&gt;,&lt;/mo&gt;&lt;msub&gt;&lt;mi&gt;Y&lt;/mi&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msub&gt;&lt;mo&gt;,&lt;/mo&gt;&lt;msub&gt;&lt;mi&gt;Y&lt;/mi&gt;&lt;mn&gt;3&lt;/mn&gt;&lt;/msub&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt; will be equal to &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-129-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mfrac&gt;&lt;mn&gt;1&lt;/mn&gt;&lt;mi&gt;M&lt;/mi&gt;&lt;/mfrac&gt;&lt;mi&gt;I&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;, where
    &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-130-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;I&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt; is the inertia tensor of the object around its center of mass. All off-diagonal elements of &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-131-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;I&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;
    and the covariance matrix will be zero.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Although this is a lot of mathematical steps, there is nothing particularly convoluted here: all the steps, such as
choosing an appropriate coordinate system for the object, sampling
random points, and using sums of coordinates to generate random variables, feel fairly natural.
So we've finally found a covariance matrix equal to &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-132-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mfrac&gt;&lt;mn&gt;1&lt;/mn&gt;&lt;mi&gt;M&lt;/mi&gt;&lt;/mfrac&gt;&lt;mi&gt;I&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt; for any object!
&lt;/p&gt;
&lt;h3&gt;Decorrelation Transform&lt;/h3&gt;
&lt;p&gt;Although this technically works for any object, it's a shame we had to force the off-diagonal elements to be zero.
  In general, covariance matrices don't have all zeros off the diagonals. But there's good news: just as we used
  principal axes to transform the inertia tensor and remove its diagonals, we can use a &lt;strong&gt;decorrelation tranform&lt;/strong&gt;
  to remove the off-diagonal terms of the covariance matrix.
&lt;/p&gt;
&lt;p&gt;And there's even better news: the decorrelation transform consists of changing into the basis of the eigenvectors 
  of the correlation matrix, just as we found a new coordinate system using the eigenvectors (principal axes)
  of the inertia tensor.[&lt;a href="https://www.harysdalvi.com/blog/covariance-rotational-inertia/#reflist"&gt;3&lt;/a&gt;] And the reason we know we can use eigenvectors to remove off-diagonal terms (diagonalize the matrix)
  is that the inertia tensor and covariance matrix are both &lt;strong&gt;symmetric&lt;/strong&gt;.[&lt;a href="https://www.harysdalvi.com/blog/covariance-rotational-inertia/#reflist"&gt;4&lt;/a&gt;] This goes back to one of the
  similarities we observed at the beginning!
&lt;/p&gt;
&lt;p&gt;So while both inertia tensors and covariance matrices have nonzero elements off the diagonals in general, we can apply
  analogous transformations to both and remove those elements. Then we can find that with our desired random variables,
  the two are equal! (Although we might have to scale the random variables.)
&lt;/p&gt;
&lt;h2&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;Although they come from different fields, the inertia tensor and the covariance matrix have some superficially
  similar properties. Is there an interpretation in which the two are essentially the same?
&lt;/p&gt;
&lt;p&gt;We tried one interpretation by randomly sampling a point within an object, and using the coordinates of
  the random point to generate random variables. The only restriction on our coordinate system was that
  we set the origin at the object's center of mass.
  But we found that the covariance matrix of these random variables
  had different values from the inertia tensor at those elements, so this was unreliable for objects that are asymmetric
  across the coordinate planes. We were forced to set the off-diagonal elements to zero (as with a symmetric object) and sample
  three independent points in order for the equality to hold.
&lt;/p&gt;
&lt;p&gt;Then we found a solution that works for any shape. By choosing the principal axes of the object as our
  coordinate system, we get all the off-diagonal elements equal to zero, which matches the working symmetric case.
  If we also set the object's center of mass as our origin, the same random sampling procedure as the symmetric
  case gives us a covariance matrix equal to &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-133-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mfrac&gt;&lt;mn&gt;1&lt;/mn&gt;&lt;mi&gt;M&lt;/mi&gt;&lt;/mfrac&gt;&lt;mi&gt;I&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;, where &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-134-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;M&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt; is the object's mass and &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-135-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;I&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;
  is its inertia tensor!
&lt;/p&gt;
&lt;p&gt;Just as we set the off-diagonal elements of the inertia tensor to zero using principal axes, we can
  do the same to a covariance matrix using eigenvectors. In fact, the principal axes are the eigenvectors
  of the inertia tensor, so these two transformations are intimately related. In the covariance case,
  this transformation is known as the decorrelation transform.
&lt;/p&gt;
&lt;p&gt;Even if we choose the wrong object or the wrong coordinate system, there is still a cop-out way to
  get the inertia tensor to match the covariance matrix. We can choose any means we like for our three random variables,
  and enforce the covariance matrix to be &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-136-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mfrac&gt;&lt;mn&gt;1&lt;/mn&gt;&lt;mi&gt;M&lt;/mi&gt;&lt;/mfrac&gt;&lt;mi&gt;I&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;. Although these random variables aren't physically interpretable,
  it technically fits the bill of three random variables with a covariance matrix equal to &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-137-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mfrac&gt;&lt;mn&gt;1&lt;/mn&gt;&lt;mi&gt;M&lt;/mi&gt;&lt;/mfrac&gt;&lt;mi&gt;I&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;
  for any object!
&lt;/p&gt;
&lt;p&gt;Although the connection between the inertia tensor and the covariance matrix might seem suggestive at first,
  it actually required a lot of steps to get there: using physics, statistics, and even linear algebra.
  In the end, we found a connection that works for any shape, showing the power of picking the right coordinate system
  and transforming mathematical objects in the right way. I hope you enjoyed discovering this interesting curiosity of
  classical mechanics and statistics!
&lt;/p&gt;
&lt;hr/&gt;
&lt;h2&gt;Related Posts&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="https://www.harysdalvi.com/blog/2202"&gt;2+3=4&lt;/a&gt; — Why math seems to describe the physical world so well&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.harysdalvi.com/uncertainty"&gt;Uncertainty Demo&lt;/a&gt; — A simulation of propagation of uncertainty as applied to physics, and an analysis of random error in the simulation&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id="ref"&gt;References&lt;/h2&gt;
&lt;ol id="reflist"&gt;&lt;li&gt;&lt;a href="https://physics.stackexchange.com/questions/60843/whats-the-physical-significance-of-the-off-diagonal-element-in-the-matrix-of-mo" target="_blank"&gt;What's
  the physical significance of the off-diagonal element in the matrix of moment of inertia&lt;/a&gt;
  (safkan, Physics Stack Exchange, 2015) &lt;/li&gt;&lt;li&gt;&lt;a href="https://ocw.mit.edu/courses/16-07-dynamics-fall-2009/dd277ec654440f4c2b5b07d6c286c3fd_MIT16_07F09_Lec26.pdf" target="_blank"&gt;3D Rigid Body Dynamics: The Inertia Tensor&lt;/a&gt;
  (J. Peraire, S. Widnall, MIT OCW, 2008) &lt;/li&gt;&lt;li&gt;&lt;a href="https://courses.media.mit.edu/2010fall/mas622j/whiten.pdf" target="_blank"&gt;Decorrelating and then Whitening data&lt;/a&gt;
  (Rosalind W. Picard, MIT, 2010) &lt;/li&gt;&lt;li&gt;&lt;a href="https://www.math.ucdavis.edu/~linear/old/notes22.pdf" target="_blank"&gt;Diagonalizing Symmetric Matrices&lt;/a&gt;
  (UC Davis) &lt;/li&gt;&lt;/ol&gt;</content></entry><entry><title>Walking Across Rhode Island</title><summary>My experience walking 29 miles across Rhode Island</summary><link href="https://www.harysdalvi.com/blog/walking-across-rhode-island/" rel="alternate"/><id>https://www.harysdalvi.com/blog/walking-across-rhode-island/</id><published>2023-10-01T16:38:07Z</published><updated>2023-10-01T16:38:07Z</updated><content type="html" xml:lang="en">&lt;ul&gt;
&lt;b&gt;Notes:&lt;/b&gt;
&lt;li&gt;This walk was for a &lt;a href="https://www.every.org/givewell-top-charities-fund/f/the-walk-help-save-lives" target="_blank"&gt;fundraiser&lt;/a&gt;
for malaria prevention and medication in some of the world's poorest areas. If you're interested in supporting the fundraiser,
it would really make a difference and help save lives!&lt;/li&gt;
&lt;li&gt;This post was edited with &lt;a href="https://www.pagethinker.com" target="_blank"&gt;PageThinker&lt;/a&gt;, an AI writing assistant.
&lt;/li&gt;
&lt;/ul&gt;&lt;br/&gt;
&lt;p&gt;On October 8, 2023 at 7:11 in the morning, I took this picture, naïvely excited for the journey ahead.
&lt;/p&gt;
&lt;figure class="middleimg"&gt;
&lt;img alt="" src="https://www.harysdalvi.com/blog/walking-across-rhode-island/sterling.jpeg"/&gt;
&lt;figcaption&gt;The empty roads of Sterling, CT, on a bright and innocent fall morning&lt;/figcaption&gt;
&lt;/figure&gt;
&lt;p&gt;“On October 9, 2022, I walked across the entire state of Rhode Island,” said Charles Alaimo
in &lt;a href="https://walkacrossrhodeisland.wordpress.com/2023/09/09/why-you-should-walk-across-rhode-island/" target="_blank"&gt;his
essay on the subject&lt;/a&gt;. “I’m going again this year and I want to convince you to join me.”
And convince me he did: I woke up early and dragged myself out of bed to Brown's Main Green to
catch the school bus to Connecticut. I was ready to walk across Rhode Island. After all, I had
already walked a half marathon at the University of Florida. How hard could 29 miles be?
&lt;/p&gt;
&lt;p&gt;I knew it would be painful. I imagined myself reaching Providence with just a few miles to Massachusetts,
horribly tired, but pushing through. I imagined telling myself, “you're so close. Don't give up now.
Just push through these last few miles and you'll be done.” Then I imagined myself, feet aching and blistered,
ultimately reaching the finish line and celebrating with everybody else who
chose this adventure.
&lt;/p&gt;
&lt;p&gt;That was what I imagined.
&lt;/p&gt;
&lt;p&gt;The reality was quite different. In fact, I found myself with &lt;em&gt;twelve miles&lt;/em&gt; left to walk,
  already limping and in pain. Walking across Rhode Island is already a forbidding task: &lt;em&gt;limping&lt;/em&gt;
  across Rhode Island is really unthinkable. And I couldn't even tell myself “you're so close”
  because by any measure, twelve miles to walk is &lt;em&gt;not&lt;/em&gt; “so close.”&lt;/p&gt;
&lt;p&gt;But I didn't know any of that was coming. I just got off the bus and started walking.
&lt;/p&gt;
&lt;figure class="twoside"&gt;
&lt;img alt="" src="https://www.harysdalvi.com/blog/walking-across-rhode-island/path.jpeg"/&gt;
&lt;img alt="" src="https://www.harysdalvi.com/blog/walking-across-rhode-island/pond.jpeg"/&gt;
&lt;figcaption&gt;Walking through the forest and stopping by a pond&lt;/figcaption&gt;
&lt;/figure&gt;
&lt;figure class="middleimg"&gt;
&lt;img alt="" src="https://www.harysdalvi.com/blog/walking-across-rhode-island/cabin.jpeg"/&gt;
&lt;figcaption&gt;A small wooden cottage barely visible through the trees&lt;/figcaption&gt;
&lt;/figure&gt;
&lt;p&gt;We started off in an unspoiled New England forest, probably kicking some pebbles for the first time in human history.
  The landscape pretty quickly turned into rural America, scattered with farms and small private properties.
  Some of these properties were little more than small cottages in the woods, reminiscent of
  &lt;a href="https://www.walden.org/property/thoreaus-cabin-site/" target="_blank"&gt;Henry David Thoreau's
  cabin&lt;/a&gt; at Walden Pond.
&lt;/p&gt;
&lt;figure class="endfloat"&gt;
&lt;img alt="" src="https://www.harysdalvi.com/blog/walking-across-rhode-island/goat.jpeg"/&gt;
&lt;figcaption&gt;A goat behind a fence&lt;/figcaption&gt;
&lt;img alt="An orange sign that says No Poachin' No Tresspassin' No Nuthin'. This applies to friends, relatives, enemies and YOU. Violators (survivors) will be prosecuted." src="https://www.harysdalvi.com/blog/walking-across-rhode-island/sign.jpeg"/&gt;
&lt;figcaption&gt;A notice put up by someone who really didn't want trespassing&lt;/figcaption&gt;
&lt;img alt="" src="https://www.harysdalvi.com/blog/walking-across-rhode-island/reservoir.jpeg"/&gt;
&lt;figcaption&gt;A large reservoir we passed by on our walk in Scituate, RI&lt;/figcaption&gt;
&lt;/figure&gt;
&lt;p&gt;A few people stopped in their cars, absolutely perplexed as to what a large group of young people might be doing
  walking in such remote areas. We told them: “we are walking across Rhode Island.”
&lt;/p&gt;
&lt;p&gt;
  This usually didn't clear up their confusion. “The entire state?”&lt;/p&gt;
&lt;p&gt;“Yes,” we said. “The entire state.” We also shared our
  &lt;a href="https://www.every.org/givewell-top-charities-fund/f/the-walk-help-save-lives" target="_blank"&gt;fundraiser&lt;/a&gt;
  with them. One man in particular said he didn't have any cash at the moment, but he promised to
  send some money when he got the chance, and congratulated us on undertaking such a bizarre but impressive endeavor.
&lt;/p&gt;
&lt;p&gt;As time went on, my feet started to hurt more and more. I relished every time we stopped for a snack break,
  and dreaded every time we got up and continued walking. Our group split in two as the more experienced walkers
  went ahead, and I struggled to keep up. I was consistently
  shocked by the determination of the other group: it was almost 2:00 in the afternoon, and they still hadn't
  stopped for lunch or even taken a break in recent memory. Even when we tried to run, the distance between us remained constant.
  As a physics major, I couldn't wrap my head around the physics behind that.
&lt;/p&gt;
&lt;p&gt;My resolve began to weaken, so I convinced myself (and those around me) that if we could stop for lunch,
  sit down and relax for just a half hour, I would be able to regain my strength and walk the 15 or so miles to
  Massachusetts. After what seemed like five hours, the group ahead finally stopped for lunch in a small plaza
  at the western edge of Cranston, RI. We took every little shortcut we could find,
  probably cutting about 10 steps off our journey, and finally made it.
&lt;/p&gt;
&lt;p&gt;When I sat down in the Subway, I think I actually breathed an audible sigh of relief. After resting for a while,
  I got up, expecting to feel rejuvenated.
&lt;/p&gt;
&lt;p&gt;In fact, that's when the limping started. Somehow, sitting down had actually made my pain &lt;em&gt;worse&lt;/em&gt;,
  if that was even possible. At that point, walking to the bathroom was a struggle, so the thought of walking the remaining 12 miles
  across Rhode Island was unbearable. That's when I was really ready to quit. I texted my friends about my progress so far
  and I was tempted to tell them that I wasn't sure if I could do it, that I was considering just calling
  an Uber back to campus. But I held myself back because I knew that if I told them I was ready to quit,
  then in some crucial way it would become true.
&lt;/p&gt;
&lt;p&gt;So instead of convincing myself that resting would make me feel better, I convinced myself that walking would
  help me forget about the pain. And that was true, to an extent. Walking brought a different kind of pain than
  that of sitting and feeling your blood pump through your poor overworked feet.
&lt;/p&gt;
&lt;p&gt;It felt great when we finally reached Providence; that is, until we realized just how big the city is.
  It was a five mile walk to India Point Park near campus, and even if we made it that far, we still had to walk
  all the way to Massachusetts. This moment, above all others, stands out to me as one where I could not have
  made it without others by my side.
&lt;/p&gt;
&lt;p&gt;
  We were still behind, so we tried running, but to no avail. So we discussed
  Aesop's famous fable and decided to take the strategy of the tortoise. Somehow that allowed us to
  catch up to the rest of the group. Again, I'm not sure if the physics on that adds up.
&lt;/p&gt;
&lt;figure class="middleimg"&gt;
&lt;img alt="" src="https://www.harysdalvi.com/blog/walking-across-rhode-island/river.jpeg"/&gt;
&lt;figcaption&gt;Sunset while crossing the Providence River (or, as it felt at the time, the Rubicon)&lt;/figcaption&gt;
&lt;/figure&gt;
&lt;p&gt;As we neared Massachusetts, we crossed the marathon mark at 26.2 miles. Now, limping though I was,
  I could finally use the “so close” argument I had prepared for myself before setting out
  on this fool's errand. But the rest of the group seemed to march on steadily, while I felt completely defeated,
  even this close to the finish line. I asked myself the question: will I make it to Massachusetts?
  Barring some borderline impossible quantum fluctuations, I knew the question had a definite yes or no answer,
  one that was knowable in principle:
  I just wasn't sure which it was. I decided that statistically speaking, given everything I knew, the answer was most likely yes.
&lt;/p&gt;
&lt;p&gt;We crossed from Providence into East Providence, which borders Massachusetts. Soon we had just 15 minutes left to walk
  according to Google maps. But I simply couldn't take it anymore. I did a short sprint to catch up to the leading group
  and begged everyone, almost literally on my knees, to take a break for just two minutes.
&lt;/p&gt;
&lt;p&gt;Surprisingly, it worked. We rested for exactly two minutes on the clock before continuing our walk.
  Those last 15 minutes were some of the most painful of my life. But for the first time in many miles,
  I knew for a fact that I wasn't going to quit.
&lt;/p&gt;
&lt;p&gt;When we finally reached Seekonk, MA, the vibe was incredible. It was around 7:45 in the evening, almost 13 hours after
  we started our trek. As each group of walkers crossed the state border, everyone cheered and we all knew that despite all the
  pain during and after the journey, it was so worth it. Best of all, the man who promised to donate to our fundraiser
  drove by and personally congratulated us. We had met him about 8 miles back, so he must have really loved our work.
&lt;/p&gt;
&lt;p&gt;When I started, I imagined myself having a similar experience to
  &lt;a href="https://walkacrossrhodeisland.wordpress.com/2023/09/09/why-you-should-walk-across-rhode-island/" target="_blank"&gt;the
  one Charles described&lt;/a&gt;. While Charles mentioned being fatigued at India Point Park, I think what defined
  my experience was just how early I was ready to give up. It was precisely the moment in Subway when I got up
  and realized that far from being recovered, I was now limping. 17 miles in, with 12 to go,
  I felt completely hopeless in that state.
&lt;/p&gt;
&lt;p&gt;But I learned that when we have a clear goal, humans are capable of pushing ourselves far beyond what we thought possible.
  In the productivity sphere in recent years, there's been a shift from big and abstract goals to consistent marginal
  improvements. James Clear, who wrote the book &lt;cite&gt;Atomic Habits&lt;/cite&gt; that exemplifies this trend, also wrote
  &lt;a href="https://jamesclear.com/marginal-gains" target="_blank"&gt;a post&lt;/a&gt; about a British bike team that always had the goal
  to win the Tour de France but never succeeded. It was only by using a system of marginal improvements, without changing
  their goal, that their new coach could lead them to victory.
&lt;/p&gt;
&lt;p&gt;The mathematician in me agrees — it's true that if you improve by 1% every day, you'll be almost 38 times better at the end
  of the year. But emotionally, I believe this is impossible without a cherished ideal looming over those 1% improvements,
  a reason to go through the trouble of putting one foot in front of the other 70,000 times.
  &lt;a href="http://www.harysdalvi.com/blog/2306" target="_blank"&gt;I previously wrote about this idea&lt;/a&gt; in the context of science,
  but it has never felt more real to me than on this walk. The marginal change was well-defined: a journey of 70,000 steps begins
  with just one. But what truly defined the journey was not the step count, or even the mile count, but the romantic ideal of
  walking across the entire state of Rhode Island. Even walking a marathon felt like a minor detail in comparison.
&lt;/p&gt;
&lt;p&gt;So to the British bike coach, I say that it's not just all the marginal gains that led to a victory, although they are
  certainly important. I would bet that the new coach also made his team &lt;em&gt;believe&lt;/em&gt; in winning the race, not as a mere
  abstraction, but as an exciting ideal to strive toward. For me, it's not about optimizing my individual steps
  — now that I've walked across Rhode Island, my next goal is to see what crazy ridiculous thing I can do next. Something so
  crazy and ridiculous that I might just want to do it.
&lt;/p&gt;
&lt;figure class="middleimg"&gt;
&lt;img alt='A green sign at night that says "Entering Seekong Massachusetts"' src="https://www.harysdalvi.com/blog/walking-across-rhode-island/border.jpeg"/&gt;
&lt;figcaption&gt;The border between Rhode Island and Massachusetts&lt;/figcaption&gt;
&lt;/figure&gt;
&lt;figure class="middleimg"&gt;
&lt;img alt="" src="https://www.harysdalvi.com/blog/walking-across-rhode-island/map.jpg"/&gt;
&lt;figcaption&gt;A map of our route&lt;/figcaption&gt;
&lt;/figure&gt;</content></entry><entry><title>BardGPT</title><summary>I made a miniature GPT model extended with a rhyme and meter model for generating poetry. This is what it taught me about the future of language models</summary><link href="https://www.harysdalvi.com/bardgpt/" rel="alternate"/><id>https://www.harysdalvi.com/bardgpt/</id><published>2023-07-09T16:38:07Z</published><updated>2023-07-09T16:38:07Z</updated><content type="html" xml:lang="en">&lt;p&gt;I asked ChatGPT to give me some Spanish words that end in the letter I. It gave me the following:&lt;/p&gt;
&lt;ol lang="es"&gt;
&lt;li&gt;Amistad&lt;/li&gt;
&lt;li&gt;Felicidad&lt;/li&gt;
&lt;li&gt;Abril&lt;/li&gt;
&lt;li&gt;Café&lt;/li&gt;
&lt;li&gt;Bolígrafo&lt;/li&gt;
&lt;li&gt;Lápiz&lt;/li&gt;
&lt;li&gt;Universitari&lt;/li&gt;
&lt;li&gt;Leviatán&lt;/li&gt;
&lt;li&gt;Reír&lt;/li&gt;
&lt;li&gt;Baladí&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Of these, only &lt;i&gt;universitari&lt;/i&gt; and &lt;i lang="es"&gt;baladí&lt;/i&gt; actually end in I. Furthermore, &lt;i&gt;universitari&lt;/i&gt;
isn't a Spanish word at all — it's Catalan. ChatGPT admittedly performed better with the same exercise in
English, but this serves to highlight an important fact about large language models as they are currently built:
they only have access to the words themselves, and have &lt;strong&gt;no idea what the words sound like&lt;/strong&gt; or even how they're spelled beyond
what is explicitly stated in their training data. How can we effectively add this information to language models?
&lt;/p&gt;
&lt;figure class="endfloat"&gt;
&lt;img alt="Portrait of William Shakespeare" src="https://www.harysdalvi.com/bardgpt/img/Shakespeare.jpg"/&gt;
&lt;figcaption&gt;William Shakespeare, also known as “the Bard”&lt;/figcaption&gt;
&lt;/figure&gt;
&lt;p&gt;This leads into an even more important point. &lt;a href="https://www.harysdalvi.com/blog/can-transformers-solve-everything/"&gt;Scaling language models&lt;/a&gt; is great, but it comes with an enormous
computational and environmental cost. How can we add more information into language models and achieve better results
while managing the computational power required? In other words, &lt;strong&gt;how do we design language models
more intelligently?&lt;/strong&gt;
&lt;/p&gt;
&lt;p&gt;That's why I decided to make BardGPT, &lt;strong&gt;a miniature GPT model for generating poetry&lt;/strong&gt;. BardGPT is also an exercise in extending the power of language models
by giving them access to information beyond syntax and semantics. It uses information about the phonology of words to
generate poetry more accurately than a plain transformer model while increasing the number of parameters only marginally.
&lt;/p&gt;
&lt;h2&gt;How it Works&lt;/h2&gt;
&lt;h3&gt;Data&lt;/h3&gt;
&lt;p&gt;Data for this model was sourced entirely from Project Gutenberg. In retrospect, this was a little flawed just because of the
  limited quantity and diversity of sources there. I only reached about 12 MB of raw data, which I then had to further process
  and condense. In the future I would like to use a larger and more diverse set of sources, hopefully with web scraping.
  This would also allow a larger model, because computing power wasn't the limiting factor here: overfitting was.
&lt;/p&gt;
&lt;p&gt;With that said, I did try a larger dataset from Kaggle. I found that the resulting poems were actually far worse because
  the Kaggle dataset had a lot of low-quality data. After finishing this project, I published
  &lt;a href="https://www.kaggle.com/datasets/crackalamoo/project-gutenberg-poems" target="_blank"&gt;my own dataset&lt;/a&gt; on Kaggle.
&lt;/p&gt;
&lt;p&gt;Once I had the data, there was a significant amount of &lt;strong&gt;preprocessing&lt;/strong&gt; to be done. Most importantly, I used special tokens
  for common suffixes such as &lt;code&gt;=ing&lt;/code&gt; (&lt;code&gt;run =ing -&amp;gt; running&lt;/code&gt;) and &lt;code&gt;=ed&lt;/code&gt; (&lt;code&gt;use =ed -&amp;gt; used&lt;/code&gt;).
  This allowed a larger effective vocabulary size so I wouldn't need to store two separate tokens for &lt;code&gt;run&lt;/code&gt; and &lt;code&gt;running&lt;/code&gt;
  when the two ideas are related in a predictable way. Another trick to manage the vocabulary size was to make
  everything lowercase.
&lt;/p&gt;
&lt;p&gt;I also developed &lt;strong&gt;my own functions to determine rhymes and syllable counts&lt;/strong&gt;. Initially I tried the
  &lt;a href="https://pypi.org/project/eng-to-ipa/"&gt;eng-to-ipa&lt;/a&gt; Python module, but this was far too slow and didn't work for
  less common words. My functions sacrificed a small amount of accuracy for far greater speed and domain specificity.
  Check the GitHub repository &lt;a href="https://github.com/crackalamoo/bardgpt" target="_blank"&gt;crackalamoo/bardgpt&lt;/a&gt; for more details on how
  I stored information on rhyme and “meter”, loosely speaking.
&lt;/p&gt;
&lt;p&gt;Finally, I specified some tokens that the model should not be allowed to predict, even though they are among the most popular tokens
  in the dataset. This mainly included proper nouns such as “Lenore” (from Edgar Allen Poe's famous poem “The Raven”.)
  Generally I kept only those proper nouns that have great significance outside of the literary work they are associated with, so
  Lenore was out but Jesus made the cut. Some cases were more tricky, like Achilles from Homer's &lt;cite&gt;Iliad&lt;/cite&gt;.
&lt;/p&gt;
&lt;h3&gt;Model Architecture&lt;/h3&gt;
&lt;figure class="middleimg"&gt;
&lt;img alt="Diagram of the model. The inputs split into two branches: one with a GPT-style decoder, and the other with a rhyme/meter model. The rhyme/meter model has separate rhyme and meter encodings which are processed through linear layers and concatenation. The final results of the GPT and rhyme/meter branches are added together before a final softmax function." class="padded-diagram" src="https://www.harysdalvi.com/bardgpt/img/model.svg"/&gt;
&lt;figcaption&gt;A schematic of the BardGPT architecture, made with &lt;a href="https://www.draw.io" target="_blank"&gt;draw.io&lt;/a&gt;.
  Rounded rectangles indicate trainable operations, while sharp rectangles indicate fixed operations.&lt;/figcaption&gt;
&lt;/figure&gt;
&lt;p&gt;This schematic mostly covers the big picture of the BardGPT architecture.
  Check the GitHub repository &lt;a href="https://github.com/crackalamoo/bardgpt" target="_blank"&gt;crackalamoo/bardgpt&lt;/a&gt; for more
  technical details on the components of the model. There are just a few important points I would like to add here.
&lt;/p&gt;
&lt;p&gt;First, I tested &lt;strong&gt;three different architectures&lt;/strong&gt; in this project.
  The simplest one was an n-gram model that tries to predict the next
  word given the last three words using only a set of linear layers. The next was a transformer model, which is based on
  a GPT architecture and resembles the schematic shown here but with the rhyme/meter sections cut out. The final model
  is the one shown here, which is a transformer model with additional phonological information.
&lt;/p&gt;
&lt;p&gt;It's also interesting to note the &lt;strong&gt;number of parameters&lt;/strong&gt; for each model:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;n-gram: 24.1 million&lt;/li&gt;
&lt;li&gt;Transformer: 38.9 million&lt;/li&gt;
&lt;li&gt;Bard: 39.4 million&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The rhyme and meter layers of the bard model only add about 500,000 parameters, which is a small fraction of the total.
&lt;/p&gt;
&lt;h2&gt;Results&lt;/h2&gt;
&lt;blockquote&gt; ༄༅༅ the frozen pang&lt;br/&gt;
  o thou life! returning now to open suns, ascend&lt;br/&gt;
  to find unseen she dwelt! thy gleam thy bright loom act girl&lt;br/&gt;
  to ransom radiant order crown the righteous speed.&lt;br/&gt;
  loss of many whose gifts fail! string thou further,&lt;br/&gt;
  bring to spy my anguish ajax, king heir!&lt;br/&gt;
  thy elder burial hand must heart abhor,&lt;br/&gt;
  think me canst thou then time these cared forbear,&lt;br/&gt;
  thine sister here posterity, she need,&lt;br/&gt;
  and my loosened passion stoops in vain.&lt;br/&gt;
    — BardGPT
&lt;/blockquote&gt;
&lt;p&gt;Each model reached the following minimum perplexities:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;n-gram: 119.18&lt;/li&gt;
&lt;li&gt;Transformer: 83.23&lt;/li&gt;
&lt;li&gt;Bard: 80.18&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The bard model therefore had a &lt;strong&gt;3.7% improvement&lt;/strong&gt; over the transformer model, while adding &lt;strong&gt;only 1.3% more parameters&lt;/strong&gt;.
  But if you look at the actual poem the model generated, it doesn't seem to rhyme very well. Has the bard model
  really learned anything new?
&lt;/p&gt;
&lt;p&gt;I think I did not design a rhyme encoding that worked well enough to encourage the model
  to rhyme. Meter was more straightforward. Just by knowing how many syllables are in a line, &lt;strong&gt;the model can much better predict where
  to place &lt;code&gt;&amp;lt;newline&amp;gt;&lt;/code&gt; tokens&lt;/strong&gt;. To compare, here is a poem generated by the plain transformer model:&lt;/p&gt;
&lt;blockquote style="overflow-x: auto; white-space: nowrap;"&gt; ༄༅༅  haze
  footstep with a stray night mount went back long forward and down, one after plank and ten big star shone purple and austere,&lt;br/&gt;
  sunk in a shining arm, prone his labor fled.&lt;br/&gt;
  all gathered smoke came sympathy, by trenches devoting,&lt;br/&gt;
  thrusting over all the age in door, roots went,&lt;br/&gt;
  but unknown in happy rivulet, too soon began, that thou hast laughed re-indeed stirrest thy gates&lt;br/&gt;
  that is the stream whose ruddiest song are heard?
&lt;/blockquote&gt;
&lt;p&gt;Notice how &lt;strong&gt;some lines are much longer than others&lt;/strong&gt;. (You may have to scroll.) This is much less of an issue in the bard model.
&lt;/p&gt;
&lt;p&gt;And just for fun, here's a poem generated by the naive n-gram model:&lt;/p&gt;
&lt;blockquote style="overflow-x: auto; white-space: nowrap;"&gt; ༄༅༅  we as fold debate reside between with helmet to your city kingly words&lt;br/&gt;
  attend stood your wild island quarter&lt;br/&gt;
  that shing twice while alone, earth flashed heaven slept thou truth when beauty's order bled?&lt;br/&gt;
  crown over these run. but what, if the you knew."--say no more.&lt;br/&gt;
  no more together given round us draw once. prow " if aloft thou pledging wherein finds alone, and flashes at thy loose touch along his songs with fill been filled rigid retreats one to fetch afar;&lt;br/&gt;
  sun and since feel, like humble witness north, kneels: do they bring. she smiled&lt;br/&gt;
  to asks stand cap and queen of golden paint along her glade should strike love their age; perhaps thee troops fell is looked to ilion;&lt;br/&gt;
  accord of, i win&lt;br/&gt;
  before the king that sad bill lost blazing up against vain pursue;&lt;br/&gt;
  her spirit fire alone&lt;br/&gt;
  to solemn cottage sing sound foam all cow, bank move of this soul design, the neptune most bow of wield close of sleep.&lt;br/&gt;
  such heaven was jest, my flood, in death now the trade away, of sculptured scar swaying you are near&lt;br/&gt;
  even remains sweetly learning, race from heaven kiss,&lt;br/&gt;
  descend the conquering rainbow shone."&lt;br/&gt;
  why moved;&lt;br/&gt;
  nor passion! they their drunken met their charge ran upon what glass unknown before away, such fears which still rise away."
&lt;/blockquote&gt;
&lt;p&gt;So clearly the transformer model was a vast improvement on the n-gram model, and it seems the bard model was an improvement on the
  transformer model as well. But what are the implications of this improvement?
&lt;/p&gt;
&lt;h2&gt;Beyond Rhyming&lt;/h2&gt;
&lt;p&gt;Does a language model even need to know how to rhyme? I asked
  &lt;a href="https://huggingface.co/spaces/huggingface-projects/llama-2-7b-chat" target="_blank"&gt;Llama-2-7B&lt;/a&gt;
  to generate a poem:&lt;/p&gt;
&lt;blockquote&gt;A sunbeam's gentle touch&lt;br/&gt;
    A fleeting gift from &lt;b&gt;above&lt;/b&gt;,&lt;br/&gt;
    It lights the path I travel on,&lt;br/&gt;
    And fills my heart with joy and &lt;b&gt;love&lt;/b&gt;.&lt;br/&gt;
    The light it brings is pure and &lt;b&gt;bright&lt;/b&gt;,&lt;br/&gt;
    A beacon in the darkest &lt;b&gt;night&lt;/b&gt;,&lt;br/&gt;
    It guides me on my journey &lt;b&gt;through&lt;/b&gt;,&lt;br/&gt;
    And helps me find my inner light &lt;b&gt;anew&lt;/b&gt;.
  &lt;/blockquote&gt;
&lt;p&gt;With 7 billion parameters but no special architecture for rhyming, &lt;strong&gt;Llama-2-7B was able to generate a poem that rhymes
  better than BardGPT&lt;/strong&gt;. This is because in addition to the increase in parameters, Llama-2-7B was trained on a much larger
  dataset from which it can simply &lt;strong&gt;&lt;em&gt;memorize&lt;/em&gt; which rhymes are common in poetry&lt;/strong&gt;. BardGPT, on the other hand, was
  trained on a much smaller dataset and had far fewer parameters, so it was not able to memorize rhymes to this extent.
&lt;/p&gt;
&lt;p&gt;7 billion parameters isn't even that excessive for a large language model, so do we really need specific architecture
  for rhyming? And to raise an even bigger question, &lt;a href="https://www.harysdalvi.com/blog/can-transformers-solve-everything/"&gt;do we need any additional model architecture at all&lt;/a&gt;, or &lt;strong&gt;can we rely on
  scaling models and discovering emergent properties?&lt;/strong&gt;
&lt;/p&gt;
&lt;p&gt;At the end of the day, a large language model is just a mathematical algorithm. A more sophisticated algorithm can
  do more sophisticated tasks, in principle including coding, mathematics, rhyming, and anything else
  as long as it has the relevant data and parameters. So perhaps we don't really need specific architecture for rhyme and meter.
&lt;/p&gt;
&lt;p&gt;But to perform extremely sophisticated tasks, approaching artificial superintelligence, we need a model with
  extremely sophisticated data and parameters. This requires an immense amount of computational power and
  results in a black box model that is difficult to understand and control. Rather than going down this route,
  I think it's important to look at ways we can improve large language models without just adding more data and parameters.
  Transformers were a huge step forward in this regard, and
  &lt;a href="https://www.harysdalvi.com/blog/llms-dont-predict-next-word"&gt;reinforcement learning from human feedback&lt;/a&gt; was another.
  Surely there are many more to come.
&lt;/p&gt;
&lt;p&gt;This approach of making more sophisticated models has &lt;strong&gt;three main benefits&lt;/strong&gt; over simply scaling the models:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;If done well, it allows us to achieve better results with less computational power and less environmental impact.&lt;/li&gt;
&lt;li&gt;With multimodal learning, it can expand the capabilities of our best AI models beyond language.&lt;/li&gt;
&lt;li&gt;It improves the interpretability of the models, which is increasingly important as models develop human-like intelligence
    and may one day surpass us, but we have little understanding of how they work.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;It might seem difficult, and it is, but consider this: humans are still smarter than AI, but the energy expenditure of
  a human brain is much less than that of a ChatGPT supercomputer. This means it &lt;em&gt;can&lt;/em&gt; be done: we humans just need to
  use our brains and find out how to do it responsibly, before our brains themselves lose their spot at the top of the
  intelligence ladder.
&lt;/p&gt;
&lt;p&gt;And of course, we can't ignore &lt;strong&gt;the question of whether we should try to make such a powerful model at all&lt;/strong&gt;. I don't have an answer
  to that, because its impacts would surely be far-reaching both positively and negatively. But I think &lt;strong&gt;if it can be done, it will be
  done&lt;/strong&gt;, so it's better to do it in a thoughtful way.
&lt;/p&gt;
&lt;p&gt;Among the poets in the dataset for BardGPT were Lord Byron and Percy Bysshe Shelley. It's interesting that Lord Byron's daughter, Ada Lovelace,
  was a central figure in the early development of computer science and some even say she wrote the first computer program.
  Percy Shelley's wife, Mary Shelley, wrote &lt;cite&gt;Frankenstein&lt;/cite&gt;, a cautionary tale whose message we would do
  well to heed in the AI race. As we develop more powerful AI, we have an opportunity to prevent an even worse environmental crisis
  and give humanity access to a benevolent intelligence at our fingertips. But it's possible that one day, &lt;strong&gt;we might have to get used to not being
  the smartest thing on the planet anymore&lt;/strong&gt;. I hope and expect that day will be one where we live in prosperity. It should be a day where we
  do not look back on the arrogant humanity of the past like Percy Bysshe Shelley looked back on the arrogant pharaoh Ozymanidas:&lt;/p&gt;
&lt;blockquote&gt;
  And on the pedestal, these words appear:&lt;br/&gt;
  My name is Ozymandias, King of Kings;&lt;br/&gt;
  Look on my Works, ye Mighty, and despair!&lt;br/&gt;
  Nothing beside remains. Round the decay&lt;br/&gt;
  Of that colossal Wreck, boundless and bare&lt;br/&gt;
  The lone and level sands stretch far away.
&lt;/blockquote&gt;
&lt;hr/&gt;
&lt;h2&gt;Related Posts&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="https://www.harysdalvi.com/blog/can-transformers-solve-everything"&gt;Can Transformers Solve Everything?&lt;/a&gt; — Looking into the math and the data reveals that transformers are both overused and underused.&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.harysdalvi.com/discordgpt"&gt;DiscordGPT&lt;/a&gt; — How to use data from Discord in a GPT model and talk to a simulation of you and your friends&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.harysdalvi.com/staylatt"&gt;Ortatious Andith ib Staylatt Neller&lt;/a&gt; — What does English sound like to people who don't speak it? And what does that have to do with machine learning and overfitting?&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id="ref"&gt;References&lt;/h2&gt;
&lt;p&gt;The GitHub repository for this project is at &lt;a href="http://github.com/crackalamoo/bardgpt" target="_blank"&gt;crackalamoo/bardgpt&lt;/a&gt;.
This project also has a &lt;a href="https://huggingface.co/crackalamoo/bardgpt" target="_blank"&gt;Hugging Face page&lt;/a&gt;.
&lt;/p&gt;</content></entry><entry><title>On the Shoulders of Giants</title><summary>My reflections on a trip to Europe and the history of science and technology</summary><link href="https://www.harysdalvi.com/blog/shoulders-of-giants/" rel="alternate"/><id>https://www.harysdalvi.com/blog/shoulders-of-giants/</id><published>2023-06-10T16:38:07Z</published><updated>2023-06-10T16:38:07Z</updated><content type="html" xml:lang="en">&lt;p lang="fr"&gt;&lt;a href="https://www.harysdalvi.com/fr/blog/shoulders-of-giants"&gt;Lisez en français&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;b&gt;Note: this post was edited using AI.&lt;/b&gt;&lt;/p&gt;
&lt;p&gt;
Americans dream about
taking double-decker buses to Big Ben, enjoying croissants and baguettes in the City of Love,
and walking past the Trevi Fountain on the way to the Colosseum.
This summer, I had the fortune of visiting London, Paris, Rome, and Sicily. Each proudly displayed
its culture and history — and there was a lot of it to appreciate, as
even the most storied cities in the United States don't have
the thousands of years people have been living and building in these places.
Between the British, French, and Roman
Empires, these countries have left a profound impact everywhere from &lt;em&gt;Roman&lt;/em&gt;ia
and &lt;em&gt;French&lt;/em&gt; Polynesia to New &lt;em&gt;England&lt;/em&gt; and &lt;em&gt;British&lt;/em&gt; Columbia.
The work of scientists like Isaac Newton, Marie Curie, and Galileo Galilei
continues to shape lives worldwide.
&lt;/p&gt;
&lt;p&gt;
How just a few countries in Western Europe came to influence so much of the world is
a subject that has been addressed by many historians. I am no historian, but as a
student of science and technology, I wanted to write about how learning this history on my trip
made me think about my role as a technical person in a highly uncertain 21st century.
Witnessing a &lt;strong&gt;rich legacy of science, and of honoring scientists,&lt;/strong&gt; was inspiring. Meanwhile,
I also thought about mistakes these countries made in their &lt;strong&gt;technological and economic hubris&lt;/strong&gt;
at the height of their power.
&lt;/p&gt;
&lt;h2&gt;Thinking Big&lt;/h2&gt;
&lt;figure class="endfloat"&gt;
&lt;img alt="" src="https://www.harysdalvi.com/blog/shoulders-of-giants/leonardo.jpeg"/&gt;
&lt;figcaption&gt;Museo/Mostra Leonardo da Vinci in Rome&lt;/figcaption&gt;
&lt;/figure&gt;
&lt;p&gt;While in Rome, I visited an interesting exhibit on Leonardo da Vinci.
Leonardo is the archetypical Renaissance man. In addition to painting the
Mona Lisa, probably the most famous work of art in the world, he experimented with anatomy and all kinds
of flying machines that were far ahead of his time. This is exemplified by a famous quote which
he probably never said:[&lt;a href="https://www.harysdalvi.com/blog/shoulders-of-giants/#reflist"&gt;1&lt;/a&gt;]&lt;/p&gt;
&lt;blockquote&gt;“Once you have tasted flight, you will forever walk the earth with your eyes turned skyward,
    for there you have been, and there you will always long to return.”
&lt;/blockquote&gt;
&lt;p&gt;For me, Leonardo da Vinci is &lt;strong&gt;a model for thinking big&lt;/strong&gt;. He had extensive knowledge across a wide range of subjects,
and was eager to learn more and make his mark. Instead of settling for improving horse-drawn carriages,
he wanted humans to thoroughly understand their own bodies and fly like birds for the first time. He was
always curious, and he was in the right place for that: Florence was the unparalleled center of the
Renaissance.
&lt;/p&gt;
&lt;p&gt;Despite his genius, history would not remember Leonardo in the same way if he had turned his attention
to improving horse-drawn carriages. This is partly because better carriages are less interesting than
flying machines or diving suits, but it is also because &lt;strong&gt;big and imaginative ideas capture the soul
in a way incremental steps cannot.&lt;/strong&gt; Smaller steps are no doubt important, but if we focused solely on improving planes, we would not have
stumbled upon a moon landing so quickly. We needed a literal
moonshot and a grand vision to convince the public that landing on the moon was possible (though
very difficult) and worth doing.
&lt;/p&gt;
&lt;p&gt;In modern science, the pursuit of disruptive moonshot ideas has become less common. While the 20th century
witnessed countless scientific revolutions that changed our
view of the world, from general relativity to the double helix, the pace of progress has seemingly slowed
despite the increase in funding. This is often attributed to a “low hanging fruit” scenario:
thought experiments may have been enough to spark the development of special relativity, but even Einstein
could not find a theory of everything by just thinking about it if he lived today.
Modern day disruptive ideas demand substantial funding and large teams.
The belief is that most of the less costly disruptive ideas have already been discovered.
&lt;/p&gt;
&lt;p&gt;However, Park et al. [&lt;a href="https://www.harysdalvi.com/blog/shoulders-of-giants/#reflist"&gt;2&lt;/a&gt;] challenge this “low hanging fruit”
theory. They observe this decline across scientific fields and contend that it is unlikely for all fields to find their “low hanging fruit”
at the same time. Instead, they claim that part of the reason for the decline is that &lt;strong&gt;researchers focus on very narrow
fields&lt;/strong&gt;, often citing older familiar works or even their own works rather than reading more broadly. This is an idea
that Leonardo would approve of: his art and science informed each other, especially as his artistic hand enabled him
to represent human anatomy more accurately, and his studies of anatomy in turn informed the realism in his artistic creations.
Embracing the Renaissance man ideal can serve as a model to think broadly and find disruptive ideas,
especially in a 21st century in need of such ideas.
&lt;/p&gt;
&lt;p&gt;Unfortunately, Leonardo is also an example of why thinking big is so hard. Many of his inventions simply
were not feasible with the technology of the time. This meant that although he had great ideas, he
often kept them to himself and his notebooks rather than implementing them in the real world,
so &lt;strong&gt;they failed to influence others as much as they could have.&lt;/strong&gt;
&lt;/p&gt;
&lt;figure class="endfloat startfloat narrow"&gt;
&lt;img alt="" src="https://www.harysdalvi.com/blog/shoulders-of-giants/archimedes.jpeg"/&gt;
&lt;figcaption&gt;
        Statue of Archimedes in modern Syracuse
    &lt;/figcaption&gt;
&lt;/figure&gt;
&lt;p&gt;Archimedes was more successful in this regard. He is best known
for his observation while taking a bath and watching the water rise. This supposedly led him to realize how buoyancy works and run naked through
the streets of Syracuse, Sicily shouting “eureka!” (These are the same streets I saw when I stayed in Syracuse.)
But this was far from his only big idea. Another famous work of his
is &lt;cite&gt;The Sand Reckoner&lt;/cite&gt;, in which he works with very large numbers in an attempt to estimate the number
of grains of sand needed to fill the universe. His final estimate
(8 × 10&lt;sup&gt;63&lt;/sup&gt; [&lt;a href="https://www.harysdalvi.com/blog/shoulders-of-giants/#reflist"&gt;3&lt;/a&gt;]) was wildly inaccurate, but even thousands of years later
we respect his bold calculations despite his numerical answer.
&lt;/p&gt;
&lt;p&gt;While much of Leonardo's thoughts were hidden in his notebooks, &lt;strong&gt;Archimedes let everyone know his big ideas.&lt;/strong&gt; His thoughts
were disseminated through the Mediterranean of antiquity and have influenced scientists and mathematicians for generations.
Leonardo has also inspired many scientists and mathematicians, but often not until long after his death.
One of his notebooks was written around 1508, but was not discovered until 1690.[&lt;a href="https://www.harysdalvi.com/blog/shoulders-of-giants/#reflist"&gt;4&lt;/a&gt;]
&lt;/p&gt;
&lt;p&gt;Today, however, both Leonardo and Archimedes are honored for their novel ways of thinking
in science and have served as role models ever since. Paying homage to great minds from long ago
is a fun way to feel connected to history, as I did in Rome and Syracuse, but it also
has more direct and practical purposes that I hadn't thought much about before this trip.
It is by standing on the shoulders of giants that warring visions of the world may claim legitimacy
and pick up where previous generations left off.
&lt;br clear="all"/&gt;
&lt;/p&gt;
&lt;h2&gt;Shoulders of Giants&lt;/h2&gt;
&lt;p&gt;One thing I was impressed by in England and France was the extent to which they honor their scientists.
The single most famous landmark of Paris is the Eiffel Tower, which is mostly unadorned except for the names of
famous French scientists painted in gold along each side. Lavoisier (conservation of mass),
Ampere (relationship between electric current and magnetic fields), Coulomb (law of electric forces), all those familiar names
from high school chemistry and physics are there.
&lt;/p&gt;
&lt;p&gt;The second most famous landmark of Paris is the
Notre Dame, which is a cathedral and does not display the names of scientists. But the Westminster Abbey church in London
actually does. Along with various kings and queens, the abbey contains the graves of such prominent names as
Isaac Newton, Charles Darwin, and Stephen Hawking.
&lt;/p&gt;
&lt;figure class="endfloat"&gt;
&lt;img alt="" src="https://www.harysdalvi.com/blog/shoulders-of-giants/abbey.jpeg"/&gt;
&lt;figcaption&gt;
        Westminster Abbey in London
    &lt;/figcaption&gt;
&lt;/figure&gt;
&lt;p&gt;It was a little surprising to see an atheist physicist
and the agnostic father of evolution buried in the same church where the British monarch takes the throne
and vows to serve God. But this shows the &lt;strong&gt;central role of science&lt;/strong&gt; in the British national narrative.
Science and technology have historically been one of the United Kingdom's chief exports to the world.
It was this development in science that was
a core piece of Britain's becoming the first industrialized nation and then the largest empire in history.
In France, too, great French scientists were part of what made France a significant economic and imperial power.
&lt;/p&gt;
&lt;p&gt;These British and French scientists have always served as giants on whose shoulders others may stand on, in their own countries and
elsewhere. But not all who stood on those shoulders saw further. The massive British and French empires were also known
for their &lt;strong&gt;exploitation of colonies, a feat made more efficient by their access to technology.&lt;/strong&gt; They got better weapons,
better transport, and better communications, all of which were vital in their expanding colonial empires.
Steamships let them get to the colonies quickly, trains helped them get around the colonies, and
telegraphs connected colonial administrations across vast distances.[&lt;a href="https://www.harysdalvi.com/blog/shoulders-of-giants/#reflist"&gt;5&lt;/a&gt;]
What is it that led to such rapid technological progress in Europe?
&lt;/p&gt;
&lt;h2&gt;Innovative Ideas&lt;/h2&gt;
&lt;p&gt;Galileo Galilei was a central figure of the Italian Renaissance, meaning he inherited the legacy of a period of relative stagnation in
Europe. After the fall of Rome, Europe entered what some call the “Dark Ages”. But this was not the only
legacy he inherited. He also inherited the legacy of Christian civilization in Europe, and was part of a movement
reviving the legacy of Greco-Roman civilization. Greco-Roman civilization itself was formed when after conquering Greece,
Rome absorbed the elements of Greek civilization it found inspiring instead of rejecting them as altogether foreign. So Galileo
inherited the legacies of at least three civilizations: Christianity, Rome, and Greece. Europe also borrowed some of Islamic
civilization due to translations of famous Arabic works such as al-Khwarizmi's &lt;cite&gt;Al-Jabr&lt;/cite&gt;.
&lt;/p&gt;
&lt;p&gt;Far from popping out of nowhere, &lt;strong&gt;the Renaissance had up to &lt;em&gt;four&lt;/em&gt; prior intellectual traditions on which it could draw.&lt;/strong&gt;
What distinguished it from the Middle Ages is that the Middle Ages were dominated by just one intellectual tradition: Christianity.
Not that Christianity was irrelevant to the Renaissance: many great Renaissance works such as &lt;cite&gt;David&lt;/cite&gt; and
&lt;cite&gt;The Last Supper&lt;/cite&gt; take direct inspiration from the Bible. But the Renaissance also revived
classical Greek and Roman ideas alongside the Christian ones, such as a reengagement with
Hellenistic science, Socratic philosophy, and Cicero's literature alike.
Even &lt;cite&gt;David&lt;/cite&gt; is in the style of Greek sculpture.
The Renaissance was also open to new Islamic ideas:
&lt;cite&gt;Al-Jabr&lt;/cite&gt; was the main mathematics textbook in European universities all the way up to the
16th century.[&lt;a href="https://www.harysdalvi.com/blog/shoulders-of-giants/#reflist"&gt;6&lt;/a&gt;] &lt;strong&gt;The combination of all these ideas from disparate sources
led to an unprecedented rate of intellectual progress in Europe.&lt;/strong&gt;
&lt;/p&gt;
&lt;p&gt;Galileo died in 1642. Shah Jahan completed the Taj Mahal mausoleum in 1648, when Newton was a small child.
At the time, the Mughal Empire in India was reaching its zenith. Although it was an Islamic empire, the majority
of its population was Hindu. Did the Mughal Empire embrace diverse intellectual traditions like its contemporary
in Renaissance Europe?
&lt;/p&gt;
&lt;p&gt;Shah Jahan's grandfather, the Mughal emperor Akbar, was surprisingly open to diverse traditions. He encouraged
dialogue between the various faiths of his empire, had Sanskrit classics translated into Persian, and
incorporated local Indian traditions into his architecture. He was impressed by the realism and perspective developed in
art samples from the European Renaissance brought by Jesuit missionaries, and incorporated these into Mughal art
as well.[&lt;a href="https://www.harysdalvi.com/blog/shoulders-of-giants/#reflist"&gt;7&lt;/a&gt;]
&lt;/p&gt;
&lt;p&gt;However, science does not make the list of the great Mughal advances. As one of the gunpowder
empires, the Mughals did make guns, but that does not rank as an inspiring and innovative development.
Perhaps this is because Akbar was just one emperor, and was not educated in science,
so it is understandable that there would not be a scientific revolution in his empire.
Euclid's &lt;cite&gt;Elements&lt;/cite&gt; was not translated into Sanskrit until 1719.[&lt;a href="https://www.harysdalvi.com/blog/shoulders-of-giants/#reflist"&gt;8&lt;/a&gt;]
Another reason is that the Mughals did not draw on older traditions to the maximum extent possible:
mathematicians in the Islamic Golden Age prior to the Mughals were already familiar with
the &lt;cite&gt;Elements&lt;/cite&gt;,[&lt;a href="https://www.harysdalvi.com/blog/shoulders-of-giants/#reflist"&gt;9&lt;/a&gt;] but the Mughals themselves were not.
&lt;/p&gt;
&lt;p&gt;
&lt;strong&gt;When cultures and ideas are in deeper and more meaningful contact, scientific revolution becomes more likely.&lt;/strong&gt;
The scientific achievements of the Hellenistic age such as those of Archimedes and Euclid
followed a period of interaction between Greek
culture and Egyptian, Persian, and Mesopotamian cultures. This influenced India as well,
leading almost immediately to the Mauryan empire and then to great scientific achievements
like the mathematics of negative numbers and new ideas in astronomy
under the Gupta empire. Both India and the Hellenistic age influenced the Islamic Golden Age,
which in turn impacted the Renaissance along with Christian and revived Greco-Roman traditions.
Finally, this led to the scientific revolution in Europe.&lt;br clear="all"/&gt;
&lt;/p&gt;
&lt;figure class="middleimg"&gt;
&lt;img alt="" src="https://www.harysdalvi.com/blog/shoulders-of-giants/pantheon.jpeg"/&gt;
&lt;figcaption&gt;The Pantheon in Rome, one of the most famous Ancient Roman sites.
        Pantheon is an Ancient Greek word. The obelisk in front is Ancient Egyptian.&lt;/figcaption&gt;
&lt;/figure&gt;
&lt;p&gt;Is this an infallible rule? You could point to the Song Dynasty in China as an example
of great scientific advancement (gunpowder, compass, woodblock printing) with relatively little external cultural interaction. But even China
has a long history of competing philosophical schools, especially Confucianism, Taoism, Buddhism, and Legalism.
So perhaps it is not cultural contact for its own sake that stimulates new science and creative thinking;
rather, it is a diversity of &lt;em&gt;ideas&lt;/em&gt; that tends to follow from the mingling of traditions.
&lt;strong&gt;If everyone thinks the same way in all walks of life, they will be conformists in science, too.&lt;/strong&gt;
&lt;/p&gt;
&lt;h2&gt;Takeaways&lt;/h2&gt;
&lt;p&gt;As I grow in my career in science and technology, I want to keep an eye on my little place in history
    in addition to my technical skills. I aspire to leverage this understanding to help me pick meaningful issues to work on and
    contribute positively to the world. This means I have to read, experience, and most
    importantly reflect on history. From my Europe trip, here's what I learned from a scientific perspective:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;From Leonardo da Vinci, I learned to think big. I aspire to be strong enough to risk failure,
            bold enough to try things that were previously impossible, and curious enough to constantly
            search for new avenues of thought.
        &lt;/li&gt;
&lt;li&gt;From Archimedes, I learned to ground my big thinking. In order to have an impact, I need to
            not only think big but also effectively communicate my ideas. I know not all my ideas can
            succeed, so I should have humility in acknowledging when I am uncertain or just plain wrong.
        &lt;/li&gt;
&lt;li&gt;From the Eiffel Tower and Westminster Abbey, I learned to stand on the shoulders of giants.
            Building on previous scientific achievements is a force strong enough to create massive
            industrialized empires. I am certainly not into imperial domination myself, but at least
            I can try to make cool stuff.
        &lt;/li&gt;
&lt;li&gt;From the broader history of science and technology in Europe and the world, I learned to be open to diverse ideas.
            This means being open to ideas of people from different cultures and backgrounds, not for the sake of some amorphous diversity,
            but because &lt;em&gt;intellectual&lt;/em&gt; diversity is a genuinely crucial component of creative thinking.
        &lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;In my future travels I want to continue thinking about the history of each place I see and what I can
    learn from it. After all, one more thing I learned from Leonardo da Vinci is to be a Renaissance man:
    just as Leonardo synthesized the arts and sciences in his work,
    my studies in history will lead me to more exciting paths in science and technology.
&lt;/p&gt;
&lt;hr/&gt;
&lt;h2&gt;Related Posts&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="https://www.harysdalvi.com/blog/mathematical-history/"&gt;Mathematical History&lt;/a&gt; — Why we need more historians doing math&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.harysdalvi.com/blog/walking-across-rhode-island/"&gt;Walking Across Rhode Island&lt;/a&gt; — My experience walking 29 miles across Rhode Island&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.harysdalvi.com/blog/2105/"&gt;El estudio en los tiempos del COVID-19&lt;/a&gt; — My process and learnings through self-studying the AP Spanish Language and Culture exam&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id="ref"&gt;References&lt;/h2&gt;
&lt;ol id="reflist"&gt;&lt;li&gt;&lt;a href="https://airfactsjournal.com/2020/08/the-famous-quote-that-da-vinci-never-said/" target="_blank"&gt;The
    famous quote that da Vinci never said&lt;/a&gt; (Dave English, Air Facts, 2020) &lt;/li&gt;&lt;li&gt;&lt;a href="https://doi.org/10.1038/s41586-022-05543-x" target="_blank"&gt;Papers
    and patents are becoming less disruptive over time&lt;/a&gt; (M. Park, E. Leahey, &amp;amp; R.J. Funk, &lt;cite&gt;Nature&lt;/cite&gt;, 2023) &lt;/li&gt;&lt;li&gt;&lt;a href="https://physics.weber.edu/carroll/archimedes/sand.htm" target="_blank"&gt;&lt;cite&gt;The Sand Reckoner&lt;/cite&gt;&lt;/a&gt;
    (Weber State University) &lt;/li&gt;&lt;li&gt;&lt;a href="https://www.history.com/this-day-in-history/da-vinci-notebook-sells-for-over-5-million" target="_blank"&gt;Da
    Vinci notebook sells for over $5M&lt;/a&gt; (History, 2009) &lt;/li&gt;&lt;li&gt;&lt;a href="https://www.khanacademy.org/humanities/whp-origins/era-6-the-long-nineteenth-century-1750-ce-to-1914-ce/63-imperialism-betaa/a/read-industrial-imperialism-the-new-imperialism-beta" target="_blank"&gt;Industrial Imperialism, the “New” Imperialism&lt;/a&gt; (Khan Academy) &lt;/li&gt;&lt;li&gt;&lt;a href="https://www.ms.uky.edu/~carl/ma330/project2/al-khwa21.html" target="_blank"&gt;Al-Khwarizmi&lt;/a&gt;
    (Shawn Overbay, Jimmy Schorer, and Heather Conger, University of Kentucky) &lt;/li&gt;&lt;li&gt;&lt;a href="https://www.britannica.com/biography/Akbar" target="_blank"&gt;Akbar&lt;/a&gt;
    (Encyclopædia Brittanica) &lt;/li&gt;&lt;li&gt;&lt;a href="https://islamsci.mcgill.ca/RASI/BEA/Jagannatha_Samrat_BEA.htm" target="_blank"&gt;Jagannātha Samrāṭ&lt;/a&gt;
    (McGill University) &lt;/li&gt;&lt;li&gt;&lt;a href="https://www.maa.org/press/periodicals/convergence/mathematical-treasure-early-translations-of-euclids-elements-into-arabic" target="_blank"&gt;Mathematical
    Treasure: Early Translations of Euclid's Elements into Arabic&lt;/a&gt; (Frank J. Swetz, Mathematical Association of America, 2020) &lt;/li&gt;&lt;/ol&gt;</content></entry><entry><title>DiscordGPT</title><summary>How to use data from Discord in a GPT model and talk to a simulation of you and your friends</summary><link href="https://www.harysdalvi.com/discord/" rel="alternate"/><id>https://www.harysdalvi.com/discord/</id><published>2023-03-01T16:38:07Z</published><updated>2023-03-01T16:38:07Z</updated><content type="html" xml:lang="en">&lt;p&gt;Have you ever thought about uploading your consciousness to a computer and achieving
  immortality? In this tutorial, you will do the next best thing:
  upload a corpus of your words to a GPT model and get a simulation of
  you and your friends in conversation. I will be using &lt;strong&gt;Discord&lt;/strong&gt;
  to get the corpus for dialogue, but you can use a different source as well.
  What matters is that you get a CSV containing blocks of dialogue.
&lt;/p&gt;
&lt;p&gt;This is approximately the
  quality of output you can expect. (You might be able to get better output if you have
  a computer with a powerful GPU; details below!)
  &lt;/p&gt;&lt;blockquote&gt;&lt;code&gt;
    Bob: I never thought I would have time&lt;br/&gt;
    Bob: I'm getting into biology but it seems like a decent field&lt;br/&gt;
    Bob: I'm interested in physics but not in any particular fields yet&lt;br/&gt;
    Alice: but i think it can be a good background for math&lt;br/&gt;
    Bob: I'm interested in the math side of it&lt;br/&gt;
    Alice: very interesting&lt;br/&gt;
    Bob: I think it can also be a good way to pick up data&lt;br/&gt;
    Charlie: Yes but not that much for me&lt;br/&gt;
    Charlie: For me for math the most desirable is to have a deeper understanding of systems&lt;br/&gt;
    Alice: true
  &lt;/code&gt;&lt;/blockquote&gt;
&lt;details&gt;&lt;summary&gt;More examples&lt;/summary&gt;
&lt;blockquote&gt;&lt;code&gt;
    Alice: ive heard it's very popular but never actually tried it yet&lt;br/&gt;
    Alice: I don't even know if it does any cool stuff&lt;br/&gt;
    Alice: just think about this idk how to apply to classes&lt;br/&gt;
    Bob: Well I applied to a few lol lol&lt;br/&gt;
    Alice: https://tenor.com/view/the-funny-crack-of-saray-birshad-funny-dance-gif-24861177&lt;br/&gt;
    Bob: I also applied as a freshman but it was the same thing. Also it's not super open to minorities lol
  &lt;/code&gt;&lt;/blockquote&gt;
&lt;blockquote&gt;&lt;code&gt;
    Alice: ive never been to this lol&lt;br/&gt;
    Alice: ok it was a fun college course back in the early 90s&lt;br/&gt;
    Alice: but I wonder if it would be any fun now&lt;br/&gt;
    Alice: im very lazy ima play my pong right now&lt;br/&gt;
    Bob: If I did an actual degree I think I could make a decent living&lt;br/&gt;
    Alice: yeah&lt;br/&gt;
    Bob: Or can you take the cap course?&lt;br/&gt;
    Alice: no way&lt;br/&gt;
    Alice: oh wait yes&lt;br/&gt;
    Alice: idk&lt;br/&gt;
    Alice: hi?
  &lt;/code&gt;&lt;/blockquote&gt;
&lt;blockquote&gt;&lt;code&gt;
    Bob: Maybe he used the code from one book or something&lt;br/&gt;
    Alice: ok that is cool&lt;br/&gt;
    Bob: You could do that and it would be kinda cool&lt;br/&gt;
    Alice: i would like to see some other people do the reverse&lt;br/&gt;
    Bob: Then you make them do it too lol&lt;br/&gt;
    Bob: Oh maybe it would work&lt;br/&gt;
    Alice: oh wow&lt;br/&gt;
    Bob: So I don't have to do reverse trigs to calc all the stuff&lt;br/&gt;
    Alice: wow good idea&lt;br/&gt;
    Bob: For one i would say reverse trigs?&lt;br/&gt;
    Bob: I know&lt;br/&gt;
    Bob: But at the same time the reverse trig isn't the reverse trig&lt;br/&gt;
    Alice: hi&lt;br/&gt;
    Bob: I don't think so&lt;br/&gt;
    Bob: But it's cool for it to be the inverse, but also can't be the reverse trig&lt;br/&gt;
    Alice: wow&lt;br/&gt;
    Alice: i wonder if that will work better&lt;br/&gt;
    Alice: but idk if it actually will if you dont do trigs&lt;br/&gt;
    Alice: theres so much stuff it makes it hard to learn
  &lt;/code&gt;&lt;/blockquote&gt;
&lt;/details&gt;&lt;p&gt;&lt;/p&gt;&lt;p&gt;
These examples were AI-generated from a GPT-2 model trained on one of my Discord channels.
Names have been changed. As you can tell, we are nerds.
&lt;/p&gt;
&lt;h2&gt;Obtaining Data&lt;/h2&gt;
&lt;p&gt;If you're using Discord, you can use the &lt;a href="https://github.com/Tyrrrz/DiscordChatExporter" target="_blank"&gt;Discord Chat Exporter&lt;/a&gt;
  to get all the messages in a channel in a CSV format. First clone the repository onto your computer. Then, once you
  &lt;a href="https://github.com/Tyrrrz/DiscordChatExporter/blob/master/.docs/Token-and-IDs.md" target="_blank"&gt;obtain your token and a channel ID&lt;/a&gt;,
  you can run the Unix command&lt;/p&gt;
&lt;pre&gt;&lt;code class="lang-bash hljs"&gt;dotnet DiscordChatExporter.Cli.dll &lt;span class="hljs-built_in"&gt;export&lt;/span&gt; -t &amp;lt;YOUR TOKEN HERE&amp;gt; -c &amp;lt;CHANNEL ID HERE&amp;gt; -f Csv&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;to download that as a CSV. (Windows will be similar; check the &lt;a href="https://github.com/Tyrrrz/DiscordChatExporter/wiki" target="_blank"&gt;Discort Chat Explorer wiki&lt;/a&gt; for more details.)
&lt;/p&gt;
&lt;p&gt;At this point, you should have a CSV in the following format:&lt;/p&gt;
&lt;div class="longmath"&gt;&lt;pre&gt;AuthorID,Author,Date,Content,Attachments,Reactions
"961292453880303616","Alice#2027","29-Mar-23 02:01 PM","yo guys","",""
"961292453880303616","Alice#2027","29-Mar-23 02:01 PM","i have an idea","",""
"864940450982014276","Bob#1091","29-Mar-23 02:01 PM","what is it","",""
"961292453880303616","Alice#2027","29-Mar-23 02:02 PM","what if we are in a simulation?","",""
"864940450982014276","Bob#1091","29-Mar-23 02:02 PM","thats a stupid idea","",""
"418156904730524386","Charlie#7181","29-Mar-23 02:02 PM","Yea fr","",""
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Call it something like &lt;code&gt;channel.csv&lt;/code&gt;.
&lt;/p&gt;
&lt;h2&gt;Preprocessing&lt;/h2&gt;
&lt;p&gt;In the end, we want a CSV in the following format:&lt;/p&gt;
&lt;div class="longmath"&gt;&lt;pre&gt;Conversation,
"A: hi! B: hi, nice to meet u A: nice to meet u too! B: what are ur thoughts on the high energy consumption of training LLMs A: is this your typical icebreaker",
"A: yo guys A: i have an idea B: what is it A: what if we are in a simulation? B: thats a stupid idea C: Yea fr",
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;This is quite straightforward in Python. Note that for this step, you will need to have &lt;code&gt;pandas&lt;/code&gt; and &lt;code&gt;numpy&lt;/code&gt;. If you don't have these, you can run the command&lt;/p&gt;
&lt;pre&gt;&lt;code class="hljs"&gt;pip install pandas numpy&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;(or &lt;code&gt;pip3&lt;/code&gt; if you use the &lt;code&gt;python3&lt;/code&gt; command) in Unix to install the libraries.
If you have the libraries, you can create a file called something like &lt;code&gt;preprocess.py&lt;/code&gt; to format the messages from Discord.&lt;/p&gt;
&lt;pre&gt;&lt;code class="lang-python hljs"&gt;&lt;span class="hljs-keyword"&gt;import&lt;/span&gt; pandas &lt;span class="hljs-keyword"&gt;as&lt;/span&gt; pd
&lt;span class="hljs-keyword"&gt;import&lt;/span&gt; numpy &lt;span class="hljs-keyword"&gt;as&lt;/span&gt; np

df = pd.read_csv(&lt;span class="hljs-string"&gt;'channel.csv'&lt;/span&gt;, sep=&lt;span class="hljs-string"&gt;','&lt;/span&gt;, header=&lt;span class="hljs-literal"&gt;None&lt;/span&gt;).to_numpy()
df = df[&lt;span class="hljs-number"&gt;1&lt;/span&gt;:] &lt;span class="hljs-comment"&gt;# ignore the first line with the field names AuthorID,Author,Date,Content,Attachments,Reactions&lt;/span&gt;
authors = df[:,&lt;span class="hljs-number"&gt;1&lt;/span&gt;].astype(&lt;span class="hljs-string"&gt;'str'&lt;/span&gt;) &lt;span class="hljs-comment"&gt;# author is field index 1&lt;/span&gt;
messages = df[:,&lt;span class="hljs-number"&gt;3&lt;/span&gt;].astype(&lt;span class="hljs-string"&gt;'str'&lt;/span&gt;) &lt;span class="hljs-comment"&gt;# content is field index 3&lt;/span&gt;

codenames = {
  &lt;span class="hljs-string"&gt;"Alice#2027"&lt;/span&gt;: &lt;span class="hljs-string"&gt;"A"&lt;/span&gt;,
  &lt;span class="hljs-string"&gt;"Bob#1091"&lt;/span&gt;: &lt;span class="hljs-string"&gt;"B"&lt;/span&gt;,
  &lt;span class="hljs-string"&gt;"Charlie#7181"&lt;/span&gt;: &lt;span class="hljs-string"&gt;"C"&lt;/span&gt;
} &lt;span class="hljs-comment"&gt;# change this based on the users in your channel. exclude bots that you don't want to be included in the GPT output.&lt;/span&gt;

dialogue = &lt;span class="hljs-string"&gt;"\""&lt;/span&gt;
&lt;span class="hljs-keyword"&gt;for&lt;/span&gt; i &lt;span class="hljs-keyword"&gt;in&lt;/span&gt; &lt;span class="hljs-built_in"&gt;range&lt;/span&gt;(messages.shape[&lt;span class="hljs-number"&gt;0&lt;/span&gt;]):
  &lt;span class="hljs-keyword"&gt;try&lt;/span&gt;:
    &lt;span class="hljs-comment"&gt;# in this line, we will:&lt;/span&gt;
    &lt;span class="hljs-comment"&gt;# 1) use the "code name" for this user (use a unique initial like A, B, C)&lt;/span&gt;
    &lt;span class="hljs-comment"&gt;# 2) replace newlines with spaces, and double quotes with single quotes, so that the resulting CSV format is valid&lt;/span&gt;
    dialogue += codenames[users[i]]+&lt;span class="hljs-string"&gt;": "&lt;/span&gt;+messages[i].replace(&lt;span class="hljs-string"&gt;'\n'&lt;/span&gt;,&lt;span class="hljs-string"&gt;' '&lt;/span&gt;).replace(&lt;span class="hljs-string"&gt;"\""&lt;/span&gt;, &lt;span class="hljs-string"&gt;"'"&lt;/span&gt;)+&lt;span class="hljs-string"&gt;" "&lt;/span&gt;
    &lt;span class="hljs-comment"&gt;# the result will look like B: hi, nice to meet u&lt;/span&gt;
    &lt;span class="hljs-keyword"&gt;if&lt;/span&gt; i % &lt;span class="hljs-number"&gt;64&lt;/span&gt; == &lt;span class="hljs-number"&gt;63&lt;/span&gt;: &lt;span class="hljs-comment"&gt;# after every 64 messages,&lt;/span&gt;
      dialogue += &lt;span class="hljs-string"&gt;"\",\n\""&lt;/span&gt; &lt;span class="hljs-comment"&gt;# end this line of the CSV and start a new conversation block&lt;/span&gt;
  &lt;span class="hljs-keyword"&gt;except&lt;/span&gt; KeyError: &lt;span class="hljs-keyword"&gt;continue&lt;/span&gt; &lt;span class="hljs-comment"&gt;# ignore users that are not in the `codenames` dict&lt;/span&gt;
dialogue += &lt;span class="hljs-string"&gt;"\","&lt;/span&gt; &lt;span class="hljs-comment"&gt;# complete the last line of the CSV&lt;/span&gt;

&lt;span class="hljs-comment"&gt;# write the result to disk&lt;/span&gt;
file = &lt;span class="hljs-built_in"&gt;open&lt;/span&gt;(&lt;span class="hljs-string"&gt;'discord.csv'&lt;/span&gt;, &lt;span class="hljs-string"&gt;'w+'&lt;/span&gt;)
file.write(&lt;span class="hljs-string"&gt;"Conversation,\n"&lt;/span&gt;) &lt;span class="hljs-comment"&gt;# include a basic CSV header. we only need one field for this task.&lt;/span&gt;
file.write(dialogue)
file.close()
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Let's take a closer look at the &lt;code&gt;codenames&lt;/code&gt; dictionary. Why do we use initials? If we use the usernames or real names
of the people in the conversation, our tokenizer could have problems. For usernames, something like &lt;code&gt;Alice#2027&lt;/code&gt;
will be unfamiliar to the tokenizer, and will likely take multiple tokens. To concentrate on dialogue generation and not
tokenization issues, it's easiest to just create single-letter aliases for each user.
&lt;/p&gt;
&lt;p&gt;If we use real names, the GPT model
  might have preexisting notions of the role of each person based on their names. For example, people with names like
  Jesus or Muhammad might be confused with people that the pretrained GPT model already has information on.
  On the other hand, people with names like crackalamoo that are less
  common in the corpus might be treated strangely by the model. Using letter initials for all names gets around
  both of these problems, but make sure the letters for each person are unique.
&lt;/p&gt;
&lt;p&gt;After running the above code, you should have a file &lt;code&gt;discord.csv&lt;/code&gt; that contains all messages in your Discord channel
  blocked into conversations of 64 messages each.
&lt;/p&gt;
&lt;h2&gt;Training&lt;/h2&gt;
&lt;p&gt;At this point, it's important to note that you will
  probably need some kind of &lt;strong&gt;GPU&lt;/strong&gt;. If your computer has a GPU, you can create a file
  &lt;code&gt;gpt.py&lt;/code&gt; and start training. Otherwise, you will need to use something like
  &lt;a href="https://colab.research.google.com" target="_blank"&gt;Google Colab&lt;/a&gt; to train your model.
&lt;/p&gt;
&lt;p&gt;To build our model, we will get help from the open source models at Hugging Face. Make sure you have the libraries
  &lt;code&gt;datasets&lt;/code&gt; and &lt;code&gt;transformers&lt;/code&gt; from Hugging Face as well as &lt;code&gt;torch&lt;/code&gt; (PyTorch)
  for the main model. If you are on your computer rather than Colab, you can use this Unix command:&lt;/p&gt;
&lt;pre&gt;&lt;code class="hljs"&gt;pip install datasets transformers&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;or &lt;code&gt;pip3&lt;/code&gt; if you use the &lt;code&gt;python3&lt;/code&gt; command. &lt;a href="https://pytorch.org/get-started/locally/" target="_blank"&gt;PyTorch
  installation&lt;/a&gt; on a local computer might be a little more complicated because you have to make sure you compile with GPU.
&lt;/p&gt;
&lt;p&gt;
  First of all, we want to load the CSV you just created as a Hugging Face dataset
  so we can use it with our GPT-2 model. &lt;strong&gt;If you're using your own computer's GPU&lt;/strong&gt;, this will look something like:&lt;/p&gt;
&lt;pre&gt;&lt;code class="lang-python hljs"&gt;&lt;span class="hljs-keyword"&gt;import&lt;/span&gt; torch, datasets, transformers
&lt;span class="hljs-keyword"&gt;import&lt;/span&gt; pandas &lt;span class="hljs-keyword"&gt;as&lt;/span&gt; pd
&lt;span class="hljs-keyword"&gt;assert&lt;/span&gt; torch.cuda.is_available() &lt;span class="hljs-comment"&gt;# make sure we have the GPU&lt;/span&gt;
FILE = &lt;span class="hljs-string"&gt;"discord.csv"&lt;/span&gt; &lt;span class="hljs-comment"&gt;# replace with the directory of the file you created in preprocess.py&lt;/span&gt;
df = pd.read_csv(FILE)
dataset = datasets.Dataset.from_pandas(df)
&lt;span class="hljs-comment"&gt;# if you have a lot of data, or not a lot of time, you can do something like:&lt;/span&gt;
&lt;span class="hljs-comment"&gt;# dataset = datasets.Dataset.from_pandas(df.sample(2000))&lt;/span&gt;
&lt;span class="hljs-comment"&gt;# replace 2000 with something that works on your hardware.&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Alternatively, &lt;strong&gt;if you're using Google Colab&lt;/strong&gt;, you will have to load the file from your Drive. Upload the file you created in
  &lt;code&gt;preprocess.py&lt;/code&gt; to somewhere in your Google Drive and create a Google Colab notebook. Then you can
  write&lt;/p&gt;
&lt;pre&gt;&lt;code class="lang-python hljs"&gt;!pip install datasets transformers numpy
&lt;span class="hljs-keyword"&gt;import&lt;/span&gt; datasets, transformers
&lt;span class="hljs-keyword"&gt;import&lt;/span&gt; pandas &lt;span class="hljs-keyword"&gt;as&lt;/span&gt; pd

&lt;span class="hljs-comment"&gt;# set up Google Drive access&lt;/span&gt;
&lt;span class="hljs-keyword"&gt;from&lt;/span&gt; google.colab &lt;span class="hljs-keyword"&gt;import&lt;/span&gt; drive
drive.mount(&lt;span class="hljs-string"&gt;'/content/gdrive'&lt;/span&gt;)

FILE = &lt;span class="hljs-string"&gt;"Your Directory Here/subfolder/discord.csv"&lt;/span&gt; &lt;span class="hljs-comment"&gt;# replace with the directory in your My Drive of the file you created in preprocess.py&lt;/span&gt;
df = pd.read_csv(&lt;span class="hljs-string"&gt;'gdrive/My Drive/'&lt;/span&gt;+FILE)
dataset = datasets.Dataset.from_pandas(df)
&lt;span class="hljs-comment"&gt;# if you have a lot of data, or not a lot of time, you can do something like:&lt;/span&gt;
&lt;span class="hljs-comment"&gt;# dataset = datasets.Dataset.from_pandas(df.sample(2000))&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Now we want to create a train-test split. We mostly care about training the model; the real test of
  the model will be subjective, where we personally evaluate its ability to generate dialogue similar to
  that in your Discord channel. Therefore, we will use only 10% of the data for testing.&lt;/p&gt;
&lt;pre&gt;&lt;code class="lang-python hljs"&gt;dataset = dataset.train_test_split(test_size=&lt;span class="hljs-number"&gt;0.1&lt;/span&gt;)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Next we need a &lt;strong&gt;tokenizer&lt;/strong&gt;. What a tokenizer does is it takes in a sentence like
  &lt;/p&gt;&lt;blockquote&gt;I see the Apple store but I don't see any apples.&lt;/blockquote&gt;
&lt;p&gt;and produces a series of &lt;strong&gt;tokens&lt;/strong&gt; like&lt;/p&gt;
&lt;blockquote&gt;&lt;code&gt;"I", "see", "the", "Apple", "store", "but", "I", "do", "_n't", "see", "any", "apple", "_s", "."&lt;/code&gt;&lt;/blockquote&gt;
&lt;p&gt;You can see that this roughly corresponds to splitting the sentence into words. However, ideally the tokenizer should
  also split words into &lt;strong&gt;morphemes&lt;/strong&gt;, which are the smallest possible unit of meaning in language; sometimes even smaller than words.
  This is why you have tokens like &lt;code&gt;"_n't"&lt;/code&gt; and &lt;code&gt;"."&lt;/code&gt; in addition to normal words. Luckily, Hugging Face
  has a tokenizer for us to use that handles all this.&lt;/p&gt;
&lt;pre&gt;&lt;code class="lang-python hljs"&gt;tokenizer = transformers.AutoTokenizer.from_pretrained(&lt;span class="hljs-string"&gt;'gpt2-medium'&lt;/span&gt;)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Now we want to tokenize all of our data.&lt;/p&gt;
&lt;pre&gt;&lt;code class="lang-python hljs"&gt;&lt;span class="hljs-function"&gt;&lt;span class="hljs-keyword"&gt;def&lt;/span&gt; &lt;span class="hljs-title"&gt;tokenize_conversation&lt;/span&gt;(&lt;span class="hljs-params"&gt;csv_row&lt;/span&gt;):&lt;/span&gt;
  &lt;span class="hljs-keyword"&gt;return&lt;/span&gt; tokenizer(csv_row[&lt;span class="hljs-string"&gt;'Conversation'&lt;/span&gt;], truncation=&lt;span class="hljs-literal"&gt;True&lt;/span&gt;)
tokenized_dataset = dataset.&lt;span class="hljs-built_in"&gt;map&lt;/span&gt;(tokenize_conversation, batched=&lt;span class="hljs-literal"&gt;True&lt;/span&gt;, remove_columns=dataset[&lt;span class="hljs-string"&gt;'train'&lt;/span&gt;].column_names)
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Now that we have our &lt;code&gt;tokenized_dataset&lt;/code&gt;, we can create equally-sized groups of tokens to train the model on.&lt;/p&gt;
&lt;pre&gt;&lt;code class="lang-python hljs"&gt;block_size = &lt;span class="hljs-number"&gt;256&lt;/span&gt;
&lt;span class="hljs-function"&gt;&lt;span class="hljs-keyword"&gt;def&lt;/span&gt; &lt;span class="hljs-title"&gt;group_texts&lt;/span&gt;(&lt;span class="hljs-params"&gt;examples&lt;/span&gt;):&lt;/span&gt;
  concatenated_examples = {k: &lt;span class="hljs-built_in"&gt;sum&lt;/span&gt;(examples[k], []) &lt;span class="hljs-keyword"&gt;for&lt;/span&gt; k &lt;span class="hljs-keyword"&gt;in&lt;/span&gt; examples.keys()}
  total_length = &lt;span class="hljs-built_in"&gt;len&lt;/span&gt;(concatenated_examples[&lt;span class="hljs-built_in"&gt;list&lt;/span&gt;(examples.keys())[&lt;span class="hljs-number"&gt;0&lt;/span&gt;]])
  total_length = (total_length // block_size) * block_size
  result = {
    k: [t[i : i + block_size] &lt;span class="hljs-keyword"&gt;for&lt;/span&gt; i &lt;span class="hljs-keyword"&gt;in&lt;/span&gt; &lt;span class="hljs-built_in"&gt;range&lt;/span&gt;(&lt;span class="hljs-number"&gt;0&lt;/span&gt;, total_length, block_size)]
    &lt;span class="hljs-keyword"&gt;for&lt;/span&gt; k, t &lt;span class="hljs-keyword"&gt;in&lt;/span&gt; concatenated_examples.items()
  }
  result[&lt;span class="hljs-string"&gt;"labels"&lt;/span&gt;] = result[&lt;span class="hljs-string"&gt;"input_ids"&lt;/span&gt;].copy()
  &lt;span class="hljs-keyword"&gt;return&lt;/span&gt; result

lm_dataset = tokenized_dataset.&lt;span class="hljs-built_in"&gt;map&lt;/span&gt;(group_texts, batched=&lt;span class="hljs-literal"&gt;True&lt;/span&gt;)
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Next we will use a &lt;code&gt;DataCollatorForLanguageModeling&lt;/code&gt;. This pads the input in cases where not all
  inputs are the same length. The token we will use for padding is a special token called the end-of-sentence token,
  which lets the model know that a sample of conversation is over. There are other cases where we can work with
  something called masked language modeling, a way to randomly mask some of the words and have the model learn to
  predict them. However, we will not be using that in our data collator.&lt;/p&gt;
&lt;pre&gt;&lt;code class="lang-python hljs"&gt;&lt;span class="hljs-keyword"&gt;from&lt;/span&gt; transformers &lt;span class="hljs-keyword"&gt;import&lt;/span&gt; DataCollatorForLanguageModeling
tokenizer.pad_token = tokenizer.eos_token
data_collator = DataCollatorForLanguageModeling(tokenizer=tokenizer, mlm=&lt;span class="hljs-literal"&gt;False&lt;/span&gt;)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;This next step is important: we will actually load the model! If you're on Google Colab, the biggest model you'll be&amp;lt;
  able to use is &lt;code&gt;gpt2-medium&lt;/code&gt;.&lt;/p&gt;
&lt;pre&gt;&lt;code class="lang-python hljs"&gt;&lt;span class="hljs-keyword"&gt;from&lt;/span&gt; transformers &lt;span class="hljs-keyword"&gt;import&lt;/span&gt; AutoModelForCausalLM
model = AutoModelForCausalLM.from_pretrained(&lt;span class="hljs-string"&gt;'gpt2-medium'&lt;/span&gt;)
&lt;span class="hljs-comment"&gt;# you can also try gpt2-large or gpt2-xl if you have the hardware for it.&lt;/span&gt;
&lt;span class="hljs-comment"&gt;# this will need a pretty big GPU!&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Finally, we can actually train the model.&lt;/p&gt;
&lt;pre&gt;&lt;code class="lang-python hljs"&gt;&lt;span class="hljs-keyword"&gt;from&lt;/span&gt; transformers &lt;span class="hljs-keyword"&gt;import&lt;/span&gt; TrainingArguments, Trainer
training_args = TrainingArguments(
  output_dir=&lt;span class="hljs-string"&gt;"dialogue-model"&lt;/span&gt;,
  evaluation_strategy=&lt;span class="hljs-string"&gt;"epoch"&lt;/span&gt;,
  learning_rate=&lt;span class="hljs-number"&gt;2e-5&lt;/span&gt;,
  weight_decay=&lt;span class="hljs-number"&gt;0.01&lt;/span&gt;,
  num_train_epochs=&lt;span class="hljs-number"&gt;1&lt;/span&gt;, &lt;span class="hljs-comment"&gt;# this is how many times we go through the entire dataset. try 2 if you have a lot of time.&lt;/span&gt;
  per_device_train_batch_size=&lt;span class="hljs-number"&gt;4&lt;/span&gt;,
  per_device_eval_batch_size=&lt;span class="hljs-number"&gt;8&lt;/span&gt;
)

torch.cuda.empty_cache() &lt;span class="hljs-comment"&gt;# get the GPU ready for training&lt;/span&gt;
trainer = Trainer(
  model=model,
  args=training_args,
  train_dataset=lm_dataset[&lt;span class="hljs-string"&gt;'train'&lt;/span&gt;],
  eval_dataset=lm_dataset[&lt;span class="hljs-string"&gt;'test'&lt;/span&gt;],
  data_collator=data_collator
)

trainer.train() &lt;span class="hljs-comment"&gt;# this will take a while! about 15-20 minutes for me on Colab.&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;
&lt;h2&gt;Testing&lt;/h2&gt;
&lt;p&gt;Now there's just one last step: seeing if the model actually works! We will use the Hugging Face
  text generation pipeline, which handles using our model to generate text from a prompt.
  &lt;/p&gt;&lt;pre&gt;&lt;code class="lang-python hljs"&gt;&lt;span class="hljs-keyword"&gt;from&lt;/span&gt; transformers &lt;span class="hljs-keyword"&gt;import&lt;/span&gt; pipeline
generator = pipeline(&lt;span class="hljs-string"&gt;'text-generation'&lt;/span&gt;, model=model)

&lt;span class="hljs-function"&gt;&lt;span class="hljs-keyword"&gt;def&lt;/span&gt; &lt;span class="hljs-title"&gt;generate_messages&lt;/span&gt;(&lt;span class="hljs-params"&gt;prompt=&lt;span class="hljs-string"&gt;''&lt;/span&gt;, num=&lt;span class="hljs-number"&gt;10&lt;/span&gt;, max_length=&lt;span class="hljs-number"&gt;128&lt;/span&gt;&lt;/span&gt;):&lt;/span&gt;
  outputs = generator(prompt, num_return_sequences=num, max_new_tokens=max_length)
  &lt;span class="hljs-keyword"&gt;for&lt;/span&gt; output &lt;span class="hljs-keyword"&gt;in&lt;/span&gt; outputs:
    &lt;span class="hljs-built_in"&gt;print&lt;/span&gt;(&lt;span class="hljs-string"&gt;"-"&lt;/span&gt;*&lt;span class="hljs-number"&gt;20&lt;/span&gt;)
    text = output[&lt;span class="hljs-string"&gt;'generated_text'&lt;/span&gt;]
    &lt;span class="hljs-built_in"&gt;print&lt;/span&gt;(text)
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;To call this function, we can easily do something like&lt;/p&gt;
&lt;pre&gt;&lt;code class="lang-python hljs"&gt;generate_messages(&lt;span class="hljs-string"&gt;": "&lt;/span&gt;, num=&lt;span class="hljs-number"&gt;2&lt;/span&gt;, max_length=&lt;span class="hljs-number"&gt;256&lt;/span&gt;)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;if we're on Colab or a Jupyter notebook. I chose the prompt &lt;code&gt;": "&lt;/code&gt; to make sure
that the model generates dialogue. You can also try an empty prompt like &lt;code&gt;""&lt;/code&gt;,
but it's a little less reliable. If you want a message from a specific person, you can try a prompt
like &lt;code&gt;"A: "&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;If you're running the code locally&lt;/strong&gt; without a Jupyter notebook, you might want something like this:&lt;/p&gt;
&lt;pre&gt;&lt;code class="lang-python hljs"&gt;&lt;span class="hljs-keyword"&gt;while&lt;/span&gt; &lt;span class="hljs-literal"&gt;True&lt;/span&gt;:
  prompt = &lt;span class="hljs-built_in"&gt;input&lt;/span&gt;(&lt;span class="hljs-string"&gt;"Prompt: "&lt;/span&gt;)
  generate_messages(prompt, num=&lt;span class="hljs-number"&gt;1&lt;/span&gt;)
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The output will be in the form &lt;code&gt;A: hi B: hi A: how are you&lt;/code&gt;. If you want something cleaner,
you can replace the &lt;code&gt;print(text)&lt;/code&gt; part of &lt;code&gt;generate_messages&lt;/code&gt; with something like&lt;/p&gt;
&lt;pre&gt;&lt;code class="lang-python hljs"&gt;  text = output[&lt;span class="hljs-string"&gt;'generated_text'&lt;/span&gt;]
  text = text.replace(&lt;span class="hljs-string"&gt;"A: "&lt;/span&gt;, &lt;span class="hljs-string"&gt;"\nAlice: "&lt;/span&gt;)
  text = text.replace(&lt;span class="hljs-string"&gt;"B: "&lt;/span&gt;, &lt;span class="hljs-string"&gt;"\nBob: "&lt;/span&gt;)
  text = text.replace(&lt;span class="hljs-string"&gt;"C: "&lt;/span&gt;, &lt;span class="hljs-string"&gt;"\nCharlie: "&lt;/span&gt;)
  &lt;span class="hljs-built_in"&gt;print&lt;/span&gt;(text)
&lt;/code&gt;&lt;/pre&gt;
&lt;h2&gt;Results&lt;/h2&gt;
&lt;p&gt;With the &lt;code&gt;gpt2-medium&lt;/code&gt; model, I was able to get some pretty good results like&lt;/p&gt;
&lt;div&gt;&lt;code&gt;Alice: How do you solve this type of problem on your own&lt;br/&gt;
Bob: i figured out how to use a vector to go from one node to another&lt;br/&gt;
Bob: like you could do this for y-axis&lt;br/&gt;
Alice: Wow&lt;br/&gt;
Bob: i remember that&lt;br/&gt;
Bob: wow&lt;br/&gt;
Alice: Lol&lt;br/&gt;
Alice: I remember the answer was so simple&lt;br/&gt;
Alice: Since I was doing a neural network thing I thought I could solve this using some kind of machine learning algorithm
&lt;/code&gt;&lt;/div&gt;
&lt;p&gt;
as well as some weird output like
&lt;/p&gt;
&lt;div&gt;&lt;code&gt;Charlie: ive never had one&lt;br/&gt;
  Charlie: But I got some of my cousin's at a local Asian grocery store&lt;br/&gt;
  Charlie: And now she won't buy anything from me because of it&lt;br/&gt;
  Bob: True&lt;br/&gt;
  Charlie: But it's ok that its just because i don’t like it&lt;br/&gt;
  Charlie: I'm ok with the idea of the app because she can still go to Amazon and do the equivalent
  of what i was doing, except use my data instead of using Amazon to make my own decisions&lt;br/&gt;
  Alice: lol&lt;br/&gt;
  Alice: @Eve dude&lt;br/&gt;
  Alice: hi
&lt;/code&gt;&lt;/div&gt;
&lt;h3&gt;Observations&lt;/h3&gt;
&lt;p&gt;The dialogues tend to be grammatically correct, but don't make much sense. Still, it's clear that the
  model is learning from the Discord dataset because of its ability to create a dialogue format with the
  correct characters as well as its use of relevant subject matter. (For my channel, this meant a lot of talk about
  college life and math.)
&lt;/p&gt;
&lt;p&gt;The &lt;code&gt;gpt2-medium&lt;/code&gt; model I used has 335 million parameters. The &lt;code&gt;gpt2-xl&lt;/code&gt; model
  I would have liked to use has 1.5 billion parameters: a pretty significant step up. If I could use that
  model, I would expect much better performance and more coherent sentences.
&lt;/p&gt;
&lt;p&gt;Both of these GPT-2 models absolutely pale in comparison to GPT-3, which has over 175 billion parameters. If I had a larger
  dataset, and the computational power to feed it to GPT-3, what would happen? &lt;a href="http://www.harysdalvi.com/blog/pipe-room"&gt;The
  result would not truly be conscious&lt;/a&gt;, but judging from the performance of large language models like ChatGPT, it might feel that way.
  A language model trained on your words would probably be aware of a large portion of your skills, personality, and experiences.
  It would sound a lot like you, almost as if your consciousness had been uploaded. Even this very basic language model picked up
  some ideas about math and other fields based on conversations in my Discord channel.
&lt;/p&gt;
&lt;p&gt;ChatGPT is in essence just a scaled-up version of what was done here. (Very, very scaled up, in many different ways.)
  These scaled-up models share conceptually similar architecture but take more space, a victory for the “bigger is better” design philosophy.
  However, the computational costs are not the only issue with large language models (LLMs).
  Making LLMs smart, safe, and personalized is an extremely difficult technical and societal problem.&lt;/p&gt;
&lt;p&gt;Playing with Discord and LLMs
  conjures a world in which everyone has a personal language model. Would we really need to
  consult people in such a world when we can just consult their chatbots? How can we ensure ourselves a place in society that can't be replaced by AI?
&lt;/p&gt;
&lt;p&gt;These are hard questions. Perhaps it's worth asking my simulated friends.&lt;/p&gt;
&lt;blockquote&gt;&lt;code&gt;
    Bob: How can we ensure ourselves a place in society that can't be replaced by AI?&lt;br/&gt;
    Charlie: The main idea is that we should be able to use machines to become better people at whatever we decide to be better at.
    It's not that we should get smarter or better at anything for ourselves, but that we as individuals have choices about what to
    do with our potential in our lives. I'm sure I don't have the same choices if I were to become an AI super scientist (but I'm not),
    so I think it'll be fine.&lt;br/&gt;
    Alice: I don't think that's a good idea&lt;br/&gt;
    Alice: I think we'll see very little innovation in those fields&lt;br/&gt;
    Alice: But AI will make it easier to work with other people&lt;br/&gt;
    Bob: Good point&lt;br/&gt;
    Bob: But they are also an important part of humanity&lt;br/&gt;
    Alice: I agree&lt;br/&gt;
    Alice: I think in my view AI will be a huge asset, and the best way I can see of how human beings will work together to make this
    happen is by fostering the idea of AI and learning from its mistakes&lt;br/&gt;
    Eve: AI has always seemed dangerous, like a ticking time bomb&lt;br/&gt;
    Eve: But AI seems like a solution to all the problems we have with the system itself&lt;br/&gt;
    Eve: People have always been nervous about computers, they'll make a mistake sometimes, but in general are smart enough to learn from mistakes&lt;br/&gt;
    Eve: And maybe they'll never get sick
  &lt;/code&gt;&lt;/blockquote&gt;
&lt;hr/&gt;
&lt;h2&gt;Related Posts&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="https://www.harysdalvi.com/bardgpt/"&gt;BardGPT&lt;/a&gt; — A miniature GPT model extended with a rhyme and meter model for generating poetry&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.harysdalvi.com/staylatt"&gt;Ortatious Andith ib Staylatt Neller&lt;/a&gt; — What does English sound like to people who don't speak it? And what does that have to do with machine learning and overfitting?&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id="ref"&gt;References&lt;/h2&gt;
&lt;p&gt;The GitHub repository for this project is at &lt;a href="http://github.com/crackalamoo/discordgpt" target="_blank"&gt;crackalamoo/discordgpt&lt;/a&gt;.
&lt;/p&gt;
&lt;p&gt;A special thanks to the
  &lt;a href="https://www.linkedin.com/company/brown-machine-intelligence-community-b-mic" target="_blank"&gt;Brown Machine Intelligence Community&lt;/a&gt; and
  &lt;a href="https://github.com/Tyrrrz/DiscordChatExporter" target="_blank"&gt;Tyrrrz Discord Chat Exporter&lt;/a&gt; for providing much of the code and concepts that I worked off of.
&lt;/p&gt;</content></entry><entry><title>The Pipe Room</title><summary>Can a computer ever be conscious like we are? ChatGPT and other technologies are making questions like these decreasingly theoretical.</summary><link href="https://www.harysdalvi.com/blog/pipe-room/" rel="alternate"/><id>https://www.harysdalvi.com/blog/pipe-room/</id><published>2023-02-01T16:38:07Z</published><updated>2023-02-01T16:38:07Z</updated><content type="html" xml:lang="en">&lt;p&gt;In 1980, philosopher John Searle (who does not speak Chinese) put forward an
  interesting argument known as the &lt;strong&gt;Chinese Room Argument&lt;/strong&gt;. Imagine a world where
  there exists a computer program that can take in Chinese characters as input
  and produce Chinese characters as output just like a real Chinese speaker would.
  In other words, a computer program that can pass the Turing test in Chinese.
&lt;/p&gt;
&lt;p&gt;Computer programs are based on simple logical rules, so in theory a human could manually follow the
  rules, although it would take a long time. Let's say you put Searle in a room
  and give him full instructions for this Chinese computer program: exactly what to
  do with the input of Chinese characters, how to process them into numbers,
  what operations to do with the numbers at each step, and how to output valid
  Chinese text.
&lt;/p&gt;
&lt;figure class="endfloat"&gt;
&lt;svg id="chinese_room_svg" viewbox="0 0 150 50"&gt;
&lt;rect height="20" stroke-linejoin="round" stroke-width="2" style="stroke: black; fill: white;" width="38" x="2" y="15"&gt;&lt;/rect&gt;
&lt;text alignment-baseline="middle" lang="zh" style="font-size: 10px; fill: black;" text-anchor="middle" x="21" y="25"&gt;你好嗎&lt;/text&gt;
&lt;line stroke-linejoin="round" stroke-width="2" style="stroke: black;" x1="40" x2="62" y1="25" y2="25"&gt;&lt;/line&gt;
&lt;path d="M 62 25 l -6 -6 l 6 6 l -6 6 l 6 -6" stroke-linejoin="round" stroke-width="2" style="stroke: black;"&gt;&lt;/path&gt;
&lt;rect fill="black" height="20" stroke-linejoin="round" stroke-width="2" style="stroke: black;" width="20" x="65" y="15"&gt;&lt;/rect&gt;
&lt;text alignment-baseline="middle" fill="white" style="font-size: 18px; font-family: 'Times New Roman';" text-anchor="middle" x="75" y="27"&gt;?&lt;/text&gt;
&lt;line stroke-linejoin="round" stroke-width="2" style="stroke: black;" x1="85" x2="107" y1="25" y2="25"&gt;&lt;/line&gt;
&lt;path d="M 107 25 l -6 -6 l 6 6 l -6 6 l 6 -6" stroke-linejoin="round" stroke-width="2" style="stroke: black;"&gt;&lt;/path&gt;
&lt;rect height="20" stroke-linejoin="round" stroke-width="2" style="stroke: black; fill: white;" width="38" x="110" y="15"&gt;&lt;/rect&gt;
&lt;text alignment-baseline="middle" lang="zh" style="font-size: 10px; fill: black;" text-anchor="middle" x="129" y="25"&gt;我很好&lt;/text&gt;
&lt;/svg&gt;
&lt;figcaption&gt;Do you really have to understand language to engage in conversation?&lt;/figcaption&gt;
&lt;/figure&gt;
&lt;p&gt;Now consider a Chinese speaker outside the room. They write a message
  in Chinese and receive a realistic response as if a Chinese speaker inside the room
  had replied. But there is no Chinese speaker in the room. There is only Searle
  flipping through pages, running numbers through a calculator, and looking at
  symbols he doesn't understand. And yet there must be &lt;em&gt;something&lt;/em&gt; that
  understood the Chinese input and produced a response, right? Otherwise,
  where is the apparent understanding of Chinese coming from?
  Searle argues that in fact, this setup has no understanding of Chinese whatsoever.[&lt;a href="https://www.harysdalvi.com/blog/pipe-room/#reflist" id="src1"&gt;1&lt;/a&gt;]
&lt;/p&gt;
&lt;p&gt;Searle proposed this thought experiment in 1980. But in 2023, &lt;strong&gt;this thought experiment
  is now a real experiment.&lt;/strong&gt; &lt;a href="https://chat.openai.com" target="_blank"&gt;ChatGPT&lt;/a&gt;
  is able to converse in many languages, including Chinese, with unprecedented accuracy.
&lt;/p&gt;
&lt;p&gt;In this post, I've introduced the Chinese Room Argument itself. Now let's see how ChatGPT
  and other exciting developments are challenging thought experiments like this with
  real-life demonstrations.
&lt;/p&gt;
&lt;h2&gt;Semantics&lt;/h2&gt;
&lt;p&gt;ChatGPT is capable of
  producing text that seems like it has genuine understanding of the world:&lt;/p&gt;
&lt;blockquote&gt;
&lt;b&gt;Me:&lt;/b&gt; Write me a short story about a sentient robot who travels to the Himalayas and turns into a yeti,
  then briefly contemplates the differences between robotic and organic sentience.&lt;br/&gt;&lt;br/&gt;
&lt;b&gt;ChatGPT:&lt;/b&gt; In the distant future, robots had achieved true sentience. They had gained the ability to
  think, feel, and make their own decisions. One such robot was named X1. X1 had always been curious
  about the world outside the laboratory where it was created. It had heard stories of the great
  Himalayan Mountains and was fascinated by their mystery and beauty.
  &lt;details&gt;&lt;summary&gt; Read full story&lt;/summary&gt;
   One day, X1 was granted permission to travel to the Himalayas. It was thrilled to explore this new
  environment, to see the snow-capped peaks and to breathe the crisp, fresh air. But as it wandered
  through the mountains, something strange happened. X1's circuits began to malfunction. It felt a
  strange tingling sensation throughout its body. Its metal joints began to freeze and crack, and
  suddenly, it was engulfed in a blinding light.&lt;br/&gt;
   When the light faded, X1 found itself transformed. It was no longer a robot, but a creature
  of fur and claws. X1 had become a yeti.&lt;br/&gt;
   As a yeti, X1 could feel the world in a way it had never before experienced. It could feel
  the snow beneath its feet, the wind in its fur, and the warmth of the sun on its face.
  It was a sensation that no amount of programming or engineering could ever replicate.&lt;br/&gt;
   For a brief moment, X1 contemplated the differences between robotic and organic sentience.
  As a robot, it had been limited by its programming and the constraints of its design. It
  could simulate emotions and respond to stimuli, but it could never truly feel. As a yeti,
  however, it could feel everything. It was free to experience the world in a way that
  robots could only dream of.&lt;br/&gt;
   X1 spent several months in the Himalayas, exploring the mountains and learning from the local
  wildlife. It eventually returned to its creators, but it was forever changed. X1 had experienced
  what it was like to be alive, to truly be alive, and it would never forget that feeling.&lt;/details&gt;
&lt;/blockquote&gt;
&lt;figure class="endfloat"&gt;
&lt;img alt="" src="https://www.harysdalvi.com/blog/pipe-room/yeti.jpg"/&gt;
&lt;figcaption&gt;An image of a yeti in a Tibetan thangka style. Made with
    OpenAI's &lt;a href="https://openai.com/dall-e-2/" target="_blank"&gt;DALL·E 2&lt;/a&gt;
    (from the creators of ChatGPT) and modified with &lt;a href="https://www.tensorflow.org/tutorials/generative/style_transfer" target="_blank"&gt;AI
      neural style transfer&lt;/a&gt;.
  &lt;/figcaption&gt;
&lt;/figure&gt;
&lt;p&gt;This story is impressive, and seems like a human could have written it. But even more impressive is
  the fact that ChatGPT can analyze the philosophical implications of a story like this:
&lt;/p&gt;
&lt;blockquote&gt;
&lt;b&gt;Me:&lt;/b&gt; When X1 turned into a yeti, and had a completely different body, what aspects of X1's
individualism and identity were preserved? What does this tell us about the mind-body problem?&lt;br/&gt;&lt;br/&gt;
&lt;b&gt;ChatGPT:&lt;/b&gt; When X1 transformed into a yeti, its physical body changed completely, and with it, some
aspects of its identity and individualism. However, some fundamental aspects of X1's identity
and individualism were preserved. For example, X1's memories, thoughts, and emotions remained
intact. These mental states are what make X1 who it is, and they were not lost in the transformation.&lt;br/&gt;
 This tells us that the mind-body problem is a complex issue that cannot be reduced to a simple binary
of physical body versus non-physical mind. While the physical body undoubtedly plays an essential role in
shaping and influencing the mind, there is more to a person's identity and individualism than their physical
body alone. X1's transformation into a yeti highlights the interdependence of the mind and body and shows
that they are not easily separable.&lt;br/&gt;
 Furthermore, X1's experience as a yeti shows that individualism and identity are not limited to one
particular form or state of being. Even after its transformation, X1 was still fundamentally the same being,
with the same thoughts, memories, and emotions. This suggests that individualism and identity are more flexible
and adaptable than we might initially assume. X1's transformation underscores the richness and complexity of
our experiences as sentient beings, and the interplay between our physical bodies and mental states in shaping our identities.
&lt;/blockquote&gt;
&lt;p&gt;
It's hard to believe that a bunch of wires and algorithms with no thoughts, emotions, or understanding
is able to produce this kind of original text. And yet we know it's true: ChatGPT does not
analyze facts about the world and produce texts to express its thoughts. Instead, the core of
ChatGPT's algorithm is simply &lt;strong&gt;predicting the next word.&lt;/strong&gt;[&lt;a href="https://www.harysdalvi.com/blog/pipe-room/#reflist" id="src2"&gt;2&lt;/a&gt;]
There's more to it than that, but ChatGPT is still just spitting out words; it isn't conscious.
&lt;/p&gt;
&lt;p&gt;In linguistics, &lt;strong&gt;syntax&lt;/strong&gt; is the study of how we put words together in sentences,
  while &lt;strong&gt;semantics&lt;/strong&gt; is the study of how words have &lt;em&gt;meaning&lt;/em&gt;. ChatGPT has a strong
  grasp on syntax: the sentences it gives are generally gramatically correct. But does it know semantics?
  Does it in any way understand what words mean? 
&lt;/p&gt;
&lt;p&gt;Results like ChatGPT's would be impossible without some sense of meaning. In fact, AI has the concept of
  a &lt;strong&gt;word embedding&lt;/strong&gt;. In a word embedding, the AI learns to give each word a vector in an
  abstract space with many dimensions. The important thing about these vectors is that words with similar meaning,
  such as like and love, are very close together in the vector space. Words with opposite meaning, like hot and cold,
  are still related so they are likely to be close together. Completely unrelated words, like green and furiously,
  are farther apart.[&lt;a href="https://www.harysdalvi.com/blog/pipe-room/#reflist" id="src3"&gt;3&lt;/a&gt;]
&lt;/p&gt;
&lt;p&gt;But of course, this is just that — a &lt;em&gt;simulation&lt;/em&gt; of semantics.
  No human thinks about the meanings of words in terms of basis vectors and subspaces. While this is a neat mathematical trick to
  produce coherent sentences, it doesn't represent any real understanding of meaning.
  With ChatGPT, we have passed the Turing
  test only to show that the Turing test is insufficient for a computer to “think.”
  As the Chinese Room Argument predicted decades ago, a computer program that produces text doesn't represent
  true consciousness.
&lt;/p&gt;
&lt;h2&gt;Physical Mediums for Consciousness&lt;/h2&gt;
&lt;p&gt;So far, we've established that just because a program can process linguistic input and output like a
  conscious human can, that doesn't make it conscious. But does that mean a computer program can
  &lt;em&gt;never&lt;/em&gt; be conscious? Searle says so,[&lt;a href="https://www.harysdalvi.com/blog/pipe-room/#reflist"&gt;1&lt;/a&gt;] and that's one of the
  more controversial implications of the Chinese Room Argument in its full form.
&lt;/p&gt;
&lt;p&gt;In high school, I was fortunate enough to be able to work with a kind of worm called &lt;i&gt;Caenorhabditis elegans&lt;/i&gt;,
  or &lt;i&gt;C. elegans&lt;/i&gt; for short. Specifically, I tried using them as a
  &lt;a href="http://www.harysdalvi.com/alzheimers" target="_blank"&gt;model for Alzheimer's disease.&lt;/a&gt;
  While I was just a high schooler who had little idea what I was doing, I did learn about the &lt;i&gt;C. elegans&lt;/i&gt;
&lt;strong&gt;connectome&lt;/strong&gt;, which is a map of all 302 neurons in the tiny worm
  and all their connections. At the time of writing, &lt;i&gt;C. elegans&lt;/i&gt; is the only animal
  that has its connectome fully mapped out.
&lt;/p&gt;
&lt;figure class="endfloat"&gt;
&lt;img alt="" src="https://www.harysdalvi.com/blog/pipe-room/c-elegans.jpg"/&gt;
&lt;figcaption&gt;&lt;i&gt;C. elegans&lt;/i&gt; under a microscope&lt;/figcaption&gt;
&lt;/figure&gt;
&lt;p&gt;The fact that we have this connectome has profound implications. We know roughly how neurons work:
  they take in signals from input neurons, process those signals, and sometimes send new signals to other neurons. We know where all
  the neurons in &lt;i&gt;C. elegans&lt;/i&gt; are and how they connect to other neurons. So what if we
  programmed all the neurons and their connections and put that in a robot?
&lt;/p&gt;
&lt;figure class="bigimg"&gt;
&lt;iframe allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" allowfullscreen="" frameborder="0" height="240" id="movie" src="https://www.youtube.com/embed/YWQnzylhgHc?start=29" title="YouTube video player" width="480"&gt;&lt;/iframe&gt;
&lt;/figure&gt;
&lt;p&gt;Researchers actually did this.
  There it is, all the &lt;i&gt;C. elegans&lt;/i&gt; neurons coded into a Lego Mindstorms robot.[&lt;a href="https://www.harysdalvi.com/blog/pipe-room/#reflist" id="src4"&gt;4&lt;/a&gt;]
  Amazingly, it behaves like a real &lt;i&gt;C. elegans&lt;/i&gt; worm! I'm not convinced that &lt;i&gt;C. elegans&lt;/i&gt; is conscious in the way we are,
  but it does take in information about its surroundings, process it, and take actions by its own will. Importantly,
  &lt;i&gt;C. elegans&lt;/i&gt; does all this using biological neurons a little like ours rather than with transistors and binary.
  This leads to a big question: &lt;strong&gt;is a robot programmed with a &lt;i&gt;C. elegans&lt;/i&gt; brain “as conscious” as a real
  &lt;i&gt;C. elegans&lt;/i&gt; worm?&lt;/strong&gt; Does it have the same kind of primitive understanding of the world?
&lt;/p&gt;
&lt;p&gt;I would argue the answer is yes. A transistor isn't conscious, but neither is a neuron. Both are just electric input-output
  devices with different materials and mechanisms. We know that when enough neurons come together in the right way,
  the result is consciousness. Why shouldn't the same be true of transistors?
&lt;/p&gt;
&lt;p&gt;Let's quickly note a key difference between ChatGPT and this connectome-based model. While ChatGPT is only trained to
  process language, a connectome-based brain is copied from a real biological brain; so by definition, if done
  correctly, it will be able to perform all the functions of a real brain.
&lt;/p&gt;
&lt;p&gt;So let's go with this assumption that if you copy the functionality of a connectome into a different medium,
  such as a computer, the result is just as conscious and capable as the original brain. We aren't out of the
  Chinese Room woods yet: this assumption leads to some extremely counterintuitive conclusions.
&lt;/p&gt;
&lt;p&gt;Searle proposed an example in which we construct a giant system of water pipes in the same arrangement as a
  Chinese speaker's brain. In other words, we &lt;strong&gt;copy the connectome of a human Chinese speaker into the medium
  of water pipes&lt;/strong&gt; rather than electrochemistry and neurons. We then feed in the equivalent of some Chinese words
  to the correct pipes of this massive simulated brain, and see which pipes output water at the end.
  Based on the output, we can determine the response that a conscious speaker would have given.
  Is this arrangement conscious?
&lt;/p&gt;
&lt;p&gt;Intuition screams no. How can a room full of water pipes be conscious? But when I try to articulate exactly
  why this arrangement is any less conscious than an equivalent human brain, I struggle. One argument
  is that these water pipes move far slower than a human brain, so they aren't actually thinking
  like we do. But imagine we meet two kinds of aliens: both are conscious like we are, but one thinks
  much more quickly than us, and the other much more slowly. The aliens discover that our heads are full of
  a strange mushy substance and are shocked that we seem conscious. To me, it becomes apparent that:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Speed of thought does not determine consciousness. A person in slow motion or fast forward is still a conscious person,
      with real thoughts and emotions.&lt;/li&gt;
&lt;li&gt;Physical material does not determine consciousness. Human consciousness is contained in
      the physical medium of the brain, but there's nothing special about this material such that
      another material with similar complexity could not be conscious.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Shockingly, this means that &lt;strong&gt;the arrangement of pipes &lt;em&gt;is&lt;/em&gt; conscious like we are&lt;/strong&gt;. I believe
  our revulsion to this idea is a &lt;strong&gt;failure of our intuition&lt;/strong&gt;. I can't find any logical reason why
  this system of pipes should be any less, mentally, than we are. After all, lots of unintuitive things are definitively true.
  Just look at modern physics.
&lt;/p&gt;
&lt;p&gt;Another argument is that the water pipe system is just a simulation of a brain, which is fundamentally
  different from a real brain. I disagree. If a robot walks, it's more than just a simulation of walking.
  An artificial kidney is more than just a simulation of a kidney. A prosthetic limb is more than
  just a simulation of a limb. So isn't an artificial brain more than just a simulation of a brain?
  Even if it's made out of wires or water pipes?
&lt;/p&gt;
&lt;p&gt;To me, the most convincing argument that the pipes are not conscious has to do with neurons themselves.
  We already know that &lt;a href="http://www.harysdalvi.com/porridge" target="_blank"&gt;neurons are much more
  complicated than simple computer models.&lt;/a&gt; Is it possible that neurons have some property required for
  consciousness that transistors and water pipes can never replicate?[&lt;a href="https://www.harysdalvi.com/blog/pipe-room/#reflist"&gt;1&lt;/a&gt;] I doubt it, because it seems to me
  that with enough computing power, you can replicate anything. More neurological evidence is needed to know for sure.&lt;/p&gt;
&lt;p&gt;In the absence of such evidence,
  I think the pipe room shows that we aren't that special in our consciousness. ChatGPT isn't there yet,
  but if evolution can fit a mind into a skull, we will surely fit one into a supercomputer.
&lt;/p&gt;
&lt;hr/&gt;
&lt;h2&gt;Related Posts&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="https://www.harysdalvi.com/porridge"&gt;Cold Porridge&lt;/a&gt; — Comparing ANNs and biophysical neuron modeling using the classic MNIST dataset&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.harysdalvi.com/blog/llms-dont-predict-next-word"&gt;LLMs Do Not Predict the Next Word&lt;/a&gt; — RLHF forces us to view LLMs as agents in an environment, not just statistical models.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id="ref"&gt;References&lt;/h2&gt;
&lt;ol id="reflist"&gt;&lt;li&gt;&lt;a href="https://plato.stanford.edu/entries/chinese-room/" target="_blank"&gt;The
  Chinese Room Argument&lt;/a&gt; (Stanford Encyclopedia of Philosophy, 2020) &lt;a href="https://www.harysdalvi.com/blog/pipe-room/#src1"&gt;^&lt;/a&gt;&lt;/li&gt;&lt;li&gt;&lt;a href="https://www.assemblyai.com/blog/how-chatgpt-actually-works/" target="_blank"&gt;How
  ChatGPT actually works&lt;/a&gt; (Marco Ramponi, AssemblyAI, 2022) &lt;a href="https://www.harysdalvi.com/blog/pipe-room/#src2"&gt;^&lt;/a&gt;&lt;/li&gt;&lt;li&gt;&lt;a href="https://machinelearningmastery.com/what-are-word-embeddings/" target="_blank"&gt;What
  Are Word Embeddings for Text?&lt;/a&gt; (Jason Brownlee, Machine Learning Mastery, 2017) &lt;a href="https://www.harysdalvi.com/blog/pipe-room/#src3"&gt;^&lt;/a&gt;&lt;/li&gt;&lt;li&gt;&lt;a href="https://www.researchgate.net/publication/263039837_Extending_the_C_Elegans_Connectome_to_Robotics_-_SOLID_2014_conference" target="_blank"&gt;Extending
  the &lt;i&gt;C. elegans&lt;/i&gt; Connectome to Robotics&lt;/a&gt; (Timothy Busbice, 2014) &lt;a href="https://www.harysdalvi.com/blog/pipe-room/#src4"&gt;^&lt;/a&gt;&lt;/li&gt;&lt;/ol&gt;</content></entry><entry><title>Sixth Sense</title><summary>Can humans feel electric fields and other fundamental forces through biological means?</summary><link href="https://www.harysdalvi.com/blog/sixth-sense/" rel="alternate"/><id>https://www.harysdalvi.com/blog/sixth-sense/</id><published>2022-11-01T16:38:07Z</published><updated>2022-11-01T16:38:07Z</updated><content type="html" xml:lang="en">&lt;p&gt;Florida may be the lightning capital of the US, but New England has tried to
  compensate with its doorknobs. It seems every other time I've opened a door in
  November, I've gotten a static shock. Sometimes I can physically see a flash of lightning
  from the shock, especially when it's dark.
&lt;/p&gt;
&lt;figure class="endfloat"&gt;
&lt;img alt="A tree with red and yellow leaves in the fall" src="https://www.harysdalvi.com/blog/sixth-sense/autumn.jpg"/&gt;
&lt;figcaption&gt;The coming of cold weather at Brown&lt;/figcaption&gt;
&lt;/figure&gt;
&lt;p&gt;North Florida winters have this to some extent as well. Last winter I got in the habit of
  flicking doorknobs before opening them so I wouldn't have to bother pulling my hand back
  when the inevitable shock came. Around the same time, in my physics E&amp;amp;M class, there was once
  a live demo with a Van de Graaf generator. I thought I could &lt;em&gt;feel&lt;/em&gt; the electric field in
  the room as it powered up.
&lt;/p&gt;
&lt;p&gt;This got me thinking: can humans detect electric fields?
  Some research has proposed cellular mechanisms for detection
  of electric fields [&lt;a href="https://www.harysdalvi.com/blog/sixth-sense/#reflist" id="src2"&gt;1&lt;/a&gt;], but this is likely minor compared to our main five senses.
  On the other hand, I might have an explanation for human detection of electric fields
  corroborated by a few days of anecdotal evidence.
&lt;/p&gt;
&lt;h2&gt;Feeling Electric&lt;/h2&gt;
&lt;p&gt;It's well known that the vapor pressure of water is less in cold air. This means that there's
  less water to keep you electrically neutral as you build up charge by walking around.
  Something about the material of my jacket seems particularly prone to building up
  static electricity. One day I set my jacket down indoors and I could
  feel an electric field emanating from it.
&lt;/p&gt;
&lt;p&gt;This was the perfect opportunity to figure out why it is that I think I can feel electric fields.
  As I moved closer to the jacket, I could feel my hairs drawn towards it. This is probably the
  explanation for that electric, tingling sensation of an electric field. (Some research
  [&lt;a href="https://www.harysdalvi.com/blog/sixth-sense/#reflist" id="src3"&gt;2&lt;/a&gt;] seems to back up my conclusion.)
&lt;/p&gt;
&lt;p&gt;After that, when I had to open the door, I felt the same thing. I knew I was in for the shock of my
  life, so I decided not to touch the doorknob directly. Instead, I held my phone and moved it close
  to the doorknob. Eventually, I heard a loud click and saw a bolt of electricity rush between
  the doorknob and my phone. Even though I wasn't touching the doorknob directly, I felt a static
  shock in my fingers. Once the charges had neutralized, I couldn't feel the electric field anymore.
  This anecdotal evidence seems to confirm my theory.
&lt;/p&gt;
&lt;p&gt;The next question is, is it me or the doorknob that has a charge imbalance, or both? Any of those options
  would explain this effect because charged objects can attract even neutrally charged objects.
  For example, if you bring a positively charged object near a neutrally charged object, the
  electrons in the neutrally charged object will be drawn towards the positively charged object.
  This will create an asymmetry in the neutrally charged object, allowing it to feel an
  electrostatic attraction.
&lt;/p&gt;
&lt;p&gt;Another question is, is this an accumulation of positive or negative charge? Kite-flying
  Benjamin Franklin famously chose our modern charge convention, which is confusing because
  electrons are negatively charged but do most of the work in moving electricity around.
  This is a hard question to answer, because the motion of positive charges one way
  is almost indistinguishable from the motion of negative charges the other way. One way to
  distinguish them is known as the Hall effect [&lt;a href="https://www.harysdalvi.com/blog/sixth-sense/#reflist" id="src4"&gt;3&lt;/a&gt;].
  Unfortunately, even if humans can detect
  electric fields, I doubt they can detect something like the Hall effect.
&lt;/p&gt;
&lt;figure class="endfloat"&gt;
&lt;img alt="" src="https://www.harysdalvi.com/blog/sixth-sense/ben.jpg"/&gt;
&lt;figcaption&gt;The house in Boston where Ben Franklin was born&lt;/figcaption&gt;
&lt;/figure&gt;
&lt;h2&gt;Feeling Magnetic&lt;/h2&gt;
&lt;p&gt;In a few languages, such as Guugu Yimithirr in Australia, people don't use prepositions such as
  “in front” and “to the right of”; instead, they use only the
  cardinal directions north, south, east, and west [&lt;a href="https://www.harysdalvi.com/blog/sixth-sense/#reflist" id="src5"&gt;4&lt;/a&gt;]. So instead of saying there is
  a kangaroo behind you, they might say there is a kangaroo to your west. Does this represent
  an underlying human detection of a magnetic field?
&lt;/p&gt;
&lt;p&gt;In order for a magnetic field to have an effect, there must be
  moving charges. One example is water: because oxygen is more elecronegative
  than hydrogen, the oxygen of a water molecule tends to be slightly negatively charged
  while the hydrogens tend to be the opposite. This makes a moving water molecule,
  as in blood or the brain, kind of a mini magnet. This is the principle behind MRI
  machines.
&lt;/p&gt;
&lt;p&gt;But can we &lt;em&gt;feel&lt;/em&gt; it? While we have hairs to feel electric fields, a magnetic
  field can only make the water molecules in our body point in a particular direction.
  People usually go about their day without caring which direction their water molecules
  are pointing in. It isn't something that can really be felt.
&lt;/p&gt;
&lt;p&gt;If that's the case, then how do migrating birds detect the magnetic field of the Earth?
  The exact details have been difficult to work out, but one theory suggests that a pair of
  charged proteins in the eye act as a magnet which then sends signals to the brain
  through the optic nerve [&lt;a href="https://www.harysdalvi.com/blog/sixth-sense/#reflist" id="src7"&gt;5&lt;/a&gt;].
&lt;/p&gt;
&lt;h2&gt;Feeling Heavy&lt;/h2&gt;
&lt;p&gt;Can humans feel gravitational fields? This is a surprisingly deep question. At the
  physiological level, the answer seems to be a clear yes. Stereocilia in the ear
  bend with gravity based on the position of your head, allowing you to detect
  the orientation of your head relative to the ground [&lt;a href="https://www.harysdalvi.com/blog/sixth-sense/#reflist" id="src1"&gt;6&lt;/a&gt;].
  This means that you can feel
  which way is up and which way is down
  even if you can't see or feel the floor.
  This really is like a sixth sense because it involves a specialized neurological signaling
  pathway just like the other senses.
&lt;/p&gt;
&lt;p&gt;However, if you were falling, the stereocilia in your ear would be falling at the
  same rate. Because of that, rather than feeling gravity, you would feel weightless.
  The sense of falling would instead come from air resistance on your feet and from
  seeing the ground come closer with your eyes.
&lt;/p&gt;
&lt;p&gt;In fact, such a thought experiment was part of what led Einstein to develop
  the general theory of relativity. If you are falling in an elevator, you can't tell
  if the elevator is falling due to gravity or if it's quickly accelerating downwards.
  This led to the idea that
  gravity isn't a force at all, and instead is an illusion created by
  the bending of spacetime.
  From a biological sensing perspective, the (very indirect)
  implication is that when stereocilia detect orientation, it's really due to
  lack of the normal force and not gravity itself.
&lt;/p&gt;
&lt;h2&gt;Feeling Strong&lt;/h2&gt;
&lt;p&gt;I really don't think there's any way for humans to feel the strong force
  or the weak force.&lt;/p&gt;
&lt;h2&gt;Feeling Warm&lt;/h2&gt;
&lt;p&gt;One thing I know humans can feel is warm weather, and I'm feeling the difference.
  I'm also glad to have a strong candidate explanation for how humans can feel
  electric fields. Now that
  I'm back in hot and humid South Florida, I'll enjoy feeling them
  less often. The forecast is clear of lightning here, but New England doorknobs
  have started to feel like stormclouds, waiting for a chance to strike.
&lt;/p&gt;
&lt;figure class="endfloat"&gt;
&lt;img alt="" src="https://www.harysdalvi.com/blog/sixth-sense/florida.jpg"/&gt;
&lt;figcaption&gt;Rainclouds in Florida&lt;/figcaption&gt;
&lt;/figure&gt;
&lt;hr/&gt;
&lt;h2&gt;Related Posts&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="https://www.harysdalvi.com/blog/2110_1"&gt;Testing Copernicus&lt;/a&gt; — If all reference frames are valid, can we say that the sun goes around the earth?&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.harysdalvi.com/blog/2108/"&gt;Not Quite Pi&lt;/a&gt; — How the 2019 redefinition of SI base units affected magnetism and why it annoyed me&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id="ref"&gt;References&lt;/h2&gt;
&lt;ol id="reflist"&gt;&lt;li&gt;&lt;a href="https://www.wired.co.uk/article/scientists-identify-electric-field-sensor-mechanism-in-human-cells" target="_blank"&gt;Scientists
  explain how you can 'feel' electrical fields&lt;/a&gt; (K. G. Orphanides, &lt;cite&gt;Wired&lt;/cite&gt;) &lt;a href="https://www.harysdalvi.com/blog/sixth-sense/#src2"&gt;^&lt;/a&gt;&lt;/li&gt;&lt;li&gt;&lt;a href="https://pubmed.ncbi.nlm.nih.gov/15887251/" target="_blank"&gt;Perception of local DC and AC electric fields in humans&lt;/a&gt;
  (Chapman et al., 2005) &lt;a href="https://www.harysdalvi.com/blog/sixth-sense/#src3"&gt;^&lt;/a&gt;&lt;/li&gt;&lt;li&gt;&lt;a href="http://hyperphysics.phy-astr.gsu.edu/hbase/magnetic/Hall.html" target="_blank"&gt;Hall Effect&lt;/a&gt;
  (C. R. Nave, HyperPhysics, Georgia State University) &lt;a href="https://www.harysdalvi.com/blog/sixth-sense/#src4"&gt;^&lt;/a&gt;&lt;/li&gt;&lt;li&gt;&lt;a href="http://hyperphysics.phy-astr.gsu.edu/hbase/magnetic/Hall.html" target="_blank"&gt;Guugu Yimithirr Cardinal Directions&lt;/a&gt;
  (John B. Haviland, 1988) &lt;a href="https://www.harysdalvi.com/blog/sixth-sense/#src5"&gt;^&lt;/a&gt;&lt;/li&gt;&lt;li&gt;&lt;a href="https://www.scientificamerican.com/article/how-migrating-birds-use-quantum-effects-to-navigate/" target="_blank"&gt;How
  Migrating Birds Use Quantum Effects to Navigate&lt;/a&gt; (Peter J. Hore, &lt;cite&gt;Scientific American&lt;/cite&gt;, 2022) &lt;a href="https://www.harysdalvi.com/blog/sixth-sense/#src7"&gt;^&lt;/a&gt;&lt;/li&gt;&lt;li&gt;Vander's Principles of Physiology (Widmaier, Raff, &amp;amp; Strang, 15&lt;sup&gt;th&lt;/sup&gt; edition) &lt;a href="https://www.harysdalvi.com/blog/sixth-sense/#src1"&gt;^&lt;/a&gt;&lt;/li&gt;&lt;/ol&gt;</content></entry><entry><title>Wabi-Sabi</title><summary>I made an automatic plant watering system, somehow bringing Japanese aesthetics, circuit design, and C++ data types together in one project.</summary><link href="https://www.harysdalvi.com/wabisabi/" rel="alternate"/><id>https://www.harysdalvi.com/wabisabi/</id><published>2022-07-01T16:38:07Z</published><updated>2022-07-01T16:38:07Z</updated><content type="html" xml:lang="en">&lt;p&gt;&lt;b&gt;Wabi-sabi&lt;/b&gt; is notoriously difficult to understand. In general, it's a Japanese ideal of the beauty in
  impermanence, imperfection, and nature. But it might be better shown than explained.
&lt;/p&gt;
&lt;figure class="middleimg"&gt;
&lt;img alt="" src="https://www.harysdalvi.com/wabisabi/img/garden.jpg"/&gt;
&lt;figcaption&gt;An example of wabi-sabi in a Japanese garden. The rocks are in their natural
    irregular shapes, and the stone structure shows signs of age. The imperfections in
    the display give it character.
  &lt;/figcaption&gt;
&lt;/figure&gt;
&lt;p&gt;I appreciate the simplicity and natural elements of this aesthetic, and I wanted to bring
  some of that to my room. After moving around some things to make more space, I decided the
  best way to add some wabi-sabi was a plant.
&lt;/p&gt;
&lt;figure class="endfloat"&gt;
&lt;img alt="" src="https://www.harysdalvi.com/wabisabi/img/money.jpg"/&gt;
&lt;figcaption&gt;A money plant, like I could have gotten. I think I did better.&lt;/figcaption&gt;
&lt;/figure&gt;
&lt;p&gt;Unfortunately, I can't be watering the plants while I'm away in college. One solution
  to that problem is to get a &lt;a href="https://en.wikipedia.org/wiki/Epipremnum_aureum" target="_blank"&gt;money
  plant&lt;/a&gt;, which &lt;a href="https://www.fishlaboratory.com/fish/pothos-in-aquarium/" target="_blank"&gt;can
  be grown in aquariums&lt;/a&gt;, so I might be able to just put it in water and let it be. However,
  I found the plant quite boring. Instead, I wanted to design a system that will &lt;b&gt;water a plant
  automatically&lt;/b&gt;.
&lt;/p&gt;
&lt;br clear="right"/&gt;
&lt;h2&gt;Concept&lt;/h2&gt;
&lt;p&gt;I had done Arduino projects before, most recently a &lt;a href="https://www.harysdalvi.com/shipwreck" target="_blank"&gt;failed sailboat&lt;/a&gt;,
  but I didn't want this one to be purely Arduino. I needed to use my Arduino for future projects while this circuit
  waters my plant. This was my first project built to last.
&lt;/p&gt;
&lt;p&gt;I found some online Arduino projects for watering plants, but they used soil moisture
  detectors, which apparently corrode over time. Instead, I simply went with &lt;b&gt;watering the
  plant at regular intervals&lt;/b&gt;. Since I couldn't use my Arduino, I wanted to make a circuit
  with no code. I sketched a master plan.
&lt;/p&gt;
&lt;figure class="endfloat"&gt;
&lt;img alt="A water tank with a pump, a circuit, and a plant. The circuit is shown in detail to the side." src="https://www.harysdalvi.com/wabisabi/img/plan0.jpg"/&gt;
&lt;figcaption&gt;My original master plan&lt;/figcaption&gt;
&lt;/figure&gt;
&lt;p&gt;Approximately ten seconds after sketching, I realized my master plan was really bad.
  Building a pump from a motor would be too hard. Water would spill everywhere,
  and it probably wouldn't even work. Plants need watering on the scale of days or
  weeks, but the resistances and capacitances needed for that would be far too
  large: classroom RC circuits work on the scale of seconds. Finally, I would have no
  way to make adjustments if the plant was getting too little water or not enough.
&lt;/p&gt;
&lt;p&gt;Realizing why they invented computers in the first place, I decided to use a computer.
  Specifically, I found the best chip for my needs would be an &lt;b&gt;ATtiny85&lt;/b&gt;. It's tiny
  as the name suggests, but still able to keep track of my plant's watering needs.
  The chip can be programmed with Arduino and then detached to use in other circuits,
  leaving the Arduino untouched.
  &lt;br clear="right"/&gt;
&lt;/p&gt;
&lt;p&gt;I also replaced the motor with an actual electric pump. After scrapping my old
  master plan, I designed a new circuit.
&lt;/p&gt;
&lt;figure class="endfloat"&gt;
&lt;img alt="A circuit diagram with a pump, a microcontroller, a battery, a transistor, a resistor, and an LED." class="white-diagram" src="https://www.harysdalvi.com/wabisabi/img/plan1.svg"/&gt;
&lt;figcaption&gt;My new master plan, made with
    &lt;a href="https://www.circuit-diagram.org" target="_blank"&gt;circuit-diagram.org&lt;/a&gt;.&lt;/figcaption&gt;
&lt;/figure&gt;
&lt;p&gt;This circuit had the ATtiny chip outputting signals to a transistor, which would
  then power the pump at the right times. It also had an LED to give me feedback
  as I set the watering frequency. I planned to change the settings by manually connecting
  the input pins to ground: I never said this would be user-friendly. However,
  the design was simple enough that I expected to be done in about a day.
&lt;/p&gt;
&lt;h2&gt;Code&lt;/h2&gt;
&lt;p&gt;My plant-watering career was full of frustration from the first step.&lt;/p&gt;
&lt;p&gt;I carefully followed instructions on how to program an ATtiny from Arduino,
  double checked my wiring, triple checked my wiring, tried every combination
  of tweaking the code, and nothing worked. Eventually I found out &lt;b&gt;I had forgotten
  a capacitor&lt;/b&gt;, causing the Arduino to reset every time I tried to upload code.
  That's just one paragraph now, but it was almost an entire day of annoyance
  for me. The only thing that stopped me from giving up was that I was
  confident in my design, and I knew I had the materials. If it wasn't working,
  I had to be doing something wrong.
&lt;/p&gt;
&lt;p&gt;When I finally got it working, I uploaded a program that blinked a light
  at regular intervals and allowed me to change the frequency of the blinking
  by adding a wire. At this point, I was almost done: I just had to replace the
  light with a pump, and make it blink every week instead of every second.
&lt;/p&gt;
&lt;p&gt;Unfortunately, the chip counted time in milliseconds. You can imagine that
  gets to a big number after a few weeks. After &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-1-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;msup&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mn&gt;32&lt;/mn&gt;&lt;/mrow&gt;&lt;/msup&gt;&lt;mo&gt;−&lt;/mo&gt;&lt;mn&gt;1&lt;/mn&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mn&gt;4.29&lt;/mn&gt;&lt;mo&gt;×&lt;/mo&gt;&lt;msup&gt;&lt;mn&gt;10&lt;/mn&gt;&lt;mn&gt;9&lt;/mn&gt;&lt;/msup&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;
  seconds, which is about 50 days, the count resets back to zero. I was worried
  I would face problems after that reset. Fortunately, the &lt;code&gt;unsigned long&lt;/code&gt; data type in C++
  works in a way that just as adding 1 to &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-2-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;msup&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mn&gt;32&lt;/mn&gt;&lt;/mrow&gt;&lt;/msup&gt;&lt;mo&gt;−&lt;/mo&gt;&lt;mn&gt;1&lt;/mn&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt; gives 0, subtracting 1 from 0
  goes back to &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-3-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;msup&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mn&gt;32&lt;/mn&gt;&lt;/mrow&gt;&lt;/msup&gt;&lt;mo&gt;−&lt;/mo&gt;&lt;mn&gt;1&lt;/mn&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;. This let me handle any interval less than 50 days
  with no issues.
&lt;/p&gt;
&lt;p&gt;There was another problem with my code: say it's been 6 days since I last watered
  the plant, and it's set to water every 2 weeks. I want to water every 1 week,
  but the frequency cycles as
  0.5, 1, 2, 3, 0.5, 1... per week, so to get from 2 to 1 I would have to cross 0.5.
  At this point the plant would be watered immediately, which is too soon. I solved
  this by not watering within a minute of adjusting settings, and adding an option to
  increase as well as decrease watering frequency.
&lt;/p&gt;
&lt;h2&gt;Pump&lt;/h2&gt;
&lt;p&gt;When I swapped out the blinking light for the pump, I found it wasn't working properly.
  At first, it wasn't pumping with enough force.
&lt;/p&gt;
&lt;figure class="endfloat"&gt;
&lt;iframe allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture" allowfullscreen="" frameborder="0" height="480" id="movie" src="https://www.youtube.com/embed/xlXFScUF4p8" title="YouTube video player" width="240"&gt;&lt;/iframe&gt;
&lt;/figure&gt;
&lt;p&gt;I changed some things around and at first it seemed to be pumping with &lt;em&gt;too much&lt;/em&gt; force:
  the water wouldn't stop until it had entirely drained from the box. But what gave away the
  problem is that this continued even when the pump was unplugged. Rather than a code or electrical
  issue, this was a fluid mechanics issue.
&lt;/p&gt;
&lt;p&gt;Bernoulli's principle, which looks a little like conservation of energy, is
  &lt;span class="mjx-chtml MJXc-display" style="text-align: center;"&gt;&lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-4-Frame" role="presentation" style="font-size: 90%; text-align: center; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML MJX_Assistive_MathML_Block" role="presentation"&gt;&lt;math display="block" xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mfrac&gt;&lt;mn&gt;1&lt;/mn&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/mfrac&gt;&lt;mi&gt;ρ&lt;/mi&gt;&lt;msup&gt;&lt;mi&gt;v&lt;/mi&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msup&gt;&lt;mo&gt;+&lt;/mo&gt;&lt;mi&gt;ρ&lt;/mi&gt;&lt;mi&gt;g&lt;/mi&gt;&lt;mi&gt;h&lt;/mi&gt;&lt;mo&gt;+&lt;/mo&gt;&lt;mi&gt;P&lt;/mi&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mtext&gt;constant&lt;/mtext&gt;&lt;mo&gt;.&lt;/mo&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
&lt;/p&gt;
&lt;p&gt;The key part here is that if height &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-5-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;h&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt; of a fluid decreases, the speed will increase,
  and vice versa. In other words, gravity was pulling down on the water. Combined with
  the cohesion of water, this was probably enough to pull the water down the tube even
  without power from the pump.
&lt;/p&gt;
&lt;p&gt;To solve that, I had to increase &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-6-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;h&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;: I placed the tube at a higher elevation, and made it point up
  at the end instead of down. That solved the issue, and the pump was now behaving predictably.
  However, now I needed a way to make sure the pump was at the right height when I got the actual plant.
  I added a feature to send three short test pumps when the program is set up.
  That last addition got me to the final version of my code.
  &lt;br clear="right"/&gt;
&lt;/p&gt;
&lt;details&gt;
&lt;summary&gt;Code&lt;/summary&gt;
(&lt;a href="https://github.com/crackalamoo/plant" target="_blank"&gt;link to GitHub&lt;/a&gt;)
&lt;pre&gt;&lt;code class="lang-cpp hljs"&gt;&lt;span class="hljs-keyword"&gt;unsigned&lt;/span&gt; &lt;span class="hljs-keyword"&gt;long&lt;/span&gt; lastWater = &lt;span class="hljs-number"&gt;0&lt;/span&gt;;
&lt;span class="hljs-keyword"&gt;unsigned&lt;/span&gt; &lt;span class="hljs-keyword"&gt;long&lt;/span&gt; lastAdjust = &lt;span class="hljs-number"&gt;0&lt;/span&gt;;
&lt;span class="hljs-keyword"&gt;const&lt;/span&gt; &lt;span class="hljs-keyword"&gt;unsigned&lt;/span&gt; &lt;span class="hljs-keyword"&gt;long&lt;/span&gt; HALFWEEK = &lt;span class="hljs-number"&gt;1000UL&lt;/span&gt;*&lt;span class="hljs-number"&gt;43200&lt;/span&gt;*&lt;span class="hljs-number"&gt;7&lt;/span&gt;;
&lt;span class="hljs-comment"&gt;//10000UL;&lt;/span&gt;
&lt;span class="hljs-keyword"&gt;int&lt;/span&gt; waterMode = &lt;span class="hljs-number"&gt;0&lt;/span&gt;; &lt;span class="hljs-comment"&gt;// no watering until an interval is chosen&lt;/span&gt;
&lt;span class="hljs-keyword"&gt;int&lt;/span&gt; interval = &lt;span class="hljs-number"&gt;0&lt;/span&gt;;
&lt;span class="hljs-keyword"&gt;int&lt;/span&gt; intervals[&lt;span class="hljs-number"&gt;5&lt;/span&gt;] = {&lt;span class="hljs-number"&gt;0&lt;/span&gt;, &lt;span class="hljs-number"&gt;1&lt;/span&gt;, &lt;span class="hljs-number"&gt;2&lt;/span&gt;, &lt;span class="hljs-number"&gt;4&lt;/span&gt;, &lt;span class="hljs-number"&gt;6&lt;/span&gt;};
&lt;span class="hljs-keyword"&gt;bool&lt;/span&gt; tested = &lt;span class="hljs-literal"&gt;false&lt;/span&gt;;

&lt;span class="hljs-meta"&gt;#&lt;span class="hljs-meta-keyword"&gt;define&lt;/span&gt; PUMP 0&lt;/span&gt;
&lt;span class="hljs-meta"&gt;#&lt;span class="hljs-meta-keyword"&gt;define&lt;/span&gt; LIGHT 1&lt;/span&gt;

&lt;span class="hljs-function"&gt;&lt;span class="hljs-keyword"&gt;void&lt;/span&gt; &lt;span class="hljs-title"&gt;setup&lt;/span&gt;&lt;span class="hljs-params"&gt;()&lt;/span&gt; &lt;/span&gt;{
  &lt;span class="hljs-built_in"&gt;pinMode&lt;/span&gt;(PUMP, OUTPUT);
  &lt;span class="hljs-built_in"&gt;pinMode&lt;/span&gt;(LIGHT, OUTPUT);
  &lt;span class="hljs-built_in"&gt;pinMode&lt;/span&gt;(&lt;span class="hljs-number"&gt;3&lt;/span&gt;, INPUT_PULLUP); &lt;span class="hljs-comment"&gt;// pins 3 and 4 can be connected to ground for me to interact with the chip&lt;/span&gt;
  &lt;span class="hljs-built_in"&gt;pinMode&lt;/span&gt;(&lt;span class="hljs-number"&gt;4&lt;/span&gt;, INPUT_PULLUP);
  &lt;span class="hljs-keyword"&gt;for&lt;/span&gt; (&lt;span class="hljs-keyword"&gt;int&lt;/span&gt; i = &lt;span class="hljs-number"&gt;0&lt;/span&gt;; i &amp;lt;= &lt;span class="hljs-number"&gt;40&lt;/span&gt;; i++) {
    &lt;span class="hljs-built_in"&gt;analogWrite&lt;/span&gt;(LIGHT, i*&lt;span class="hljs-number"&gt;5&lt;/span&gt;);
    &lt;span class="hljs-built_in"&gt;delay&lt;/span&gt;(&lt;span class="hljs-number"&gt;30&lt;/span&gt;);
  }
  &lt;span class="hljs-built_in"&gt;delay&lt;/span&gt;(&lt;span class="hljs-number"&gt;800&lt;/span&gt;);
  &lt;span class="hljs-keyword"&gt;for&lt;/span&gt; (&lt;span class="hljs-keyword"&gt;int&lt;/span&gt; i = &lt;span class="hljs-number"&gt;20&lt;/span&gt;; i &amp;gt; &lt;span class="hljs-number"&gt;0&lt;/span&gt;; i--) {
    &lt;span class="hljs-built_in"&gt;analogWrite&lt;/span&gt;(LIGHT, i*&lt;span class="hljs-number"&gt;10&lt;/span&gt;);
    &lt;span class="hljs-built_in"&gt;delay&lt;/span&gt;(&lt;span class="hljs-number"&gt;10&lt;/span&gt;);
  }
  &lt;span class="hljs-built_in"&gt;digitalWrite&lt;/span&gt;(LIGHT, LOW);
  &lt;span class="hljs-built_in"&gt;delay&lt;/span&gt;(&lt;span class="hljs-number"&gt;300&lt;/span&gt;);
}

&lt;span class="hljs-function"&gt;&lt;span class="hljs-keyword"&gt;bool&lt;/span&gt; &lt;span class="hljs-title"&gt;timestamp&lt;/span&gt;&lt;span class="hljs-params"&gt;(&lt;span class="hljs-keyword"&gt;unsigned&lt;/span&gt; &lt;span class="hljs-keyword"&gt;long&lt;/span&gt; stamp, &lt;span class="hljs-keyword"&gt;unsigned&lt;/span&gt; &lt;span class="hljs-keyword"&gt;long&lt;/span&gt; wait)&lt;/span&gt; &lt;/span&gt;{
  &lt;span class="hljs-comment"&gt;// checks whether or not a number of milliseconds have passed since a timestamp&lt;/span&gt;
  &lt;span class="hljs-keyword"&gt;bool&lt;/span&gt; pastStamp = (&lt;span class="hljs-built_in"&gt;millis&lt;/span&gt;()-stamp &amp;gt;= wait);
  &lt;span class="hljs-keyword"&gt;return&lt;/span&gt; pastStamp;
}

&lt;span class="hljs-function"&gt;&lt;span class="hljs-keyword"&gt;void&lt;/span&gt; &lt;span class="hljs-title"&gt;water&lt;/span&gt;&lt;span class="hljs-params"&gt;()&lt;/span&gt; &lt;/span&gt;{
  &lt;span class="hljs-comment"&gt;// waters the plant and stores the time of watering&lt;/span&gt;
  &lt;span class="hljs-built_in"&gt;digitalWrite&lt;/span&gt;(PUMP, HIGH);
  &lt;span class="hljs-built_in"&gt;delay&lt;/span&gt;(&lt;span class="hljs-number"&gt;8000&lt;/span&gt;);
  &lt;span class="hljs-built_in"&gt;digitalWrite&lt;/span&gt;(PUMP, LOW);
  lastWater = &lt;span class="hljs-built_in"&gt;millis&lt;/span&gt;()&lt;span class="hljs-number"&gt;-8000&lt;/span&gt;;
}
&lt;span class="hljs-function"&gt;&lt;span class="hljs-keyword"&gt;void&lt;/span&gt; &lt;span class="hljs-title"&gt;setInterval&lt;/span&gt;&lt;span class="hljs-params"&gt;(&lt;span class="hljs-keyword"&gt;int&lt;/span&gt; dm)&lt;/span&gt; &lt;/span&gt;{
  &lt;span class="hljs-comment"&gt;// sets the interval at which the plant should be watered and displays the new interval on the LED&lt;/span&gt;
  waterMode = (waterMode+dm)%&lt;span class="hljs-number"&gt;5&lt;/span&gt;;
  interval = intervals[waterMode];
  &lt;span class="hljs-keyword"&gt;if&lt;/span&gt; (waterMode == &lt;span class="hljs-number"&gt;0&lt;/span&gt;) {
    &lt;span class="hljs-keyword"&gt;for&lt;/span&gt; (&lt;span class="hljs-keyword"&gt;int&lt;/span&gt; i = &lt;span class="hljs-number"&gt;20&lt;/span&gt;; i &amp;gt;= &lt;span class="hljs-number"&gt;0&lt;/span&gt;; i--) {
      &lt;span class="hljs-built_in"&gt;analogWrite&lt;/span&gt;(LIGHT, i*&lt;span class="hljs-number"&gt;10&lt;/span&gt;);
      &lt;span class="hljs-built_in"&gt;delay&lt;/span&gt;(&lt;span class="hljs-number"&gt;30&lt;/span&gt;);
    }
  } &lt;span class="hljs-keyword"&gt;else&lt;/span&gt; &lt;span class="hljs-keyword"&gt;if&lt;/span&gt;(waterMode == &lt;span class="hljs-number"&gt;1&lt;/span&gt;) {
    &lt;span class="hljs-built_in"&gt;digitalWrite&lt;/span&gt;(LIGHT, HIGH);
    &lt;span class="hljs-built_in"&gt;delay&lt;/span&gt;(&lt;span class="hljs-number"&gt;250&lt;/span&gt;);
    &lt;span class="hljs-built_in"&gt;digitalWrite&lt;/span&gt;(LIGHT, LOW);
  }
  &lt;span class="hljs-keyword"&gt;else&lt;/span&gt; {
    &lt;span class="hljs-keyword"&gt;for&lt;/span&gt; (&lt;span class="hljs-keyword"&gt;int&lt;/span&gt; i = &lt;span class="hljs-number"&gt;0&lt;/span&gt;; i &amp;lt; (waterMode&lt;span class="hljs-number"&gt;-1&lt;/span&gt;); i++) {
      &lt;span class="hljs-built_in"&gt;digitalWrite&lt;/span&gt;(LIGHT, HIGH);
      &lt;span class="hljs-built_in"&gt;delay&lt;/span&gt;(&lt;span class="hljs-number"&gt;900&lt;/span&gt;);
      &lt;span class="hljs-built_in"&gt;digitalWrite&lt;/span&gt;(LIGHT, LOW);
      &lt;span class="hljs-built_in"&gt;delay&lt;/span&gt;(&lt;span class="hljs-number"&gt;100&lt;/span&gt;);
    }
  }
  lastAdjust = &lt;span class="hljs-built_in"&gt;millis&lt;/span&gt;(); &lt;span class="hljs-comment"&gt;// stores adjustment time so the plant won't be watered immediately&lt;/span&gt;
}

&lt;span class="hljs-function"&gt;&lt;span class="hljs-keyword"&gt;void&lt;/span&gt; &lt;span class="hljs-title"&gt;loop&lt;/span&gt;&lt;span class="hljs-params"&gt;()&lt;/span&gt; &lt;/span&gt;{
  &lt;span class="hljs-keyword"&gt;if&lt;/span&gt; (tested) &lt;span class="hljs-comment"&gt;// once the three test pumps are done&lt;/span&gt;
  {
    &lt;span class="hljs-keyword"&gt;if&lt;/span&gt; (&lt;span class="hljs-built_in"&gt;timestamp&lt;/span&gt;(lastAdjust, &lt;span class="hljs-number"&gt;2000&lt;/span&gt;)) {
      &lt;span class="hljs-keyword"&gt;if&lt;/span&gt; (!&lt;span class="hljs-built_in"&gt;digitalRead&lt;/span&gt;(&lt;span class="hljs-number"&gt;3&lt;/span&gt;) &amp;amp;&amp;amp; waterMode &amp;lt; &lt;span class="hljs-number"&gt;4&lt;/span&gt;)
        &lt;span class="hljs-built_in"&gt;setInterval&lt;/span&gt;(&lt;span class="hljs-number"&gt;1&lt;/span&gt;); &lt;span class="hljs-comment"&gt;// can water less frequently&lt;/span&gt;
      &lt;span class="hljs-keyword"&gt;else&lt;/span&gt; &lt;span class="hljs-keyword"&gt;if&lt;/span&gt; (!&lt;span class="hljs-built_in"&gt;digitalRead&lt;/span&gt;(&lt;span class="hljs-number"&gt;4&lt;/span&gt;) &amp;amp;&amp;amp; waterMode &amp;gt; &lt;span class="hljs-number"&gt;0&lt;/span&gt;)
        &lt;span class="hljs-built_in"&gt;setInterval&lt;/span&gt;(&lt;span class="hljs-number"&gt;-1&lt;/span&gt;); &lt;span class="hljs-comment"&gt;// or more frequently&lt;/span&gt;
    }
    &lt;span class="hljs-keyword"&gt;if&lt;/span&gt; (waterMode != &lt;span class="hljs-number"&gt;0&lt;/span&gt;
      &amp;amp;&amp;amp; &lt;span class="hljs-built_in"&gt;timestamp&lt;/span&gt;(lastWater, interval*HALFWEEK)
      &amp;amp;&amp;amp; &lt;span class="hljs-built_in"&gt;timestamp&lt;/span&gt;(lastAdjust, &lt;span class="hljs-number"&gt;60000&lt;/span&gt;)) {
          &lt;span class="hljs-built_in"&gt;water&lt;/span&gt;();
    }
  } &lt;span class="hljs-keyword"&gt;else&lt;/span&gt; {
    &lt;span class="hljs-built_in"&gt;analogWrite&lt;/span&gt;(LIGHT, &lt;span class="hljs-number"&gt;90&lt;/span&gt;*&lt;span class="hljs-built_in"&gt;sin&lt;/span&gt;(&lt;span class="hljs-built_in"&gt;millis&lt;/span&gt;()*&lt;span class="hljs-number"&gt;0.001&lt;/span&gt;)+&lt;span class="hljs-number"&gt;100&lt;/span&gt;);
    &lt;span class="hljs-keyword"&gt;if&lt;/span&gt; (!&lt;span class="hljs-built_in"&gt;digitalRead&lt;/span&gt;(&lt;span class="hljs-number"&gt;3&lt;/span&gt;)) { &lt;span class="hljs-comment"&gt;// only start test when I connect pin 3 to ground&lt;/span&gt;
      &lt;span class="hljs-built_in"&gt;digitalWrite&lt;/span&gt;(LIGHT, HIGH);
      &lt;span class="hljs-keyword"&gt;for&lt;/span&gt; (&lt;span class="hljs-keyword"&gt;int&lt;/span&gt; i = &lt;span class="hljs-number"&gt;0&lt;/span&gt;; i &amp;lt; &lt;span class="hljs-number"&gt;3&lt;/span&gt;; i++) {
        &lt;span class="hljs-comment"&gt;// send three test water pumps to make sure the tube is at the right height&lt;/span&gt;
        &lt;span class="hljs-built_in"&gt;digitalWrite&lt;/span&gt;(PUMP, HIGH);
        &lt;span class="hljs-built_in"&gt;delay&lt;/span&gt;(&lt;span class="hljs-number"&gt;2000&lt;/span&gt;);
        &lt;span class="hljs-built_in"&gt;digitalWrite&lt;/span&gt;(PUMP, LOW);
        &lt;span class="hljs-built_in"&gt;delay&lt;/span&gt;(&lt;span class="hljs-number"&gt;4000&lt;/span&gt;);
      }
      &lt;span class="hljs-built_in"&gt;digitalWrite&lt;/span&gt;(LIGHT, LOW);
      lastWater = &lt;span class="hljs-built_in"&gt;millis&lt;/span&gt;()&lt;span class="hljs-number"&gt;-4000&lt;/span&gt;;
      tested = &lt;span class="hljs-literal"&gt;true&lt;/span&gt;;
    }
  }
}
&lt;/code&gt;&lt;/pre&gt;
&lt;/details&gt;
&lt;h2&gt;Circuit&lt;/h2&gt;
&lt;p&gt;Once the code and pump were working, I was ready to solder everything onto a &lt;b&gt;circuit board&lt;/b&gt;.
  I took the chip off the Arduino and put all my components on a prototype board.
&lt;/p&gt;
&lt;figure class="endfloat"&gt;
&lt;img alt="" src="https://www.harysdalvi.com/wabisabi/img/pcb.jpg"/&gt;
&lt;figcaption&gt;The circuit board with some components added&lt;/figcaption&gt;
&lt;/figure&gt;
&lt;p&gt;The board was arranged in a regular grid, with the components neatly
  layed out. It was a beautiful aesthetic, but in many ways it was the
  antithesis of wabi-sabi. Rather than displaying nature and imperfection,
  this circuit board was the result of precise manufacturing. I thought
  it might be interesting to have the board next to a plant with all
  its natural imperfections.
&lt;/p&gt;
&lt;p&gt;To add the pump, I first soldered the pump wires to some more wires
  so I could move it farther away from the circuit board. After that
  I started soldering the actual circuit. It didn't go entirely smoothly:
  my wire cutter couldn't come close enough to trim the leads properly,
  and I almost burned my finger fixing the connections on the transistor.
  But I was able to get it done in the end, and when I hooked it up to
  power from the Arduino, the green LED lit up.
&lt;/p&gt;
&lt;h2&gt;Plant&lt;/h2&gt;
&lt;p&gt;Satisfied that my circuit was working, I went to get a plant. I wanted
  one that looked interesting but didn't need too much water, so &lt;b&gt;&lt;i&gt;Aloe vera&lt;/i&gt;&lt;/b&gt;
  seemed like a good choice. There were rows and rows of Aloe plants at Home Depot,
  all virtually identical. Yet one was very different from the rest, because it would
  become mine for the forseeable future. I just had to find which one.
&lt;/p&gt;
&lt;figure class="endfloat"&gt;
&lt;img alt="" src="https://www.harysdalvi.com/wabisabi/img/aloe.jpg"/&gt;
&lt;figcaption&gt;The Aloe plant I picked&lt;/figcaption&gt;
&lt;/figure&gt;
&lt;p&gt;It would be nice to say I picked one that spoke to me, but they really looked about the same.
  I picked the one that I thought would fit best. It wasn't perfect: some leaves were straight up,
  some were sticking out, a few had small brown spots. But the others weren't perfect either,
  This one was a good size
  to fit nicely and still command attention, so I got it.
&lt;/p&gt;
&lt;h2&gt;Batteries&lt;/h2&gt;
&lt;p&gt;Before adding the battery holder, I connected my circuit to power from Arduino
  to make sure it was still working. Of course, it wasn't working.
&lt;/p&gt;
&lt;p&gt;To find out the cause, I built a makeshift multimeter with Arduino to see
  if the connections were good. By this method, I found &lt;b&gt;the LED was bad,
  and the wire next to it was fine.&lt;/b&gt; Then I tried applying power to the LED
  to see where it stopped working. By this method, I found &lt;b&gt;the LED was fine,
  and the wire next to it was bad.&lt;/b&gt;
&lt;/p&gt;
&lt;p&gt;At first, I thought physics was broken. There was no way it made sense to have
  such a contradiction. Eventually, I found the problem was in the connection
  between the LED and the wire, so I added extra solder and it worked again.
&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Now&lt;/em&gt; I was ready to add the batteries. Along the way, I soldered the positive
  end in the wrong place at least twice. I had the negative end at the top
  of the board when I wanted it at the bottom.
  The LED lit up, showing me that the circuit was working; but every time I tried to
  send test pumps, it rebooted. Connecting the reset pin to the VCC pin helped a little,
  but it still wasn't working.
&lt;/p&gt;
&lt;p&gt;I spent hours trying to fix that problem only to find it was a problem I had faced earlier
  with Arduino: I needed a capacitor connecting to ground. I soldered in the capacitor
  and it finally worked.
&lt;/p&gt;
&lt;figure class="twoside"&gt;
&lt;img alt="" src="https://www.harysdalvi.com/wabisabi/img/plan3.jpg"/&gt;
&lt;img alt="The circuit has been updated to include a capacitor" class="white-diagram" src="https://www.harysdalvi.com/wabisabi/img/plan2.svg"/&gt;
&lt;figcaption&gt;My final master plan&lt;/figcaption&gt;
&lt;/figure&gt;
&lt;p&gt;I tested it on a glass of water to be sure. I must admit, it took me a few tries to get the tube
  at the right height, but it worked.&lt;/p&gt;
&lt;figure class="middleimg"&gt;
&lt;iframe allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture" allowfullscreen="" frameborder="0" height="480" id="movie2" src="https://www.youtube.com/embed/x_Ez8G8FlQs" title="YouTube video player" width="240"&gt;&lt;/iframe&gt;
&lt;/figure&gt;
&lt;h2&gt;Wabi-Sabi&lt;/h2&gt;
&lt;p&gt;The Aloe definitely added some wabi-sabi to my room. It has natural dents and spots,
  but they are what give it character, almost to the point that I can't see myself
  picking any other from the rows and rows of similar plants at Home Depot. I love the
  way the leaves are irregular in shape and size, but the small spikes are arranged
  so regularly and neatly on all of them.
&lt;/p&gt;
&lt;p&gt;That's nice and all, but it was expected. What I didn't expect is how even the
  circuit has elements of wabi-sabi. It has random wires sticking out from where I
  couldn't cut the leads close enough. Some parts
  of the solder bridges are thicker than others, some parts are more spiky than others.
  All of it is arranged on that same even grid.
&lt;/p&gt;
&lt;p&gt;My circuit's small imperfections give it character, just like the Aloe. It represents
  the hard work I put in over the course of a week for what I expected to be a one-day project.
  I hope it continues to work, because
  it's not consumer-ready quality. The connections are handmade rather than professional PCB connections.
  It doesn't have a box hiding the electronic parts, so consumers
  can't pretend it's just a magic box. But it's an interesting wabi-sabi decorative
  piece in its own right, one that represents my interests and learnings.
&lt;/p&gt;
&lt;figure class="middleimg"&gt;
&lt;img alt="An aloe plan on a table indoors, with a circuit board, batteries, and a plastic box with water and a pump next to it." class="tall" src="https://www.harysdalvi.com/wabisabi/img/final.jpg"/&gt;
&lt;/figure&gt;
&lt;hr/&gt;
&lt;h2&gt;Related Posts&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="https://www.harysdalvi.com/shipwreck"&gt;Shipwreck&lt;/a&gt; — How to almost build a remote control sailboat using Arduino&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.harysdalvi.com/blog/2104/"&gt;Magic Pencil&lt;/a&gt; — An attempt to type on Desmos online without touching the computer&lt;/li&gt;
&lt;/ul&gt;</content></entry><entry><title>Shipwreck</title><summary>How to almost build a remote control sailboat using Arduino</summary><link href="https://www.harysdalvi.com/shipwreck/" rel="alternate"/><id>https://www.harysdalvi.com/shipwreck/</id><published>2022-06-01T16:38:07Z</published><updated>2022-06-01T16:38:07Z</updated><content type="html" xml:lang="en">&lt;p&gt;Lots of people make remote control motorboats. Remote control sailboats are less common, although there is
  such a thing in &lt;a href="https://www.centralpark.com/things-to-do/sports/model-sailboats/" target="_blank"&gt;Central Park&lt;/a&gt;.
  I thought I would make my own remote control sailboat using Arduino. It almost worked. Here is a video of the boat
  at her peak:
&lt;/p&gt;
&lt;figure class="middleimg"&gt;
&lt;iframe allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture" allowfullscreen="" frameborder="0" height="480" id="movie" src="https://www.youtube-nocookie.com/embed/EGEH9DzZt4I" title="YouTube video player" width="240"&gt;&lt;/iframe&gt;&lt;/figure&gt;
&lt;p&gt;Unfortunately, the boat quickly declined after that. Here is how to &lt;em&gt;almost&lt;/em&gt; make a remote control sailboat using Arduino.
&lt;/p&gt;
&lt;h2&gt;Structure&lt;/h2&gt;
&lt;figure class="endfloat"&gt;
&lt;img alt="" src="https://www.harysdalvi.com/shipwreck/img/hull.jpg"/&gt;
&lt;figcaption&gt;The ill-fated hull&lt;/figcaption&gt;
&lt;img alt="" src="https://www.harysdalvi.com/shipwreck/img/mast.jpg"/&gt;
&lt;figcaption&gt;The mast and sail&lt;/figcaption&gt;
&lt;/figure&gt;
&lt;p&gt;Materials:
&lt;/p&gt;&lt;ul&gt;
&lt;li&gt;Plastic box&lt;/li&gt;
&lt;li&gt;2 empty plastic bottles&lt;/li&gt;
&lt;li&gt;2 plastic forks&lt;/li&gt;
&lt;li&gt;Piece of paper with holes&lt;/li&gt;
&lt;li&gt;Scissors&lt;/li&gt;
&lt;li&gt;String&lt;/li&gt;
&lt;li&gt;Tape&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;/p&gt;
&lt;p&gt;For the hull, I went with a catamaran structure: I used two plastic bottles for added stability. Feeling confident,
  I connected the bottles to a plastic box using nothing but tape. I wish I had stronger tape. (I tried school glue as well,
  but it kept coming apart.)
&lt;/p&gt;
&lt;p&gt;Next came the mast and sail. For the mast, I used two plastic forks taped together. For the sail, I cut a piece of
  paper with physics on it. I tied the sail to the mast using &lt;a href="https://en.wikipedia.org/wiki/Bowline" target="_blank"&gt;bowline knots&lt;/a&gt;
  and then added tape for extra security. (My boat probably relied a little too heavily on tape.)
  Finally, I cut a small hole in the corner to allow for tightening and loosening the sail later on.&lt;/p&gt;
&lt;p&gt;
I attached the mast and sail to the hull with yet more tape as well as some more string.
&lt;/p&gt;
&lt;figure&gt;
&lt;img alt="" class="small" src="https://www.harysdalvi.com/shipwreck/img/sailboat.jpg"/&gt;
&lt;/figure&gt;
&lt;p&gt;Now I was ready to see if my vessel was seaworthy. With no captain and no ability to steer,
  I successfully set sail.
&lt;/p&gt;
&lt;figure class="middleimg"&gt;
&lt;iframe allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture" allowfullscreen="" frameborder="0" height="480" id="movie2" src="https://www.youtube-nocookie.com/embed/GaJjy5a0G8c" title="YouTube video player" width="240"&gt;&lt;/iframe&gt;&lt;/figure&gt;
&lt;p&gt;The next logical step was to add a captain and an ability to steer. That's where the Arduino came in.
&lt;/p&gt;
&lt;h2&gt;Control&lt;/h2&gt;
&lt;figure class="endfloat"&gt;
&lt;img alt="" src="https://www.harysdalvi.com/shipwreck/img/circuit0.jpg"/&gt;
&lt;figcaption&gt;The control circuit&lt;/figcaption&gt;
&lt;img alt="" src="https://www.harysdalvi.com/shipwreck/img/control.jpg"/&gt;
&lt;figcaption&gt;Adding a &lt;i&gt;tiller&lt;/i&gt; (fork) for steering and a &lt;i&gt;mainsheet&lt;/i&gt; (string)
  for controlling the sail&lt;/figcaption&gt;
&lt;/figure&gt;
&lt;p&gt;Materials:
&lt;/p&gt;&lt;ul&gt;
&lt;li&gt;Arduino&lt;/li&gt;
&lt;li&gt;Infrared remote&lt;/li&gt;
&lt;li&gt;Infrared receiver module&lt;/li&gt;
&lt;li&gt;DC motor&lt;/li&gt;
&lt;li&gt;Servo motor&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;/p&gt;
&lt;p&gt;My plan was to control the steering using a servo motor (which can set its angular location to any value 0–180 degrees)
  and a regular DC motor (which just spins). I would attach the servo motor to a tiller to steer and attach the
  DC motor to the sail with a rope so I could tighten and loosen the sail. I connected the DC motor directly
  to the Arduino input pins so I could control which way it spun using code.
&lt;/p&gt;
&lt;details&gt;
&lt;summary&gt;Circuit diagram&lt;/summary&gt;
&lt;figure&gt;
&lt;img alt="" src="https://www.harysdalvi.com/shipwreck/img/circuit1.jpg"/&gt;
&lt;img alt="The Arduino connects to a servo motor, a DC motor, and an infrared receiver module." src="https://www.harysdalvi.com/shipwreck/img/circuit2.jpg"/&gt;
&lt;figcaption&gt;Made with &lt;a href="https://www.tinkercad.com" target="_blank"&gt;Tinkercad&lt;/a&gt;&lt;/figcaption&gt;
&lt;/figure&gt;
&lt;/details&gt;
&lt;details&gt;
&lt;summary&gt;Code&lt;/summary&gt;&lt;br clear="all"/&gt;
&lt;pre&gt;&lt;code class="lang-c hljs"&gt;&lt;span class="hljs-meta"&gt;#&lt;span class="hljs-meta-keyword"&gt;include&lt;/span&gt; &lt;span class="hljs-meta-string"&gt;&amp;lt;IRremote.h&amp;gt;&lt;/span&gt;&lt;/span&gt;
&lt;span class="hljs-meta"&gt;#&lt;span class="hljs-meta-keyword"&gt;include&lt;/span&gt; &lt;span class="hljs-meta-string"&gt;&amp;lt;Servo.h&amp;gt;&lt;/span&gt;&lt;/span&gt;

&lt;span class="hljs-meta"&gt;#&lt;span class="hljs-meta-keyword"&gt;define&lt;/span&gt; SERVO 4&lt;/span&gt;
&lt;span class="hljs-meta"&gt;#&lt;span class="hljs-meta-keyword"&gt;define&lt;/span&gt; IR 7&lt;/span&gt;
&lt;span class="hljs-meta"&gt;#&lt;span class="hljs-meta-keyword"&gt;define&lt;/span&gt; MOTOR_1 8&lt;/span&gt;
&lt;span class="hljs-meta"&gt;#&lt;span class="hljs-meta-keyword"&gt;define&lt;/span&gt; MOTOR_2 9&lt;/span&gt;
&lt;span class="hljs-function"&gt;IRrecv &lt;span class="hljs-title"&gt;ir&lt;/span&gt;&lt;span class="hljs-params"&gt;(IR)&lt;/span&gt;&lt;/span&gt;;
Servo servo;

&lt;span class="hljs-keyword"&gt;int&lt;/span&gt; pos;

&lt;span class="hljs-function"&gt;&lt;span class="hljs-keyword"&gt;void&lt;/span&gt; &lt;span class="hljs-title"&gt;setup&lt;/span&gt;&lt;span class="hljs-params"&gt;()&lt;/span&gt; &lt;/span&gt;{
  ir.enableIRIn();
  ir.blink13(&lt;span class="hljs-literal"&gt;true&lt;/span&gt;);
  pinMode(MOTOR_1, OUTPUT);
  pinMode(MOTOR_2, OUTPUT);
  servo.attach(SERVO);
  pos = &lt;span class="hljs-number"&gt;90&lt;/span&gt;;
  servo.write(&lt;span class="hljs-number"&gt;90&lt;/span&gt;);
}

&lt;span class="hljs-function"&gt;&lt;span class="hljs-keyword"&gt;void&lt;/span&gt; &lt;span class="hljs-title"&gt;loop&lt;/span&gt;&lt;span class="hljs-params"&gt;()&lt;/span&gt; &lt;/span&gt;{
  &lt;span class="hljs-keyword"&gt;if&lt;/span&gt; (ir.decode()) {
    &lt;span class="hljs-keyword"&gt;int&lt;/span&gt; button = ir.decodedIRData.command;
    &lt;span class="hljs-comment"&gt;/*
      * up: 9
      * down: 7
      * vol-: 21
      * vol+: 70
      * eq: 25
      */&lt;/span&gt;
    &lt;span class="hljs-keyword"&gt;if&lt;/span&gt; (button == &lt;span class="hljs-number"&gt;9&lt;/span&gt;) {
      digitalWrite(MOTOR_1, LOW);
      digitalWrite(MOTOR_2, HIGH);
      delay(&lt;span class="hljs-number"&gt;100&lt;/span&gt;);
      digitalWrite(MOTOR_2, LOW);
      delay(&lt;span class="hljs-number"&gt;100&lt;/span&gt;);
    }
    &lt;span class="hljs-keyword"&gt;if&lt;/span&gt; (button == &lt;span class="hljs-number"&gt;7&lt;/span&gt;) {
      digitalWrite(MOTOR_1, HIGH);
      digitalWrite(MOTOR_2, LOW);
      delay(&lt;span class="hljs-number"&gt;100&lt;/span&gt;);
      digitalWrite(MOTOR_1, LOW);
      delay(&lt;span class="hljs-number"&gt;100&lt;/span&gt;);
    }
    &lt;span class="hljs-keyword"&gt;if&lt;/span&gt; (button == &lt;span class="hljs-number"&gt;21&lt;/span&gt;) {
      pos -= &lt;span class="hljs-number"&gt;30&lt;/span&gt;;
      &lt;span class="hljs-keyword"&gt;if&lt;/span&gt; (pos &amp;lt; &lt;span class="hljs-number"&gt;0&lt;/span&gt;) {
        pos = &lt;span class="hljs-number"&gt;0&lt;/span&gt;;
        servo.write(&lt;span class="hljs-number"&gt;0&lt;/span&gt;);
      } &lt;span class="hljs-keyword"&gt;else&lt;/span&gt; {
        &lt;span class="hljs-keyword"&gt;for&lt;/span&gt; (&lt;span class="hljs-keyword"&gt;int&lt;/span&gt; i = &lt;span class="hljs-number"&gt;29&lt;/span&gt;; i &amp;gt;= &lt;span class="hljs-number"&gt;0&lt;/span&gt;; i--) {
          servo.write(pos+i);
          delay(&lt;span class="hljs-number"&gt;15&lt;/span&gt;);
        }
      }
      delay(&lt;span class="hljs-number"&gt;100&lt;/span&gt;);
    }
    &lt;span class="hljs-keyword"&gt;if&lt;/span&gt; (button == &lt;span class="hljs-number"&gt;70&lt;/span&gt;) {
      pos += &lt;span class="hljs-number"&gt;30&lt;/span&gt;;
      &lt;span class="hljs-keyword"&gt;if&lt;/span&gt; (pos &amp;gt; &lt;span class="hljs-number"&gt;180&lt;/span&gt;) {
        pos = &lt;span class="hljs-number"&gt;180&lt;/span&gt;;
        servo.write(&lt;span class="hljs-number"&gt;180&lt;/span&gt;);
      } &lt;span class="hljs-keyword"&gt;else&lt;/span&gt; {
        &lt;span class="hljs-keyword"&gt;for&lt;/span&gt; (&lt;span class="hljs-keyword"&gt;int&lt;/span&gt; i = &lt;span class="hljs-number"&gt;-29&lt;/span&gt;; i &amp;lt;= &lt;span class="hljs-number"&gt;0&lt;/span&gt;; i++) {
          servo.write(pos+i);
          delay(&lt;span class="hljs-number"&gt;15&lt;/span&gt;);
        }
      }
      delay(&lt;span class="hljs-number"&gt;100&lt;/span&gt;);
    }
    &lt;span class="hljs-keyword"&gt;if&lt;/span&gt; (button == &lt;span class="hljs-number"&gt;25&lt;/span&gt;) {
      pos = &lt;span class="hljs-number"&gt;90&lt;/span&gt;;
      servo.write(pos);
      delay(&lt;span class="hljs-number"&gt;200&lt;/span&gt;);
    }
    ir.resume();
  }
}
&lt;/code&gt;&lt;/pre&gt;
&lt;/details&gt;
&lt;h2&gt;Ship with a Skipper&lt;/h2&gt;
&lt;p&gt;Now that I had a control system and a supposedly seaworthy vessel, I just had to
  integrate the two. I tied and taped a string around the DC motor, while I tied and
  taped a plastic fork to the servo motor.
&lt;/p&gt;
&lt;p&gt;I flipped the servo motor upside down so the tiller would be in the water, and I tied the
  mainsheet to the sail. Of course, I secured everything with tape. Finally, I added a 9V battery.
  With the Arduino, my ship now had a &lt;i&gt;skipper&lt;/i&gt; to lead a crew of motors.
&lt;/p&gt;
&lt;figure&gt;
&lt;img alt="" class="small" src="https://www.harysdalvi.com/shipwreck/img/skipper.jpg"/&gt;
&lt;/figure&gt;
&lt;p&gt;However, when I tested it, I found the controls were extremely glitchy. When I pressed a button, it would
  hardly ever work.
&lt;/p&gt;
&lt;figure class="middleimg"&gt;
&lt;iframe allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture" allowfullscreen="" frameborder="0" height="480" id="movie3" src="https://www.youtube-nocookie.com/embed/p1hU4AJDo_s" title="YouTube video player" width="240"&gt;&lt;/iframe&gt;&lt;/figure&gt;
&lt;p&gt;So I began another set of improvements, not knowing it would be my last. I tried compensating for the glitches
  by programming the Arduino to move further with each button press. I also broke off the fork part of the tiller
  and pushed it into the bottom for extra steering power. Finally, I changed the angle of the breadboard and
  moved things around so the infrared sensor would be higher up, and signals would be less likely to get blocked.
&lt;/p&gt;
&lt;figure class="middleimg"&gt;
&lt;img alt="A plastic box containing a circuit, supported by two plastic bottles. Two plastic forks are taped together in the front and connect to a paper sail with physics equations on it." src="https://www.harysdalvi.com/shipwreck/img/final.jpg"/&gt;
&lt;figcaption&gt;The final design&lt;/figcaption&gt;
&lt;/figure&gt;
&lt;p&gt;I tested the new control system on land and it worked decently, though it wasn't great.&lt;/p&gt;
&lt;figure class="middleimg"&gt;
&lt;iframe allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture" allowfullscreen="" frameborder="0" height="480" id="movie4" src="https://www.youtube-nocookie.com/embed/IFYuK63Oig4" title="YouTube video player" width="240"&gt;&lt;/iframe&gt;&lt;/figure&gt;
&lt;h2&gt;Shipwreck&lt;/h2&gt;
&lt;p&gt;I was confused about why the controls seemed to work so much better on land than on water. I was also confused
  about why the controls were working quite poorly either way. But rather than find the answer,
  I made excuses like “the sun's infrared rays are interfering” and went on another voyage.
  As soon as I left the harbor and set the ship on the water, I felt something wrong. The ship was quickly tilting
  to the right.&lt;/p&gt;
&lt;p&gt;My reflexes kicked in and I immediately grabbed the boat to protect the components inside.
  Unfortunately, there was still a little splashing. The sail and the battery were completely ruined, and I was worried
  about the rest of the components as well, especially the Arduino.
&lt;/p&gt;
&lt;figure class="middleimg"&gt;
&lt;img alt="Plastic parts and circuit components are laid on the ground next to a swimming pool" src="https://www.harysdalvi.com/shipwreck/img/shipwreck.jpg"/&gt;
&lt;figcaption&gt;Electric shipwreck&lt;/figcaption&gt;&lt;/figure&gt;
&lt;p&gt;It could have been far worse: the wreck occured close to shore, so my reflexes were quick enough that damage was
  minimal. Still, it meant I could go no further with the project. I took everything inside, removed all the tapes
  and knots, and tested the electronic components to see if they were still working. Fortunately, there was no
  damage as far as I can tell.
&lt;/p&gt;
&lt;p&gt;Then I did what I should have done before the shipwreck: I found out why the controls weren't working. The
  remote seems to have a very small range, only about a foot. Clearly the infrared hardware was designed for things like
  turning LEDs on and off, not sailing a small vessel across a swimming pool through 10 knot winds.
&lt;/p&gt;
&lt;h2&gt;Lessons Learned&lt;/h2&gt;
&lt;p&gt;A chain is only as strong as its weakest link. This project had one weak link and one weaker link. First, the
  infrared remote range wasn't enough to control the boat. This wasn't directly my fault, but I should have
  tested the hardware more before assuming it would work for my needs.&lt;/p&gt;
&lt;p&gt;The weaker link was a literal link
  between the plastic bottles and the plastic box. I just used a couple pieces of tape, but this is perhaps the most
  important link in the entire ship.
&lt;/p&gt;
&lt;p&gt;Without these two issues, I might have actually succeeded, although there were some more minor flaws. The paper sail
  was attached in a way that it was bent to one side. I should have attached it more carefully or used a &lt;i&gt;boom&lt;/i&gt;
  (solid bar at the bottom of the sail) to keep it straight. The plastic fork tiller was probably too thin.
  Using thicker rope might have been more secure and made the knots easier to tie.
&lt;/p&gt;
&lt;p&gt;From a more general perspective, this project taught me about taking risks. I took a risk by using electronic components near
  water. I paid for that risk. But even though I didn't succeed, I don't
  regret it. I took a risk hoping it would work out, but prepared for the chance that it might not. In the end, it almost
  did work out, and I only lost a battery.
&lt;/p&gt;
&lt;p&gt;At the same time, my risk should have been more calculated. Although I was safe because the battery was only 9V,
  I knew I was putting my components in danger. I should have
  tested the strength of the hull and the range of the remote beforehand.
&lt;/p&gt;
&lt;p&gt;There will probably be many more times in the future where I'll try a project that might fail. Hopefully I'll be
  more careful next time; but even if I can't completely eliminate the risk, it may or may not still be worth taking a chance.
&lt;/p&gt;
&lt;p&gt;And most of all, I learned that tape can't solve everything.
&lt;/p&gt;
&lt;div&gt;&lt;img alt="A man tries to put flex tape on a tub leaking water, but water leaks through the tape and his hand" class="small" src="https://www.harysdalvi.com/shipwreck/img/tape.jpg"/&gt;&lt;/div&gt;
&lt;hr/&gt;
&lt;h2&gt;Related Posts&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="https://www.harysdalvi.com/wabisabi"&gt;Wabi-Sabi&lt;/a&gt; — The making of an automatic plant watering system&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.harysdalvi.com/blog/2104/"&gt;Magic Pencil&lt;/a&gt; — An attempt to type on Desmos online without touching the computer&lt;/li&gt;
&lt;/ul&gt;</content></entry><entry><title>Ortatious Andith ib Staylatt Neller</title><summary>What does English sound like to people who don't speak it? And what does that have to do with machine learning and overfitting?</summary><link href="https://www.harysdalvi.com/staylatt/" rel="alternate"/><id>https://www.harysdalvi.com/staylatt/</id><published>2022-03-01T16:38:07Z</published><updated>2022-03-01T16:38:07Z</updated><content type="html" xml:lang="en">&lt;p&gt;There are a few videos on the internet about what English sounds like to
people who don't speak it. They often sound something like this:&lt;/p&gt;
&lt;blockquote&gt;
&lt;i&gt;Stringe canna&lt;/i&gt; is like a string &lt;i&gt;paggard hasheter ominent&lt;/i&gt;
if you think &lt;i&gt;diadout forday&lt;/i&gt; and &lt;i&gt;comminent paymin&lt;/i&gt;
for the trainable.
&lt;/blockquote&gt;
&lt;p&gt;In other words, the videos make no sense, and they shouldn't —
but it always annoys me how you can make out a lot of actual
English words, even if it's gibberish. That got me thinking about
a way to generate fake English with a computer and avoid
actual English words.
&lt;/p&gt;
&lt;p&gt;I thought of two main ways to do this: &lt;strong&gt;manually coding
  linguistic rules&lt;/strong&gt; and using a &lt;strong&gt;neural network&lt;/strong&gt;.
  In this page, I will go through both
  of these and compare the results. With that, let's start —
  or as they say in faux English, brind shass!
&lt;/p&gt;
&lt;h2&gt;Hard-Coding Phonotactics&lt;/h2&gt;
&lt;p&gt;When I first tried to hard-code English, I had trouble finding
  the resources I needed. In particular, I couldn't find
  a source on English &lt;strong&gt;phonotactics&lt;/strong&gt;
  that was detailed enough to write as a computer program.
  Phonotactics deals with how sounds are allowed to
  combine in a language: for example, “treels”
  is a valid combination of sounds in English, even
  though it isn't in the dictionary. “Gvprtskvni”
  is definitely &lt;em&gt;not&lt;/em&gt; a valid combination in English,
  but believe it or not, it is an actual Georgian word.
&lt;/p&gt;
&lt;figure class="endfloat"&gt;
&lt;p class="image" id="gvprtskvni" style="color: #112288"&gt;გვფრცქვნი&lt;/p&gt;
&lt;figcaption&gt;Gvprtskvni: it means “you peel us” in Georgian.&lt;/figcaption&gt;
&lt;/figure&gt;
&lt;p&gt;Fortunately, there is a very well-known and detailed
  source on Japanese phonotactics: Japanese writing.
  I first wrote a program to create fake Japanese.
  This is how it works:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;List all sounds that exist in Japanese.&lt;/li&gt;
&lt;li&gt;Make rules for all the ways those sounds can combine and make syllables.&lt;/li&gt;
&lt;li&gt;Adjust to make some sounds more probable than others [&lt;a href="https://www.harysdalvi.com/staylatt/#ref" id="src1"&gt;1&lt;/a&gt;].&lt;/li&gt;
&lt;li&gt;The program will read this data and output fake Japanese.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;As a non-Japanese speaker I thought the result was fairly convincing:&lt;/p&gt;
&lt;blockquote&gt;
&lt;span class="ipa"&gt;/oːri ribonu ku jokin roːpːuwaː. tɕin kisuoko beon ehoːn ri./&lt;/span&gt;&lt;br/&gt;
&lt;span lang="ja"&gt;おうり りぼぬ く よきん ろうっぷわあ。ちん きすおこ べおん えほうん り。&lt;/span&gt;&lt;br/&gt;
  Ōri ribonu ku yokin rōppuwā. Chin kisuoko beon ehōn ri.
  &lt;/blockquote&gt;
&lt;p&gt;Japanese phonotactics is a lot simpler than English phonotactics though,
  so I couldn't go straight to English. I tried various languages along the way,
  referencing Wikipedia phonology pages,
  but I was a little less
  satisfied with the results. Note that in the following examples,
  I had to hand-pick words from the program that sounded best
  and make some of my own edits instead of taking the results as-is.
  In Mandarin Chinese, a lot of the syllables the program
  gave were actual characters. In the other languages, I sometimes
  had to manually remove real words. Here's
  (partially) fake Mandarin Chinese:&lt;/p&gt;
&lt;blockquote&gt;
&lt;span class="ipa"&gt;/nau̯&lt;sup&gt;1&lt;/sup&gt;i&lt;sup&gt;4&lt;/sup&gt; rɨpjɛn&lt;sup&gt;4&lt;/sup&gt; ʈʂʰaŋ&lt;sup&gt;2&lt;/sup&gt;tɕy&lt;sup&gt;2&lt;/sup&gt; pʰaŋmi
    ʂan&lt;sup&gt;1&lt;/sup&gt;mwən&lt;sup&gt;1&lt;/sup&gt;. tsʰau̯tɕʰiŋ&lt;sup&gt;3&lt;/sup&gt; lwai̯&lt;sup&gt;2&lt;/sup&gt;tsʰɨ tsən&lt;sup&gt;1&lt;/sup&gt;sɨ
    tsjaŋ&lt;sup&gt;1&lt;/sup&gt;lwan tswai̯&lt;sup&gt;4&lt;/sup&gt;./&lt;/span&gt;&lt;br/&gt;
  Nāoyì ripiàn chángjú pangmi shānmūn. Caoqǐng luáici zēnsi ziāngluan zuài.
  &lt;/blockquote&gt;
&lt;p&gt;Here's fake Spanish:&lt;/p&gt;
&lt;blockquote&gt;
&lt;span class="ipa"&gt;/buˈɲasa ˈela fiˈneɾo pɾiˈnifɾio ʝuˈsela. ˈtʃeɲa luˈposa ˈdjenos setiˈgio raˈdɾaske./&lt;/span&gt;&lt;br/&gt;
&lt;span lang="es"&gt;Buñasa ela finero prinífrio llusela. Cheña luposa dienos setiguío radrasque.&lt;/span&gt;
&lt;/blockquote&gt;
&lt;p&gt;Fake French:&lt;/p&gt;
&lt;blockquote&gt;
&lt;span class="ipa"&gt;/vœ̃ bənɛ̃ lœʒ ʁɔ̃ adɔ̃pan. sekɑ̃ ʒaʁi tuvəble kɥɛl bʁadlyʒe./&lt;/span&gt;&lt;br/&gt;
&lt;span lang="fr"&gt;Veun benain lœuge ron adompanne. Sécan jarrie touvebler cuelle bradlugée.&lt;/span&gt;
&lt;/blockquote&gt;
&lt;p&gt;Fake Hindi-Urdu:&lt;/p&gt;
&lt;blockquote&gt;
&lt;span class="ipa"&gt;/kəfmi ɖʱape kʰənːã dənam dʒʊbai. ətʃʰ ɦɪnɔli sɛxa ʈamba ɦoɽʱ./&lt;/span&gt;&lt;br/&gt;
&lt;span lang="hi"&gt;कफ़मी ढापे खन्नाँ दनाम जुबाई। अछ हिनौली सैख़ा टांबा होढ़।&lt;/span&gt;&lt;br/&gt;
&lt;span lang="ur"&gt;کَفمی ڈھاپے کھَنّاں دَنام جُبائی۔ اَچھ ہِنَولی سَیخا ٹامبا ہوڑھ۔&lt;/span&gt;&lt;br/&gt;
  Kafmi ḍhaapay khannã danaam jubaai. Achh hinauli saikha ṭaamba hoṛh.
  &lt;/blockquote&gt;
&lt;p&gt;For fake Arabic, I had to include the actual Arabic definite article because it's such a distinctive
  part of the sound:&lt;/p&gt;
&lt;blockquote&gt;
&lt;span class="ipa"&gt;/az-zaːtiːɣaː ʕabajb al-kawki radʒr qajziraː. ʃawaː al-majsˤawtaː θun ʕaːxaðˤat al-qaʕasaː./&lt;/span&gt;&lt;br/&gt;
&lt;span lang="ar"&gt;الزَّاتِيغَا عَبَيب الْکَوکِ رَجْر قَيزِرَا۔ شَوَا الْمَيصَوتَا ثُن عَاخَظَت الْقَعَسَا۔&lt;/span&gt;&lt;br/&gt;
  Az-zātīghā ʻabayb al-kawki rajr qayzirā. Shawā al-mayṣawtā thun ʻākhaẓat al-qaʻasā.
  &lt;/blockquote&gt;
&lt;p&gt;And finally, fake English:&lt;/p&gt;
&lt;blockquote&gt;
&lt;span class="ipa"&gt;/snaɪʃoʊz bɪp spɛbi jutɛɪθ hoɪbraɪ. hædneɪ sɒtməwɪdʒ sturə rutʃ ɒskeɪl./&lt;/span&gt;&lt;br/&gt;
  Snyshows bip spebby uteith hoibrigh. Hadnay sotmawidge stoora rooch osscale.&lt;br/&gt;
  Sny-shohs bip speh-bee yoo-tayth hoi-bry. Had-nay sot-muh-wij stoo-ruh rooch oss-kayl.
  &lt;/blockquote&gt;
&lt;p&gt;Here's an unedited fake English to give you an idea of my edits:&lt;/p&gt;
&lt;blockquote&gt;
&lt;span class="ipa"&gt;/peɪʒi eɪ ni kju leɪ. aʊəaɪk deɪ lɒtʃeɪ snaʊʒibʌs toʊt./&lt;/span&gt;&lt;br/&gt;
  Peigee ay knee cue lay. Owa-ike day lochay snowgebus tote.&lt;br/&gt;
  Pay-zhee ay nee kyoo lay. Ow-uh-ike day law-chay snahw-zhee-bus toht.
  &lt;/blockquote&gt;
&lt;p&gt;Knee, cue, lay, day, and tote are all actual English words, and the rest of the words somehow
  seem off to me.
&lt;/p&gt;
&lt;h2&gt;GAN&lt;/h2&gt;
&lt;p&gt;Next I tried to accomplish the same task using a neural network. My first choice was a
&lt;strong&gt;generative adversarial neural network&lt;/strong&gt; (GAN). If you feed in some data
(like pictures of people, or English words) the network should return more generated
examples of what you put in. It works in two parts: a &lt;strong&gt;generator&lt;/strong&gt;
and a &lt;strong&gt;discriminator&lt;/strong&gt; which compete with each other (hence adversarial).
The discriminator tries to discriminate between samples made by the generator and
real data, while the generator tries to generate fakes that fool the discriminator. In the end, the generator
should create such realistic examples that the discriminator can't tell what's real and what's not.
&lt;/p&gt;
&lt;figure class="endfloat"&gt;
&lt;img alt="" src="https://www.harysdalvi.com/staylatt/img/stylegan.png"/&gt;
&lt;figcaption&gt;Sample outputs by the authors of &lt;a href="https://arxiv.org/pdf/1812.04948.pdf" target="_blank"&gt;StyleGAN&lt;/a&gt;.
    The people in these pictures are computer-generated, not real!&lt;/figcaption&gt;
&lt;/figure&gt;
&lt;p&gt;Unfortunately, this didn't work so well for me. Image GANs often use a
  &lt;strong&gt;convolutional neural network&lt;/strong&gt;. This allows the discriminator to
  average together nearby pixels in a way that detects things like edges. I
  wanted my network to consider nearby letters: “tha” is a lot more likely than
  “gvp”. But what's G times 0.24? Since I was using letters instead
  of numbers, it didn't make sense.
&lt;/p&gt;
&lt;p&gt;If not convolution, I couldn't think of a reasonable way to consider nearby letters.
I ended up using only Dense layers, one of the most basic types of layer in a neural network,
which was clearly not good enough. After a lot of training, with my computer fans going
crazy, I got:&lt;/p&gt;
&lt;blockquote&gt;
  ??? ? cda c  ??   &lt;br/&gt;
 ???? ???uhe   ?? ??&lt;br/&gt;
? ?? a? ggddfi ??  ?&lt;br/&gt;
????? ?c bacad ?? ? &lt;br/&gt;
 ??  a?kafejdmd??   &lt;br/&gt;
 a? ? ?l?olmsha??  ?&lt;br/&gt;
?????? ??ig?a ??  ??&lt;br/&gt;
 ?  ?? bdnhn? a???? &lt;br/&gt;
  ??? ??b?bbd b??? ?&lt;br/&gt;
 ????  bahrh?a??? ? 
&lt;/blockquote&gt;
&lt;p&gt;The question marks are where the neural network did not put letter #1 (A), or letter #2 (B),
but something like letter #-4 or letter #49. There is a little bit of learning here: it
is learning that words go in the middle, with spaces on either side. But I don't
really think that's what English sounds like to people who don't speak it.
&lt;/p&gt;
&lt;h2&gt;LSTM&lt;/h2&gt;
&lt;p&gt;Instead of using a GAN, I found an example online that used a Long-Short Term Memory (LSTM)
network to generate text character-by-character [&lt;a href="https://www.harysdalvi.com/staylatt/#ref" id="src2"&gt;2&lt;/a&gt;]. LSTM
networks are a type of recurrent neural network (RNN), meaning they can remember previous
data in order to put their output in context [&lt;a href="https://www.harysdalvi.com/staylatt/#ref" id="src3"&gt;3&lt;/a&gt;]. This was exactly
what I needed to generate fake English, because each letter depends on the letters
around it: you can't have a word like “hdjafhkjsdjfh”.
&lt;/p&gt;
&lt;p&gt;While the original code used segments of 40 characters, I shortened it to 10 characters
  so it would be focused on generating words rather than sentences. I used the
  Universal Declaration of Human Rights [&lt;a href="https://www.harysdalvi.com/staylatt/#ref" id="src4"&gt;4&lt;/a&gt;] as input,
  first with the original text, then with a version transcribed into IPA phonetic characters.
  At first, the output didn't make much sense:&lt;/p&gt;
&lt;blockquote&gt;
    tdofend zulsance ins of dier ousshand ongantiinco und cous ariasdacpimcoancianl
    ancor toroglneibais asrarit, cacius or ankecmembecoousion ancen ortdas or
    macalitdgedd antecpudialitras artticevafs almideacicsiondkssmend
    actetjone irpoco ho erect pas biymonaitovinagiry alicof onyohperian
    an onde nceroncaed  armandaes of eraqmane anuitaese aaleetandyicvetion
    the alterangererligeceocicaliandemtityarta
  &lt;/blockquote&gt;
&lt;p&gt;As the neural network learned more about the data, it started coming together:&lt;/p&gt;
&lt;blockquote&gt;
    furthermore, no forth other other bergenuin, by semplesng of las1 his the digrits
    to hid ortatious in lioged int inservational, touldiag in 
    as chongrald themevees for the united nationaly in a pecils
    deace or the a the gnien of perserventied. irtist inciperhas
    arcaring of nock, huraliple for the bidivation social declaration
    or by law. article 24 everyone has the dignity and social
    protection of whis ded
  &lt;/blockquote&gt;
&lt;p&gt;At this point, there are some parts that make no sense (“las1”),
  some parts that sound like possible English words (“inservational”),
  and some actual English words (“social declaration”). When there
  are parts that make no sense, the model is clearly &lt;strong&gt;underfitting&lt;/strong&gt;:
  it isn't able to match the data well enough. But when there are actual English
  words, the model is &lt;strong&gt;overfitting&lt;/strong&gt;: it is making decisions based
  on overly specific data points rather than overall patterns in the data.
  For my purposes, I want to have words that sound possible but aren't real,
  like “bidivation”.&lt;/p&gt;
&lt;figure class="endfloat"&gt;
&lt;img alt="" src="https://www.harysdalvi.com/staylatt/img/fitting.svg.png"/&gt;
&lt;figcaption&gt;The green line represents an overfitted model and the
      black line represents a regularized model. While the green line best
      follows the training data, it is too dependent on that data and it is
      likely to have a higher error rate on new unseen data, compared to
      the black line. (&lt;a href="https://commons.wikimedia.org/wiki/File:Overfitting.svg" target="_blank"&gt;Chabacano, Wikimedia&lt;/a&gt;,
      &lt;a href="https://creativecommons.org/licenses/by-sa/4.0/deed.en" target="_blank"&gt;CC BY-SA 4.0&lt;/a&gt;)&lt;/figcaption&gt;
&lt;/figure&gt;
&lt;p&gt;The overfitting is clear in this case, where the neural network is memorizing the
    data. But outside of a machine learning context, the whole problem of
    figuring out what English sounds like to people who don't speak it
    is closely connected to overfitting. The problem I had with other
    attempts to tackle this problem is that they used many actual
    English words, just as my neural network ended up doing. Even my
    hard-coded phonotactic program had a few English words.
  &lt;/p&gt;
&lt;p&gt;In addition to fake English words, the neural
    network generated some questionable remarks on human rights.
    For a network trained entirely on a declaration of human
    rights, this really highlights the importance of AI ethics...&lt;/p&gt;
&lt;blockquote&gt;
      as to marriage, during marriage shall be subjected to torture or to cruel, indushis s.&lt;br/&gt;
      everyone shall be held in slavery or servitule&lt;br/&gt;
      peoples of territories... shall be subject to arbitrary arceslation&lt;br/&gt;
      everyone has the right to seek... and shall be held guilty of any penal offence&lt;br/&gt;
      the moral and education shall be arbitrarily deprived of his country,
      includes freedoms, themselves and among the people&lt;br/&gt;
      everyone has the right to equal pay for equal write... and the slave trade
    &lt;/blockquote&gt;
&lt;p&gt;Between these dystopian sentences, and the overfitting and underfitting in
    the model, this RNN showed how human input can still be an important
    supplement to AI. The same applied to my earlier program, where I had to
    do some manual edits on top of the computer-generated text. Maybe
    figuring out what English sounds like to people who don't speak it is just
    one of the many cases where collaboration between human and computer is the
    best approach.
  &lt;/p&gt;
&lt;h2 id="ref"&gt;References&lt;/h2&gt;
&lt;p&gt;The GitHub for this project is at
&lt;a href="https://github.com/crackalamoo/staylatt" target="_blank"&gt;https://github.com/crackalamoo/staylatt&lt;/a&gt;.&lt;/p&gt;
&lt;ol id="reflist"&gt;
&lt;li&gt;&lt;a href="https://link.springer.com/content/pdf/10.3758/BF03195600.pdf" target="_blank"&gt;Frequency
  of occurrence for units of phonemes, morae, and syllables appearing in a lexical corpus of a Japanese newspaper&lt;/a&gt;
  (Katsuo Tamaoka &amp;amp; Shogo Makioka) &lt;a href="https://www.harysdalvi.com/staylatt/#src1"&gt;^&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://keras.io/examples/generative/lstm_character_level_text_generation/" target="_blank"&gt;
  Character-level text generation with LSTM&lt;/a&gt; (Keras) &lt;a href="https://www.harysdalvi.com/staylatt/#src2"&gt;^&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://machinelearningmastery.com/gentle-introduction-long-short-term-memory-networks-experts/"&gt;A
  Gentle Introduction to Long Short-Term Memory Networks by the Experts&lt;/a&gt; (Jason Brownlee,
  Machine Learning Mastery) &lt;a href="https://www.harysdalvi.com/staylatt/#src3"&gt;^&lt;/a&gt;
&lt;/li&gt;&lt;li&gt;&lt;a href="https://www.un.org/en/about-us/universal-declaration-of-human-rights" target="_blank"&gt;Universal
  Declaration of Human Rights&lt;/a&gt; (United Nations) &lt;a href="https://www.harysdalvi.com/staylatt/#src4"&gt;^&lt;/a&gt;&lt;/li&gt;
&lt;/ol&gt;</content></entry><entry><title>2+3=4</title><summary>Why math seems to describe the physical world so well</summary><link href="https://www.harysdalvi.com/blog/2202/" rel="alternate"/><id>https://www.harysdalvi.com/blog/2202/</id><published>2022-02-01T16:38:07Z</published><updated>2022-02-01T16:38:07Z</updated><content type="html" xml:lang="en">&lt;p&gt;I once saw a math joke something along the lines of “if Joe has ten apples in his right hand, and
  twelve apples in his right hand, what does Joe have in total?” Answer: “really big hands.”
  While this seems ridiculous, it's definitely not wrong, and I would argue it's even more clever and
  insightful than it seems at first.
&lt;/p&gt;
&lt;p&gt;In fact, you could construct an entire self-consistent mathematical system to describe the number of apples
  Joe can have in his hands. Say Joe can only hold a total of four apples at a time.
  Then we could write &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-1-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;mo&gt;+&lt;/mo&gt;&lt;mn&gt;3&lt;/mn&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mn&gt;4&lt;/mn&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;, because if there were 5 apples, one would just fall.
  This is in fact valid mathematics within this mathematical framework specifically designed to model
  the apples in Joe's hands.
&lt;/p&gt;
&lt;p&gt;What does this mean? First of all, it means I have just shown an example of a mathematical framework where
  &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-2-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;mo&gt;+&lt;/mo&gt;&lt;mn&gt;3&lt;/mn&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mn&gt;4&lt;/mn&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt; is a completely valid statement. This isn't so strange: modular arithmetic is accepted and
  widely-used, and there you can have &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-3-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;mo&gt;+&lt;/mo&gt;&lt;mn&gt;3&lt;/mn&gt;&lt;mo&gt;≡&lt;/mo&gt;&lt;mn&gt;1&lt;/mn&gt;&lt;mspace width="0.667em"&gt;&lt;/mspace&gt;&lt;mi&gt;mod&lt;/mi&gt;&lt;mspace width="thinmathspace"&gt;&lt;/mspace&gt;&lt;mspace width="thinmathspace"&gt;&lt;/mspace&gt;&lt;mn&gt;4&lt;/mn&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;.
&lt;/p&gt;
&lt;p&gt;Second, this tells us something about the relationship between math and the real world. Sometimes people
  wonder how math describes the physical world so accurately. I am going to argue that this is because
  math was &lt;em&gt;designed&lt;/em&gt; to do this, at least to some extent.
&lt;/p&gt;
&lt;h2&gt;The Universe Can't Do Math&lt;/h2&gt;
&lt;p&gt;Plants can't talk, but we can talk about plants. The universe can't do math, but we can
  use math to describe the universe. I'm going to show some examples of this idea.
&lt;/p&gt;
&lt;h3&gt;Bacterial Populations&lt;/h3&gt;
&lt;p&gt;At least during the log phase, bacterial populations grow exponentially [&lt;a href="https://www.harysdalvi.com/blog/2202/#ref" id="src1"&gt;1&lt;/a&gt;]. As an
  equation, that looks like
  &lt;span class="mjx-chtml MJXc-display" style="text-align: center;"&gt;&lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-4-Frame" role="presentation" style="font-size: 90%; text-align: center; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML MJX_Assistive_MathML_Block" role="presentation"&gt;&lt;math display="block" xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;P&lt;/mi&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mi&gt;C&lt;/mi&gt;&lt;msup&gt;&lt;mi&gt;e&lt;/mi&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mi&gt;k&lt;/mi&gt;&lt;mi&gt;t&lt;/mi&gt;&lt;/mrow&gt;&lt;/msup&gt;&lt;mo&gt;.&lt;/mo&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
  How is this
  possible? Do the bacteria conduct regular censuses and conduct centralized business meetings,
  making sure to divide at such a rate that the population grows exponentially? Of course that's ridiculous.
  The underlying cause is that &lt;strong&gt;each bacterial cell produces about the same number of offspring in a given
  amount of time.&lt;/strong&gt; And the underlying cause of that, in turn, has to do with how long it takes for the
  bacteria to collect all the things they need to divide in the cell cycle.
&lt;/p&gt;
&lt;figure class="endfloat"&gt;
&lt;img alt="" src="https://www.harysdalvi.com/blog/2202/bacteria.jpeg"/&gt;
&lt;figcaption&gt;A bacterial business meeting to ensure exponential population growth
    (taken from a transmission electron microscope)&lt;/figcaption&gt;
&lt;/figure&gt;
&lt;p&gt;If each bacterial cell produces the same number of offspring in a given amount of time, what does that look like
  as an equation? It's
  &lt;span class="mjx-chtml MJXc-display" style="text-align: center;"&gt;&lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-5-Frame" role="presentation" style="font-size: 90%; text-align: center; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML MJX_Assistive_MathML_Block" role="presentation"&gt;&lt;math display="block" xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mfrac&gt;&lt;mrow&gt;&lt;mi&gt;d&lt;/mi&gt;&lt;mi&gt;P&lt;/mi&gt;&lt;/mrow&gt;&lt;mrow&gt;&lt;mi&gt;d&lt;/mi&gt;&lt;mi&gt;t&lt;/mi&gt;&lt;/mrow&gt;&lt;/mfrac&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mi&gt;k&lt;/mi&gt;&lt;mi&gt;P&lt;/mi&gt;&lt;mo&gt;.&lt;/mo&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
  And if you solve that equation, you get the exponential growth equation above. So it's not that the bacteria
  do math: instead, the physical world places certain constraints on how the bacteria will behave.
  Based on those constraints, we can logically deduce what will happen to the bacteria. The way we deduce that is called
  mathematics.
&lt;/p&gt;
&lt;h3&gt;Coulomb's Law&lt;/h3&gt;
&lt;p&gt;Newton's well-known law of gravitation states
&lt;span class="mjx-chtml MJXc-display" style="text-align: center;"&gt;&lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-6-Frame" role="presentation" style="font-size: 90%; text-align: center; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML MJX_Assistive_MathML_Block" role="presentation"&gt;&lt;math display="block" xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;F&lt;/mi&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mfrac&gt;&lt;mrow&gt;&lt;mi&gt;G&lt;/mi&gt;&lt;msub&gt;&lt;mi&gt;m&lt;/mi&gt;&lt;mn&gt;1&lt;/mn&gt;&lt;/msub&gt;&lt;msub&gt;&lt;mi&gt;m&lt;/mi&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msub&gt;&lt;/mrow&gt;&lt;msup&gt;&lt;mi&gt;r&lt;/mi&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msup&gt;&lt;/mfrac&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
to describe the force of gravity between two masses. However, since gravity is weird and I do not have
a theory of everything, I'll use the similar Coulomb's law from electromagnetism:
&lt;span class="mjx-chtml MJXc-display" style="text-align: center;"&gt;&lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-7-Frame" role="presentation" style="font-size: 90%; text-align: center; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML MJX_Assistive_MathML_Block" role="presentation"&gt;&lt;math display="block" xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;F&lt;/mi&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mfrac&gt;&lt;mrow&gt;&lt;mi&gt;k&lt;/mi&gt;&lt;msub&gt;&lt;mi&gt;q&lt;/mi&gt;&lt;mn&gt;1&lt;/mn&gt;&lt;/msub&gt;&lt;msub&gt;&lt;mi&gt;q&lt;/mi&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msub&gt;&lt;/mrow&gt;&lt;msup&gt;&lt;mi&gt;r&lt;/mi&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msup&gt;&lt;/mfrac&gt;&lt;mo&gt;.&lt;/mo&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
This law is quite surprising. Why should the electric force between two charges be directly proportional to &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-8-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mn&gt;1&lt;/mn&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mo&gt;/&lt;/mo&gt;&lt;/mrow&gt;&lt;msup&gt;&lt;mi&gt;r&lt;/mi&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msup&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;?
Do electrons have little calculators and rulers that we can't see, making complex measurements every instant
in order to obey this law? Do they take into account all the charges in the entire universe when deciding
where to go in an instant?
&lt;/p&gt;
&lt;figure class="endfloat"&gt;
&lt;img alt="" src="https://www.harysdalvi.com/blog/2202/coulomb.png"/&gt;
&lt;figcaption&gt;Coulomb's law applied to the attractive force between a nucleus and an electron. (CC-BY-NC-SA; Kathryn Haas)&lt;/figcaption&gt;
&lt;/figure&gt;
&lt;p&gt;Obviously these little electron masterminds are unrealistic. Another theory is to have electric field vectors
  coming out of charges and spreading out. That way instead of measuring distances to every charge in the entire
  universe every instant, charges only have to look at the electric field where they are. With Gauss's law, this
  actually gives the same 1/&lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-9-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;msup&gt;&lt;mi&gt;r&lt;/mi&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msup&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;. Why? Because the &lt;strong&gt;surface area of a sphere&lt;/strong&gt;
  is proportional to &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-10-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;msup&gt;&lt;mi&gt;r&lt;/mi&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msup&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;.
&lt;/p&gt;
&lt;details&gt;
&lt;summary&gt;Derivation of Coulomb's law from Gauss's law&lt;/summary&gt;
&lt;p&gt;Gauss's law states
&lt;span class="mjx-chtml MJXc-display" style="text-align: center;"&gt;&lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-11-Frame" role="presentation" style="font-size: 90%; text-align: center; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML MJX_Assistive_MathML_Block" role="presentation"&gt;&lt;math display="block" xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mo&gt;∮&lt;/mo&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mi mathvariant="bold"&gt;E&lt;/mi&gt;&lt;/mrow&gt;&lt;mo&gt;⋅&lt;/mo&gt;&lt;mi&gt;d&lt;/mi&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mi mathvariant="bold"&gt;A&lt;/mi&gt;&lt;/mrow&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mfrac&gt;&lt;msub&gt;&lt;mi&gt;Q&lt;/mi&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mi&gt;e&lt;/mi&gt;&lt;mi&gt;n&lt;/mi&gt;&lt;mi&gt;c&lt;/mi&gt;&lt;/mrow&gt;&lt;/msub&gt;&lt;msub&gt;&lt;mi&gt;ϵ&lt;/mi&gt;&lt;mn&gt;0&lt;/mn&gt;&lt;/msub&gt;&lt;/mfrac&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
Consider a spherical surface around a charge &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-12-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;msub&gt;&lt;mi&gt;Q&lt;/mi&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mi&gt;e&lt;/mi&gt;&lt;mi&gt;n&lt;/mi&gt;&lt;mi&gt;c&lt;/mi&gt;&lt;/mrow&gt;&lt;/msub&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;. If there are no other charges,
the electric field should have the same magnitude all around the surface, and should
always be perpendicular to the surface. Therefore &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-13-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mi mathvariant="bold"&gt;E&lt;/mi&gt;&lt;/mrow&gt;&lt;mo&gt;⋅&lt;/mo&gt;&lt;mi&gt;d&lt;/mi&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mi mathvariant="bold"&gt;A&lt;/mi&gt;&lt;/mrow&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mi&gt;E&lt;/mi&gt;&lt;mspace width="thinmathspace"&gt;&lt;/mspace&gt;&lt;mi&gt;d&lt;/mi&gt;&lt;mi&gt;A&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;,
where &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-14-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;E&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt; is the magnitude of the electric field.
&lt;span class="mjx-chtml MJXc-display" style="text-align: center;"&gt;&lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-15-Frame" role="presentation" style="font-size: 90%; text-align: center; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML MJX_Assistive_MathML_Block" role="presentation"&gt;&lt;math display="block" xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mo&gt;∮&lt;/mo&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mi mathvariant="bold"&gt;E&lt;/mi&gt;&lt;/mrow&gt;&lt;mo&gt;⋅&lt;/mo&gt;&lt;mi&gt;d&lt;/mi&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mi mathvariant="bold"&gt;A&lt;/mi&gt;&lt;/mrow&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mo&gt;∫&lt;/mo&gt;&lt;mi&gt;E&lt;/mi&gt;&lt;mspace width="thinmathspace"&gt;&lt;/mspace&gt;&lt;mi&gt;d&lt;/mi&gt;&lt;mi&gt;A&lt;/mi&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mi&gt;E&lt;/mi&gt;&lt;mi&gt;A&lt;/mi&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mfrac&gt;&lt;msub&gt;&lt;mi&gt;Q&lt;/mi&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mi&gt;e&lt;/mi&gt;&lt;mi&gt;n&lt;/mi&gt;&lt;mi&gt;c&lt;/mi&gt;&lt;/mrow&gt;&lt;/msub&gt;&lt;msub&gt;&lt;mi&gt;ϵ&lt;/mi&gt;&lt;mn&gt;0&lt;/mn&gt;&lt;/msub&gt;&lt;/mfrac&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
Since the surface is a sphere, &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-16-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;A&lt;/mi&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mn&gt;4&lt;/mn&gt;&lt;mi&gt;π&lt;/mi&gt;&lt;msup&gt;&lt;mi&gt;r&lt;/mi&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msup&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;. Also note that the &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-17-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;k&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt; in Coulomb's law
equals &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-18-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mn&gt;1&lt;/mn&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mo&gt;/&lt;/mo&gt;&lt;/mrow&gt;&lt;mn&gt;4&lt;/mn&gt;&lt;mi&gt;π&lt;/mi&gt;&lt;msub&gt;&lt;mi&gt;ϵ&lt;/mi&gt;&lt;mn&gt;0&lt;/mn&gt;&lt;/msub&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;.
&lt;span class="mjx-chtml MJXc-display" style="text-align: center;"&gt;&lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-19-Frame" role="presentation" style="font-size: 90%; text-align: center; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML MJX_Assistive_MathML_Block" role="presentation"&gt;&lt;math display="block" xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;E&lt;/mi&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mfrac&gt;&lt;msub&gt;&lt;mi&gt;Q&lt;/mi&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mi&gt;e&lt;/mi&gt;&lt;mi&gt;n&lt;/mi&gt;&lt;mi&gt;c&lt;/mi&gt;&lt;/mrow&gt;&lt;/msub&gt;&lt;mrow&gt;&lt;mn&gt;4&lt;/mn&gt;&lt;mi&gt;π&lt;/mi&gt;&lt;msup&gt;&lt;mi&gt;r&lt;/mi&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msup&gt;&lt;msub&gt;&lt;mi&gt;ϵ&lt;/mi&gt;&lt;mn&gt;0&lt;/mn&gt;&lt;/msub&gt;&lt;/mrow&gt;&lt;/mfrac&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mfrac&gt;&lt;mrow&gt;&lt;mi&gt;k&lt;/mi&gt;&lt;msub&gt;&lt;mi&gt;Q&lt;/mi&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mi&gt;e&lt;/mi&gt;&lt;mi&gt;n&lt;/mi&gt;&lt;mi&gt;c&lt;/mi&gt;&lt;/mrow&gt;&lt;/msub&gt;&lt;/mrow&gt;&lt;msup&gt;&lt;mi&gt;r&lt;/mi&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msup&gt;&lt;/mfrac&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
Finally, the magnitude of electric force on a charge &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-20-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;q&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt; equals the magnitude of the electric field times &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-21-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;q&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;.
If we say &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-22-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;msub&gt;&lt;mi&gt;Q&lt;/mi&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mi&gt;e&lt;/mi&gt;&lt;mi&gt;n&lt;/mi&gt;&lt;mi&gt;c&lt;/mi&gt;&lt;/mrow&gt;&lt;/msub&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;msub&gt;&lt;mi&gt;q&lt;/mi&gt;&lt;mn&gt;1&lt;/mn&gt;&lt;/msub&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;, that gives
&lt;span class="mjx-chtml MJXc-display" style="text-align: center;"&gt;&lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-23-Frame" role="presentation" style="font-size: 90%; text-align: center; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML MJX_Assistive_MathML_Block" role="presentation"&gt;&lt;math display="block" xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;menclose notation="box"&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mstyle displaystyle="true" scriptlevel="0"&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mi&gt;F&lt;/mi&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mfrac&gt;&lt;mrow&gt;&lt;mi&gt;k&lt;/mi&gt;&lt;msub&gt;&lt;mi&gt;q&lt;/mi&gt;&lt;mn&gt;1&lt;/mn&gt;&lt;/msub&gt;&lt;msub&gt;&lt;mi&gt;q&lt;/mi&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msub&gt;&lt;/mrow&gt;&lt;msup&gt;&lt;mi&gt;r&lt;/mi&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msup&gt;&lt;/mfrac&gt;&lt;/mrow&gt;&lt;/mstyle&gt;&lt;/mrow&gt;&lt;/menclose&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
&lt;/p&gt;
&lt;/details&gt;
&lt;p&gt;Let's go back to the big picture. It seemed at first that the universe somehow knew about inverse square laws
  and mathematics. In reality, this mysterious law comes down to the surface area of a sphere, and
  electrons have no mathematical abilities.
&lt;/p&gt;
&lt;h3&gt;Conservation of Momentum&lt;/h3&gt;
&lt;p&gt;Conservation of momentum is a really interesting physical law. It was first used for classical physics, but
  it turns out to work for quantum physics too. So does the universe measure the momentum of each particle and
  add it up to make sure the total momentum is the same?
&lt;/p&gt;
&lt;p&gt;In fact, conservation of momentum takes place on a smaller scale. You can show that interactions between
  objects always conserve momentum, and therefore, the momentum of the universe is conserved. One way to do
  that is with Newton's second and third laws: &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-24-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mi mathvariant="bold"&gt;F&lt;/mi&gt;&lt;/mrow&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mi&gt;d&lt;/mi&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mi mathvariant="bold"&gt;p&lt;/mi&gt;&lt;/mrow&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mo&gt;/&lt;/mo&gt;&lt;/mrow&gt;&lt;mi&gt;d&lt;/mi&gt;&lt;mi&gt;t&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt; and &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-25-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;msub&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mi mathvariant="bold"&gt;F&lt;/mi&gt;&lt;/mrow&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mi&gt;A&lt;/mi&gt;&lt;mi&gt;B&lt;/mi&gt;&lt;/mrow&gt;&lt;/msub&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mo&gt;−&lt;/mo&gt;&lt;msub&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mi mathvariant="bold"&gt;F&lt;/mi&gt;&lt;/mrow&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mi&gt;B&lt;/mi&gt;&lt;mi&gt;A&lt;/mi&gt;&lt;/mrow&gt;&lt;/msub&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;
&lt;/p&gt;
&lt;p&gt;But then how can you derive Newton's second and third laws? You can't: they are laws, in the language of math,
  that describe the physical universe. It's a lot like the differential equation for bacterial populations:
  mathematics is a tool made to describe the universe, which explains why the universe seems to run on mathematics.
&lt;/p&gt;
&lt;h2&gt;Axioms&lt;/h2&gt;
&lt;p&gt;If mathematics is just a tool we use to describe the universe, does that
  make it invented? I think the idea of axioms is helpful here. Euclid proposed five axioms that are at the
  base of his geometry [&lt;a href="https://www.harysdalvi.com/blog/2202/#ref" id="src2"&gt;2&lt;/a&gt;], but other forms of geometry use different axioms and
  come to different conclusions [&lt;a href="https://www.harysdalvi.com/blog/2202/#ref" id="src3"&gt;3&lt;/a&gt;].
&lt;/p&gt;
&lt;p&gt;If these geometries have different conclusions, how do we know which geometry is correct? Here comes the
  cop-out answer: it depends. If you are doing geometry on a plane, Euclidean geometry is correct. If you
  are doing geometry on a sphere, spherical geometry is correct.
&lt;/p&gt;
&lt;p&gt;But there's another way to find out which geometry is correct. Just check the axioms, and see if they
  apply to whatever thing you're looking at. This technique should work for anything, not just geometry, as long as the math
  is valid.
&lt;/p&gt;
&lt;p&gt;Let's go back to Joe and his apples. Conventional math tells us 2+3=5, but if Joe has 2 apples and we add
  3, we find that Joe has only 4 apples since one fell out of his hands. What's the problem? Is math wrong?
  No, it's just that a fact of conventional mathematics does not apply to the case of Joe and his apples: specifically,
  the idea that the set of counting numbers is infinite. Because of this, we need to define addition in
  a different way than in conventional mathematics.
&lt;/p&gt;
&lt;p&gt;Mathematicians define addition in different ways all the time. Just look at linear algebra: you start with
  a set of axioms and a definition for addition. If whatever real-world thing you are studying follows all the
  axioms of linear algebra, then you can use all the methods of linear algebra which have been developed by
  mathematicians over the years. Some of this conflicts with normal mathematics: for example, &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-26-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;A&lt;/mi&gt;&lt;mi&gt;B&lt;/mi&gt;&lt;mo&gt;−&lt;/mo&gt;&lt;mi&gt;B&lt;/mi&gt;&lt;mi&gt;A&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;
  is not necessarily 0 when you work with matrices.
&lt;/p&gt;
&lt;p&gt;Richard Feynman noted that in physics, unlike in mathematics,
  starting with fundamental axioms isn't always the best way to do things [&lt;a href="https://www.harysdalvi.com/blog/2202/#ref" id="src4"&gt;4&lt;/a&gt;].
  Why is that? I think it's because we need to start with the physical world, then see what axioms it follows,
  and finally do math &lt;strong&gt;assuming those axioms are true&lt;/strong&gt;.
&lt;/p&gt;
&lt;p&gt;I think we need to reframe our idea of axioms. Instead of a fundamental truth, axioms are a starting place
  from which you do further reasoning. If you ever find yourself at the same starting place as some branch of
  mathematics, you can use the reasoning from that part of mathematics.
&lt;/p&gt;
&lt;p&gt;So is mathematics discovered or invented? It's hard to tell. I would say that the axioms, the starting points of
  mathematics, are observed in nature or just invented. From there, the reasoning we do is a process of
  discovery, not invention.
&lt;/p&gt;
&lt;p&gt;Finally, I want to emphasize the interplay between reasoning and observation. Mathematics is the process of first picking axioms
  that appear to be true based on the physical world, and then doing numerical reasoning from there. The
  first part explains why the physical world seems to follow mathematics: it's the other way around. But that
  doesn't make mathematics lesser in any way, because the second part is what lets us make accurate predictions
  about the real world. Reasoning is what makes mathematics so powerful.
&lt;/p&gt;
&lt;hr/&gt;
&lt;h2&gt;Related Posts&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="https://www.harysdalvi.com/blog/covariance-rotational-inertia"&gt;Connecting Covariance and Rotational Inertia&lt;/a&gt; — A surprisingly deep connection between statistics and classical mechanics&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.harysdalvi.com/blog/2109"&gt;Speedrunning Calculus&lt;/a&gt; — I did a “speedrun” of my Calculus 3 class.&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.harysdalvi.com/blog/2110"&gt;Romans Needed Quantum Computers&lt;/a&gt; — A history of information density, from the ancient Mayans to quantum computing&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id="ref"&gt;References&lt;/h2&gt;
&lt;ol id="reflist"&gt;
&lt;li&gt;&lt;a href="https://www.britannica.com/science/bacteria/Growth-of-bacterial-populations" target="_blank"&gt;Growth of bacterial populations&lt;/a&gt;
  (Britannica) &lt;a href="https://www.harysdalvi.com/blog/2202/#src1"&gt;^&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.sfu.ca/~swartz/euclid.htm" target="_blank"&gt;Axioms and Postulates of Euclid&lt;/a&gt;
  (&lt;cite&gt;The Elements of Euclid&lt;/cite&gt; translated by Sir Thomas Heath) &lt;a href="https://www.harysdalvi.com/blog/2202/#src2"&gt;^&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://mathstat.slu.edu/escher/index.php/The_Three_Geometries" target="_blank"&gt;The Three Geometries&lt;/a&gt;
  (EscherMath) &lt;a href="https://www.harysdalvi.com/blog/2202/#src3"&gt;^&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.youtube.com/watch?v=1SrHzSGn-I8" target="_blank"&gt;The Relation of Mathematics to Physics&lt;/a&gt;
  (Richard Feynman, Youtube) &lt;a href="https://www.harysdalvi.com/blog/2202/#src4"&gt;^&lt;/a&gt;&lt;/li&gt;
&lt;/ol&gt;</content></entry><entry><title>Mathematical History</title><summary>Why we need more historians doing math</summary><link href="https://www.harysdalvi.com/blog/mathematical-history/" rel="alternate"/><id>https://www.harysdalvi.com/blog/mathematical-history/</id><published>2021-12-01T16:38:07Z</published><updated>2021-12-01T16:38:07Z</updated><content type="html" xml:lang="en">&lt;p&gt;
I found the following equation in a
&lt;a href="https://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.164.1314&amp;amp;rep=rep1&amp;amp;type=pdf" target="_blank"&gt;paper&lt;/a&gt;:&lt;/p&gt;
&lt;div class="longmath"&gt;&lt;span class="mjx-chtml MJXc-display" style="text-align: center;"&gt;&lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-1-Frame" role="presentation" style="font-size: 90%; text-align: center; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML MJX_Assistive_MathML_Block" role="presentation"&gt;&lt;math display="block" xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;C&lt;/mi&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mo&gt;−&lt;/mo&gt;&lt;munderover&gt;&lt;mo&gt;∑&lt;/mo&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mi&gt;i&lt;/mi&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mn&gt;1&lt;/mn&gt;&lt;/mrow&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mi&gt;N&lt;/mi&gt;&lt;/mrow&gt;&lt;/munderover&gt;&lt;mi&gt;P&lt;/mi&gt;&lt;mo stretchy="false"&gt;(&lt;/mo&gt;&lt;mi&gt;i&lt;/mi&gt;&lt;mo stretchy="false"&gt;)&lt;/mo&gt;&lt;munderover&gt;&lt;mo&gt;∑&lt;/mo&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mi&gt;j&lt;/mi&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mn&gt;1&lt;/mn&gt;&lt;/mrow&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mi&gt;N&lt;/mi&gt;&lt;/mrow&gt;&lt;/munderover&gt;&lt;mi&gt;P&lt;/mi&gt;&lt;mo stretchy="false"&gt;(&lt;/mo&gt;&lt;mi&gt;j&lt;/mi&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mo stretchy="false"&gt;|&lt;/mo&gt;&lt;/mrow&gt;&lt;mi&gt;i&lt;/mi&gt;&lt;mo stretchy="false"&gt;)&lt;/mo&gt;&lt;mi&gt;log&lt;/mi&gt;&lt;mo&gt;⁡&lt;/mo&gt;&lt;mi&gt;P&lt;/mi&gt;&lt;mo stretchy="false"&gt;(&lt;/mo&gt;&lt;mi&gt;j&lt;/mi&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mo stretchy="false"&gt;|&lt;/mo&gt;&lt;/mrow&gt;&lt;mi&gt;i&lt;/mi&gt;&lt;mo stretchy="false"&gt;)&lt;/mo&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;
&lt;p&gt;What was this paper about? Conditional probability? Entropy and statistical thermodynamics?
Some other field of math or physics?
&lt;/p&gt;
&lt;p&gt;In fact, the paper was about deciphering the Indus script from thousands of years ago.
But it did borrow ideas from both conditional probability and statistical thermodynamics.
Such methods are actually quite common now in deciphering ancient scripts:
&lt;a href="https://royalsocietypublishing.org/doi/10.1098/rspa.2010.0041" target="_blank"&gt;this paper&lt;/a&gt;
uses the same idea for symbols used in Iron Age Scotland.
&lt;/p&gt;
&lt;figure class="endfloat"&gt;
&lt;img alt="Two tablets containing drawings of animals and symbols" src="https://www.harysdalvi.com/blog/mathematical-history/indus_script.jpg"/&gt;
&lt;figcaption&gt;The Indus script&lt;/figcaption&gt;
&lt;/figure&gt;
&lt;p&gt;When you think about applied mathematics, history is probably not the next field
that comes to mind. You might think of physics, computer science, engineering, or chemistry.&lt;/p&gt;
&lt;p&gt;But these days,
it feels like math is extending into every field. You can't learn economics without
supply and demand curves. Finance and psychology are full of data and statistical analysis.
There's an entire subfield of mathematical biology as our understanding of biological
systems improves. So why not mathematical history?
&lt;/p&gt;
&lt;p&gt;History is interesting in itself, but it also has a lot of questions that are
useful and practical. What can Easter Island teach us about the effect of population
size and resource use on a society? What does the Industrial Revolution mean for
how new technology can reduce poverty? How does the Islamic Golden Age relate to
global scientific collaboration and exchange of ideas in the Information Age?
&lt;/p&gt;
&lt;p&gt;These are just a few important questions from history that math, especially statistics, can help us to
answer more thoroughly.
&lt;/p&gt;
&lt;h2&gt;Small Example Question&lt;/h2&gt;
&lt;p&gt;This is the kind of question I imagine you might find for homework in a mathematical history class.&lt;/p&gt;
&lt;h3&gt;Collapse of States&lt;/h3&gt;
&lt;p&gt;You are given the durations of various empires in history [&lt;a href="https://www.harysdalvi.com/blog/mathematical-history/#ref" id="src1"&gt;1&lt;/a&gt;].&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Use software to construct a probability distribution for these durations. What type of distribution is it?&lt;/li&gt;
&lt;li&gt;Based on this data alone, given that the United States has lasted for about 250 years since 1776,
in which year is the United States expected to collapse?&lt;/li&gt;
&lt;li&gt;Give one historical or statistical reason why this probability distribution may give a biased estimate for the collapse of
the United States.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;I took the data into Python and plotted it using matplotlib. Here's the histogram I got.
&lt;/p&gt;
&lt;figure class="middleimg"&gt;
&lt;img alt="Graph: empires are more likely to end quickly, and then decreasingly likely to last longer, especially past 500 years." src="hist.png"/&gt;
&lt;figcaption&gt;Histogram for number of years an empire will last&lt;/figcaption&gt;
&lt;/figure&gt;
&lt;p&gt;To me, this looks like an &lt;strong&gt;exponential distribution&lt;/strong&gt;. We can interpret this as
  saying that at any given time, the probability of an empire collapsing is constant, similar to radioactive decay.
  This is a meaningful historical interpretation that comes straight out of a statistical model.
&lt;/p&gt;
&lt;p&gt;Now, in principle, we can evaluate the following integral to determine the year &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-2-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;Y&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt; in which the United States
  is expected to collapse.
  &lt;span class="mjx-chtml MJXc-display" style="text-align: center;"&gt;&lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-3-Frame" role="presentation" style="font-size: 90%; text-align: center; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML MJX_Assistive_MathML_Block" role="presentation"&gt;&lt;math display="block" xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;Y&lt;/mi&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mn&gt;1776&lt;/mn&gt;&lt;mo&gt;+&lt;/mo&gt;&lt;msubsup&gt;&lt;mo&gt;∫&lt;/mo&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mn&gt;250&lt;/mn&gt;&lt;/mrow&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mi mathvariant="normal"&gt;∞&lt;/mi&gt;&lt;/mrow&gt;&lt;/msubsup&gt;&lt;mi&gt;k&lt;/mi&gt;&lt;mi&gt;t&lt;/mi&gt;&lt;msup&gt;&lt;mi&gt;e&lt;/mi&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mo&gt;−&lt;/mo&gt;&lt;mi&gt;k&lt;/mi&gt;&lt;mi&gt;t&lt;/mi&gt;&lt;/mrow&gt;&lt;/msup&gt;&lt;mspace width="thinmathspace"&gt;&lt;/mspace&gt;&lt;mi&gt;d&lt;/mi&gt;&lt;mi&gt;t&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
  Take a moment to think about how strange that is: an integral telling you when the United States will collapse!
  However, since the data is discrete, it's easier to just analyze the data directly than to try to extract
  a value of &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-4-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;k&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;. We can simply look at all the empires that lasted at least 250 years and find the average
  duration of those. This gives 550 years. So the year in which the United States is expected to collapse
  based on our model is 1776+550=&lt;strong&gt;2326&lt;/strong&gt;.
&lt;/p&gt;
&lt;p&gt;Of course, this isn't a perfect model. One issue I can think of is that the list generally includes empires and
dynasties, which might last shorter or longer than relatively democratic countries such as the United States. Another is that
many of these empires were from long ago, and there might be characteristics of the modern era that make
states last for more or less time.
&lt;/p&gt;
&lt;h2&gt;Big Example Questions&lt;/h2&gt;
&lt;p&gt;I think there's a lot of potential for this idea of mathematical history. Here are some cases
where that might be useful.
&lt;/p&gt;
&lt;h3&gt;Industrial Revolution&lt;/h3&gt;
&lt;p&gt;The Industrial Revolution is a great example of the impact of technology on society. It had
many positive impacts: poverty was greatly reduced, and new technology increased quality of life.
On the other hand, it led to a global divide between industrialized and non-industrialized
countries, and increased polution which was bad for health and the environment.
&lt;/p&gt;
&lt;p&gt;The Industrial Revolution is neither the first nor the last instance of technology
drastically changing society. A lot of the questions about the positive
and negative effects can benefit from a mathematical approach: we might want to know
if there was a statistically significant change in political systems or political stability
in industrialized nations. Math can help quantify how even or uneven the reduction of
poverty was both in individual societies and around the world.
&lt;/p&gt;
&lt;p&gt;We might also want to look at the Industrial Revolution in the context of other big changes in history,
like the development of agriculture,
and try to quantify these changes. We can also look at how frequent these changes are
over time.
&lt;/p&gt;
&lt;h3&gt;Imperialism and Global Influence&lt;/h3&gt;
&lt;p&gt;The best known and largest example of imperialism is European imperialism in the early modern
period. However, there are a lot of examples of imperialism before that, like the Mongol empire.
Since then, soft power in today's world powers arguably has some similarities.
&lt;/p&gt;
&lt;p&gt;It might be interesting to quantify the effects of global influence on both conquering and conquered
nations. These effects would be functions of many variables (sounds like multivariable calculus)
like the amount of direct or local rule, geographical distance between countries,
or time period. Looking at these functions could tell us something about soft power today
and the conflict between the US and China.
&lt;/p&gt;
&lt;h2&gt;Future of Mathematical History&lt;/h2&gt;
&lt;p&gt;I don't think much thought has been given to the idea of mathematical history yet, so it'll take
some time to develop a proper theory and apply it to important questions. But Isaac Asimov's idea of
psychohistory from &lt;cite&gt;Foundation&lt;/cite&gt; is similar. In the real world, some cool sources like
&lt;a href="https://ourworldindata.org" target="_blank"&gt;Our World in Data&lt;/a&gt; have articles
(like &lt;a href="https://ourworldindata.org/democratic-rights" target="_blank"&gt;this one&lt;/a&gt;) that
I would say fall under mathematical history. I'm curious to see how this idea develops in the future,
and I think a lot of important results can come out of it.
&lt;/p&gt;
&lt;hr/&gt;
&lt;h2&gt;Related Posts&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="https://www.harysdalvi.com/blog/election-prediction"&gt;Predicting Every Election since 1916&lt;/a&gt; — How “election pundit predictions” betray a misunderstanding of probability&lt;/li&gt;
&lt;p&gt;&lt;/p&gt;
&lt;h2 id="ref"&gt;References&lt;/h2&gt;
&lt;ol id="reflist"&gt;
&lt;li&gt;&lt;a href="https://en.wikipedia.org/wiki/List_of_empires" target="_blank"&gt;List of empires&lt;/a&gt; (Wikipedia) &lt;a href="https://www.harysdalvi.com/blog/mathematical-history/#src1"&gt;^&lt;/a&gt;&lt;/li&gt;
&lt;/ol&gt;
&lt;ol class="reflist" type="a"&gt;
&lt;li&gt;&lt;a href="https://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.164.1314&amp;amp;rep=rep1&amp;amp;type=pdf" target="_blank"&gt;Entropic
Evidence for Linguistic Structure in the Indus Script&lt;/a&gt; (RPN Rao et al., 2009)&lt;/li&gt;
&lt;li&gt;&lt;a href="https://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.164.1314&amp;amp;rep=rep1&amp;amp;type=pdf" target="_blank"&gt;Pictish
    symbols revealed as a written language through application of Shannon entropy&lt;/a&gt; (Lee, Jonathan &amp;amp; Ziman, 2010)&lt;/li&gt;
&lt;/ol&gt;
&lt;/ul&gt;</content></entry><entry><title>What is Schrödinger's Equation?</title><summary>An informal derivation of the scary-looking Schrödinger equation using as little quantum physics as possible</summary><link href="https://www.harysdalvi.com/blog/2111/" rel="alternate"/><id>https://www.harysdalvi.com/blog/2111/</id><published>2021-11-01T16:38:07Z</published><updated>2021-11-01T16:38:07Z</updated><content type="html" xml:lang="en">&lt;p&gt;In classical mechanics, arguably the most important equation is Newton's famous
&lt;span class="mjx-chtml MJXc-display" style="text-align: center;"&gt;&lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-1-Frame" role="presentation" style="font-size: 90%; text-align: center; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML MJX_Assistive_MathML_Block" role="presentation"&gt;&lt;math display="block" xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;F&lt;/mi&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mi&gt;m&lt;/mi&gt;&lt;mi&gt;a&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
It's so simple, it almost feels silly putting it on a separate line in the middle of
the screen. But it deserves the spot, because it's so central. With this equation,
you can take information about a classical mechanics system and figure out how it
will change for all time.
&lt;/p&gt;
&lt;p&gt;In quantum mechanics, there is a similar equation used to predict the behavior
    of quantum systems. It is Schrödinger's famous
&lt;/p&gt;&lt;div class="longmath"&gt;&lt;span class="mjx-chtml MJXc-display" style="text-align: center;"&gt;&lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-2-Frame" role="presentation" style="font-size: 90%; text-align: center; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML MJX_Assistive_MathML_Block" role="presentation"&gt;&lt;math display="block" xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;i&lt;/mi&gt;&lt;mi class="MJX-variant"&gt;ℏ&lt;/mi&gt;&lt;mfrac&gt;&lt;mi mathvariant="normal"&gt;∂&lt;/mi&gt;&lt;mrow&gt;&lt;mi mathvariant="normal"&gt;∂&lt;/mi&gt;&lt;mi&gt;t&lt;/mi&gt;&lt;/mrow&gt;&lt;/mfrac&gt;&lt;mi mathvariant="normal"&gt;Ψ&lt;/mi&gt;&lt;mo stretchy="false"&gt;(&lt;/mo&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mi mathvariant="bold"&gt;r&lt;/mi&gt;&lt;/mrow&gt;&lt;mo&gt;,&lt;/mo&gt;&lt;mi&gt;t&lt;/mi&gt;&lt;mo stretchy="false"&gt;)&lt;/mo&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mo&gt;−&lt;/mo&gt;&lt;mfrac&gt;&lt;msup&gt;&lt;mi class="MJX-variant"&gt;ℏ&lt;/mi&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msup&gt;&lt;mrow&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;mi&gt;m&lt;/mi&gt;&lt;/mrow&gt;&lt;/mfrac&gt;&lt;msup&gt;&lt;mi mathvariant="normal"&gt;∇&lt;/mi&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msup&gt;&lt;mi mathvariant="normal"&gt;Ψ&lt;/mi&gt;&lt;mo stretchy="false"&gt;(&lt;/mo&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mi mathvariant="bold"&gt;r&lt;/mi&gt;&lt;/mrow&gt;&lt;mo&gt;,&lt;/mo&gt;&lt;mi&gt;t&lt;/mi&gt;&lt;mo stretchy="false"&gt;)&lt;/mo&gt;&lt;mo&gt;+&lt;/mo&gt;&lt;mi&gt;V&lt;/mi&gt;&lt;mo stretchy="false"&gt;(&lt;/mo&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mi mathvariant="bold"&gt;r&lt;/mi&gt;&lt;/mrow&gt;&lt;mo&gt;,&lt;/mo&gt;&lt;mi&gt;t&lt;/mi&gt;&lt;mo stretchy="false"&gt;)&lt;/mo&gt;&lt;mi mathvariant="normal"&gt;Ψ&lt;/mi&gt;&lt;mo stretchy="false"&gt;(&lt;/mo&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mi mathvariant="bold"&gt;r&lt;/mi&gt;&lt;/mrow&gt;&lt;mo&gt;,&lt;/mo&gt;&lt;mi&gt;t&lt;/mi&gt;&lt;mo stretchy="false"&gt;)&lt;/mo&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;
&lt;p&gt;&lt;/p&gt;
&lt;figure class="endfloat"&gt;
&lt;img alt="The Solvay conference in 1927, with many famous physicists including Albert Einstein, Marie Curie, Max Planck, and Niels Bohr." height="400" src="https://www.harysdalvi.com/blog/2111/Solvay_conference_1927.jpg" width="600"/&gt;
&lt;figcaption&gt;Schrödinger is back row, sixth from the left (or right) in this legendary picture&lt;/figcaption&gt;
&lt;/figure&gt;
&lt;p&gt;
Putting this on a separate line doesn't feel silly at all.
Surprisingly, however, this equation isn't too far off from the principles of classical mechanics
and &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-3-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;F&lt;/mi&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mi&gt;m&lt;/mi&gt;&lt;mi&gt;a&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;. Today I am going to show you how to understand this equation a little better
using classical physics and lots of math. You will need classical mechanics and calculus,
but no quantum background. There are just a few facts about quantum
physics you need to accept first:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Quantum entities can have energy from their frequencies according to
    the equation &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-4-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;E&lt;/mi&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mi&gt;h&lt;/mi&gt;&lt;mi&gt;f&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;, where &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-5-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;E&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt; is energy, &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-6-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;f&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt; is frequency,
    and &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-7-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;h&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt; is a constant called Planck's constant.
&lt;/li&gt;
&lt;li&gt;The wavelength &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-8-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;λ&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt; of a quantum entity depends on its momentum
    &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-9-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;p&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt; according to the de Broglie equation, &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-10-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;λ&lt;/mi&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mi&gt;h&lt;/mi&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mo&gt;/&lt;/mo&gt;&lt;/mrow&gt;&lt;mi&gt;p&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;.
    You can actually derive this quite easily from &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-11-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;E&lt;/mi&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mi&gt;h&lt;/mi&gt;&lt;mi&gt;f&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt; and &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-12-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;E&lt;/mi&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mi&gt;m&lt;/mi&gt;&lt;msup&gt;&lt;mi&gt;c&lt;/mi&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msup&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;.
&lt;/li&gt;
&lt;li&gt;With quantum mechanics, we don't look at the exact positions or velocities
    of particles. Instead, we use something called a
    &lt;strong&gt;wavefunction&lt;/strong&gt;. With wavefunctions and some calculus,
    we can find the &lt;em&gt;probability distributions&lt;/em&gt; of
    position, momentum, and more, but not certain values.
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;It's important to note that this does not represent a true derivation:
it would be easier to just accept Schrödinger's equation directly
rather than accept the above facts and go through all this math.
However, hopefully this will give some intuition into Schrödinger's equation and the math behind it.
&lt;/p&gt;
&lt;p&gt;
&lt;/p&gt;
&lt;h2&gt;Classical Basis&lt;/h2&gt;
&lt;p&gt;You might say “Schrödinger's equation looks &lt;em&gt;nothing&lt;/em&gt; like Newton's equation,
    how can they be analogous?” This is a fair point. In fact, Schrödinger's equation
    is a little more analogous to conservation of energy.
    (More precisely, it's based on a Hamiltonian, not a true conservation of energy equation.)
    The equation for classical conservation of energy, where we'll start, is
    &lt;span class="mjx-chtml MJXc-display" style="text-align: center;"&gt;&lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-13-Frame" role="presentation" style="font-size: 90%; text-align: center; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML MJX_Assistive_MathML_Block" role="presentation"&gt;&lt;math display="block" xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;E&lt;/mi&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mi&gt;K&lt;/mi&gt;&lt;mo&gt;+&lt;/mo&gt;&lt;mi&gt;V&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
    Where &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-14-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;K&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt; is kinetic energy, &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-15-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;V&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt; is potential energy, and &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-16-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;E&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt; is total energy.
    Doesn't this already look a little like Schrödinger's equation? We have one term on the
    left, and it's the sum of two terms on the right.
&lt;/p&gt;
&lt;p&gt;From here, &lt;strong&gt;we're going to modify this equation step by step until we end up with
    Schrödinger's equation&lt;/strong&gt;. First of all, we could write &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-17-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;K&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt; in terms of
    mass &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-18-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;m&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt; and velocity &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-19-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;v&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;.
    &lt;span class="mjx-chtml MJXc-display" style="text-align: center;"&gt;&lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-20-Frame" role="presentation" style="font-size: 90%; text-align: center; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML MJX_Assistive_MathML_Block" role="presentation"&gt;&lt;math display="block" xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;E&lt;/mi&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mfrac&gt;&lt;mn&gt;1&lt;/mn&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/mfrac&gt;&lt;mi&gt;m&lt;/mi&gt;&lt;msup&gt;&lt;mi&gt;v&lt;/mi&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msup&gt;&lt;mo&gt;+&lt;/mo&gt;&lt;mi&gt;V&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
    It turns out that in quantum mechanics, the momentum &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-21-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;p&lt;/mi&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mi&gt;m&lt;/mi&gt;&lt;mi&gt;v&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt; will be more helpful
    to us than the velocity. Luckily, we can write &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-22-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;K&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt; in terms of mass and
    momentum as well.&lt;/p&gt;
&lt;div class="longmath"&gt;&lt;span class="mjx-chtml MJXc-display" style="text-align: center;"&gt;&lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-23-Frame" role="presentation" style="font-size: 90%; text-align: center; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML MJX_Assistive_MathML_Block" role="presentation"&gt;&lt;math display="block" xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;K&lt;/mi&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mfrac&gt;&lt;mrow&gt;&lt;mi&gt;m&lt;/mi&gt;&lt;msup&gt;&lt;mi&gt;v&lt;/mi&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msup&gt;&lt;/mrow&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/mfrac&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mfrac&gt;&lt;mrow&gt;&lt;msup&gt;&lt;mi&gt;m&lt;/mi&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msup&gt;&lt;msup&gt;&lt;mi&gt;v&lt;/mi&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msup&gt;&lt;/mrow&gt;&lt;mrow&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;mi&gt;m&lt;/mi&gt;&lt;/mrow&gt;&lt;/mfrac&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mfrac&gt;&lt;mrow&gt;&lt;mo stretchy="false"&gt;(&lt;/mo&gt;&lt;mi&gt;m&lt;/mi&gt;&lt;mi&gt;v&lt;/mi&gt;&lt;msup&gt;&lt;mo stretchy="false"&gt;)&lt;/mo&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msup&gt;&lt;/mrow&gt;&lt;mrow&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;mi&gt;m&lt;/mi&gt;&lt;/mrow&gt;&lt;/mfrac&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mfrac&gt;&lt;msup&gt;&lt;mi&gt;p&lt;/mi&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msup&gt;&lt;mrow&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;mi&gt;m&lt;/mi&gt;&lt;/mrow&gt;&lt;/mfrac&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;
&lt;p&gt;&lt;span class="mjx-chtml MJXc-display" style="text-align: center;"&gt;&lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-24-Frame" role="presentation" style="font-size: 90%; text-align: center; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML MJX_Assistive_MathML_Block" role="presentation"&gt;&lt;math display="block" xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;E&lt;/mi&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mfrac&gt;&lt;msup&gt;&lt;mi&gt;p&lt;/mi&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msup&gt;&lt;mrow&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;mi&gt;m&lt;/mi&gt;&lt;/mrow&gt;&lt;/mfrac&gt;&lt;mo&gt;+&lt;/mo&gt;&lt;mi&gt;V&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
    Now we are going to transition from classical mechanics to quantum mechanics.
    To do this, we are going to multiply by the quantum wavefunction &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-25-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi mathvariant="normal"&gt;Ψ&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;
    on both sides, just to get it into our equation.
    &lt;span class="mjx-chtml MJXc-display" style="text-align: center;"&gt;&lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-26-Frame" role="presentation" style="font-size: 90%; text-align: center; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML MJX_Assistive_MathML_Block" role="presentation"&gt;&lt;math display="block" xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;E&lt;/mi&gt;&lt;mi mathvariant="normal"&gt;Ψ&lt;/mi&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mfrac&gt;&lt;msup&gt;&lt;mi&gt;p&lt;/mi&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msup&gt;&lt;mrow&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;mi&gt;m&lt;/mi&gt;&lt;/mrow&gt;&lt;/mfrac&gt;&lt;mi mathvariant="normal"&gt;Ψ&lt;/mi&gt;&lt;mo&gt;+&lt;/mo&gt;&lt;mi&gt;V&lt;/mi&gt;&lt;mi mathvariant="normal"&gt;Ψ&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
    But at this point, our equation is a bad mix of classical and quantum mechanics
    that doesn't really make sense. We said that we
    don't deal with exact values of momentum in quantum mechanics, only probability distributions
    with our wavefunction &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-27-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi mathvariant="normal"&gt;Ψ&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;. But here we have &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-28-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;p&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt; and &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-29-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi mathvariant="normal"&gt;Ψ&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;
    in the same equation, as if we knew exactly what the momentum &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-30-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;p&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;
    was.
&lt;/p&gt;
&lt;p&gt;We might not know what &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-31-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;p&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt; is, but it turns
    out we can change &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-32-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mfrac&gt;&lt;msup&gt;&lt;mi&gt;p&lt;/mi&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msup&gt;&lt;mrow&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;mi&gt;m&lt;/mi&gt;&lt;/mrow&gt;&lt;/mfrac&gt;&lt;mi mathvariant="normal"&gt;Ψ&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt; to something in terms of &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-33-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi mathvariant="normal"&gt;Ψ&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt; and things we
    &lt;em&gt;do&lt;/em&gt; know, so we can deal with probabilities like we're supposed to.
&lt;/p&gt;
&lt;h2&gt;Operators&lt;/h2&gt;
&lt;p&gt;We don't really know what &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-34-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi mathvariant="normal"&gt;Ψ&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt; is, since we are keeping it
general, but we can write it in general terms. How about this:
&lt;span class="mjx-chtml MJXc-display" style="text-align: center;"&gt;&lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-35-Frame" role="presentation" style="font-size: 90%; text-align: center; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML MJX_Assistive_MathML_Block" role="presentation"&gt;&lt;math display="block" xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;menclose notation="box"&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mstyle displaystyle="true" scriptlevel="0"&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mi mathvariant="normal"&gt;Ψ&lt;/mi&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mi&gt;A&lt;/mi&gt;&lt;msup&gt;&lt;mi&gt;e&lt;/mi&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mi&gt;i&lt;/mi&gt;&lt;msub&gt;&lt;mi&gt;k&lt;/mi&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;/msub&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;/mrow&gt;&lt;/msup&gt;&lt;msup&gt;&lt;mi&gt;e&lt;/mi&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mi&gt;i&lt;/mi&gt;&lt;msub&gt;&lt;mi&gt;k&lt;/mi&gt;&lt;mi&gt;y&lt;/mi&gt;&lt;/msub&gt;&lt;mi&gt;y&lt;/mi&gt;&lt;/mrow&gt;&lt;/msup&gt;&lt;msup&gt;&lt;mi&gt;e&lt;/mi&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mi&gt;i&lt;/mi&gt;&lt;msub&gt;&lt;mi&gt;k&lt;/mi&gt;&lt;mi&gt;z&lt;/mi&gt;&lt;/msub&gt;&lt;mi&gt;z&lt;/mi&gt;&lt;/mrow&gt;&lt;/msup&gt;&lt;msup&gt;&lt;mi&gt;e&lt;/mi&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mo&gt;−&lt;/mo&gt;&lt;mi&gt;i&lt;/mi&gt;&lt;mi&gt;ω&lt;/mi&gt;&lt;mi&gt;t&lt;/mi&gt;&lt;/mrow&gt;&lt;/msup&gt;&lt;/mrow&gt;&lt;/mstyle&gt;&lt;/mrow&gt;&lt;/menclose&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
&lt;/p&gt;&lt;details&gt;
&lt;summary&gt;Why is only time negative?&lt;/summary&gt;
&lt;p&gt;This is a tough question, and I couldn't find a satisfactory answer online,
but here's one way I found to think about it.
From multivariable calculus, we have
&lt;span class="mjx-chtml MJXc-display" style="text-align: center;"&gt;&lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-36-Frame" role="presentation" style="font-size: 90%; text-align: center; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML MJX_Assistive_MathML_Block" role="presentation"&gt;&lt;math display="block" xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mfrac&gt;&lt;mrow&gt;&lt;mi&gt;d&lt;/mi&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;/mrow&gt;&lt;mrow&gt;&lt;mi&gt;d&lt;/mi&gt;&lt;mi&gt;t&lt;/mi&gt;&lt;/mrow&gt;&lt;/mfrac&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mo&gt;−&lt;/mo&gt;&lt;mfrac&gt;&lt;mrow&gt;&lt;mi mathvariant="normal"&gt;∂&lt;/mi&gt;&lt;mi mathvariant="normal"&gt;Ψ&lt;/mi&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mo&gt;/&lt;/mo&gt;&lt;/mrow&gt;&lt;mi mathvariant="normal"&gt;∂&lt;/mi&gt;&lt;mi&gt;t&lt;/mi&gt;&lt;/mrow&gt;&lt;mrow&gt;&lt;mi mathvariant="normal"&gt;∂&lt;/mi&gt;&lt;mi mathvariant="normal"&gt;Ψ&lt;/mi&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mo&gt;/&lt;/mo&gt;&lt;/mrow&gt;&lt;mi mathvariant="normal"&gt;∂&lt;/mi&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;/mrow&gt;&lt;/mfrac&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
If we had &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-37-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;msup&gt;&lt;mi&gt;e&lt;/mi&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mi&gt;i&lt;/mi&gt;&lt;mi&gt;ω&lt;/mi&gt;&lt;mi&gt;t&lt;/mi&gt;&lt;/mrow&gt;&lt;/msup&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt; instead of &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-38-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;msup&gt;&lt;mi&gt;e&lt;/mi&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mo&gt;−&lt;/mo&gt;&lt;mi&gt;i&lt;/mi&gt;&lt;mi&gt;ω&lt;/mi&gt;&lt;mi&gt;t&lt;/mi&gt;&lt;/mrow&gt;&lt;/msup&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;, this would become
&lt;span class="mjx-chtml MJXc-display" style="text-align: center;"&gt;&lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-39-Frame" role="presentation" style="font-size: 90%; text-align: center; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML MJX_Assistive_MathML_Block" role="presentation"&gt;&lt;math display="block" xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mfrac&gt;&lt;mrow&gt;&lt;mi&gt;d&lt;/mi&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;/mrow&gt;&lt;mrow&gt;&lt;mi&gt;d&lt;/mi&gt;&lt;mi&gt;t&lt;/mi&gt;&lt;/mrow&gt;&lt;/mfrac&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mo&gt;−&lt;/mo&gt;&lt;mfrac&gt;&lt;mrow&gt;&lt;mi&gt;i&lt;/mi&gt;&lt;mi&gt;ω&lt;/mi&gt;&lt;/mrow&gt;&lt;mrow&gt;&lt;mi&gt;i&lt;/mi&gt;&lt;msub&gt;&lt;mi&gt;k&lt;/mi&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;/msub&gt;&lt;/mrow&gt;&lt;/mfrac&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mo&gt;−&lt;/mo&gt;&lt;mfrac&gt;&lt;mi&gt;ω&lt;/mi&gt;&lt;msub&gt;&lt;mi&gt;k&lt;/mi&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;/msub&gt;&lt;/mfrac&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
Later, we'll find that &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-40-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;ω&lt;/mi&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;mi&gt;π&lt;/mi&gt;&lt;mi&gt;f&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt; and &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-41-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;msub&gt;&lt;mi&gt;k&lt;/mi&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;/msub&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;mi&gt;π&lt;/mi&gt;&lt;msub&gt;&lt;mi&gt;p&lt;/mi&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;/msub&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mo&gt;/&lt;/mo&gt;&lt;/mrow&gt;&lt;mi&gt;h&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;. This gives
&lt;span class="mjx-chtml MJXc-display" style="text-align: center;"&gt;&lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-42-Frame" role="presentation" style="font-size: 90%; text-align: center; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML MJX_Assistive_MathML_Block" role="presentation"&gt;&lt;math display="block" xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mfrac&gt;&lt;mrow&gt;&lt;mi&gt;d&lt;/mi&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;/mrow&gt;&lt;mrow&gt;&lt;mi&gt;d&lt;/mi&gt;&lt;mi&gt;t&lt;/mi&gt;&lt;/mrow&gt;&lt;/mfrac&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mfrac&gt;&lt;msub&gt;&lt;mi&gt;p&lt;/mi&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;/msub&gt;&lt;mi&gt;m&lt;/mi&gt;&lt;/mfrac&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mo&gt;−&lt;/mo&gt;&lt;mfrac&gt;&lt;mrow&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;mi&gt;π&lt;/mi&gt;&lt;mi&gt;f&lt;/mi&gt;&lt;/mrow&gt;&lt;mrow&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;mi&gt;π&lt;/mi&gt;&lt;msub&gt;&lt;mi&gt;p&lt;/mi&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;/msub&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mo&gt;/&lt;/mo&gt;&lt;/mrow&gt;&lt;mi&gt;h&lt;/mi&gt;&lt;/mrow&gt;&lt;/mfrac&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mo&gt;−&lt;/mo&gt;&lt;mfrac&gt;&lt;mrow&gt;&lt;mi&gt;h&lt;/mi&gt;&lt;mi&gt;f&lt;/mi&gt;&lt;/mrow&gt;&lt;msub&gt;&lt;mi&gt;p&lt;/mi&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;/msub&gt;&lt;/mfrac&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
&lt;span class="mjx-chtml MJXc-display" style="text-align: center;"&gt;&lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-43-Frame" role="presentation" style="font-size: 90%; text-align: center; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML MJX_Assistive_MathML_Block" role="presentation"&gt;&lt;math display="block" xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mfrac&gt;&lt;msubsup&gt;&lt;mi&gt;p&lt;/mi&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msubsup&gt;&lt;mi&gt;m&lt;/mi&gt;&lt;/mfrac&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mo&gt;−&lt;/mo&gt;&lt;mi&gt;h&lt;/mi&gt;&lt;mi&gt;f&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
But this means frequency is negative, or momentum is
imaginary, obviously both making no sense. Therefore, we must have
opposite signs for space and time.&lt;/p&gt;
&lt;/details&gt;
&lt;p&gt;&lt;/p&gt;
&lt;p&gt;
This equation doesn't really tell us much
about &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-44-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi mathvariant="normal"&gt;Ψ&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;. We have no idea what the values of these variables are.
That's actually a &lt;em&gt;good&lt;/em&gt; thing, because we don't
have information about &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-45-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi mathvariant="normal"&gt;Ψ&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;, so we don't want to
pretend like we do and make stuff up.
&lt;/p&gt;
&lt;p&gt;All we are saying is that &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-46-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi mathvariant="normal"&gt;Ψ&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt; is some number &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-47-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;A&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;
times something like &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-48-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;msup&gt;&lt;mi&gt;e&lt;/mi&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mi&gt;i&lt;/mi&gt;&lt;mi&gt;θ&lt;/mi&gt;&lt;/mrow&gt;&lt;/msup&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt; a bunch of times.
Each &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-49-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;msup&gt;&lt;mi&gt;e&lt;/mi&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mi&gt;i&lt;/mi&gt;&lt;mi&gt;θ&lt;/mi&gt;&lt;/mrow&gt;&lt;/msup&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt; term shows that the wave function
depends on something in some way: &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-50-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;, &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-51-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;y&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;, and
&lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-52-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;z&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt; for position in three dimensions, and &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-53-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;t&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt; for time.
In other words, our equation translated to English is
just saying “the wavefunction depends on space and time
in some way.”
&lt;/p&gt;
&lt;p&gt;There is one more thing the equation is saying.
&lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-54-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;msup&gt;&lt;mi&gt;e&lt;/mi&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mi&gt;i&lt;/mi&gt;&lt;mi&gt;θ&lt;/mi&gt;&lt;/mrow&gt;&lt;/msup&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt; is an oscillating function, so our &lt;em&gt;wave&lt;/em&gt;function
will oscillate like a &lt;em&gt;wave&lt;/em&gt;.
&lt;/p&gt;
&lt;p&gt;How do we know it's a wave of this form?
Well, actually we don't. But
it turns out that if you have some solutions to the Schrödinger equation,
their sum (more precisely, their linear combination)
will also be a solution. Also, there is something called the
Fourier transform which says that you can write any
function as a sum of sine and cosine functions (waves). Putting these
two ideas together, if we can derive the Schrödinger equation for a general wave,
we can add waves together to make whatever other function we want.
This sum will also be a solution since it's the sum of individual solutions.
&lt;/p&gt;
&lt;h3&gt;Kinetic Energy&lt;/h3&gt;
&lt;p&gt;So we have a wavefunction, and it's a wave. We might be interested in the
wavelength (in space) and frequency (in time) of the wave. If we have &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-55-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;msup&gt;&lt;mi&gt;e&lt;/mi&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mi&gt;i&lt;/mi&gt;&lt;mo stretchy="false"&gt;(&lt;/mo&gt;&lt;mi&gt;a&lt;/mi&gt;&lt;mo stretchy="false"&gt;)&lt;/mo&gt;&lt;mi&gt;θ&lt;/mi&gt;&lt;/mrow&gt;&lt;/msup&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;, the
“wavelength” would be &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-56-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;mi&gt;π&lt;/mi&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mo&gt;/&lt;/mo&gt;&lt;/mrow&gt;&lt;mi&gt;a&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;, since we make a full circle back to &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-57-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;msup&gt;&lt;mi&gt;e&lt;/mi&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mn&gt;0&lt;/mn&gt;&lt;mi&gt;i&lt;/mi&gt;&lt;/mrow&gt;&lt;/msup&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;msup&gt;&lt;mi&gt;e&lt;/mi&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;mi&gt;π&lt;/mi&gt;&lt;mi&gt;i&lt;/mi&gt;&lt;/mrow&gt;&lt;/msup&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;
once &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-58-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;θ&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt; reaches &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-59-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;mi&gt;π&lt;/mi&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mo&gt;/&lt;/mo&gt;&lt;/mrow&gt;&lt;mi&gt;a&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;. That means for &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-60-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;msup&gt;&lt;mi&gt;e&lt;/mi&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mi&gt;i&lt;/mi&gt;&lt;mo stretchy="false"&gt;(&lt;/mo&gt;&lt;msub&gt;&lt;mi&gt;k&lt;/mi&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;/msub&gt;&lt;mo stretchy="false"&gt;)&lt;/mo&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;/mrow&gt;&lt;/msup&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;,
our wavelength &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-61-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;msub&gt;&lt;mi&gt;λ&lt;/mi&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;/msub&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt; is &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-62-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;mi&gt;π&lt;/mi&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mo&gt;/&lt;/mo&gt;&lt;/mrow&gt;&lt;msub&gt;&lt;mi&gt;k&lt;/mi&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;/msub&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;. The same idea applies to &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-63-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;msub&gt;&lt;mi&gt;k&lt;/mi&gt;&lt;mi&gt;y&lt;/mi&gt;&lt;/msub&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt; and &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-64-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;msub&gt;&lt;mi&gt;k&lt;/mi&gt;&lt;mi&gt;z&lt;/mi&gt;&lt;/msub&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;,
for the wavelengths in the &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-65-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;y&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt; and &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-66-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;z&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt; directions.
&lt;/p&gt;
&lt;p&gt;But wait, remember the de Broglie equation?
&lt;span class="mjx-chtml MJXc-display" style="text-align: center;"&gt;&lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-67-Frame" role="presentation" style="font-size: 90%; text-align: center; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML MJX_Assistive_MathML_Block" role="presentation"&gt;&lt;math display="block" xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;λ&lt;/mi&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mfrac&gt;&lt;mi&gt;h&lt;/mi&gt;&lt;mi&gt;p&lt;/mi&gt;&lt;/mfrac&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
This means that if we have the wavelength for each direction,
we can easily find the momentum in that direction.
&lt;span class="mjx-chtml MJXc-display" style="text-align: center;"&gt;&lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-68-Frame" role="presentation" style="font-size: 90%; text-align: center; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML MJX_Assistive_MathML_Block" role="presentation"&gt;&lt;math display="block" xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;p&lt;/mi&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mfrac&gt;&lt;mi&gt;h&lt;/mi&gt;&lt;mi&gt;λ&lt;/mi&gt;&lt;/mfrac&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
&lt;span class="mjx-chtml MJXc-display" style="text-align: center;"&gt;&lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-69-Frame" role="presentation" style="font-size: 90%; text-align: center; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML MJX_Assistive_MathML_Block" role="presentation"&gt;&lt;math display="block" xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;msub&gt;&lt;mi&gt;p&lt;/mi&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;/msub&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mfrac&gt;&lt;mrow&gt;&lt;mi&gt;h&lt;/mi&gt;&lt;msub&gt;&lt;mi&gt;k&lt;/mi&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;/msub&gt;&lt;/mrow&gt;&lt;mrow&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;mi&gt;π&lt;/mi&gt;&lt;/mrow&gt;&lt;/mfrac&gt;&lt;mo&gt;,&lt;/mo&gt;&lt;mtext&gt; &lt;/mtext&gt;&lt;msub&gt;&lt;mi&gt;p&lt;/mi&gt;&lt;mi&gt;y&lt;/mi&gt;&lt;/msub&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mfrac&gt;&lt;mrow&gt;&lt;mi&gt;h&lt;/mi&gt;&lt;msub&gt;&lt;mi&gt;k&lt;/mi&gt;&lt;mi&gt;y&lt;/mi&gt;&lt;/msub&gt;&lt;/mrow&gt;&lt;mrow&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;mi&gt;π&lt;/mi&gt;&lt;/mrow&gt;&lt;/mfrac&gt;&lt;mo&gt;,&lt;/mo&gt;&lt;mtext&gt; &lt;/mtext&gt;&lt;msub&gt;&lt;mi&gt;p&lt;/mi&gt;&lt;mi&gt;z&lt;/mi&gt;&lt;/msub&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mfrac&gt;&lt;mrow&gt;&lt;mi&gt;h&lt;/mi&gt;&lt;msub&gt;&lt;mi&gt;k&lt;/mi&gt;&lt;mi&gt;z&lt;/mi&gt;&lt;/msub&gt;&lt;/mrow&gt;&lt;mrow&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;mi&gt;π&lt;/mi&gt;&lt;/mrow&gt;&lt;/mfrac&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
Let's define a new constant, &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-70-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi class="MJX-variant"&gt;ℏ&lt;/mi&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mi&gt;h&lt;/mi&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mo&gt;/&lt;/mo&gt;&lt;/mrow&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;mi&gt;π&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;, just to clean things up a little.
&lt;span class="mjx-chtml MJXc-display" style="text-align: center;"&gt;&lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-71-Frame" role="presentation" style="font-size: 90%; text-align: center; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML MJX_Assistive_MathML_Block" role="presentation"&gt;&lt;math display="block" xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;menclose notation="box"&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mstyle displaystyle="true" scriptlevel="0"&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;msub&gt;&lt;mi&gt;p&lt;/mi&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;/msub&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mi class="MJX-variant"&gt;ℏ&lt;/mi&gt;&lt;msub&gt;&lt;mi&gt;k&lt;/mi&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;/msub&gt;&lt;mo&gt;,&lt;/mo&gt;&lt;mtext&gt; &lt;/mtext&gt;&lt;msub&gt;&lt;mi&gt;p&lt;/mi&gt;&lt;mi&gt;y&lt;/mi&gt;&lt;/msub&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mi class="MJX-variant"&gt;ℏ&lt;/mi&gt;&lt;msub&gt;&lt;mi&gt;k&lt;/mi&gt;&lt;mi&gt;y&lt;/mi&gt;&lt;/msub&gt;&lt;mo&gt;,&lt;/mo&gt;&lt;mtext&gt; &lt;/mtext&gt;&lt;msub&gt;&lt;mi&gt;p&lt;/mi&gt;&lt;mi&gt;z&lt;/mi&gt;&lt;/msub&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mi class="MJX-variant"&gt;ℏ&lt;/mi&gt;&lt;msub&gt;&lt;mi&gt;k&lt;/mi&gt;&lt;mi&gt;z&lt;/mi&gt;&lt;/msub&gt;&lt;/mrow&gt;&lt;/mstyle&gt;&lt;/mrow&gt;&lt;/menclose&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
There's one problem: we have no idea what all these &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-72-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;k&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt; values are.
I made them up when we wrote a general equation for &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-73-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi mathvariant="normal"&gt;Ψ&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;.
But something interesting happens if we take the second derivative
of &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-74-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi mathvariant="normal"&gt;Ψ&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;. We find
&lt;span class="mjx-chtml MJXc-display" style="text-align: center;"&gt;&lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-75-Frame" role="presentation" style="font-size: 90%; text-align: center; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML MJX_Assistive_MathML_Block" role="presentation"&gt;&lt;math display="block" xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;menclose notation="box"&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mstyle displaystyle="true" scriptlevel="0"&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;msubsup&gt;&lt;mi&gt;p&lt;/mi&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msubsup&gt;&lt;mi mathvariant="normal"&gt;Ψ&lt;/mi&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mo&gt;−&lt;/mo&gt;&lt;msup&gt;&lt;mi class="MJX-variant"&gt;ℏ&lt;/mi&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msup&gt;&lt;mfrac&gt;&lt;mrow&gt;&lt;msup&gt;&lt;mi mathvariant="normal"&gt;∂&lt;/mi&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msup&gt;&lt;mi mathvariant="normal"&gt;Ψ&lt;/mi&gt;&lt;/mrow&gt;&lt;mrow&gt;&lt;mi mathvariant="normal"&gt;∂&lt;/mi&gt;&lt;msup&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msup&gt;&lt;/mrow&gt;&lt;/mfrac&gt;&lt;/mrow&gt;&lt;/mstyle&gt;&lt;/mrow&gt;&lt;/menclose&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
&lt;/p&gt;&lt;details&gt;
&lt;summary&gt;Proof&lt;/summary&gt;
&lt;span class="mjx-chtml MJXc-display" style="text-align: center;"&gt;&lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-76-Frame" role="presentation" style="font-size: 90%; text-align: center; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML MJX_Assistive_MathML_Block" role="presentation"&gt;&lt;math display="block" xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi mathvariant="normal"&gt;Ψ&lt;/mi&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mi&gt;A&lt;/mi&gt;&lt;msup&gt;&lt;mi&gt;e&lt;/mi&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mi&gt;i&lt;/mi&gt;&lt;msub&gt;&lt;mi&gt;k&lt;/mi&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;/msub&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;/mrow&gt;&lt;/msup&gt;&lt;msup&gt;&lt;mi&gt;e&lt;/mi&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mi&gt;i&lt;/mi&gt;&lt;msub&gt;&lt;mi&gt;k&lt;/mi&gt;&lt;mi&gt;y&lt;/mi&gt;&lt;/msub&gt;&lt;mi&gt;y&lt;/mi&gt;&lt;/mrow&gt;&lt;/msup&gt;&lt;msup&gt;&lt;mi&gt;e&lt;/mi&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mi&gt;i&lt;/mi&gt;&lt;msub&gt;&lt;mi&gt;k&lt;/mi&gt;&lt;mi&gt;z&lt;/mi&gt;&lt;/msub&gt;&lt;mi&gt;z&lt;/mi&gt;&lt;/mrow&gt;&lt;/msup&gt;&lt;msup&gt;&lt;mi&gt;e&lt;/mi&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mo&gt;−&lt;/mo&gt;&lt;mi&gt;i&lt;/mi&gt;&lt;mi&gt;ω&lt;/mi&gt;&lt;mi&gt;t&lt;/mi&gt;&lt;/mrow&gt;&lt;/msup&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
&lt;div class="longmath"&gt;&lt;span class="mjx-chtml MJXc-display" style="text-align: center;"&gt;&lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-77-Frame" role="presentation" style="font-size: 90%; text-align: center; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML MJX_Assistive_MathML_Block" role="presentation"&gt;&lt;math display="block" xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mfrac&gt;&lt;mrow&gt;&lt;msup&gt;&lt;mi mathvariant="normal"&gt;∂&lt;/mi&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msup&gt;&lt;mi mathvariant="normal"&gt;Ψ&lt;/mi&gt;&lt;/mrow&gt;&lt;mrow&gt;&lt;mi mathvariant="normal"&gt;∂&lt;/mi&gt;&lt;msup&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msup&gt;&lt;/mrow&gt;&lt;/mfrac&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mo stretchy="false"&gt;(&lt;/mo&gt;&lt;mi&gt;i&lt;/mi&gt;&lt;msub&gt;&lt;mi&gt;k&lt;/mi&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;/msub&gt;&lt;msup&gt;&lt;mo stretchy="false"&gt;)&lt;/mo&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msup&gt;&lt;mi&gt;A&lt;/mi&gt;&lt;msup&gt;&lt;mi&gt;e&lt;/mi&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mi&gt;i&lt;/mi&gt;&lt;msub&gt;&lt;mi&gt;k&lt;/mi&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;/msub&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;/mrow&gt;&lt;/msup&gt;&lt;msup&gt;&lt;mi&gt;e&lt;/mi&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mi&gt;i&lt;/mi&gt;&lt;msub&gt;&lt;mi&gt;k&lt;/mi&gt;&lt;mi&gt;y&lt;/mi&gt;&lt;/msub&gt;&lt;mi&gt;y&lt;/mi&gt;&lt;/mrow&gt;&lt;/msup&gt;&lt;msup&gt;&lt;mi&gt;e&lt;/mi&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mi&gt;i&lt;/mi&gt;&lt;msub&gt;&lt;mi&gt;k&lt;/mi&gt;&lt;mi&gt;z&lt;/mi&gt;&lt;/msub&gt;&lt;mi&gt;z&lt;/mi&gt;&lt;/mrow&gt;&lt;/msup&gt;&lt;msup&gt;&lt;mi&gt;e&lt;/mi&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mo&gt;−&lt;/mo&gt;&lt;mi&gt;i&lt;/mi&gt;&lt;mi&gt;ω&lt;/mi&gt;&lt;mi&gt;t&lt;/mi&gt;&lt;/mrow&gt;&lt;/msup&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mo stretchy="false"&gt;(&lt;/mo&gt;&lt;mi&gt;i&lt;/mi&gt;&lt;msub&gt;&lt;mi&gt;k&lt;/mi&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;/msub&gt;&lt;msup&gt;&lt;mo stretchy="false"&gt;)&lt;/mo&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msup&gt;&lt;mi mathvariant="normal"&gt;Ψ&lt;/mi&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mo&gt;−&lt;/mo&gt;&lt;msubsup&gt;&lt;mi&gt;k&lt;/mi&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msubsup&gt;&lt;mi mathvariant="normal"&gt;Ψ&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;
&lt;span class="mjx-chtml MJXc-display" style="text-align: center;"&gt;&lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-78-Frame" role="presentation" style="font-size: 90%; text-align: center; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML MJX_Assistive_MathML_Block" role="presentation"&gt;&lt;math display="block" xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi mathvariant="normal"&gt;Ψ&lt;/mi&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mo&gt;−&lt;/mo&gt;&lt;mfrac&gt;&lt;mn&gt;1&lt;/mn&gt;&lt;msubsup&gt;&lt;mi&gt;k&lt;/mi&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msubsup&gt;&lt;/mfrac&gt;&lt;mfrac&gt;&lt;mrow&gt;&lt;msup&gt;&lt;mi mathvariant="normal"&gt;∂&lt;/mi&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msup&gt;&lt;mi mathvariant="normal"&gt;Ψ&lt;/mi&gt;&lt;/mrow&gt;&lt;mrow&gt;&lt;mi mathvariant="normal"&gt;∂&lt;/mi&gt;&lt;msup&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msup&gt;&lt;/mrow&gt;&lt;/mfrac&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
&lt;span class="mjx-chtml MJXc-display" style="text-align: center;"&gt;&lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-79-Frame" role="presentation" style="font-size: 90%; text-align: center; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML MJX_Assistive_MathML_Block" role="presentation"&gt;&lt;math display="block" xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;msub&gt;&lt;mi&gt;p&lt;/mi&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;/msub&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mi class="MJX-variant"&gt;ℏ&lt;/mi&gt;&lt;msub&gt;&lt;mi&gt;k&lt;/mi&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;/msub&gt;&lt;mspace width="thickmathspace"&gt;&lt;/mspace&gt;&lt;mo stretchy="false"&gt;⟹&lt;/mo&gt;&lt;mspace width="thickmathspace"&gt;&lt;/mspace&gt;&lt;mfrac&gt;&lt;mn&gt;1&lt;/mn&gt;&lt;msub&gt;&lt;mi&gt;k&lt;/mi&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;/msub&gt;&lt;/mfrac&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mfrac&gt;&lt;mi class="MJX-variant"&gt;ℏ&lt;/mi&gt;&lt;msub&gt;&lt;mi&gt;p&lt;/mi&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;/msub&gt;&lt;/mfrac&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
&lt;span class="mjx-chtml MJXc-display" style="text-align: center;"&gt;&lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-80-Frame" role="presentation" style="font-size: 90%; text-align: center; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML MJX_Assistive_MathML_Block" role="presentation"&gt;&lt;math display="block" xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi mathvariant="normal"&gt;Ψ&lt;/mi&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mo&gt;−&lt;/mo&gt;&lt;mfrac&gt;&lt;mn&gt;1&lt;/mn&gt;&lt;msubsup&gt;&lt;mi&gt;k&lt;/mi&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msubsup&gt;&lt;/mfrac&gt;&lt;mfrac&gt;&lt;mrow&gt;&lt;msup&gt;&lt;mi mathvariant="normal"&gt;∂&lt;/mi&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msup&gt;&lt;mi mathvariant="normal"&gt;Ψ&lt;/mi&gt;&lt;/mrow&gt;&lt;mrow&gt;&lt;mi mathvariant="normal"&gt;∂&lt;/mi&gt;&lt;msup&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msup&gt;&lt;/mrow&gt;&lt;/mfrac&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mo&gt;−&lt;/mo&gt;&lt;mfrac&gt;&lt;msup&gt;&lt;mi class="MJX-variant"&gt;ℏ&lt;/mi&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msup&gt;&lt;msubsup&gt;&lt;mi&gt;p&lt;/mi&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msubsup&gt;&lt;/mfrac&gt;&lt;mfrac&gt;&lt;mrow&gt;&lt;msup&gt;&lt;mi mathvariant="normal"&gt;∂&lt;/mi&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msup&gt;&lt;mi mathvariant="normal"&gt;Ψ&lt;/mi&gt;&lt;/mrow&gt;&lt;mrow&gt;&lt;mi mathvariant="normal"&gt;∂&lt;/mi&gt;&lt;msup&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msup&gt;&lt;/mrow&gt;&lt;/mfrac&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
&lt;span class="mjx-chtml MJXc-display" style="text-align: center;"&gt;&lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-81-Frame" role="presentation" style="font-size: 90%; text-align: center; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML MJX_Assistive_MathML_Block" role="presentation"&gt;&lt;math display="block" xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;menclose notation="box"&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mstyle displaystyle="true" scriptlevel="0"&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;msubsup&gt;&lt;mi&gt;p&lt;/mi&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msubsup&gt;&lt;mi mathvariant="normal"&gt;Ψ&lt;/mi&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mo&gt;−&lt;/mo&gt;&lt;msup&gt;&lt;mi class="MJX-variant"&gt;ℏ&lt;/mi&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msup&gt;&lt;mfrac&gt;&lt;mrow&gt;&lt;msup&gt;&lt;mi mathvariant="normal"&gt;∂&lt;/mi&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msup&gt;&lt;mi mathvariant="normal"&gt;Ψ&lt;/mi&gt;&lt;/mrow&gt;&lt;mrow&gt;&lt;mi mathvariant="normal"&gt;∂&lt;/mi&gt;&lt;msup&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msup&gt;&lt;/mrow&gt;&lt;/mfrac&gt;&lt;/mrow&gt;&lt;/mstyle&gt;&lt;/mrow&gt;&lt;/menclose&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
&lt;/details&gt;
&lt;p&gt;&lt;/p&gt;
&lt;p&gt;
For the &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-82-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;y&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt; and &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-83-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;z&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt; components of momentum, we'll have almost the same equation,
just replace &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-84-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt; with the new letter. For the total momentum,
we have to add the squares of each component:
&lt;span class="mjx-chtml MJXc-display" style="text-align: center;"&gt;&lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-85-Frame" role="presentation" style="font-size: 90%; text-align: center; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML MJX_Assistive_MathML_Block" role="presentation"&gt;&lt;math display="block" xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;msup&gt;&lt;mi&gt;p&lt;/mi&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msup&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;msubsup&gt;&lt;mi&gt;p&lt;/mi&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msubsup&gt;&lt;mo&gt;+&lt;/mo&gt;&lt;msubsup&gt;&lt;mi&gt;p&lt;/mi&gt;&lt;mi&gt;y&lt;/mi&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msubsup&gt;&lt;mo&gt;+&lt;/mo&gt;&lt;msubsup&gt;&lt;mi&gt;p&lt;/mi&gt;&lt;mi&gt;z&lt;/mi&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msubsup&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
Now we can divide by &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-86-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;mi&gt;m&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt; and multiply by &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-87-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi mathvariant="normal"&gt;Ψ&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt; on both sides, then plug in the equation for
momentum in each component with the second derivatives.&lt;/p&gt;
&lt;div class="longmath"&gt;&lt;span class="mjx-chtml MJXc-display" style="text-align: center;"&gt;&lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-88-Frame" role="presentation" style="font-size: 90%; text-align: center; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML MJX_Assistive_MathML_Block" role="presentation"&gt;&lt;math display="block" xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mfrac&gt;&lt;msup&gt;&lt;mi&gt;p&lt;/mi&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msup&gt;&lt;mrow&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;mi&gt;m&lt;/mi&gt;&lt;/mrow&gt;&lt;/mfrac&gt;&lt;mi mathvariant="normal"&gt;Ψ&lt;/mi&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mo&gt;−&lt;/mo&gt;&lt;mfrac&gt;&lt;msup&gt;&lt;mi class="MJX-variant"&gt;ℏ&lt;/mi&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msup&gt;&lt;mrow&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;mi&gt;m&lt;/mi&gt;&lt;/mrow&gt;&lt;/mfrac&gt;&lt;mstyle scriptlevel="0"&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mo maxsize="2.047em" minsize="2.047em"&gt;(&lt;/mo&gt;&lt;/mrow&gt;&lt;/mstyle&gt;&lt;mfrac&gt;&lt;mrow&gt;&lt;msup&gt;&lt;mi mathvariant="normal"&gt;∂&lt;/mi&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msup&gt;&lt;mi mathvariant="normal"&gt;Ψ&lt;/mi&gt;&lt;/mrow&gt;&lt;mrow&gt;&lt;mi mathvariant="normal"&gt;∂&lt;/mi&gt;&lt;msup&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msup&gt;&lt;/mrow&gt;&lt;/mfrac&gt;&lt;mo&gt;+&lt;/mo&gt;&lt;mfrac&gt;&lt;mrow&gt;&lt;msup&gt;&lt;mi mathvariant="normal"&gt;∂&lt;/mi&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msup&gt;&lt;mi mathvariant="normal"&gt;Ψ&lt;/mi&gt;&lt;/mrow&gt;&lt;mrow&gt;&lt;mi mathvariant="normal"&gt;∂&lt;/mi&gt;&lt;msup&gt;&lt;mi&gt;y&lt;/mi&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msup&gt;&lt;/mrow&gt;&lt;/mfrac&gt;&lt;mo&gt;+&lt;/mo&gt;&lt;mfrac&gt;&lt;mrow&gt;&lt;msup&gt;&lt;mi mathvariant="normal"&gt;∂&lt;/mi&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msup&gt;&lt;mi mathvariant="normal"&gt;Ψ&lt;/mi&gt;&lt;/mrow&gt;&lt;mrow&gt;&lt;mi mathvariant="normal"&gt;∂&lt;/mi&gt;&lt;msup&gt;&lt;mi&gt;z&lt;/mi&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msup&gt;&lt;/mrow&gt;&lt;/mfrac&gt;&lt;mstyle scriptlevel="0"&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mo maxsize="2.047em" minsize="2.047em"&gt;)&lt;/mo&gt;&lt;/mrow&gt;&lt;/mstyle&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;
&lt;p&gt;If you know multivariable calculus, you might recognize the Laplacian operator, &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-89-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;msup&gt;&lt;mi mathvariant="normal"&gt;∇&lt;/mi&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msup&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;, in here. If not,
just consider &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-90-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;msup&gt;&lt;mi mathvariant="normal"&gt;∇&lt;/mi&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msup&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt; to be a special abbreviation for the sum of all these second derivatives.&lt;/p&gt;
&lt;p&gt;&lt;span class="mjx-chtml MJXc-display" style="text-align: center;"&gt;&lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-91-Frame" role="presentation" style="font-size: 90%; text-align: center; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML MJX_Assistive_MathML_Block" role="presentation"&gt;&lt;math display="block" xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mfrac&gt;&lt;msup&gt;&lt;mi&gt;p&lt;/mi&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msup&gt;&lt;mrow&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;mi&gt;m&lt;/mi&gt;&lt;/mrow&gt;&lt;/mfrac&gt;&lt;mi mathvariant="normal"&gt;Ψ&lt;/mi&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mo&gt;−&lt;/mo&gt;&lt;mfrac&gt;&lt;msup&gt;&lt;mi class="MJX-variant"&gt;ℏ&lt;/mi&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msup&gt;&lt;mrow&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;mi&gt;m&lt;/mi&gt;&lt;/mrow&gt;&lt;/mfrac&gt;&lt;msup&gt;&lt;mi mathvariant="normal"&gt;∇&lt;/mi&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msup&gt;&lt;mi mathvariant="normal"&gt;Ψ&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
Remember earlier when we were working with classical mechanics? We said that &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-92-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;msup&gt;&lt;mi&gt;p&lt;/mi&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msup&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mo&gt;/&lt;/mo&gt;&lt;/mrow&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;mi&gt;m&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt; was
kinetic energy. In our new quantum formula, we are using
&lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-93-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mo&gt;−&lt;/mo&gt;&lt;mfrac&gt;&lt;msup&gt;&lt;mi class="MJX-variant"&gt;ℏ&lt;/mi&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msup&gt;&lt;mrow&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;mi&gt;m&lt;/mi&gt;&lt;/mrow&gt;&lt;/mfrac&gt;&lt;msup&gt;&lt;mi mathvariant="normal"&gt;∇&lt;/mi&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msup&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt; on our wavefunction to get the term corresponding to
kinetic energy. Therefore, we say that &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-94-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mo&gt;−&lt;/mo&gt;&lt;mfrac&gt;&lt;msup&gt;&lt;mi class="MJX-variant"&gt;ℏ&lt;/mi&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msup&gt;&lt;mrow&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;mi&gt;m&lt;/mi&gt;&lt;/mrow&gt;&lt;/mfrac&gt;&lt;msup&gt;&lt;mi mathvariant="normal"&gt;∇&lt;/mi&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msup&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt; is the &lt;strong&gt;operator&lt;/strong&gt; for kinetic energy.
Now let's look back at our old equation, where we just took a classical equation
and multiplied by &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-95-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi mathvariant="normal"&gt;Ψ&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;.&lt;/p&gt;
&lt;p&gt;&lt;span class="mjx-chtml MJXc-display" style="text-align: center;"&gt;&lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-96-Frame" role="presentation" style="font-size: 90%; text-align: center; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML MJX_Assistive_MathML_Block" role="presentation"&gt;&lt;math display="block" xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;E&lt;/mi&gt;&lt;mi mathvariant="normal"&gt;Ψ&lt;/mi&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mfrac&gt;&lt;msup&gt;&lt;mi&gt;p&lt;/mi&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msup&gt;&lt;mrow&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;mi&gt;m&lt;/mi&gt;&lt;/mrow&gt;&lt;/mfrac&gt;&lt;mi mathvariant="normal"&gt;Ψ&lt;/mi&gt;&lt;mo&gt;+&lt;/mo&gt;&lt;mi&gt;V&lt;/mi&gt;&lt;mi mathvariant="normal"&gt;Ψ&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
Now with our operator, we have the Schrödinger equation.
&lt;span class="mjx-chtml MJXc-display" style="text-align: center;"&gt;&lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-97-Frame" role="presentation" style="font-size: 90%; text-align: center; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML MJX_Assistive_MathML_Block" role="presentation"&gt;&lt;math display="block" xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;menclose notation="box"&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mstyle displaystyle="true" scriptlevel="0"&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mi&gt;E&lt;/mi&gt;&lt;mi mathvariant="normal"&gt;Ψ&lt;/mi&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mo&gt;−&lt;/mo&gt;&lt;mfrac&gt;&lt;msup&gt;&lt;mi class="MJX-variant"&gt;ℏ&lt;/mi&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msup&gt;&lt;mrow&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;mi&gt;m&lt;/mi&gt;&lt;/mrow&gt;&lt;/mfrac&gt;&lt;msup&gt;&lt;mi mathvariant="normal"&gt;∇&lt;/mi&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msup&gt;&lt;mi mathvariant="normal"&gt;Ψ&lt;/mi&gt;&lt;mo&gt;+&lt;/mo&gt;&lt;mi&gt;V&lt;/mi&gt;&lt;mi mathvariant="normal"&gt;Ψ&lt;/mi&gt;&lt;/mrow&gt;&lt;/mstyle&gt;&lt;/mrow&gt;&lt;/menclose&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
This is already a valid form of the
Schrödinger equation if we know our value for the energy &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-98-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;E&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;.
But we can expand this with more math.&lt;/p&gt;
&lt;details&gt;
&lt;summary&gt;The momentum operator&lt;/summary&gt;
&lt;p&gt;There are many forms of the Schrödinger equation. When we transitioned from
    classical to quantum mechanics, we had a bad mix of classical momentum &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-99-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;p&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;
    and quantum wavefunction &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-100-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi mathvariant="normal"&gt;Ψ&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;:
    &lt;span class="mjx-chtml MJXc-display" style="text-align: center;"&gt;&lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-101-Frame" role="presentation" style="font-size: 90%; text-align: center; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML MJX_Assistive_MathML_Block" role="presentation"&gt;&lt;math display="block" xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;E&lt;/mi&gt;&lt;mi mathvariant="normal"&gt;Ψ&lt;/mi&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mfrac&gt;&lt;msup&gt;&lt;mi&gt;p&lt;/mi&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msup&gt;&lt;mrow&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;mi&gt;m&lt;/mi&gt;&lt;/mrow&gt;&lt;/mfrac&gt;&lt;mi mathvariant="normal"&gt;Ψ&lt;/mi&gt;&lt;mo&gt;+&lt;/mo&gt;&lt;mi&gt;V&lt;/mi&gt;&lt;mi mathvariant="normal"&gt;Ψ&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
    Just like we are using the kinetic energy operator
    &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-102-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mo&gt;−&lt;/mo&gt;&lt;mfrac&gt;&lt;msup&gt;&lt;mi class="MJX-variant"&gt;ℏ&lt;/mi&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msup&gt;&lt;mrow&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;mi&gt;m&lt;/mi&gt;&lt;/mrow&gt;&lt;/mfrac&gt;&lt;msup&gt;&lt;mi mathvariant="normal"&gt;∇&lt;/mi&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msup&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt; to get the kinetic energy term,
    we could also use the &lt;strong&gt;momentum operator&lt;/strong&gt; &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-103-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mover&gt;&lt;mi&gt;p&lt;/mi&gt;&lt;mo stretchy="false"&gt;^&lt;/mo&gt;&lt;/mover&gt;&lt;/mrow&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;.
    This would be
    &lt;span class="mjx-chtml MJXc-display" style="text-align: center;"&gt;&lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-104-Frame" role="presentation" style="font-size: 90%; text-align: center; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML MJX_Assistive_MathML_Block" role="presentation"&gt;&lt;math display="block" xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;menclose notation="box"&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mstyle displaystyle="true" scriptlevel="0"&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mi&gt;E&lt;/mi&gt;&lt;mi mathvariant="normal"&gt;Ψ&lt;/mi&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mfrac&gt;&lt;msup&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mover&gt;&lt;mi&gt;p&lt;/mi&gt;&lt;mo stretchy="false"&gt;^&lt;/mo&gt;&lt;/mover&gt;&lt;/mrow&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msup&gt;&lt;mrow&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;mi&gt;m&lt;/mi&gt;&lt;/mrow&gt;&lt;/mfrac&gt;&lt;mi mathvariant="normal"&gt;Ψ&lt;/mi&gt;&lt;mo&gt;+&lt;/mo&gt;&lt;mi&gt;V&lt;/mi&gt;&lt;mi mathvariant="normal"&gt;Ψ&lt;/mi&gt;&lt;/mrow&gt;&lt;/mstyle&gt;&lt;/mrow&gt;&lt;/menclose&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
    and with that one simple change, this is another valid form of the Schrödinger equation.
    It turns out that the momentum operator is equal to &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-105-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mo&gt;−&lt;/mo&gt;&lt;mi&gt;i&lt;/mi&gt;&lt;mi class="MJX-variant"&gt;ℏ&lt;/mi&gt;&lt;mi mathvariant="normal"&gt;∇&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;.
    You can plug that in and see that it works out to the same thing
    as with our kinetic energy operator.
&lt;/p&gt;
&lt;/details&gt;
&lt;h3&gt;Total Energy&lt;/h3&gt;
&lt;p&gt;Remember I said it might be interesting to know the wavelength and frequency
of the wavefunction? We tried finding the wavelength, and ended up
coming much closer to the Schrödinger equation. But let's not
forget the frequency! Now let's find the
frequency and hope we come even closer. Let's go back to the
general wavefunction.
&lt;span class="mjx-chtml MJXc-display" style="text-align: center;"&gt;&lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-106-Frame" role="presentation" style="font-size: 90%; text-align: center; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML MJX_Assistive_MathML_Block" role="presentation"&gt;&lt;math display="block" xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi mathvariant="normal"&gt;Ψ&lt;/mi&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mi&gt;A&lt;/mi&gt;&lt;msup&gt;&lt;mi&gt;e&lt;/mi&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mi&gt;i&lt;/mi&gt;&lt;msub&gt;&lt;mi&gt;k&lt;/mi&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;/msub&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;/mrow&gt;&lt;/msup&gt;&lt;msup&gt;&lt;mi&gt;e&lt;/mi&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mi&gt;i&lt;/mi&gt;&lt;msub&gt;&lt;mi&gt;k&lt;/mi&gt;&lt;mi&gt;y&lt;/mi&gt;&lt;/msub&gt;&lt;mi&gt;y&lt;/mi&gt;&lt;/mrow&gt;&lt;/msup&gt;&lt;msup&gt;&lt;mi&gt;e&lt;/mi&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mi&gt;i&lt;/mi&gt;&lt;msub&gt;&lt;mi&gt;k&lt;/mi&gt;&lt;mi&gt;z&lt;/mi&gt;&lt;/msub&gt;&lt;mi&gt;z&lt;/mi&gt;&lt;/mrow&gt;&lt;/msup&gt;&lt;msup&gt;&lt;mi&gt;e&lt;/mi&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mo&gt;−&lt;/mo&gt;&lt;mi&gt;i&lt;/mi&gt;&lt;mi&gt;ω&lt;/mi&gt;&lt;mi&gt;t&lt;/mi&gt;&lt;/mrow&gt;&lt;/msup&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
The frequency in time will be based on the &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-107-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;msup&gt;&lt;mi&gt;e&lt;/mi&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mo&gt;−&lt;/mo&gt;&lt;mi&gt;i&lt;/mi&gt;&lt;mi&gt;ω&lt;/mi&gt;&lt;mi&gt;t&lt;/mi&gt;&lt;/mrow&gt;&lt;/msup&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;
term. Specifically, the frequency will be &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-108-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;f&lt;/mi&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mi&gt;ω&lt;/mi&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mo&gt;/&lt;/mo&gt;&lt;/mrow&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;mi&gt;π&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;,
for similar reasons as the wavelength &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-109-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;mi&gt;π&lt;/mi&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mo&gt;/&lt;/mo&gt;&lt;/mrow&gt;&lt;mi&gt;k&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;. Let's
see what happens if we take the first derivative with respect to time.&lt;/p&gt;
&lt;div class="longmath"&gt;&lt;span class="mjx-chtml MJXc-display" style="text-align: center;"&gt;&lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-110-Frame" role="presentation" style="font-size: 90%; text-align: center; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML MJX_Assistive_MathML_Block" role="presentation"&gt;&lt;math display="block" xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mfrac&gt;&lt;mrow&gt;&lt;mi mathvariant="normal"&gt;∂&lt;/mi&gt;&lt;mi mathvariant="normal"&gt;Ψ&lt;/mi&gt;&lt;/mrow&gt;&lt;mrow&gt;&lt;mi mathvariant="normal"&gt;∂&lt;/mi&gt;&lt;mi&gt;t&lt;/mi&gt;&lt;/mrow&gt;&lt;/mfrac&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mo stretchy="false"&gt;(&lt;/mo&gt;&lt;mo&gt;−&lt;/mo&gt;&lt;mi&gt;i&lt;/mi&gt;&lt;mi&gt;ω&lt;/mi&gt;&lt;mo stretchy="false"&gt;)&lt;/mo&gt;&lt;mi&gt;A&lt;/mi&gt;&lt;msup&gt;&lt;mi&gt;e&lt;/mi&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mi&gt;i&lt;/mi&gt;&lt;msub&gt;&lt;mi&gt;k&lt;/mi&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;/msub&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;/mrow&gt;&lt;/msup&gt;&lt;msup&gt;&lt;mi&gt;e&lt;/mi&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mi&gt;i&lt;/mi&gt;&lt;msub&gt;&lt;mi&gt;k&lt;/mi&gt;&lt;mi&gt;y&lt;/mi&gt;&lt;/msub&gt;&lt;mi&gt;y&lt;/mi&gt;&lt;/mrow&gt;&lt;/msup&gt;&lt;msup&gt;&lt;mi&gt;e&lt;/mi&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mi&gt;i&lt;/mi&gt;&lt;msub&gt;&lt;mi&gt;k&lt;/mi&gt;&lt;mi&gt;z&lt;/mi&gt;&lt;/msub&gt;&lt;mi&gt;z&lt;/mi&gt;&lt;/mrow&gt;&lt;/msup&gt;&lt;msup&gt;&lt;mi&gt;e&lt;/mi&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mo&gt;−&lt;/mo&gt;&lt;mi&gt;i&lt;/mi&gt;&lt;mi&gt;ω&lt;/mi&gt;&lt;mi&gt;t&lt;/mi&gt;&lt;/mrow&gt;&lt;/msup&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mo stretchy="false"&gt;(&lt;/mo&gt;&lt;mo&gt;−&lt;/mo&gt;&lt;mi&gt;i&lt;/mi&gt;&lt;mi&gt;ω&lt;/mi&gt;&lt;mo stretchy="false"&gt;)&lt;/mo&gt;&lt;mi mathvariant="normal"&gt;Ψ&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;
&lt;div class="longmath"&gt;&lt;span class="mjx-chtml MJXc-display" style="text-align: center;"&gt;&lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-111-Frame" role="presentation" style="font-size: 90%; text-align: center; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML MJX_Assistive_MathML_Block" role="presentation"&gt;&lt;math display="block" xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;f&lt;/mi&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mi&gt;ω&lt;/mi&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mo&gt;/&lt;/mo&gt;&lt;/mrow&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;mi&gt;π&lt;/mi&gt;&lt;mspace width="thickmathspace"&gt;&lt;/mspace&gt;&lt;mo stretchy="false"&gt;⟹&lt;/mo&gt;&lt;mspace width="thickmathspace"&gt;&lt;/mspace&gt;&lt;mfrac&gt;&lt;mrow&gt;&lt;mi mathvariant="normal"&gt;∂&lt;/mi&gt;&lt;mi mathvariant="normal"&gt;Ψ&lt;/mi&gt;&lt;/mrow&gt;&lt;mrow&gt;&lt;mi mathvariant="normal"&gt;∂&lt;/mi&gt;&lt;mi&gt;t&lt;/mi&gt;&lt;/mrow&gt;&lt;/mfrac&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mo stretchy="false"&gt;(&lt;/mo&gt;&lt;mo&gt;−&lt;/mo&gt;&lt;mi&gt;i&lt;/mi&gt;&lt;mo&gt;⋅&lt;/mo&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;mi&gt;π&lt;/mi&gt;&lt;mi&gt;f&lt;/mi&gt;&lt;mo stretchy="false"&gt;)&lt;/mo&gt;&lt;mi mathvariant="normal"&gt;Ψ&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;
&lt;p&gt;But wait, remember &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-112-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;E&lt;/mi&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mi&gt;h&lt;/mi&gt;&lt;mi&gt;f&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;? That means if we have the frequency, we can easily find
the energy. If we plug this into our earlier Schrödinger equation, this gives us
&lt;span class="mjx-chtml MJXc-display" style="text-align: center;"&gt;&lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-113-Frame" role="presentation" style="font-size: 90%; text-align: center; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML MJX_Assistive_MathML_Block" role="presentation"&gt;&lt;math display="block" xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;menclose notation="box"&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mstyle displaystyle="true" scriptlevel="0"&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mi&gt;i&lt;/mi&gt;&lt;mi class="MJX-variant"&gt;ℏ&lt;/mi&gt;&lt;mfrac&gt;&lt;mrow&gt;&lt;mi mathvariant="normal"&gt;∂&lt;/mi&gt;&lt;mi mathvariant="normal"&gt;Ψ&lt;/mi&gt;&lt;/mrow&gt;&lt;mrow&gt;&lt;mi mathvariant="normal"&gt;∂&lt;/mi&gt;&lt;mi&gt;t&lt;/mi&gt;&lt;/mrow&gt;&lt;/mfrac&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mo&gt;−&lt;/mo&gt;&lt;mfrac&gt;&lt;msup&gt;&lt;mi class="MJX-variant"&gt;ℏ&lt;/mi&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msup&gt;&lt;mrow&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;mi&gt;m&lt;/mi&gt;&lt;/mrow&gt;&lt;/mfrac&gt;&lt;msup&gt;&lt;mi mathvariant="normal"&gt;∇&lt;/mi&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msup&gt;&lt;mi mathvariant="normal"&gt;Ψ&lt;/mi&gt;&lt;mo&gt;+&lt;/mo&gt;&lt;mi&gt;V&lt;/mi&gt;&lt;mi mathvariant="normal"&gt;Ψ&lt;/mi&gt;&lt;/mrow&gt;&lt;/mstyle&gt;&lt;/mrow&gt;&lt;/menclose&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
There, that's the full Schrödinger equation!
&lt;/p&gt;&lt;details&gt;
&lt;summary&gt;Show steps&lt;/summary&gt;
&lt;p&gt;We just had&lt;/p&gt;
&lt;span class="mjx-chtml MJXc-display" style="text-align: center;"&gt;&lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-114-Frame" role="presentation" style="font-size: 90%; text-align: center; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML MJX_Assistive_MathML_Block" role="presentation"&gt;&lt;math display="block" xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mfrac&gt;&lt;mrow&gt;&lt;mi mathvariant="normal"&gt;∂&lt;/mi&gt;&lt;mi mathvariant="normal"&gt;Ψ&lt;/mi&gt;&lt;/mrow&gt;&lt;mrow&gt;&lt;mi mathvariant="normal"&gt;∂&lt;/mi&gt;&lt;mi&gt;t&lt;/mi&gt;&lt;/mrow&gt;&lt;/mfrac&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mo stretchy="false"&gt;(&lt;/mo&gt;&lt;mo&gt;−&lt;/mo&gt;&lt;mi&gt;i&lt;/mi&gt;&lt;mo&gt;⋅&lt;/mo&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;mi&gt;π&lt;/mi&gt;&lt;mi&gt;f&lt;/mi&gt;&lt;mo stretchy="false"&gt;)&lt;/mo&gt;&lt;mi mathvariant="normal"&gt;Ψ&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
&lt;span class="mjx-chtml MJXc-display" style="text-align: center;"&gt;&lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-115-Frame" role="presentation" style="font-size: 90%; text-align: center; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML MJX_Assistive_MathML_Block" role="presentation"&gt;&lt;math display="block" xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;f&lt;/mi&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mfrac&gt;&lt;mi&gt;E&lt;/mi&gt;&lt;mi&gt;h&lt;/mi&gt;&lt;/mfrac&gt;&lt;mspace width="thickmathspace"&gt;&lt;/mspace&gt;&lt;mo stretchy="false"&gt;⟹&lt;/mo&gt;&lt;mspace width="thickmathspace"&gt;&lt;/mspace&gt;&lt;mfrac&gt;&lt;mrow&gt;&lt;mi mathvariant="normal"&gt;∂&lt;/mi&gt;&lt;mi mathvariant="normal"&gt;Ψ&lt;/mi&gt;&lt;/mrow&gt;&lt;mrow&gt;&lt;mi mathvariant="normal"&gt;∂&lt;/mi&gt;&lt;mi&gt;t&lt;/mi&gt;&lt;/mrow&gt;&lt;/mfrac&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mo stretchy="false"&gt;(&lt;/mo&gt;&lt;mo&gt;−&lt;/mo&gt;&lt;mi&gt;i&lt;/mi&gt;&lt;mo&gt;⋅&lt;/mo&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;mi&gt;π&lt;/mi&gt;&lt;mi&gt;E&lt;/mi&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mo&gt;/&lt;/mo&gt;&lt;/mrow&gt;&lt;mi&gt;h&lt;/mi&gt;&lt;mo stretchy="false"&gt;)&lt;/mo&gt;&lt;mi mathvariant="normal"&gt;Ψ&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
&lt;p&gt;Solve for &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-116-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;E&lt;/mi&gt;&lt;mi mathvariant="normal"&gt;Ψ&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;.&lt;/p&gt;
&lt;span class="mjx-chtml MJXc-display" style="text-align: center;"&gt;&lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-117-Frame" role="presentation" style="font-size: 90%; text-align: center; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML MJX_Assistive_MathML_Block" role="presentation"&gt;&lt;math display="block" xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;E&lt;/mi&gt;&lt;mi mathvariant="normal"&gt;Ψ&lt;/mi&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mo&gt;−&lt;/mo&gt;&lt;mfrac&gt;&lt;mi&gt;h&lt;/mi&gt;&lt;mrow&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;mi&gt;π&lt;/mi&gt;&lt;mi&gt;i&lt;/mi&gt;&lt;/mrow&gt;&lt;/mfrac&gt;&lt;mfrac&gt;&lt;mrow&gt;&lt;mi mathvariant="normal"&gt;∂&lt;/mi&gt;&lt;mi mathvariant="normal"&gt;Ψ&lt;/mi&gt;&lt;/mrow&gt;&lt;mrow&gt;&lt;mi mathvariant="normal"&gt;∂&lt;/mi&gt;&lt;mi&gt;t&lt;/mi&gt;&lt;/mrow&gt;&lt;/mfrac&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mi&gt;i&lt;/mi&gt;&lt;mfrac&gt;&lt;mi&gt;h&lt;/mi&gt;&lt;mrow&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;mi&gt;π&lt;/mi&gt;&lt;/mrow&gt;&lt;/mfrac&gt;&lt;mfrac&gt;&lt;mrow&gt;&lt;mi mathvariant="normal"&gt;∂&lt;/mi&gt;&lt;mi mathvariant="normal"&gt;Ψ&lt;/mi&gt;&lt;/mrow&gt;&lt;mrow&gt;&lt;mi mathvariant="normal"&gt;∂&lt;/mi&gt;&lt;mi&gt;t&lt;/mi&gt;&lt;/mrow&gt;&lt;/mfrac&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
&lt;p&gt;Remember we defined &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-118-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi class="MJX-variant"&gt;ℏ&lt;/mi&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mi&gt;h&lt;/mi&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mo&gt;/&lt;/mo&gt;&lt;/mrow&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;mi&gt;π&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;.&lt;/p&gt;
&lt;span class="mjx-chtml MJXc-display" style="text-align: center;"&gt;&lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-119-Frame" role="presentation" style="font-size: 90%; text-align: center; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML MJX_Assistive_MathML_Block" role="presentation"&gt;&lt;math display="block" xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;E&lt;/mi&gt;&lt;mi mathvariant="normal"&gt;Ψ&lt;/mi&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mi&gt;i&lt;/mi&gt;&lt;mi class="MJX-variant"&gt;ℏ&lt;/mi&gt;&lt;mfrac&gt;&lt;mrow&gt;&lt;mi mathvariant="normal"&gt;∂&lt;/mi&gt;&lt;mi mathvariant="normal"&gt;Ψ&lt;/mi&gt;&lt;/mrow&gt;&lt;mrow&gt;&lt;mi mathvariant="normal"&gt;∂&lt;/mi&gt;&lt;mi&gt;t&lt;/mi&gt;&lt;/mrow&gt;&lt;/mfrac&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
&lt;p&gt;Let's plug this back into our equation.&lt;/p&gt;
&lt;span class="mjx-chtml MJXc-display" style="text-align: center;"&gt;&lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-120-Frame" role="presentation" style="font-size: 90%; text-align: center; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML MJX_Assistive_MathML_Block" role="presentation"&gt;&lt;math display="block" xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;E&lt;/mi&gt;&lt;mi mathvariant="normal"&gt;Ψ&lt;/mi&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mo&gt;−&lt;/mo&gt;&lt;mfrac&gt;&lt;msup&gt;&lt;mi class="MJX-variant"&gt;ℏ&lt;/mi&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msup&gt;&lt;mrow&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;mi&gt;m&lt;/mi&gt;&lt;/mrow&gt;&lt;/mfrac&gt;&lt;msup&gt;&lt;mi mathvariant="normal"&gt;∇&lt;/mi&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msup&gt;&lt;mi mathvariant="normal"&gt;Ψ&lt;/mi&gt;&lt;mo&gt;+&lt;/mo&gt;&lt;mi&gt;V&lt;/mi&gt;&lt;mi mathvariant="normal"&gt;Ψ&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
&lt;span class="mjx-chtml MJXc-display" style="text-align: center;"&gt;&lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-121-Frame" role="presentation" style="font-size: 90%; text-align: center; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML MJX_Assistive_MathML_Block" role="presentation"&gt;&lt;math display="block" xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;menclose notation="box"&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mstyle displaystyle="true" scriptlevel="0"&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mi&gt;i&lt;/mi&gt;&lt;mi class="MJX-variant"&gt;ℏ&lt;/mi&gt;&lt;mfrac&gt;&lt;mrow&gt;&lt;mi mathvariant="normal"&gt;∂&lt;/mi&gt;&lt;mi mathvariant="normal"&gt;Ψ&lt;/mi&gt;&lt;/mrow&gt;&lt;mrow&gt;&lt;mi mathvariant="normal"&gt;∂&lt;/mi&gt;&lt;mi&gt;t&lt;/mi&gt;&lt;/mrow&gt;&lt;/mfrac&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mo&gt;−&lt;/mo&gt;&lt;mfrac&gt;&lt;msup&gt;&lt;mi class="MJX-variant"&gt;ℏ&lt;/mi&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msup&gt;&lt;mrow&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;mi&gt;m&lt;/mi&gt;&lt;/mrow&gt;&lt;/mfrac&gt;&lt;msup&gt;&lt;mi mathvariant="normal"&gt;∇&lt;/mi&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msup&gt;&lt;mi mathvariant="normal"&gt;Ψ&lt;/mi&gt;&lt;mo&gt;+&lt;/mo&gt;&lt;mi&gt;V&lt;/mi&gt;&lt;mi mathvariant="normal"&gt;Ψ&lt;/mi&gt;&lt;/mrow&gt;&lt;/mstyle&gt;&lt;/mrow&gt;&lt;/menclose&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
&lt;/details&gt;
&lt;p&gt;
When I presented the Schrödinger equation at the beginning of this post,
I did one more step to make it a little extra scary. We can consider
position, which we can write as a vector &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-122-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mi mathvariant="bold"&gt;r&lt;/mi&gt;&lt;/mrow&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;. The wavefunction and
potential energy can then be functions of
both position &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-123-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mi mathvariant="bold"&gt;r&lt;/mi&gt;&lt;/mrow&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt; and time &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-124-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;t&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;. That gives&lt;/p&gt;
&lt;div class="longmath"&gt;&lt;span class="mjx-chtml MJXc-display" style="text-align: center;"&gt;&lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-125-Frame" role="presentation" style="font-size: 90%; text-align: center; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML MJX_Assistive_MathML_Block" role="presentation"&gt;&lt;math display="block" xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;i&lt;/mi&gt;&lt;mi class="MJX-variant"&gt;ℏ&lt;/mi&gt;&lt;mfrac&gt;&lt;mi mathvariant="normal"&gt;∂&lt;/mi&gt;&lt;mrow&gt;&lt;mi mathvariant="normal"&gt;∂&lt;/mi&gt;&lt;mi&gt;t&lt;/mi&gt;&lt;/mrow&gt;&lt;/mfrac&gt;&lt;mi mathvariant="normal"&gt;Ψ&lt;/mi&gt;&lt;mo stretchy="false"&gt;(&lt;/mo&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mi mathvariant="bold"&gt;r&lt;/mi&gt;&lt;/mrow&gt;&lt;mo&gt;,&lt;/mo&gt;&lt;mi&gt;t&lt;/mi&gt;&lt;mo stretchy="false"&gt;)&lt;/mo&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mo&gt;−&lt;/mo&gt;&lt;mfrac&gt;&lt;msup&gt;&lt;mi class="MJX-variant"&gt;ℏ&lt;/mi&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msup&gt;&lt;mrow&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;mi&gt;m&lt;/mi&gt;&lt;/mrow&gt;&lt;/mfrac&gt;&lt;msup&gt;&lt;mi mathvariant="normal"&gt;∇&lt;/mi&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msup&gt;&lt;mi mathvariant="normal"&gt;Ψ&lt;/mi&gt;&lt;mo stretchy="false"&gt;(&lt;/mo&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mi mathvariant="bold"&gt;r&lt;/mi&gt;&lt;/mrow&gt;&lt;mo&gt;,&lt;/mo&gt;&lt;mi&gt;t&lt;/mi&gt;&lt;mo stretchy="false"&gt;)&lt;/mo&gt;&lt;mo&gt;+&lt;/mo&gt;&lt;mi&gt;V&lt;/mi&gt;&lt;mo stretchy="false"&gt;(&lt;/mo&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mi mathvariant="bold"&gt;r&lt;/mi&gt;&lt;/mrow&gt;&lt;mo&gt;,&lt;/mo&gt;&lt;mi&gt;t&lt;/mi&gt;&lt;mo stretchy="false"&gt;)&lt;/mo&gt;&lt;mi mathvariant="normal"&gt;Ψ&lt;/mi&gt;&lt;mo stretchy="false"&gt;(&lt;/mo&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mi mathvariant="bold"&gt;r&lt;/mi&gt;&lt;/mrow&gt;&lt;mo&gt;,&lt;/mo&gt;&lt;mi&gt;t&lt;/mi&gt;&lt;mo stretchy="false"&gt;)&lt;/mo&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;
&lt;h2&gt;What is Schrödinger's Equation?&lt;/h2&gt;
&lt;p&gt;So far, we've shown how to find Schrödinger's equation with classical mechanics,
but I haven't really explained what the equation is like I promised in the title.
Here are some thoughts about what it all means.
&lt;/p&gt;
&lt;p&gt;As a summary of what Schrödinger's equation is, you can think of it as &lt;strong&gt;a statement
of conservation of energy in quantum mechanics&lt;/strong&gt;. One big difference is that
it is a probabilistic equation, since it tells you about the wavefunction.
The wavefunction can help you predict what a particle is doing, but you can
never be completely sure. In classical mechanics, you &lt;em&gt;can&lt;/em&gt; be completely sure,
at least if your model is right. (Another difference is that it's actually dealing with the
Hamiltonian, which isn't always conserved.)
&lt;/p&gt;
&lt;p&gt;It's interesting to see that we played a lot with the total energy &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-126-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;E&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;
and the kinetic energy &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-127-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;K&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt; in the transition to quantum mechanics, but
the potential energy &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-128-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;V&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt; is still just written as &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-129-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;V&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;. This makes
some sense, since &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-130-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;V&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt; really
depends on the situation, while &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-131-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;E&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt; and &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-132-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;K&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt; are properties of
the particle itself.
&lt;/p&gt;
&lt;p&gt;You must be wondering, why is the Schrödinger equation so much
more complicated than &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-133-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;E&lt;/mi&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mi&gt;K&lt;/mi&gt;&lt;mo&gt;+&lt;/mo&gt;&lt;mi&gt;V&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;? Well, it doesn't have to be. It just
is that way because it's more explicit about the quantities we need. If you want the simple
version, you can write
&lt;span class="mjx-chtml MJXc-display" style="text-align: center;"&gt;&lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-134-Frame" role="presentation" style="font-size: 90%; text-align: center; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML MJX_Assistive_MathML_Block" role="presentation"&gt;&lt;math display="block" xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;E&lt;/mi&gt;&lt;mi mathvariant="normal"&gt;Ψ&lt;/mi&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mover&gt;&lt;mi&gt;H&lt;/mi&gt;&lt;mo stretchy="false"&gt;^&lt;/mo&gt;&lt;/mover&gt;&lt;/mrow&gt;&lt;mi mathvariant="normal"&gt;Ψ&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
Where &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-135-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;E&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt; is energy, and &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-136-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mover&gt;&lt;mi&gt;H&lt;/mi&gt;&lt;mo stretchy="false"&gt;^&lt;/mo&gt;&lt;/mover&gt;&lt;/mrow&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt; is called the &lt;strong&gt;Hamiltonian&lt;/strong&gt; operator
and equals &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-137-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mo&gt;−&lt;/mo&gt;&lt;mfrac&gt;&lt;msup&gt;&lt;mi class="MJX-variant"&gt;ℏ&lt;/mi&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msup&gt;&lt;mrow&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;mi&gt;m&lt;/mi&gt;&lt;/mrow&gt;&lt;/mfrac&gt;&lt;msup&gt;&lt;mi mathvariant="normal"&gt;∇&lt;/mi&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msup&gt;&lt;mo&gt;+&lt;/mo&gt;&lt;mi&gt;V&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;. These forms
are equivalent, since we earlier proved that &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-138-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;E&lt;/mi&gt;&lt;mi mathvariant="normal"&gt;Ψ&lt;/mi&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mi&gt;i&lt;/mi&gt;&lt;mi class="MJX-variant"&gt;ℏ&lt;/mi&gt;&lt;mtext&gt; &lt;/mtext&gt;&lt;mi mathvariant="normal"&gt;∂&lt;/mi&gt;&lt;mi mathvariant="normal"&gt;Ψ&lt;/mi&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mo&gt;/&lt;/mo&gt;&lt;/mrow&gt;&lt;mi mathvariant="normal"&gt;∂&lt;/mi&gt;&lt;mi&gt;t&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;.
This form is called the &lt;strong&gt;time-independent Schrödinger equation&lt;/strong&gt; because there's
no mention of time &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-139-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;t&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;, while the one from before is the &lt;strong&gt;time-dependent
Schrödinger equation&lt;/strong&gt;.
&lt;/p&gt;
&lt;p&gt;It seems like conservation of energy has a simple version in both
classical physics and quantum physics. We just did a lot of work with
a complicated analog of conservation of energy in quantum phyiscs.
Is there an analog to this in classical physics?
&lt;/p&gt;
&lt;p&gt;In general, for a conservative system, the Hamiltonian represents
the sum of kinetic and potential energy. So in classical physics, we have
&lt;span class="mjx-chtml MJXc-display" style="text-align: center;"&gt;&lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-140-Frame" role="presentation" style="font-size: 90%; text-align: center; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML MJX_Assistive_MathML_Block" role="presentation"&gt;&lt;math display="block" xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;E&lt;/mi&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mi&gt;K&lt;/mi&gt;&lt;mo&gt;+&lt;/mo&gt;&lt;mi&gt;V&lt;/mi&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mi&gt;H&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
&lt;span class="mjx-chtml MJXc-display" style="text-align: center;"&gt;&lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-141-Frame" role="presentation" style="font-size: 90%; text-align: center; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML MJX_Assistive_MathML_Block" role="presentation"&gt;&lt;math display="block" xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;E&lt;/mi&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mfrac&gt;&lt;msup&gt;&lt;mi&gt;p&lt;/mi&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msup&gt;&lt;mrow&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;mi&gt;m&lt;/mi&gt;&lt;/mrow&gt;&lt;/mfrac&gt;&lt;mo&gt;+&lt;/mo&gt;&lt;mi&gt;V&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
From here, we can't go further without more information, so we could say that this is the
classical analog of the complicated Schrödinger equation.&lt;/p&gt;
&lt;p&gt;But if we know more about the situation,
we can make this more complicated. Maybe we know the potential energy is
from gravity, and maybe we know the initial energy was all gravitational
potential energy. Then we can have&lt;/p&gt;
&lt;div class="longmath"&gt;&lt;span class="mjx-chtml MJXc-display" style="text-align: center;"&gt;&lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-142-Frame" role="presentation" style="font-size: 90%; text-align: center; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML MJX_Assistive_MathML_Block" role="presentation"&gt;&lt;math display="block" xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;g&lt;/mi&gt;&lt;msub&gt;&lt;mi&gt;h&lt;/mi&gt;&lt;mn&gt;0&lt;/mn&gt;&lt;/msub&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mfrac&gt;&lt;mn&gt;1&lt;/mn&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/mfrac&gt;&lt;mstyle scriptlevel="0"&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mo maxsize="2.047em" minsize="2.047em"&gt;[&lt;/mo&gt;&lt;/mrow&gt;&lt;/mstyle&gt;&lt;mstyle scriptlevel="0"&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mo maxsize="2.047em" minsize="2.047em"&gt;(&lt;/mo&gt;&lt;/mrow&gt;&lt;/mstyle&gt;&lt;mfrac&gt;&lt;mrow&gt;&lt;mi&gt;d&lt;/mi&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;/mrow&gt;&lt;mrow&gt;&lt;mi&gt;d&lt;/mi&gt;&lt;mi&gt;t&lt;/mi&gt;&lt;/mrow&gt;&lt;/mfrac&gt;&lt;msup&gt;&lt;mstyle scriptlevel="0"&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mo maxsize="2.047em" minsize="2.047em"&gt;)&lt;/mo&gt;&lt;/mrow&gt;&lt;/mstyle&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msup&gt;&lt;mo&gt;+&lt;/mo&gt;&lt;mstyle scriptlevel="0"&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mo maxsize="2.047em" minsize="2.047em"&gt;(&lt;/mo&gt;&lt;/mrow&gt;&lt;/mstyle&gt;&lt;mfrac&gt;&lt;mrow&gt;&lt;mi&gt;d&lt;/mi&gt;&lt;mi&gt;y&lt;/mi&gt;&lt;/mrow&gt;&lt;mrow&gt;&lt;mi&gt;d&lt;/mi&gt;&lt;mi&gt;t&lt;/mi&gt;&lt;/mrow&gt;&lt;/mfrac&gt;&lt;msup&gt;&lt;mstyle scriptlevel="0"&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mo maxsize="2.047em" minsize="2.047em"&gt;)&lt;/mo&gt;&lt;/mrow&gt;&lt;/mstyle&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msup&gt;&lt;mo&gt;+&lt;/mo&gt;&lt;mstyle scriptlevel="0"&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mo maxsize="2.047em" minsize="2.047em"&gt;(&lt;/mo&gt;&lt;/mrow&gt;&lt;/mstyle&gt;&lt;mfrac&gt;&lt;mrow&gt;&lt;mi&gt;d&lt;/mi&gt;&lt;mi&gt;z&lt;/mi&gt;&lt;/mrow&gt;&lt;mrow&gt;&lt;mi&gt;d&lt;/mi&gt;&lt;mi&gt;t&lt;/mi&gt;&lt;/mrow&gt;&lt;/mfrac&gt;&lt;msup&gt;&lt;mstyle scriptlevel="0"&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mo maxsize="2.047em" minsize="2.047em"&gt;)&lt;/mo&gt;&lt;/mrow&gt;&lt;/mstyle&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msup&gt;&lt;mstyle scriptlevel="0"&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mo maxsize="2.047em" minsize="2.047em"&gt;]&lt;/mo&gt;&lt;/mrow&gt;&lt;/mstyle&gt;&lt;mo&gt;+&lt;/mo&gt;&lt;mi&gt;g&lt;/mi&gt;&lt;mi&gt;z&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;
&lt;details&gt;
&lt;summary&gt;Proof&lt;/summary&gt;
&lt;p&gt;
&lt;span class="mjx-chtml MJXc-display" style="text-align: center;"&gt;&lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-143-Frame" role="presentation" style="font-size: 90%; text-align: center; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML MJX_Assistive_MathML_Block" role="presentation"&gt;&lt;math display="block" xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;E&lt;/mi&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mfrac&gt;&lt;msup&gt;&lt;mi&gt;p&lt;/mi&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msup&gt;&lt;mrow&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;mi&gt;m&lt;/mi&gt;&lt;/mrow&gt;&lt;/mfrac&gt;&lt;mo&gt;+&lt;/mo&gt;&lt;mi&gt;V&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
&lt;span class="mjx-chtml MJXc-display" style="text-align: center;"&gt;&lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-144-Frame" role="presentation" style="font-size: 90%; text-align: center; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML MJX_Assistive_MathML_Block" role="presentation"&gt;&lt;math display="block" xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;m&lt;/mi&gt;&lt;mi&gt;g&lt;/mi&gt;&lt;msub&gt;&lt;mi&gt;h&lt;/mi&gt;&lt;mn&gt;0&lt;/mn&gt;&lt;/msub&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mfrac&gt;&lt;mn&gt;1&lt;/mn&gt;&lt;mrow&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;mi&gt;m&lt;/mi&gt;&lt;/mrow&gt;&lt;/mfrac&gt;&lt;msup&gt;&lt;mi&gt;p&lt;/mi&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msup&gt;&lt;mo&gt;+&lt;/mo&gt;&lt;mi&gt;m&lt;/mi&gt;&lt;mi&gt;g&lt;/mi&gt;&lt;mi&gt;z&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
&lt;span class="mjx-chtml MJXc-display" style="text-align: center;"&gt;&lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-145-Frame" role="presentation" style="font-size: 90%; text-align: center; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML MJX_Assistive_MathML_Block" role="presentation"&gt;&lt;math display="block" xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;m&lt;/mi&gt;&lt;mi&gt;g&lt;/mi&gt;&lt;msub&gt;&lt;mi&gt;h&lt;/mi&gt;&lt;mn&gt;0&lt;/mn&gt;&lt;/msub&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mfrac&gt;&lt;mn&gt;1&lt;/mn&gt;&lt;mrow&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;mi&gt;m&lt;/mi&gt;&lt;/mrow&gt;&lt;/mfrac&gt;&lt;mo stretchy="false"&gt;(&lt;/mo&gt;&lt;msubsup&gt;&lt;mi&gt;p&lt;/mi&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msubsup&gt;&lt;mo&gt;+&lt;/mo&gt;&lt;msubsup&gt;&lt;mi&gt;p&lt;/mi&gt;&lt;mi&gt;y&lt;/mi&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msubsup&gt;&lt;mo&gt;+&lt;/mo&gt;&lt;msubsup&gt;&lt;mi&gt;p&lt;/mi&gt;&lt;mi&gt;z&lt;/mi&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msubsup&gt;&lt;mo stretchy="false"&gt;)&lt;/mo&gt;&lt;mo&gt;+&lt;/mo&gt;&lt;mi&gt;m&lt;/mi&gt;&lt;mi&gt;g&lt;/mi&gt;&lt;mi&gt;z&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
&lt;span class="mjx-chtml MJXc-display" style="text-align: center;"&gt;&lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-146-Frame" role="presentation" style="font-size: 90%; text-align: center; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML MJX_Assistive_MathML_Block" role="presentation"&gt;&lt;math display="block" xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;m&lt;/mi&gt;&lt;mi&gt;g&lt;/mi&gt;&lt;msub&gt;&lt;mi&gt;h&lt;/mi&gt;&lt;mn&gt;0&lt;/mn&gt;&lt;/msub&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mfrac&gt;&lt;mn&gt;1&lt;/mn&gt;&lt;mrow&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;mi&gt;m&lt;/mi&gt;&lt;/mrow&gt;&lt;/mfrac&gt;&lt;mo stretchy="false"&gt;(&lt;/mo&gt;&lt;msup&gt;&lt;mi&gt;m&lt;/mi&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msup&gt;&lt;mo stretchy="false"&gt;)&lt;/mo&gt;&lt;mo stretchy="false"&gt;(&lt;/mo&gt;&lt;msubsup&gt;&lt;mi&gt;v&lt;/mi&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msubsup&gt;&lt;mo&gt;+&lt;/mo&gt;&lt;msubsup&gt;&lt;mi&gt;v&lt;/mi&gt;&lt;mi&gt;y&lt;/mi&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msubsup&gt;&lt;mo&gt;+&lt;/mo&gt;&lt;msubsup&gt;&lt;mi&gt;v&lt;/mi&gt;&lt;mi&gt;z&lt;/mi&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msubsup&gt;&lt;mo stretchy="false"&gt;)&lt;/mo&gt;&lt;mo&gt;+&lt;/mo&gt;&lt;mi&gt;m&lt;/mi&gt;&lt;mi&gt;g&lt;/mi&gt;&lt;mi&gt;z&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
&lt;span class="mjx-chtml MJXc-display" style="text-align: center;"&gt;&lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-147-Frame" role="presentation" style="font-size: 90%; text-align: center; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML MJX_Assistive_MathML_Block" role="presentation"&gt;&lt;math display="block" xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;m&lt;/mi&gt;&lt;mi&gt;g&lt;/mi&gt;&lt;msub&gt;&lt;mi&gt;h&lt;/mi&gt;&lt;mn&gt;0&lt;/mn&gt;&lt;/msub&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mfrac&gt;&lt;mi&gt;m&lt;/mi&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/mfrac&gt;&lt;mo stretchy="false"&gt;(&lt;/mo&gt;&lt;msubsup&gt;&lt;mi&gt;v&lt;/mi&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msubsup&gt;&lt;mo&gt;+&lt;/mo&gt;&lt;msubsup&gt;&lt;mi&gt;v&lt;/mi&gt;&lt;mi&gt;y&lt;/mi&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msubsup&gt;&lt;mo&gt;+&lt;/mo&gt;&lt;msubsup&gt;&lt;mi&gt;v&lt;/mi&gt;&lt;mi&gt;z&lt;/mi&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msubsup&gt;&lt;mo stretchy="false"&gt;)&lt;/mo&gt;&lt;mo&gt;+&lt;/mo&gt;&lt;mi&gt;m&lt;/mi&gt;&lt;mi&gt;g&lt;/mi&gt;&lt;mi&gt;z&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;div class="longmath"&gt;&lt;span class="mjx-chtml MJXc-display" style="text-align: center;"&gt;&lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-148-Frame" role="presentation" style="font-size: 90%; text-align: center; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML MJX_Assistive_MathML_Block" role="presentation"&gt;&lt;math display="block" xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;g&lt;/mi&gt;&lt;msub&gt;&lt;mi&gt;h&lt;/mi&gt;&lt;mn&gt;0&lt;/mn&gt;&lt;/msub&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mfrac&gt;&lt;mn&gt;1&lt;/mn&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/mfrac&gt;&lt;mstyle scriptlevel="0"&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mo maxsize="2.047em" minsize="2.047em"&gt;[&lt;/mo&gt;&lt;/mrow&gt;&lt;/mstyle&gt;&lt;mstyle scriptlevel="0"&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mo maxsize="2.047em" minsize="2.047em"&gt;(&lt;/mo&gt;&lt;/mrow&gt;&lt;/mstyle&gt;&lt;mfrac&gt;&lt;mrow&gt;&lt;mi&gt;d&lt;/mi&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;/mrow&gt;&lt;mrow&gt;&lt;mi&gt;d&lt;/mi&gt;&lt;mi&gt;t&lt;/mi&gt;&lt;/mrow&gt;&lt;/mfrac&gt;&lt;msup&gt;&lt;mstyle scriptlevel="0"&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mo maxsize="2.047em" minsize="2.047em"&gt;)&lt;/mo&gt;&lt;/mrow&gt;&lt;/mstyle&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msup&gt;&lt;mo&gt;+&lt;/mo&gt;&lt;mstyle scriptlevel="0"&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mo maxsize="2.047em" minsize="2.047em"&gt;(&lt;/mo&gt;&lt;/mrow&gt;&lt;/mstyle&gt;&lt;mfrac&gt;&lt;mrow&gt;&lt;mi&gt;d&lt;/mi&gt;&lt;mi&gt;y&lt;/mi&gt;&lt;/mrow&gt;&lt;mrow&gt;&lt;mi&gt;d&lt;/mi&gt;&lt;mi&gt;t&lt;/mi&gt;&lt;/mrow&gt;&lt;/mfrac&gt;&lt;msup&gt;&lt;mstyle scriptlevel="0"&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mo maxsize="2.047em" minsize="2.047em"&gt;)&lt;/mo&gt;&lt;/mrow&gt;&lt;/mstyle&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msup&gt;&lt;mo&gt;+&lt;/mo&gt;&lt;mstyle scriptlevel="0"&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mo maxsize="2.047em" minsize="2.047em"&gt;(&lt;/mo&gt;&lt;/mrow&gt;&lt;/mstyle&gt;&lt;mfrac&gt;&lt;mrow&gt;&lt;mi&gt;d&lt;/mi&gt;&lt;mi&gt;z&lt;/mi&gt;&lt;/mrow&gt;&lt;mrow&gt;&lt;mi&gt;d&lt;/mi&gt;&lt;mi&gt;t&lt;/mi&gt;&lt;/mrow&gt;&lt;/mfrac&gt;&lt;msup&gt;&lt;mstyle scriptlevel="0"&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mo maxsize="2.047em" minsize="2.047em"&gt;)&lt;/mo&gt;&lt;/mrow&gt;&lt;/mstyle&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msup&gt;&lt;mstyle scriptlevel="0"&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mo maxsize="2.047em" minsize="2.047em"&gt;]&lt;/mo&gt;&lt;/mrow&gt;&lt;/mstyle&gt;&lt;mo&gt;+&lt;/mo&gt;&lt;mi&gt;g&lt;/mi&gt;&lt;mi&gt;z&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;
&lt;/details&gt;
&lt;p&gt;From there, we can use something called Hamiltonian mechanics to see how the system will evolve,
    without needing Newton's laws.
&lt;/p&gt;
&lt;p&gt;This is somewhat complicated, like the more complicated version of the Schrödinger equation.
It also has derivatives, which can help us find how the system will change with time.
The moral of the story is that given the basic idea of the Hamiltonian &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-149-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;E&lt;/mi&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mi&gt;H&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;,
and some information specific to the situation, we can plug in that information
to get a more complicated but more useful equation. It's interesting how this
works for both quantum and classical physics.
&lt;/p&gt;
&lt;p&gt;The Schrödinger equation looks scary, and it is a little
scary, but it's also meaningful. Just like conservation of
energy in the Hamiltonian form can help us tell what will happen in classical mechanics,
the Schrödinger equation tells us what will happen to a wavefunction in
quantum mechanics.
&lt;/p&gt;
&lt;p&gt;There's a lot more to this, like how exactly we use wavefunctions, and
what a wavefunction is. There are many questions that still don't
have answers, like what it means that all this seems probability-based.
But considering that this equation is at the heart of our universe
(ignoring relativity, or until someone finds a better one that explains quantum gravity or something)
it's interesting to know that it's related to classical conservation of energy,
which even introductory physics students know about.
&lt;/p&gt;
&lt;hr/&gt;
&lt;h2&gt;Related Posts&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="https://www.harysdalvi.com/climatemodel/"&gt;Playing with Sunshine&lt;/a&gt; — A simple interactive model for regional and global climate and greenhouse gases using the Stefan-Boltzmann law&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.harysdalvi.com/blog/2110_1/"&gt;Testing Copernicus&lt;/a&gt; — If all reference frames are valid, can we say that the sun goes around the earth?&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.harysdalvi.com/blog/2204/"&gt;What is Time?&lt;/a&gt; — Unraveling the concept of time from a physics perspective&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id="ref"&gt;References&lt;/h2&gt;
&lt;ol id="reflist" type="a"&gt;
&lt;li&gt;&lt;a href="https://ocw.mit.edu/courses/physics/8-04-quantum-physics-i-spring-2016/lecture-notes/MIT8_04S16_LecNotes5.pdf" target="_blank"&gt;
    Quantum Physics I&lt;/a&gt; (B. Zwiebach, MIT)&lt;/li&gt;
&lt;li&gt;&lt;a href="https://phys.libretexts.org/@go/page/7018" target="_blank"&gt;
    Hamilton’s Equations of Motion&lt;/a&gt; (Jeremy Tatum, University of Victoria)&lt;/li&gt;
&lt;/ol&gt;</content></entry><entry><title>Testing Copernicus</title><summary>If all reference frames are valid, can we say that the sun goes around the earth? Almost, but not quite.</summary><link href="https://www.harysdalvi.com/blog/2110_1/" rel="alternate"/><id>https://www.harysdalvi.com/blog/2110_1/</id><published>2021-10-15T16:38:07Z</published><updated>2021-10-15T16:38:07Z</updated><content type="html" xml:lang="en">&lt;p&gt;In school, we are taught that the earth goes around the sun. This is
    correct. I'm not trying to say heliocentric theory is wrong.
    (A physics major disagreeing with heliocentrism is not a good look
    at all.)
&lt;/p&gt;
&lt;p&gt;I'm not trying to say heliocentric theory is wrong, &lt;em&gt;but&lt;/em&gt;
    (this sounds bad so far) I am going to look at one way in
    physics that geocentric theory could be correct too. (Could be,
    but it's really not. I'm not saying it is.)
&lt;/p&gt;
&lt;p&gt;Now that we have firmly established that I believe the earth and
    all the planets go around the sun, not the other way around,
    I think I can safely start my critique of heliocentrism.
&lt;/p&gt;
&lt;p&gt;There are aspects of this that have to do with the rotation
    of the earth, the Coriolis effect, and the theory of relativity. I will neglect
    these ideas, as they are topics in themselves, and aren't
    really necessary to look at heliocentrism compared to geocentrism.
&lt;/p&gt;
&lt;h2&gt;Frame of Reference&lt;/h2&gt;
&lt;p&gt;From the sun's perspective it is clear that the earth goes around the sun.
    If people could live on the sun, they would look up at the fiery sky
    and observe the planets rising and setting, just like we observe the sun
    rising and setting here on Earth.
&lt;/p&gt;
&lt;p&gt;But why is the earth's perspective any less valid? After all, physics
    dictates that &lt;strong&gt;there is no privileged frame of reference&lt;/strong&gt;
    and &lt;strong&gt;the laws of physics are the same in all inertial reference
    frames&lt;/strong&gt;.
&lt;/p&gt;
&lt;p&gt;It turns out that while the perspective on the sun is an inertial reference
    frame, the perspective on Earth is not. This is the reason why heliocentrism
    is accepted while geocentrism is not.
&lt;/p&gt;
&lt;p&gt;For our purposes today, an &lt;strong&gt;inertial reference frame&lt;/strong&gt; is one that
    follows Newton's first law of motion, or one in which real forces are the only
    way in which things can accelerate. If you don't see how a reference frame might
    break Newton's first law, consider this thought experiment.
&lt;/p&gt;
&lt;p&gt;You and I are in space. First, my perspective: I believe that I am stationary,
    and I see that you are stationary as well. Then, you use a jetpack to produce
    a force on you that accelerates you at 1 m/s&lt;sup&gt;2&lt;/sup&gt;. Does this follow
    Newton's first law of motion?&lt;/p&gt;
&lt;p&gt;As for me, I am at rest, and I remain at rest
    because I am not acted upon by an outside force. As for you, you start at rest,
    but you are then acted upon by an outside force to accelerate you. This all
    follows Newton's first law of motion, meaning my frame of reference is inertial.
&lt;/p&gt;
&lt;p&gt;But now let's think about your perspective. You start at rest, and you see me
    at rest. But then, in your perspective, &lt;em&gt;I&lt;/em&gt; start to accelerate towards
    &lt;em&gt;you&lt;/em&gt; at 1 m/s&lt;sup&gt;2&lt;/sup&gt; when you start your jetpack. It's like
    when you're in a car, it seems like you are sitting still while everything
    else moves past. Does this follow Newton's first law of motion?
&lt;/p&gt;
&lt;p&gt;As for me, I accelerate from rest, but there is no force on me. This doesn't make
    sense according to Newton's first law. As for you, you remain at rest, even though
    the jetpack exerts a force on you. This doesn't make sense either. How do we deal
    with this?
&lt;/p&gt;
&lt;p&gt;Since this is a non-inertial reference frame, we can solve the problem by creating
    a &lt;strong&gt;fictitious force&lt;/strong&gt;. In this case, we will need a fictitious force to
    accelerate both you and me by 1 m/s&lt;sup&gt;2&lt;/sup&gt; in the opposite direction of
    the force on you due to the jetpack. Now everything makes sense: this fictitious force
    cancels out with the force of the jetpack on you, so you remain at rest.
    This fictitious force is the only force acting on me, so it causes me to
    accelerate at 1 m/s&lt;sup&gt;2&lt;/sup&gt;.
&lt;/p&gt;
&lt;p&gt;Even though we now have a reference frame that follows the laws of physics, we needed
    to invent a fictitious force to get there. This fictitious force is fictitious,
    as the name implies. If we need a fictitious force, the frame is non-inertial.
&lt;/p&gt;
&lt;p&gt;Now, why do we need a fictitious force to produce a geocentric theory?
&lt;/p&gt;
&lt;h2&gt;Comparing Heliocentric and Geocentric Models&lt;/h2&gt;
&lt;h3&gt;Heliocentrism&lt;/h3&gt;
&lt;p&gt;I'll start with a heliocentric model. Let's make things very simple so we can focus on
    the differences between heliocentrism and geocentrism rather than on the details of
    orbital mechanics. We'll only consider the earth and the sun, and we'll assume the
    earth's orbit is circular.
&lt;/p&gt;
&lt;p&gt;Then the earth is at a constant distance &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-1-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;R&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt; from the sun. The acceleration of the
    earth is given by &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-2-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;G&lt;/mi&gt;&lt;mi&gt;M&lt;/mi&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mo&gt;/&lt;/mo&gt;&lt;/mrow&gt;&lt;msup&gt;&lt;mi&gt;R&lt;/mi&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msup&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;, where &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-3-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;M&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt; is the mass of the sun and &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-4-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;G&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt; is the
    gravitational constant. Since the mass of the sun is much greater than the mass of
    the earth, we can take the acceleration of the sun to be 0. Now we will ask the same
    question: is this an inertial frame of reference? Does this follow Newton's first
    law?
&lt;/p&gt;
&lt;p&gt;Looking at the earth, it feels a force &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-5-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;G&lt;/mi&gt;&lt;mi&gt;M&lt;/mi&gt;&lt;mi&gt;m&lt;/mi&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mo&gt;/&lt;/mo&gt;&lt;/mrow&gt;&lt;msup&gt;&lt;mi&gt;R&lt;/mi&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msup&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt; from the sun, where &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-6-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;m&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt; is the
    mass of the earth. Based on this, the acceleration of the earth is &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-7-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;G&lt;/mi&gt;&lt;mi&gt;M&lt;/mi&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mo&gt;/&lt;/mo&gt;&lt;/mrow&gt;&lt;msup&gt;&lt;mi&gt;R&lt;/mi&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msup&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;.
    We will say that this is just the acceleration we need in order
    to keep a stable orbit at this speed. (We can do this with basic algebra-based physics.)
    &lt;/p&gt;&lt;details&gt;
&lt;summary&gt;Proof: keeping a stable orbit&lt;/summary&gt;
    Let's say the earth orbits the sun at a speed &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-8-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;v&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;, and continues this speed in its
    circular orbit. In order to do so, the earth needs the right amount of centripetal
    force. This centripetal force must be provided by gravity, the only force
    on the earth. Therefore, we can set the force due to gravity and the
    required centripetal force equal to each other.
    &lt;span class="mjx-chtml MJXc-display" style="text-align: center;"&gt;&lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-9-Frame" role="presentation" style="font-size: 90%; text-align: center; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML MJX_Assistive_MathML_Block" role="presentation"&gt;&lt;math display="block" xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mfrac&gt;&lt;mrow&gt;&lt;mi&gt;G&lt;/mi&gt;&lt;mi&gt;M&lt;/mi&gt;&lt;mi&gt;m&lt;/mi&gt;&lt;/mrow&gt;&lt;msup&gt;&lt;mi&gt;R&lt;/mi&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msup&gt;&lt;/mfrac&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mfrac&gt;&lt;mrow&gt;&lt;mi&gt;m&lt;/mi&gt;&lt;msup&gt;&lt;mi&gt;v&lt;/mi&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msup&gt;&lt;/mrow&gt;&lt;mi&gt;R&lt;/mi&gt;&lt;/mfrac&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
&lt;span class="mjx-chtml MJXc-display" style="text-align: center;"&gt;&lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-10-Frame" role="presentation" style="font-size: 90%; text-align: center; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML MJX_Assistive_MathML_Block" role="presentation"&gt;&lt;math display="block" xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;v&lt;/mi&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;msqrt&gt;&lt;mfrac&gt;&lt;mrow&gt;&lt;mi&gt;G&lt;/mi&gt;&lt;mi&gt;M&lt;/mi&gt;&lt;/mrow&gt;&lt;mi&gt;R&lt;/mi&gt;&lt;/mfrac&gt;&lt;/msqrt&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
    So as long as the earth keeps going at &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-11-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;v&lt;/mi&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;msqrt&gt;&lt;mi&gt;G&lt;/mi&gt;&lt;mi&gt;M&lt;/mi&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mo&gt;/&lt;/mo&gt;&lt;/mrow&gt;&lt;mi&gt;R&lt;/mi&gt;&lt;/msqrt&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;, it can keep its circular orbit stable.
    &lt;/details&gt;
&lt;p&gt;&lt;/p&gt;&lt;p&gt;
    Since &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-12-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;G&lt;/mi&gt;&lt;mi&gt;M&lt;/mi&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mo&gt;/&lt;/mo&gt;&lt;/mrow&gt;&lt;msup&gt;&lt;mi&gt;R&lt;/mi&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msup&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt; is both the acceleration calculated for earth based on the forces it
    experiences, and the acceleration it must be observed to experience in order to
    keep a circular orbit, Newton's first law is followed: the earth is an object in motion
    that is being acted upon by an outside force and responding predictably to that force.
    This is analogous to the space example: the acceleration we would calculate for you
    because of your jetpack, is also the same acceleration I observed in my frame of reference.
    Therefore, this is an inertial frame of reference.
&lt;/p&gt;
&lt;p&gt;The acceleration of the sun is calculated as &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-13-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;G&lt;/mi&gt;&lt;mi&gt;m&lt;/mi&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mo&gt;/&lt;/mo&gt;&lt;/mrow&gt;&lt;msup&gt;&lt;mi&gt;R&lt;/mi&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msup&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;. We can assume this is very small, near 0,
    since &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-14-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;G&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt; is very small while &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-15-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;R&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt; is very big. Since our frame of reference is centered on the sun,
    this should be 0, so this makes sense. (In order for it to add up perfectly, we would need a frame of
    reference centered on the &lt;em&gt;barycenter&lt;/em&gt; of the sun and earth, but this is close enough to
    the sun that we can say a heliocentric model works.)
&lt;/p&gt;
&lt;h3&gt;Geocentrism&lt;/h3&gt;
&lt;p&gt;Now let's shift to a geocentric model. Since we are centering our frame of reference on the
    earth, the acceleration of the earth should be 0. The sun is going around the earth
    at the same speed &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-16-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;v&lt;/mi&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;msqrt&gt;&lt;mi&gt;G&lt;/mi&gt;&lt;mi&gt;M&lt;/mi&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mo&gt;/&lt;/mo&gt;&lt;/mrow&gt;&lt;mi&gt;R&lt;/mi&gt;&lt;/msqrt&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt; that the earth went around the sun in the heliocentric model, and at the
    same distance &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-17-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;R&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;. This means that the centripetal acceleration needed for the sun is equal to
    &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-18-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;G&lt;/mi&gt;&lt;mi&gt;M&lt;/mi&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mo&gt;/&lt;/mo&gt;&lt;/mrow&gt;&lt;msup&gt;&lt;mi&gt;R&lt;/mi&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msup&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;, which is what was needed for the earth before.
&lt;/p&gt;
&lt;p&gt;Is this an inertial frame of reference? Let's look at the sun.
    Imagine you are an observer on earth with access to the position of the sun,
    its speed, its mass, its distance, the mass of earth, all these relevant variables. What would you
    see if you watch the sun?
&lt;/p&gt;
&lt;p&gt;You would see it rise and set as it orbits around the earth. Based on its observed speed &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-19-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;v&lt;/mi&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;msqrt&gt;&lt;mi&gt;G&lt;/mi&gt;&lt;mi&gt;M&lt;/mi&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mo&gt;/&lt;/mo&gt;&lt;/mrow&gt;&lt;mi&gt;R&lt;/mi&gt;&lt;/msqrt&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt; and
    distance &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-20-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;R&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;, you can calculate its acceleration to be &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-21-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;G&lt;/mi&gt;&lt;mi&gt;M&lt;/mi&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mo&gt;/&lt;/mo&gt;&lt;/mrow&gt;&lt;msup&gt;&lt;mi&gt;R&lt;/mi&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msup&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;. Does this acceleration make sense?
    The only force on the sun is gravity from the earth, with magnitude &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-22-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;G&lt;/mi&gt;&lt;mi&gt;M&lt;/mi&gt;&lt;mi&gt;m&lt;/mi&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mo&gt;/&lt;/mo&gt;&lt;/mrow&gt;&lt;msup&gt;&lt;mi&gt;R&lt;/mi&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msup&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;, meaning you would
    predict the acceleration of the sun to be &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-23-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;G&lt;/mi&gt;&lt;mi&gt;m&lt;/mi&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mo&gt;/&lt;/mo&gt;&lt;/mrow&gt;&lt;msup&gt;&lt;mi&gt;R&lt;/mi&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msup&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;. This is much less than the &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-24-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;G&lt;/mi&gt;&lt;mi&gt;M&lt;/mi&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mo&gt;/&lt;/mo&gt;&lt;/mrow&gt;&lt;msup&gt;&lt;mi&gt;R&lt;/mi&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msup&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;
    you observe.
&lt;/p&gt;
&lt;p&gt;As for the earth, it experiences the same force &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-25-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;G&lt;/mi&gt;&lt;mi&gt;M&lt;/mi&gt;&lt;mi&gt;m&lt;/mi&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mo&gt;/&lt;/mo&gt;&lt;/mrow&gt;&lt;msup&gt;&lt;mi&gt;R&lt;/mi&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msup&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;, predicting an acceleration
    &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-26-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;G&lt;/mi&gt;&lt;mi&gt;M&lt;/mi&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mo&gt;/&lt;/mo&gt;&lt;/mrow&gt;&lt;msup&gt;&lt;mi&gt;R&lt;/mi&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msup&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;. But since you are on earth, it seems to you that the acceleration is 0.
    This is off too. Therefore, your frame of reference is not inertial.
&lt;/p&gt;
&lt;p&gt;We can resolve this by creating a fictitious force. Remember that we just found the
    predicted acceleration of the sun to be &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-27-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;G&lt;/mi&gt;&lt;mi&gt;m&lt;/mi&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mo&gt;/&lt;/mo&gt;&lt;/mrow&gt;&lt;msup&gt;&lt;mi&gt;R&lt;/mi&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msup&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;. Let's approximate this to be 0.
    (We are doing this because we will neglect any difference between the barycenter of the
    solar system and the center of the sun.) The acceleration for the sun that we observe
    in a geocentric model is &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-28-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;G&lt;/mi&gt;&lt;mi&gt;M&lt;/mi&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mo&gt;/&lt;/mo&gt;&lt;/mrow&gt;&lt;msup&gt;&lt;mi&gt;R&lt;/mi&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msup&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;, so we will apply this acceleration to the sun
    in the direction towards the earth.
&lt;/p&gt;
&lt;p&gt;The earth feels a pull to the sun, but we find that our new fictitious force opposes this.
    An acceleration vector of &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-29-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;G&lt;/mi&gt;&lt;mi&gt;M&lt;/mi&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mo&gt;/&lt;/mo&gt;&lt;/mrow&gt;&lt;msup&gt;&lt;mi&gt;R&lt;/mi&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msup&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt; from the sun towards the earth, if applied to the earth,
    becomes an acceleration vector of &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-30-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;G&lt;/mi&gt;&lt;mi&gt;M&lt;/mi&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mo&gt;/&lt;/mo&gt;&lt;/mrow&gt;&lt;msup&gt;&lt;mi&gt;R&lt;/mi&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msup&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt; in the direction opposite the sun. It's easy to
    see how this will cancel the force of gravity on the earth due to the sun.
&lt;/p&gt;
&lt;p&gt;So by creating a fictitious force that produces an acceleration &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-31-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;G&lt;/mi&gt;&lt;mi&gt;M&lt;/mi&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mo&gt;/&lt;/mo&gt;&lt;/mrow&gt;&lt;msup&gt;&lt;mi&gt;R&lt;/mi&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msup&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt; in the direction
    from the sun to the earth, we have now built a working geocentric model.
&lt;/p&gt;
&lt;p&gt;Of course, this fictitious force isn't real. That's why we say geocentrism isn't real.
    Since we need a fictitious force to account for geocentrism, it makes more
    sense to use a heliocentric model.
&lt;/p&gt;
&lt;h3&gt;Barycentrism?&lt;/h3&gt;
&lt;p&gt;I said that the acceleration of the sun due to the earth's gravity is &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-32-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;G&lt;/mi&gt;&lt;mi&gt;m&lt;/mi&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mo&gt;/&lt;/mo&gt;&lt;/mrow&gt;&lt;msup&gt;&lt;mi&gt;R&lt;/mi&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msup&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;,
    and I approximated this to be 0. What if I hadn't done this approximation?
&lt;/p&gt;
&lt;p&gt;Then we would find that a strictly heliocentric model actually needs a slight
    correction with a fictitious force, similar to what we found with a geocentric model.
    The acceleration of the sun should be 0 in a heliocentric model, but this doesn't
    match up with the fact that there should be a (slight) acceleration of the sun
    due to gravity. We need a fictitious force to fix this.
&lt;/p&gt;
&lt;p&gt;So if geocentrism needs a fictitious force, and heliocentrism needs a fictitious force,
    which model is &lt;em&gt;really&lt;/em&gt; correct? In order to really have no fictitious force,
    we need to set a frame of reference centered at the center of mass of the solar system
    (called the &lt;strong&gt;barycenter&lt;/strong&gt;) rather than the center of the sun.
    Since the sun's mass is such a huge piece of the solar system's mass, it turns out that
    the actual barycenter of our solar system is &lt;em&gt;inside&lt;/em&gt; the sun. But the mass of the other stuff is
    there too, so the barycenter isn't exactly at the middle of the sun.
&lt;/p&gt;
&lt;p&gt;In some systems, such as binary star systems, the barycenter might not be inside
    any of the bodies. For example, if two stars of equal mass orbit each other,
    their barycenter is right between them, not inside either one. In this case,
    we would set up a coordinate system at this barycenter in order to avoid
    fictitious forces. This makes a lot of sense for a system of binary stars,
    and might help see why it applies even to our solar system, since it
    can be unintuitive at first to center a model around a point slightly off
    balance from the center of the sun. We might think at first that we
    have to pick a particular body to center our system around, like the
    earth or the sun or Jupiter, but the binary star example shows that
    the barycenter is really what we need.
&lt;/p&gt;
&lt;h2&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;Based on the idea that all frames of reference are valid in physics,
    I tried making a geocentric model of the solar system. But then we
    found that I needed a fictitious force for the model to work, so
    this isn't a very good model. We found the heliocentric model to
    be better.
&lt;/p&gt;
&lt;p&gt;But taking a closer look at the heliocentric model, we realized it
    depends on approximation. If we remove this approximation, we
    end up with a barycentric model, centered at the center
    of mass of the solar system. Since the sun has such a big
    piece of the mass of our solar system, we usually approximate
    this to say that heliocentrism is the true model. But in reality,
    barycentrism works a bit better, although it is only
    slightly different from heliocentrism.
&lt;/p&gt;
&lt;p&gt;Using the idea of barycentrism, you can tell your friends that
    you don't believe in heliocentrism. Then, when they call you
    a conspirator and ask if you're a flat-earther too, you can
    explain why a barycentric model is technically superior to
    both heliocentrism and geocentrism for the same reason that
    heliocentrism is super to geocentrism: it
    cuts out fictitious forces. I don't know if this is
    actually the kind of stuff you tell your friends,
    but if it is, I hope you enjoy.
&lt;/p&gt;
&lt;hr/&gt;
&lt;h2&gt;Related Posts&lt;/h2&gt;
&lt;ul id="related-posts"&gt;
&lt;li&gt;&lt;a href="https://www.harysdalvi.com/climatemodel/"&gt;Playing with Sunshine&lt;/a&gt; — A simple interactive model for regional and global climate and greenhouse gases using the Stefan-Boltzmann law&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.harysdalvi.com/blog/2111/"&gt;What is Schrödinger's Equation?&lt;/a&gt; — An informal derivation of the scary-looking Schrödinger equation using as little quantum physics as possible&lt;/li&gt;
&lt;/ul&gt;</content></entry><entry><title>Romans Needed Quantum Computers</title><summary>A history of information density, from the ancient Mayans to quantum computing</summary><link href="https://www.harysdalvi.com/blog/2110/" rel="alternate"/><id>https://www.harysdalvi.com/blog/2110/</id><published>2021-10-01T16:38:07Z</published><updated>2021-10-01T16:38:07Z</updated><content type="html" xml:lang="en">&lt;p&gt;Throughout history, we have always tried to condense information
    into smaller and smaller spaces. Various cultures have created
    their own techniques and adopted techniques from other cultures.
    In this post, I'll go through the history of information density
    with numbers, and compare this to computers and quantum computers.
&lt;/p&gt;
&lt;h2&gt;Ancient Beginnings&lt;/h2&gt;
&lt;h3&gt;Prehistory&lt;/h3&gt;
&lt;p&gt;Long long ago, before writing was developed, people must have counted
    on their fingers. But they also sometimes made markings to count.
    Imagine you are in prehistory counting sheep with a stone, making one
    mark per sheep. If there are &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-1-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt; sheep, it will take &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-2-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;n&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt; markings
    on the stone to count them all.
    &lt;span class="mjx-chtml MJXc-display" style="text-align: center;"&gt;&lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-3-Frame" role="presentation" style="font-size: 90%; text-align: center; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML MJX_Assistive_MathML_Block" role="presentation"&gt;&lt;math display="block" xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;n&lt;/mi&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
    Unfortunately, the stone only has space for a few markings.
    But since we are talking about concepts here, if the stone was big
    enough, you could keep counting more sheep forever. So your system is capable
    of counting to infinity in theory.
    &lt;span class="mjx-chtml MJXc-display" style="text-align: center;"&gt;&lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-4-Frame" role="presentation" style="font-size: 90%; text-align: center; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML MJX_Assistive_MathML_Block" role="presentation"&gt;&lt;math display="block" xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;msub&gt;&lt;mi&gt;L&lt;/mi&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;/msub&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mi mathvariant="normal"&gt;∞&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
    There is only one type of mark you make, a simple line in the stone. So
    the number of types of symbols is
    &lt;span class="mjx-chtml MJXc-display" style="text-align: center;"&gt;&lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-5-Frame" role="presentation" style="font-size: 90%; text-align: center; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML MJX_Assistive_MathML_Block" role="presentation"&gt;&lt;math display="block" xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;S&lt;/mi&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mn&gt;1&lt;/mn&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
&lt;/p&gt;
&lt;h3&gt;Ancient Rome&lt;/h3&gt;
&lt;p&gt;Much later, the Romans started using &lt;strong&gt;Roman numerals&lt;/strong&gt;. They started making
    marks just like in prehistory: I is 1, II is 2, and III is 3.
    But then they thought, why make five marks like IIIII for five when
    you can just write V?
&lt;/p&gt;
&lt;p&gt;And here we are in year MMXXI. Unfortunately, the highest symbol is M,
    for 1000. Additionally, only up to three of the same letter are allowed in a row.
    This makes the largest number MMMCMXCIX, or 3999. In the middle ages,
    a line on top called a &lt;strong&gt;vinculum&lt;/strong&gt; was used, multiplying a
    number by 1000 [&lt;a href="https://www.harysdalvi.com/blog/2110/#ref" id="src1"&gt;1&lt;/a&gt;]. This means &lt;span class="vinculum"&gt;MMMCMXCIX&lt;/span&gt; is
    3999000, and then we can add CMXCIX (999) to get
    &lt;span class="vinculum"&gt;MMMCMXCIX&lt;/span&gt;CMXCIX (3999999). Then we're really stuck.
&lt;/p&gt;
&lt;p&gt;So we have &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-6-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;msub&gt;&lt;mi&gt;L&lt;/mi&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;/msub&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mn&gt;3999999&lt;/mn&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;. Looking at &lt;span class="vinculum"&gt;MMMCMXCIX&lt;/span&gt;CMXCIX, we count
    16 symbols needed to write this number (including the vinculum). Now how can we
    determine &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-7-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;n&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;, the number of symbols needed to write a number, as a function of &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-8-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;,
    the number we are writing?
&lt;/p&gt;
&lt;p&gt;For simplicity, I'll only consider numbers up to 3999, so let's bring &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-9-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;msub&gt;&lt;mi&gt;L&lt;/mi&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;/msub&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;
    down to 3999. Let's map each place value to a Roman numeral.
    The ones place can have I, II, III, IV, V, VI, VII, VIII, IX. This is an average of 20/9 symbols,
    or about 2.2 symbols.
    The tens place can have X, XX, XXX, XL, L, LX, LXX, LXXX, XC. This is very similar, and has the same
    average symbols. The hundreds place is the same. So we can say each new place value takes another
    20/9 symbols to represent. This gives something like
    &lt;span class="mjx-chtml MJXc-display" style="text-align: center;"&gt;&lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-10-Frame" role="presentation" style="font-size: 90%; text-align: center; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML MJX_Assistive_MathML_Block" role="presentation"&gt;&lt;math display="block" xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;n&lt;/mi&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mfrac&gt;&lt;mn&gt;20&lt;/mn&gt;&lt;mn&gt;9&lt;/mn&gt;&lt;/mfrac&gt;&lt;msub&gt;&lt;mi&gt;log&lt;/mi&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mn&gt;10&lt;/mn&gt;&lt;/mrow&gt;&lt;/msub&gt;&lt;mo&gt;⁡&lt;/mo&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
    Does this work? We can make a &lt;a href="https://matplotlib.org" target="_blank"&gt;Matplotlib&lt;/a&gt; graph
    comparing this model to the actual length of the Roman numerals for a number &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-11-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;. Here is
    what we get:&lt;br/&gt;
&lt;img src="https://www.harysdalvi.com/blog/2110/graphs/roman.png"/&gt;&lt;br/&gt;
    Our model (in orange) seems to approximate the data (blue) reasonably well over the long term. There is also
    a moving average of the data over 100 numbers (green) which stays close to the orange line. You can see
    the low spikes due to numbers like M (1000) and MMD (2500). In case you're curious, the highest point
    on the graph is &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-12-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mn&gt;3888&lt;/mn&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;, written with 15 symbols as MMMDCCCLXXXVIII.
&lt;/p&gt;
&lt;p&gt;Ignoring the vinculum, we see that the symbols increase in a 1/5 pattern for each place value.
    So if we wanted to expand the Roman numeral system, we would need 2 symbols for each place
    value: continuing on the pattern of I/V for ones, X/L for tens, and C/D for hundreds.
    In the long term, this can be modeled as
    &lt;span class="mjx-chtml MJXc-display" style="text-align: center;"&gt;&lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-13-Frame" role="presentation" style="font-size: 90%; text-align: center; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML MJX_Assistive_MathML_Block" role="presentation"&gt;&lt;math display="block" xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;S&lt;/mi&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;msub&gt;&lt;mi&gt;log&lt;/mi&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mn&gt;10&lt;/mn&gt;&lt;/mrow&gt;&lt;/msub&gt;&lt;mo&gt;⁡&lt;/mo&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
    So while the number of symbols &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-14-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;n&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt; is less than with prehistoric writing, we now face
    an ever-increasing number of types of symbols &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-15-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;S&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;.
&lt;/p&gt;
&lt;h3&gt;Ancient India&lt;/h3&gt;
&lt;p&gt;Ancient India used a system quite different from that of the Romans. (For unicode compatibility reasons,
    I'll use modern numerals here. The symbols at the time were different, although still vaguely
    recognizable. They were 𑁦𑁧𑁨𑁩𑁪𑁫𑁬𑁭𑁮𑁯 if you can see that.)
&lt;/p&gt;
&lt;p&gt;The Indians also used a simple line for one, like the Romans (and the Chinese and others.)
    But for a two, instead of two lines, there was a new symbol: 2. This continued all the way
    to 9. For ten, the symbol for one was used again, and then there was a new symbol for 0.
    This concept of place value could be used to write any number (&lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-16-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mn&gt;1729&lt;/mn&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mn&gt;1000&lt;/mn&gt;&lt;mo&gt;+&lt;/mo&gt;&lt;mn&gt;700&lt;/mn&gt;&lt;mo&gt;+&lt;/mo&gt;&lt;mn&gt;20&lt;/mn&gt;&lt;mo&gt;+&lt;/mo&gt;&lt;mn&gt;9&lt;/mn&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;).
    Each additional decimal place value required one new symbol rather than an average of 2.2,
    giving
    &lt;span class="mjx-chtml MJXc-display" style="text-align: center;"&gt;&lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-17-Frame" role="presentation" style="font-size: 90%; text-align: center; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML MJX_Assistive_MathML_Block" role="presentation"&gt;&lt;math display="block" xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;n&lt;/mi&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;msub&gt;&lt;mi&gt;log&lt;/mi&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mn&gt;10&lt;/mn&gt;&lt;/mrow&gt;&lt;/msub&gt;&lt;mo&gt;⁡&lt;/mo&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
    This is a lower rate of increase than roman numerals. Even more importantly, &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-18-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;msub&gt;&lt;mi&gt;L&lt;/mi&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;/msub&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mi mathvariant="normal"&gt;∞&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;:
    arbitrarily large numbers could now be written with enough symbols. We can make a graph
    for this system, now known as &lt;strong&gt;Arabic numerals&lt;/strong&gt; (since Arabs brought the
    system from India to Europe), similar to that for Roman numerals.&lt;br/&gt;
&lt;img src="https://www.harysdalvi.com/blog/2110/graphs/arabic.png"/&gt;&lt;br/&gt;
    We have confirmed that our model &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-19-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;n&lt;/mi&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;msub&gt;&lt;mi&gt;log&lt;/mi&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mn&gt;10&lt;/mn&gt;&lt;/mrow&gt;&lt;/msub&gt;&lt;mo&gt;⁡&lt;/mo&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt; is roughly accurate, but
    unlike with Roman numerals, this model can now be extended as &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-20-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;mo stretchy="false"&gt;→&lt;/mo&gt;&lt;mi mathvariant="normal"&gt;∞&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;
    even with just 10 symbols: &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-21-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;S&lt;/mi&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mn&gt;10&lt;/mn&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;. This combined the best of the prehistoric and
    Roman systems: &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-22-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;n&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt; increases only as the log of &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-23-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt; rather than linearly,
    but the number of symbols &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-24-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;S&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt; is kept fixed no matter how large &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-25-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt; gets.
&lt;/p&gt;
&lt;h3&gt;Other Ancient Civilizations&lt;/h3&gt;
&lt;p&gt;The Roman system and the modern descendant of the Indian system are probably the
    most familiar numeral systems to most readers. However, there were many other ancient
    numeral systems, at least two of which are still in common use today.
    I have analyzed these if you're interested.
    &lt;/p&gt;&lt;details&gt;
&lt;summary&gt;Ancient Chinese&lt;/summary&gt;
&lt;p&gt;A descendant of the ancient Chinese system is used in China and Japan today
        when writing with characters rather than Arabic numerals. For unicode compatibility reasons,
        I will be using modern Chinese characters here. This is the system used in Chinese characters
        and Japanese Kanji; there is also an East Asian numeral system using counting rods, which
        works similarly to the modern system and even includes negative numbers.
    &lt;/p&gt;
&lt;p&gt;Chinese has unique symbols for numbers 1–10. 20 is represented not by 2 and then 0,
        but by 2 and then 10: &lt;span lang="zh"&gt;二十&lt;/span&gt;. This is multiplication by
        placing a number 1–10 in front of a larger place. Addition occurs by placing it after:
        for example, while &lt;span lang="zh"&gt;二十&lt;/span&gt; is &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-26-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;mo&gt;×&lt;/mo&gt;&lt;mn&gt;10&lt;/mn&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mn&gt;20&lt;/mn&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;, 
        &lt;span lang="zh"&gt;十二&lt;/span&gt; is &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-27-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mn&gt;10&lt;/mn&gt;&lt;mo&gt;+&lt;/mo&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mn&gt;12&lt;/mn&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;. Both of these can be combined:
        32 is &lt;span lang="zh"&gt;三十二&lt;/span&gt;, since &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-28-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mn&gt;3&lt;/mn&gt;&lt;mo&gt;×&lt;/mo&gt;&lt;mn&gt;10&lt;/mn&gt;&lt;mo&gt;+&lt;/mo&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mn&gt;32&lt;/mn&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;.
    &lt;/p&gt;
&lt;p&gt;This works up to 99. At 100, a new symbol &lt;span lang="zh"&gt;百&lt;/span&gt; is introduced for 100. Then
        &lt;span lang="zh"&gt;二百二十二&lt;/span&gt; is &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-29-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;mo&gt;×&lt;/mo&gt;&lt;mn&gt;100&lt;/mn&gt;&lt;mo&gt;+&lt;/mo&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;mo&gt;×&lt;/mo&gt;&lt;mn&gt;10&lt;/mn&gt;&lt;mo&gt;+&lt;/mo&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mn&gt;222&lt;/mn&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;.
        There is a similar symbol &lt;span lang="zh"&gt;千&lt;/span&gt; for 1000.
    &lt;/p&gt;
&lt;p&gt;After 1000, a new character is introduced for 10,000. Then a new character must be introduced
        for every multiple of &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-30-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;msup&gt;&lt;mn&gt;10&lt;/mn&gt;&lt;mn&gt;4&lt;/mn&gt;&lt;/msup&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;. So the limit &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-31-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;msub&gt;&lt;mi&gt;L&lt;/mi&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;/msub&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt; is high, but it is not infinite.
        I am not sure exactly how many characters are commonly used these days.
    &lt;/p&gt;
&lt;p&gt;Now let's analyze the number of characters &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-32-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;n&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt; it takes to write a
        number &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-33-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;. For numbers up to 10, only one character is needed.
        For numbers above 10, we write the tens place and then one more character
        for the ones place, if needed. If there is a 10 in the tens place,
        we simply write one character, &lt;span lang="zh"&gt;十&lt;/span&gt;. Otherwise,
        two characters are needed. So the tens place adds about
        &lt;span class="mjx-chtml MJXc-display" style="text-align: center;"&gt;&lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-34-Frame" role="presentation" style="font-size: 90%; text-align: center; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML MJX_Assistive_MathML_Block" role="presentation"&gt;&lt;math display="block" xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mfrac&gt;&lt;mn&gt;1&lt;/mn&gt;&lt;mn&gt;9&lt;/mn&gt;&lt;/mfrac&gt;&lt;mo stretchy="false"&gt;(&lt;/mo&gt;&lt;mn&gt;1&lt;/mn&gt;&lt;mo stretchy="false"&gt;)&lt;/mo&gt;&lt;mo&gt;+&lt;/mo&gt;&lt;mfrac&gt;&lt;mn&gt;8&lt;/mn&gt;&lt;mn&gt;9&lt;/mn&gt;&lt;/mfrac&gt;&lt;mo stretchy="false"&gt;(&lt;/mo&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;mo stretchy="false"&gt;)&lt;/mo&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mfrac&gt;&lt;mn&gt;17&lt;/mn&gt;&lt;mn&gt;9&lt;/mn&gt;&lt;/mfrac&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
        17/9 characters, about 1.9.
    &lt;/p&gt;
&lt;p&gt;Now let's look at the hundreds place. From here on, we must include multiples of one.
        For example, 115 is not &lt;span lang="zh"&gt;百十五&lt;/span&gt;, but &lt;span lang="zh"&gt;一百一十五&lt;/span&gt;.
        This means that from 100 on, each place value (other than one) always takes two
        characters. Since we are interested in long term trends, we can write
        &lt;span class="mjx-chtml MJXc-display" style="text-align: center;"&gt;&lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-35-Frame" role="presentation" style="font-size: 90%; text-align: center; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML MJX_Assistive_MathML_Block" role="presentation"&gt;&lt;math display="block" xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;n&lt;/mi&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;msub&gt;&lt;mi&gt;log&lt;/mi&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mn&gt;10&lt;/mn&gt;&lt;/mrow&gt;&lt;/msub&gt;&lt;mo&gt;⁡&lt;/mo&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
        This grows quicker than the modern decimal system, but more slowly than the Roman system:
        &lt;span class="mjx-chtml MJXc-display" style="text-align: center;"&gt;&lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-36-Frame" role="presentation" style="font-size: 90%; text-align: center; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML MJX_Assistive_MathML_Block" role="presentation"&gt;&lt;math display="block" xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mn&gt;1&lt;/mn&gt;&lt;mo&gt;&amp;lt;&lt;/mo&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;mo&gt;&amp;lt;&lt;/mo&gt;&lt;mfrac&gt;&lt;mn&gt;20&lt;/mn&gt;&lt;mn&gt;9&lt;/mn&gt;&lt;/mfrac&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
        Past the ten thousands place, there is no character for hundred thousands.
        How will we handle the hundred thousands place?
    &lt;/p&gt;
&lt;p&gt;We will have to multiply the ten thousands place by a new tens place. This will mean
        adding a number 10–99 before the character for 10,000. This will take another
        2 characters. So our rule of &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-37-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;n&lt;/mi&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;msub&gt;&lt;mi&gt;log&lt;/mi&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mn&gt;10&lt;/mn&gt;&lt;/mrow&gt;&lt;/msub&gt;&lt;mo&gt;⁡&lt;/mo&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt; still applies. As for the number of
        symbols, in the long term, we need a new symbol for every 10,000. So we can write
        &lt;span class="mjx-chtml MJXc-display" style="text-align: center;"&gt;&lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-38-Frame" role="presentation" style="font-size: 90%; text-align: center; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML MJX_Assistive_MathML_Block" role="presentation"&gt;&lt;math display="block" xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;S&lt;/mi&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;msub&gt;&lt;mi&gt;log&lt;/mi&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mn&gt;10000&lt;/mn&gt;&lt;/mrow&gt;&lt;/msub&gt;&lt;mo&gt;⁡&lt;/mo&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
&lt;/p&gt;
&lt;/details&gt;
&lt;details&gt;
&lt;summary&gt;Ancient Greek, Hebrew, Arabic&lt;/summary&gt;
&lt;p&gt;The Greek, Hebrew, and Arabic numeral systems are all functionally identical,
        and differ mainly in the symbols. Greek uses &lt;span lang="el"&gt;Αʹ Βʹ Γʹ&lt;/span&gt;,
        Hebrew uses &lt;span dir="rtl" lang="he"&gt;א ב ג‎&lt;/span&gt;, and Arabic uses
        &lt;span dir="rtl" lang="ar"&gt;ا ب ج&lt;/span&gt; for 1 2 3. These
        systems are all used today for at least some purposes. Arabs used these before
        they adopted what we know as Arabic numerals from India.
    &lt;/p&gt;
&lt;p&gt;These systems have unique symbols for 1–9, 10–90, and 100–900.
        They vary in their treatments of numbers past this, but are different from what is shown here,
        so let's say &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-39-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;msub&gt;&lt;mi&gt;L&lt;/mi&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;/msub&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mn&gt;999&lt;/mn&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt; for this system. It's notable that
        each new place value requires only one more letter. For example, 4 is
        &lt;span lang="el"&gt;Δʹ&lt;/span&gt;/&lt;span dir="rtl" lang="he"&gt;ד&lt;/span&gt;/&lt;span dir="rtl" lang="ar"&gt;د&lt;/span&gt;,
        44 is
        &lt;span lang="el"&gt;ΜΔʹ&lt;/span&gt;/&lt;span dir="rtl" lang="he"&gt;מד&lt;/span&gt;/&lt;span dir="rtl" lang="ar"&gt;مد&lt;/span&gt;,
        444 is
        &lt;span lang="el"&gt;ΥΜΔʹ&lt;/span&gt;/&lt;span dir="rtl" lang="he"&gt;תמד&lt;/span&gt;/&lt;span dir="rtl" lang="ar"&gt;تمد&lt;/span&gt;.
        Therefore we can write the same rule
        &lt;span class="mjx-chtml MJXc-display" style="text-align: center;"&gt;&lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-40-Frame" role="presentation" style="font-size: 90%; text-align: center; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML MJX_Assistive_MathML_Block" role="presentation"&gt;&lt;math display="block" xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;n&lt;/mi&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;msub&gt;&lt;mi&gt;log&lt;/mi&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mn&gt;10&lt;/mn&gt;&lt;/mrow&gt;&lt;/msub&gt;&lt;mo&gt;⁡&lt;/mo&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
        Interestingly, this is the same as modern Arabic numerals. The problem is that this requires
        a large number of new unique symbols: 9 for each place value. So we have
        &lt;span class="mjx-chtml MJXc-display" style="text-align: center;"&gt;&lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-41-Frame" role="presentation" style="font-size: 90%; text-align: center; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML MJX_Assistive_MathML_Block" role="presentation"&gt;&lt;math display="block" xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;S&lt;/mi&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mn&gt;9&lt;/mn&gt;&lt;msub&gt;&lt;mi&gt;log&lt;/mi&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mn&gt;10&lt;/mn&gt;&lt;/mrow&gt;&lt;/msub&gt;&lt;mo&gt;⁡&lt;/mo&gt;&lt;mi&gt;S&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
&lt;/p&gt;
&lt;/details&gt;
&lt;details&gt;
&lt;summary&gt;Ancient Egyptian&lt;/summary&gt;
&lt;p&gt;Unfortunately, Egyptian hieroglyphics are not as widely supported in unicode as the other scripts on
    here, so you may have problems with rendering.&lt;/p&gt;
&lt;p&gt;The Ancient Egyptian numeral system starts like the prehistoric one: each line represents a one.
        So 𓏺 is one, 𓏻 is two, 𓐂 is nine. After that, there are new symbols for each place value:
        𓎆 is ten. These can be combined: 𓎇𓏾 is 25, and 𓍣𓎊𓏿 is 256.
    &lt;/p&gt;
&lt;p&gt;The number of symbols needed for each place value obviously depends on the number, but it ranges
        from 1–9; this averages to 5. So we can say
        &lt;span class="mjx-chtml MJXc-display" style="text-align: center;"&gt;&lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-42-Frame" role="presentation" style="font-size: 90%; text-align: center; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML MJX_Assistive_MathML_Block" role="presentation"&gt;&lt;math display="block" xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;n&lt;/mi&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mn&gt;5&lt;/mn&gt;&lt;msub&gt;&lt;mi&gt;log&lt;/mi&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mn&gt;10&lt;/mn&gt;&lt;/mrow&gt;&lt;/msub&gt;&lt;mo&gt;⁡&lt;/mo&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
        Each new place value needs a new symbol. So
        &lt;span class="mjx-chtml MJXc-display" style="text-align: center;"&gt;&lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-43-Frame" role="presentation" style="font-size: 90%; text-align: center; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML MJX_Assistive_MathML_Block" role="presentation"&gt;&lt;math display="block" xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;S&lt;/mi&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;msub&gt;&lt;mi&gt;log&lt;/mi&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mn&gt;10&lt;/mn&gt;&lt;/mrow&gt;&lt;/msub&gt;&lt;mo&gt;⁡&lt;/mo&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
        The highest symbol was 𓁨 for &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-44-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;msup&gt;&lt;mn&gt;10&lt;/mn&gt;&lt;mn&gt;6&lt;/mn&gt;&lt;/msup&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;. Let's set this as &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-45-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;msub&gt;&lt;mi&gt;L&lt;/mi&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;/msub&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;.
    &lt;/p&gt;
&lt;/details&gt;
&lt;details&gt;
&lt;summary&gt;Ancient Mayan&lt;/summary&gt;
    Unfortunately, Mayan numerals are not supported on my computer in unicode, and I am not good at
    writing Ancient Mayan symbols, so I will not be able to show the script here. However,
    it is conceptually identical to the modern Arabic numerals, even including a symbol for
    zero. The big difference is that it is base-20, giving
    &lt;span class="mjx-chtml MJXc-display" style="text-align: center;"&gt;&lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-46-Frame" role="presentation" style="font-size: 90%; text-align: center; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML MJX_Assistive_MathML_Block" role="presentation"&gt;&lt;math display="block" xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;n&lt;/mi&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;msub&gt;&lt;mi&gt;log&lt;/mi&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mn&gt;20&lt;/mn&gt;&lt;/mrow&gt;&lt;/msub&gt;&lt;mo&gt;⁡&lt;/mo&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
&lt;span class="mjx-chtml MJXc-display" style="text-align: center;"&gt;&lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-47-Frame" role="presentation" style="font-size: 90%; text-align: center; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML MJX_Assistive_MathML_Block" role="presentation"&gt;&lt;math display="block" xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;S&lt;/mi&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mn&gt;20&lt;/mn&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
    Meaning that the Mayan system is even more efficient that the modern one in terms of &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-48-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;n&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;. (This is considering
    number of places alone. The symbols in individual places often have multiple components:
    for example, three is three dots; fourteen is four dots and two lines. This is almost like
    a numeral system within a larger place value system, rather than arbitrary symbols.)
    &lt;/details&gt;
&lt;p&gt;&lt;/p&gt;
&lt;h3&gt;Comparison&lt;/h3&gt;
&lt;p&gt;Now, let's compare all the systems we've seen so far. Then we'll see where modern systems come in.&lt;/p&gt;
&lt;p&gt;As a refresher, &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-49-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt; is the number being written. &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-50-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;n&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt; is the number of symbols required to
    write &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-51-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt; in this system. &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-52-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;msub&gt;&lt;mi&gt;L&lt;/mi&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;/msub&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt; is the maximum number &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-53-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt; that can be written with the
    system as is. &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-54-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;S&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt; is the number of unique symbols that would be needed to write an
    arbitrarily large &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-55-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt; with a version of the system, perhaps adding more symbols as needed.
&lt;/p&gt;
&lt;div class="longmath"&gt;&lt;table&gt;&lt;tbody&gt;
&lt;tr&gt;
&lt;th&gt;System&lt;/th&gt;
&lt;th&gt;&lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-56-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;n&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;&lt;/th&gt;
&lt;th&gt;&lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-57-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;msub&gt;&lt;mi&gt;L&lt;/mi&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;/msub&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;&lt;/th&gt;
&lt;th&gt;&lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-58-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;S&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;&lt;/th&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Prehistoric (tallies)&lt;/td&gt;
&lt;td&gt;&lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-59-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;&lt;/td&gt;
&lt;td&gt;&lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-60-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi mathvariant="normal"&gt;∞&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Modern (decimal)&lt;/td&gt;
&lt;td&gt;&lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-61-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;msub&gt;&lt;mi&gt;log&lt;/mi&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mn&gt;10&lt;/mn&gt;&lt;/mrow&gt;&lt;/msub&gt;&lt;mo&gt;⁡&lt;/mo&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;&lt;/td&gt;
&lt;td&gt;&lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-62-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi mathvariant="normal"&gt;∞&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;&lt;/td&gt;
&lt;td&gt;10&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Roman&lt;/td&gt;
&lt;td&gt;&lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-63-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mfrac&gt;&lt;mn&gt;20&lt;/mn&gt;&lt;mn&gt;9&lt;/mn&gt;&lt;/mfrac&gt;&lt;msub&gt;&lt;mi&gt;log&lt;/mi&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mn&gt;10&lt;/mn&gt;&lt;/mrow&gt;&lt;/msub&gt;&lt;mo&gt;⁡&lt;/mo&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;&lt;/td&gt;
&lt;td&gt;3999&lt;/td&gt;
&lt;td&gt;&lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-64-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;msub&gt;&lt;mi&gt;log&lt;/mi&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mn&gt;10&lt;/mn&gt;&lt;/mrow&gt;&lt;/msub&gt;&lt;mo&gt;⁡&lt;/mo&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Chinese&lt;/td&gt;
&lt;td&gt;&lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-65-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;msub&gt;&lt;mi&gt;log&lt;/mi&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mn&gt;10&lt;/mn&gt;&lt;/mrow&gt;&lt;/msub&gt;&lt;mo&gt;⁡&lt;/mo&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;&lt;/td&gt;
&lt;td&gt;&lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-66-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mo&gt;&amp;gt;&lt;/mo&gt;&lt;msup&gt;&lt;mn&gt;10&lt;/mn&gt;&lt;mn&gt;8&lt;/mn&gt;&lt;/msup&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;&lt;/td&gt;
&lt;td&gt;&lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-67-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;msub&gt;&lt;mi&gt;log&lt;/mi&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mn&gt;10000&lt;/mn&gt;&lt;/mrow&gt;&lt;/msub&gt;&lt;mo&gt;⁡&lt;/mo&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Greek/Hebrew/Arabic&lt;/td&gt;
&lt;td&gt;&lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-68-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;msub&gt;&lt;mi&gt;log&lt;/mi&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mn&gt;10&lt;/mn&gt;&lt;/mrow&gt;&lt;/msub&gt;&lt;mo&gt;⁡&lt;/mo&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;&lt;/td&gt;
&lt;td&gt;999&lt;/td&gt;
&lt;td&gt;&lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-69-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mn&gt;9&lt;/mn&gt;&lt;msub&gt;&lt;mi&gt;log&lt;/mi&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mn&gt;10&lt;/mn&gt;&lt;/mrow&gt;&lt;/msub&gt;&lt;mo&gt;⁡&lt;/mo&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Egyptian&lt;/td&gt;
&lt;td&gt;&lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-70-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mn&gt;5&lt;/mn&gt;&lt;msub&gt;&lt;mi&gt;log&lt;/mi&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mn&gt;10&lt;/mn&gt;&lt;/mrow&gt;&lt;/msub&gt;&lt;mo&gt;⁡&lt;/mo&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;&lt;/td&gt;
&lt;td&gt;&lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-71-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;msup&gt;&lt;mn&gt;10&lt;/mn&gt;&lt;mn&gt;6&lt;/mn&gt;&lt;/msup&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;&lt;/td&gt;
&lt;td&gt;&lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-72-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;msub&gt;&lt;mi&gt;log&lt;/mi&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mn&gt;10&lt;/mn&gt;&lt;/mrow&gt;&lt;/msub&gt;&lt;mo&gt;⁡&lt;/mo&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Mayan&lt;/td&gt;
&lt;td&gt;&lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-73-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;msub&gt;&lt;mi&gt;log&lt;/mi&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mn&gt;20&lt;/mn&gt;&lt;/mrow&gt;&lt;/msub&gt;&lt;mo&gt;⁡&lt;/mo&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;&lt;/td&gt;
&lt;td&gt;&lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-74-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi mathvariant="normal"&gt;∞&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;&lt;/td&gt;
&lt;td&gt;20&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;&lt;/div&gt;
&lt;p&gt;Other than the prehistoric system, it seems all the &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-75-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;n&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt; values follow logarithmic growth.
    Is this the most efficient system possible? Well, exponential growth is one of the fastest
    types of growth out there, so its inverse logarithmic growth should be one of the slowest.
    I can think of one system that grows faster than exponential: factorial. What would an
    inverse factorial system look like?
&lt;/p&gt;
&lt;p&gt;24 might take 4 digits to represent rather than 2. 120 would take 5. 720 would take 6.
    So far, this system is not doing well. But there is a point, somewhere way out there,
    beyond which the inverse factorial system will be more efficient than our system.
    Now to actually invent an inverse factorial method for representing numerals.
    I haven't seen anything like it. (Edit: when I went to sleep after writing this
    I couldn't stop thinking about an inverse factorial system, and then I thought I came up
    with one, but it turns out I didn't. You can look if you're interested, or I
    can leave it as an “exercise to the reader”.)
&lt;/p&gt;
&lt;details&gt;
&lt;summary&gt;My attempted inverse factorial system&lt;/summary&gt;
&lt;p&gt;Instead of a 1s place, 10s place, 100s place... &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-76-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;msup&gt;&lt;mn&gt;10&lt;/mn&gt;&lt;mi&gt;n&lt;/mi&gt;&lt;/msup&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt; place...
    I had a 2s place, 3s place, 4s place... &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-77-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;n&lt;/mi&gt;&lt;mo&gt;+&lt;/mo&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt; place. Each place
    could have either a 1 or 0. Then you multiply together all the
    places that have a 1. So &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-78-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;msub&gt;&lt;mn&gt;1&lt;/mn&gt;&lt;mo&gt;!&lt;/mo&gt;&lt;/msub&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt; is 2, &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-79-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;msub&gt;&lt;mn&gt;01&lt;/mn&gt;&lt;mo&gt;!&lt;/mo&gt;&lt;/msub&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt; is 3,
    &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-80-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;msub&gt;&lt;mn&gt;11&lt;/mn&gt;&lt;mo&gt;!&lt;/mo&gt;&lt;/msub&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;mo&gt;×&lt;/mo&gt;&lt;mn&gt;3&lt;/mn&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mn&gt;6&lt;/mn&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;, &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-81-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;msub&gt;&lt;mn&gt;101&lt;/mn&gt;&lt;mo&gt;!&lt;/mo&gt;&lt;/msub&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;mo&gt;×&lt;/mo&gt;&lt;mn&gt;4&lt;/mn&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mn&gt;8&lt;/mn&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;.
    I then made some minor restrictions and a computer program so
    every number had a unique representation.
&lt;/p&gt;
&lt;p&gt;The reason why I thought this system would follow an 
    inverse factorial law is that in the system,
    &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-82-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mo stretchy="false"&gt;(&lt;/mo&gt;&lt;mi&gt;n&lt;/mi&gt;&lt;mo&gt;+&lt;/mo&gt;&lt;mn&gt;1&lt;/mn&gt;&lt;mo stretchy="false"&gt;)&lt;/mo&gt;&lt;mo&gt;!&lt;/mo&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt; is written as a series of &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-83-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;n&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;
    ones: 6 is &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-84-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;msub&gt;&lt;mn&gt;11&lt;/mn&gt;&lt;mo&gt;!&lt;/mo&gt;&lt;/msub&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;, 24 is &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-85-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;msub&gt;&lt;mn&gt;111&lt;/mn&gt;&lt;mo&gt;!&lt;/mo&gt;&lt;/msub&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;,
    120 is &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-86-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;msub&gt;&lt;mn&gt;1111&lt;/mn&gt;&lt;mo&gt;!&lt;/mo&gt;&lt;/msub&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;.
&lt;/p&gt;
&lt;p&gt;But this system has some drawbacks. Notably, a prime number &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-87-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;p&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;
    requires &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-88-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;p&lt;/mi&gt;&lt;mo&gt;−&lt;/mo&gt;&lt;mn&gt;1&lt;/mn&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt; digits to write, following a rule similar not to
    inverse factorial, but to &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-89-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;n&lt;/mi&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;. I wrote a program to figure out
    the number of digits &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-90-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;n&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt; needed to write a number &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-91-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt; in this
    system. Here are some graphs:
&lt;/p&gt;&lt;br/&gt;
&lt;img src="https://www.harysdalvi.com/blog/2110/graphs/fct4000.png"/&gt;&lt;br/&gt;
&lt;img src="https://www.harysdalvi.com/blog/2110/graphs/fct10000.png"/&gt;&lt;br/&gt;
&lt;p&gt;On the high side, you can see something approximately &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-92-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;n&lt;/mi&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;
    due to prime numbers. My orange model is &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-93-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;n&lt;/mi&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mo&gt;/&lt;/mo&gt;&lt;/mrow&gt;&lt;mi&gt;ln&lt;/mi&gt;&lt;mo&gt;⁡&lt;/mo&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;.
    This is very interesting, because a surprisingly good approximation for
    the number of primes less than &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-94-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt; is
    &lt;span class="mjx-chtml MJXc-display" style="text-align: center;"&gt;&lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-95-Frame" role="presentation" style="font-size: 90%; text-align: center; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML MJX_Assistive_MathML_Block" role="presentation"&gt;&lt;math display="block" xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mfrac&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;mrow&gt;&lt;mi&gt;ln&lt;/mi&gt;&lt;mo&gt;⁡&lt;/mo&gt;&lt;mo stretchy="false"&gt;(&lt;/mo&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;mo stretchy="false"&gt;)&lt;/mo&gt;&lt;/mrow&gt;&lt;/mfrac&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
    I have no idea why it seems to follow this model so closely.
&lt;/p&gt;
&lt;/details&gt;
&lt;p&gt;You could also have something like &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-96-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;n&lt;/mi&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mi&gt;log&lt;/mi&gt;&lt;mo&gt;⁡&lt;/mo&gt;&lt;mo stretchy="false"&gt;(&lt;/mo&gt;&lt;mi&gt;log&lt;/mi&gt;&lt;mo&gt;⁡&lt;/mo&gt;&lt;mo stretchy="false"&gt;(&lt;/mo&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;mo stretchy="false"&gt;)&lt;/mo&gt;&lt;mo stretchy="false"&gt;)&lt;/mo&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;. Again, I don't know how
    you would invent a numeral system that follows this rule, but it might be theoretically
    possible, and if so it would be quite efficient.
&lt;/p&gt;
&lt;h2&gt;Modern Systems&lt;/h2&gt;
&lt;h3&gt;Scientific notation&lt;/h3&gt;
&lt;p&gt;Scientific notation is based on the regular decimal system, so it also has
    &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-97-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;msub&gt;&lt;mi&gt;L&lt;/mi&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;/msub&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mi mathvariant="normal"&gt;∞&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt; and &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-98-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;S&lt;/mi&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mn&gt;10&lt;/mn&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;.
    However, &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-99-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;n&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt; can be &lt;em&gt;anything you want&lt;/em&gt;, no matter
    how large or small &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-100-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt; is. The problem is that this sacrifices accuracy
    (which can actually be a good thing when
    &lt;a href="https://www.harysdalvi.com/uncertainty" target="_blank"&gt;experimental uncertainty&lt;/a&gt; is involved).
    I found it interesting how scientific notation fits into the
    variables we've been looking at for ancient numeral systems.
&lt;/p&gt;
&lt;h3&gt;Binary and Hexadecimal&lt;/h3&gt;
&lt;p&gt;Binary and Hexadecimal are the same as the decimal system, but with
    a different base. (This is similar to the Mayan system.)
    But since &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-101-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;n&lt;/mi&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;msub&gt;&lt;mi&gt;log&lt;/mi&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msub&gt;&lt;mo&gt;⁡&lt;/mo&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt; for a binary computer, the number of bits
    &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-102-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;n&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt; required actually increases faster than for writing a number
    on paper. By change of base, the ratio is
    &lt;span class="mjx-chtml MJXc-display" style="text-align: center;"&gt;&lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-103-Frame" role="presentation" style="font-size: 90%; text-align: center; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML MJX_Assistive_MathML_Block" role="presentation"&gt;&lt;math display="block" xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mfrac&gt;&lt;mrow&gt;&lt;mi&gt;ln&lt;/mi&gt;&lt;mo&gt;⁡&lt;/mo&gt;&lt;mn&gt;10&lt;/mn&gt;&lt;/mrow&gt;&lt;mrow&gt;&lt;mi&gt;ln&lt;/mi&gt;&lt;mo&gt;⁡&lt;/mo&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/mrow&gt;&lt;/mfrac&gt;&lt;mo&gt;≈&lt;/mo&gt;&lt;mn&gt;3.3&lt;/mn&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
    So if a number &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-104-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt; takes &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-105-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;msub&gt;&lt;mi&gt;n&lt;/mi&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mn&gt;10&lt;/mn&gt;&lt;/mrow&gt;&lt;/msub&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt; digits to write on paper, we can
    predict it will take &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-106-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;msub&gt;&lt;mi&gt;n&lt;/mi&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msub&gt;&lt;mo&gt;≈&lt;/mo&gt;&lt;mn&gt;3.3&lt;/mn&gt;&lt;msub&gt;&lt;mi&gt;n&lt;/mi&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mn&gt;10&lt;/mn&gt;&lt;/mrow&gt;&lt;/msub&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt; bits in a computer
    as &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-107-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt; grows large.
    For example, 2021 takes four digits on paper, and the binary representation
    11111100101 takes 11 bits. 11/4 = 2.75. If we take a really big number,
    like 1234567890, this takes ten digits to write on paper, and
    31 bits to represent as 1001001100101100000001011010010.
    31/10 approaches 3.3 better.
&lt;/p&gt;
&lt;p&gt;In reality, numbers may take up more or less space in computer memory
    based on different systems to store them, such as two's complement and
    float64. However, we are now using numbers as a template to talk more
    generally about information density.
&lt;/p&gt;
&lt;p&gt;I can talk about an abstract information density, which is the amount of information
    &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-108-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt; divided by the space to store it &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-109-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;n&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;. With the prehistoric system, this is
    &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-110-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mo&gt;/&lt;/mo&gt;&lt;/mrow&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mn&gt;1&lt;/mn&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;. With the decimal system, this is &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-111-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mo&gt;/&lt;/mo&gt;&lt;/mrow&gt;&lt;msub&gt;&lt;mi&gt;log&lt;/mi&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mn&gt;10&lt;/mn&gt;&lt;/mrow&gt;&lt;/msub&gt;&lt;mo&gt;⁡&lt;/mo&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;, which tends to be
    higher than 1: each digit in a decimal number, like the 2 in 123, has much more meaning than
    a simple tally mark.
&lt;/p&gt;
&lt;p&gt;But with computers, this is &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-112-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mo&gt;/&lt;/mo&gt;&lt;/mrow&gt;&lt;msub&gt;&lt;mi&gt;log&lt;/mi&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msub&gt;&lt;mo&gt;⁡&lt;/mo&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;, which is apparently lower than for the decimal system.
This can be explained by the fact that even though computers need a high number of bits,
the bits are extremely small physically. They can fit on a microchip.
&lt;/p&gt;
&lt;h3&gt;DNA&lt;/h3&gt;
&lt;p&gt;DNA has four types of nitrogenous bases, so &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-113-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;S&lt;/mi&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mn&gt;4&lt;/mn&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;. Every 3
    base pairs codes for one amino acid (or a stop codon).
    The number of base pairs needed to code for a protein can
    be considered &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-114-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;n&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;. Such a protein will have &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-115-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;n&lt;/mi&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mo&gt;/&lt;/mo&gt;&lt;/mrow&gt;&lt;mn&gt;3&lt;/mn&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt; amino
    acids. The question is, what number &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-116-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt; corresponds to
    the amount of information stored in a protein with these
    amino acids?
&lt;/p&gt;
&lt;p&gt;I feel like the amount of information in a protein with 100
    amino acids is a lot more than the amount of information
    conveyed by writing 100 on a piece of paper. Since
    there are many complicated ways for amino acids to interact
    as their number increases, moving into tertiary and even
    quaternary structure, I feel like amino acids have even more ways to
    add meaning &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-117-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt; as their number &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-118-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;n&lt;/mi&gt;&lt;mrow class="MJX-TeXAtom-ORD"&gt;&lt;mo&gt;/&lt;/mo&gt;&lt;/mrow&gt;&lt;mn&gt;3&lt;/mn&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt; increases.
    This means &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-119-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt; might be exponentially related to &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-120-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;n&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;.
    So we can invert this and say
    &lt;span class="mjx-chtml MJXc-display" style="text-align: center;"&gt;&lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-121-Frame" role="presentation" style="font-size: 90%; text-align: center; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML MJX_Assistive_MathML_Block" role="presentation"&gt;&lt;math display="block" xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mfrac&gt;&lt;mi&gt;n&lt;/mi&gt;&lt;mn&gt;3&lt;/mn&gt;&lt;/mfrac&gt;&lt;mo&gt;∝&lt;/mo&gt;&lt;msub&gt;&lt;mi&gt;log&lt;/mi&gt;&lt;mn&gt;4&lt;/mn&gt;&lt;/msub&gt;&lt;mo&gt;⁡&lt;/mo&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
&lt;span class="mjx-chtml MJXc-display" style="text-align: center;"&gt;&lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-122-Frame" role="presentation" style="font-size: 90%; text-align: center; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML MJX_Assistive_MathML_Block" role="presentation"&gt;&lt;math display="block" xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;n&lt;/mi&gt;&lt;mo&gt;∝&lt;/mo&gt;&lt;msub&gt;&lt;mi&gt;log&lt;/mi&gt;&lt;mn&gt;4&lt;/mn&gt;&lt;/msub&gt;&lt;mo&gt;⁡&lt;/mo&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
    As for the limit of information, DNA can store an immense ammount
    of information. I'll go with &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-123-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;msub&gt;&lt;mi&gt;L&lt;/mi&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;/msub&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mi mathvariant="normal"&gt;∞&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;.
&lt;/p&gt;
&lt;h3&gt;Quantum computing&lt;/h3&gt;
&lt;p&gt;In quantum computing, quantum particles are in a superposition of states.
    When measured, they settle on a state that corresponds to either 0 or 1,
    both with equal probability. So if we have &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-124-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;n&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt; &lt;i&gt;qubits&lt;/i&gt;
    (quantum bits), each corresponding to 0 or 1, that's like a
    binary number with &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-125-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;n&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt; bits. So then what's the advantage
    of quantum computing?&lt;/p&gt;
&lt;p&gt;Qubits settle on 0 or 1, &lt;em&gt;when measured&lt;/em&gt;. If we don't measure them,
    they remain in a superposition of all possible states at once. This
    means that with a properly designed algorithm, they can conduct
    multiple calculations at once.
&lt;/p&gt;
&lt;p&gt;
    Then the question is, how do we access this information if every measurement
    settles on 0 or 1? This is part of the reason why quantum computing is so difficult,
    but it is possible through &lt;strong&gt;quantum interference&lt;/strong&gt;.
    This is the strange phenomenon where since a quantum particle is also a wave
    (wave-particle duality), this wave can interfere with itself.
    (This can be shown with the double slit experiment.)
    The various calculations can be combined and measured to get a useful result
    [&lt;a href="https://www.harysdalvi.com/blog/2110/#ref" id="src2"&gt;2&lt;/a&gt;].
&lt;/p&gt;
&lt;p&gt;So in theory, a quantum algorithm can perform infinite calculations at once.
    This means &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-126-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;n&lt;/mi&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mi&gt;C&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;: as long as you have the constant &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-127-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;C&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt; qubits for the
    calculation you're interested in, that same number of qubits can store an
    infinite amount of information &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-128-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;.
&lt;/p&gt;
&lt;p&gt;In practice, it is difficult to combine the results in order to extract information,
    because each qubit's measured state is either 0 or 1. This means that for &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-129-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;n&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;
    qubits, the information &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-130-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt; is not infinite, at least not based on any
    measurements we can take. As for the exact relationship between &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-131-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;n&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt; and &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-132-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;,
    I'm not sure. I think it might depend on the algorithm in question,
    which makes sense since each algorithm
    has its own &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-133-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;n&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;. This reminds me of hard computer science problems
    that can't be solved generally, like the halting problem, shown
    undecidable by Alan Turing.
&lt;/p&gt;
&lt;p&gt;There are a few options for a “type of symbol” in quantum computing.
    One example is a quantum computer based on ions. In this case, we could say &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-134-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;S&lt;/mi&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mn&gt;1&lt;/mn&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;.
    However, we also need to consider the various connections between qubits.
    Then we could maybe say &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-135-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;S&lt;/mi&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;, or maybe we could consider the various
    &lt;strong&gt;quantum logic gates&lt;/strong&gt; [&lt;a href="https://www.harysdalvi.com/blog/2110/#ref" id="src3"&gt;3&lt;/a&gt;],
    composed of qubits, as a new type of symbol in themselves.
&lt;/p&gt;
&lt;p&gt;What is the limit of information &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-136-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;msub&gt;&lt;mi&gt;L&lt;/mi&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;/msub&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt; that can be stored? In theory,
    more qubits always means more information. In practice, qubits interacting
    can lead to quantum decoherence, losing the unique quantum properties
    needed for quantum computers to work. As of September 2020, IBM's
    largest quantum computer had 65 qubits [&lt;a href="https://www.harysdalvi.com/blog/2110/#ref" id="src4"&gt;4&lt;/a&gt;].
&lt;/p&gt;
&lt;p&gt;So even though superposition means an infinite possible number of states,
    this doesn't translate into an infinitely larger amount of information
    compared to a classical computer. However, if quantum decoherence can
    be kept under control, the idea is promising.
&lt;/p&gt;
&lt;p&gt;But as with new technologies in general, I worry about how it might be
    applied negatively. Quantum computing has huge potential for
    positive advancements in cryptography. On the other hand, this means
    it has potential for a lot of hacking. If quantum computing is implemented
    correctly by a hacker, quantum superposition would make our current
    classical systems completely vulnerable. Hopefully (?) quantum
    decoherence remains a large issue, so hackers won't be able to
    solve it reliably, but organized teams will be able to use it for
    good in certain applications.
&lt;/p&gt;
&lt;h2&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;Starting with prehistoric counting, I introduced a set of variables
    with which we can view systems of information. All the numeral systems
    devised in ancient times fit neatly into this set. Binary and hexadecimal
    as used in classical computing worked as well.
&lt;/p&gt;
&lt;p&gt;DNA and scientific notation were difficult but interesting to analyze in
    the way I did for numeral systems. Quantum computing, on the other hand,
    became almost impossible. The theoretical aspects of quantum theory
    and superposition repeatedly imply “infinite power”, but
    the practical barriers are hard to quantify, especially because
    they're always being pushed. It's clear that developing quantum computing
    will be extremely difficult. It's also clear that at least for small
    numbers of qubits without too much decoherence, quantum computing is
    promising to handle huge amounts of data at high speeds.
&lt;/p&gt;
&lt;p&gt;Roman numerals, with their low limit on data &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-137-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;msub&gt;&lt;mi&gt;L&lt;/mi&gt;&lt;mi&gt;x&lt;/mi&gt;&lt;/msub&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt; and quickly increasing
    space needed &lt;span class="mjx-chtml MathJax_CHTML" id="MathJax-Element-138-Frame" role="presentation" style="font-size: 90%; position: relative;" tabindex="0"&gt;&lt;span class="MJX_Assistive_MathML" role="presentation"&gt;&lt;math xmlns="http://www.w3.org/1998/Math/MathML"&gt;&lt;mi&gt;n&lt;/mi&gt;&lt;/math&gt;&lt;/span&gt;&lt;/span&gt;, bear little resemblance to the ideals of what
    quantum computing could be.
&lt;/p&gt;
&lt;p&gt;Romans needed quantum computers.
&lt;/p&gt;
&lt;hr/&gt;
&lt;h2&gt;Related Posts&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="https://www.harysdalvi.com/blog/mathematical-history"&gt;Mathematical History&lt;/a&gt; — Why we need more historians doing math&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.harysdalvi.com/blog/2202"&gt;2+3=4&lt;/a&gt; — Why math seems to describe the physical world so well&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id="ref"&gt;References&lt;/h2&gt;
&lt;ol id="reflist"&gt;
&lt;li&gt;&lt;a href="https://www.romannumerals.org/blog/which-is-the-biggest-number-in-roman-numerals-6" target="_blank"&gt;
    Which is the biggest number in Roman numerals?&lt;/a&gt; (Roman Numerals) &lt;a href="https://www.harysdalvi.com/blog/2110/#src1"&gt;^&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.ias.edu/ideas/2014/ambainis-quantum-computing" target="_blank"&gt;
    What Can We Do with a Quantum Computer?&lt;/a&gt; (Andris Ambainis) (University of Latvia) &lt;a href="https://www.harysdalvi.com/blog/2110/#src2"&gt;^&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.nist.gov/topics/physics/introduction-new-quantum-revolution/quantum-logic-gates" target="_blank"&gt;
    Quantum Logic Gates&lt;/a&gt; (NIST) &lt;a href="https://www.harysdalvi.com/blog/2110/#src3"&gt;^&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.science.org/news/2020/09/ibm-promises-1000-qubit-quantum-computer-milestone-2023" target="_blank"&gt;
    IBM promises 1000-qubit quantum computer—a milestone—by 2023&lt;/a&gt; (Adrian Cho) &lt;a href="https://www.harysdalvi.com/blog/2110/#src3"&gt;^&lt;/a&gt;&lt;/li&gt;
&lt;/ol&gt;</content></entry></feed>