On one access you have more Hardware Coming in on the other hand you have an Explosion of innovation and Ai and so What happened with both tensorflow and Pytorch is that the explosion of Innovation in AI has led to it's not Just about multiplication and Convolution these things have now like 2 000 different operators And on the other hand you have I don't Know how many pieces of Hardware there Are there it's a lot part of my thesis Part of my belief of where Computing Goes if you look out 10 years from now It's not going to get simpler Physics isn't going back to where we Came from it's only going to get weirder From here on out right and so to me the Exciting part about what we're building Is it's about building that Universal Platform which the world can continue to Get weird because again I don't think It's avoidable it's physics but we can Help lift people's scale do things with It and they don't have to rewrite their Code every time a new device comes out And I think that's pretty cool The following is a conversation with Chris Ladner his third time on this Podcast as I've said many times before He's one of the most brilliant engineers In modern Computing having created llm Compiler infrastructure project the clan Compiler the Swift programming language
A lot of key contributions to tensorflow And tpus as part of Google he served as Vice president of autopilot software at Tesla was a software innovator and Leader at Apple and now he co-created a New Full stack AI infrastructure for Distributed training inference and Deployment on all kinds of Hardware Called modular and a new programming Language called Mojo that is a superset Of python giving you all the usability Of python but with the performance of C C plus plus in many cases Mojo code has Demonstrated over 30 000 X speed up over python if you love Machine learning if you love python you Should definitely give Mojo a try this Programming language this new AI Framework and infrastructure and this Conversation with Chris is mind-blowing I love it It gets pretty technical at times so I Hope you hang on for the ride this is The Lex Friedman podcast to support it Please check out our sponsors in the Description and now it's your friends Here's Chris lattner It's been I think two years since we Last talked and then in that time you Somehow went and co-created a new Programming language called Mojo So it's Optimized for AI it's a super set of Python let's look at the big picture What is the vision for Mojo for Mojo
Well so I mean I think you have to zoom Out so I've been working on a lot of Related Technologies for many many years So I've worked on llvm and a lot of Things and mobile and servers and things Like this But the world's changing and what's Happened with AI is we have new gpus and New Machine learning accelerators and other Asics and things like that that make ai Go real fast at Google I worked on tpus That's one of the biggest larger scale Deployed systems that exist for AI and Really what you see is if you look Across all of the things that are Happening in the industry there's this New compute platform coming and it's not Just about CPUs or gpus or tpus or npus or ipus or Whatever all the pus right it's about How do we program these things Right and so for software folks like us Right it doesn't do us any good if There's this amazing Hardware that we Can't use And one of the things you find out Really quick is that having the Theoretical capability of programming Something and then having the world's Power and the innovation of all the all The smart people in the world get Unleashed on something can be quite Different and so really where Mojo came
From was starting from a problem of we Need to be able to take machine learning Take the infrastructure underneath it And make it way more accessible way more Usable way more understandable by normal People and researchers and other folks That are not themselves like experts in Gpus and things like this and then Through that Journey we realized hey we Need Syntax for this we need to do a Programming language so one one of the The main features of the language I say So fully ingest is that it allows you to Have the file extension to be uh an Emoji or the fire Emoji which is One of the first Emojis used as a file extension I've Ever seen in my life and then you ask Yourself the question why in the 21st Century we're not using Unicode for file Extensions does that mean it's an epic Decision I think clearly the most Important decision you made the most but But you could also just use mojo as the File extension well so okay so take a Step back I mean come on Max do you Think that the world's ready for this This is a big moment in the world right This is we'll release this onto the World this is innovation I mean it really is kind of brilliant Emojis is such a big part of our daily Lives why is it not in programming well And and like you take a step back and
Look look at what file extensions are Right they're basically metadata right And so why are we spending all the Screen space on them and all the stuff Also you know you have them stacked up Next to text files and PDF files and Whatever else like if you're gonna do Something cool you want to stand out Right and emojis are colorful they're Visual they're they're beautiful right What's been the response so far from uh Is is there support on like Windows on The operating system in displaying like File explorer yeah the one problem I've Seen is the git doesn't escape it right And so it thinks that the fire Emoji is Unprintable and so it like prints out Weird hex things if you use the command Line git tool but everything else as far As I'm aware works fine and I I have Faith that git can be improved so GitHub Is fine GitHub is fine yep GitHub is Fine Visual Studio code Windows like all This stuff totally ready because people Have internationalization yeah in their Normal part of their past So this is just like taking the next Step right Somewhere between oh wow that makes Sense cool I like new things too oh my God you're killing my baby like what are You talking about this can never be like I can never handle this how am I going To type this like all these things and
So this is something where I think that The world will get there we don't have To bet the whole Farm on this I think we Can provide both paths but I think it'll Be great uh when can we have emojis as Part of the code I wonder uh yeah so I Mean lots of languages provide that so Um I think that we have partial support For that it's probably not fully done Yet but but yeah you can you can do that For example in Swift you can do that for Sure so an example we give give it Apple Was the the dog cow yeah so that's a Classical Mac Heritage thing and so he's The dog and the cow emoji together and That could be your variable name but of Course the internet went and made pile Of poop for everything yeah so you know If you want to name your function pile Of poop then you can totally go to town And see how that gets through code Review Okay so uh let me just ask bunch of Random questions uh so is Mojo primarily Designed for AIS or is it a general Purpose programming yeah good question So it's AI first and so AI is driving a Lot of the requirements and so Um modular is building and designing and Driving Mojo forward it's not because It's an interesting project Theoretically to build it's because we Need it That's what modular we're really
Tackling the AI infrastructure landscape And the big problems in Ai and the Reasons it is so difficult to use in Scale and adopt and deploy and like all These big problems in Ai and so we're Coming out from that perspective now When you do that when you start tackling These problems you realize that the Solution to these problems isn't Actually an AI specific solution And so while we're doing this we're Building Mojo to be a fully General Programming language and that means that You can Obviously tackle gpus and CPUs and like These AI things but it's also a really Great way to build Numpy and other things like that or you Know just if you look at what many Python libraries are today often they're A layer of python for the API and they End up being C and C plus plus code Underneath them that's very true in AI That's True in lots of other domains as Well and so anytime you see this pattern That's an opportunity for Mojo to help Simplify the world and help people have One thing to optimize through Simplification By having one thing so you mentioned Modular Mojo is the programming language Modular is the whole software stack so Just over a year ago we started this Company called modular yeah okay what
Modular is about is it's about taking Ai And up leveling it into the Next Generation right and so if you take a Step back what's gone on in the last Five six seven eight years is that we've Had things like tensorflow and pytorch And these other systems come in you've Used them you know this and what's Happened is these things have grown like Crazy they get tons of users it's in Production deployment scenarios it's Being used to power so many systems I Mean AIS all around us now now it used To be controversial years ago but now It's a thing but the challenge with These systems is that they haven't Always been Um thought out with current demands in Mind and so you think about it when Where were llms eight years ago well They didn't exist right AI has changed So much and a lot of what people are Doing today are very different than when These systems were built and meanwhile The hardware side of this has gone into A huge mess there's tons of new chips And accelerators and every every big Company's announcing a new chip every Day it feels like and so between that You have like this moving system on one Side a moving system on the other side And it just turns into this gigantic Mess which makes it very difficult for People to actually use AI particularly
In production deployment scenarios That's what modular is doing is we're Helping build out that software stack to Help solve some of those problems so Then people can be more productive and Get more AI Research into production Now what Mojo does is it's a really Really really important piece of that And so that is you know part of that Engine and part of the technology that Allows us to solve these problems so Mojo is a programming language that Allows you to do a higher level Programming the low level programming Like do all kinds of programming In that spectrum that gets you closer And closer to the hardware so take step Back so let's what do you love about Python oh boy Where do I begin Um what is love what do I love about Python you're a guy who knows love I Know this yes Um How intuitive it is Thank you How it feels like I'm writing natural Language English Uh How when I can not just write but read Other people's code somehow I can Understand it faster it's more And condensed than other languages like Ones I'm really familiar with like C
Plus plus and C uh there's a bunch of Sexy little features yeah uh we'll Probably talk about some of them but List comprehensions and stuff like this And don't forget the entire ecosystem of All the Packers oh yeah there's probably Huge there's always something if you Want to do anything there's always a Package yeah so it's not just The ecosystem of the packages and the Ecosystem of the humans that do it that That's a really That's an interesting dynamic because I Think something About the the usability and the Ecosystem makes the thing viral it grows And then it's a virtuous cycle I think Well there's many things that went into That like so I think that ml was very Good for Python and so I think that Tensorflow and pytorch in these systems Embracing python really took and helped Python grow but I think that the major Thing underlying it is that Python's Like the universal connector right it Really helps bring together lots of Different systems so you can compose Them and build out larger systems Without having to understand how it Works but then what is the problem with Python Well I guess you could say several Things but probably that it's slow I think that's usually what people
Complain about right and so slow I mean Other people complain about tabs in Spaces versus curly braces or whatever But I mean those people are just wrong Because it is actually just better to Use indentation Wow strong words so actually I'm a small Change let's actually take that let's Take all kinds of tangents oh come on Lex you can push me on I could take nine And decide listen I've recently left Emacs for vs code the kind of hate mail I had to receive because on the way to Doing that I also said I've considered Vim yep and uh chose not to and went With vs code and especially on deep Religions right anyway uh tabs is an Interesting design decision and so You've really written a new programming Language here yes it is a a super set of Python but you can make a bunch of Different interesting decisions here Totally yeah and you chose actually to Stick with python is a Uh in terms of some of the syntax well So let me explain why right so I mean you can explain this in many Rational ways I think that the Indentation is beautiful but that's not A rational explanation right so but I Can defend it rationally right so first Of all python one Has millions of programmers yeah it is Huge it's everywhere it owns machine
Learning right so factually it is the Thing right second of all if you look at It C code C plus plus code Java whatever Swift curly brace languages also run Through formatting tools and get Indented and so if they're not indented Correctly first of all we'll twist your Brain around it can lead to bugs there's Notorious bugs that have happened across Time where the indentation was wrong or Misleading and it wasn't formatted right And so it turned into an issue right and So what ends up happening in modern Large-scale code bases is people run Automatic formatters So now what you end up with is Indentation and curly braces Well if you're going to have You know the notion of grouping why not Have one thing right and get rid of all The Clutter and have a more beautiful Thing right also you look at many of These languages it's like okay well we Can have curly braces or you can omit Them if there's one statement or you Just like enter this entire world of Complicated design space that Objectively you don't need if you have Python style indentation so yeah I would Love to actually see statistics on Errors made because of indentation like How many errors are made in python Versus in C plus plus that have to do With basic formatting all that kind of
Stuff I would love to see I think it's It's probably pretty minor because once You get uh like you use vs code I do too So if you get vs code set up it does the Indentation for you generally right and So you don't you know it's actually Really nice to not have to fight it and Then what you can see is the editors Telling you how your code will work by Indenting it which I think is pretty Cool I honestly don't think I've ever I don't remember having an Error in Python because I indented stuff Wrong so I mean I think that there's Again this is a religious thing and so I Can joke about it and I love I love to Kind of You know I realized that this is such a Polarizing thing and everyone wants to Argue about and so I like poking at the Bear a little bit right but but frankly Right come back to the first point Python one like it's huge it's an AI um It's the right thing for us like we see Mojos being an incredible part of the Python ecosystem we're not looking to Break python or change it or quote Unquote fix it we love python for what It is our view is that python is just Not done yet And so if you look at you know you Mentioned python being slow well there's A couple of different things go into That which we can talk about if you want
But one of them is it just doesn't have Those features that you would use to do C like programming and so if you say Okay well I'm forced out of python into C for certain use cases Well then what we're doing is we're Saying okay well why why is that can we Just add those features that are missing From python back up to Mojo and then you Can have everything that's great about Python all the things you're talking About that you love plus not be forced Out of it when you do something a little Bit more computationally intense or Weird or Hardware or whatever it is that You're doing well a million questions I Want to ask what high level again is it Compiled or is it an interpretive Language so python is just in time Compilation what's what's Mojo So Mojo a complicated answer does all The things so it's interpreted it's Chip Compiled and it's statically compiled Um and so this is for a variety of Reasons so One of the things that makes python Beautiful is that it's very Dynamic and Because it's Dynamic one of the things They added is that it has this powerful Meta programming feature and so if you Look at something like pytorch or Tensorflow or or I mean even a simple Simple use case like you define a class That has the plus method right you can
Overload the dunder methods like Dunder Add for example and then the plus method Works on your class and so it has very Nice and very expressive Dynamic meta programming features In Mojo we want all those features come In like we don't want to break python we Want all the work but the problem is you Can't run those super Dynamic features On an embedded processor or on a GPU Right or if you could you probably don't Want to just because of the performance And so we entered this question of Saying okay how do you get the power of This Dynamic meta programming into a Language that has to be super efficient In specific cases and so what we did was We said okay we'll take that interpreter Python has an interpreter in it right Take that interpreter and allow to run It compile time and so now what you get Is you get compile time meta programming And so this is super interesting and Super powerful because One of the big advantages you get is you Get python style expressive apis you get The ability to have overloaded operators And if you look at what happens inside Of like pytorch for example with Automatic differentiation and eager mode Like all these things they're using These really Dynamic and Powerful Features at runtime but we can take Those features and lift them so they run
A compile time so you're because C plus Does amount of programming with with Templates But it's really messy it's super messy It's it's always it was accidentally I Mean different people have different Interpretations my interpretation is That it was made accidentally powerful It was not designed to be terrain Complete for example but that was Discovered kind of along the way Accidentally Um and so there have been a number of Languages in the space and so they Usually have templates or code Instantiation code copying features of Various sorts Um some more modern languages or some More newer languages let's say like you Know they're fairly unknown like Zig for Example Um says okay well let's take all of Those types so you can run it all those Things you can do at runtime and allow Them to happen at compile time and so One of the problems with C plus plus I Mean which is one of one of the problems With C plus plus there we go is wrong Words Oh that's okay I mean everybody hates me For a variety of reasons anyways I'm Sure right I've written that's the way They show love I have written enough C Plus plus code to earn a little bit of
Grumpiness with C plus plus but Um but one of the problems with it is That the meta programming system Templates is just a completely different Universe from the normal runtime Programming world and so if you do meta Programming and programming it's just Like a different Universe different Syntax different concepts different Stuff going on and so again one of our Goals with mojos to make things really Easy to use easy to learn and so there's A natural stepping stone And so as you do this you say okay well I have to do programming at runtime After you do programming at compile time Why are these different things how hard Is that to pull it up because that Sounds to me as a fan of meta Programming in c plus even How how hard is it to pull that off that Sounds really really exciting because You can do the same style programming at Compile time in a runtime that's really Really exciting yep and so I mean in Terms of the compiler implementation Details it's hard I won't be shy about that it's super Hard it requires I mean what Mojo has Underneath the covers is a completely New approach to the design of the Compiler itself and so this Builds on These Technologies like mlir that you Mentioned but it also includes other
Like caching and other interpreters and Jit compilers and other stuff like so You have like an interpreter inside Within the compiler yes And so it really takes the standard Model of programming languages and kind Of twisted and unifies it with the Runtime model right which I think is Really cool and to me the value of that Is that again many of these languages Have meta programming features like they Grow macros or something right lisp Right yes I know your roots right Um you know and this is a powerful thing Right and so you know if you go back to List one of the most powerful things About about it is that it said that the Meta programming the programming are the Same right and so that made it way Simpler way more consistent way easier To understand reason about and it made It more composable so if you build a Library you can use it both at runtime And compile time Which is pretty cool yeah and then for Machine learning I think meta Programming I think we could generally say is Extremely useful and so you get features I mean I'll jump around but there's the Feature of Auto tuning and adaptive Compilation just blows my mind yeah well So okay so let's come back to that all Right so so what what is what is what is
Machine learning like or what is a Machine learning model like you take a Pie torch model off the internet right Um it's really interesting to me because What a pipe what pi torch and what Tensorflow and all these Frameworks are Kind of pushing compute into as they're Pushing into like this abstract Specification of a compute problem which Then gets mapped in a whole bunch of Different ways right so this is why it Became a meta programming problem is That you want to be able to say cool I Have I have this neural net now run with Batch size a thousand right do do do a Mapping across batch or okay I want to Take this problem now running across a Thousand CPUs or gpus right and so like This this problem of like just describe The compute and then map it and do Things and transform it or like actually It's very profound and that's one of the Things that makes machine Learning Systems really special uh maybe can you Describe Auto tuning and how do you pull Off I mean I guess adaptive compilation Is what we're talking about as meta Programming yeah how do you pull off Auto-tune I mean is that is that as Profound as I think it is it seems like A really like uh you know we'll Mentioned list comprehensions to me from A quick glass of Mojo uh which by the Way I have to absolutely like dive in uh
As I realized how amazing this is I Absolutely must have been uh it that Looks like just an incredible feature For machine learning people yeah well so So what is autotune so take a step back Auto tuning is a feature in Mojo it's Not so very very little of what we're Doing is actually research like many of These ideas have existed in other Systems and other places and so what We're doing is we're pulling together Good ideas remixing them and making them Into hopefully a beautiful system right And so Auto tuning the observation is That it turns out hardware systems Algorithms are really complicated turns Out maybe you don't actually want to Know how the hardware works Right A lot of people don't right and so There are lots of really smart Hardware People I know a lot of them uh where They know everything about okay the the Cache size is this and the number of Registers is that and if you use this What length of vector is going to be Super efficient because it Maps directly Onto what it can do and like all this Kind of stuff or the GPU has SMS and it Has a warp size of whatever right all The stuff that goes into these things or The dial size of a TPU is 128 like these These factoids right My belief is that most normal people and I love Hardware people also I'm not
Trying to offend literally everybody in The internet Um but uh most programmers actually Don't want to know this stuff right and So if you come at it from perspective of How do we allow people to build both More abstracted but also more portable Code Because you know it could be that the Vector length changes or the cash size Changes it could be that the tile size Of your Matrix changes or the number you Know an a100 versus an h100 versus a Volta versus whatever GPU have different Characteristics right a lot of the Algorithms that you run are actually the Same but the parameters these magic Numbers you have to fill in end up being Really fiddly numbers that an expert has To go figure out and so what Auto tuning Does it says okay well Guess what there's a lot of compute out There Right so instead of having humans go Randomly try all the things or do a grid Search or go search some complicated Multi-dimensional space How about we have computers do that Right and so autotuning does is you can Say hey here's my algorithm If it's a a matrix operation or Something like that you can say okay I'm Going to carve it up into blocks I'm Going to do those blocks in parallel and
I want this this with 128 things that I'm running on I want to cut it this way Or that way or whatever and you can say Hey go see which one's actually Empirically better on the system And then the result of that you cash for That system yep you save it and so come Back to twisting your compiler brain Right so not only does the compiler have An interpreter that's used to do meta Programming that compiler that Interpreter that meta programming now Has to actually take your code and go Run it on a Target machine See which one it likes the best and then Stitch it in and then keep going right So part of the compilation is machine Specific yeah well so I mean this is an Optional feature right so you don't have To use it for everything but yeah if you If you're so one one of one of the Things that we're in the quest of is Ultimate performance yes right ultimate Performance is important for a couple of Reasons right so if you're an Enterprise You're looking to save cost and compute And things like this ultimate Performance translates to you know fewer Servers If you care about the environment hey Better performance leads to more Efficiency I mean you could joke and say like you Know Python's bad for the environment
Right and so if you move to Mojo it's Like at least 10x better or just out of The box and then keep going right Um uh but but performance is also Interesting because it leads to better Products and so in the space of machine Learning right if you reduce the latency Of a model So that it runs faster so every time you Query the server running the model it Takes less time well then the product Team can go and make the model bigger Well that's actually makes it so you Have a better experience as a customer And so a lot of people care about that So for auto-tune for like towel size you Mentioned 128 for tpus you would specify Like a bunch of options to try yeah just In the code it's a simple statement and Then you can just set and forget and Know depending wherever it compiles It'll actually be the fastest and yeah Exactly the beauty of this is that it Helps you in a whole bunch of different Ways right so if you're building so Often what will happen is that you know You've written a bunch of software Yourself right you you wake up one day You say I have an idea I'm going to go Put up some code I get to work I forget about it And move on with life I come back six Months or a year or two years or three Years later you dust it off and you go
Use it again in a new environment and Maybe your GPU is different maybe you're Running on a server instead of a laptop Maybe whatever right and so the problem Now is you say okay well I mean again Not everybody cares about performance But if you do you say okay well I want To take advantage of all these new Features I don't want to break the old Thing though Right and so the typical way of handling This kind of stuff before is you know if You're talking about sequence templates Are you talking about C with macros you End up with if defs you get like all These weird things get layered in make The code super complicated and then how Do you test it right it becomes this This crazy complexity multi-dimensional Space that you have to worry about and You know that just doesn't scale very Well Actually let me just jump around before It goes to specific features like the Increase in performance here that we're Talking about can be just insane uh you Write that Mojo can provide a 35 Thousand X speed up over python uh how Does it do that yeah so it can even do More but uh we'll get to that so uh so First of all when we say that we're Talking about what's called C python It's the default python that everybody Uses when you type python3 that's like
Typically the one you use right see Python is an interpreter And so interpreters they have an extra Layer of like byte codes and things like This that they have to go read parse Interpret and it makes them kind of slow From that perspective and so one of the First things we do is we move to a Compiler And so I'm just moving to a compiler Getting The Interpreter out of the loop Is two to five to ten X speed up Depending on the code so just out of the Gate Just using more modern techniques right Now if you do that one of the things you Can do is you can start to look at how C Python started to lay out data And so one of the things that that c Python did and this isn't part of the Python spec necessarily but this is just Sets of decisions is that If you take an integer for example it'll Put it in an object because in Python Everything's an object and so they do The very logical thing of keeping the Memory representation of all objects the Same so all objects have a header they Have like payload data they and what This means is every time you pass around An object you're passing around a Pointer to the data Well this has overhead it turns out that Modern computers don't like chasing
Pointers very much and things like this It means that you have to allocate the Data means you have to reference count It which is another way of that python Uses to keep track of memory and so this Has a lot of overhead and so if you say Okay Let's try to get that out of The Heap out of a box out of an Interaction and into the registers That's that's another 10x so it adds up If you if you're reference counting Every single every every single thing You create that adds up yeah and if you Look at you know people complain about The python Gill this is one of the Things that hurts parallelism Um that's because of the reference Counting Right and so the Gill and reference Counting are very tightly intertwined in Python it's not the only thing but it's Very tightly intertwined and so then you Lean into this and you say okay cool Well modern computers they can do more Than one operation at a time and so they Have vectors what is a vector well a Vector allows you to take one instead of Taking one piece of data doing an ad or Multiply and then picking up the next One you can now do a 4 or 8 or 16 or 32 At a time right well python doesn't Expose that because of reasons and so Now you can say okay well you can adopt
That Now you have threads now you have like Additional things like you control Memory hierarchy and so what Mojo allows You to do is it allows you to start Taking advantage of all these powerful Things that have been built into the Hardware over time and it gives the Library gives um very nice features so You can say just parallelize this do This in parallel right so it's very very Powerful weapons against slowness which Is why people have been I think having Fun like just taking code and making go Fast because it's just kind of an Adrenaline rush to see like how fast you Can get things before I talk about some Of the interesting stuff with Parallelization all that let's let's First talk about like the basics we Talked to indentation right so this Thing looks like python It's sexy and beautiful like python as I Mentioned uh is it a typed language so What's the role of types yeah good Question so python has types it has Strings as integers it has dictionaries And like all that stuff but they all Live at runtime Right and so Because all those types of runtime in Python you never or you don't have to Spell them python also has like this Whole typing thing going on now and a
Lot of people use it yeah I'm not Talking about that that's that's kind of A different thing we can go back to that If you want but but typically the um You know you just say I take I have a Death and my def takes two parameters I'm going to call them A and B and I Don't have to write a type okay so that Is great but what that does is that Forces what's called a consistent Representation so these things have to Be a pointer to an object with the Object header and they all have to look The same and then when you dispatch a Method you go through all the same Different paths no matter what the the Receiver whatever that type is so what Mojo does is it allows you to have more Than one kind of type and so what it Does is allows you to say okay cool I Have I have an object an object's behave Like python does and so it's fully Dynamic and that's all great and for Many things classes like that's all very Powerful and very important But if you want to say hey it's an Integer and it's 32 bits or 64 bits or Whatever it is or it's a floating point Value At six four bits well then the compiler Can take that and it can use that to do Way better optimization and turns out Again getting rid of the interactions It's huge means you can get better code
Completion because you have Um because compiler knows what the type Is and so knows what operations work on It and so that's actually pretty huge And so what Mojo does allows you to Progressively adopt types into your Program so you can start again it's Compatible with python and so then you Can add however many types you want Wherever you want them and if you don't Want to deal with it you don't have to Deal with it right and so one of one of You know our opinions on this is It's Not that types are the right thing or The wrong thing It's a very useful thing Which was kind of optional it's not Strict typing you don't have to specify A type exactly Okay so starting from the thing that Python's kind of reaching towards right Now with trying to inject types into it Yeah with a very different approach but Yes yes what's the different approach I'm actually one of the people That have not been using types very much In Python okay why did you say It's just well because I I know the Importance it's like adults use strict Typing and so I I refuse to grow up in That sense it's a it's a kind of Rebellion but I I just know that um It probably reduces the amount of Errors Even just for forget about performance
Improvements it probably reduces errors Of when you do strict typing yeah so I Mean I think it's interesting if you Look at that right and the reason is I'm Giving a hard time yeah is that that There's this this cultural norm this Pressure this like there has to be a Right way to do things like you know Only grown-ups only do it one way and if You want to do that you should feel bad Yes right like some people feel like Python's a guilty pleasure or something And that's like when I get serious I Need to go rewrite it right yeah exactly I mean cool I understand history and I Understand kind of where this comes from But I don't think it has to be a guilty Pleasure yeah right and so if you look At that you say why do you have to Rewrite it well you have to rewrite it To deploy well why do you want to deploy Well you care about performance you care About predictability or you want you Know a tiny thing on the server that has No dependencies or you know you have Objectives you're trying to attain So what if python can achieve those Objectives So if you want types well maybe you want Types because you want to make sure You're passing the right thing sure you Can add a type if you don't care you're Protyping some stuff you're hacking some Things out you're like pulling some Ram
Good off the internet it should just Work right and you shouldn't be like Pressured he shouldn't feel bad about Doing the right thing or the thing that Feels good now if you're in a team right You're working at some massive internet Company and you have 400 million lines Of python code well they they may have a House rule that you use types yeah right Because it makes it easier for different Humans to talk to each other and Understand what's going on and bugs at Scale right and so there are lots of Good reasons why you might want to use Types but that doesn't mean that Everybody should use them all the time Right so what Mojo does is it says cool Well allow people to use types and if You use types you get nice things out of It right you get better performance and Things like this right but Mojo is a Full compatible superset of python And so that means it has to work without Types It has to support all the dynamic things I support all the packages that support Uh for comprehension list comprehensions And things like this right and so that That starting point I think is really Important and I think that Again you can look at why I care so much About this and there's many different Aspects of that one of which is the World went through a very challenging
Migration from python 2 to python 3. Right yes this migration took many years And it was very painful for many teams Right and there's a lot of a lot of Things that went on in that Um I'm not an expert in all the details I honestly don't want to be I don't want The world to have to go through that Yeah right and you know people can Ignore Mojo and if it's not their thing That's that's cool but if they want to Use Mojo I don't want them to have to Rewrite all their code yeah I mean just Look at the superset part is There's just I mean there's so much Brilliant stuff here that definitely is Is incredible Um we'll talk about that yeah first of All how's the typing implemented Differently in uh in python versus uh Mojo So this heterogeneous flexibility You said it's definitely implemented Yeah so I'm not a full expert in the Whole backstory and types in Python so I'll give you I'll give you that I can Give you my understanding Um my understanding is basically like Many Dynamic languages the ecosystem Went through a phase where people went From writing scripts during a large Scale huge code bases in Python and at Scale kind of helps have types yeah People want to be able to reason about Interfaces what what do you expect
String or an inch or like what these Basic things right and so what the Python Community started doing is it Started saying okay let's have tools on The side Checker tools right the go and like Enforce some variants check for bugs try To identify things these are called Static analysis tools generally and so These tools run over your code and try To look for bugs What ended up happening is there's so Many of these things so many different Weird patterns and different approaches On specifying the types and different Things going on that the python Community realize and recognize hey hey There's the thing here and so what they Started to do is they started to Standardize the Syntax for adding types To python now one of the challenges that They had is that they're coming from Kind of this fragmented world where There's lots of different tools they Have different trade-offs and Interpretations and the types mean Different things and so if you look at Types in Python according to the python Spec The types are ignored Right so according to python spec you Can write pretty much anything in in a Tight position okay and um You can technically you can write any
Expression okay now That's beautiful because you can extend It you can do cool things you can build Your own tools you can build your own House linter or something like that Right but it's also a problem because Any existing Python program may be using Different tools and they have different Interpretations and so if you adopt Somebody's package into your ecosystem Try to run the tool you prefer it may Throw out tons of weird errors and Warnings and problems just because it's Incompatible with how these things work Also because they're added late and They're not checked by the python Interpreter it's always kind of more of A hint than it is a requirement also the C python implementation can't use them For performance and so it's really That's a big one right so you can't Utilize the for the compilation for the Just in time compilation okay exactly And this this all comes back to the Design principle if it's it's kind of They're kind of hints they're kind of The definition is a little bit murky It's unclear exactly the interpretation In a bunch of cases and so because of That you can't actually even if you want To it's really difficult to use them to Say like it is going to be an INT and if It's not it's a problem right a lot of Code would break if you did that so so
In Mojo right so you can still use those Kind of type annotations it's fine but In Mojo if you declare a type and you Use it then it means it is going to be That type and the compiler helps you Check that and force it and it's safe Um and it's not it's not a like best Effort kind of a thing so if you try to Shovel string type thing into an integer You get an error from the compiler Compile time Nice okay what kind of basic types are There yeah so uh Mojo is Um pretty hardcore in terms of what it Tries to do in the language which is the Philosophy there is that we Um Again if you if you look at python right Python's a beautiful language because It's so extensible right and so all of The different things in Python like for Loops and plus and like all these things Can be accessed through these Under Armor methods okay so you have to say Okay if I make something that is super Fast I can go all the way down to the Metal why do I need to have integers Built into the language Right so what Mojo does it says okay Well we can have this notion of structs So we have classes in Python now you can Have structs classes are Dynamic structs Are static Cool we can get high performance we can
Write C plus plus kind of code with Structs if you want these things mix and Work beautifully together but what that Means is that you can go and Implement Strings and ins and floats and arrays And all that kind of stuff in the Language Right and so that's really cool because You know to me as a ideal idealizing Compile compiler language type of person What I want to do is I want to get magic Out of the compiler and put in the Libraries because if somebody can you Know if we can build an integer that's Beautiful and it has an amazing API it Does all the things you'd expect an Editor to do If you don't like it maybe you want a Big integer maybe you want to like Sideways integer I don't know like what What all the space of integers are um Then uh then you can do that and it's Not a second class citizen And so if you look at certain other Languages like C plus plus one I also Love and use a lot um Into hard code in the language But complex is not and so isn't it kind Of weird that you know you have this STD Complex class but you have int and Complex tries to look like a natural Numeric type and things like this but Integers and floating Point have these Like special promotion rules and other
Things like that that are magic and They're hacked into the compiler and Because of that you can't actually make Something that works like the built-in Types is there something provided as a Standard because uh you know because It's AI first You know numerical types are so Important here so is there something Like a nice standard implementation of Integer influence yeah so so we're still Building all that stuff out so we Provide answers and floats and all that Kind of stuff we also provide like Buffers and tensors and things like that That you'd expect in an ml context Honestly we need to keep designing and Redesigning and working with the Community to build that out and make That better that's not our strength Right now Give us six months or a year and I think It'll be way better but um but the power Of putting in the library means we can Have teams of experts that aren't Compiler Engineers that can help us Design and refine and drive this forward So uh one of the exciting things we Should mention here is that This is uh this is new and fresh this Cake is unbaked It's almost baked you can tell it's Delicious But it's not fully ready to be consumed
Yep that's very fair it is very useful But it's very useful if you're a super Low level programmer right now and what We're doing is we're working our way up The stack and so the way I would look at Mojo today in May and 2023 Um is that it's like a 0.1 So I think that you know a year from now It's gonna be way more interesting to a Variety of people but what we're doing Is we're we decide to release it early So that people can get access to it and Play with them we can build it with the Community we um have a big road map Fully published being transparent about This and a lot of people are involved in This stuff and so what we're doing is We're really optimizing for building This thing the right way and building it The right way is kind of interesting Working with the community because Everybody wants it yesterday And so it's sometimes it's kind of you Know there's some Dynamics there but Yeah I think it's good it's the right Thing so there's a Discord also so the Dynamics is pretty interesting sometimes The community probably can be very Chaotic And uh introduce a lot of stress Guido Famously quit over the stress of the Walrus operator I mean yeah you know it Broke maybe Exactly and so like it could be very
Stressful to develop but can you just Add tangent upon a tangent is it Stressful to to uh To work through the design of various Features here given that the community Is so richly involved well so um so I've Been doing open development and Community stuff for decades now somehow This has happened to me Um so I've I've learned some tricks but The the thing that always gets me is I Want to make people happy right and so This is this is maybe not all people all Happy all the time but generally I want I want people to be happy right and so The challenge is that again we're Tapping into some long Some deep-seated long tensions and Pressures both in the python world but Also in the AI world in the hardware World and things like this and so people Just want us to move faster right and so Again our decision was let's release This early let's get people used to it Or access to it and play with it and Like let's let's build it in the open Which we could have you know had the the Language monk sitting in the Cloister up On the hilltop like beavering away Trying to build something but in my Experience you get something that's way Better if you work with the community Right uh and so yes it can be Frustrating can be challenging for lots
Of people involved and you know if you I Mean if you mention our Discord we have Over 10 000 people on the Discord 11 000 People or something keep in mind we Released Mojo like two weeks ago yeah so Um very effective so it's very cool Um but what that means is that um you Know 10 11 000 people all will want Something different right and so what We've done is we've tried to say okay Cool here's our roadmap here here and The roadmap isn't completely arbitrary It's based on here's the logical order In which to build these features or add Add these capabilities and things like That and what we've done is we've spun Really fast on like bug fixes and so we Actually have very few bugs which is Cool I mean actually for a project in The state but then what we're doing is We're dropping in features very Deliberately I mean this is fun to watch Because you got the two Gigantic communities of like Hardware Like systems engineers and then you have The machine learning python people that Are like higher level yeah and it's just Too like for like Army like uh they've Been at War yeah they've been at War Right and so so here's here's a Tolkien Novel or something okay so here's a test Again like it's it's super funny for for Something that's only been out for two Weeks right people are so impatient
Right but okay cool let's fast forward a Year Like in a year's time Mojo will be Actually quite amazing and solve tons of Problems and be very good Um people still have these problems Right and so you you look at this you Say and the way I look at this at least Is to say okay well we're solving big Long-standing problems To me I again working on many different Problems I want to make sure we do it Right There's like a responsibility you feel Because if you mess it up right there's Very few opportunities to do projects Like this and have them really have Impact on the world if we do it right Then maybe we can take those feuding Armies and actually heal some of those Wounds yeah like this feels this feels Like a speech by George Washington or Abraham Lincoln or something and you Look at this it's like okay well how Different are we yeah we all want Beautiful things we all want something That's nice we all want to be able to Work together we all want our stuff be Used right and so if we can help heal That now I'm not optimistic that All people will use Mojo and they'll Stop using C plus plus like that's not My goal right but um but if we can heal Some of that I think that'd be pretty
Cool yeah and we start by putting the People who like braces into the gulag no Uh so so there are proposals for adding Braces to Mojo and we just know what's Your thing we tell them no okay Politely yeah anyway so there's a lot of Amazing features on the roadmap and Those already implemented it it'd be Awesome I could just ask you a few Things yeah so uh the the other Performance Improvement comes from Immutability so what's the what's this VAR and this let thing that we got going On what's immutability Yeah so one of the things that is uh Useful and it's not always required but It's useful is knowing whether something Can change out from underneath you right So in Python you have a pointer to an Array right and so you pass that pointer To an array around to things If you pass into a function they may Take that and scroll away in some other Data structure and so you get your array Back and you go to use it now somebody Else is like putting stuff in your array How do you reason about that it gets to Be very complicated at least lots of Bugs right and so one of the things that You know again this is not selling Mojo Forces on you but something that Mojo Enables is a thing called value Semantics and what value semantics do is They take
Collections like arrays like Dictionaries also tensors and strings And things like this that are much Higher level and make them behave like Proper values and so it makes it look Like if you pass these things around you Get a logical copy of all the data and So if I pass you an array your array you Can go do what you want to it you're not Going to hurt my array now that is an Interesting and very powerful design Principle it defines away a ton of bugs You have to be careful to implement it In an efficient way as their performance Hit that's a significant Uh generally not if you implement it the Right way but it requires a lot of very Low level uh getting the language right Bits I assume there'll be a huge Performance hit because it's a really The benefit is really nice because you Don't get into that absolutely well the Trick is is you can't do it you can't do Copies So you have to provide The behavior of copying without doing The copy yeah how do you do that Is that how do you do that it's not Magic it's just it's actually pretty Cool well so first before we talk about How that works let's talk about how it Works in Python right so in Python you Need to find a person class or maybe a Person class is a bad idea you define a
Database class right and database class Has an array of Records something like That right and so the problem is that if You pass in a record or class instance Into the database it'll take a hold of That object and then it assumes it has It and if you're passing an object in You have to know that that database is Going to take take it and therefore you Shouldn't change it after you put in the Database right this is this you kind of Have to know that you just have to kind Of know that right and so you roll out Version one of the database you just Kind of have to know that of course Lex Uses its own database right yeah right Because you built it you understand how This works right somebody else joins the Team they don't know this yes right and So now they suddenly get bugs you're Having to maintain the database you Shake your fist you argue the tenth time This happens you're like okay we have to Do something different right and so what You do is you go to change your python Code and you change your database class To copy the record every time you add it And so what ends up happening is you say Okay I will do what's called a defensive Copy inside the database and then that Way if somebody passes something in I Will have my own copy of it and they can Go do whatever and they're not going to Break my thing
Okay this is usually the the two design Patterns if you look in pytorch for Example this is cloning a tensor like There's a specific thing and you have to Know where to call it if you don't call In the right place you get these bugs And this is state of the art right So a different approach so it's used in Many languages so I've worked with it in Swift Um is you say okay well let's provide Value semantics and so we want to Provide the view that you get a Logically independent copy but we won't Do that lazily And so what what we do is you say okay If you pass something into a function it Doesn't actually make a copy what it Actually does is it just increments a Reference to it and if you pass it Around you stick in your database They can go on the database you or not And then you come back out of the stack Nobody's copied anything you come back Out of the stack and then the caller Lets go of it well then you've just Handed it off to the database you've Transferred it and there's no copies Made Now on the other hand if you know your Co-worker goes and hands you a record And you pass it in you stick it in the Database and then you go to town and you Start modifying it what happens is you
Get a copy lazily on demand And so what this does is gives you Copies only when you need them and it Also so it defines away the bugs but Also generally reduces the number of Copies in practice and so but the Implementation details are tricky here Yeah so this is yes something with Reference Counting But to make it performant Across a number of different kinds of Objects yeah well so you need a couple Of things and so there's many so this Concept has existed in many different Worlds and so that again it's not novel Research at all right the magic is Getting the design right so that you can Do this in a reasonable way right and so There's a number of components that go Into this one is when you're passing Around so we're talking about Python and Reference counting and the expense of Doing that when you're passing values Around you don't want to do extra Reference counting for no good reason And so you have to make sure that you're Efficient and you transfer ownership Instead of duplicating references and Things like that which is a very low Level problem you also have to adopt This and you have to build these data Structures and so if you say Um you know Mojo has to be compatible With python so of course the default
List is a reference semantic list that Works the way you'd expect in Python but Then you have to design a value semantic List and so you just have to implement That and then you implement the logic Within and so the the role of the Language here is to provide all the Low-level hooks that allow the author of The type to be able to get and express This Behavior without forcing it into All cases or hard coding this into the Language itself but there's a ownership So you you're constantly transferring You're tracking who owns the thing yes And so there's a whole system called Ownership and so this is related to work Done in the rust Community also the Swift community's done a bunch of work And there's a bunch of different other Languages that have all kind of C plus Plus actually has copy Constructors and Destructors and things like that and so Um and I mean single spell says Everything so it has moved Constructors It has like this whole world of things And so this is uh this is a body of work That's kind of been developing for many Many years now and so Mojo takes some of The best Ideas out of all these systems and Remixes in a nice way so that you get The power of something like the rust Programming language but you don't have To deal with it when you don't want to
Which is a major thing in terms of Teaching and learning and being able to Use and scale these systems uh how does That play with argument conventions what Are they why are they important how does The value semantics how does the Transfer ownership uh work with with the Arguments when they're passing different Yeah so so if you go deep into systems Programming land so this isn't again This is not something for everybody but If you go deep into systems programming Land what you encounters you encounter These types that get weird so if you're Used to python you think about Everything I could just copy it around I Can go change it and mutate it and do These things and it's all cool Um if you get into systems programming Land you get into these things like I Have an atomic number or I have a mutex Or I have a uniquely owned database Handle things like this right so these Types you can't necessarily copy yeah Sometimes you can't necessarily even Move them to a different address And so what Mojo allows you to do is it Allows you to express hey I don't want To get a copy of this thing I want to Actually just get a reference to it and By doing that what you can say is you Can say okay if I'm defining something Weird like a atomic number or something It's like it has to be so it's an atomic
Number is a an area in memory that Multiple threads can access at a time Without synchronous without without Locks right and so uh and so like the Definition of atomic number is multiple Different things have to be poking it Therefore they have to agree on where it Is right so you can't just like move it Up from underneath one because it kind Of breaks what what it means and so That's that's an example of a type that You can't even you can't copy you can't Move it like once you create it has to Be where it was right now if you look at Many other examples like a database Handle right so okay well what happens How do you copy a database handle do you Copy the whole database that's not Something you necessarily want to do Um the there's a lot of types like that Where you want to be able to say that They are uniquely owned and so there's Always one of this thing and or if if I Create a thing I don't copy it and so What Mojo allows you to do is it allows You to say hey I want to pass around a Reference to this thing without copying It and so it has borrowed conventions so You can say you can use it but you don't Get to change it you can pass it by Mutable reference and so if you do that Then you can you get a reference to it But you can change it and so it manages All that kind of stuff so it's uh it's
Just a really nice implementation of Like C plus plus has yeah uh you know The reference kinds of pointers yeah Smart smart different different kinds of Applications and smart pointers that you Can uh explicitly Define this allows you But you're saying that's more like Um the weird case versus the common case Well it depends on where I mean I mean I Don't I don't think I'm a normal person So yes I mean I'm not one to call other People weird yeah But the uh uh but you know if you talk To a normal python a typical python Programmer you're typically not about This right this is a lower level of Abstraction now if you talk to a C plus Plus programmer certainly if you talk to A rust programmer again they're not Weird they're delightful like these are All good people right Um those those folks will think about All the time Right and so I look at this as there's a Spectrum between very deep low-level Systems I'm going to go poke the bits And care about how they're laid out in Memory all the way up to application and Scripting and other things like this and So it's not that anybody's right or Wrong it's about how do we build One system that scales By the way the the idea of an atomic Number has been something that always
Brought me deep happiness because The flip side of that the the idea that Threads can just modify stuff Um Asynchronously it's the whole idea of Concurrent programming is a source of Infinite stress for me well so this is Where you jump into Um you know again you zoom out and get Out of program languages or compilers And just look what the industry has done My mind is constantly blown by this Right and you look at what you know Moore's Law Moore's law has this idea That like computers for a long time Single thread performance just got Faster and faster and faster and faster For free But then physics and other things Intervened in power consumption like Other things started to matter and so What ended up happening is we went from Single Core computers to multi-core then We went to accelerators right this this Trend towards specialization of Hardware Is only going to continue and so for Years us programming language nerds and Compiler people have been saying okay Well how do we tackle multi-core right For a while it was like multi-core is The future we have to get on top of this Thing and then it was multi-cores to Default what are we doing with this Thing and that is like there's chips
With hundreds of cores in them what Happened right yeah and so I'm super inspired by the fact that you Know in the face of this you know those Machine learning people invented this Idea of a tensor right and was it tensor A tensor is an Like an arithmetic and algebraic concept It's like an abstraction around a Gigantic paralyzable data set right and Because of that and because of things Like tensorflow and pytorch we're able To say okay we'll Express the math Of the system this enables you to do Automatic differentiations enables you Do like all these cool things Um and and it's it's an abstract Representation well because you have That abstract representation you can now Map it onto these parallel machines Without having to Um control okay put that right here put That right there put that right there And this has enabled an explosion in Terms of AI compute accelerators like All the stuff and so that's super super Exciting what about the the deployment The execution across multiple machines So uh you write that the modular compute Platform dynamically partitions models With billions of parameters and Distributes their execution across Multiple machines enabling unparalleled Efficiency
Whether the use of unparalleled in that Sentence anyway enabling unparalleled Efficiency scale and reliability for the Largest workloads so how do you do this Um Abstraction of uh distributed deployment Of of a large models yeah so one of the Really interesting Um tensions so there's a whole bunch of Stuff that goes into that I'll pick a Random walkthrough uh if you if you go Back and replay the history of machine Learning right I mean the brief the Brief most recent history of machine Learning because this is as you know Very deep I I knew Lex when he had an AI Podcast yes Right yeah So uh so if you look at just tensorflow And Pie George which is pretty recent History in the big picture right but Tensorflow is all about graphs pie torch I think pretty unarguably ended up Winning and why did It win mostly Because of usability Right and the usability of pie torches I Think huge and I think again that's a Huge Testament to the power of taking Abstract theoretical technical Concepts And bring it to the masses right now the Challenge with what the tensorflow Versus the pie George Design points was that tensorflows kind Of difficult to use for researchers but
It was actually pretty good for Deployment Pytorch is really good for researchers It kind of not super great for Deployment right and so I think the we As an industry have been struggling and If you look at what deploying a machine Learning model today means is that You'll have researchers who are I mean Wicked smart of course but they're Wicked smart at model architecture and Data and calculus and like all like They're Wicked Smart in various domains They don't want to know anything about The hardware deployment or C plus plus Or things like this right and so what's Happened is you get people who train the Model they throw over throw it over the Fence and they have people that try to Deploy the model Well every time you have a team a does X They throw it over the fence and team y Does some Team B does y like you have a Problem because of course it never works The first time and so you throw over the Fence they figure out okay it's too slow It won't fit doesn't use the right Operator the tool crashes whatever the Problem is then they have to throw it Back over the fence And every time you throw a thing over a Fence it takes three weeks of project Managers and meetings and things like This and so what we've seen today is
Getting models in production can take Weeks or months like it's not atypical Right I talk to lots of people and you Talk about like VP of software some Internet company trying to deploy a Model and they're like why do I need a Team of 45 people Okay it's so easy to train a model why Why can't I deploy it right and if you Dig into this Every layer is problematic so if you Look at the language piece I mean this Is tip of the iceberg it's a very Exciting tip of the iceberg for folks But you've got python on one side and C Plus plus on the other side python Doesn't really deploy I mean it can Theoretically technically in some cases But often a lot of production teams will Want to get things out of python because They get their performance and control And whatever else so Mojo can help with That If you look at serving so you talk about Gigantic models well a gigantic model Won't fit on one machine Right and so now you have this model It's written in Python it has to be Rewritten in C plus plus now it also has To be carved up so that half of it runs On one machine half of it runs on Another machine or maybe it runs on 10 Machines Well so now suddenly the complexity is
Exploding right and the reason for this Is that if you if you look into Tensorflow pytorch these systems they Weren't really designed for this world Right they're designed for you know back In the day when we were starting and Doing things where it was a different Much simpler world like you want to run Resnet 50 or some ancient model Architecture like this it was just a it Was a completely different world trained On one GPU exactly doing Yeah now it's not right in the major Breakthrough and Um And the world has changed right and so Now the challenge is that tensorflow Pi Towards these systems they weren't Actually designed for llm so like that Was not that was not a thing and so what Where tensile actually has amazing power In terms of scale and deployment and Things like that and I think Google is I mean maybe not unmatched but they're Like incredible in terms of their Capabilities and gigantic scale Um many researchers using pytorch right And so pytorch doesn't have those same Capabilities and so what modular can do Is it can help with that now if you take A step back and say like what is modular Doing right so modular has like a A bitter enemy they were fighting Against in the industry and it's one of
These things where everybody knows it But nobody is usually willing to talk About it the bitter enemy The Bitter Thing that we have to destroy that we're All struggling with and it's like all Around it's like fish can't see water It's complexity Sure yes Complexity right that was very Philosophical And so if you look at it yes it is on The hardware side yes all these all These accelerators all these software Stacks that go with the accelerator all These like this massive complexity over There you look at What's happening on the modeling side Massive amount of complexity like things Are changing all the time people are Inventing turns out the research is not Done Right and so people want to be able to Move fast Transformers are amazing but There's a ton of diversity even within Transformers and what's the next Transformer right and you look into Serving also huge amounts of complexity It turns out that all the cloud Providers right have all their very Weird but very cool hardware for Networking all this kind of stuff and It's all very complicated people aren't Using that you look at classical serving Right there there's this whole world of
People who know how to write high Performance servers with zero copy Networking and like all all this fancy Uh asynchronous I O and like all these Fancy things in the in in the serving Community very little that has pervaded Into the machine learning world right And why is that well it's because again These systems have been built up over Many years they they haven't been Rethought there hasn't been a first Principle's approach to this and so what Modular is doing is we're saying okay We've built many of these things like so I've worked on tensorflow and tpus and Things like that other folks on our team Like are worked on pytorch core we've Worked on onyx one time we've worked on Many of these other systems and so the Systems like the Apple accelerators and All that kind of stuff like our team is Quite amazing and so one of the things That roughly everybody modular is grumpy About is that when you're working on one Of these projects you have a first order Goal Get the hardware to work get the system To enable one more model get this Product out the door enable the specific Workload or make it solve this problem For this this product team right and Nobody's been given a chance to actually Do that step back and so we as an Industry we didn't take two steps
Forward we took like 18 steps forward in Terms of all this really cool technology Across compilers and systems and Runtimes and heterogeneous Computing Like all this kind of stuff and like all This technology has been you know I Wouldn't say uh beautifully designed but It's been proven in different quadrants Like you know you look at Google with Tpus massive huge exaflops of compute Strapped together into machines that Researchers are programming in Python in A notebook that's huge that's amazing That's incredible right it's incredible And so you look at the technology that Goes into that and the the algorithms Were actually quite General And so lots of other Hardware out there And lots of other teams out there don't Have the sophistication or that maybe The the years working on it or the the Budget or whatever that Google does Right and so they should be getting Access to same algorithms but they just Don't have that right that's what Modular's doing is we're saying Cool this is not research anymore like We've we've built Auto tuning in many Systems we've built programming Languages right and so like have have You know implemented C plus plus I've Implemented Swift I've implemented many Of these things and so you know this It's hard but it's not research and you
Look at accelerators well we know There's a bunch of different weird kind Of accelerators but they actually Cluster together right and you look at Gpus well there's a couple of major Vendors of gpus and they maybe don't Always get along but their architectures Are very similar you look at CPUs CPUs Are still super important for the Deployment side of things you see new New architectures coming out from all The cloud providers and things like this And they're all super important to the World right but they don't have the 30 Years of development that the entrenched People do right and so what modular can Do is we're saying okay all this Complexity like it's not it's not bad Complexity it's actually Innovation Right and so it's Innovation that's Happening and it's for good reasons but I have sympathy for the poor software People right I mean again I'm a Generally a software person too I love Hardware but software people want to Build applications and products and Solutions that scale over many years They don't want to build a solution for One generation of Hardware with one Vendor's tools right and because of this They need something that scales with Them they need something works on cloud And mobile Right because you know their product
Manager said hey I wanted to be have Lower latency and it's better for Personalization or whatever they decide Right products evolve and so the Challenge with the machine learning Technology and the infrastructure we Have today in the industry is that it's All these Point Solutions And because there are all these Point Solutions it means that Azure product Evolves you have to like switch Different technology Stacks or switch to Different vendor and what that does is That slows down progress So basically a lot of the things we've Developed in those little uh silos for Machine learning tasks you want to make That the first class citizen of a General purpose programming language They can then be compiled across all These kinds of Hardware well so it's not Really about a programming language I Mean the programming language is a Component of the mission right and the Mission is are not literal but our Joking mission is to save the world from Terrible AI software So so you know if you look at this Mission you need a syntax So that's so yeah she needed a Programming language right and and like We wouldn't have to build the Programming language if one existed Right so if python was already good
Enough then cool we've just used it Right we're not just doing very large Scale expensive engineering projects for The sake of it like it's to solve a Problem right it's also about Um uh accelerators it's also about Exotic numerics and B float 16 and Matrix multiplications and convolutions And like this this kind of stuff Um within the stack there are things Like uh kernel Fusion That's a esoteric but really important Thing that leads to much better Performance and much more general Research hackability together Right and that that's enabled by the Asics that's enabled by certain Hardware So it's like where's the dance between Um there's several questions here like How do you add a piece of Hardware to This deck yeah if a new piece like if I Have this genius invention Of a specialized accelerator yeah how do I add that to the module framework and Also how does modular as a standard Start to define the kind of Hardware that should be developed yeah So let me take a step back and talk About status quo okay yes and so um if You go back to tensorflow 1 Pi torch one The this kind of time frame Um and these have all evolved and gotten Way more complicated so let's go back to The the Glorious simple days right these
Things basically were CPUs and Cuda and So what you do is you say go do A dense layer and a dens layer has a Matrix multiplication in it right and so When you say that you say go do this big Operation of matrix multiplication and If it's on a GPU kick off Cuda kernel if It's on CPU go do Like an Intel algorithm or something Like that with the Intel mko okay now That's really cool if you're either in Video or Intel right but then more Hardware comes in Right and and on one access you have More Hardware coming in on the other Hand you have an explosion of innovation In Ai and so what happened with both Tensorflow and pytorch is that the Explosion of innovation in AI has led to It's not just about multiplication and Convolution these things have now like 2 000 different operators And on the other hand you have I don't Know how many pieces of Hardware there Are out there it's a lot It's it's not it's not even hundreds It's probably thousands okay and across All of Edge and across like all the Different things that are used at scale Yeah exactly I mean so it's not just Like ai's everywhere yeah it's not a Handful of TPU Alternatives correct it's It's every phone often with many Different right chips inside of it from
Different vendors right like it's AI is Everywhere it's a thing right why are They all making their own chips like What why is everybody making their own Thing Well so is that a good thing official so Chris's velocity on Hardware yeah right So My Philosophy is that there isn't one Right solution Right and so I think that again we're at The end of Moore's Law specialization Happens yeah if you if you're building If you're training gpt5 you want some Crazy super computer data center thingy If you're making a smart camera that Runs on batteries you want something That looks very different If you're building a phone you want Something looks very different if you Have something like a laptop you want Something that looks maybe similar but a Different scale right and so AI ends up Touching all of our Lives robotics right And like lots of different things and so As you look into this these have Different Power envelopes there's Different trade-offs in terms of the Algorithms there's new Innovations and Sparsity and other data formats and Things like that and so uh Hardware Innovation I think is a really good Thing right and what I'm interested in Is unlocking that Innovation there's Also like analog and Quantum and like
All the the They're really weird stuff right and so If somebody can come up with a chip that Uses analog Computing and it's 100x more Power efficient think what that would Mean in terms of the daily impact on the Products we use that would be huge now If you're building an analog computer You may not be a compiler specialist Right these are different skill sets Right and so you can hire some compiler People if you're running a big company Maybe but it turns out these are really Uh like exotic new generation of Compilers like this this is a different Thing right and so if you if you take a Step back out and come back to what is The status quo status quo is that If you're Intel or you're in video you Can you keep up with the industry and You chase and okay there's 1900 now There's 2 000 now there's 2100 and you Have a huge team of people that are like Trying to keep up and tune and optimize And even when uh one of the big guys Comes out with a new generation of their Chip they have to go back and rewrite All these things right so really it's Only powered by having hundreds of People they're all like frantically Trying to keep up and what that does is That keeps out the little guys And sometimes the not so little guys the Big guys that are also just not not in
Those dominant positions and so Um and so what has been happening and so A lot of you talk about the rise of new Exotic crazy accelerators is people have Been trying to turn this from uh let's Go write lots of special kernels problem Into a compiler problem And so we and I contributed to this as Well we as an industry went into it like Let's go make this compiler problem Phase let's call it and much of the Industry is still in this phase by the Way so it's I won't say this phase is Over and so the idea is to say look okay What a compiler does is it provides a Much more General extensible Uh hackable interface for dealing with The general case right and so Um within machine learning algorithms For example people figured out that hey If I do a matrix multiplication and I do A relu right the classic activation Function it is way faster to do one pass Over the data and then do the relu on The output where I'm writing out the Data because really is just a maximum Operation right Max is zero and so It's an amazing optimization to take not More value squish together in one Operation now we have Matt morelu Well wait a second if I do that now I Just went from having you know two Operators to three But now I figure out okay well there's a
Lot of activation functions what about Uh leaky rally what about like like a Million things that are out there right And so as I start fusing these in now I Get permutations of all these algorithms Right and so what the compiler people Said is they said hey cool well I will Go enumerate all the algorithms and I Will enumerate all the pairs and I will Actually generate a kernel for you and I Think that this has been very very Useful for the industry this is one of The things that powers Google tpus uh Pytorch twos like rolling out really Cool compiler stuff with Triton this Other technology and things like this And so the compiler people are kind of Coming into their four and saying like Awesome this is a competitive problem We'll compiler it Here's the problem Not everybody's compiler person I love Compiler people trust me right but not Everybody can or should be a compiler Person it turns out that there are People that know analog computers really Well or they know Some GPU internal architecture thing Really well or they know some crazy Sparse numeric interesting algorithm That is the cusp of research but they're Not compiler people and so one of the Challenges with this new wave of Technology trying to turn everything
Into a compiler Once again it's excluded a ton of people And so you look at what does mojo do What is the modular stack do it brings Programmability back into this world Like it enables I wouldn't say normal People but like a new you know different Kind of delightful nerd that cares about Numerics or cares about Hardware or Cares about things like this to be able To express that in the stack and extend The stack without having to actually go Hack the compiler itself to extend the Stack on the on the algorithm side yeah And then on the hardware side yeah so Again go back to like the simplest Example of int right and so what both Swift and Mojo and other things like This did is we said okay pull magic out Of the compiler and put it in the Standard Library Right so what modular is doing with the Engine that we're providing and like This this very deep technology stack Right which goes into heterogeneous run Times and like a whole bunch of really Cool really cool things Um this this whole stack allows that Stack to be extended and hacked and Changed by researchers and by Hardware Innovators and by people who know things That we don't know because you know Modular has some smart people but we Don't have all the smart people it turns
Out right uh what are heterogeneous Runtimes yeah so uh so what is Heterogeneous right so heterogeneous Just means many different kinds of Things together and so the simple Simplest example you might come up with Is a CPU and a GPU and so it's a simple Heterogeneous computer to say I'll run My data loading and pre-processing and Other algorithms on the CPU and then Once I get it into the right shape I Shove it into the GPU I do a lot of Matrix multiplications and convolutions And things like this and I get it back Out and I do some reductions and Summaries and they shove it across the Wire to across the network to another Machine right and so you've got now what Are effectively two computers A CPU and a GPU talking to each other Working together in a heterogeneous System Um But that was 10 years ago Okay you look at a modern cell phone Modern cell phone you've got CPUs and They're not just CPUs there's like Big Dot little CPUs and so there's multiple Different kinds of CPUs that are again Working together they're multi-core You've got gpus you've got neural Network accelerators you got dedicated Hardware blocks for for media so for Video decode and jpeg code and things
Like this and so you've got this Massively complicated system and this Isn't just cell phones every laptop These days is doing the same thing and All these blocks can run at the same Time And need to be Choreographed right and so again one of The cool things about machine learning Is it's moving things to like data flow Graphs and higher level of abstractions And tensors and these things that it Doesn't specify here's how to do the Algorithm it gives the system a lot more Flexibility in terms of how to translate Or map it or compile it onto the system That you have and so what you need you Know at the bottom is part of the layer There is a way for all these devices to Talk to each other And so this is one thing that you know I'm very passionate about I mean you Know I'm a nerd but um but all these all These machines and all these systems are Effectively parallel computers running At the same time sending messages to Each other and so they're all fully Asynchronous well this is actually a Small version of the same problem you Have in a data center right in a data Center you now have multiple different Machines sometimes very specialized Sometimes with gpus or tpus in OneNote And sometimes with disks and other nodes
And so you get a much larger scale Heterogeneous computer and so what ends Up happening is you have this like Multi-layer abstraction of hierarchical Parallelism hierarchical Asynchronous communication and making That again the enemy my enemy is Complexity by getting that away from Being different specialized systems at Every different part of the stack and Having more consistency and uniformity I Think we can help lift the world and Make it much simpler and actually get Used but how do you leverage like the Strengths of the different specialized Systems so looking inside the smartphone Yeah like there's there's what like I Don't know five six computers Essentially inside a smartphone uh how Do you Without Trying to minimize the explicit uh Making it explicit which which computer Is supposed to be used for which Operation yeah so there's there's a Pretty well known algorithm and what You're doing is you're looking at two Two factors you're looking at the factor Of sending data from one thing to Another right because it takes time to Get it from that side of the chip to That side of the Chip and things like This and then you're looking at what is The time it takes to do an operation on
A particular block so take CPUs CPUs are Fully General they can do anything right But then you have a neural net Accelerator that's really good at Matrix Multiplications okay and so you say okay Well if my workload is all Matrix Multiplications I start up I send the Data over the neural net thing it goes And does matrix multiplications when It's done it sends me back the result All is good right and so the simplest Thing is just saying do Matrix do Matrix Operations over there right but then you Realize you get a little bit more Complicated because you can do Matrix Multiplications on a GPU you can do it On A neural net accelerator you can do it On CPU and they'll have different Trade-offs and costs and it's not just Matrix multiplication and so what you Actually look at is you look at I have Generally a graph of compute I want to Do a partitioning I want to look at the Communication the bisection bandwidth And like the overhead and the sending of All these different things and and build A model for this and then decide okay It's an optimization problem where do I Want to place this compute This is the old school theoretical Computer science problem of scheduling And then how does uh Presumably it's possible to somehow
Magically include autogun into this Absolutely so I mean in my opinion this Is an opinion this is not uh not Everybody would agree with this but in My opinion the world benefits from Simple and predictable systems at the Bottom that you can control But then once you have a predictable Execution layer you can build lots of Different policies on top of it right And so one policy can be that The human programmer says do that here Do that here do that here do that here And like fully manually controls Everything And the system should just do it right Then you quickly get in the mode of like I don't want to have to tell it to do it Yeah and so the next logical step that People typically take because they write Some terrible heuristic oh if it's Amazing location do it over there or if It's floating Point dude on the GPU if It's integer due on the CPU like Something like that right and and then You you then get into this mode of like People care more and more and more and You say okay well let's actually Um like make your stick better let's get Into auto tune let's actually do a Search of the space to decide well what Is actually better right well then you Get into this problem where you realize This is not a small space this is a many
Dimensional Hyperdimensional space that you cannot Exhaustively search So do you know of any algorithms that Are good at searching very complicated Spaces for Don't tell me you're going to turn this Into a machine learning problem so then You turn into a machine learning problem And then you have a space of genetic Algorithms and reinforcement learning And like all these all these what can You include that into the stack into the Into the modulus that yeah yeah where Does it sit where does it live is it Separate thing or is it part of the Compilation so you start from simple and Predictable models and so you can have Full control and you can have coarse Grain knobs that like nudge system so You don't have to do this but if you Really care about getting the best you Know the last ounce out of a problem Then you can use additional tools and They're the cool thing is you don't want To do this every time you run a model You want to figure out the right answer And then cache it and once you do that You can get you can say okay cool I can Get up and running very quickly I can Get good execution out of my system I Can decide if something's important and If it's important I can go through a Bunch of machines at it and do a big
Expensive search over the space using Whatever technique I feel like it's Further up to the problem and then when I get the right answer cool I can just Start using it Right and so you can get out of this um This trade-off between okay am I gonna Like spend forever doing a thing or do I Get up and running quickly and it's a Quality result like these these are Actually not In Contention with each Other if the system's designed to scale You started and did a little bit of a Whirlwind overview of how you get 35 000 X uh speed up or more over python Um Jeremy Howard did a really great Presentation about sort of the basic Like look at the code here's how you get The speed up like you said that's Something we could uh probably Developers can do for their own code to See how you can get these gigantic Speedos but can you maybe speak to the Machine learning task in general how do You how do you make some of this code Fast and specifics like what would you Say is the main bottleneck Uh for uh machine learning tasks so are We talking about uh Matt Mall matrix Multiplication how do you make that fast So I mean if you just look at the python Problem right you can say how do I make Python faster There's been a lot of people that have
Been working on the Okay how to make python 2x faster 10xs Or something like that right and there's Been a ton of projects in that van right Mojo started from the what can the Hardware do Like what is the limit of physics yeah What is the speed of light what is it Like how fast can the sun go and then How do I express that yeah right and so It wasn't well it wasn't anchored Relatively on make python a little bit Faster it's saying cool I know what the Hardware can do let's unlock that right Now when you when you just say how how Gutsy that is to be in the meeting and As opposed to trying to see how do we Get the Improvement it's like what can The physics do I mean maybe I'm a special kind of nerd But you look at that what is the limit Of physics how fast can these things go Right When you start looking at that typically It ends up being a memory problem right And so today uh particularly with these Specialized accelerators the problem is That you can do a lot of math within Them but you get bottleneck sending data Back and forth to memory whether it be Local memory or distant memory or disk Or whatever it is and and that that Bottleneck particularly is the training Sizes get large as you start doing tons
Of inferences all over the place like That becomes a huge bottleneck for People right So again what happened is we went Through a phase of many years where People took the special case and hand Tuned it and tweaked it and tricked it Out and they knew exactly how the Hardware worked and they knew the model And they made it they made it fast Didn't generalize And so you can make you know resting at 50 or some or Alex net or something Inception V1 like you can you can do That right because the models are small They fit in your head right but as the Models get bigger more complicated as The machines get more complicated it Stops working right and so this is where Things like kernel Fusion come in so What is Chrono Fusion this is this idea Of saying let's avoid going to memory And let's do that by building a new Hybrid kernel a numerical algorithm that Actually keeps things in the accelerator Instead of having to write all the way Out to memory all right what's happened With with these accelerators now is you Get multiple levels of memory like in a GPU for example you'll have Global Memory and local memory and like all These things Um if you zoom way into how Hardware Works the register file is actually a
Memory So the registers are like an L zero Cache and so a lot of taking advantage Of the hardware ends up being fully Utilizing the full power In all of its capability and this has a Number of problems right one of which is Again the complexity of disaster right There's too much Hardware even if you Just say let's look at the chips from One line of vendor like apple or Intel Or whatever it is Each version of the chip comes out with New features and they change things so That it takes more time or less time to Do different things and you can't Rewrite all the software whenever a new Chip comes out right and so this is Where you need a much more scalable Approach and this is what Mojo and what The modular stack provides is it Provides this infrastructure and the System for factoring all this complexity And then allowing people to express Algorithms you talk about Auto tuning For example Express algorithms in a more Portable way so that when a new chip Comes out you have to you don't have to Rewrite it all So to me like you know I kind of joke Like what is a compiler well there's Many ways to explain that you convert Thing a into thing B and you convert Source code to machine code like you can
Talk about many many Things that compilers do but to me it's About a bag of tricks it's about a System and a framework that you can hang Complexity it's a system that can then Generalize and it can work on problems That are bigger than fit in one human's Head right and so what that means what a Good stack and what the modular stack Provides is the ability to walk up to it With a new problem and it'll generally Work quite well And that's something a lot of machine Learning infrastructure and tools and Technologies don't have typical State-of-the-art today as you walk up Particularly if you're deploying if you Walk up with a new model you try to push It through the converter the converter Crashes That's crazy the state of ml tooling Today is not anything that a c Programmer would ever accept right and It's always been this kind of flaky set Of tooling that's never been integrated Well and it's been uh never worked Together and because it's not designed Together it's built by different teams It's built by different Hardware vendors It's built by different systems it's Built by different internet companies That are trying to solve their their Problems right and so that means that we Get this fragmented terrible mess of
Complexity So I mean the specifics of Emily Jeremy Showed this uh there's the vectorize Function which I guess is Uh built in to the uh into Mojo does That vectorized as he showed is built Into the library into the library Instead of the library Um vectorized paralyze Which vectorizes more low-level Paralyzes higher level there's the Tiling thing which is how he Demonstrated the um Autotune I think so so think of think About this in like levels hierarchical Levels of abstraction right and so it at The very if you zoom all the way into a Compute problem you have one floating Point number right so then you say okay I want to be I can do things one at a Time in an interpreter it's pretty slow Right so I can get to doing one one at a Time in a compiler I can see then I can Get to doing four or eight or 16 at a Time with vectors that's called Vectorization Then you can say hey I have a whole Bunch of different You know what what a multi-core computer Is is it's basically a bunch of Computers Right so they're all independent Computers that can talk to each other And they share memory and so now what
Parallelized does it says okay run Multiple instances this on different Computers and now they can all work Together on Chrome right and so what You're doing is you're saying keep going Out to the next level out and and as you Do that how do I take advantage of this So tiling is a memory optimization right It says okay let's make sure that we're Keeping the data close to the compute Part of the problem instead of sending It all back and forth through memory Every every time I load a block and the Size of the block size is is all that's How you get to the auto tune to make Sure it's optimized yeah well so all of These The Details Matter so much to get Good performance this is another funny Thing about machine learning and high Performance Computing that is very Different than C compilers we all grew Up grew up with where you know if you Get a new version of GCC or new version Of clang or something like that you know Maybe something will go one percent Faster Right and so compiler insurers will work Really really really hard to get half a Percent out of your C code something Like that but when you're talking about An accelerator or an AI application or You're talking about these kinds of Algorithms now these are things people Used to write in Fortran for example
Right If you get it wrong it's not five Percent or one percent it could be 2X or 10x right if you think about it Um you really want to make use of the Full memory you have the cash for Example but if you use too much space it Doesn't fit in the cache now you're Going to be thrashing all the way back Out to main memory and these can be 2x 10x Major Performance differences and so This is where getting these magic Numbers and these things right is really Actually quite important so you Mentioned that moja is a superset of Python Can you run Python code As if it's Mojo code Yes yes so and so and this has two sides Of it so Mojo's not done yet so I'll Give you disclaimer mode it's not done Yet but already we see people that take Small pieces of python code move it over They don't change it and you can get 12x Speedups like somebody's just tweeting About that yesterday which is pretty Cool right and again interpreters Compilers right and so without changing Any code without also this is not with This is not jit compiling or do any Anything fancy this is just basic stuff Moving straight over now Mojo will Continue to grow out and as it grows out
It will have more and more and more Features and our North Stars to be a Full super set of python and so you can Bring over Basically arbitrary python code and have It just work and it may not always be 12x faster but um but it should be at Least as fast and way faster in many Cases this is the goal right Um Now I'll take time to do that and python Is a complicated language there's not Just the obvious things but there's also Non-obvious things that are complicated Like we have to be able to talk to C Python packages to talk to the C API and There's a bunch of there's a bunch of Pieces so you have to I mean to make Explicit the obvious that may not be so Obvious until you think about it so you Know to run python code that means you Have to run all the python packages and Libraries yeah yeah so that means what What's the relationship between Mojo and C python the The Interpreter that's Presumably would be tasked with getting Those packages to work yep so in the Fullness of time Mojo will solve for all The problems and you'll be able to move Python packages over and run them in Mojo without the C python without C Python someday yeah right it's not today Not someday and that'll be a beautiful Day because then you'll get a whole
Bunch of advantages and you'll get Massive speed ups and things like this But you can do that one at a time right You can move packages one exactly but But we're not willing to wait for that Python is too important the ecosystem is Too broad uh we want to both be able to Build Mojo out we also want to do it the Right way without time like in without Intense time pressure we're obviously Moving fast but Um and so what we do is we say okay well Let's make it so you can import an Arbitrary existing package Arbitrary Including like you write your own on Your local disk or whatever it's not It's not like a standard like an Arbitrary package And import that using C python because C Python already runs all the packages Right and so what we do is we built an Integration layer where we can actually Use C python again I'm practical and to Actually just load and use all the Existing packages as they are the Downside of that is you don't get the Benefits of Mojo for those packages Right and so they'll run as fast as they Do in the traditional C python way But what that does is that gives you an Incremental migration path and so if you Say hey cool well here's a you know the Python ecosystem is vast I want all of
It to just work but there's certain Things that are really important and so If I if I'm doing weather forecasting or Something well I want to be able to load all the data I Want to be able to work with it and then I have my own crazy algorithm inside of It Well normally I'd write that in C Plus plus If I can write in Mojo and have one System that scales well that's way Easier to work with is it hard to do That to to have that layer That's running C python because is there Some communication back and forth yes It's complicated I mean this is what we Do so I mean we make it look easy but um It is it is complicated but what we do Is we use The C python existing interpreter so It's running its own byte codes and That's how it provides full Compatibility and then it gives us C Python objects and we use those objects As is and so that way we're fully Compatible with all the C python objects And all the the you know it's not just The python part it's also the C packages The C libraries underneath them because They're often hybrid and so we can fully Run and we're fully compatible with all That and the way we do that is that we Have to play by the rules right and so We we keep objects in that
Representation when they're coming from That world what's the representation That's being used in memory you'd have To know a lot about how the C python Interpreter works it has for example Reference counting but also different Rules on how to pass pointers around and Things like this super low level fiddly And it's not like python it's like how The Interpreter works okay and so that Gets all exposed out and then you have To Define wrappers around the low level C code right and so What this means is you have to know not Only C Which is a different world from python Obviously not only python but the Rappers but The Interpreter and the Rappers and the implementation details And the conventions and it's just this Really complicated mess and when you do That now suddenly you have a debugger That debugs python they can't step into C code Right so you have this two world problem Right and so by pulling this all into Mojo what you get is you get one world You get the ability to say cool I have Untyped very Dynamic beautiful simple Code Okay I care about performance for Whatever reason right there's lots of Reasons you could you you might care and So then you add types you can
Parallelize things you can factorize Things you can use these techniques Which are General techniques to solve a Problem and then you can do that by Staying in the system and if you're uh You have that one python package it's Really important to you you can move it To Mojo you get massive performance Benefits on that and other other Advantages you know if you like stack Types it's nice if they're enforced Some people like that right rather than Being hints so there's other advantages Too and then Um And then you can do that incrementally As you go So one different perspective on this Would be um Why Mojo instead of making C python Faster or redesigning C python yeah well I mean you can argue Mojo is redesigning C python but but uh but why not make C Python faster and better and other Things like that uh there's lots of People working on that so actually There's a team at Microsoft that is Really improving I think C python 3.11 came out in October or something Like that and it was you know 15 faster 20 faster across the board Which is pretty huge given how mature Python is and things like this and so Um that's awesome I love it
Um doesn't run on GPU it doesn't do AI Stuff like it doesn't do vectors doesn't Do things Um I'm 20 is good 35 000 times is better Right so like they're they're they're Definitely I'm a huge fan of that work By the way and it composes well with What we're doing and so it's not it's Not like we're fighting or anything like That it's actually just general it's Goodness for the world but it's just a Different path right and again we're not Working forwards from making python a Little bit better we're working Backwards from what is the limit of Physics what's the process of uh Supporting python code to Mojo is there A What's involved in that in the process Is there tooling for that not yet so um We're missing some basic features right Now and so we're continuing to drop out New features like on a weekly basis but Um you know at the fullness of time give Us a year and a half maybe two years is It an automatable process so when we're Ready it'll be very automatable yes is It automatable automate like is it Possible to automate In the general case the python Mojo Conversion yeah well you're saying it's Possible well so and this is why I mean Among other reasons why we use tabs yes Right so first of all by being a
Superset yep you could it's like C Versus C plus plus can you move C code To C plus plus Yes yeah right and you move you you can Move C code to C plus plus and uh then You can adopt classes you can add adopt Templates you can adopt other references Or whatever C plus features you want After you move C to C code to C plus Plus like you can't use templates in C Right and so if you leave it a c fine You can't use the cool features but it Still works right and C and C plus plus Code work together and so that's the Analogy right now Um here right you you you There's not a python is bad and the Mojo Is good Right Mojo just gives you superpowers Right and so if you want to stay with Python that's cool uh but the tooling Should be actually very beautiful and Simple because we're doing the hard work Of defining a superset right so you're Right so there's several things to say There but also the conversion tooling Should probably give you hints as to Like how you can improve the code and Then yeah exactly once you're in the new World then you can build all kinds of Cool tools to say like hey should you Adopt this feature or like and we Haven't built those tools yet but I Fully expect those tools will exist and
Then you can like you know quote Modernize your code or however you want To look at it right so I mean one of the Things that I think is really Interesting about Mojo is that there Have been a lot of projects to improve Python over the years Um everything from you know getting Python to run on the Java virtual Machine uh Pi Pi which is the jit Compiler there's tons of these projects Out there that have been working on Improving python in various ways They founded one or two camps so pipei Is a great example of a camp that is Trying to be compatible with python Even there not really it doesn't work With all the C packages and stuff like That but um but they're trying to be Compatible with python there's also Another category of these things where They're saying well python is too Complicated And you know I'm gonna cheat on the Edges and it you know like integers in Python can be an arbitrary size integer Like if you care about it fitting in a Going fast on a register and a computer That's really annoying right and so you Can you can choose to pass on that right You can say well people don't really use Big integers that often therefore I'm Gonna just not do it and it'll be fine Not not a python superset or you can do
The hard thing and say okay this is Python You can't be a super set of python Without Being a super set of python and that's a Really hard technical problem but it's In my opinion worth it right and it's Worth it because it's not about any one Package it's about this ecosystem it's About what python means for the world And it also means we don't want to Repeat the python 2 to Python 3 Transition like we want we want people To be able to adopt this stuff quickly And so by doing that work we can help Lift people yeah the challenge it's Really interesting technical Philosophical challenge of Really making a language a superset of Another language That's breaking my brain a little bit Well it paints you in the corners so Um again I'm very happy with python Right so joking all joking aside I think That the indentation thing is not the Actual important part of the problem yes Right but the the fact that python has Amazing Dynamic meta programming Features and they translate to beautiful Static meta programming features I think Is profound I think that's huge right And so python I've talked with Guido About this it's it's like it was not Designed to do what we're doing that was
Not the reason they built it this way But because they really cared and they Were very thoughtful about how they Designed the language it scales very Elegantly in the space but if you look At other languages for example C and C Plus plus Right if you're building a superset you Get stuck with the design decisions of The subset Right and so you know C plus plus is way More complicated because of C in the Legacy than it would have been if they Would have theoretically designed a from Scratch thing And there's lots of people right now That are trying to make C plus plus Better and recent tax C plus plus it's Gonna be great we'll just change all the Syntax Uh but if you do that now suddenly you Have zero packages So you don't have compatibility so what What are the if you could just uh Linger On that what are the Biggest challenges of keeping that Superset status What are the things you're struggling With is it all boil down to having a big Integer No I mean it's it's one of the other Things usually it's the um it's a long Tail weird things so let me give you a War story okay so War story in the space
Is Um you go way back in time project I Worked on is called clang clang what it Is it's a cc plus plus parser right and When I start working on clang Spent like 2006 or something was 2007 in 2006 when I first started working on it Right Um it's funny how time flies yeah the uh Uh I started that project and I'm like Okay well I want to build a c parser C Plus plus parser for lvm it's gonna be The work GCC is yucky you know this is Mean earlier times it's yucky it's on Principled it has all these weird Features like all these bugs like It's yucky so I'm going to build a Standard compliant C and C plus parser It's gonna be beautiful it'll be amazing Well engineered all the cool things an Engineer wants to do And so I started implementing building It out building on building out and then I got to include standardio.h And all of the headers in the world use All the GCC stuff Okay this and so again come back away From Theory back to reality right I had I was In a fork on the road I could have built An amazingly beautiful academic thing That nobody would ever use Or I could say well it's yucky in Various ways all these design mistakes
Accents of History the Legacy at that Point GCC was like over 20 years old Which by the way yeah now lvm's over 20 Years old yeah that's funny how yeah Time catches up to you right and so Um you you say okay well what what is Easier right I mean as an engineer it's It's actually much easier for me to go Implement long tail compatibility weird Features even if they're distasteful and Just do the hard work and like figure it Out reverse engineer understand what it Is write a bunch of test cases like try To understand Behavior It's way easier to do all that work as An engineer than it is to go talk to all C programmers and get argue with them And try to get them to rewrite their Code yeah Right and because that breaks a lot more Things yeah and and you have realities Like nobody actually even understands How the code works because it was Written by the person who quit 10 years Ago right and so this is this software Has kind of frustrating that way but It's that's how the world works right Yeah unfortunately it can never be this Perfect beautiful thing well there there Are occasions in which you get to build Like you know you invent a new Data structure or something like that or There's this beautiful algorithm that Just like makes you super happy right I
I love that moment but but when you're Working with people yeah and you're Working with code and Dusty that code Bases and things like this right It's not about what's theoretically Beautiful it's about what's practical What's real what people will actually Use and I don't meet a lot of people That say I want to rewrite all my code Just for the sake of it By the way there could be interesting Possibilities and we'll probably talk About it where AI can help rewrite some Code that might be farther out future But it's a really interesting one how That could create more Be a a tool in the battle against this Monster of complexity that you mentioned Yeah Foreign Guido the the benevolent Dictator for life of python what does he Think about Mojo have you talk too much About it uh I have talked with him about It he found it very interesting Um we actually talked with Guido before It launched and so he was aware of it Before it went public Um I have a ton of respect for Credo for A bunch of different reasons you talk About walrus operator and like Guido's Pretty amazing in terms of Steering such a huge and diverse Community and and And
Like driving forward and I think python Is what it is thanks to him right and so To me it was really important starting To work on Mojo to get his feedback and Get his input and get his eyes on this Right now Um a lot of what Guido was is wasn't as I think concerned about is how do we not Fragment the community yeah we don't Want to python to Python 3 thing like That was that was really painful for Everybody involved and so we spent quite A bit of time talking about that and Some of the tricks I learned from Swift For example so in the migration from Swift we managed to like not just Convert Objective C into a slightly prettier Objective C which we did we then Converted not entirely but almost an Entire Community to completely different Language Right and so there's a bunch of tricks That you learn along the way that are Directly relevant to what we do and so This is where for example the you Leverage C python While bringing up the new thing like That that approach is I think proven and And comes from experience and so Guido Is very interested in like okay cool Like I think that python is really his Legacy it's his baby I have ton tons of Respect for that incidentally I see mojo
As a member of the Python family I'm not Trying to take python away from Guido And from the python Community Um uh and so uh to me it's really Important that we're a good member of That community and so yeah I think that Again you would have to ask Guido this But I think that he was very interested In this notion of like Cool but I think it's been up for being Slow Maybe there's a path out of that Right and that you know if the future is Python right I mean look look at the the Far outside Case on this right and I'm not saying This is Guido's perspective but you know There's this path of saying like Okay Well suddenly python can suddenly go all The places it's never been able to go Before Right and that means the python can go Even further and can have even more Impact on the world so in some sense Mojo could be seen as python 4.0 I would not say that I think that would Drive a lot of people really crazy Because of the PTSD of the 3.02. I'm Willing to annoy people about emacs Versus Bim versus spaces that's that one I don't know that might be a little bit Far even for me like my my skin may not Be that thick but the point is the step To it being a superset and allowing all
These capabilities I think is the Evolution of a language it feels like an Evolution of a language So he he's interested by the ideas that You're playing with but also concerned About the fragmentation so how what are The ideas you've learned what are you Thinking about how do we avoid Fragmenting the community where the the Pythonistas and the Uh I don't know what to call the Mojo People uh magicians The Magicians yeah I Like it uh can coexist happily and and Share a code and basically just have These big code bases that are using uh C Python and more and more moving towards Mojo well so again these are lessons I Learned from Swift and and here we Face Very similar problems right and Swift You have Objective C super Dynamic uh They're very different syntax right but You're talking to people who have large Scale code bases I mean Apple's got the Biggest largest scale code base of Objective c code right and so you know None of the companies none of the iOS Developers none of the other developers Want to rewrite everything all at once And so you want to be able to adopt Things piece at a time and so a thing That I found that worked very well in The Swift Community was saying okay cool And this is when switch was very young As you say okay you have a million line
Of code Objective C app Don't rewrite it all but when you Implement a new feature go Implement That new class Using Swift right and so now this turns Out is a very wonderful thing for an app Developer But it's a huge challenge for this Compiler team and the systems people That are implementing that's right and This comes back to what is this Trade-off between doing the hard thing That enables scale versus doing the Theoretically pure and ideal thing right And so Swift adopted and built a lot of Different Machinery to deeply integrate With the objective runtime and we're Doing the same thing with python right Now what what happened in the case with Swift is that Swift as the language got more and more And more mature over time right and Incidentally Mojo is a much simpler Language than Swift in many ways and so I think that Mojo will develop way Faster than Swift for a variety of Reasons but as the language gets more Mature in parallel with that you have New people starting new projects Right and stuff when the language is Mature and somebody's starting a new Project that's when they say okay cool I'm not dealing with a million lines of Code I'll just start and use the new
Thing for my whole stack now the problem Is again you come back to where Communities and we're People that work together you build new Subsystem or a new feature or new thing In Swift or you build new thing in Mojo Then you want to be end up being used on The other side Right and so then you need to work on Integration back the other way And so it's not just Mojo talking python It's also python talking to Mojo right And so what I would love to see and I Don't want to see this next month right But what I want to see over the course Of time is I would love to see people That are building these packages like You know numpy or uh you know tensorflow Or what you know these packages that are Half python half C plus plus And if you say okay cool I want to get Out of this python C plus plus world Into a unified role and so I can move to Mojo But I can't give up on my python clients Because they're like these libraries get Used by everybody and they're not all Going to switch ever all you know all Once and maybe never right well so the Way we should do that is we should vend Python interfaces to the Mojo types And that's what we did in Swift and We're great I mean it was a huge Implementation challenge for the
Compiler people right but um there's Only a dozen of those compiler people And there are millions of users and so It's a very expensive Capital intensive Like skill set intensive problem but Once you solve that problem it really Helps adoption it really helps the Community progressively adopt Technologies and so I think that this Approach will work quite well with with The Python and the Mojo world so for a Package ported to Mojo and then create a Python interface yep So how do just the Linger on these Packages numpy Pi torch and tensorflow Yeah how do they play nicely together so Is uh Mojo supposed to be let's talk About the machine learning ones Is Mojo kind of vision to replace ply Torture tensorflow uh to incorporate it What's what's the relationship in this All right so um dance so take a step Back so I wear many hats so you're You're angling it on the Mojo side yes Mojo is a programming language and so it Can help solve the C C plus plus python Feud that's happening the fire Emoji got Me I'm sorry we should be talking about Modular yes yes yes okay so the fire Emoji is amazing I love it uh it's it's A big deal the other side of this is the Fire Emoji is in service of solving some Big AI problems yes right and so the big AI problems are again this fragmentation
This Hardware nightmare this uh this Explosion of new potential but that's Not getting felt by the industry right And so when you look at how does the Modular engine help tensile and pytorch Right it's not replacing them right in Fact when I talk to the people again They don't like to rewrite all their Code you have people that are using a Bunch of high torch a bunch of Tensorflow they have models that they've Been building over the course of many Years right and when I talk to them There's a few exceptions but generally They don't want to rewrite all their Code Right and so what we're doing is we're Saying okay well you don't have to Rewrite all your code what happens is The modular engine goes in there and Goes underneath tensorflow and Pi torch It's fully compatible and just provides Better performance better predictability Better tooling It's a better experience that helps lift Tensorflow and pytorch and make them Even better I love python I love Tensorflow I love by torch right this is About making the world better because we Need AI to go further but if I have a Process that trains a model and have a Process that performs inference on that Model and have the model itself Uh what should I do with that in the
Long Arc of History In terms of if I use Pi torch to train It should I rewrite stuff in Mojo would That if I care about performance well so I mean again it depends so if you care About performance then writing and mojos Can be way better than writing in Python But if you look at Um if you look at llm companies for Example so you look at open AI rumored And you look at many of the other folks That are working on maybe these many of These LMS and other like Innovative Machine learning models on the one hand They're innovating in the data Collection and the model billions of Parameters in the model architecture and The RL HF and the the like all these all The cool things that people are talking About But on the other hand they're spending a Lot of time writing Cuda girls So you say wait a second how much faster Could all this progress go if they were Not having to handwrite all these Cuda Kernels right and so there are a few Technologies that are out there and People have been working on this problem For a while and Um and they're trying to solve subsets The problem again kind of fragmenting The space and so what Mojo provides for These kinds of companies is the ability To say cool I can have a unifying Theory
Right again this the The Better Together The unifying Theory the the two world Problem or the three world problem or The enrolled problem like this is the Thing that is slowing people down and so As we help solve this problem I think It'll be very helpful for making this Whole cycle go faster So obviously we've talked about the Transition from Objective C to Swift if Designed this uh programming language And you've also talked uh quite a bit About the use of Swift for machine Learning uh context Why have you decided to move away from Uh maybe an intense focus on Swift for The machine learning context versus sort Of Designing a new programming language That happens to be a superstar this is An irrational set of Life Choices I make I go to the desert and did you meditate On it okay all right no it was Bull it Was bold and needed and I think uh I Mean it's just bold and sometimes to Take those leaps is a difficult leap to Take yeah well so okay I mean I think There's a couple of different things so Um actually I left apple back in 2017 Like January 2017. so it's been a number Of years that I left apple and the Reason I left Apple was to do AI Okay so and again I won't comment on Apple and AI but the uh uh at the time Right I want to get into and understand
And understand the technology understand The applications the workloads and so I Was like okay I'm gonna go dive deep Into applied and Ai and then the Technology underneath it right Um I found myself a Google And that was like when tpus were yep Waking up exactly and so I found myself At Google and uh Jeff Dean who's a rock Star as you know right and the and in 2017 tens flow is like really taking off And doing incredible things and I was Attracted to Google to help them with The tpus right and tpus are an Innovative Hardware accelerator platform Uh have now I mean I think proven Massive scale and like done incredible Things right and so one of the things That this led into is a bunch of Different projects which I'll skip over Right one of which was this Swift for Tensorflow project right and so that Project was a research project and so The idea of that is say okay well let's Look at Innovative new programming Models where we can get a fast Programming language we can get Automatic differentiation into the Language let's push the boundaries of These things in a research setting right Now that project I think lasted two Three years there's some really cool Outcomes of that so one of things that's
Really interesting is um I published a talk at an LM conference In 2018 again that seems like so long Ago about graph program abstraction Which is basically the thing that's in Pytorch too And so Pi Torch 2 with all this Dynamo Real thing it's all about this graph Program abstraction thing from Python Bytecodes and so a lot of the research That was done Um ended up pursuing and going out Through the industry and influencing Things and I think it's super exciting And awesome to see that But the software testflow project itself Did not work out super well and so There's a couple of different problems With that one of which is that you may Have noticed Swift is not python There's a few people that write python Code yes and so it turns out that all of Ml is pretty happy with python it's Actually a problem that other Programming languages have as well that They're not python well probably maybe Briefly talk about Julia was a very Interesting uh beautiful programming Language but it's not python exactly Well and so if and so like if you're Saying I'm going to solve a machine Learning problem where all the Programmers are python Pro programmers Yeah and you say the first thing you
Have to do is switch to a different Language Well your new thing may be good or bad Or whatever but if it's a new thing the Adoption barrier is massive it's still Possible still possible yeah absolutely The world changes and evolves and There's definitely room for new new and Good ideas but it just makes it so much Harder right and so Lesson learned Swift is not Python and People are not always in search of like Learning a new thing for the sake of Learning a new thing and if you want to Be compatible with all the world's code Turns out Meet the world where it is right second Thing is that Um you know a lesson learned is that uh Swift as a very fast and efficient Language kind of like Mojo but a Different a different take on it still Um Really worked well with eager mode And so eager mode is something that Pytorch does and it proved out really Well and it enables really expressive And dynamic and easy to debug Programming Um tensorflow at the time was not set up For that Let's say that was not the timing is Also important in this world yeah yeah Intensive flow is a good thing and it
Has many many strengths but uh You could say Swift potential is a good Idea except for the Swift and except for The tensorflow part Sell it because it's not Python and Tensorflow because it's not it wasn't Set up for eager mode at the time yeah It was 1.0 exactly yeah and so one of The so one of the things about that is In the context of it being a research Project I'm very happy with the fact That we built a lot of really cool Technology we learned a lot of things I Think the ideas went on to have Influence and other systems like pytorch A few people use that right here right And so I think that's super cool and for Me personally I learned so much from it Right and I think a lot of the engineers That worked on it also learned a Tremendous amount and so you know I Think that Um that's just really exciting to see And and you know I'm sorry that the Project didn't work out I wish it did of Course right but um Uh but you know it's it's a research Project and so you're there to learn From it but it's interesting to think About Uh the evolution of programming As we come up with these whole new set Of algorithms in machine learning in Artificial intelligence and what's going
To win out because it could be a new Programming language yeah it could be Um I mean we I just mentioned Julia I Think there's a lot of ideas behind Julia that Mojo shares Um what what are your thoughts about Julia in general Um Um so I would I will have to say that When we launched Mojo the One of the biggest things I didn't Predict was the response from the Julia Community and so Um I was not I mean I've okay let me Take a step back I've known the Julia Folks for a really long time they were They're an adopter of llvm a long time Ago they've been pushing state of the Art in a bunch of different ways Julie Is a really cool system Um I had always thought of Julia as Being mostly a scientific Computing Focused environment right and and I Thought that was its focus Um I neglected To understand that one of their missions Is to like help make python work end to End and so I think that was my my error For not understanding that and so I Could have been maybe more sensitive to That but um but there's major Differences between what Mojo's doing What Julie is doing so as you say Julia
Is not python Right and so one of the things that a Lot of the Julia people came out and Said is like okay well if we put a ton Of more energy and ton more money or Engineering or whatever into Julia maybe Uh that would be better than starting Mojo right Well I mean maybe that's true but it Still wouldn't make Julian to python so If you've worked backwards from the goal Of let's build something for python Programmers without requiring them to Relearn syntax Then Julia just isn't There right I mean that's a different Thing right and so if you anchor on I Love Julia and I want Julia to go Further then you can you can look at it From a different lens But the lens we Were coming at it was hey everybody is Using python python isn't syntax isn't Broken let's take what's great about Python and make it even better and so It's just a different starting point so I think Julie is a great language the Community is a lovely Community they're Doing really cool stuff but it's just a Different a slightly different angle But it does seem that python is quite Sticky uh is there some Uh philosophical almost thing you could Say about why python by many measures Seems to be the most popular programming
Language in the world well I can tell You things I love about it maybe that's One way to answer the question right so Huge package ecosystem Super lightweight and easy to integrate It has very low startup time Right so what startup time you mean Money curve or what yeah so if you if You look at certain other languages that You know you say like go and it just Takes a like Java for example it takes a Long time to compile all the things and And then the the VM starts up and the Garbage clusters kicks in and then it Revs its engines and then it can plow Through a lot of Internet stuff or Whatever right Um python is like scripting like it's it Just goes right python has very low Compile time like so you're not sitting There waiting python integrates into Notebooks in a very elegant way that Makes exploration super interactive and It's awesome right python is also um It's like almost the glue of computing Because it has such a simple object Representation a lot of things plug into It that Dynamic meta programming thing We were talking about also enables Really expressive and beautiful apis Right so there's lots of reasons that You can look at Technical things that python has done And say like okay well this is actually
A pretty amazing thing and any one of Those you can neglect people all just Talk about indentation And ignore like the fundamental things But then you also look at the community Side right so python owns machine Learning Machine learning is pretty big yeah and It's growing and it's growing right and It's growing in importance right and so And there's a reputation of prestige to Machine learning to where like if you're A new programmer you're thinking about Like which programming language do I use Well I should probably care about Machine learning therefore let me try Python and what kind of builds and Builds a bit and even go go back before That like my kids Learn Python Probably not because I'm telling them to Learn Python but because were they Replying against you or what no no well They also learn scratch right and things Like this too but it's because python is Taught everywhere right because it's Easy to learn right and because it's Pervasive right and there's like my day We learned Java and C plus plus yeah but Uphill both directions but yes I guess Python is the main language of teaching Software engineering schools now yeah Well if you look at if you look at this There's these growth Cycles right if you Look at what causes things to become
Popular and then gain in popularity There's reinforcing feedback loops and Things like this and I think python has Done again the whole Community has done A really good job of building those Growth loops and help Propel the Ecosystem and I think that again you Look at what you can get done with just A few lines of code it's amazing so this Kind of self Building Loop It's interesting to understand because When you look at Mojo what it stands for Some of the features It seems sort of clear that this is a Good direction for programming languages To evolve in the machine Learning Community but it's still not obvious That it will because of this Whatever the engine of popularity of Virality Um is there something you could speak to Like how how do you get people to switch Yeah well I mean I think that the the The the viral growth Loop is to switch People to Unicode yeah I think the Unicode file extensions are what I'm Betting on I think that's going to be The thing yeah tell the kids that you Could use the fire emojis exactly what Exactly Uh well in all seriousness like I mean I Think there's really I'll give you two Opposite answers one is
I hope if it's useful if it solves Problems and if people care about those Problems being solved They'll adopt the tech Right that's that's kind of the simple Answer and when you're looking to get Tech adopted the question is is it Solving an important problem people need Solved and is the adoption cost low Enough that they're willing to make the Switch and cut over and do do the pain Up front so they can actually do it Right And so hopefully Mojo will be that for a Bunch of people and you know people Building these hybrid packages are Suffering it's really painful and so I Think that we have a good shot of Helping people but the other side is Like it's okay if people don't use Mojo Like it's not my job to say like Everybody should do this like I'm not Saying python is bad like I hope python See python like all these Implementations because python ecosystem Is not just C python it's also a bunch Of different implementations with Different trade-offs and this ecosystem Is really powerful and exciting Um as are other programming languages It's not like typescript or something is Going to go away right and so it's not a There's not a winner take all thing and So I hope that Mojo is exciting and
Useful to people but if it's not that's Also fine but I also wonder what uh The use case For why you should try Mojo would be so Practically speaking Yeah it seems like Uh so there's entertainment there's a Dopamine hit of saying holy this is 10 times faster Uh this little piece of code is 10 times Faster in Mojo out of the box before you Get to 35 000. exactly I mean just even That I mean that's the dopamine hit that Uh every programmer sort of dreams of is Uh the optimization it's it's also the Drug that can uh pull you in and have You waste way too much of your life Without optimizing and over optimizing Right Um but so what uh what do you see it Would be like comedy is this very hard To predict of course but Um you know if you look 10 years from Now on Mojo's uh super successful what Do you think would be the thing Where people like try it and then use it Regularly and it kind of grows and grows And grows well let's say you talk about Dopamine hit and so what again humans Are not one thing and Some people love rewriting their code And learning new things and throwing Themselves in the deep end and trying Out a new thing in my experience most People don't
Like they're too busy they have other Things going on Um by number most people don't want like This I want to rewrite all my code But Even those people the two busy people The people that uh don't actually care About the language that just care about Getting stuff done those people do like Learning new things Right and so you talk about the dopamine Rush of 10x faster wow that's cool I Want to do that again well it's also Like here's here's the thing I've heard About in a different domain and I don't Have to write all my code I can learn a New trick right well that's called Growth you know and so and so one thing That I think is cool about Mojo and Again those will take a little bit of Time for for example the blog posts and The books and like all that kind of Stuff develop and the languages get Further along but what we're doing you Talk about types like you can say look You can start with the world you already Know and you can progressively learn new Things and adopt them where it makes Sense If you never do that That's cool you're not a bad person If you if you get really excited about And want to go all the way in the deep End and want to rewrite everything and
Like whatever that's cool right but I Think the middle path is actually the More likely one where it's um you know You you come out with a new a new idea And you discover wow that makes my code Way simpler way more beautiful way Faster way whatever and I think that's What people like now if you fast forward And you said like 10 years out right uh I can give you a very different answer On that which is I mean If you go back and look at what Computers look like 20 years ago Every 18 months they got faster for free Right 2x faster every 18 months it was Like clockwork it was it was free right You go back 10 years ago and we entered In this world where suddenly we had Multi-core CPUs and we had gpus And if you squint and turn your head What a GPU is it's just a many core or Very simple CPU thing kind of right and So Um and 10 years ago it was CPUs and gpus And graphics Today we have CPS gpus graphics And AI because it's so important because The compute is so demanding because of The smart cameras and the watches and All the different places the AI needs to To work on our lives it's caused this Explosion of hardware And so part of my thesis part of my Belief of where Computing goes if you
Look out 10 years from now it's not Going to get simpler Physics isn't going back to where we Came from it's only going to get weirder From here on out right and so to me the Exciting part about what we're building Is it's about building that Universal Platform which the world can continue to Get weird because again I don't think It's avoidable it's physics but we can Help lift people's scale do things with It and they don't have to rewrite their Code every time a new device comes out And I think that's pretty cool and so if Mojo can help with that problem then I Think that it will be hopefully quite Interesting and quite useful to a wide Range of people because there's so much Potential and like there's someone you Know maybe analog computers will become A thing or something right and we need To be able to get into a mode where we Can move this programming model forward But do so in a way where we're lifting People and and growing them instead of Forcing them to write all their code and Exploding them do you think there will Be a few major libraries that go Mojo First Uh well so I mean the modular engines All Mojo so I can't come back to like We're not building Mojo because it's fun We're building Mojo because we had to to Solve these accelerators that's the
Origin story but I mean ones that are Currently in Python yeah so I think that A number of these projects will and so One one of the things again this is just My best guess like each of the package Maintainers also has I'm sure plenty of Other things going on people don't like Really don't like rewriting code just For the sake of rewriting code Um but sometimes like people are excited About like adopting a new idea yeah it Turns out that while rewriting code is Generally not People's First Thing turns out that redesigning Something while you rewrite it and using A rewrite as an excuse to redesign can Lead to the 2.0 of your thing that's way Better than the 1.0 right and so I have No idea I can't predict that but there's A lot of these places where again if you Have a package that is half C and half Python right it it just solve the pain Make it easier to move things faster Make it easier to debug and evolve your Tech adopting Mojo kind of makes sense To start with and then it gives you this Opportunity to rethink these things so The two big gains are that the there's a Performance gain And then Um there's the Portability to all kinds of different Devices and their safety right so you Talk about real types
I mean not saying this is for everybody But that's actually a pretty big thing Right yeah types are and and so there's A bunch of different aspects of what you Know what value Mojo provides and so I Mean it's funny for me like I've been Working on these kinds of Technologies And tools for too many years now Um but you look at Swift right and we Talked about Swift for tensorflow but Swift as a programming language right For Swift snow 13 years old from when I started it yeah So because I started in 2010 if I Remember and so That that project and I was involved With it for 12 years or something right That that project has gone through its Own really interesting story arc right And it's a mature successful used by Millions of people system right uh Certainly not dead yet right but but Also going through that story arc I Learned a tremendous amount about Building languages about building Compilers about working with community And things like this and so that Experience like I'm helping Channel and Bring directly into Mojo and you know Other systems same thing like apparently I like building building and iterating And evolving things and so you look at This lvm thing I worked on 20 years ago You look at mlir right and so a lot of
The Lessons Learned in llvm got fed into Mlir and I think that mlr is a way Better system than lvm was and you know Swift is a really good system and it's It's amazing but I hope that Mojo will Take the next step for Step forward in terms of design In terms of running Mojo people can play With it what's uh Mojo playground yeah And uh From the interface perspective and from The hardware perspective what's this Incredible thing running on yeah so Right now so here we are two weeks after Launch yes we decided that okay we're we Have this incredible set of technology That We think might be good but we have not Given it to lots of people yet and so We're very conservative and said let's Put it in a workbook so that if it Crashes we can do something about it we Can monitor and track that right so Um again things are still super early But we're having like one person a Minute Sign up with over 70 000 people two Weeks in it's kind of crazy so you you Can sign up to playground and you can Use it in in the cloud yeah in your Browser and so what that's running on Notebook yeah what that's running on is That's running on Um Cloud VMS and so you share a machine
With a bunch of other people but turns Out there's a bunch of them now because There's a lot of people and so what You're doing is you're getting free Compute and you're getting a play with This thing and kind of a limited Controlled way so that we can make sure That it doesn't Totally crashing Be embarrassing right yeah so um now a Lot of the feedback we've gotten is People want to download it around Locally so we're working on that right Now and so that's that's the goal to be Able to download locally yeah that's What everybody expects and so we're Working on that right now and so we just Want to make sure that we do it right And I think this is this is one of the Lessons I learned from Swift also by the Way Is it when we launch Swift uh gosh it Feels like forever ago it's 2014. and uh We I mean it was super exciting I and we The team had worked on Swift for a Number of years in secrecy okay and we Uh four years into this development Roughly of working on this thing At that point about 250 people at Apple Knew about it yeah okay so secret Apple's good at secrecy and it was a Secret project and so we launched this At wwc a bunch of hoopla and excitement And said developers are going to be able
To develop and submit apps the App Store In three months okay well several Interesting things happened right so First of all we learned that a it had a Lot of bugs and it was not actually Production quality and it was extremely Stressful in terms of like trying to get It working for a bunch of people and so What happened was we went from zero to You know I don't know how many Developers Apple had at the time but a Lot of developers overnight and they ran Into a lot of bugs and it was really Embarrassing and it was very stressful For everybody involved right it was also Very exciting because everybody was Excited about that the other thing I Learned is that when that happened Roughly every software engineer who did Not know about the project at Apple Their head exploded when it was launched Because they didn't know it was coming And so they're like wait what is this I I signed up to work for Apple because I Love Objective C why is there a new Thing right and so uh Now what that meant practically is that The push from launch to first of all the Fall but then to 2.0 and 3.0 and like Ever All the Way Forward was Super painful for the engineering team And myself it was very stressful the Developer Community was very grumpy About it because they're like okay well
Wait a second you're changing and Breaking my code and like we have to fix The bugs and it was just like a lot of Tension and friction on all sides Um uh there's a lot of technical debt in The compiler because we have to run Really fast you have to go implement the Thing and unblock the use case and do The thing and and you know it's not Right but you never have time to go back And do it right and I'm very proud of The Swift team because they've come I mean we but they came so far and made So much progress over over this time Since launch it's pretty incredible and Swift is a very very good thing but I Just don't want to do that again right And so a more iterate more through the Development process and so what we're Doing is we're not launching it when It's hopefully is 0.9 with no testers We're launching it and saying it's 0.1 Right and so we're setting expectations Of saying like Okay well don't use this For production Right if you're interested in what we're Doing we'll do it in an open way and we Can do it together but don't use it in Production yet like we'll get there but Let's let's do it the right way and I'm Also saying we're not in a race The thing that I want to do is build the World's best thing yeah right because if You do it right and it lifts the
Industry it doesn't matter if it takes An extra two months yeah like two months Is worth waiting and so doing it right And not being overwhelmed with technical Debt and things like this is like again War wounds Um Lessons Learned uh whatever you want To say I think is absolutely the right Thing to do even though right now people Are very frustrated that you know you Can't download it or it doesn't have Feature X or something like this and so What have you learned in the in a little Bit of time since it's been Released into the wild or that people Have been complaining about future X or Y or Z what have they been complaining About what they have been Uh excited about like yeah almost like Detailed things versus a big I think Everyone would be very excited about the Big Vision yeah yeah well so I mean I've Been very pleased and in fact I mean We've been massively overwhelmed with Response which is um a good problem to Have um it's kind of like a success Disaster yeah in a sense right Um and um so I mean if you go back in Time when we started modular which is Just Um not yet a year and a half ago so it's Still a pretty new company new team Small but very good team of people like We started with extreme conviction that
There's a set of problems we need to Solve and if we solve it then people Will be interested in what we're doing Right but but again you're building in Basically secret right you're trying to Figure it out it's the creation's a Messy process you're having to go Through different paths and understand What you want to do and how to explain It often when you're doing disruptive And new kinds of things Just knowing how to explain it is super Difficult right Um and so when we launched we hope People would be excited but you know I'm I'm an optimist but I'm also like don't Want to get ahead of myself and so when People found out about Mojo I think Their heads exploded a little bit right And you know here here's a I think a Pretty credible team that has built some Languages and some tools before and so They have some lessons learned and are Tackling some of the deep problems in The python ecosystem and giving it the Love and attention that it should be Getting and I think people got very Excited about that and so if you look at That I mean I think people are excited About ownership and taking a Step Beyond Rust right there's people that are very Excited about that there's people that Are excited about uh you know just like I made Game of Life go 400 times faster
Right and things like that and that's Really cool there are people that are Really excited about the okay I really Hate writing stuff in C plus plus save Me like systems and you're they're like Stepping up like yeah yes so that's That's that's that's me by the way also Um I really want to stop writing C plus Plus but the um I get third person Excitement when people tweet here I made This code Game of Life or whatever it's Faster and you're like yeah yeah and and Also like um well I would also say that Um Let me let me cast blame out to People who deserve it sure these Terrible people who convinced me to do Some of this yes Jeremy Howard yes that Guy Well he's been pushing for this kind of Thing he's one of this for years yeah He's wanted this for a long time he's Won this for years and so for people who Don't know Jimmy Howard he's like one of The most legit people in the machine Learning Community he's uh has a Grassroots he really teaches he's an Incredible educator he's an incredible Teacher but also legit uh in terms of a Machine learning engineer himself yeah I Think he's been running the fast uh dot Ai and looking I think for uh exactly What you've done exactly so and so um I Mean the first time so I met Jeremy Pretty early on but the first time I sat
Up and I'm like This guy is ridiculous is when I was at Google and we're bringing up tpus and we Had a whole team of people and we're There was this competition called Don Bench of who can train uh imagenet yeah Fastness right yes and Jeremy and one of His researchers Crushed Google Yeah by not through sheer Force of the amazing amount of compute And the number of tpus and stuff like That that he just decided that Progressive imagery sizing was the right Way to train the model and if you're Epoch faster and make the whole thing go Go vroom right yep and I'm like this guy Is incredible right so you can say Anyways come back to you know where's Mojo coming from Chris finally listened To Jeremy It's all his fault well there's a kind Of very uh Refreshing uh pragmatic view that he has About machine learning that Um I don't know if it's like this mix of A desire for efficiency But ultimately Grounded in a desired to make uh machine Learning more accessible to a lot of People I don't know what that is I guess That's coupled with efficiency and Performance but it's not just obsessed About performance well so so a lot of AI And AI research ends up being that it Has to go fast enough to get scale so a
Lot of people don't actually care about Performance particularly on the research Side until it allows them to have more a Bigger data set Right and so suddenly now you care about Distributed compute and like all these Exotic HPC like you don't actually want To know about that you just want to be Able to do more experiments faster and Do so with bigger data sets right and so Jeremy has been really pushing limits And one of the things I'll say about Jeremy and there's many things I could Say about Jeremy because I'm a fanboy of His but uh he uh it fits in his head And Jeremy actually takes the time where Many people don't to really dive deep Into why is the beta parameter of the Atom Optimizer equal to this yeah right And he'll go survey and understand what Are all the activation functions in the Trade-offs and why is it that everybody That does uh you know this model pick That thing so the why not just trying Different values like really what is Going on here right and so as a Consequence of that like he's always he Again he makes time but he he spends Time to understand things that are depth That a lot of people don't and as you Say he then brings it and teaches people And he's his mission is to help lift you Know his website says making AI uncool Again like it's about like forget about
The hype list it's actually practical And useful let's teach people how to do This right now the problem Jeremy Struggled with is he's pushing the Envelope Right research isn't about doing the Thing that is staying on the happy path Or the the well-paved road right and so A lot of the systems today have been These really frag fragile fragmented Things or special case in this happy Path and if you fall off the happy path You get eaten by an alligator So What about uh so python has this giant Ecosystem of packages uh and there's a Package repository do you have ideas of How to do that well for Mojo Yeah how to do a repository of packages Well so that's another really Interesting problem that I knew about But I didn't understand how big of a Problem it was uh python Packaging A lot of people have very big pain Points and a lot of scars with python Packaging Oh you mean uh so there's Several things building and distributing Yes managing dependencies and versioning And all this stuff so from the Perspective of if you want to create Your own package yes yeah and then or You want to build on top of a bunch of Other people's packages and then they Get updated and it's like this now I'm
Not an expert in this so I don't know The answer I think this is one of the Reasons why it's great that we work as a Team and there's other really good and Smart people involved Um the uh but one of my One of the things I've heard from smart People who've done a lot of this is that The packaging becomes a huge disaster When you get the python and C together And so if you have this problem where You have code split between Python and C Now not only do you have to package the C code you have to build the C code C Doesn't have a package manager right C Doesn't have a dependency versioning Management system right and so I'm not Experiencing the state of the art and uh All the different python package Managers but my understanding is that's A massive part of the problem and I Think Mojo solves that part of the Problem directly heads on now one of the Things I think we'll do with the Community and this isn't again we're not Solving all the world's problems at once We have to be kind of focused start with Is that I think that we will have an Opportunity to reevaluate Packaging Right and so I think that we can come Back and say okay well given the new Tools and Technologies and the cool Things we have that we've built up Because we have not just syntax we have
An entirely new compiler stack that Works in a new way maybe there's other Innovations we can bring together and Maybe we can help solve that problem so Almost a tangent to that question from The user perspective of packages It was always surprising to me That it was not easier to sort of Explore and find packages You know with with Pip install and it Just it feels uh it's an incredible Ecosystem it's just uh interesting that It wasn't made it's still I think not Made easier to discover packages to do Yeah like uh Uh search and Discovery as YouTube calls It well I mean it's kind of funny Because this is one of the challenges of These like Intentionally decentralized communities And so I don't know what the right Answer is for python I mean there are Many people that Or I don't even know the right answer For Mojo So there are many people that would have Much more informed opinions than I do But but it's interesting if you look at This right open source communities Um you know there's git git is a fully Decentralized anybody could do it any Way they want but then there's GitHub Right and GitHub centralized commercial In that case right thing uh really help
Pull together and help solve some of the Discovery problems and help build a more Consistent community and so maybe There's opportunities for something like A GitHub yeah although even GitHub I Might be wrong on this but the search And Discovery for GitHub is not that Great like I still use Google Search Yeah well I mean make it maybe that's Because GitHub doesn't want to replace Google Search right and I think there is Room for specialized solutions to Specific problems but sure I don't know I don't know the right answer for GitHub Either that's I think they can go figure That out but the point is to have an Interface that's usable that's Accessible to people of all different Skill levels well well and again like What what are the benefit of Standards Right standards allow you to build these Next level up ecosystem next level of Infrastructure or next level of things And so Um again come back to I hate complexity See C plus python is complicated it Makes everything more difficult to deal With it makes it difficult to Port move Code around work with all these things Get more complicated and so I mean I'm Not an expert but maybe Mojo can help a Little bit by helping reduce the amount Of C in this ecosystem and make it Therefore scale better so any kind of
Packages that are hybrid in nature would Be a natural fit to move to Mojo which Is a lot of them by the way yeah A lot of them especially they're doing Some interesting stuff computational Wise Let me ask you about some features yeah So we talked about obviously the Indentation that it's a type language or Optionally typed Is that the right way to say it it's Either optionally or progressively or Aggressively I think so so so people Have very strong opinions on the right Word to use yeah I don't know I look Forward to your letters uh so there's The the VAR versus let but let is for Constants uh VAR is an optional uh yeah Makes it mutable so you can reassign Okay uh then there's uh Function overloading oh okay yeah so I Mean there's a lot of source of Happiness for me but function Overloading that's Um I guess is that is that for Performance or is that why does python Not have function overloading So I can speculate so um python is a Dynamic language the way it works is That um Uh python Objective C are actually Very similar worlds if you ignore syntax And so Uh Objective C is straight line derived
From small talk They're really venerable interesting Language that much of the world has Forgotten about but the people that Remember it love it generally and the Way that Small Talk Works is that every Object has a dictionary in it and the Dictionary maps from the name of a Function or the name of a value within An object to its implementation And so the way you call a method in Objective C is you say go look up the Way I call Foose I go look up Foo I get A pointer to the function back and then I call it okay that's how python works Right and so now the problem with that Is that The dictionary within a python object All the keys are strings And it's a dictionary yeah so you can Only have one entry per name you think It's as simple as that I think it's as Simple as that and so now why do they Never fix this like why do they not Change it to not be a dictionary like do Other things Um well you don't really have to in Python because it's Dynamic and so you Can say I get into the function now if I Got past an integer do some Dynamic Tests for it if it's a string go do Another thing there's another additional Challenge which is even if you did Support overloading you're saying okay
Well here's a version of a function for Integers and a function for Strings well You'd have even if you could put it in That dictionary you'd have to have the Caller do the dispatch and so every time You call the function you'd have to say Like is an integer is it a string and so You have to figure out where to do that Test and so in a dynamic language Um overloading is something you Generally you don't have to have so But now you get into a type language and You know in Python if you subscript with An integer Then you get typically one element out Of a collection if you subscript with a Range you get a different thing out Right and so often in type languages You'll want to be able to express the Fact that cool I have different Behavior Depending on what I actually pass into This thing if you can model that it can Make it safer and more predictable and Faster and like all these things it Somehow feels safer yes but also feels Empowering like in terms of clarity like You don't have to design hold different Functions yeah well this is also one of The the challenges with the existing Python typing systems is that in Practice like you take subscript like in Practice a lot of these functions they Don't have one signature right they Actually have different behavior in
Different cases and so this is why it's Difficult to like retrofit this into Existing python code and make it Uh play well with typing you kind of Have to design for that okay so there's A interesting distinction That people the program python might be Interested in is def versus FN So it's two different ways to define a Function Yeah and uh FN is uh a stricter version Of death what's the coolness that comes From the strictness so here you get into What is the trade-off with the superset Yes okay so superset you have to or you Really want to be compatible if like if You're doing a superset you've decided Compatibility with existing code is the Important thing even if some of the Decisions they made were maybe not what You choose yeah okay so that means you Put a lot of time into compatibility and It means that you get locked into Decisions of the past Even if they may not have been a good Thing right now systems programmers Typically like to control things right And they want to make sure that you know Not not all cases of course and no and Even systems programmers are not one Thing right but but often you want Predictability and so one of one of the Things that python has for example as You know is that if you find a variable
You just say x equals four I have a Variable name to X Now I say some long method some some Long name equals 17. Print out some long name Oops but I typoed it right well the Compiler the python compiler doesn't Know in all cases what you're defining What you're using and did you typo the Use of it or the definition right and so For people coming from type languages Again I'm not saying they're right or Wrong but that drives them crazy because They want the compiler to tell them you Typo the name of this thing right and so What FN does is it turns on as you say It's a strict mode and so it says okay Well you have to actually declare Intentionally declare your variables Before you use them that gives you more Predictability more error checking and Things like this but you don't have to Uh you don't have to use it and this is A way that Mojo is both compatible Because deaths work the same way that Deaths have already always worked but it Provides a new alternative that gives You more control and allows certain Kinds of people that have a different Philosophy to be able to express that And get that but usually if you're Writing Mojo code from scratch you'll be Using FN It depends again it depends on your
Mentality right it's not it's not the Deafest python and FN is Mojo Mojo has Both and it loves both right it really Depends on it's just strict yeah exactly Do you are you playing around and Scripting something out is it a one-off Throwaway script cool like python is Great at that I'll still be using Nothing but yeah well so I I love Strickness okay well so control Power You also like suffering right yes go Hand in hand how many how many pull-ups I have lost count at this yeah at this Point so I mean that's cool I love you For that yeah some and I love other People like strict things right but but I don't want to say that that's the Right thing because Python's also very Beautiful for hacking around and doing Stuff and research and these other cases Where you may not want that you see I Just feel like Uh maybe I'm wrong with that but it Feels like strictness leads to faster Debugging so in terms of going from Even on a small project from zero to Completion it's just I guess it depends How many bugs you generate usually well So I mean if it's again Lessons Learned In looking at the ecosystem it's really I mean I think it's If you study some of these languages Over time like the Ruby Community for Example now Ruby is a pretty well
Developed pretty established Community But along their path they really Invested in unit testing Like so I think that the Ruby Community Is really pushed forward the state of The art of testing because they didn't Have a type system that caught a lot of Bugs at compel time right and so you can Have the best of both worlds you can Have good testing and good types right And things like this but but I thought That that it was really interesting to See how certain challenges get solved And in Python for example The interactive notebook kind of Experiences and stuff like this are Really amazing if you typo something it Doesn't matter it just tells you that's Fine right and so I think that the Tryouts are very different if you're Building a Um you know large scale production System versus you're building and Exploring a notebook and the speaking of Control the hilarious thing if you look At code I write just for myself for fun It's like littered with asserts Everywhere okay It's a kind of yeah you would like to Ask it's basically saying uh in a Dictatorial way this should be true now Otherwise everything stops and that that Is the sign I love you man but that is a Sign of somebody who likes control yeah
And so yes I think that you'll like I Think you're like Mojo therapy session Yes I definitely will uh uh speaking of Asserts uh exceptions are called errors Why is it called errors so we I mean we We use the same we're the same as python Right but um we implemented a very Different way right and so if you look At other languages like we'll pick on C Plus plus our favorite right uh C plus Plus has this thing called zero cost Exception handling Okay see and this is In my opinion Something to learn lessons from it's a Nice polite way of thing and so Um and so zero cost exception handling The way it works is that it's called Zero cost because If you don't throw an exception there's Supposed to be no overhead for the Non-error code and so it takes the error Path out of the uh the common path Um it does this by making throwing an Error extremely expensive and so if you Actually throw an error with a C plus Plus compiler using exceptions let's go Look up in tables on the side and do all The stuff and so throwing an error could Be like 10 000 times more expensive than Returning from a function right also It's called zero cost exceptions but It's not zero cost by any stretch of the Imagination because it massively blows
Out your code your binary it also adds a Whole bunch of different paths because Of destructors and other things like That that exist in C plus plus and it Reduces the number of optimizations it Has like all these effects and so this Thing that was called zero cost Exceptions It really ain't okay now if you fast Forward to newer languages and um and This includes Swift and rust and go and And now Mojo Um Uh well in Python's a little bit Different because it's interpreted and So like it's got a little bit of a Different thing going on but if you look At it if you look at compiled languages Um many neural languages say okay well Let's not do that zero cost exception Handling thing let's actually treat and Throwing an error the same as returning A variant returning either the normal Result or an error now Programmers generally don't want to deal With all the typing machinery and like Pushing around a variant and so you use All the syntax that python gives us for Example try and catch and it you know Functions that raise and things like This you can put erases decorator on Your functions stuff like this and if You want to control that and then the Language can provide syntax for it but
Under the hood the way the computer Executes it throwing errors basically as Fast as returning something interesting So it's exactly the same way it's from a Compiler perspective and so this is Actually I mean it's a fairly nerdy Thing right which is why I love it Um but the uh this has a huge impact on The way you design your apis Right so in C plus plus Huge communities turn off exceptions Because the cost is just so high right And so the zero cost cost is so high Right and so that means you can't Actually use exceptions in many Libraries Right and even for the people that do Use it well okay how and when do you Want to pay the cost if I try to open a File should I throw an error well what If I'm probing around looking for Something right I'm looking up in many Different paths well if it's really slow To do that maybe I'll add another Function that doesn't throw an error it Returns an error code instead and now I Have two different versions the same Thing and so it causes you to Fork your Apis and so you know one of the things I Learned from Apple and isil love is the Art of API design is actually really Profound I think this is something that Python's also done a pretty good job at In terms of building out this
Large-scale package ecosystem it's about Having standards and things like this And so you know we wouldn't want to Enter a mode where Um you know there's this theoretical Feature that exists in language but People don't use it in practice Now I'll also say one of the other Really cool things about this Implementation approach is that it can Run on gpus and it can run on Accelerators and things like this and That standard zero cost exception thing Would never work on an accelerator and So this is also part of how Mojo can Scale all the way down to like little Embedded systems and to running on gpus And things like that can you actually Say about the Maybe uh is there some high-level way to Describe the challenge of Exceptions and how they work in code During compilation so it's just this Idea of percolating up a thing An error yeah yeah so the way the way to Think about it is Um think about a function that doesn't Return anything just as a simple case Right and so you have Function one calls function two calls Function three calls function four Along that call stack that are try Blocks right and so if you have function One calls function two function two has
A try block and then within it it calls Function three right well what happens If function three throws Well actually start simpler what happens If it returns well if it returns it's Supposed to go back out and continue Executing and then fall off the bottom Of the try block and keep going and All's good If the function throws you're supposed To exit the current function And then get into the accept Clause Right and then do whatever code's there And then keep falling on and going on And so the way that a compiler like Mojo Works is that the call to that function Which happens in the accept block calls The function and then instead of Returning nothing It actually returns you know a variant Between nothing and an error And so if you return normally go off the Bottom or do a return You refer nothing and if you throw throw An error you Return the variant that is I'm an error Right so when you get to the call you Say okay cool I called a function hey I Know locally I'm in a try block Right and so I I call the function and Then I check to see what it returns aha If it's that error thing jump to the Accept block And that's all done for you behind the
Scenes exactly and so the competitors All this for you and I mean one of the Things if you dig into how this stuff Works in Python it gets a little bit More complicated because you have Finally blocks which now need you need To go into do some stuff and then those Can also throw and return wait what Nothing and like the stuff matters for Compatibility Um like there's there's nestum there's With Clauses and so with Clauses are Kind of like finally blocked with some Special stuff going on and so there's Nothing in general nesting of anything Nothing of functions should be illegal It just feels like it adds a level of Complexity Lex I'm merely an implementer Oh this is again yeah one of one of the One of the trade-offs you get when you Decide to build a superset is you get to Implement a full Fidelity implementation Of the thing that you decided is good And so Yeah I mean we can we can complain about The reality of the world and Shake our Fist but it always feels like you Shouldn't be allowed to do that like to Declare functions in certain functions Inside functions What happened to lacks the the lisp guy No I understand that but lisp is what I Used to do in college so now you've Grown up
You know we've all done things in College we're not proud Okay yeah I was gonna say you're afraid Of me you're taking the whole internet It's uh it worked it worked as a joke in My head and yeah it was right so so Message functions are joking aside Actually really great and for certain Things right and so these are also Called closures Closures are pretty cool and you can Pass callbacks there's a lot of good Patterns and so uh So speaking of which I don't think you have uh nested Functions implemented yet in Mojo we Don't have Lambda syntax but we do have The synthetics functions yeah so there's A few things on the roadmap they have That it would be cool to sort of just Fly through because it's interesting to See you know how many features there are In a language small and big yep they Have to implement yeah so first of all There's Tuple support and that has to do With some very specific aspect of it Like the parentheses or not parentheses That yeah this is just a totally a Syntactic thing a syntactic thing okay There's but it's cool it's still uh So keyword arguments and functions yeah So this is where in Python you can say Call a function x equals four yeah and X Is the name of the argument that's a Nice sort of documenting
Self-documenting feature yeah I mean and Again this isn't rocket science to Implement that's just the laundry it's Just on the list Uh the bigger features are things like Traits so traits are when you want to Define abstract so when you get into Typed languages you need the ability to Write generics and so you want to say I Want to write this function and now I Want to work on all things that are Arithmetic like Well what does arithmetic like mean well Arithmetic-like is a categorization of a Bunch of types and so it's again you can Define many different ways and I'm not Going to go into ring Theory or Something but the uh you know you can Say it's arithmetic like if you can add Subtract multiply divide it for example Right and so what you're saying is You're saying there's a set of traits That apply to a broad variety of types And so they're all these types of Arithmetic like all these tensors and Floating Point integer and like there's This category of of types and then I can Define on an orthogonal axis algorithms That then work against types that have Those properties And so this is a again it's a widely Known thing it's been implemented in Swift and rust in many languages so it's Not a Haskell
Which is where everybody learns learns Their tricks from Um but the uh but we need to implement That and that will enable a new level of Expressivity Uh so classes yeah class is a big deal It's a big deal uh still to be Implemented Um like you said Lambda syntax And there's like detailed stuff like Coal module import uh Support for top level code and file Scope so and then Global variables also So being able to have variables outside Of a top level well and so this comes Back to the where Mojo came from and the Fact that it's 0.1 right and so we're Building the modular is building an AI Stack right and an air stack has a bunch Of problems working with hardware and Writing high performance kernels and Doing with kernel Fusion thing I was Talking about and getting the most out Of the hardware and so we've really Prioritized and built Mojo to solve Modulus problem Right now our North Stars build out and Support all the things and so we're Making incredible progress by the way Mojo's only like seven months old so That's another interesting thing I mean Part of the reason I wanted to mention Some of these things is like there's a Lot to to do and it's pretty cool how
You just kind of sometimes you take for Granted how much there is in a Programming language how many cool Features you kind of rely on and this is Kind of a nice reminder when you lay it As a to-do list yeah and so I mean but Also you look into It's it's amazing how much is also there And you take it for granted that Um a value if you define it it will get Destroyed automatically Like that little feature itself is Actually really complicated given the Way the ownership system has to work and The way that works within Mojo is a huge Step forward from what Russ and Swift Have done can you say that again when a Value when you define it gets destroyed Yeah so like say you have a string right So you just find a string on the stack Okay whatever that means like in in your Local function Right and so you say uh like whether it Being a deaf once they just say x equals Hello world right well if your string Type requires you to allocate memory Then when it's destroyed you have to Deallocate it so in Python and Mojo you Define that with the Dell method right Where does that get run Well it gets run sometime between the Last use of the value and The end of the program like in this you Know get into a garbage collection you
Get into like all these long debated you Talk about religions and and trade-offs And things like this this is a hugely Hotly contested world If you look at C plus plus the way this Works is that If you define a variable or a set of Variables within a function they get Destroyed in a last in first out order So it's like nesting okay Um this has a huge problem because if You define you have a big scope I need To find a whole bunch of values at the Top and then you use them and then you Do a whole bunch of code that doesn't Use them they don't get destroyed until The very end of that scope right and so This also destroys tail calls it's a Good functional programming right this This has a bunch of different impacts on Um you know you talk about reference Counting optimizations and things like This a bunch of very low level things And so what Mojo does it has a different Approach on that from any language I'm Familiar with where it destroys them as Soon as possible And by doing that you get better memory Use you get better predictability you Get tail calls that work like you get a Bunch of other things you get better Ownership tracking there's a bunch of These very simple things that are very Fundamental that are already built in
There in Mojo today that are the things That nobody talks about generally but When they don't work right you find out And you have to complain about is it Trivial to know Uh what's the soonest possible to delete A thing that's not going to be used Again yeah well I mean it's generally Trivial it's it's after the last use of It so if you find X as a string and then You have some use of X somewhere in your Code within that scope I mean within the Scope that's accessible it's yeah Exactly so you can only use something Within its scope and so then it doesn't Wait until the end of the script to Delete it it destroys it after the last Years so there's kind of some very ego Machine that's just sitting there and Deleting yeah and it's all in the Compiler so it's not at runtime which is Also cool and so yeah and so what and This is actually non-trivial because you Have control flow right and so it gets Complicated pretty quickly and so like Getting straight was not also you have To insert delete like in a lot of places Potentially yeah exactly so the compiler Asks a reason about this and this is Where again it's experience building Languages and not getting this right so Again you get another chance to do it And you get basic things like this right But it's it's extremely powerful when
You do that right and so there's a bunch Of things like that that kind of combine Together And this comes back to the you get a Chance to do it the right way do it the Right way and make sure that every brick You put down is really good so that when You put more bricks on top of it they Stack up to something that's beautiful Well there's also Like how many Design discussions do there have to be About particular details like Implementation of particular small Features because the features that Seem small I bet some of them might be Like Really uh require really big design Decisions yeah well so I mean let me Give you another example of this python Has a feature called async await so it's It's a new feature I mean in in the long Arguments on History it's a relatively New feature right that allows way more Expressive asynchronous programming okay Again this is this is a Python's a Beautiful thing and they did things that Are great for Mojo for completely Different reasons Um the reason that async await got added To python as far as I know is because Python doesn't support threads Okay and so python doesn't support Threads but you want to work with
Networking and other things like that That can block I mean python does Support threads it's just not its Strength and so Um And so they added this feature called Async await it's also seen in other Languages like Swift and JavaScript and Many other places as well Um async wait and Mojo's amazing because We have a high performance heterogeneous Compute runtime underneath the covers That then allows non-blocking IO so you Get full use of your accelerator that's Huge it turns out it's actually really An important part of fully utilizing a Machine you talk about design Discussions that took a lot of Discussions right and it probably will Require more iteration and so My Philosophy with Mojo is that you know we Have a small team of really good people That are pushing forward and they're Very good at the extremely deep knowing How the compiler and runtime and like All the the low-level stuff works Together Um but they're not perfect the same Thing as the Swift team right and this Is where one of the reasons we released Mojo much earlier is so we can get Feedback and we've already like renamed A keyword and did a community feedback And which one uh we use an ampersand and
Now it's named in out we're not renaming Existing python keywords because that Breaks compatibility right we're Renaming things we're adding and making Sure that they are designed well we get Usage experience we iterate and work With the community because again if you Scale something really fast and Everybody write the older code and they Start using it in production then it's Impossible to change and so you want to Learn from people you want to iterate And work on that early on and this is Where design discussions it's it's Actually quite important could you could You incorporate an emoji like into the Language into the main language Do you have a favorite one well I really Like uh in terms of humor like uh RAW Full whatever rolling on the floor Laughing So that could be like a What would that be the use case for that I can accept throw an exception of some Sort I don't know you should totally File a feature request Uh or maybe a hard one it has to be a Hard one uh people have told me that I'm Insane so this is this is this is I I'm Liking this I'm gonna I'm gonna use the viral nature Of the internet to actually get this to Get this passed uh I mean it's funny you Come back to the flame Emoji file
Extension right the uh um you know we Have the option to use the flame Emoji Which just even that concept because for Example the people at GitHub say no I've Seen everything like Yeah there's something uh it kind of It's reinvigorating it's like uh It's like oh that's possible that's Really cool that for some reason that Makes everything else actually I'm Really excited the world is ready for This stuff right and so you know when we Have a package manager we'll clearly Have to innovate by having the compiled Package thing be the little box with the Bow on it right I mean It has to be done it has to be done is There some stuff on the road map that You're particularly stressed about or Excited about that you're thinking about A lot I mean as a today snapshot which Will be obviously tomorrow uh the Lifetime stuff is really exciting and so Lifetimes give you safe references to Memory without dangling pointers and so This has been done in languages like Russ before and so we have a new Approach which is really cool I'm very Excited about that that'll be out to the Community very soon Um the traits feature is really a big Deal and so that's blocking a lot of API Design and so there's that I think That's really exciting
Um A lot of it is these kind of table Stakes features Um one of the things that is again also Lessons Learned with Swift uh Is that uh programmers in general like To add syntactic sugar And so it's like oh well this annoying Thing like like in Python you have to Spell ad Why can't I just use plus def plus come On why can't I just do that right and so Try a little bit of syntactic sugar it Makes sense it's beautiful it's obvious We're trying not to do that And so Um for two different reasons one of Which is that again lesson learn Swift Swift has a lot of syntactic sugar Um Which may maybe a good thing maybe not I Don't know but um but because it's such An easy and addictive thing to do sugar Like make sure blood get crazy right Um like the community will really dig Into that and want to do a lot of that And I think it's very distracting from Building the core abstractions second is We want to be a good member of the Python community Right and so we want to work with the Broader python community and yeah we're Pushing forward a bunch of systems Programming features and we need to
Build them out to understand them but Once we get a long ways forward I want To make sure that we go back to the Python community and say okay let's do Some design reviews let's actually talk About this stuff let's figure out how we Want this stuff all to work together and Syntactic sugar just makes all that more Complicated so And uh yeah list comprehensions that you Have to be implemented and my favorite I Mean dictionaries Yeah but nonetheless it's actually still Quite interesting and useful as you Mentioned modular is very new Mojo is very new it's a relatively small Team yeah it's building up this yeah We're just gigantic stack it's Incredible stack that's going to perhaps Define the future of Development of our AI overlords uh we Just hope it will be useful As do all of us uh so what uh what have You learned from this process of Building up a team maybe one question is How do you hire Great programmers great people that Operate in this Compiler Hardware machine learning Software interface design space yeah and Maybe are a little bit fluid yeah what They can do so okay so language design Too so building a company is just as Interesting in different ways is
Building a language like different skill Sets different things but super Interesting and I've built a lot of Teams in a lot of different places Um if you zoom in from the big problem Into recruiting Well so here's our problem okay I'll I'll just I'll be very straightforward About this we started modular with a lot Of conviction about we understand the Problems we understand the customer pain Points we need to work backwards from The suffering in the industry and if we Solve those problems we think it'll be Useful for people But the problem is is that the people we Need to hire as you say are all these Super specialized people that have jobs At Big Tech big Tech worlds right and You know we I don't think we have Um product Market fit in the way that a Normal startup does we don't have Product Market fit challenges because Right now everybody's using Ai and so Many of them are suffering and they want Help and so again we started with strong Conviction now again you have to hire And recruit the best and the best all Have jobs and so what we've done is we Said okay well let's build an amazing Culture Start with that that's usually not Something a company starts with usually You hire a bunch of people and then it
People start fighting and it turns into Gigantic mess and then you try to figure Out how to improve your culture later my Co-founder Tim in particular is super Passionate about making sure that that's Right and we've spent a lot of time Early on to make sure that we can scale Can you come inside before we get to the Second yeah what makes for a good Culture Um so I mean there's many different Cultures and I have learned many things From Several very unique almost famously Unique cultures and some of them I Learned what to do and some of them I Learned what not to do yep okay and so Um We want an inclusive culture uh I Believe in like amazing people working Together And so I've seen cultures where people You have amazing people and they're Fighting each other I see amazing people and they're told What to do like Thou shalt line up and Do what I say it doesn't matter if it's The right thing do it right and neither Of these is the and I've seen people That have no Direction they're just kind Of floating in different places and they Want to be amazing they just don't know How and so a lot of it starts with have A Clear Vision
Right and so we have a clear vision of What we're doing and um so I kind of Grew up at Apple in my engineering life Right and so a lot of the Apple DNA Rubbed off on me my co-founder Tim also Is like a strong product guy and so what We learned is you know I decided Apple That you don't work from building cool Technology you don't work from like come Up with cool product and think about the Features you'll have in the big check Boxes and stuff like this Because if you go talk to customers they Don't actually care about your product They don't care about your technology What they care about is their problems Right and if your product can help solve Their problems well hey they might be Interested in that right and so if you Speak to them about their problems if You understand and you have compassion You understand what people are working With then you can work backwards to Building an amazing product so divisions Finding the problem and then you can Work backwards in solving technology got It and at Apple like it's I think pretty Famously said that you know for every You know there's a hundred no's for Every yes I would find that to say that there's a Hundred not yet for every yes but Famously if you go back to the iPhone For example right the iPhone one every I
Mean many people laughed at it because It didn't have 3G it didn't have copy And paste Right and then a year later okay finally It has 3G but it still doesn't have copy And paste it's a joke nobody will ever Use this product blah blah blah blah Blah blah right well year three it had Copy and paste and people stopped Talking about it right and so and so Being laser focused and having Conviction and understanding what the Core problems are and giving the team The space to be able to build the right Tech is really important Um also I mean you come back to Recruiting you have to pay well right so We have to pay industry leading salaries And have good benefits and things like This that's a big piece uh we're a Remote first company and so we have to Uh Uh so remote first has a very strong set Of pros and cons on the one hand you can Hire people from wherever they are and You can attract amazing talent even if They live in strange places or unusual Places on the other hand you have time Zones On the other hand you have like Everybody on the internet will fight if They don't understand each other and so We've had to learn how to like have a System where we actually fly people in
And we get the whole company together Periodically and then we get work groups Together and we plan and execute Together and there's like an intimacy to The in-person brainstorming yeah I guess You lose but maybe you don't maybe if You get to know each other well and you Trust each other maybe you can do that Yeah well so when the pandemic first hit I mean I'm curious about your experience Too the first thing I missed was having Whiteboards yeah right in those design Discussions where like I can high high Intensity work through things get things Done work through the problem of the day Understand where you're on figure out And solve the problem and move forward Yeah Um but we figured out ways to work Around that now with you know all these Uh screen sharing and other things like That that we do the thing I miss now is Sitting down at a lunch table with the Team yeah the spontaneous things like Those the the coffee the coffee bar Things and the and the bumping into each Other and getting to know people outside Of the transactional solve a problem Over Zoom okay and I think there's There's just a lot of stuff that um I'm Not an expert at this I don't know who Is hopefully there's some people but There's stuff that somehow is missing on Zoom
Even with the Whiteboard if you look at That If you have a room with one person at The Whiteboard and there's like three Other people at a table There's uh first of all there's a social Aspect to that where you're just Shooting the a little bit almost Like yeah as people just kind of coming In and yeah that but also while Like it's a breakout discussion that Happens for like seconds at a time maybe An inside joke or it's like this Interesting Dynamic that happens that Zoom you're bonding yeah you're bonding You're bonding but through that bonding You get the excitement there's certain Ideas are like complete and You'll see that in the faces of others That you won't see necessarily on zoom In like something it feels like that Should be possible to do Without being in person well I mean Being in person is a very different Thing yeah I don't it's worth it but you Can't always do it and so again we're Still learning and we're also learning As like Humanity with this new reality Right but um but what we found is that Getting people together whether it be a Team or the whole company or whatever Is it worth the expense because people Work together and are happier After that like it just it just like
There's a massive period of time where You like go out and things start getting Frayed pull people together and then you Realize that we're all working together We see things the same way we work Through the disagreement or the Misunderstanding we're talking across Each other and then you work much better Together and so things like that I think Are really quite important what about uh People that are kind of specialized in Very different aspects of the stack Working together what are some Interesting challenges there yeah well So I mean I mean there's lots of Interesting people as you can tell I'm You know hard to deal with too But you're one of the most lovable the Uh uh so one of the so there's different Philosophies in building teams uh for me And so some people say higher 10x Programmers and that's the only thing That whatever that means right Um what I believe in is building Well-balanced teams teams that have People that are different in them like If you have all generals and no troops Or all troops and no generals or you Have all people that think in one way And not the other way what you get is You get a very biased and skewed and Weird situation where people end up Being unhappy and so what I like to do Is I like to build teams of people where
They're not all the same you know we do Have teams and they're focused on like Runtime or compiler GPU or Excel or Whatever the specialty is but people Bring a different take and have a Different perspective and I look for People that complement each other and Particularly if you look at leadership Teams and things like this you don't Want everybody thinking the same way you Want people bringing different Perspectives and experiences and so I Think that's really important that's Team but what about building a a company As ambitious as modular so what uh some Interesting questions there oh I mean so Many like so um one of the things I love About okay so modular is the first Company I built from scratch Um Uh one of the first things that was Profound was I'm not cleaning up Somebody else's mess right and so if you Look at and that's liberating to Something it's super liberating and Um and also many of the projects I've Built in the past have not been core to The product of the company Swift is not Apple's product Right mlar is not Google's revenue Machine or whatever right it's not it's It's important but it's like working on The accounting software for you know the The retail giant or something right it's
It's it's like enabling infrastructure And technology and so at modular the the Tech we're building is Here to solve people's problems like it Is directly the thing that we're giving To people and so this is a really big Difference and what it means for me as a Leader but also for many of our Engineers is they're working on the Thing that matters and that's actually Pretty I mean again for for compiler People and things like that that's That's usually not the case right and so That's that's also pretty exciting and And quite nice but the um one of the Ways that this manifests is it makes it Easier to make decisions And so one of the challenges I've had in Other worlds is it's like okay well Community matters Somehow for the goodness of the world Like or open source matters Theoretically but I don't want to pay For a t-shirt Right or some Swag like well t-shirts Cost 10 bucks each you can have 100 T-shirts for a thousand dollars to a Mega Corp a thousand dollars is Uncountably can't count that low right But justifying it and getting a t-shirt By the way if you'd like a t-shirt Why would 100 Like a t-shirt are you joking you can Have a fire Emoji t-shirt is that I will
I will treasure this I will pass it down To my grandchildren and so you know it's It's very liberating to be able to Decide I think that life should have a T-shirt Right and it becomes very simple Like Lex It's this uh this is awesome Um so I have to ask you about the One of the interesting developments with Large language models Is that they're able to generate code Uh recently really well I guess to a degree that maybe a I don't know if you understand but I Have I struggle to understand because it It forces me to ask questions about the Nature of programming of the nature of Thought Because the uh language models are able To predict the kind of code I was about To write so well yep that it makes me Wonder like how unique my brain is and Where the valuable ideas actually come From like how much do I contribute in Terms of uh Ingenuity Innovation to code I write or Design and that kind of stuff When you stand on the shoulders of Giants are you really doing anything and What L alums are helping you do is they Help you stand on the shoulders of Giants new program there's mistakes
They're interesting that you learn from But I just it would love to get your Opinion first high level yeah of what You think about this impact of large Language models when they do program Synthesis when they generate code yeah Well so Um I don't know where it all goes yeah Um I'm an optimist and I'm a human Optimist right I think that things I've Seen are that a lot of the llms are Really good at crushing leak code Projects and they can reverse the link List like crazy well it turns out There's a lot of Instances of that on the internet and It's a pretty stock thing and so if you Want to see Standard questions answered LMS can Memorize all the answers and that can be Amazing and also they do generalize out From that and so there's good work on That but um but I think that if in my Experience building things building Something like you talk about Mojo where You talk about these things where you Talk about building an applied solution To a problem it's also about working With people It's about understanding the problem What is the product that you want to Build what are the use case what are the Customers you can't just go survey all
The customers because they'll tell you That they want a faster horse maybe they Need a car right and so a lot of it Comes into Um you know I don't feel like we have to Compete with L alums I think they'll Help automate a ton of the mechanical Stuff out of the way and just like you Know I think we all try to scale through Delegation and things like this Delegating wrote things to an llm I Think is an extremely valuable and Approach that will help us all scale and Be more productive but I think it's a It's a fascinating companion but I'd say I don't think that means that we're Going to be done with coding But there's power in it as a companion From there I could I would love to zoom In onto Mojo a little bit do you think Uh do you think about that do you think About llm's generating Mojo code And helping sort of like when you design New programming language it almost seems Like man it would be nice to sort of Um Almost as a way to learn how I'm Supposed to use this thing For them to be trained on some of the Most good so I do lead an AI company so Maybe there will be a Mojo llm at some Point uh but if your question is like How do we make a language to be suitable For llms yeah I think that the
Um I think the cool thing about LMS is you Don't have to And so if you look at what is English or Any of these other terrible languages That we as humans deal with on a Continuous basis they're never designed For machines and yet they're the Intermediate representation they're The Exchange format that we humans use to Get stuff done right and so these Programming languages they're an Intermediate representation between the Human and the computer or the human and The compiler roughly right and so I Think the llms will have no problem Learning whatever keyword we pick maybe The Phi Emoji is gonna oh maybe that's Gonna break it it doesn't tokenize no The reverse of that it will actually Enable it because one of the issues I Could see with being a super set of Python is there would be Confusion by The gray area So we'll be mixing stuff But well I'm a human Optimist I'm also An llm optimist I think that will solve That problem but the uh um but but you Look at that and you say okay well Reducing the rote thing right turns out Compilers are very particular and they Really want things they really want the Indentation to be right they really want The colon to be there on your else or
Else it'll complain right I mean Compilers can do better at this but um Lens can totally help solve that problem And so I'm very happy about the new uh Predictive coding and copilot type Features and things like this because I Think it'll all just make us more Productive it's still messy and fuzzy And uncertain unpredictable so but is There a future you see given how big of A leap gpt4 was where you start to see Something like llms inside a compiler Uh I mean you could do that yeah Absolutely I mean I think that would be Interesting there's otherwise well well I mean it would be very expensive so Compilers run fast and they're very Efficient and LMS are currently very Expensive there's on device llms and There's other things going on and so Maybe there's an answer there Um I think that one of the things that I Haven't seen enough of is that So llms to me are amazing when you tap Into the creative potential of the Hallucinations right and so if you're Building doing creative brainstorming or Creative writing or things like that the Hallucinations work in your favor Um If your writing code that has to be Correct because you're going to ship it In production then maybe that's not Actually a feature
And so I think that there there has been Research and there has been work on Building algebraic reasoning systems and Kind of like figuring out more things That feel like proofs and so I think That there could be interesting work in Terms of building more reliable scale Systems and that could be interesting But if you chase that rabbit hole down The question then becomes how do you Express your intent of the machine and So maybe you want LM to provide the spec But you have a different kind of net That then actually implements the code Right so it's a used documentation and And inspiration versus the actual Implementation yeah potentially Since uh if successful modular will be The thing that runs I say so jokingly Our AI overlords but AI systems that are Used across Uh I know it's a cliche term but uh Internet of things so of course so so I'll joke and say like AGI should be Written in Mojo yeah AGI you're joking But it's also possible that it's not a Joke uh that a lot of the ideas behind Mojo is uh seems like the the natural Set of ideas that would enable at scale Training and inference of AI systems Um so just I have to ask you about the Big philosophical question about human Civilization so folks like uh uh Eliezeriatkowski are really concerned
About the threat of AI do you think About The the good And the bad that can happen at scale Deployment of AI systems well so I've I've thought a lot about it and there's A lot of different parts to this problem Everything from job displacement to Skynet things like this and so you can Zoom into sub parts of this problem Um I'm not super optimistic about AGI being Solved next year I don't think that's going to happen Personally so you have a kind of Zen-like calm about because there's a Nervousness because the leap of gbt4 Seems so big sure that's like we're Almost we're there's some kind of Transition here period you're thinking Well so so I mean there's a couple of Things going on there one is Um I'm sure GPT five and seven and 19 Will be also huge leaps Um they're also getting much more Expensive to run and so there may be a Limiting function in terms of just Expense on the one hand and train like That that could be a limiter that slows Things down but I think the bigger Limiter is outside of like Skynet takes Over and I don't spend any time thinking About that because if Skynet takes over And kills us all then I'll be dead so I
Don't worry about that so you know I Mean that's just okay I have other Things worry about I'll just focus on Yeah I'll focus and not worry about that One Um but I think that the the other thing I'd say is that AI moves quickly but humans move slowly And we adapt slowly and so what I expect To happen is just like any technology Diffusion like the promise and then the Application takes time to roll out and So I think that I'm not even too worried About autonomous cars defining away all The taxi drivers remember autonomy is Supposed to be solved by 2020. yeah boy Do I so and um and so like I think that On the one hand we can see amazing Progress but on the other hand we can See that uh you know the the reality is A little bit more complicated and it may Take longer to roll out than than you Might expect well that's in the physical Space I I do think in the digital space Is a the stuff that's built on top of Llms that runs You know the millions of apps that could Be built on top of them And they could be run on millions of Devices millions of types of devices I I just think That the rapid effect it has in human Civilization could be truly Transformative to it yeah you don't even
Know well so that predict well and there I think it depends on are you an Optimist or a pessimist yeah or a Masochist Um just to clarify uh optimist about Human civilization me too and so I look At that as saying okay cool well yeah I Do right and so some people say oh my God it's going to destroy us all how do We prevent that I I kind of look at it From a is it going to unlock us all Right you talk about coding it's going To make so I don't have to do all the Repetitive stuff Well suddenly that's a very optimistic Way to look at it and you look at what a Lot of a lot of these technologies have Done to improve our lives and I want That to go faster What do you think the future of Programming looks like in the next 10 20 30 50 years There are the limbs and uh with with Mojo with modular like your vision for Devices the hardware to compilers to This to the different stacks of software Yeah well so what I want I mean coming Coming back to my arch nemesis right It's complexity right so again me being The Optimist if we drive down complexity We can make these tools these Technologies these cool Hardware widgets Accessible to way more people right and So what I'd love to see is more
Personalized experiences more uh things The research getting into production Instead of being lost at nerups right And so and like the the the these things That impact people's lives by entering Products And so one of the things that I'm a Little bit concerned about is right now Um the big companies are investing huge Amounts of money and are driving the top Line of AI capability for really quickly But if it means that you have to have 100 million dollars to train a model or More 100 billion dollars right well That's gonna make it very concentrated With very few people in the world that Can actually do this stuff I would much Rather see Lots of people across the industry Be able to participate and use this Right and you look at this you know I Mean a lot of great research has been Done in the health world and looking at Like detecting mythologies and doing Radiology with AI and like doing all These things well the problem today is That to deploy and build these systems You have to be an expert in radiology And an expert in AI And if we can break down the barriers so That more people can use AI techniques It's more like programming python Which roughly everybody can do if they Want to right then I think that we'll
Get a lot more practical application of These techniques and a lot more nicheer Cool but narrower demands I think that's That's going to be really cool do you Think we'll have more or less Programmers in the world than no well so Um I think we'll have more more Programmers but they may not consider Themselves to be programmers that'd be a Different name for you right I mean do You consider somebody that uses uh you Know I think that arguably the most Popular programming language is Excel Yeah Right yeah and so do they consider Themselves to be programmers maybe not I Mean some of them make crazy macros and Stuff like that but but but what what The you mentioned Steve Jobs it's the uh Bicycle for the mind it allows you to go Faster right and so I think that as we Look forward right what is AI I look at It as hopefully a new programming Paradigm it's like object-oriented Programming right if you want to write a Cat detector you don't use for Loops it Turns out that's not the right tool for The job right and so right now Unfortunately because I mean it's not Unfortunate but it's just kind of where Things are AI is this weird different Thing that's not integrated into Programming languages and normal tool Chains and all the Technologies really
Weird and doesn't work right and you Have to babysit it and every time you Switch Hardware it's different shouldn't Be that way when you change that when You fix that suddenly again the tools Technologies can be way easier to use You can start using them for many more Things and so that that's that's what I Would be excited about What kind of advice could you give to Somebody in high school right now or Maybe early college who's curious about Programming And Feeling like the world is changing Really quickly here yeah what kind of Stuff to learn what kind of stuff to Work on Should they finish college they Go work at a company there build a thing What do you think well so I mean one of The things I'd say is that um you'll be Most successful if you work on something You're excited by And so don't get the book and read the Book Cover to cover and study and memorize in Our site and flash card and go build Something like go solve a problem go Build the thing that you wanted to exist Go build an app go build train a model Like go build something and actually use It and set a goal for yourself and if You do that then you'll you know there's
A success there's the adrenaline rush There's the achievement there's the Unlock that I think is where you know if You keep setting goals and you keep Doing things and Building Things Learning by building is really powerful Um in terms of career advice I mean Everybody's different it's very hard to Give generalized experience generalized Advice Um all speakers you know a compiler nerd If everybody's going Left sometimes it's pretty cool to go Right yeah and so just because Everybody's doing a thing it doesn't Mean you have to do the same thing and Follow the herd in fact I think that Sometimes the most exciting path through Life lead to being curious about things That nobody else actually focuses on Right and turns out that understanding Deeply parts of the problem that people Want to take for granted makes you Extremely valuable and specialized in Ways that the herd is not and so again I Mean there's lots of rooms for Specialization lots of rooms for uh Generalists there's lots of room for Different kinds and parts of the problem But but I think that it's you know just Because everything everybody's doing one Thing doesn't mean you should Necessarily do it and now the herd is Using python so if you want to be a
Rebel Go check out Mojo and uh help Chris and The rest of the world fight the arch Nemesis of complexity because simple is Beautiful there you go Because you're an incredible person You've uh you've been so kind to me ever Since we met they've been extremely Supportive I'm forever grateful for that Thank you for being who you are for Being legit for being kind for fighting This um Um really interesting problem of how to Make AI accessible to a huge number of People huge number of devices yeah well So Lex you're a pretty special person Too right and so I think that you know One of the funny things about you is That besides being curious and pretty Damn smart you're actually willing to Push on things and you're you're I think That you've got an agenda to like make The world think Which I think is a pretty good agenda It's a pretty good one uh thank you so Much for talking hey Chris yeah thanks Alex Thanks for listening to this Conversation with Chris Ladner to Support this podcast please check out Our sponsors in the description and now Let me leave you some words from Isaac Asimov I do not fear computers
I fear the lack of them Thank you for listening and hope to see You next time