Python can run Mojo now
Chris Lattner mentioned that Python can actually call Mojo code now. I love this idea (!) as I'm definitely in the market for a simple compiled language that can offer Python some really fast functions. So I gave it a quick spin
Setup
The setup is much simpler than I remember it, you can use uv
for it now.
uv pip install modular --index-url https://dl.modular.com/public/nightly/python/simple/
After that you can declare a .mojo
file that looks like this:
# mojo_module.mojo
from python import PythonObject
from python.bindings import PythonModuleBuilder
import math
from os import abort
@export
fn PyInit_mojo_module() -> PythonObject:
try:
var m = PythonModuleBuilder("mojo_module")
m.def_function[factorial]("factorial", docstring="Compute n!")
return m.finalize()
except e:
return abort[PythonObject](String("error creating Python Mojo module:", e))
fn factorial(py_obj: PythonObject) raises -> PythonObject:
# Raises an exception if `py_obj` is not convertible to a Mojo `Int`.
var n = Int(py_obj)
var result = 1
for i in range(1, n + 1):
result *= i
return result
And you can then load it from Python via:
# main.py
import max.mojo.importer
import os
import sys
import time
import math
sys.path.insert(0, "")
import mojo_module
start = time.time()
print(mojo_module.factorial(10))
end = time.time()
print(f"Time taken: {end - start} seconds for mojo")
start = time.time()
print(math.factorial(10))
end = time.time()
print(f"Time taken: {end - start} seconds for python")
This was the output:
3628800
Time taken: 3.0279159545898438e-05 seconds for mojo
3628800
Time taken: 5.0067901611328125e-06 seconds for python
This all works, but at the time of making this blogpost I was able to spot some rough edges. If I increase the factorial number to 100 then the output changes.
0
Time taken: 2.7894973754882812e-05 seconds for mojo
188267717688892609974376770249160085759540364871492425887598231508353156331613598866882932889495923133646405445930057740630161919341380597818883457558547055524326375565007131770880000000000000000000000000000000
Time taken: 9.298324584960938e-06 seconds for python
This is probably because of some overflow issues on the modular side. The docs mention that this whole stack is pretty early, and I guess this is a sign of that.
Another example
Given that the overflow is probably the issue here, I figured I'd run one extra example just to see if we could measure a speed increase. So I went with a naive prime number counting example. This is the mojo code:
from python import PythonObject
from python.bindings import PythonModuleBuilder
import math
from os import abort
@export
fn PyInit_mojo_module() -> PythonObject:
try:
var m = PythonModuleBuilder("mojo_module")
m.def_function[count_primes]("count_primes", docstring="Count primes up to n")
return m.finalize()
except e:
return abort[PythonObject](String("error creating Python Mojo module:", e))
fn count_primes(py_obj: PythonObject) raises -> PythonObject:
var n = Int(py_obj)
var count: Int = 0
for i in range(2, n + 1):
var is_prime: Bool = True
for j in range(2, i):
if i % j == 0:
is_prime = False
break
if is_prime:
count += 1
return count
This is the Python code, note that I also added a numpy implementation for comparison.
import numpy as np
import max.mojo.importer
import os
import sys
import time
import math
sys.path.insert(0, "")
import mojo_module
def count_primes(n):
count = 0
for i in range(2, n + 1):
is_prime = True
for j in range(2, i):
if i % j == 0:
is_prime = False
break
if is_prime:
count += 1
return count
def count_primes_numpy(n):
if n < 2:
return 0
candidates = np.arange(2, n + 1)
is_prime_mask = np.ones(len(candidates), dtype=bool)
# For each position in our candidates array
for idx, candidate in enumerate(candidates):
if candidate == 2: # 2 is prime
continue
# Create divisors array [2, 3, ..., candidate-1]
divisors = np.arange(2, candidate)
# Vectorized divisibility check
has_divisor = np.any(candidate % divisors == 0)
if has_divisor:
is_prime_mask[idx] = False
return np.sum(is_prime_mask)
n = 20_000
start = time.time()
print(count_primes(n))
end = time.time()
print(f"Time taken: {end - start} seconds for python")
start = time.time()
print(count_primes_numpy(n))
end = time.time()
print(f"Time taken: {end - start} seconds for numpy")
start = time.time()
print(mojo_module.count_primes(n))
end = time.time()
print(f"Time taken: {end - start} seconds for mojo")
The results look promising!
2262
Time taken: 0.44585609436035156 seconds for python
2262
Time taken: 0.25995898246765137 seconds for numpy
2262
Time taken: 0.011101961135864258 seconds for mojo
There are better algorithms for prime counting compared to what I am doing here, so take this with a truck of salt, but this is very exciting. Mojo is a lot easier to learn than Rust, but I seem to get the function speedup that I want (need).
At the time of writing the main downside is that the modular stack is still early, but there also seems to be light support for building these extensions.
In short: it's not ready for prime-time just yet, but I'm more optimistic now that the dream is getting within reach!