The economics
What it takes to build, and why it's worth doing once.
The case for an open Torah transcription model isn't sentiment — it's arithmetic. The audio is vast and grows daily; the closed way of transcribing it is a bill that never stops; and an open model that runs on your own device turns that recurring cost into something the community pays once. Here is the honest version, rough numbers and all.
The scale, honestly
Hundreds of thousands of hours — and rising every day.
No one has an exact count, and we won't pretend to. But the order of magnitude is clear: the recorded Torah already sitting in archives, on shiur platforms, and on aging media runs to hundreds of thousands of hours, and hundreds more are taught every day. The mix of languages is uneven — Hebrew is by far the largest, then English, then Yiddish, with Aramaic and Yeshivish forming a long, hard tail. The proportions below are a rough sketch, not a measurement.
Rough estimate — illustrative proportions, not measured figures.
The cost of the closed way
A bill that recurs — per language, per organization, forever.
Proprietary transcription exists and works; that it exists at all is proof the need is real. The best of it is priced low at scale — sofer.ai, for example, is around $0.60 an hour. But multiply even a low per-hour rate by hundreds of thousands of hours and you have a large one-time bill just to clear the back-catalog.
And it doesn't clear. New audio is recorded every day, so the bill renews daily. Each language is its own effort, so it renews per language. And because it's a paid service rather than a shared asset, every organization that wants its own archive transcribed pays the bill again from scratch. The total isn't a purchase — it's a subscription the whole Torah world keeps paying, with the result sitting behind a paywall.
Why open + on-device collapses it
Build the model once; the cost of using it falls to almost nothing.
An open model is a different kind of thing. The weights and the code are published under a permissive license, and inference runs on the user's own phone or laptop. There is no per-hour fee, no per-seat license, no metered API — so once the model exists, the marginal cost of transcribing one more hour is essentially the electricity it takes to run a phone.
That's the whole argument for paying to build it once. A closed service turns transcription into a toll you pay forever. An open, on-device model turns it into roads, not an app — public infrastructure that every yeshiva, archive, and developer can run for free, without anyone's permission.
What it actually takes to build
The surprising part: it's cheaper than people assume.
The instinct is that training an AI model means millions of dollars and a data center. For this project, that instinct is wrong — and understanding why is the heart of the case.
- We never train from scratch. We fine-tune open models that already exist — ivrit.ai and Whisper — and adapt them to the Torah register. That alone turns a multi-million-dollar problem into a low-six-figure one.
- Compute is a rounding error. The GPU time for the whole project runs from a few thousand dollars to roughly $25K — not the bottleneck, not even close.
- For canonical Hebrew, the transcript already exists. The pointed text of Tanach, Mishnah, and Gemara is already digitized in Sefaria and Dicta. So instead of paying to write the words down, we align known text to the audio — a genuine Torah-specific advantage no one else has. (Honest caveat: chanted and cantillated audio — leining, te'amim — still needs careful hand review.)
- The budget is people, not machines. The real cost is Torah expertise and careful annotation — the human judgment that makes a transcript trustworthy — rather than hardware.
- The hard languages cost real money. Yiddish, Aramaic, and Yeshivish have almost no open data, and the only people who can annotate them well are scarce experts. That work is multi-year and several times larger than the Hebrew core — it's where the serious money goes.
The one-time-gift case
One focused effort builds the durable, open assets.
A founding effort doesn't have to do everything at once. It has to build the two things the whole community then uses for free: a Hebrew on-device model that works, and the open benchmark that lets anyone measure and improve it. Those are durable, open assets — built once, owned by no one and everyone, never re-purchased.
It's a focused, low-six-figure effort to reach that milestone — modest against a bill the closed world would otherwise pay, in part, every single day. The full multilingual vision is a larger, multi-year undertaking, and its cost is overwhelmingly the expert people who can do Yiddish, Aramaic, and Yeshivish well. We're glad to walk through detailed figures with funders considering that work.
One gift. Free for the whole Torah world. Forever.