Open & free, forever
Open model weights and open code under a permissive license. Anyone — a yeshiva, an archive, a developer, even another service — can use and build on it without permission or fees.
An open-source initiative for the Torah world
Most Torah ever taught lives only as audio that no one can search, read, translate, or preserve. TorahScribe is building the free, open, on-device speech-to-text that changes that — accurate for Hebrew, Yiddish, Aramaic, and Yeshivish, and owned by the whole community, forever.
A communal public good in formation — seeking founding funders, data partners, and builders.
The problem
Every day, in batei midrash and shuls and on the phone, an ocean of Torah is spoken — and most of it vanishes into audio files that can't be searched, quoted, reviewed, printed for Shabbos, or made accessible to those who can't hear. The general transcription tools that exist mangle it: they confuse Hebrew, Aramaic, and English, and miswrite pesukim and Gemara phrases.
A few proprietary services have started to solve this commercially — proof the need is real. But a capability this essential to limud haTorah shouldn't live behind a paywall or depend on one company. It should be open infrastructure: free to use, free to build on, and able to run anywhere.
Why now
Until recently, accurate speech-to-text meant sending your audio to someone else's servers. That era is ending. Modern open models can run directly on a phone or laptop, with no internet — which matters for three reasons:
An open model on your device can't be shut down, rate-limited, paywalled, or lost. It works in a beis medrash with no wifi, on an old phone, forever.
Sensitive recordings — a beis din, a private chaburah, a family's testimony — never have to leave the device or touch a company's servers.
The base models are good enough now and freely licensed. If the community doesn't build the open Torah layer today, the capability calcifies inside closed silos.
What we're building
Open model weights and open code under a permissive license. Anyone — a yeshiva, an archive, a developer, even another service — can use and build on it without permission or fees.
Designed to run offline on a phone or laptop, not just in the cloud — so it's private, free to run, and works anywhere.
Built to handle what generic tools fail on: Hebrew, Aramaic, Hasidic Yiddish, and Yeshivish — with the names, pesukim, and Gemara phrases written correctly.
Stewarded as shared infrastructure, with rabbinic oversight for accuracy and the sensitivity that sacred texts deserve. Institutions keep ownership of their own transcripts.
Who it serves
Automatic captions for the deaf and hard-of-hearing — opening shiurim that have never been accessible to them.
Turn vast audio libraries into a searchable, quotable, printable text of Torah — findable by topic, source, or phrase.
Convert decades of gedolim's recordings into permanent, searchable text — a zikaron that outlives the tape.
Accurate transcripts make translation tractable — extending a rav's Torah to learners in other languages.
Let dozens of Torah organizations build on one shared engine instead of each paying to solve transcription again.
A yeshiva can run it on its own archive and own its own transcripts — no per-seat dependency on any vendor.
How it's built
Roadmap
Publish the first open Torah transcription benchmark and measure where today's models really stand on shiurim, daf yomi, and davening.
Adapt an open Hebrew model to Torah content and ship a working demo — including a phone that transcribes a shiur offline.
Extend to Hasidic Yiddish and Talmudic Aramaic through data partnerships with the archives and communities that hold this audio.
Harden quality, release open weights and tools, and establish TorahScribe as durable communal infrastructure.
Get involved
TorahScribe is a non-commercial, open initiative in formation. We're looking for the founding partners who understand that shared infrastructure is the highest-leverage investment in the Torah world's future.
Seed the benchmark, the first model, and the on-device demo. This is roads, not an app — built once, used by everyone.
Archives, yeshivos, and shiur platforms who can share audio — and in return get their own libraries made searchable and accessible.
ML engineers, Hebrew/Yiddish/Aramaic linguists, and Torah scholars who want to lend their expertise.