Hacker Newsnew | past | comments | ask | show | jobs | submit | sieve's commentslogin

I am continuing with my proofreading and language learning efforts and have been working on tooling for it.

= Proofreading =

https://github.com/adhyeta-org-in/adhyeta-tools

provides image extraction from PDF, OCR as well as a basic but nice proofreading web-ui.

Qwen 3/3.5 is good enough for OCR on books in Indic scripts. So that is what I am using. But you can configure the model that you want to use.

I may add a tesseract back end as well if necessary.

= Language Learning =

I have tried a few parallel text readers and was not satisfied by any of them. My website (https://www.adhyeta.org.in/) had a simple baked-in interface that I deleted soon after I developed it. However, this weekend, I sat down with Claude and designed one to my liking. I also ported the theming and other goodies from the website to this local reader. This will serve as a test bed for the Reader on the website itself.

LLMs now produce wonderful translations for most works. You can take an old Bengali book, have Claude/Gemini OCR a few pages and then also have it translate the content to English/Sanskrit. Then load it into the Reader and you are good to go!

The Reader, I will release this month. Claude is nice, but I do not like the way it writes code. It often misses edge cases and even some basic things and I have to remind it to do that. So I want to refactor/rearrange some stuff and test the functionality end-to end before I put it online.


India has a lot of languages and people need access to something than allows them to do basic stuff with it. I don't think relying on the US is a long term solution.

An example. I am into proofreading and language learning and am forced to rely on Claude/Gemini to extract text from old books because of the lack of good Indian models. I started with regular Tesseract, but its accuracy outside of the Latin alphabet is not that great. Qwen 3/3.5 is good with the Bombay style of Devanagari but craps the bed with the Calcutta style. And neither are great with languages like Bengali. In contrast, Claude can extract Bengali text from terrible scans and old printing with something like 99+ percent accuracy.

Models specifically targeted at Indian languages and content will perform better within that context, I feel.


llama-cpp provides an API server as well via llama-server (and a competent webgui too).


This is a typical technical solution to a sociopolitical problem. The powers-that-be are not comfortable with the free-for-all that exists on the internet. All these laws are meant to fix that squeaky wheel, one ball-bearing at a time.

"Children" gets the Right to march behind you unquestioningly. "Misinformation/Nazis" does the same for the Left. This is now a perfect recipe for a shit sandwich.


I agree. But if you find a different way to protect the children, that normal people can understand and relate to ("It's like buying beer"), and still maintain privacy, you take away at least one leg of support for what a lot of states really want to do (remove anonymity).

It's better than the fatalism in your comment IMO.


What is the useful life of something like this compared to an RCC structure? Do you have to keep painting them to protect it from rust?

You do see steel used in mobile towers etc because you may not be able to place an RCC structure of that height on top of a building not designed for those loads. And in single story workshops/sheds.


The useful life is less than RCC. A RCC structure is good for about 50 years, steel for 25. The 4x4 is even less, about 15 years.

No, you paint them initially when you build the structure. It's quite hard to paint afterwards.

Also, if you notice closely, the steel is welded rather than bolted. Newer buildings are bolted now-a-days, which increases their useful life.

Example: https://imgur.com/a/f4z84dx

This is the current building being built that I talk about. Notice (1) the two layers of paint, and (2) bolts being used instead of welds compared to the steel structure photos in the essay


The issues with subscriptions to streaming services are manifold (if you ignore the gargantuan waste of time that mindless TV-watching is):

- the UI is deliberately crap

- the library is deliberately incomplete

- accessing content is deliberately complicated

I had an experience recently where my phone provider bundles 20+ OTT services in a single plan within a single app that runs on your TV/phone/browser. The kicker: you can add stuff to a watch list, but the watch list is never exposed anywhere. While they want you to pay for stuff, they do not want you to be choosy about it.

YT has, to my mind, the best user interface of all the services I have tried.


Jellyfin is quite good.


Nice! His Shakespeare generator was one of the first projects I tried after ollama. The goal was to understand what LLMs were about.

I have been on an LLM binge this last week or so trying to build a from-scratch training and inference system with two back ends:

- CPU (backed by JAX)

- GPU (backed by wgpu-py). This is critical for me as I am unwilling to deal with the nonsense that is rocm/pytorch. Vulkan works for me. That is what I use with llama-cpp.

I got both back ends working last week, but the GPU back end was buggy. So the week has been about fixing bugs, refactoring the WGSL code, making things more efficient.

I am using LLMs extensively in this process and they have been a revelation. Use a nice refactoring prompt and they are able to fix things one by one resulting in something fully functional and type-checked by astral ty.


Unwilling to deal with pytorch? You couldn't possibly hobble yourself anymore if you tried.


If you want to train/sample large models, then use what the rest of the industry uses.

My use case is different. I want something that I can run quickly on one GPU without worrying about whether it is supported or not.

I am interested in convenience, not in squeezing out the last bit of performance from a card.


You wildly misunderstand pytorch.


What is there to misunderstand? It doesn't even install properly most of the time on my machine. You have to use a specific python version.

I gave up on all tools that depend on it for inference. llama-cpp compiles cleanly on my system for Vulkan. I want the same simplicity to test model training.


pytorch is as easy as you are going to find for your exact use case. If you can't handle the requirement of a specific version of python, you are going to struggle in software land. ChatGPT can show you the way.


I have been doing this for 25 years and no longer have the patience to deal with stuff like this. I am never going to install Arch from scratch by building the configuration by hand ever again. The same with pytorch and rocm.

Getting them to work and recognize my GPU without passing arcane flags was a problem. I could at least avoid the pain with llama-cpp because of its vulkan support. pytorch apparently doesn't have a vulkan backend. So I decided to roll out my own wgpu-py one.


FWIW, I've been experimenting with LLMs for the last couple of years, and have exclusively built everything I do around llama.cpp exactly because of the issues you highlight. "gem install hairball" has gone way too far, and I appreciate shallow dependency stacks.


Fair enough I guess. I think you'll find the relatively minor headache worth it. Pytorch brings a lot to the table.


I suspect the OP's issues might be mostly related to the ROCM version of PyTorch. AMD still can't get this right.


Probably - but the answer is to avoid ROCM, not pytorch.


Avoiding ROCm means buying a new Nvidia GPU. Some people would like to keep using the hardware they already have.


The cost to deal with rocm is > cost of a consumer nvidia gpu by orders of magnitude.


If you’re not writing/modifying the model itself but only training, fine tuning, and inferencing, ONNX now supports these with basically any backend execution provider without needing to get into dependency version hell.


What are your thoughts on using JAX? I've used TensorFlow and Pytorch and I feel like I'm missing out by not having experience with JAX. But at the same time, I'm not sure what the advantages are.


I only used it to build the CPU back end. It was a fair bit faster than the previous numpy back end. One good thing about JAX (unlike numpy) is that it also gives you access to a GPU back end if you have the appropriate stuff installed.


> The CEO is also more puritan than the pope himself considering the amount of censorship it has.

In that case, you should try OpenAI's gpt-oss!

Both models are pretty fast for their size and I wanted to use them to summarize stories and try out translation. But it keeps checking everything against "policy" all the time! I created a jailbreak that works around this, but it still wastes a few hundred tokens talking about policy before it produces useful output.


Surely someone has abliterated it by now


I started using this a couple of days ago. It is a fully functional replacement for what I have been doing with WhatsApp. About 30-40% of my n/w is on it now and I have also created our Sanskrit channel on it.

What it is missing:

- E2E encryption for text messages

- Communities as a container for groups

- Chat exports

- UPI payment integration

Also, the servers are under pressure so messages can get delayed sometimes.

But Vembu has promised continuous development. So let's see.

I am using it regularly and do hundreds of messages every day across groups and contacts.


I have been planning to put out a quarterly Sanskrit newsletter for some time now, and was dreading having to deal with LaTeX. For basic stuff, LibreOffice PDF export works. But that is not a plain text workflow.

I then discovered typst and it is a breath of fresh air. Unicode/Dēvanāgarī support out-of-the-box, no installing gigabytes of packages, near-instant compilation.

My complements to those who got this done.


Where can we sign up for the newsletter?


I will post it on our website as well as reddit when it is ready. I am taking my time to ensure that it does not become a one-off thing and can continue for many quarters.

- https://www.adhyeta.org.in/

- https://old.reddit.com/r/adhyeta/


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: