More

polishdude20 · 2026-04-11T00:16:46 1775866606

Its less about "crimes" and more about a moral or ethical boundary that people feel is being crossed.

culi · 2026-04-11T02:24:46 1775874286

Yeah think of it as a moral crime. Someone can achieve tax evasion completely legally but that doesn't make it fair or right.

polishdude20 · 2026-04-09T05:47:56 1775713676

What is structured water?

oersted · 2026-04-09T06:06:24 1775714784

Yeah okay... Surprised to see this as the top comment.

> Hexagonal water, also known as gel water, structured water, cluster water,[1] H3O2 or H3O2 is a term used in a marketing scam[2][3] that claims the ability to create a certain configuration of water that is better for the body.[4]

> The concept of hexagonal water clashes with several established scientific ideas. Although water clusters have been observed experimentally, they have a very short lifetime: the hydrogen bonds are continually breaking and reforming at timescales shorter than 200 femtoseconds.[7] This contradicts the hexagonal water model's claim that the particular structure of water consumed is the same structure used by the body.

https://en.wikipedia.org/wiki/Hexagonal_water

eru · 2026-04-09T06:30:27 1775716227

Though funnily enough, you can make real 'structured water' at home in your freezer. Making your ice crystals hexagonal is theoretically possible, but it's really, really hard to grow monocrystaline water ice. That might be a really interesting niche hobby, though.

See https://www.youtube.com/watch?v=VA710QYxEu0 for the latter.

oersted · 2026-04-09T06:34:56 1775716496

Well yes, that’s in a solid state. Lots of crystals have hexagonal structures since it’s the optimal packing distribution.

If “structured water” just means that there are tiny ice crystals in water, sure that’s very plausible, but I doubt it would have much of an effect.

PS: Trying to grow crystals of different challenging structures does sound like an awesome hobby.

eru · 2026-04-09T08:19:06 1775722746

Oh, the pseudo-science 'structured water' is absolutely bonkers. I just went off on a mildly interesting tangent.

pkghost · 2026-04-10T03:05:29 1775790329

A so-called fourth phase of water (liquid, but with some crystalline organization) that grows on hydrophilic surfaces by absorbing ultraviolet and infrared light, and organizes into a honeycomb-like lattice similar to ice, but lacking the H+ binding layers to make it rigid. It has higher viscosity than bulk water, and a net-negative charge.

Yes, it's a relatively recent concept (decades) pursued mostly by Gerald Pollack at University of Washington and not widely replicated, though there is some replication that has prompted critical review (https://pmc.ncbi.nlm.nih.gov/articles/PMC7404113/). It's also downstream of work by Albert Szent-Györgyi (Nobel prize for vitamin C) and Gilbert Ling. And, of course, there are a bunch of folks Pollack distances himself from commercializing the concept.

From the horse's mouth: https://www.pollacklab.org/research

If I had a coloring book for every person who cited wikipedia as a reliable source on cutting-edge science... I'd have Christmas presents for a bunch of people I don't know!

polishdude20 · 2026-04-02T18:19:17 1775153957

Hey in really interested in your pipeline techniques. I've got some pdfs I need to get processed but processing them in the cloud with big providers requires redaction.

Wondering if a local model or a self hosted one would work just as well.

evilelectron · 2026-04-02T20:01:56 1775160116

I run llama.cpp with Qwen3-VL-8B-Instruct-Q4_K_S.gguf with mmproj-F16.gguf for OCR and translation. I also run llama.cpp with Qwen3-Embedding-0.6B-GGUF for embeddings. Drupal 11 with ai_provider_ollama and custom provider ai_provider_llama (heavily derived from ai_provider_ollama) with PostreSQL and pgvector.

People on site scan the documents and upload them for archival. The directory monitor looks for new files in the archive directories and once a new file is available, it is uploaded to Drupal. Once a new content is created in Drupal, Drupal triggers the translation and embedding process through llama.cpp. Qwen3-VL-8B is also used for chat and RAG. Client is familiar with Drupal and CMS in general and wanted to stay in a similar environment. If you are starting new I would recommend looking at docling.

lwhi · 2026-04-03T06:50:48 1775199048

Are you linking any of the processes using the Drupal AI module suite?

evilelectron · 2026-04-03T14:25:59 1775226359

Yes, they are all linked using Drupal's AI modules. I have an OpenCV application that removes the old paper look, enhances the contrast and fixes the orientation of the images before they hit llama.cpp for OCR and translation.

chrisweekly · 2026-04-02T20:13:48 1775160828

Disclaimer: I'm an AI novice relative to many here. FWIW last wknd I spent a couple hours setting up self-hosted n8n with ollama and gemma3:4b [EDIT: not Qwen-3.5], using PDF content extraction for my PoC. 100% local workflow, no runtime dependency on cloud providers. I doubt it'd scale very well (macbook air m4, measly 16GB RAM), but it works as intended.

patrickk · 2026-04-03T06:33:01 1775197981

For those who wish to do OCR on photos, like receipts, or PDFs or anything really, Paperless-NGX works amazingly well and runs on a potato.

polishdude20 · 2026-04-02T20:25:18 1775161518

How do you extract the content? OCR? Pdf to text then feed into qwen?

I tried something similar where I needed a bunch of tables extracted from the pdf over like 40 pages. It was crazy slow on my MacBook and innacurate

philipkglass · 2026-04-02T20:47:24 1775162844

If you have a basic ARM MacBook, GLM-OCR is the best single model I have found for OCR with good table extraction/formatting. It's a compact 0.9b parameter model, so it'll run on systems with only 8 GB of RAM.

https://github.com/zai-org/GLM-OCR

Use mlx-vlm for inference:

https://github.com/zai-org/GLM-OCR/blob/main/examples/mlx-de...

Then you can run a single command to process your PDF:

  glmocr parse example.pdf

  Loading images: example.pdf
  Found 1 file(s)
  Starting Pipeline...
  Pipeline started!
  GLM-OCR initialized in self-hosted mode
  Using Pipeline (enable_layout=true)...

  === Parsing: example.pdf (1/1) ===

My test document contains scanned pages from a law textbook. It's two columns of text with a lot of footnotes. It took 60 seconds to process 5 pages on a MBP with M4 Max chip.

After it's done, you'll have a directory output/example/ that contains .md and .json files. The .md file will contain a markdown rendition of the complete document. The .json file will contain individual labeled regions from the document along with their transcriptions. If you get all the JSON objects with

  "label": "table"

from the JSON file, you can get an HTML-formatted table from each "content" section of these objects.

It might still be inaccurate -- I don't know how challenging your original tables are -- but it shouldn't be terribly slow. The tables it produced for me were good.

I have also built more complex work flows that use a mixture of OCR-specialized models and general purpose VLM models like Qwen 3.5, along with software to coordinate and reconcile operations, but GLM-OCR by itself is the best first thing to try locally.

polishdude20 · 2026-04-02T21:56:39 1775166999

Thanks! Just tried it on a 40 page pdf. Seems to work for single images but the large pdf gives me connection timeouts

philipkglass · 2026-04-02T22:04:54 1775167494

I also get connection timeouts on larger documents, but it automatically retries and completes. All the pages are processed when I'm done. However, I'm using the Python client SDK for larger documents rather than the basic glmocr command line tool. I'm not sure if that makes a difference.

polishdude20 · 2026-04-03T04:52:43 1775191963

Yeah looks like the cli also retries as well. I was able to get it working using a higher timeout.

davidbjaffe · 2026-04-03T14:25:04 1775226304

Cool! For GLM-OCR, do you use "Option 2: Self-host with vLLM / SGLang" and in that case, am I correct that there is no internet connection involved and hence connection timeouts would be avoided entirely?

philipkglass · 2026-04-03T14:54:33 1775228073

When you self-host, there's still a client/server relationship between your self-hosted inference server and the client that manages the processing of individual pages. You can get timeouts depending on the configured timeouts, the speed of your inference server, and the complexity of the pages you're processing. But you can let the client retry and/or raise the initial timeout limit if you keep running into timeouts.

That said, this is already a small and fast model when hosted via MLX on macOS. If you run the inference server with a recent NVidia GPU and vLLM on Linux it should be significantly faster. The big advantage with vLLM for OCR models is its continuous batching capability. Using other OCR models that I couldn't self-host on macOS, like DeepSeek 2 OCR or Chandra 2, vLLM gave dramatic throughput improvements on big documents via continuous batching if I process 8-10 pages at a time. This is with a single 4090 GPU.

chrisweekly · 2026-04-02T21:40:08 1775166008

1. Correction: I'd planned to use Qwen-3.5 but ended up using gemma3:4b.

2. The n8n workflow passes a given binary pdf to gemma, which (based on a detailed prompt) analyzes it and produces JSON output.

See https://github.com/LinkedInLearning/build-with-ai-running-lo... if you want more details. :)

tehologist · 2026-04-03T04:27:11 1775190431

Python pdftools to convert to images and tesseract to ocr them to text files. Fast free and can run on CPU.

jorl17 · 2026-04-02T19:22:02 1775157722

Seconded, would also love to hear your story if you would be willing

polishdude20 · 2026-04-02T00:05:02 1775088302

https://www.memoryexpress.com/Products/MX00115488

Dylan16807 · 2026-04-02T00:31:28 1775089888

Let's see, this is a low speed 2x16GB DDR4 kit for $300.

The closest option on the pcpartpicker chart was about $75 as a stable price. So that one's only a 4x increase.

Versus DDR5 where... it looks like a 5x increase to me? I'm seeing a jump from 200USD up to 1000USD. Edit: Oh there's an extra jump in the last month on the CAD version but not the USD version.

kasabali · 2026-04-02T08:48:50 1775119730

that was like $80 last year.

polishdude20 · 2026-04-01T08:18:00 1775031480

A few years ago I did a bit more of a crude flow.

Play the footage on a tv in a dark room. Place a 4k camera on a tripod and record the tv with audio into the camera audio port.

Worked perfectly.

Melatonic · 2026-04-01T17:39:42 1775065182

Actually not a terrible way to go from interlaced to progressive footage. Depending on the TV and camera

polishdude20 · 2026-03-28T00:02:23 1774656143

I'd love to see some data for how much it has improved via this process in the last week

heliumtera · 2026-03-28T00:54:54 1774659294

It would be the same as kimi k2.5, the underlying model

polishdude20 · 2026-03-16T22:05:15 1773698715

Surely this Musk project will happen in the time he says it will.

toast0 · 2026-03-16T22:18:25 1773699505

Musk has lots of experience with ignoring lack of experience; should speed the process up.

Tostino · 2026-03-16T22:20:08 1773699608

This is sarcasm, right?

polishdude20 · 2026-03-16T21:26:43 1773696403

It's probably because usually normal people don't but routers because they get them included in their internet subscription. So the people buying them have a specific reason to that normal routers don't do

nine_k · 2026-03-16T23:31:16 1773703876

It's a travel router which power users buy to get good connectivity away from home and office. An hotel won't offer you that (and chances are that they'll try to rip you off on their wifi).

devilbunny · 2026-03-17T13:10:56 1773753056

Assuming you can find an Ethernet port to supply it, that is. Most hotels don't make them easy to find and use, if they even have them.

More common is that you use the travel router to connect to hotel WiFi and then share out that connection. It's slower than using directly, but it's great for family travel since you can name your travel SSID the same as your home network - all your usual devices will connect automatically, and will use any whole-connection VPN you have set up (most of the gl.inets will do Wireguard, OpenVPN, and Tailscale that I know of straight out of the box, and they will let you into luci or via SSH to configure the underlying OpenWRT directly for anything else). And, of course, it's just one device for hotels that try to limit the number of devices you use.

dixie_land · 2026-03-17T15:44:11 1773762251

As far as travel and hotel goes, another huge benefit is that the router enables devices without captive portal support, on a recent trip I can use: - Fi base station for my dogs trackers (huge for me) - FireTV stick (no need to trust hotel streaming apps will clear your credentials like they claim)

Also I can WireGuard back home automatically for select IP ranges (no need to configure WireGuard separately on many of my devices)

polishdude20 · 2026-03-14T20:12:18 1773519138

Yeah and especially the satisfaction that you were able to make a user delighted to use your thing. Fixing bugs, making things faster, adding new features, for me personally I do it because I feels really good when a customer loves to use the thing I've built.

Weather I've done the manual coding work myself or have prompted an LLM to cause these things to happen, I still chose what to work on and if it was worthy of the users' time.

polishdude20 · 2026-03-14T20:07:56 1773518876

I use my wired sex toys usually. Ethernet works really well.

prepend · 2026-03-14T21:11:30 1773522690

10G Ethernet has improved their performance so greatly.