Hacker Newsnew | past | comments | ask | show | jobs | submit | bluehark's commentslogin

How large was the dataset used for post-training?


We used two types of datasets for post-training. Supervised finetuning data and preference data used for RLHF stage. You can actually use less than < 1M samples to significantly boost the aesthetics. Quality matters A LOT. Quantity helps with generalisation and stability of the checkpoints though.


How is the data collected?


The highest quality finetuning data was hand curated internally. I would say our post training pipeline is quite similar to SeedDream 2.0 ~ 3.0 series from ByteDance. Similar to them, we use extensive quality filters and internal models to get the highest quality possible. Even from there, we still hand curate a hand-picked subset.


Do you have an NVIDIA optimized version? Similar to how RTX accelerated FLUX.1 Kontext: https://blogs.nvidia.com/blog/rtx-ai-garage-flux-kontext-nim...


We have not added a separate RTX accelerated version for FLUX.1 Krea, but the model is fully compatible with existing FLUX.1 dev codebase. I don't think we made a separate onnx export for it though. Doing 4~8 bit quantized version with SVDQuant would be a nice follow up so that the checkpoint is more friendly for consumer grade hardware.


I highly recommend pypika by Kayak: https://github.com/kayak/pypika

Have used in multiple projects and have found it's the right balance between ORMs and writing raw SQL. It's also easily extensible and takes care of the many edge cases and nuances of rolling your own SQL generator.


Yup, after my own initial research I found it as well, and I like it a lot. I talk more about it here: https://death.andgravity.com/own-query-builder#sqlbuilder-py...


> and takes care of the many edge cases and nuances of rolling your own SQL generator

Want to elaborate?


For one, it can output more than one flavor of SQL: https://pypika.readthedocs.io/en/latest/3_advanced.html#hand...

Since SQL is ever-so-slightly different across databases, I imagine trying to cover all of them as a single dev is a nightmare (especially if that's not the problem you're trying to solve).

I wrote my own query builder because I know for sure I'm only targeting SQLite. The second I need my feed reader library to work with another database engine I'm dumping my own for something more serious – either a full blown database abstraction layer like SQLAlchemy or Peewee (likely without the ORM part), or something simpler like PyPika or python-sql.[1]

[1]: I talk more about them here: https://death.andgravity.com/own-query-builder#sqlbuilder-py...


wow....$125/user/mo


Average dev salary at my org is north of 100k, but I do work in the US so lets say we have a developer earning half that. At $125 a month, it works out to be around 3% of monthly compensation (not including taxes and benefits). This only has to improve productivity by a tiny amount to be worth it.


I think it'd be really hard to prove any single tool provides you a specific productivity boost. Most engineers probably have a tool-set. Which means all those tools work together nicely. Taking one of them out, as well as adding one, might break the whole setup. Usually established engineers are not working in a vacuum, they already have their setups in place, so justifying a 3% extra cost might be very hard to justify for very unclear benefits, if any. I'm not trying to make a definitive argument, just some food for though.


That is not how it works, having a water cooler in the office increase the productivity 100x vs not having water, that does not mean you should pay millions for one. That is a myth invented by SAAS vendors and consultants to justify their sky-high price. The value offered of course factors in the price but many other factors too (scarcity of materials and resource to produce the good,cost of production, maintenance cost, cost of the products of your competitors, risk of vendor lock-in, etc)


>That is not how it works,

Plenty of places pay X for tools that add more than X in productivity value.

In fact, nearly every tool I have ever gotten at a company worked like this. Most of them are also willing to test pricey tools to see if they would pay off, and when they do, the company starts buying such tools.

If you don't work at such a place, look for a place that values developer time.


I worked for a Fortune 20 company so you can stop the patronizing tone. A paper and a pencil also increases productivity by a lot ( perhaps more than any tool) that does not mean you need to pay 5% of your developer salary by month for them.


But, you likely would pay 5% of salary (or more) for a paper and pencil (to continue with your analogy) if you had no other choice and there was no alternative tool that could substitute. So I'm not sure what point you are trying to make.


So you are repeating my original point, congratulations, it is not only the productivity that factors in the price, go back and read it.


Unless… You and me we launch a blockchain-AI-SaaS startup selling pencils by subscription!


Create a landing page. I will write a couple of posts saying that Sam Altman and Paul Graham are the smartest guys this century and we will be soon launching a a show HN, new YC company.


Keurig cups cost more per dev per day than JetBrains tools do...


Get your devs a French press


Just on this topic--dedicated napping spaces might be the single most powerful dollar for productivity boost you can buy an in-office dev team.

Oh and noise cancelling headphones.


It's the enterprise version. They can afford it. Besides it's not like everyone in the org will be having a seat. Only the people that are doing data science.


I work in a very large enterprise that could definitely benefit from such a product, but at that cost I'm not even going to try mentioning it, it's never going to pass.

We have engineers in the thousands so that would be a budget in the millions per year, for a tool for which it's hard to demonstrate the productivity benefit over alternatives. Not going to happen.

It's not that the company can't afford it, but enterprises are strict when it comes to spending money. There are lots of processes, checks, and people that sign off.


On-Prem (even in private cloud) solutions like this seem to always be pricier, includes dedicated support too. They have a cheaper $19/m and free tier on https://datalore.jetbrains.com/


This is not unreasonable for data science products.


Free version lets you create up to 3 (!) notes: https://www.serenity.re/en/notes/pricing



It's nice to see a hack turn into an API extension that addresses the underlying problem.


Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: