Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Ah, I don't know much about multi modal models but I wonder what they'd think of pixel art representing the factory where each pixel is a point on the grid and each color is a specific entity, perhaps ignoring things such as bots flying about. Probably easier to comprehend than an actual screenshot?


The thing is that when you create a dense ASCII representation, any gain you might make from the spatial relationships is lost by: a) the tokeniser not working on characters alone (remember strawberrry), and b) the increased number of 'dead' tokens encoding not very much.

Our sparse encoding seems to confuse the models less - even though it certainly isn't perfect.


I mean at some point you compress the board state down to Dwarf Fortress with an extended ASCII representation for each grid-state (maybe 2 bytes each?)


Lots of questions here - you need item, orientation, info about pipes (2 directions) , belts (3 or 4 colors x2 directions). Do you wish Circuits?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: