Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Do you have experience implementing ETL pipelines in Go? I think it'd be a better fit for us over our current language, but I'm curious to hear from people who've actually done it.


Yes. It works fairly well. With that said, I've got a feeling that life would be a lot easier to change around if we weren't using it, we end up writing a lot of code to do relatively simple things.


I do this at my job.. Disclaimer: I’m a web dev (“architect”) who does some lightweight data engineering tasks to facilitate views in some of my apps.

My pipelines are very simple (no DAG-like dependencies across pipelines). I could just have separate scripts, but instead I have a monorepo of pipelines that implement an interface with Extract, Transform, Load methods. I run this as a single process that runs pipelines on a schedule and has an HTTP API for manually triggering pipelines.

At some point I felt guilty that I am doing something nobody else seems to do, and that I had rolled my own poor-man’s orchestrator. I played around with Dagster and it was pretty nice but I decided it was overkill for my needs (however I definitely think the actual data analysis team at my company should switch from Jenkins to Dagster heh…)

On a separate note, all of my pipelines Load into Elasticsearch, which I’m using as a data warehouse. I’ve realized this is another unconventional decision I’ve made, but it also seems to work well for my use-cases.


What is current language and have considered doing it in SQL?

I don't think go will be the right choice. It is just not its strength.


It depends on what you're doing right? The commenter here replied to me, and we're processing really large data files that are deliberately not in a SQL database due to size, only artefacts of these files eventually make it into a time series DB. For us Go works well and is performant without any great difficulty. For domain specific analytics we generally use Python, and Go just calls out to an API to do them.


You are right. And I am mostly a "T" guy so I guess the answer was mostly about the transform.

For extracting the data, go is probably a very good choice. But for transforming, pretty often not, although your use case may be suitable.

In the end, the question was very open ended.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: