Hacker Newsnew | past | comments | ask | show | jobs | submit | markmacardle's commentslogin

  Location: Berlin, Germany
  Remote: No
  Willing to relocate: No
  Technologies: Python, SQL, Terraform, Bash, GCP, AWS, Docker, BigQuery, Snowflake, dbt, Airflow
  Résumé/CV: https://www.linkedin.com/in/markmacardle/
  Email: markmacardle1 at gmail dot com
Hey, I'm a data engineer with 4.5 years experience. I've just finished a masters in Computer Science so am now looking for my next role. Past experience with building a data platform on GCP and working in an AWS based team. Originally from Ireland but have been living in Berlin for 2.5 years now and speak B2 level German.


I think your TLDR is accurate, but I think the mechanism is likely even simpler than Google looking at the last page visit.

The docs are written by and for experienced programmers. They're very dense with information but light on examples and comments that explain things for dummies.

Its popularity means the vast majority of Python users are novices who would find the docs hard to read. Geeksforgeeks et al are providing content for them and so is actually a better resource to show the majority of searchers (especially given that it's a basic string formatting question, something very likely to be searched by for novices).


I think two things are conflated in the question: helping others for no money or only a token gesture of thanks (eg helping a friend paint their house for a beer), and doing a side job for cash (eg house painting for acquaintances on the weekend for cheap rates).

I'm pretty sure the first is totally fine in any country. Also pretty sure the second would be considered a tax-dodge in any country as it's just income you don't declare. In Ireland those doing jobs for cash like that is called a "nixer".


what is the motivation behind attacks like this? Like is whoever put in the effort to do this just trying to annoy Gitlab's customers? What do they gain from that?


Judging from the example email the name of the spammer is "fixspam gitlab", I'm guessing they want gitlab to fix spam.


It's intended for syncing data sources to a data warehouse for analysis. Even if you're only doing one data source, if you want analysis to be easy later you'll likely want to sync all fields available from the api (as you probably don't know what fields are of interest before analysis).

If you imagine doing this for Stripe say, there's a huge amount of fields available in different objects (charges, invoices, subscriptions etc) and you need to add these as columns to your relation database of a data warehouse. Unnesting may also be needed. That's very tedious work and on top of it you need to run, monitor and maintain the extraction process.

Even a small company could easily have 10+ data sources containing dozens of tables that they want to sync to a warehouse so this quickly becomes unmanageable. Hence companies like Stitchdata, Fivetran and now Airbyte now selling it as a service.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: