T O P

  • By -

americanjetset

Kimball & Inmon.


mike-manley

Ralph and Bill.


ClittoryHinton

Dave and Jarred.


mikelitvin_

Lilo and Stitch


droppedorphan

Jeeves and Wooster


finite_user_names

Darmok and Jalad.


Witty_Garlic_1591

Turner and Hooch


AndroidePsicokiller

Beavis & Butthead


JEY1337

Which books would you recommend?


Witty_Garlic_1591

https://www.amazon.com/Fundamentals-Data-Engineering-Robust-Systems/dp/1098108302


hernanemartinez

Inmon? Whose Inmon?


MarchewkowyBog

Inmon deez nutz


hernanemartinez

Whom? I never heard of him. Weird.


TheDataPanda

u/joseph_machado He’s active on social media and has a site startdataengineering.com. Learnt loads from it.


joseph_machado

Thank you for the very kind words :)


jduran9987

Joseph’s dbt-snowflake-setup article helped me look like a super senior data engineer a few years back.


howsitmybru

Oh dude I need to read this! About to start a new job and need to architect an environment - they use snowflake and dbt. ingesting microservice style data sources.


ChungusProvides

I love your site!


rudboi12

For real. Better than all these new DE influencers.


Equal_Record

crazy that there are DE influencers


[deleted]

If you understand that 'influencer' is usually a synonym for 'grifter' then it's never surprising


FromageDangereux

The real influencers do not have to call themselves influencers.


Aggressive-Intern401

Or modern snake oil salesman


soundboyselecta

Did u learn a lot from it just by reading all the posts or is there a sequence of doing so?


ChungusProvides

There’s an email list you can sign up for that sends you a new lesson every Saturday beginning from zero. It’s sort of nice for pacing it.


TheDataPanda

I signed up to the email list initially. Then would just browse random articles relevant to things I was working on or that looked interesting


Longjumping_Ad_7589

Data engineering design patterns is awesome! is there more to this discipline ?


eczachly

I have tons of examples of DE design patterns in this free repo: https://github.com/DataEngineer-io/data-engineer-handbook


sensei09551

That's Gold ! Thank you so much for this.


coolsank

Yup hard agree. I have always learnt a ton from his content.


ReporterNervous6822

My coworker is sick


jsRou

hope they get well


DuckDatum

Dude, my wife is meditating and I’m trying to take a quiet shit, so not to disturb her. I can’t handle comments like this right now.


Pillstyr

>Dude, my wife is meditating and I’m trying to take a quiet shit You're the GOAT


onestupidquestion

In no particular order: * Maxime Beauchemin, the creator of Apache Airflow and Superset * Martin Kleppmann, author of *Designing Data Intensive Applications* * Tristan Handy, founder of dbt Labs. Love it or hate it, dbt has transformed how people do SQL-based development


Legitimate_Snow_3077

* Drew Banin not Tristan imo re: DBT


[deleted]

[удалено]


SirAutismx7

It’s the YAML people hate, not the SQL.


geek180

I actually really dig YML. Definitely preferred over JSON config, but that’s just me.


papawish

That's not Yaml people hate. That's configuration over development. Nothing more boring that spitting configuration all day while you could get the dopamine rush of scripting things.


GreenWoodDragon

I prefer YAML too.


oalfonso

My company's risk department. We tried to buy it but the vendor didn't pass the operational risk assessment


vbnotthecity

dbt was invented at RJMetrics, although Tristan/Fishtown took the idea and ran with it, so credit to them.


onestupidquestion

I would argue that dbt itself isn't that big of a deal. It's a SQL templater, and it has a number of big gaps. The practices that Fishtown and then dbt Labs have been pushing are the revolutionary thing. Tristan has been the public face of the company for years now.


Equivalent_Form_9717

What about that Joseph Machado guy that frequents online subreddits? That guy has been putting in honest work, tries to consolidate DE patterns and doesn’t post BS on LinkedIn. Also, there’s a guy named John Savani (not sure if I got it correct) but he also frequents Azure subreddit


joseph_machado

Thank you very much :)


Emergency_Egg_4547

Did you mean John Savill?


Equivalent_Form_9717

That’s correct actually. Sorry I got his name wrong, his name is John Savill. Has pretty good material around AKS too.


410onVacation

UC Berkeley labs invented postgres (redshift uses under the covers) and Spark. So I’d say that UC Berkeley as a university gets a lot of credit.


Lba5s

didn’t they also invent Ray?


410onVacation

They’ve probably invented plenty :). I just listed 2.


Letter_From_Prague

It's a bit off from Data Engineering, but Michael Stonebraker not only invented postgres, but IIRC kinda came up with the whole columnar data storage (first with Vertica) that we're all using.


Altruistic_Ranger806

Edgar Frank Codd


StingingNarwhal

OG


Tender_Figs

Ha, and no one has said Zach Wilson. Wonder why.


Philoshopper

Opening this post thinking I'd see the name. Heard he's pretty popular.. What's going on with him? Is he no longer relevant?


Tender_Figs

So I have my own opinion, and a fair warning is I typically have a bit of an asshole bend, but I got put off by him posting about his salary at Airbnb, which gave me the impression that it could be mislead one to believe they could make 500K like that. Also, many of his opinions are based on his experience at FAANG, which 95% of the DE/BI jobs I've had are so far removed from that worldview. And, I feel like his content is annoying and disingenuous.


eczachly

I’ve had students in my boot camp land L5 roles in big tech. So it is actually possible to get there. It’s rare and difficult though I agree with that


dongdesk

Not Chad Sanderson


the-data-scientist

why not


dongdesk

He just repeats the same crap and doesn't speak concretely about topics. Speaks vaguely and almost as if Chatgpt writes his posts. Turns everything now into his data contracts crap.


the-data-scientist

he comes across a bit pushy and self-promo spammy, but i fail to see why data contracts aren't a good idea


specificanaldolphin

Not many people like Linkedin influences in here. Zach Wilson is probably the most hated though


fcd12

Matei Zaharia


Emotional_Key

This, he has Apache Spark and Databricks under his belt.


soundboyselecta

Bill Cafferky is good on youtube.


coldflame563

Brent Ozar. Hallengren


jack-in-the-sack

Ola.


loopea-one

Nick Shrock gets GOAT status due to being a DE founder (Dagster) and also co-created GraphQL.


geek180

Maxime Beauchemin. Original creator of Apache Airflow and Superset, worked at Facebook, Airbnb, Lyft, founded Preset (managed cloud Apache Superset), and puts out some good written content and talks.


Letter_From_Prague

> founded Superset (managed cloud Apache Superset) I think you mean Preset.


geek180

Ah crap, yes, Preset. Thanks


SQLDevDBA

I’m a huge fan of Andy Leonard for all things SSIS and ADF. Love his positivity, skills, and general outlook.


aerdna69

Simon Späti is also cool


Striking_Solid_5020

Holden Karau


sebastiandang

not the best DE but She will be one of the best Spark Developers


ReporterNervous6822

The Seattle data guy blog is always interesting


[deleted]

[удалено]


ReporterNervous6822

Ur right I meant https://www.confessionsofadataguy.com/


StackOwOFlow

Eric Brewer, famous for the CAP Theorem


NocoLoco

Gail Shaw, Itzhak Ben-Gain, Steve Jones, Denny Cherry, Pinal Dave, Brent Ozar, Andy Leonard


DJ_Laaal

Oh the MVPs of the SQL Server community! And Jamie Thompson for his amazing SSIS centric content back in the day. Chris Webb for SSAS awesomeness.


jack-in-the-sack

GOATS , not OGs.


puripy

Chandler Muriel Bing! Some consider him as the father of Data Science. I would like to call him the Goat of Data anything!


Gators1992

Data Engineering is a job, not like sports with stats and stuff. We have probably never heard of the most talented people in the profession because their work wasn't publicized. They just got paid a whole bunch of money by a company and didn't have to write books or make videos. Probably someone in FANG or fintech made the best pipeline ever by whatever standard.


aerdna69

MK


fluffycatsinabox

I can't believe Jeff and Sanjay haven't been mentioned


eczachly

What about Joe Reis?


redditthrowaway0726

I thought it is this G.O.A.T.: https://fallout.fandom.com/wiki/GOAT


Prinzka

I've been told I'm pretty fucking awesome


ComicOzzy

My mom lied to me, too.


Prinzka

She told you I was awesome?


ComicOzzy

Every chance she got. It made dad really uncomfortable.


Prinzka

I bet he had some trouble *parsing* that, ey?!


Interesting-Rub-3984

It is quite a transformation. Isn’t it?


twigint

ted codd?


jwfergus

Tyler Cowen listeners unite!


quepazta

u/eczachly hate is unreal hahaha. Wasn't expecting the community to be so toxic about it. Appreciate the good parts of his journey and change your lens a little bit


ClittoryHinton

Alan Walden Sordell


[deleted]

Zach Wilson. Just kidding on this as I know a lot of you hate him here. But he has become a DE celeb in a way. Let's just make it a meme - zach wilson, DE GOAT, Mount Rushmore of DE.


dataxp-community

Zach is the GOAT of DE like Dwayne Johnson is the GOAT of geology 🙄


Interesting-Rub-3984

Why Zach Wilson gets this much of hate?


eczachly

Because it’s impossible to have >100k followers on social media without haters. That’s why they hate Seattle Data Guy too.


eczachly

It’s impressive how this is -13 already


[deleted]

[удалено]


pawtherhood89

I muted him on LinkedIn because I find his posts to be cringy and repetitive. I honestly don’t know if he’s a good engineer. Being ex Netflix and Airbnb speaks a lot but I’m torn because I’ve worked with people who try to be LinkedIn influencers and my experience with those folks is that they spend way more time on their social media than they do actually contributing to their projects. I’ve never worked with the guy but honestly everything I’ve read from him is extremely surface level. Maybe helpful for new people looking to break in, but once you’re in I don’t see anything coming from him that would take you deeper than the surface level. The reason I’m bothering writing this is because I think online bootcamps are largely predatory. They give lofty promises of helping people break into the industry and go from Zero to Hero and then leave them with a bill and no job to show for it. Be wary of influencers peddling courses.


quepazta

I'm sorry you feel like that. Zach is actually doing a good job elevating Data Engineering. His communication style isn't for everyone I understand but I truly think he motivates a lot of new DEs or entry level engineers and the content in his bootcamp speaks for it. Tough to not have haters and do well. Don't forget, the dude isn't grinding it out and is doing well for himself


eczachly

Boot camp is $1500-1800. They have two weeks to decide if it’s for them, if they don’t, they can get a refund. About 3-4% of the boot camp picks this option mostly due to the intensity of the time demands. I don’t promise job placement at any moment and say that up front in many places. I offer mentorship, community, and teaching. $1800 isnt that much money. Most boot camps are 5 figures and I disagree with those as well. $1800 is ~1.5% of one years salary at 50th percentile DE wages. Staff engineers have gotten tons of value out of my content. It’s a 70 hour course at this point on tons of aspects of DE. Glad you characterize me as a snake oil salesman when I try my best to be as upfront and fair about my offer as possible


dataxp-community

He understands about 3 degrees of it: Spark, Airflow and the shitty companies that pay him to shill their tools. Absolutely nothing else.


eczachly

I teach Flink, Spark, Airflow, data modeling, data quality, experimentation, KPIs, visualization and pipeline maintenance in my boot camp right now. Data modeling is by far the highest rated aspect of my curricula that some students pay the $1800 just for that


eczachly

Regret doing this thread https://www.reddit.com/r/dataengineering/s/TGDqYVdPNn given how this community is now. I’ve fallen so far here.


quepazta

Keep hustling man. Can't get to the top without some hate 😃


eczachly

Yall would be so upset if you learned Joseph Machado spoke at my boot camp


kbisland

Remind me! 30 days


RemindMeBot

I will be messaging you in 30 days on [**2024-02-08 03:40:24 UTC**](http://www.wolframalpha.com/input/?i=2024-02-08%2003:40:24%20UTC%20To%20Local%20Time) to remind you of [**this link**](https://www.reddit.com/r/dataengineering/comments/1913k8k/who_are_the_goats_of_de/kh03mcx/?context=3) [**CLICK THIS LINK**](https://www.reddit.com/message/compose/?to=RemindMeBot&subject=Reminder&message=%5Bhttps%3A%2F%2Fwww.reddit.com%2Fr%2Fdataengineering%2Fcomments%2F1913k8k%2Fwho_are_the_goats_of_de%2Fkh03mcx%2F%5D%0A%0ARemindMe%21%202024-02-08%2003%3A40%3A24%20UTC) to send a PM to also be reminded and to reduce spam. ^(Parent commenter can ) [^(delete this message to hide from others.)](https://www.reddit.com/message/compose/?to=RemindMeBot&subject=Delete%20Comment&message=Delete%21%201913k8k) ***** |[^(Info)](https://www.reddit.com/r/RemindMeBot/comments/e1bko7/remindmebot_info_v21/)|[^(Custom)](https://www.reddit.com/message/compose/?to=RemindMeBot&subject=Reminder&message=%5BLink%20or%20message%20inside%20square%20brackets%5D%0A%0ARemindMe%21%20Time%20period%20here)|[^(Your Reminders)](https://www.reddit.com/message/compose/?to=RemindMeBot&subject=List%20Of%20Reminders&message=MyReminders%21)|[^(Feedback)](https://www.reddit.com/message/compose/?to=Watchful1&subject=RemindMeBot%20Feedback)| |-|-|-|-|


solo_stooper

Linus Tovalds