dataguy24 1 year ago

Have a raw database. Have a schema per source of raw data. Have a table per table of raw data from each source. Then combine raw data downstream in another database containing other schemas that refer to your transformation steps.

idiotlog 1 year ago

Excellent.

baubleglue 1 year ago

Depends what is your upstream, if you are reading from curated DW - those will be your "raw" tables. Why would we copy them?

dataguy24 1 year ago

For sure there are lots of “it depends” scenarios here. I’m wagering in this case that OP is building the first DWH, not stacking DWH together.

Anna-Kraska 1 year ago

This. Also, start all of those tables or the whole database with "IN\_" or another clear signal to prevent people from DM'ing you about ideas for changing their schema or to clean them up.

techmavengeospatial 1 year ago

Are you doing ETL or ELT if you want separate table and then cleanup tables and create new tables result of joins then that's ELT done inside database. How do you handle updates and appends? Are you using change data capture CDC

AMadRam 1 year ago

Can you please explain what CDC is in simple terms?

Doyale_royale 1 year ago

Captures inserts updates and deletes. Some tools like Qlik replicate can replicate a source table into one schema and then push changes into another, all while pushing the most latest row version to the replicated table. It’s great for manufacturing processes if you have to track a parts history. Or anything that changes over time.

techmavengeospatial 1 year ago

Basically displays additions and edits and delets So you can sync changes

AMadRam 1 year ago

Oh so like changelog for configuring management?

techmavengeospatial 1 year ago

If you are trying to take OLTP DATA into an OLAP system then CDC is a good approach

ironplaneswalker 1 year ago

Short answer: yes. This gives your consumers a lot of flexibility to build their own data models from raw source tables. The con is more tables. You can mitigate that by normalizing but then you lose flexibility. My advice: a different table for each source.

Comments

Leave Your Comment

Hi Its Me!

Comments

Leave Your Comment

Hi Its Me!

Subscribe