feat: place snapshot meta-columns first (#1481)#1486
Conversation
|
Since this changes the default column ordering for snapshot tables, best to add this behind a flag (refer to behaviour flag, you can keep default as |
|
I have made a comment on the original issue, that there might be a better (less intrusive way) o accomplishing the desired behavior |
|
Hi @aarushisingh04 , I have verified an alternative approach for using |
|
@sd-db understood, and its definitely better for me to have learnt of a cleaner approach! all good 👍 |
fixes #1481
description
databricks collects statistics and enables automatic liquid clustering only on the first 32 columns of a delta table. previously snapshot meta-columns were appended after user columns, pushing them out of that window on wide tables.
this PR
databricks__build_snapshot_tableinsnapshot_helpers.sqlto place the meta-columns first in the SELECT list, before the user columns expanded by*dbt_valid_toanddbt_scd_idto benefit from automatic liquid clustering out of the boxthe
hard_deletes = 'new_record'path is handled correctly, placingdbt_is_deletedalongside the other meta-columns at the front. custom column names configured viasnapshot_meta_column_namesare respected through the existingget_snapshot_table_column_names()dispatch, consistent with the rest ofsnapshot_helpers.sql.Note
this change only affects newly created snapshot tables. existing snapshots will continue to work unchanged with the old column order. If you want the new column order on an existing snapshot for the statistics/clustering benefit, you can recreate it with
dbt snapshot --full-refreshbut be aware this will permanently drop all historical SCD2 data. back up the table first.checklist
stated issue
CHANGELOG.mdand added information about mychange to the "dbt-databricks next" section.