Public Schedule Face-to-Face & Online Instructor-Led Training - View dates & book

flexiblechanging csvs

Forum home » Delegate support and help forum » Microsoft Power BI training and help » Flexible/changing csvs

Flexible/changing csvs

ResolvedVersion 365

Flexible/changing csvs

Hi,

We’re starting to implement Power BI in our organisation, and I’d really appreciate your advice.

We use a third-party provider for our case management system. The proposed setup is to export data as an Excel file, which is automatically dropped into a SharePoint folder and then picked up by Power BI.

As advised on the previous posts, we are already pushing back on
1) Asking for multiple logical tables instead of a single wide extract file (https://www.stl-training.co.uk/post-43016-single-wide-extract-vs.html#43016)
2) No incremental load, just a full daily extract (https://www.stl-training.co.uk/post-43037-incremental-load-vs-full.html)

There is one more thing they have proposed which sounded amazing, but with deeper thinking, I am not entirely sure.

The system provider is proposing to deliver “flexible” CSV extracts. In practical terms, this means that if a new field is added in the case management system, it will automatically appear as a new column in the CSV extract. This sounded great because it meant that we do not need to keep making requests to their report team for modifying the csv’s every time there is a change and it happens automatically.

• However, from a Power BI perspective, what are the things I need to be mindful of?
• In your experience, is this kind of fully flexible / schema-changing extract considered good practice for Power BI models?
• Are there recommended approaches to balance flexibility and stability, for example: allowing new fields to be added but ensuring no renaming/removal of existing fields

Any guidance would be really helpful.

Many thanks.

Best regards,
Vandana

RE: Flexible/changing csvs

Hi Vandana,

Thanks for the forum question and good luck with your Power BI development.

This is a great question—and you’re right to pause before accepting “fully flexible” extracts. What sounds convenient upstream can create fragility downstream in Power BI if not controlled.
Below is a practical, experience-based view of the trade-offs and how to handle this safely.
________________________________________
1) What to be mindful of in Power BI
Schema changes can silently break transformations
Power Query (M) is schema-sensitive. Most steps reference column names explicitly.
Typical issues:
• If a column is renamed → transformations fail
• If a column is removed → refresh errors
• If column order changes → usually fine, but can break index-based logic
• If new columns appear → usually ignored unless you use dynamic logic
Example:
= Table.SelectColumns(Source, {"CaseID", "Status", "CreatedDate"})
If “Status” disappears, refresh fails.
________________________________________
Data types and model layer impact
Power BI doesn’t just load data—it builds a semantic model.
If schema changes:
• New columns won’t automatically appear in visuals
• Measures depending on renamed fields will break
• Relationships may fail if key columns change
• Data types might be inferred differently (dangerous with CSV)
________________________________________
CSV-specific risks
CSV is already a weak format compared to structured sources:
• No enforced schema
• No metadata (types, keys, constraints)
• Easy to corrupt with formatting inconsistencies
Adding schema drift on top increases risk significantly.
________________________________________
Refresh reliability becomes harder to guarantee
Enterprise BI best practice values predictable refreshes.
Flexible schema introduces:
• Uncontrolled changes
• Higher chance of refresh failures
• More time spent troubleshooting data issues
________________________________________
2) Is fully flexible / schema-changing good practice?
Short answer: No, not for production BI models.
It’s generally considered:
• Acceptable for exploratory or data lake ingestion layers
• Risky for curated BI datasets
• Poor practice for governed reporting environments
In mature BI architectures, you typically separate:
• Raw ingestion (flexible)
• Curated model (stable schema)
Your provider is essentially proposing to merge these layers, which creates instability.
________________________________________
3) A better balance: flexibility with control
You’re already thinking in the right direction. The goal is:
Allow evolution without breaking stability
Here are proven approaches:
________________________________________
A) Contract-based schema (strongly recommended)
Define a data contract with the provider:
Allowed:
• Adding new columns
Not allowed:
• Renaming existing columns
• Removing columns
• Changing data types
This gives you:
• Backward compatibility
• Safe evolution
________________________________________
B) Split datasets: raw vs curated
Pattern:
1. Raw layer (flexible)
o Load CSV as-is
o Minimal transformations
o Schema can drift
2. Curated layer (stable)
o Reference only known columns
o Apply business logic
o Feed reports
This way:
• New columns don’t break anything
• You opt-in to new fields deliberately
________________________________________
C) Defensive Power Query design
You can make your queries more resilient.
Use column existence checks
= Table.SelectColumns(Source, {"CaseID", "Status"}, MissingField.Ignore)
Avoid hard failures
• Use try … otherwise
• Use Record.HasFields for dynamic logic
________________________________________
D) Explicit schema enforcement step
At some point, “freeze” the structure:
ExpectedColumns = {"CaseID", "Status", "CreatedDate"},
CleanedTable = Table.SelectColumns(Source, ExpectedColumns, MissingField.Ignore)
This:
• Protects your model
• Ignores unexpected columns
• Prevents schema creep
________________________________________
E) Change notification process
Even with flexibility, you need governance:
Ask the provider for:
• Notification when fields are added
• Schema change log
• Versioning if possible
Without this, changes become invisible until something breaks.
________________________________________
F) Consider Parquet or structured alternatives (if possible)
If you can influence it:
• Parquet or database export is far better than CSV
• Supports schema more reliably
• Better performance and typing
________________________________________
4) Practical recommendation for your situation
Given everything you’ve described, the best position to take is:
1. Accept adding new columns automatically
2. Reject uncontrolled schema changes (rename/remove/type change)
3. Implement a stable curated model in Power BI
4. Treat the CSV as a semi-raw input, not your reporting layer
________________________________________
5) Simple rule of thumb
Think of it like this:
• Flexibility is good at the edges (data ingestion)
• Stability is essential at the core (reporting model)
Your provider is optimizing for convenience. You need to optimize for reliability and trust in reporting.
________________________________________
6) One-line answer you can take back to the vendor
You could say:
“We’re happy for new fields to be added automatically, but we require backward-compatible schema—no renaming, deletion, or data type changes to existing fields, and advance notice of changes.”
________________________________________
Below is a robust, production-style Power Query (M) template for safely ingesting flexible / changing CSV files from a SharePoint folder.
This pattern is designed to:
• Tolerate new columns appearing
• Avoid breaking when columns are missing
• Enforce a stable schema for your model
• Handle multiple files (latest file pattern)
• Keep transformation logic resilient
________________________________________
Here is an example of a robust Power Query template that safely handles evolving CSVs without breaking your dataset.

Power Query Template: Safe CSV Ingestion
1) Parameters (optional but recommended)
Create a parameter for your folder path:
FolderPath = "https://yourtenant.sharepoint.com/sites/yoursite/Shared Documents/Data/"
________________________________________
2) Get latest CSV from SharePoint folder
let
Source = SharePoint.Files(FolderPath, [ApiVersion = 15]),

FilterCsv = Table.SelectRows(Source, each Text.EndsWith([Name], ".csv")),

SortByDate = Table.Sort(FilterCsv, {{"Date modified", Order.Descending}}),

LatestFile = SortByDate{0},

FileContent = LatestFile[Content]
in
FileContent
________________________________________
3) Read CSV safely
let
Csv = Csv.Document(
FileContent,
[
Delimiter = ",",
Encoding = 65001,
QuoteStyle = QuoteStyle.Csv
]
),

PromoteHeaders = Table.PromoteHeaders(Csv, [PromoteAllScalars = true])
in
PromoteHeaders
________________________________________
4) Standardise column names (optional but helpful)
let
CleanNames = Table.TransformColumnNames(
PromoteHeaders,
each Text.Trim(Text.Clean(_))
)
in
CleanNames
________________________________________
5) Define your expected schema (core stability layer)
This is the key control step.
ExpectedColumns = {
"CaseID",
"Status",
"CreatedDate",
"ClosedDate",
"Owner"
}
________________________________________
6) Safely select only known columns
This avoids breaking if columns are missing and ignores unexpected new ones.
let
SelectedColumns = Table.SelectColumns(
CleanNames,
ExpectedColumns,
MissingField.Ignore
)
in
SelectedColumns
________________________________________
7) Apply data types defensively
Use try ... otherwise so refresh won’t fail.
let
Typed =
Table.TransformColumns(
SelectedColumns,
{
{"CaseID", each try Text.From(_) otherwise null, type text},
{"Status", each try Text.From(_) otherwise null, type text},
{"CreatedDate", each try Date.From(_) otherwise null, type date},
{"ClosedDate", each try Date.From(_) otherwise null, type date},
{"Owner", each try Text.From(_) otherwise null, type text}
}
)
in
Typed
________________________________________
8) Optional: Capture new/unexpected columns (for monitoring)
This is a best practice for governance.
let
AllColumns = Table.ColumnNames(CleanNames),

NewColumns = List.Difference(AllColumns, ExpectedColumns),

NewColumnsTable = Table.FromList(NewColumns, Splitter.SplitByNothing(), {"NewColumn"})
in
NewColumnsTable
You can:
• Load this as a separate table
• Use it for alerting or QA
________________________________________
9) Final output (curated table)
let
Final = Typed
in
Final
________________________________________
What this template protects you from
Safe from:
• New columns appearing in CSV
• Missing columns (no refresh failure)
• Dirty column names
• Type conversion errors
• File replacement in SharePoint
Still requires control:
• Column renames (will behave like missing columns)
• Business logic changes
• Key field integrity
________________________________________
Recommended architecture with this template
Use two layers:
Raw query (keep everything)
• Loads full CSV
• No schema enforcement
• Hidden from report users
Curated query (the template above)
• References raw query
• Applies ExpectedColumns logic
• Feeds model
________________________________________
Optional enhancement: Add a schema check flag
You can add a simple warning flag:
let
MissingCols = List.Difference(ExpectedColumns, Table.ColumnNames(CleanNames)),
SchemaStatus =
if List.Count(MissingCols) > 0
then "Missing columns: " & Text.Combine(MissingCols, ", ")
else "OK"
in
SchemaStatus
________________________________________
Key takeaway
This pattern gives you:
• Flexibility at ingestion (provider can add fields)
• Stability in your model (you control what is used)
• Visibility of changes (optional monitoring step)
________________________________________



Kind regards

Jens Bonde
Microsoft Office Specialist Trainer

Tel: 0207 987 3777
STL - https://www.stl-training.co.uk
98%+ recommend us

London's leader with UK wide delivery in Microsoft Office training and management training to global brands, FTSE 100, SME's and the public sector

RE: Flexible/changing csvs

Hi Jens,

Many thanks for your response and advice, really appreciate it.

Kind regards,
Vandana

 

Training courses

Welcome. Please choose your application (eg. Excel) and then post your question.

Our Microsoft Qualified trainers will then respond within 24 hours (working days).

Frequently Asked Questions
What does 'Resolved' mean?

Any suggestions, questions or comments? Please post in the Improve the forum thread.

Power BI tip:

General Shortcuts

Ctrl + S: Save the report or dashboard.
Ctrl + Z: Undo the last action.
Ctrl + Y: Redo the last undone action.
Ctrl + C: Copy selected items.
Ctrl + X: Cut selected items.
Ctrl + V: Paste copied or cut items.
Ctrl + A: Select all items.
Ctrl + N: Create a new report or dashboard.

View all Power BI hints and tips

Connect with us:

0207 987 3777

Call for assistance

Request Callback

We will call you back

Server loaded in 0.11 secs.