Incremental Load vs Full extract
Hi,
We’re starting to implement Power BI in our organisation, and I’d really appreciate your advice.
We use a third-party provider for our case management system. The proposed setup is to export data as an Excel file, which is automatically dropped into a SharePoint folder and then picked up by Power BI.
Initially, they suggested a single wide extract instead of multiple logical tables (see post (https://www.stl-training.co.uk/post-43016-single-wide-extract-vs.html#43016). We are pushing back on this for multiple extracts.
I wanted to get your advice on another aspect the system provider has recently introduced — incremental data loads.
They are proposing the following approach:
a) Initial full extract of all data
b) Then subsequent files containing only new and modified records (rather than a full daily extract)
I understand this is a common approach in larger data environments, but I’d really value your guidance on whether this is appropriate in our scenario.
Our current data volumes (for context)
~700 records in the main (parent/master) table
~20–30 child tables (assessments, needs, experiences, etc.)
Each child table currently ~2,000–5,000 rows
Growth maybe 300 new cases per year
Questions
1) At what point is data considered “large enough” to justify incremental loading in Power BI rather than a full refresh approach?
2) For a dataset of this size, would you recommend: Continuing with full daily extracts, or Implementing an incremental load pattern from the outset?
3) From a practical perspective, what are the main advantages and disadvantages of incremental loading in Power BI, particularly in terms of: Complexity of setup and maintenance, Risk of data inconsistencies (e.g. missed updates/deletes), Debugging and support overhead
Given that I’m the sole person building and maintaining the model, would incremental loading typically be considered over-engineering at this stage, or still a good design choice for future-proofing?
As mentioned in my earlier post, we’ve also been advised to avoid a single wide extract and instead use separate logical tables aligned with the underlying data model. I’m therefore trying to ensure we make the right decision now on:
Data structure (multiple tables vs wide file)
Data refresh approach (full vs incremental)
…so that the model remains simple, reliable, and maintainable long-term.
Any guidance or real-world examples would be really helpful.
Thanks in advance,
Vandana
RE: Incremental Load vs Full extract
Hi Vandana,
Thank you for the forum question.
________________________________________
1. When is incremental loading justified?
Incremental loading becomes appropriate when at least one of the following applies:
• Data volumes reach millions of rows
• Full refresh times become slow (typically more than 5–10 minutes)
• The source system cannot handle repeated full extracts
• Historical data grows large but changes infrequently
• There is a need to optimise refresh windows or infrastructure load
Your current dataset is very small:
• Total rows likely under 100,000
• Growth is modest
• Refresh will likely take seconds
Conclusion: your data is far below the threshold where incremental loading is necessary.
________________________________________
2. What should you do in your case?
You should use full daily extracts.
This is the better approach for your situation because:
Simplicity
• No need to build logic for merging, updates, or deletes
• Straightforward Power Query transformations
Reliability
• Each refresh is a complete and consistent view of the data
• No risk of missing updates or partial data
Maintainability
• Easy to troubleshoot and explain
• Easier for others to take over later if needed
Performance
• Your dataset is small enough that full refresh cost is negligible
Incremental loading adds no meaningful benefit at your scale.
________________________________________
3. Advantages and disadvantages of incremental loading
Advantages
• Reduces data processing for very large datasets
• Faster refresh when only a small portion of data changes
• Less load on source systems in high-volume environments
Disadvantages
Complexity
• You must handle inserts, updates, and deletes explicitly
• Often requires staging logic or additional data layers
Risk of inconsistencies
• Missed updates can lead to stale data
• Duplicate or conflicting records may appear
• Deletes are particularly difficult to handle correctly
Debugging difficulty
• Errors are harder to trace back to a specific load
• Troubleshooting becomes time-consuming
Maintenance overhead
• Requires monitoring and validation
• More moving parts to maintain as a single developer
________________________________________
4. Is incremental loading over-engineering in your case?
Yes.
For your current data size and growth:
• It introduces unnecessary complexity
• It increases risk without delivering tangible benefits
• It makes the model harder to maintain as a solo developer
Incremental loading is not future-proofing in this context. It is premature optimisation.
________________________________________
5. Data structure decision (important confirmation)
You are correct to push for multiple logical tables instead of a single wide extract.
Multiple tables are better because:
• They align with a star schema
• Improve model performance
• Reduce data duplication
• Make relationships clearer
• Simplify DAX and reporting
A single wide file tends to create long-term issues in maintainability and performance.
________________________________________
Final recommendation
For your environment:
• Use multiple logical tables aligned with the underlying data model
• Use full daily extracts
• Keep the architecture simple
Only reconsider incremental loading if your dataset grows to millions of rows or refresh performance becomes a real issue.
________________________________________
If you want to go further, the next useful step would be defining a clean star schema for your specific tables and relationships.
Kind regards
Jens Bonde
Microsoft Office Specialist Trainer
Tel: 0207 987 3777
STL - https://www.stl-training.co.uk
98%+ recommend us
London's leader with UK wide delivery in Microsoft Office training and management training to global brands, FTSE 100, SME's and the public sector


