Power BI is Microsoft's enterprise business intelligence platform designed to connect data from multiple external systems, unify it inside a single governed data model, and deliver interactive dashboards. Most organizations store operational data across separate platforms, spreadsheets, relational databases, cloud warehouses, and third-party APIs, none of which communicate with each other natively. Power BI's core function is to bridge those disconnected systems into one consolidated, refreshable reporting layer.
A data source in Power BI is any external system that holds data: a file, a relational database, a cloud platform, or an API endpoint. Power BI accesses each source through a data connector, a built-in, authenticated integration that retrieves data securely and loads it into the report environment.
Power BI currently supports over 150 native connectors, covering the majority of business tools in active use. All connectors are accessed through the Get Data menu in Power BI Desktop. The core processing pipeline is consistent regardless of source type:
Connect → Transform → Model → Visualize
Each stage is distinct. Connection establishes the authenticated link to the source. Transformation cleans and reshapes raw data in Power Query Editor. Modeling defines relationships between tables. Visualization renders data into reports and dashboards. Separating these stages ensures every step is governed, repeatable, and automatically applied on each refresh cycle.
File-based connectors cover Excel workbooks (.xlsx), CSV exports, JSON files, and XML datasets. These sources connect directly without requiring server infrastructure, database credentials, or gateway configuration.
Files stored on OneDrive or SharePoint support automatic hourly refresh. Power BI detects file updates and rebuilds the semantic model without manual intervention. In 2026, Power BI Desktop uses a modernized file picker experience as the default, replacing the legacy interface that was retired in April 2026. Files stored on a local drive do not support automatic cloud refresh.
Relational databases store structured transactional data, orders, customers, inventory, and financial records, and Power BI connects natively to SQL Server, MySQL, PostgreSQL, Oracle, and Microsoft Access without requiring any file export.
For databases hosted on local servers, the Power BI On-Premises Data Gateway is required. The gateway functions as a secure, encrypted tunnel between the on-premises database and the Power BI cloud service. Without a running gateway, scheduled refreshes fail, and no on-premises data can be reached. The gateway must be installed on a continuously running machine and verified before report development begins.
Note: The Simba Vertica ODBC driver began its deprecation in February 2026. Organizations using the Vertica connector should transition to the updated Vertica ODBC driver, which is generally available and continues to receive investment.
Cloud-hosted data warehouses, including Azure SQL Database, Azure Synapse Analytics, Google BigQuery, Amazon Redshift, Snowflake, and Databricks, connect natively to Power BI without a gateway, because both systems operate within cloud infrastructure.
Microsoft's Azure ecosystem provides the deepest integration, including Single Sign-On (SSO) authentication, role-based access control inheritance, and Direct Lake mode through Microsoft Fabric. Direct Lake reads data directly from OneLake Microsoft Fabric's unified storage layer, delivering Import Mode query speeds without physically duplicating data into Power BI's internal storage engine. In 2026, Direct Lake also supports calculated columns in preview, enabling more advanced in-place modeling directly on OneLake data.
Power BI includes pre-built connectors for widely used online platforms, including Salesforce, Google Analytics, Microsoft Dynamics 365, SharePoint Online, LinkedIn Ads, and Mailchimp. The QuickBooks Online connector was retired in March 2026 as part of Microsoft's ongoing effort to streamline its connector portfolio and maintain the highest standards of secure data connectivity.
For platforms without a dedicated native connector, Power BI's Web connector, combined with the Power Query M language, enables custom REST API integrations handling authentication, parameter passing, JSON parsing, and pagination entirely within the query editor environment.
For high-volume APIs, routing data through an intermediary staging database such as Azure SQL via a scheduled pipeline provides a more reliable architecture. This approach decouples data retrieval from the report refresh cycle and avoids API rate limit conflicts.
When continuous data updates are required, Power BI supports streaming through Azure Stream Analytics, Push datasets, and DirectQuery against live databases. These configurations enable dashboards to reflect data changes within seconds rather than waiting for a scheduled refresh cycle.
Streaming datasets have defined technical constraints: they do not support the full DAX function library and are incompatible with certain visual types. Real-time architecture requires deliberate planning at the data model design stage it cannot be retrofitted onto a standard Import Mode report.
The connection mode defines the relationship between Power BI and its data source: how data is stored, how queries execute, and how frequently the data reflects changes in the source system. Selecting the correct mode per source is the most architecturally significant decision in any Power BI project.
| Mode | How It Works | Best Use Case | Key Constraint |
|---|---|---|---|
| Import Mode | Data is copied into Power BI's columnar storage engine | Most standard business reporting | Data is only as fresh as the last scheduled refresh |
| DirectQuery | Every visual sends a live query to the source database | Real-time operational monitoring | Query speed depends on the source database performance |
| Live Connection | Power BI connects directly to an Analysis Services model | Enterprise BI with centralized models | Transformations must be done in the upstream model |
| Direct Lake | Data is read from OneLake without duplication | Microsoft Fabric environments | Requires Microsoft Fabric capacity |
| Composite Mode | Import and DirectQuery sources coexist in one model | Mixed freshness requirements | Requires aggregation table configuration |
Import Mode loads data into Power BI's internal VertiPaq engine a compressed, in-memory columnar store optimized for fast aggregation queries. Dashboard performance is highest in Import Mode, but data is only as current as the most recent refresh.
DirectQuery sends each visual's query directly to the source at render time. There is no local data copy, meaning every report view reflects the source system's current state. Query performance is bounded by the source database's capacity to handle concurrent analytical queries.
Direct Lake, available through Microsoft Fabric, reads columnar Parquet files directly from OneLake at in-memory speeds without an import copy step. It eliminates the Import/DirectQuery trade-off and is the recommended architecture for large-scale deployments in 2026. The April 2026 update introduced Direct Lake calculated columns in preview, further closing the gap between Direct Lake and traditional Import Mode capabilities.
Composite Mode allows tables within the same data model to use different connection modes. Import tables are stored locally for speed; DirectQuery tables are queried live. Aggregation tables must be manually configured to route queries correctly between the two layers.
Open Power BI Desktop and click Get Data on the Home ribbon. The dialog categorizes all available connectors by type: File, Database, Power Platform, Azure, Online Services, and Other. Select the appropriate connector and authenticate.
After selecting a source, click Transform Data rather than Load. This opens Power Query Editor, the transformation environment where data is cleaned and shaped before entering the data model. Loading without transformation embeds raw data quality issues directly into the model, making them significantly harder to correct.
Inside Power Query Editor, click New Source on the Home ribbon to add additional sources. Each source loads as an independent query in the left panel. Power BI imposes no limit on the number of concurrent source connections within a single report.
Each source must be cleaned individually before combination. Key operations include correcting data type assignments, removing duplicate rows, renaming columns to consistent labels across sources, filtering irrelevant records, and normalizing categorical values. Column key mismatches between sources in naming, data type, or formatting cause relationship failures in the data model that produce incorrect or null results.
Append Queries stacks two or more sources with identical column structures vertically into a single unified table, used when consolidating parallel datasets such as sales data from multiple regions or time periods.
Merge Queries performs a horizontal join between two sources on a shared key column used when enriching one dataset with fields from another, such as joining a product name dimension from Excel onto a transaction table from SQL Server. The merge type (inner, left outer, right outer, full outer) determines how unmatched rows are handled.
Navigate to the Model view in Power BI Desktop. Relationships are defined by connecting shared key columns between tables. Each relationship requires a defined cardinality (one-to-one, one-to-many, many-to-many) and a cross-filter direction. One-to-many is the standard cardinality for most dimensional models. Cross-filter direction controls how filter context propagates across tables when a slicer or visual interaction is applied. All DAX measures that aggregate across multiple tables depend entirely on correct relationship definitions.
Click Publish to upload the report to the Power BI Service. In the Service, open the dataset's Settings and configure a scheduled refresh frequency appropriate to each source's data update cadence. For on-premises sources, the correct gateway must be assigned. For cloud sources, credentials must be entered in Data Source Settings. A published report without a configured refresh retains the data snapshot from the moment of publication indefinitely.
Unplanned connection mode mixing. Combining Import and DirectQuery sources in a Composite model requires explicit aggregation table configuration. Not all DAX functions operate consistently across mixed-mode tables. Mode selection must be determined per source before any connections are made.
Gateway not configured for on-premises sources. The On-Premises Data Gateway must be installed, configured as a Windows service, and actively running on a persistent machine. Any scheduled refresh that cannot reach the gateway fails silently no error is visible to report consumers, only to the dataset owner via refresh history.
Semantic model exceeding capacity limits. Power BI shared capacity (Pro tier) enforces a 1GB compressed semantic model limit. Reports importing large transactional history or high-frequency telemetry data regularly exceed this ceiling. The resolution is to apply incremental refresh policies, switch large tables to DirectQuery, or migrate to Power BI Premium or Microsoft Fabric capacity.
Incorrect relationship cardinality. A one-to-many relationship incorrectly configured as many-to-many activates DAX's bidirectional filtering behavior, which silently inflates or distorts aggregate measures. All key metrics must be cross-validated against source system values before any report is shared with stakeholders.
No scheduled refresh after publishing. A report published without a refresh schedule is a static snapshot. Configuring the refresh schedule is a required post-publication step, not an optional one.
Legacy dataflow experience retirement. Semantic models created using the legacy dataflow experience will stop refreshing after July 31, 2026. Reports built on the legacy experience must be recreated using the current Dataflows Gen2 interface to continue updating.
Copilot with Conversational Chat (Generally Available). Copilot now operates across connected data models and supports full conversational chat, including inside reports on Power BI Mobile as of April 2026. Users can ask natural language questions, receive DAX query results with plain-language explanations, and refine follow-up queries in a continuous chat thread. This removes DAX authoring as a prerequisite skill for analytical exploration and works consistently across desktop, web, and mobile.
Direct Lake Calculated Columns (Preview). The April 2026 update introduced Direct Lake calculated columns and user-context-aware calculated columns in preview. This allows analysts to define computed fields directly within a Direct Lake model without switching to Import Mode, significantly expanding the analytical capability of Fabric-native architectures.
PBIR as the Default File Format. From January 2026, Power BI adopted the Power BI Enhanced Report format (PBIR) as the default for all new and edited reports. PBIR aligns metadata between PBIX and PBIP files, enabling cleaner Git-based version control workflows and more auditable reporting environments. The legacy format is being auto-converted upon editing and saving.
Dataflows Gen2 for Centralized ETL. Dataflows Gen2 are reusable transformation pipelines built and maintained in the Power BI Service. A single Dataflow can serve multiple reports across an organization, ensuring shared data dimensions such as customer records or product hierarchies are transformed consistently from one centrally governed definition. Changes to the source structure are corrected once in the Dataflow and propagate automatically to all connected reports. The legacy Dataflows Gen1 experience is being retired, with entry points removed on May 31, 2026.
Fabric Copilot Capacity. Beginning February 2026, Fabric Copilot capacity is enabled by default for all tenants. This allows organizations to consolidate Copilot usage across Power BI Desktop, Pro workspaces, and Premium per-user workspaces into a single designated capacity, simplifying governance and cost management for AI-assisted analytics.
Incremental Refresh for API Sources. The Web connector supports incremental refresh for REST API-sourced data. On each refresh cycle, Power BI requests only records created or modified since the last successful refresh, rather than reloading the full dataset, reducing API call volume, cutting refresh duration, and avoiding provider-side rate limits.
Determine connection mode per source before building. Changing a table's connection mode after relationships and DAX measures have been built often requires full model reconstruction. Mode selection is an architectural decision made before development begins.
Perform all data transformation inside Power Query. Transformations applied in Power Query are recorded as an ordered, documented sequence of steps that execute automatically on every refresh. Any preparation done manually within a source file is undocumented, nonreproducible, and lost when the source file is updated or replaced.
Use Dataflows Gen2 for shared dimensions. Any dataset used across multiple reports, territory hierarchies, product categories, and customer master data should be maintained as a Dataflow. This enforces a single definition and eliminates divergence between reports that each apply different transformations to the same underlying data.
Migrate to PBIR for version-controlled environments. Teams managing reports through Git-based workflows should ensure all reports are saved in the PBIR format. This enables clean diffs, branch-based collaboration, and auditable change histories that are not possible with the legacy PBIX format.
Validate all key measures against source systems before publishing. Each critical metric must be manually verified against the originating source system before distribution. A discrepancy between Power BI's output and a direct database query indicates an incorrect relationship, cardinality setting, or filter context issue that must be resolved before stakeholder access.
Power BI's data connectivity architecture, spanning file imports, relational databases, cloud warehouses, online services, REST APIs, and real-time streams, is built around one design principle, every data source in an organization should be accessible, combinable, and automatically refreshable from within a single governed reporting environment.
The 2026 advancements reinforce this architecture at every layer. Direct Lake calculated columns extend Fabric-native modeling. PBIR standardizes file management for enterprise teams. Copilot's conversational chat brings AI-assisted exploration to mobile and desktop alike. Fabric Copilot capacity centralizes AI governance. And the retirement of legacy experiences, Dataflows Gen1, the old file picker, and deprecated connectors signals a clear shift toward a unified, modern platform.
Understanding each layer of this architecture, what each connection mode means, how queries combine sources, how relationships define analytical logic, and how refresh configurations maintain data currency remains the technical foundation for building Power BI reports that are accurate, scalable, and operationally trustworthy in 2026 and beyond.
Tell us about your problems and one of our Customer Success Managers will get back to you the same day. No spam. No pressure.
No spam. No pressure.Prefer direct contact? Call or email us anytime.
© 2025 All Rights Reserved By TechImplement