Google Analytics 4 – Ensuring Continuity of Management in BigQuery

Article Analytics 20.06.2024
Par Simon Gay

Simon Gay, Senior Data Analyst Consultant at Converteo, Expert in Analytics Engineering: GCP, BigQuery. He assists our clients in utilizing and appropriating their marketing data for analysis.

Clients frequently approach us with the following question: how can I ensure the transition between my BigQuery UA datasets and GA4?

It is indeed crucial for teams to manage their KPIs without interruption.

Universal Analytics, in its 360 version, will definitely cease operations in July. However, many clients wish to continue using it until the end, and there is significant resistance to change and transition to GA4.

We discuss several approaches to overcome the barriers to adopting the tool:

  • Understanding the various pitfalls of the GA4 data model in BigQuery
  • Properly re-modeling data to align with Universal Analytics

 

Understanding the GA4 Data Model in BigQuery

The export of GA4 data to BigQuery provides access to raw and granular data, offering opportunities for management and analysis for Data Marketing teams.

Two Different Data Models

Unlike Universal Analytics, which features a sometimes complex exported data model based on a combination of hits and sessions, GA4 adopts a more “flat” data model where each row represents an event. Thus, we have two different tools and two different models.

For each KPI that we want to manage continuously, it is essential to first consider its calculation across the different models to determine whether it is intellectually possible to compare them.

In this regard, we recommend reading GA4 BigQuery, which provides numerous pre-built SQL queries for basic indicators.

…and Pitfalls to Know in the GA4 Data Model

The data export model of GA4 presents certain challenges that are important to overcome. Here are the main ones we encounter in our projects:

Type Management: GA4 automatically classifies a parameter into a column based on its type (integer / string / double value / float). Returning the correct value of a parameter can be a bit tricky in SQL, so it’s essential to master this: the functions COALESCE() and SAFE_CAST() are your best allies. The key functions to know are listed in this Medium article and in this one specifically for date management.

 

A funny note on X

  • Data modification: Data can be modified—in the daily table—up to 72 hours after availability. Consequently, your calculation routines and calls to these tables must reference the original tables for 3 days if you want to have a comprehensive calculation.
  • Uniqueness of session_id: Remember that a session ID is queried by combining the ga_session_id and the user_pseudo_id.

  • Consent mode management: Non-consented hits “privacy_info.analytics_storage = no” are included in the export—even with a minimal number of associated parameters—so be sure not to count them.
  • Absence of session dimension: Since the model is event-based, unlike Universal Analytics, you will need to “sessionize” certain dimensions.
  • No integration with Ads tools: Reconciliation must be done manually.
  • No attribution data: This also needs to be recalculated by yourself, especially if you want to reproduce a “last-click non-direct” logic.
  • Events sent in batch have the same event timestamp: It is up to you at the collection level to create a distinct timestamp if you want to recreate an order notion—awaiting an evolution from Google!

Once we have clearly defined the KPIs we wish to track and their calculation method, we can address the question of reconciling them between UA and GA4.

 

Modeling and Reconciling Data Between GA4 and UA

To work with GA4 data, we recommend subdividing it into subsets of tables based on a logical scope: users, sessions, ecommerce, events, pages, etc.

The main benefit of this subdivision is to maximize performance and reduce costs—in other words, to avoid querying the main table, which can contain tens of millions of rows for each request.

For example: our Data Marketing team tracks the KPI “Sessions per User” in Universal Analytics. We create a “ga4_user” table that calculates this KPI for GA4; then we merge this KPI with its historical data from Universal Analytics.

At Converteo, we have chosen to rely on standard market modeling tools to carry out this part: DBT and Dataform.

Two public examples exist to demonstrate the capabilities of these tools:

  • For DBT, the most interesting resource is the Velir/dbt-ga4 package, which allows for the recreation of tables at the user/session/page level and associates conversion rates, for example.
    For Dataform, I appreciate the work done by Artem Korneev, particularly regarding attribution model management.

Our role at Converteo is to create and deliver all the turnkey pre-calculated tables for our clients’ teams, providing them with a continuity of control.

 

The Limits of the Approach

Let’s take an example of a “session” table with various associated KPIs:

  • Checkout access rate
  • Number of sessions by source/medium

Once the data is modeled and reconciled, we can visualize it in the tool of our choice (Looker Studio/Tableau). However, the curves will not align exactly because sessions are calculated differently between the tools.

Even though it may be tempting to want to “correct” the curves with a Data Science model, our experience suggests that this approach is quite futile.

That is why it is essential, from the framing phase, to carefully choose the KPIs and to challenge the level of confidence in reconciling GA4 and UA through a Sanity Check.

The Indispensable BigQuery

The use of BigQuery for GA4 offers a wide range of possibilities in terms of marketing analysis: data quality analysis, clustering, and the recreation of custom visualizations.

It also presents a great opportunity for organizations to advance their use of modeling tools like DBT and Dataform. Utilizing BigQuery, integrated into the GCP stack, along with all the innovations around Gemini, is clearly a marker of digital maturity and a direction our profession is heading towards.

The advantage is that it is never too late to get started or to improve your skills, although it is essential to understand the conditions: the success of a GA4 x BigQuery project requires both experience—due to the numerous inherent problems with the data model—and a significant commitment from teams to maintain data quality over time.

With contributions from Pierre Adrien Lair.

 

Par Simon Gay

Consultant Senior data analyst

1 / 1

Implementing Consent Mode at L’Oréal, a solution to reconcile customer experience and privacy

L’Oréal leverages Google’s Consent Mode to balance privacy compliance and data performance using user consent and probabilistic modeling.

How video redefines the online shopping experience and boosts the performance of e-commerce sites?

The use of videos in e-commerce has become essential. Consumers favor this format, considering it a preferred way to explore products and services.

To activate a RCU, it must first be understood as a Data Product

Why the RCU must be considered a product in its own right to fulfill its promises.

Creative analysis: underutilized data that is full of potential

Sylvain Deffay reveals concrete examples and addresses key themes such as personalization, diversity, and performance optimization.

10 % of third-party cookies, is it still useful?”: what Google’s announcement on third-party cookies changes, and what it doesn’t

Third-party cookies are structurally losing ground, regardless of Google's reversals

Implementation of Pricing Tools: How to Address the Time-to-Value Challenge

Discover how Converteo helps companies reduce the Time-to-Value of pricing tools and maximize their ROI during a period of inflation.

Evolving Together: Reinventing Togetherness at Converteo

At Converteo, rapid growth, internationalization, and the ability to adapt are central to our dynamic and present challenges.

Reinventing the Customer Experience with Artificial Intelligence

Artificial intelligence will reinvent the customer experience: projection and analysis.

Implementing a Referral Program: Lessons Learned from Samsung

Marie Galiana, Head of CRM & Loyalty at Samsung, and Maëva Le Menn, Consultant at Converteo, tell us about the implementation of a referral program.

Behind the trend of data mesh lies the true challenge of data decentralization

Thibault and David strip down the concept of data mesh to reveal what it truly enables and under what conditions.

Removal of Third-Party Cookies and Privacy Sandbox: The Great Blur

Google Chrome has once again delayed the removal of third-party cookies from its browser to early 2025.

Developing a SEA Strategy for Acquiring New Customers

Thibault invites you to explore in detail the technical and organizational adjustments we implement for our clients.

1 / 1

Using Looker Studio with Piano Analytics Data

We will guide you through building your Looker Studio dashboards with your Piano data, while being aware of these few important considerations.

Implementing Consent Mode at L’Oréal, a solution to reconcile customer experience and privacy

L’Oréal leverages Google’s Consent Mode to balance privacy compliance and data performance using user consent and probabilistic modeling.

How video redefines the online shopping experience and boosts the performance of e-commerce sites?

The use of videos in e-commerce has become essential. Consumers favor this format, considering it a preferred way to explore products and services.

To activate a RCU, it must first be understood as a Data Product

Why the RCU must be considered a product in its own right to fulfill its promises.

Creative analysis: underutilized data that is full of potential

Sylvain Deffay reveals concrete examples and addresses key themes such as personalization, diversity, and performance optimization.

10 % of third-party cookies, is it still useful?”: what Google’s announcement on third-party cookies changes, and what it doesn’t

Third-party cookies are structurally losing ground, regardless of Google's reversals

Implementation of Pricing Tools: How to Address the Time-to-Value Challenge

Discover how Converteo helps companies reduce the Time-to-Value of pricing tools and maximize their ROI during a period of inflation.

Evolving Together: Reinventing Togetherness at Converteo

At Converteo, rapid growth, internationalization, and the ability to adapt are central to our dynamic and present challenges.

Reinventing the Customer Experience with Artificial Intelligence

Artificial intelligence will reinvent the customer experience: projection and analysis.

Implementing a Referral Program: Lessons Learned from Samsung

Marie Galiana, Head of CRM & Loyalty at Samsung, and Maëva Le Menn, Consultant at Converteo, tell us about the implementation of a referral program.

Behind the trend of data mesh lies the true challenge of data decentralization

Thibault and David strip down the concept of data mesh to reveal what it truly enables and under what conditions.

Removal of Third-Party Cookies and Privacy Sandbox: The Great Blur

Google Chrome has once again delayed the removal of third-party cookies from its browser to early 2025.

1 / 1

Exploit the Potential of the GA4 Interface with Collections

With the discontinuation of data processing in Universal Analytics, it is essential to master its replacement, Google Analytics 4.

Implementing Consent Mode at L’Oréal, a solution to reconcile customer experience and privacy

L’Oréal leverages Google’s Consent Mode to balance privacy compliance and data performance using user consent and probabilistic modeling.

How video redefines the online shopping experience and boosts the performance of e-commerce sites?

The use of videos in e-commerce has become essential. Consumers favor this format, considering it a preferred way to explore products and services.

To activate a RCU, it must first be understood as a Data Product

Why the RCU must be considered a product in its own right to fulfill its promises.

Creative analysis: underutilized data that is full of potential

Sylvain Deffay reveals concrete examples and addresses key themes such as personalization, diversity, and performance optimization.

10 % of third-party cookies, is it still useful?”: what Google’s announcement on third-party cookies changes, and what it doesn’t

Third-party cookies are structurally losing ground, regardless of Google's reversals

Implementation of Pricing Tools: How to Address the Time-to-Value Challenge

Discover how Converteo helps companies reduce the Time-to-Value of pricing tools and maximize their ROI during a period of inflation.

Evolving Together: Reinventing Togetherness at Converteo

At Converteo, rapid growth, internationalization, and the ability to adapt are central to our dynamic and present challenges.

Reinventing the Customer Experience with Artificial Intelligence

Artificial intelligence will reinvent the customer experience: projection and analysis.

Implementing a Referral Program: Lessons Learned from Samsung

Marie Galiana, Head of CRM & Loyalty at Samsung, and Maëva Le Menn, Consultant at Converteo, tell us about the implementation of a referral program.

Behind the trend of data mesh lies the true challenge of data decentralization

Thibault and David strip down the concept of data mesh to reveal what it truly enables and under what conditions.

Removal of Third-Party Cookies and Privacy Sandbox: The Great Blur

Google Chrome has once again delayed the removal of third-party cookies from its browser to early 2025.