Customers Contact TR

Mastering LookML: A Practical Guide to High-Performance Data Modeling

Good LookML does not just answer questions; it makes them faster, cleaner, and easier to ask. If you have worked with Looker long enough, you know that LookML is more than just a modeling language; it is the foundation that turns raw data into reliable, business-ready insights. But building a scalable LookML model is not just about making things “work.” It is about writing clean, consistent, and maintainable code that your future self and your teammates will thank you for.


In this guide, we will go beyond the basics. We will break down Dimension and Measure syntax, Explore and Join definitions, and advanced techniques for Derived Tables, Persistent Derived Tables (PDTs), and caching strategies. Along the way, you will see not just what each parameter does, but exactly how to write it in LookML.


Whether you are building your first model or refactoring a complex analytics layer, these patterns and best practices will help you write LookML that is clear, reusable, and optimized for performance.


1. Dimension and Measure Syntax


1.1 Defining Dimensions

A dimension represents a data attribute (column) from your source. When defining one, follow these best practices:

  • Naming convention: Use lowercase letters and underscores for spaces (e.g., full_name).
  • Code organization: Add new dimensions after existing ones in the view file for easier navigation.

Example:


dimension: first_name {
     sql: ${TABLE}.first_name ;;
     }


Key parameters for dimensions:

  • type
    • string – Default for text values.
    • number – For numeric data and math computations.
    • yesno – Boolean output (“Yes”/”No”).
    • tier – Bucket values into ranges.
  • dimension_group – Creates multiple dimensions from one field.
    • type: time – For date/time fields (requires timeframes)
    • Reference with ${created_date}.
    • type: duration – For time differences (requires intervals, sql_start, sql_end)
    • Use instead of hard-coded DATE_DIFF() for flexibility.

1.2 Defining Measures

A measure is an aggregated version of your data.


Common measure types:

  • sum – Adds values.
  • average – Calculates the mean.
  • count – Counts rows, uses primary key by default.
  • count_distinct – Counts unique values.

Example:


measure: total_sales {
     type: sum
     sql: ${sale_price} ;;
     }


Additional options:

  • value_format_name – Display format (e.g., usd_0 for currency).
  • filters – Apply conditions for a measure.

Example:


measure: new_user_count {
     type: count
     filters: [users.is_new_user: "Yes"]
     }


  • Derived from other measures: In LookML, you are not limited to building measures directly from raw fields; you can also create new measures from existing measures, as long as the new measure’s type is set to number.

    This is particularly useful when you want to calculate ratios, percentages, or other KPIs without repeating aggregation logic. By referencing other measures instead of re-writing SQL, you ensure your calculations stay consistent and automatically reflect any future changes to the original measures.

Example:


measure: profit_margin {
     type: number
     sql: ${total_profit} / ${total_revenue} ;;
     value_format_name: percent_2
     }


In this example, total_profit and total_revenue are both pre-defined measures. By dividing them, you create a calculated KPI that is:

  • Easier to maintain, no repeated SQL.
  • Less error-prone, inherits logic from the original measures.
  • Automatically updated if the source measures change.

💡 Pro Tip: Always reference measures with ${measure_name} instead of ${TABLE}.column_name. When doing this, it keeps your logic centralized and reduces the risk of mismatched calculations.


1.3 Referencing Fields

LookML uses substitution operators to reuse logic and reduce hard-coding.


  • ${field_name} – References an existing dimension/measure in the same view.

Example:


sql: concat(${first_name}, " ", ${last_name}) ;;


  • ${TABLE}.column_name – Direct database column reference, use when first defining a field.
  • ${view_name.field_name} – Reference fields from another joined view.

2. Explore and Join Syntax


2.1 Defining Explores

An Explore is a starting point for analysis, containing pre-joined views.


Example:


explore: orders { }


The Explore name must match an existing view name.


2.2 Defining Joins

Joins link additional views into an Explore.


Example:


join: users {
     type: left_outer
     sql_on: ${order_items.user_id} = ${users.id} ;;
     relationship: many_to_one
     }


Key join parameters:

  • type: left_outer (default), inner, full_outer, cross.
  • sql_on: Join condition using LookML field references.
  • relationship: Defines cardinality for symmetric aggregation (one_to_one, one_to_many, many_to_one, many_to_many).
  • from: Used for join aliases when joining the same view multiple times.

2.3 Primary Keys and Filtering


  • primary_key: yes – Always define one in each view for symmetric aggregation.
  • sql_always_where – Adds a hidden, unremovable WHERE clause.
  • sql_always_having – Same as above, but in HAVING clause.
  • always_filter – Visible, required filter users cannot remove.
  • conditionally_filter – Visible filter that can be removed if another filter is applied.

Summary Table: Filtering Types


Feature SQL Clause Applied In Visible to User? Editable by User? Use Case
sql_always_where WHERE ❌ No ❌ No Apply a permanent row-level filter to restrict data (e.g., status = 'active').
sql_always_having HAVING ❌ No ❌ No Apply a permanent aggregate-level filter (e.g., COUNT(*) > 0).
always_filter Looker UI filter ✅ Yes ⚠️ Yes
(can change value but cannot remove filter)
Provide a default, required filter that guides analysis but still allows adjustments.
conditionally_filter Conditional logic in LookML ✅ Yes ✅ Yes Show a filter only if another specified filter is applied, useful for dependent filtering logic.

3. Derived Table Syntax

Derived tables are custom tables defined in LookML that do not exist in the underlying database. They let you build reusable transformations directly in your model without altering the source schema.


💡 To learn more about best practices for Looker, check out our previous blog here.


3.1 SQL-Derived Tables (as Views)

A SQL-Derived Table (SDT) is defined with raw SQL inside LookML. You can create them via SQL Runner or directly in the IDE, and Looker will generate a .view file for you. You can treat this file like any other view in your project.


Why use SQL-Derived Tables?

  • Ideal for transformations that are simpler to express in SQL than in LookML.
  • Useful when you need a one-off pre-aggregation.
  • Helpful for complex joins or CASE logic that do not translate well into Explore joins.

Example:


view: order_facts {
  derived_table: {
    sql:
      SELECT
        order_id,
        SUM(sale_price) AS total_sales
      FROM order_items
      GROUP BY order_id ;;
    }
  primary_key: order_id
  measure: total_sales {
    type: sum
    sql: ${TABLE}.total_sales ;;
  }
}


3.2 Native Derived Tables (NDTs)

Native Derived Tables are written entirely in LookML and leverage existing LookML objects (dimensions, measures, filters).


Why use NDTs?

  • Avoids repeating SQL logic; everything references LookML fields.
  • Automatically inherits updates when the source fields change.
  • Easier to maintain and portable across models.

Example:


view: top_customers {
  derived_table: {
    explore_source: orders {
      column: customer_id { field: orders.customer_id }
      column: total_spent { field: orders.total_spent }
      filters: [orders.order_status: "completed"]
    }
  }
  primary_key: customer_id
}


Key Parameters:

  • explore_source – The Explore to build from (equivalent to SQL FROM).
  • column – Output field in the derived table; can be renamed or aliased.
  • filters – Add WHERE/HAVING conditions.
  • bind_filters – Pass user filters from Explore into the NDT.
  • derived_column – Create a new calculated column inside the NDT.
  • expression_custom_filter – Build complex filter logic with AND/OR.

3.3 Persistent Derived Tables (PDTs)

PDTs store derived tables in your database, so queries run against a materialized version instead of regenerating every time. This is critical for performance-heavy queries or when you need consistent snapshots of data.


Why use PDTs?

  • Speeds up queries that would otherwise require large scans or expensive joins.
  • Ensures consistency across users by serving the same materialized data.
  • Useful for cascading logic where derived tables depend on each other.

Persistence Strategies:


Option 1 – Using datagroup_trigger: Recommended; rebuilds based on caching policy.


view: order_summary {
  derived_table: {
    sql:
      SELECT customer_id, COUNT(*) AS order_count
      FROM orders
      GROUP BY customer_id ;;
  }
  datagroup_trigger: orders_datagroup
}


Option 2 – Using sql_trigger_value: Rebuilds when query result changes.


sql_trigger_value: SELECT MAX(updated_at) FROM orders ;;


Option 3 – Using persist_for: Keeps PDT fresh for a fixed time window.


persist_for: "8 hours"


3.4 Ephemeral Derived Tables (EDTs)

Ephemeral Derived Tables are temporary; they do not persist in the database. Instead, they exist only within the SQL query Looker generates, often as a Common Table Expression (CTE).


Why use EDTs?

  • Great for lightweight, intermediate transformations.
  • Avoids database storage costs.
  • Perfect for logic that is only needed once and will not be reused.

Example:


view: temporary_calc {
  derived_table: {
    sql:
      SELECT user_id, COUNT(*) AS event_count
      FROM events
      GROUP BY user_id ;;
    ephemeral: yes
  }
}


Summary Table: Derived Table Types


Type Defined In Persists in Database? Use Case
SQL-Derived Table Raw SQL in .view ❌ No One-off transformations or complex SQL logic
Native Derived Table (NDT) LookML ❌ No Reusable LookML-based aggregations
Persistent Derived Table (PDT) SQL or LookML ✅ Yes Heavy queries or large aggregations needing caching
Ephemeral Derived Table (EDT) SQL in .view with ephemeral: yes ❌ No Temporary calculations or CTE-style queries

4. Caching and Datagroup Syntax

Datagroups are Looker’s way of controlling when cached query results or Persistent Derived Tables (PDTs) are refreshed. Think of them as centralized cache policies that multiple Explores and PDTs can share, ensuring your data stays fresh, performant, and consistent without overloading the database.


Why they matter:

  • Performance: Reduce redundant queries and save database costs.
  • Consistency: All dependent Explores and PDTs refresh on the same schedule.
  • Control: Tie cache refresh to a database change (via sql_trigger) rather than an arbitrary time.

4.1 Defining a Datagroup

Datagroups define cache refresh rules.


Best Practice:

Always define both max_cache_age (time-based refresh) and sql_trigger (event-based refresh).

  • max_cache_age ensures data will not be stale for too long.
  • sql_trigger detects actual source table changes and refreshes immediately when needed.

Example:


datagroup: orders_datagroup {
  max_cache_age: "1 hour"
  sql_trigger: SELECT MAX(order_id) FROM orders ;;
}


  • max_cache_age: Forces a refresh every hour, even if no data change is detected.
  • sql_trigger: Checks the highest order_id; if it changes, the cache refreshes immediately.

4.2 Applying Datagroups

  • At the Explore level: Control Explore-level caching.


explore: orders {
  persist_with: orders_datagroup
}


  • For PDT refresh: Tie PDT rebuilds to a datagroup.


datagroup_trigger: orders_datagroup


  • Fixed-time caching (no datagroup): When datagroups are not supported.


persist_for: "2 hours"


Summary Table: Datagroup vs. persist_for


Feature Detects Data Changes? Time-Based Refresh Use Case
Datagroup ✅ Yes (via sql_trigger) ✅ Yes (max_cache_age) Dynamic freshness tied to DB changes, shared cache policy.
persist_for ❌ No ✅ Yes Fixed refresh when datagroups are not supported.

Mastering LookML – Final Summary Table


Component Purpose SQL Output Best Practice
Explore Combines views into an analysis starting point FROM + JOIN Only join what is needed, use sql_always_where for security filters
Dimensions Groupable fields, no aggregation SELECT + GROUP BY Use business-friendly names, avoid raw DB column names
Measures Aggregated metrics SELECT with AGG() Reuse calculations across explores, avoid hardcoding in SQL
Dimension Filters Filters applied before aggregation WHERE Use for raw-level filtering to improve query speed
Measure Filters Filters applied after aggregation HAVING Only use when aggregation is necessary
SQL-Derived Tables (SDTs) Custom SQL logic in LookML Depends on SQL Use for complex logic not possible in LookML fields
Native Derived Tables (NDTs) Derived tables using LookML references Auto-generated SQL Easier to maintain, automatically adapts to schema changes
Persistent Derived Tables (PDTs) Cached tables in DB for speed Physical table in DB Use for heavy queries, refresh with datagroups
Ephemeral Derived Tables (EDTs) Temporary tables for query context CTE in query Great for lightweight logic without persistence
Datagroups Control cache refresh rules N/A Always set both max_cache_age & sql_trigger
Persistence Controls Keep PDTs fresh N/A Match refresh to data update frequency

⭐⭐⭐


LookML is not just a modeling language; it is the bridge between your raw data and the business insights that drive action. By mastering concepts like Explores, dimensions, measures, derived tables, and datagroups, you unlock the ability to deliver fast, accurate, and consistent analytics at scale. Whether you are optimizing query performance, enforcing governance rules, or building reusable metrics, these best practices help ensure every dashboard and data exploration tells the right story, at the right time.


Ready to elevate your Looker experience? Our team can help you design a scalable LookML framework, optimize performance, and enable your business users to self-serve insights without compromising on governance.


Get in touch today and let’s turn your data into a competitive advantage.


Author: Umniyah Abbood

Date Published: Aug 21, 2025



Discover more from Kartaca

Subscribe now to keep reading and get access to the full archive.

Continue reading