Hi!
Our team at ChatterBlast leverages dbt, and we actually started our journey on Dataform before realizing it was way more pain that it was worth.
Dataform was nice in the sense that I could work on it directly in the same UI as our tables in BigQuery, but it was maddening trying to schedule out jobs (I’m not well-versed in much on the coding spectrum) or understand what models were tied to others. I probably didn’t explore the product enough, but Google does a horrible job making it easy to use, if it ever gets there.
When I jumped to dbt, we never looked back. BigQuery now acts as a Supermetrics data lake, where the important stuff gets set to staging models (think of a model for organic Instagram post data, but includes new followers by day), and from there we either join everything together (lots of use cases here), or branch off into individual client marts for things like content category analysis, benchmarking and specific timeframe analysis.
I think it would be awesome to see templates and recipes, especially for those just starting out. I was completely blind to the benefits, but after working with it for the past year, I feel like I lost so much time and value without it.
This is so cool to hear that you’re using the DBT/Supermetrics combo. The organic Instagram data model is especially interesting because we get some clients that really want to see the change in followers, but just getting the data through the API without using a tool like DBT can make it a challenge.
Does your team use a lot of custom queries or do you mostly use the more “standard” metrics and dimensions (clicks, impressions, video views, etc)? It would be great to know what sort of data points and platforms you think could have really helped when you were first starting out with the tool.
I can totally understand how Dataform can be too cryptic for new users. I find the learning curve of many cloud services is way too steep for people who don’t already bring relevant experience from somewhere else, even though cloud services are seemingly supposed to make computing tasks easier and more widely available…
We managed to deploy our first data modelling task the other week in Dataform and for what it’s worth, it’s performing well and doing its job. But we were able to use years of dbt experience as the jump-off-point and then still ran into problems during setup.
For me the big win with dbt is that it is target agnostic, the same code should run on any supported database so you avoid platform lock-in. You can even go totally free and self-host dbt core if that’s an option (although the learning curve to do this is again very steep - I tried it myself). The huge community of dbt users is also a big plus. It is easy to find other users and get advice online.
@gcfmineo creating dbt model templates has been on my ‘wish-I-could-do-this’ list for a long time now, thus my question - I can justify putting time into this project when I get more people who would be interested
My consulting team and I have built dozens of dbt-based marketing data warehouses and we’ve templatised a lot. I’d love to share this knowledge with a wider customer audience.
To @courtneywaganer - YES. Instagram was a huge challenge when they limited follower data to 28 days, and it became a huge value point for clients that wanted larger lookback windows. We even moved up access & data onboarding in our client engagement timelines so there would be less shock and awe, as most clients have no idea about the API limits across products.
For queries, I’ve often found it helpful to pull as much standard query data as possible, and build custom queries for what’s not available immediately, or for what isn’t easily accessible. For example on Facebook Insights, I only use standard queries for post data and follower data. For LinkedIn, the standard query for followers is perfect, but I need two custom queries for post data, since the API separates video and static content. In the process of aligning all platforms - paid and organic - into one table, it really forced us to learn how each platform defines each dimension and metric, and how to align them all in a way that creates consistency. Storage in BigQuery is also ridiculously cheap, so it never hurts to pull more than you need, and then weed it out later on once the use case is supported.
To @bartschneider - My journey was about three years long, with the first year just scrapping raw tables into dashboards and thinking of it like big excel docs and fearing SQL, the second year toying with some attempts at blends and transformations while I learned SQL, and then once SQL made sense, the third year was upgrading to dbt and going crazy with optimizing models and storage. There is a huge gap in documentation, trainings and resources related to marketing-specific use cases with data management, and even competitors in the space focus on more technical data engineering examples. Those templates would be gold in general, be it from me three years ago not understanding how to use SQL, to me now, knowing that I’ve scrapped SQL together and I could probably optimize it even more with the right guidance.