Is log-level data the new gold rush? Very possibly. As the digital advertising industry continues to advocate for transparency, publishers are pushing partners for more granular data so they can peel back the layers and verify what’s going on under the hood. In theory, it’s a perfect way to double-check that everything is happening as it’s supposed to. In reality, the nature of log-level data complicates this effort. In this post, we explain what log-level data is, the challenges of working with log-level data, and how to start using log-level data to optimize your advertising revenue.
What is Log-Level Data?
Log-level data, in the most general sense, is data coming from server logs where each row of data is tied to a specific event. This event could be an Impression, an Ad request, a Bid request, or anything else that is logged by the given platform. Log-level data is the most granular data because it provides data on a single event and isn’t aggregated or joined.
Log-level data can come from various platforms. Within the digital advertising space, some of the more common systems that provide log-level data are ad servers and DSPs—SSPs and exchanges are increasingly offering log-level data to their customers as well.
Below is an example of a log-level data file provided by Google from their Google Ad Manager (DFP) Data Transfer Files. This row of data may look like the usual data coming from an ad server UI, but the main difference is that there are no columns for metrics such as Ad Requests, Impressions, Clicks, etc. This is because each row of data corresponds to one event, such as an Ad Request, Impression, or Click.
What are other names for Log-Level data?
In the digital advertising space, log-level data is also sometimes referred to as bid-level data, impression-level data, or log data. This can be a bit confusing because bid and impression-level data are types of log-level data. Bid-level data is a type of log data where every row in the file corresponds to a single bid request or single bid response. Similarly, impression-level data is a type of log data where every row of data corresponds to a single impression.
What are some of the challenges with Log-Level Data?
There are several intrinsic characteristics of log-level data that make working with it a bit more challenging than standard data sets.
1. Data Volume
The first challenge is simply dealing with the large amount of data. Because data is provided in a raw unaggregated form, data files will be significantly larger. For example, if you are subscribing to Google Ad Manager’s “NetworkRequests” Data Transfer File feed then every hour you will get a file that has one row for every single ad request that came into your ad server. If you have 20 million ad requests every day then your files will have 20 million rows of data a day. In comparison, a standard report from a UI or API will provide processed and aggregated data on these 20 million ad requests which can vary from a few rows of data a day and up depending on the dimensions that are getting reported on. Because of the large amount of data there is an added layer of complexity and cost to every step of the process that deals with the log data.
2. Data Processing
The second challenge with log-level data is that you will need to build or buy some systems that will allow you to gather, transform, and report on the log data. You will need a system to store the log-level data as most vendor platforms that provide these files only retain the files for days or a few weeks. This retention period is much shorter than the retention period vendors offer for aggregated data which ranges from months to years. Once storage has been figured out, then processing and transforming the data will need to take place. This is the step that transforms the log data to aggregated tabular data that we are accustomed to using for analysis. Many times a low budget way to transform and aggregate data is to import the data into Excel and build your aggregation and transformation logic in Excel. However, with log data Excel is not a viable option due to Excel’s 1,048,576 row limit. You would likely exceed Excel’s data capacity with a few hours worth of log data.
3. Joining Data Feeds
The third challenge that log-level data presents is that you will need to build rules and logic for joining the different log-level data feeds that you receive. For example, if you are a publisher looking to track an initial ad request and follow it as corresponding bid requests are sent out, bid responses are received, and an impression is served then you will need to create the logic that joins those different events. In other words, you would be receiving different log-level data feeds for the ad request, bid requests, bid responses, and impressions served. These different data feeds would then need to be joined. This can be contrasted to the data that is provided through vendor UI or APIs where you can get metrics such ad requests, impressions, and clicks in a single data table that is easy to collect and analyze. This is because the vendor has already processed, aggregated, and joined the data from the various events prior to presenting it to the end user.
Should I use Log-Level Data?
The first thing to ask is, What problem am I trying to solve, and how will log-level data provide insights into this problem and drive actions? Then ask, Do I need this log-level data to solve my problem or is some form of this data already available in an aggregated and joined form (through a UI, API, or scheduled report)? The answers will depend on the vendor or platform you are using.
For example, Index Exchange offers Client Audit Logs that provide impression level data including dimensions such as Advertiser, Buyer, and DSP. These fields are not available in Index’s standard UI or API so using the Client Audit Log feed provides obvious incremental value in that it allows publishers to perform buy side analysis to identify the demand path for their revenue—and answer questions such as, What are the top advertisers that have increased spend in the last month through the open exchange so I can target them for direct deals? or which DSPs are bringing unique demand?
On the other hand, if the only incremental data you get from a log-level file is the geo code of the user who saw the ad, you should decide if that additional piece of information is worth the cost of collecting and processing the log-level data.
How do I start using log-level data?
The process and cost for getting access to log-level data feeds varies from vendor to vendor. Reach out to your vendors to get information on if and how you can get access to log-level files; and know that most vendors charge an additional fee for providing access to the log-level files. In addition, the level of standardization, documentation, and support of log-level data varies widely from vendor to vendor. Some vendors have not standardized on the log-level data feeds and offer different feeds to different customers.
After working with your vendors to get access to the files, come up with a plan to gather, transform, and report on the data. For a good overview of this process specifically for digital advertising, check out our interview on Best Practices for Data Aggregation or watch our on-demand webinar here.
Want to learn more about the log-level feeds available in Google Ad Manager?
We will be releasing a report of the numerous log-level feeds that Google Ad Manager offers along with insights into the value of these feeds and how they can be used. If you are interested in receiving this report, sign up here.
Need help processing and analyzing your log-level data? Contact us here ›