This post is a continuation of the SSIS GoogleAnalyticsSource component description started in Google Analytics for SSIS (SQL Server Integration Services).
So by now you should know how to extract some basic information out of the Google Analytics.
Now lets take a look into the more advanced options of this component:
Custom Dimensions & Metrics
If you are a regular Google Analytics user, you might have changed a lot of changes that the platform has suffered in the years since its inception – Google is always tweaking and improving it. Sometimes those changes are breaking the name of the existing Dimesions and Metrics and so your package one day might get error from the changed name of the Dimension, for example. To solve this, you are able to edit the names of the Dimensions and Metrics by using XML files.
If you go to SSIS GoogleAnalyticsSource release downloads page, you should notice that besides the component itself, there are 2 more download at the moment – Dimensions.xml & Metrics.xml
These are the files that contain information about Google Analytics dimensions and metrics, if something has changed – they might be updated by the authors, or if you are in the hurry – you can go and modify them yourself.
At the principal options screen you have an opportunity to upload the new XML files with updated information. Very neat! :)
Simply explained, Segments are the views for the Google Analytics. They allow to limit a type of the information that will be pre-filtered and inside the Google Analytics Platform – focusing on subsets of sessions and users.
You are allowed to select only 1 active segment, which is logical, since once can’t turn head into more than 1 direction. :)
I hope that the future version shall allow the inclusion of the custom segments. :)
Filters: Dimension & Metrics
I think about the M functionalities that is included in Power Query – filter pushing, which basically means that if you are applying a filter to your data that your are reading from a relation data source, than it will be included as a part of the original query predicate. This filters in Google Analytics component are similar functionalities.
There are 4 nice properties that are not included into the Options screens of the component, but which are editable once you open “Custom Properties” of the SSIS GoogleAnalyticsSource:
– Max Results
– Time Out
– Max Results
This property allows you to control the maximum number of rows that Google Analytics shall deliver after request. Capped at 5000 by default, you might want to increase or decrease it based on your current project’s need.
Threads is a property that allows you to control your service allocated resources for data processing. I would argue that in the most cases this property should be left by default unless there is an explicit need for a change. You don’t want to spent all your server resources on processing data from 1 source. :)
– Time out
Capped by default at 120 seconds, this property allows you to control the amount of time your package might spend waiting for Google Analytics platform to respond on the request.
This property allows to make decision if the excesses of information should be removed if the string for example is too long.
Be aware that Google Analytics Platform gives you the final information only after 48 hours, which means that todays or yesterday’s statistics are not guaranteed to be correct.
You will need to have a paying account in order to have close-to-real-time information.
Take a look at this great component and give some feedback in order to make it event better!