Show the Latest Data

Sometimes you need to just show the latest data. This is often the case for metrics like inventory or some percentages where adding things up just does not make sense. Most analytic tools like Oracle Analytics allows authors to pick from various aggregation methods as a solution, but when building a project you don’t always want this to change for all cases. Users could also leverage date filters and this can be done at various levels in the project, but as new data changes report maintenance goes up!

With all this in mind I have recently introduces the concept of Relative Time into my data sets using a custom column and a function. I am doing this as part of a Data Flow but this concept can be leveraged elsewhere in the product.

Let’s break things down a bit and lay out the use case:

  • This is a self service use case. I am loading a new data file almost every day with data updated to the most recent. This can also be down with a database or other source that gets automatic updates.
  • At times the data updates get a bit behind and they may not have the latest data. Using today – 1 day or anything like that is not predictable.
  • I only care about the most recent days data in my analysis. This could be most recent week, hour, or even minute, but in my case it is day.

So what I have done is created a column that calculates the difference between today (Current Date) and the date column (ObservationDate) in my data set. I ordered the function so I would get number of days away represented in negatives so the closest day is the highest number, in this case -1. Doing it the opposite way would have been my approach for solving the problem of ‘oldest date’. You will see what I mean in the next step.

TIMESTAMPDIFF(SQL_TSI_DAY,CURRENT_DATE, ObservationDate)

Next apply a rank function to the previous function to have each row in my data set report it’s rank. As you can see below that ranks things and because there are a lot of ties in the data set the same rank is repeated many times. Also note that with -1 being my largest number it is going to be ranked #1.

Rank(TIMESTAMPDIFF(SQL_TSI_DAY,CURRENT_DATE, ObservationDate))

So now I know that my most recent time period is 3/27/2020 and I can clean things up so my naming convention is clear for my users consuming the report.

CASE WHEN Rank(TIMESTAMPDIFF(SQL_TSI_DAY,CURRENT_DATE, ObservationDate))=1 THEN 'Most Recent Day' ELSE 'Older' END

With this in place, using filters on my visualization or other levels, I can add the column value ‘Most Recent Day’ for the metrics where I just want the latest data! Once this is done I never have to update my data, I just load new data and the visualizations in my project always show the most recent data.

Also this is a great trick for making your data searchable and ready for voice for Oracle Analytics Day by Day use cases. More to come on this topic so stay tuned.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.