![]() REDSHIFT WINDOW FUNCTIONS CODELet’s say I had a bug in my code and I accidentally adding a bunch of duplicate trades to my trades table and wanted to write a query to clean things up. This might not seem especially interesting at first, but once you have this tool at your disposal you’ll find lots of uses for it.įor instance, let’s say you have duplicated data that you want to filter out, but your table has no primary key to that you can use to figure out which records are duplicated. Ranking and numberingĪside from aggregation, Window Functions can also do ranking or numbering of rows. This will essentially ‘filter’ our aggregation to only include rows that have the same value for symbol as our current row. We also include a partition by symbol clause. In the over clause, we’re still sorting by date and we also have a frame clause. This is different from a normal sum aggregation you would do in Redshift, because you don’t need to add a group by clause, but instead you’re adding an over clause. We can use the sum window function to do this: selectĪs you can see, we’re using the sum aggregation as a window function to calculate our running total. Now let’s say you want to list out all my trades over time, with a running total of your profit per symbol. It’s pretty simple and it has ts (timestamp), symbol and profit columns. Let’s say you also have trades table where you keep track of all your stock trading. Window functions also lets you do more complex aggregations on frames of data. This tells our window function to use all the rows preceding the current row, after sorting on the date field. This time, we also include a frame clause: rows unbounded preceding. Say you always want to calculate the difference of your daily closing balance with the first balance you had when opening the account, you can use the first_value window function to do this: select If we were to order differently in the outer query, the lag function would still calculate gain correctly.Īnother pair of functions that are pretty useful for looking ahead or backwards are first_value and last_value. One interesting thing to note is that the window function does not rely on the ordering of the query itself. We can specify the ordering of rows within the window function in the over(order by date) clause. The lag function allows us to grab the value of a column in a ‘previous’ row ( lead does the opposite, by accessing the next row). The lag window function allow us to simplify things, without having to do a self-join. Left join balances previous_day on dateadd(day, -1, b.date) = previous_day.date , b.balance - previous_day.balance as gain One way to do this would be something like this: select Now let’s say for each day you would like to calculate your loss/gain over the previous day’s balance. Let’s say you have a simple table of timeseries data, for instance the closing balance of your stock market account per day. Two of the most useful window functions are the lead and lag functions. The most basic window functions help you access data from ‘previous’ or ‘following’ rows, much like you can do in a spreadsheet. In this post, I’ll walk you through some of the Window Functions that I’ve found the most useful and hopefully help you gain a better intuition on how they work in general. Window functions work in much the same way, and thus can be super powerful for the same kind of operations you can perform in a spreadsheet and even beyond that. Cells in a spreadsheet can reference arbitrary ranges of other cells relative to themselves and use their values as input to functions. Like the name suggests, Window Functions let you operate on frame or ‘window’ of data and return a value for each row in that result set.Īn easy way to understand this is to think about a spreadsheet. One of the coolest things I learned about in my Redshift journey has been Window Functions.Īlthough Window functions aren’t a novel feature and exists in other SQL databases, they are a really powerful tool to have in your analysis toolbelt and fits in really well with Redshift. ![]()
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |