If you have multiple metrics that you want to weigh together, it may make sense to normalize the data so one metric (that is measured on a larger scale), doesn't disproportionately sway the composite score.
This SQL snippet (which uses CTEs), is one way you can normalize data in SQL. all_plays simply counts the number of plays per user_id. parameters allows us to pull the maximum and minimum number of plays per user_id.
Finally we normalize and rescale the data in the final select statement, with the minimum games played set to 0 and the maximum set to 1. Everything else falls in between these 2 extremes.
Any other methods you prefer to normalize data?
with all_plays as ( select user_id , count(*) as number_gameplays from gameplays group by user_id ) , parameters as ( select max(number_gameplays) as max_plays , min(number_gameplays) as min_plays from all_plays ) select user_id , (max_plays - number_gameplays) * 1.0 / (max_plays - min_plays) as normalized_plays from all_plays, parameters
Please sign in to leave a comment.