Creator of DiceDB, ex-Google Dataproc, ex-Amazon Fast Data, ex-Director of Engg. SRE and Data Engineering at Unacademy. I spark engineering curiosity through my no-fluff engineering videos on YouTube and my courses
YouTube’s analytics team is studying the evolution of tech content creators. They have historical data showing how creators’ content changes over time in terms of technical depth and entertainment value. Each video is rated on two scales: technical depth and entertainment value and each creator posts one video every week.
You have a dataset of 100 creators spread across 52 weeks. Each line in the dataset
contains <tech value, entertainment value>
of previous video and <tech value, entertainment value>
of the next
video posted by the same creator. Analyzing this shows will show you how the content evolves over time.
Now, you are given a different list of 30 creators
and their current state of content <tech value, entertainment value>
. Now among these 30 creators figure out,
You can output the index of the creator in the list of 30 creators (starting with 0).
Here’s the code for reference and some notes on the solution below.
We need to use the data to calculate the transformation matrix. The transformation matrix will be a 2x2 matrix which tells how much current tech depth and entertainment value influences future tech depth and entertainment value.
To generate this matrix, we leverage least squares regression method. Either follow the link above or refer your favourite LLM tool to build an understanding. Applying this method to the data will give us the following matrix.
[[0.70500624 0.19902547]
[0.09087316 0.89926622]]
Now that we have the transformation matrix, we can use it to predict the future state of any creator. The idea is to multiply the transformation matrix with the current state of the creator to get the future state.
To compute the k
th state, we have two options
k
timesThe second option is better because it is faster and more efficient.
def predict(A, x0, k):
eigenvalues, eigenvectors = np.linalg.eig(A)
return eigenvectors @ np.diag(eigenvalues ** k) @ np.linalg.inv(eigenvectors) @ x0
Applying the predict
to all 30 creators (in the test), we get the final state for each and
then computing
np.argmax(final_state[:, 0])
to get the creator with highest technical depth after 4 weeksnp.argmax(final_state[:, 1])
to get the creator with highest entertainment value after 4 weeksargmin
s of initial and final state to tell which creators switched from tech-focused to entertainment-focused and from entertainment-focused to more tech-focusedArpit's Newsletter read by 100,000 engineers
Weekly essays on real-world system design, distributed systems, or a deep dive into some super-clever algorithm.