Professional Documents
Culture Documents
Movie Que
Movie Que
Movie Que
schema,
activity('Get Metadata Combined File').output.firstRow.properties.schema)
--------------------------------------------------------------------------
from pyspark.sql import SparkSession
# Create a SparkSession
spark = SparkSession.builder \
.appName("MovieRecommendation") \
.getOrCreate()
ratings_df.printSchema()
ratings_df.show(5)
users_df.printSchema()
users_df.show(5)
Certainly! Here are some movie recommendation questions along with corresponding
SparkSQL queries:
Question: What are the top movie recommendations for a specific user (e.g., user
with ID 1)?
SparkSQL Query:
sql
Copy code
SELECT m.movieId, m.title, rec.rating
FROM movies m
JOIN (
SELECT movieId, rating
FROM (
SELECT userId, movieId, rating, row_number() OVER (PARTITION BY userId
ORDER BY rating DESC) as rn
FROM ratings
) tmp
WHERE userId = 1 AND rn <= 10
) rec ON m.movieId = rec.movieId
Recommendations Based on Similar Users:
Question: Which movies have not been rated by a specific user (e.g., user with ID
1)?
SparkSQL Query:
sql
Copy code
SELECT m.movieId, m.title
FROM movies m
LEFT JOIN ratings r ON m.movieId = r.movieId AND r.userId = 1
WHERE r.userId IS NULL
These are just a few examples of questions you can ask and corresponding SparkSQL
queries you can execute to perform movie recommendation tasks. You can modify these
queries based on your specific requirements and the structure of your data.
---------------------