Professional Documents
Culture Documents
Sna Lab Report (21mic7199)
Sna Lab Report (21mic7199)
CSE4008
Lab Report
Done by
R. Sai Vighnesh
21MIS7157
Week:1-4
Data Visualization
Histogram:
Output:
Line graph:
Output:
Bar Plot:
Output:
Scatter Plot:
Output:
Area Plot:
Output:
Box Plot:
Output:
Week:5-6
Youtube Data Analysis
1)Importing necessary libraries:
5)
6)Task 1: Total views of trending videos
Output:
Output:
Output:
Output:
Output:
Output:
Output:
Output:
WEEK: 7-8
Analyze the Facebook data for sentimental analysis
DATASET: “pseudo_facebook.csv”
B. PREPROCESS THE DATASET. FIND NULL VALUES, DUPLICATES PROCESS THOSE VALUES
OUTPUT:
OUTPUT:
E. DISPLAY THE AGE GROUPS ANALYZE ITS COUNT.
OUTPUT:
OUTPUT:
H.WHICH GENDER HAS THE GREATEST NUMBER OF FRIENDS USING SCATTER PLOT?
OUTPUT:
I.WHO HAS THE HIGHEST FRIEND COUNT? USING BAR PLOT
OUTPUT:
J.WHICH AGE GROUP HAS HIGHEST NUMBER OF LIKES RECEIVED? USING COUNT PLOT.
OUTPUT:
OUTPUT:
L.WHO MALE HAS THE HIGHEST NUMBER OF LIKES RECEIVED.
OUTPUT:
M.WHICH MONTH ARE FACEBOOK USERS BORN?
OUTPUT:
N.WHICH FEMALES AGE GROUPS HAS RECEIVED A GREATER NUMBER OF LIKES THROUGH MOBILE
AND WEB.
OUTPUT:
Week:9-10
class QuotesSpider(scrapy.Spider):
name = "quotes"
start_urls = [
"https://quotes.toscrape.com/page/1/",
"https://quotes.toscrape.com/page/2/",
] def parse(self,
response):
page = response.url.split("/")[-2]
filename = f"quotes-{page}.html"
Path(filename).write_bytes(response.body)
c) If we run the spider that we created:
We will be getting quotes-1.html and quotes-2.html files with the contents for
the respective urls as parse method instructs in quotes_spider.py:
d) let’s see the selection and extraction of data: running this command, scrapy
shell https://quotes.toscrape.com/page/1/ in cmd
e) Storing the scraped data:
using this command: scrapy crawl quotes -O quotes.json
This creates a quotes.json file: where it stores all the scraped data.