Welcome to Scribd!

Twigposterfinal

Uploaded by

api-278681960

0% found this document useful (0 votes)

26 views1 page

The document describes TWIG, a tool that collects and analyzes data from multiple sources including social media, news RSS feeds, images, and videos. It does this through 5 main steps: 1) storing relevant feeds in a data structure, 2) entering keywords and URLs, 3) retrieving RSS feeds using APIs, 4) storing feeds in a list, and 5) running a web crawler to retrieve articles. The goals are to implement APIs to collect social media data, develop crawlers to collect RSS feeds and articles, and create modules to geocode and detect topics in data. The output is filtered data with metadata that can be used for tasks like social unrest anticipation and trend analysis.

Original Description:

Original Title

twigposterfinal

Copyright

Available Formats

PDF, TXT or read online from Scribd

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Report this Document

Copyright:

Available Formats

Download as PDF, TXT or read online from Scribd

Flag for inappropriate content

Download as pdf or txt

0% found this document useful (0 votes)

26 views1 page

Twigposterfinal

Uploaded by

api-278681960

Copyright:

Available Formats

Download as PDF, TXT or read online from Scribd

Flag for inappropriate content

Download as pdf or txt

Jump to Page

You are on page 1of 1

Search inside document

RSS-ALGORITHM

Why TWIG?

1. Create a data structure to store feeds of relevance

2. Enter the keyword and the list of URLs for the
leading news agencies
3. Retrieve RSS feeds using an API containing a title,
an author, a URL, and the summary of the article
4. Store the RSS feeds in the list
5. Run the web crawler to retrieve the article using
the URLs thus obtained
6. Write all the data obtained in a text file

Gives a vast opinion of people in a given region

Provides a unified dataset harnessed from both social media and
news agencies, thus providing a complete picture
Allows the spatial analysis of human behavior in a given region
Identify hotspots of social unrest or any specified topic in the world

Objectives
The objectives of this work are to:
1. Implement APIs to collect data from multiple social media sites
2. Develop a web crawler to collect RSS feeds and their associated
news articles
3. Develop and implement a GeoCoder module for not only text
based sources, but images and videos as well, that will associate each
piece of data with a location
4. Develop and implement a Topic Detector module for all types of
datasets, that will help us categorize each piece of data gathered
from the web
5. Have a fully functioning User Interface that will allow one to
gather information from multiple sources on the web

Figure 1: TWIG User Interface

GUI
Social Media
(Twitter, Reddit,
Tumblr)

TWIG Engine

Anticipated Uses of the Data Collected

1.
2.
3.
4.
5.

Social unrest anticipation

Natural disaster relief
Trend Analysis for a certain topic of interest
Regional analysis
Story building

RSS Feeds (Leading

newspapers worldwide)

GeoCoder and
Topic Detector

Acknowledgement
We would like to thank Dr. Mei Chen, Dr. Lok Lew-Yan Voon, and The Citadel
Foundation for their support in this work.

Images and Videos

(Flickr, Instagram,
YouTube)

Filtered Data

MetaData Extractor

Location
Date/Time
Topic
Description
Author

The Design and Implementation of Configurable News Collection System Based On Web Crawler
Document5 pages
The Design and Implementation of Configurable News Collection System Based On Web Crawler
Hasnain Khan Afridi
No ratings yet
Report
Document19 pages
Report
chirag
0% (1)
Twitter Big Data Analysis
Document4 pages
Twitter Big Data Analysis
Editor IJTSRD
No ratings yet
Recommender Systems Using Semantic Web Technologies and Folksonomies
Document5 pages
Recommender Systems Using Semantic Web Technologies and Folksonomies
bonsonsm
No ratings yet
News-Please: Felix Hamborg, Norman Meuschke, Corinna Breitinger, Bela Gipp
Document6 pages
News-Please: Felix Hamborg, Norman Meuschke, Corinna Breitinger, Bela Gipp
Hasnain Khan Afridi
No ratings yet
Datamining
Document21 pages
Datamining
21BCS119 S. Deepak
No ratings yet
The Design and Implementation of Web Crawler Distributed News Domain Detection System
Document6 pages
The Design and Implementation of Web Crawler Distributed News Domain Detection System
James bb
No ratings yet
Something
Document39 pages
Something
point breaker
No ratings yet
Siena S Twitter Information Retrieval System: The 2012 Microblog Track
Document8 pages
Siena S Twitter Information Retrieval System: The 2012 Microblog Track
Flavio58IT
No ratings yet
CSCI 587 SEC 1220 - Final Project - Kotha0746
Document40 pages
CSCI 587 SEC 1220 - Final Project - Kotha0746
Vasavi Reddy
No ratings yet
Crahid: A New Technique For Web Crawling in Multimedia Web Sites
Document6 pages
Crahid: A New Technique For Web Crawling in Multimedia Web Sites
International Journal of computational Engineering research (IJCER)
No ratings yet
2nd PBL Report Public News Website
Document9 pages
2nd PBL Report Public News Website
Vyomkesh Sharma
No ratings yet
Analysing Youtube Data Using K-Means Clustering
Document5 pages
Analysing Youtube Data Using K-Means Clustering
International Journal of Application or Innovation in Engineering & Management
No ratings yet
Issuecrawlerscenarios Use
Document6 pages
Issuecrawlerscenarios Use
hjdbp
No ratings yet
RSS-Crawler Enhancement For Blogosphere-Mapping
Document7 pages
RSS-Crawler Enhancement For Blogosphere-Mapping
Editor IJACSA
No ratings yet
2 Search Engines
Document41 pages
2 Search Engines
Wendy Joy Cuyugan
No ratings yet
Innovation 24 Finalppt
Document10 pages
Innovation 24 Finalppt
Appu
No ratings yet
Search Engines
Document25 pages
Search Engines
asquared29
No ratings yet
What Is The Semantic Web
Document5 pages
What Is The Semantic Web
Jaqueline Alves Pisetta
No ratings yet
### URL Shortener Assignment
Document2 pages
### URL Shortener Assignment
Shashank Shekhar
No ratings yet
AstrologyxLLM Milestones and Timeline
Document3 pages
AstrologyxLLM Milestones and Timeline
Mohit Mathur
No ratings yet
FOSSICK: An Implementation of Federated Search Engine: February 2016
Document11 pages
FOSSICK: An Implementation of Federated Search Engine: February 2016
Online pc
No ratings yet
Web Mining
Document13 pages
Web Mining
dhruu2503
No ratings yet
F INALDOCUMENT
Document69 pages
F INALDOCUMENT
rajakarthik0118
No ratings yet
Representing Gridded and Space-Time Data in Hydroshare: Oci-1148453 Oci-1148090
Document1 page
Representing Gridded and Space-Time Data in Hydroshare: Oci-1148453 Oci-1148090
Consortium of Universities for the Advancement of Hydrologic Science, Inc.
No ratings yet
LLM For QnA Proposal
Document12 pages
LLM For QnA Proposal
Akhil Kumar
No ratings yet
Sentiment Analysis On Twitter
Document7 pages
Sentiment Analysis On Twitter
armanghouri
No ratings yet
Data Mining
Document7 pages
Data Mining
Roxanna Gonzalez
No ratings yet
Search Engine Problems and Solutions
Document2 pages
Search Engine Problems and Solutions
International Journal of Innovative Science and Research Technology
No ratings yet
Twitter Mining Using R
Document11 pages
Twitter Mining Using R
sanchit nagpal
No ratings yet
Preparation
Document10 pages
Preparation
shiv900
No ratings yet
Implementation of An Image Search Engine - 1
Document31 pages
Implementation of An Image Search Engine - 1
LIKHITHA GOGULAPATI
No ratings yet
Web Data Mining Synopsis
Document18 pages
Web Data Mining Synopsis
Komal
No ratings yet
BTP Report
Document14 pages
BTP Report
Lakshaya Teotia
No ratings yet
Computer Science Department
Document28 pages
Computer Science Department
eecs.northwestern.edu
No ratings yet
Designing Instagram - Grokking The System Design Interview
Document16 pages
Designing Instagram - Grokking The System Design Interview
Basavaraja K
No ratings yet
49 T100 PDF
Document5 pages
49 T100 PDF
kam_anw
No ratings yet
Explores The Ways of Usage of Web Crawler in Mobile Systems
Document5 pages
Explores The Ways of Usage of Web Crawler in Mobile Systems
International Journal of Application or Innovation in Engineering & Management
No ratings yet
News Aggregator The World at Your Finger Tips
Document5 pages
News Aggregator The World at Your Finger Tips
Editor IJTSRD
No ratings yet
Twitter BDA Presentation
Document15 pages
Twitter BDA Presentation
XYZ ABC
No ratings yet
Download
Document5 pages
Download
gluckskind533
No ratings yet
Submitted To: Priyanka Mam Submitted By: Sonu
Document17 pages
Submitted To: Priyanka Mam Submitted By: Sonu
Sonu Saini
No ratings yet
Stock Documentation
Document8 pages
Stock Documentation
Shubham dattatray kote
No ratings yet
Name: Pankaj L. Chowkekar Application ID: 6808 Class: S.Y. M.C.A. (Sem 4) Subject: Advanced Database Techniques Group: A
Document41 pages
Name: Pankaj L. Chowkekar Application ID: 6808 Class: S.Y. M.C.A. (Sem 4) Subject: Advanced Database Techniques Group: A
Bhavin Panchal
No ratings yet
(IJETA-V11I3P34) :santosh Kumar, Nitin Kumar, Shashank Ranjan, Akhilesh Kumar
Document6 pages
(IJETA-V11I3P34) :santosh Kumar, Nitin Kumar, Shashank Ranjan, Akhilesh Kumar
editorijeta
No ratings yet
Synopsis Major
Document4 pages
Synopsis Major
resume.ansh
No ratings yet
April Report - Akash 16070122002
Document5 pages
April Report - Akash 16070122002
Akash Dholaria
No ratings yet
Newsstand Through RSS Feeds
Document11 pages
Newsstand Through RSS Feeds
Chintan Parekh
No ratings yet
Unit - Iv - Mining Social Web
Document13 pages
Unit - Iv - Mining Social Web
vani
No ratings yet
Implementing A Web Crawler in A Smart Phone Mobile Application
Document4 pages
Implementing A Web Crawler in A Smart Phone Mobile Application
Editor IJAERD
No ratings yet
Web Crawler & Scraper Design and Implementation
Document9 pages
Web Crawler & Scraper Design and Implementation
kassila
100% (1)
Real-Time Anomaly Monitoring System: Guided By, Presented By
Document34 pages
Real-Time Anomaly Monitoring System: Guided By, Presented By
kokila
No ratings yet
T-Creo: A Twitter Credibility Analysis Framework: Social
Document19 pages
T-Creo: A Twitter Credibility Analysis Framework: Social
MadhanDhonian
No ratings yet
Investigate Peoples Social Media Behaviours Using Tweets From Twitter
Document5 pages
Investigate Peoples Social Media Behaviours Using Tweets From Twitter
IJRASETPublications
No ratings yet
Semantic Web Unit - 1 & 2
Document16 pages
Semantic Web Unit - 1 & 2
pavanpk0812
No ratings yet
Whatsapp Chat Analyzer: (Peer-Reviewed, Open Access, Fully Refereed International Journal)
Document6 pages
Whatsapp Chat Analyzer: (Peer-Reviewed, Open Access, Fully Refereed International Journal)
Sumit Morwal
No ratings yet
Synopsis Yashvir
Document4 pages
Synopsis Yashvir
Dhananjay Kumar
No ratings yet
@twitter Mining#Microblogs Using #Semantic Technologies
Document12 pages
@twitter Mining#Microblogs Using #Semantic Technologies
Johnny Dow
No ratings yet
Image Retrieval: Fundamentals and Applications
From Everand
Image Retrieval: Fundamentals and Applications
Fouad Sabry
No ratings yet
Image Retrieval: Unlocking the Power of Visual Data
From Everand
Image Retrieval: Unlocking the Power of Visual Data
Fouad Sabry
No ratings yet