Professional Documents
Culture Documents
Full Proposal
Full Proposal
Page 3 - Functionality
Page 5 - Timeline
Page 6 - Review/Conclusion
Functionality
Orwebb will be a privacy focused consumer tool, primarily using twitter, to report and
observe the connotations of the end users digital footprint, by doing this I plan and hope to offer
a user either more security or knowledge on how much there digital footprint overlaps with there
current identity, if there is any security risks (such as multiple linked accounts to the same twitter
ID), how there related to there family, and how there tweets impact the overall environment of
the internet. I plan on reaching this goal by utilizing the Twitter API and Data Mining/Data
Scraping. Data Mining is the process of taking or “mining” data for analytical purposes, usually
from large databases, whereas Data Scraping is taking specific sets of information from large
databases (such as names or ages) and then using it for analyticals. Utilizing the Twitter API and
Python, I plan on doing just that, by aggregating large amounts of data I can diagnose things
such as current trends to even the spread of influenza. However the first thing I really want to do
is make something called a “Relational Engine”, by this I mean I want to make an analyzation
program for Orwebb that can diagnose and see if certain users are closely related (Such as Father
to Son or even whole family trees). If I am able to construct this, not only could I view the
impact of the individual user, I could also see the impact it has on their family, and the impact
their family has on them. However Twitter does have some rules on what Data I can and cannot
collect, such as scraping Twitter Geo-Location data from individual tweets, however I could
instead take the location data from the user’s main profile and then test for accuracy compared to
other tweets of the user. (For Example, if a user said that they were moving to a location in a
tweet, and their location is different, that location would not be displayed). In doing this I hope to
offer to the end user knowledge on how public there data is, and what they can and should do to
change it.
I plan on doing this for a multitude of reasons, primarily however, not only am I
interested in the topic of Data analytics this also allows me to “throw myself in” the larger realm
of advance python and advance level programming. Since I do inevitably plan on doing this for a
career, I wish to amass as much experience as possible in order to make my skills that much
Timeline
Goal Set 1
● Learn and review Python concepts in order to utilize the Twitter API
● Learn how to utilize the Twitter API and how to begin extracting rudimentary amounts of
data
Goal Set 2
● If Research is completed on how to utilize the API, begin to extract more and more data
Goal Set 3
Currently my greatest concern is the computational power for all of the analytics I plan
on being able to perform, however if needed, I can always cut back on either the data collected
for greater performance, or eventually optimize my code to better help the users machine. It may
be effective to make a “Low Spec” mode in order to cater to the users that don't have the best of
computer power or other components. Other then that my concern of accessing the Twitter API
has been addressed due to me receiving API access, so currently, everything does look suitable
Conclusion
Python, which so far seems to be one of my favorite languages to use. In doing so I hope to gain
the upper hand on programming in general and be able to create this program and be proud of it.
Other then that, utilizing things such as the Twitter API and using advance programming skills in
general can better prepare myself for the future I plan on having, and will allow me to not only
gain a greater intuition on the world of programming, but a show of the skill I already have, and