Professional Documents
Culture Documents
Computer Science Final
Computer Science Final
Analysis
Amy Fullington
Block #4 Computer Science
Introduction
★ Goodreads is a social platform utilized by book readers to rate and discuss works of
literature with friends and like-minded followers
★ Over 20,000 books located in the database
★ Allows members to update book progress, rate and review books, join virtual book
clubs, etc.
★ Throughout history, gender discrimination has skewed the predominant gender of
authors in the male direction
★ Genre additionally has an immense impact on a books traits
★ The way a book presents itself determines the audience attracted and reviews
received
★ Purpose of study was to determine the gender disparities in book writing as well as
the effect certain attributes of books have on the audience and on each other
Methodology
★ Dataset downloaded to Excel from Kaggle (“goodreads books/author data” by Ben
Roden
★ 22,892 books recorded x 20 categories
★ 6 megabytes
★ Genres in the “genre_1” column converted to numbers 0-43
★ Genders in “author_gender” column converted into 0’s (female) and 1’s (male)
★ All other necessary data numerical, so no conversion necessary
★ Save as CSV (Comma Delimited) file in the correct Python project
★ I was drawn to this dataset because I have always been an avid reader and was first
introduced to Goodreads by my English teacher in eighth grade.
Genre Number
Fiction 1
Mystery 2
Romance 3
“
What genre is most commonly written
by each gender?
Female -
Romance,
Fantasy, Young
Adult
Male - Fiction,
Fantasy,
Romance
Discussion
★ Female top genre was romance
○ Writers tend to write about what interests them
○ Female more romantic than male
○ Stereotypes also play a role (women expression)
○ More comfortable with names on romances
★ Male top gender (besides fiction) was fantasy
○ Historical context
○ Gender normals
○ More comfortable writing this genre
○ Money lure
“
What book genre tends to receive the
highest ratings?
Poetry,
Animals, and
Religion books
receive the
highest average
rating
Discussion
★ Highest rated genre overall is poetry
○ Reason lies in the genres followers
○ Very specific genre that specific people read
○ Kind people = good reviews
★ Backed up by runner up genres - animals and religion
○ Narrow audience
○ People who read about these subjects tend to enjoy them
“
What genres books tend to be the
longest?
Anthologies,
Biographies,
and Historical
novels average
the most pages
Discussion
★ Longest books are anthologies
○ Average around 600 words
○ Collections of literary works chosen by the creators,
meaning one anthology contains many works.
○ Increases size
★ Historical novels and biographies telling about intricate subjects
○ History = elaborate when considering individual countries
and the world alike
○ Biographies = someone's entire life in one book
“
Does book-length correlate with book
rating? How about the number of
ratings?
Neither
average rating
or number of
ratings
correlate with a
book’s length
Discussion
★ No correlation for either comparison
○ R-values 0.11209898 and 0.04089959
○ 1 is a perfect positive correlation and -1 is a perfect
negative correlation
○ Books have differing audiences
○ Opinions differ when considering one book
“
Out of the top 50 rated books, how
many of them were written by male
authors? Female authors?
The split is
approximately
even between
female and
male authors
who wrote a
top fifty rated
book
Discussion
★ The top fifty rated books have an even split between female and
male authors
○ 24 female (48.0%)
○ 26 male (52.0%)
○ Demonstrates that gender has no direct influence on the
popularity of a book
○ No gender is preferred over the other overall
QUESTIONS?