Change the column type of the scrape_id column to number
Change the values f and t to True and False for more understandability in the following columns host_is_superhost, host_identity_verified, has_availability, instant_bookable Delete the following unecessary or duplicated columns
last scraped, source, description(blank), host_url, host_about,
host_thumbnail_url, host_picture_url, host_has_profile_pic,neighbourhood,neighbourhood_group_cleansed(blank),bathrooms, amenities, minimum_nights, maximum_nights,minimum_nights_avg_ntm and maximum, calendar_updated, availability_30_60_90, number_of_reviews_ltm and Id30, licence(blank), calculated_host columns, reviews_per_month Change the price column to a number type for better accuracy in SQL Clean the reviews column from inaccurate reviews that are above the rating scale Change the column types of the percentage columns
* Work done in SQL
Convert the id column to big int to show id correctly. Separate the host location to host city and host country columns. Reformate the columns that have f and t to true and false. Fix the verifications column and make it have the count of the verfications instead of their type. Fill the host_neighborhood null rows with the values from neighbourhood_cleansed column Convert the rate columns(percentages) to categorical values then delete the original ones Add an availability percentage column
* Analyzing what makes an Airbnb successful on three different criteria:
- The best price -What is the most profitable neighbourhood in London -What is the most profitable room_type -Do the number of bedrooms and beds influence the price -Do hosts with superhosts status earn more -Does the instant bookable status increase the price of the property -Which number of accomodations typically earns most money -Do long members earn more than newer members -BONUS: Do hosts from the same country earn more than foreigners - The highest ratings -Which neighbourhoods typically get the best reviews -Does the response and acceptance rate lead to better reviews -Does the host_response time lead to better reviews -Which room type has the best reviews - The highest activity -Which neighbourhoods typically has the most bookings -Does more verification lead to more bookings -Which number of accommodates people generally gravitate to -Does the superhost status make the airbnb more bookable -Does the verification of the host's identity lead to more bookings -Do hosts with multiple properties have more bookings generally -Does the availability of the property effect it's activity *Work done in Tableau *Tables to import into an Excel workbook -Most expensive, Best reviewed and most booked neighbourhood -Which room type is the most expensive, most enjoyed and most booked -The price of the airbnb depending on the number of beds -superhost status -instant bookable -older members reveue -Acceptance/response rate and time -Number of verifications effect on bookings -Identity verification effect on booking -Multiple properties effect on bookings -Number of accommodations effect on booking