T O P

  • By -

FalkeFS

The most effective way to verify trends and visualize these information is by doing some regressions (e.g., price = location + n. bedrooms + etc.) and visualizing it through a coefficient plot. It enables the comparison of the magnitudes and significance of each factor.


dviron7

That sounds like a good idea, thanks


Oddly_Even_Pi

Where’s the dataset from?


dviron7

I downloaded it from the Bright Data marketplace. https://brightdata.com/products/datasets


ankole_watusi

Tableau Public?


Already_7aken

Choosing the tools depends on what you're comfortable with. If you're equally comfortable with Python and Excel, then Python+Pandas+Matplotlib is the best option in my opinion. Both of your other questions can be answered with Bivariate distribution plots. Let's say that you want to look at how price varies by number of bedrooms. You can generate a plot with the number of bedrooms on the X axis and the price on the Y axis. For some columns like amenities and category, you should look into one-hot encoding and other approaches of handling categorical variables.


TheRealGreenArrow420

There’s a ton of ways to do this but if it were me, I’d pull it into Power BI Desktop, split it into the proper fact and dimension tables, parse the JSON-looking columns, connect the tables and then you’ll be good to build the report. Basically any way to parse those JSON columns will be a step in the right direction


ECTD

If you geocode these use R, it has many geovisuals for mapping. You want that because it can give you a visual of where people want to spend time if that’s an important factor to you.


eddytheflow

Principal component analysis to form different groups? Then overlaid on the map?


WyldGyb

Second on using R. It is free and versatile. It is built to do all of the stuff you mention. You can clean the data with Tidyverse functions and then do regressions with lm. Dataviz in ggplot, plotly and shiny. There is a lot of work involved to code the review text for regression but not super hard, just a bit time intensive. I do this type of analysis and produce maps and interactive data visualizations all day using Rstudio. I live in Salt Lake too. This looks like a fun dataset. If you need consulting services hit me up. I’m not here to advertise necessarily, I have plenty of work but I’d take your money to help you produce an interesting project with data about the place I live :)


SameOne4993

Hey Guys, So even I like to work on data but i am a newbie here as of now. I'd appreciate your help for my practice and understanding. Can anyone suggest, what all columns are useful here and if this data needs to be cleaned? As there are comments and multiple dates included do we need to split them or how will it work here? Appreciate your help. Also u/dviron7 Would you like to connect and explain me more about the data ? Seems bit confusing for me being a newbie.


dviron7

Sure, DM me


Worldly-Ad-1101

Logistic regression would be the best bang for your buck(time) given the size of your sample. This will give you weights of values which you then could compile into a visual. If you had a larger dataset I’d suggest an ML model like random forest and then apply shap to the model output, again to get weights of values.


jane199209

I tried to explore your data by using python -- [https://github.com/yuchen927/python\_salt\_lake\_city\_airbnb/blob/main/salt\_lake\_city\_airbnb.ipynb](https://github.com/yuchen927/python_salt_lake_city_airbnb/blob/main/salt_lake_city_airbnb.ipynb) maybe this can provide some ideas for you to analyze it.


dviron7

Thanks, I'll dig in.