RedFlagDeals is a forum where users can post product sales that they come across. The "All Hot Deals" section of the forum was scraped for relevant information on July 17, 2020.
I supplied a kernel on how to clean the data and will follow up with some analyses for identifying promising deals. I will continue updating the data-set with new posts on the forum should there be sufficient interest, wich I will evaluate based on the number of downloads and upvotes.
Three tables are supplied.
Each row in the main table corresponds to a post. Columns indicate post information such as the title, the sum of up-votes minus down-votes, a link to the referenced deal, and more.
The comments table stores all comments made in response to the scraped posts. Titles in the 'title' column serve as foreign keys and link comments to the corresponding posts found in the main table.
Lastly, a cleaned version of the main table was supplied, for those who do not want to deal with data wrangling. The corresponding code can be found in the Kernel section.
After data-wrangling of the main table, the set should be fairly simple to analyze and may contain some interesting deals. Since links to the sales are included, you may come across offerings that interest you.
The comments table can be used for natural language processing and more robust sentiment analysis. You may want to consider applying PCA.
Happy sales hunting!
Some questions you may want to answer: