Documents Home » Data Files » Structured Data » Labeled Data » Sales and workload in retail industry

Vaibhav Mali's Documents

  • More »
  •  
  •  

Sales and workload in retail industry

June 29, 2021
Uploaded through Retail - Category: Data Files » Structured Data » Labeled Data - Tags: #retail  #retail_sales  #retail_work  0 152 0

Context

Raw data of real analytical use cases in a number of industries and companies is frequently provided in an Excel-based form. These files usually cannot be processed directly in machine learning models, but must first be cleaned and preprocessed. In this procedure, many different types of pitfalls may occur. This makes data preprocessing an essential time factor in the daily work of a data scientist.

Here, an Excel spreadsheet will be presented which in this form is closely oriented to a real case but contains only simulated figures for reasons of data and business results protection. The form and structure of the file correspond to a real case and could be encountered by a data scientist in a company in this way. Such a file can be the result of a download from a financial controlling system, e.g. SAP.

Content

The data includes information about sold goods resp. product units, the associated turnover and hours worked. This information is grouped by month, store and department of the retailer. Moreover, information about the sales area in a specific department as well as about the opening hours of the store is provided.

Possible objectives

The following goals of data cleansing might be addressed:

  • Import the Excel-file
  • Inspect the dataset
  • Check data types and do meaningful modifications
  • Handle missings/data gaps
  • Find and solve data inconsistencies
  • Rename columns for improved usage
  • Join tables to a single one

Furthermore, the data can be investigated with regard to correlations between different features and/or a regression model.

  • License Type Open Data Commons
  • Data Original Source Attribution https://www.kaggle.com/dgluesen/sales-and-workload-data-from-retail-industry