Australian Wine Importers (AWI) asked you to develop a data
mining method of classifying imported wines based on:
•
Price category (as 5 equal size bins)
AWI provided you with a sample of 130,000 wine tasting
results, which include:
Taster name and twitter handle;
Wine “title” (name + vintage);
Country, Province and Region;
Variety and Winery;
Description and Designation (text data);
Price (US$) and Points (taster’s rating).
As there are great many tasting results, AWI would like to get
the preliminary insight into the wine’s origin and its
marketability.
The following questions are of interests to AWI:
A)
What is the best source of wine in the optimum price-rating ratio? and,
B)
What is the expected price range category of newly imported wines?
C)
Can all wine tasters in the data set be trusted with their tasting results?
AWI wants you to cleanup and explore wine tasting data, develop and evaluate a
classifier to determine the price range for new wines, and to minimize classification
errors.
In technical terms:
Your project objectives form a learning portfolio. The first objective (LP1) is to acquire and explore the available data, visualise and report any significant characteristics of non-text data, as well as, prepare the data for further processing.