As a chocolate enthusiast π₯°, I would like to learn more about what constitutes a great chocolate bar. In this project, I will explore how the ratings π of chocolates from different countries vary and how the ingredients used in their production βοΈ may influence these ratings.
To begin with, I would like to learn about which companies produced highly-rated chocolate bars so that I may purchase their products in the future.
To answer this question,
I plan to compute the average rating of each chocolate company and identify the top 10 companies with the highest averages.
I have decided to use a lollipop chart,
which is a modified form of the basic bar chart.
While bar charts are effective for displaying information using line marks and position channels,
the top 10 companies in this case all have high ratings,
which could result in a cluttered figure with multiple bars of similar height.
Therefore, the lollipop chart provides a better visual effect.
Since there is only one value to be displayed, I will use the same color for all lollipops.
From the plot, we can observe that Ocelot and Heirloom Cacao Preservation (Zokoko) have the highest average ratings.
However, these are both new companies for me. I hope to try their products during the upcoming spring break βοΈ.
Feel free to view detailed information (and π« images) by moving your cursor on each lollipop (each circle) π!
Photo credit: All images of chocolates were obtained from the official websites of the chocolate brands, as well as from store information pages on Amazon, Yelp, Cocoa Runners, and Theobroma Cacao. Two images are from the free images website Pixabay.
To answer this question, I decided to use scatterplots with point marks and positions (horizontal & vertical) and color channels. Scatterplot are helpful for exploring correlation between two variables. I also made the dots transparent, with deeper colors indicating more data points.
I assumed that a higher cocoa percentage would lead to a higher rating because a higher percentage usually indicates better quality chocolate bar π«. However, after examining the scatterplot, I found that there is no obvious pattern between cocoa percentage and rating. A chocolate bar with low cocoa percentage can also have a high rating, and a chocloate bar with high cocoa percentage may even have a lower rating! Most highly-rated chocolate bars seem to have a cocoa percentage between 65% and 80%.
It appears that the complexity of making a chocolate bar has little impact on its rating. The chocolate ratings are almost evenly distributed among various ingredient counts, indicating that the number of ingredients has little correlation with the chocolate's rating.
Feel free to click the button to switch between the two plots.
I am also curious about what are the most popular tastes among the chocolate bars recommended by the chocolate experts (rating >= 3). Maybe next time I can also try these flavors!π₯° I would also like to count the total number of each popular flavor, and see how many times it is the first taste, the second taste, the third taste, and the fourth taste of a chocolate bar.
To answer this question, I decide to use a stacked bar chart with lines mark and positions (horizontal & verticle) and color channels. Stacked bar chart are much more dense than pie chart, and in this question, it is especially helpful when we want to compare the compositions of different tastes. Since first taste, second taste, third taste, and fourth taste are categorical variables, I use some random chocolate colors (color hue) to represent each subcategory.
From the stacked bar chart below, we can observe that nutty is the most popular taste in a great chocolate bar. This makes sense, nuts are great complement of dark chocolate! We can also see that the composition of the counts of each taste varies. Although creamy π¦ is not the most common taste, it has the highest count of first taste. Cocoa βοΈ has the highest count of thrid taste and fourth taste.
In question 1, we learn what are some best companies producing chocolate bars. In this question, I would like to further explore the distribution of high-rated chocolate companies. To achieve this, I find the chocolate bars companies with the top 3 highest average ratings among companies that have more than 20 chocolate bars. I decide to use box plots with lines and points marks and position and color channels since box plots will be helpful for us to compare the spread of the data. I use a sequencial color scale to represent the rating data point. Brighter colors indicates higher ratings.
We can observe that companies with high average rating generally tend to have a high overall performance. Most of their ratings are scattered around the center. However, their distributions may varies.
After learning about chocolate flavors, I also want to learn more about the ingredients of great chocolate bars. Specifically, do ingredients from certain regions tend to have better cocoa beans?
To answer this question, we can find the most common chocolate bean origin countries of the chocolate bars that are recommended by the chocolate experts (with rating >= 3.5 out of 5). I decide to use a map with points mark and position and size channel. Each point represent the number of high rating chocolate bars in a certain country encoded by the circle size. Since we only have one value, I use the same color for all circles.
We can observe from the graph that Venezuela is the country of beans origin that has the highest number of tasty chocolate bars. Many countries from Middle America seem to have plentiful high-quality cocoa beans that produce great chocolate π!