• Google Data Analytics Capstone Project - BellaBeat /

Google Data Analytics Capstone Project - BellaBeat

Introduction #.

This is the case study that served as my capstone project for Google’s Data Analytics Course . I aimed to use as many of the skills I learned in that course while completing this project, including spreadsheets, SQL, and RStudio. I chose this case study in particular for it’s focuse on exercise and physical fitness, topics that I have a deep interest in. Beyond the Data Analytics Certificate, I hope that this project will help me learn how to better use my own Fitbit data.

You are a junior data analyst working on the marketing analyst team at Bellabeat, a high-tech manufacturer of health-focused products for women. Bellabeat is a successful small company, but they have the potential to become a larger player in the global smart device market. UrÅ¡ka SrÅ¡en, cofounder and Chief Creative Officer of Bellabeat, believes that analyzing smart device fitness data could help unlock new growth opportunities for the company. You have been asked to focus on one of Bellabeat’s products and analyze smart device data to gain insight into how consumers are using their smart devices. The insights you discover will then help guide marketing strategy for the company. You will present your analysis to the Bellabeat executive team along with your high-level recommendations for Bellabeat’s marketing strategy UrÅ¡ka SrÅ¡en and Sando Mur founded Bellabeat, a high-tech company that manufactures health-focused smart products. SrÅ¡en used her background as an artist to develop beautifully designed technology that informs and inspires women around the world. Collecting data on activity, sleep, stress, and reproductive health has allowed Bellabeat to empower women with knowledge about their own health and habits. Since it was founded in 2013, Bellabeat has grown rapidly and quickly positioned itself as a tech-driven wellness company for women. By 2016, Bellabeat had opened offices around the world and launched multiple products. Bellabeat products became available through a growing number of online retailers in addition to their own e-commerce channel on their website. The company has invested in traditional advertising media, such as radio, out-of-home billboards, print, and television, but focuses on digital marketing extensively. Bellabeat invests year-round in Google Search, maintaining active Facebook and Instagram pages, and consistently engages consumers on Twitter. Additionally, Bellabeat runs video ads on Youtube and display ads on the Google Display Network to support campaigns around key marketing dates. SrÅ¡en knows that an analysis of Bellabeat’s available consumer data would reveal more opportunities for growth. She has asked the marketing analytics team to focus on a Bellabeat product and analyze smart device usage data in order to gain insight into how people are already using their smart devices. Then, using this information, she would like high-level recommendations for how these trends can inform Bellabeat marketing strategy. SrÅ¡en asks you to analyze smart device usage data in order to gain insight into how consumers use non-Bellabeat smart devices. She then wants you to select one Bellabeat product to apply these insights to in your presentation. These questions will guide your analysis: What are some trends in smart device usage? How could these trends apply to Bellabeat customers? How could these trends help influence Bellabeat marketing strategy? You will produce a report with the following deliverables: A clear summary of the business task: A description of all data sources used. Documentation of any cleaning or manipulation of data A summary of your analysis Supporting visualizations and key findings Your top high-level content recommendations based on your analysis

Business Task #

We have been tasked to discover trends in device usage, then apply those findings towards helping customers and Bellabeat’s marketing strategy.

The data were collected from thirty-four users who gave informed consent to have their data analyzed. The data have been anonymized. The project wants to evaluate the data on the following criteria:

  • Reliable - This data is not very reliable. In addition to the problems listed below, there’s a lack of useful information about the individuals, nor is it clear why these individuals were chosen.
  • Original - This data originates from a third party.
  • Comprehensive - Only 34 individuals are involved and the data is full of gaps. Some datasets are missing entire days of data.
  • Current - This data was gathered in 2016. Granted, health data doesn’t have an exact expiration date and can still be useful years afterward, but one has to wonder why more recent data couldn’t be found, especially since this is supposed to inform the business decisions of a company about to enter a new market.
  • Cited - It’s not clear how the Kaggle user who uploaded the data got them in the first place.

Overall, the quality of the data is quite poor. I searched for similar datasets that might make up for these deficiencies, but none were forthcoming even after extensive searching.

Here are a list of the individual datasets along with the columns from each one:

  • dailyActivity_merged - Id, Activity Date, Total Steps, Total Distance, Tracker Distance, Logged Activities Distance, Very Active Distance, Moderately Active Distance, Lightly Active Distance, Sedentary Active Distance, Very Active Minutes, Fairly Active Minutes, Lightly Active Minutes, Sedentary Minutes, Calories
  • dailyCalories_merged - Id, ActivityDay, Calories
  • dailyIntensities_merged - Id, Activity Day, Sedentary Minutes, Lightly Active Minutes, Fairly Active Minutes, Very Active Minutes, Sedentary Active Distance, Light Active Distance, Moderately Active Distance, Very Active Distance
  • dailySteps_merged - Id, Activity Day, Step Total
  • heartrate_seconds_merged - Id, Time, Value
  • hourlyCalories_merged - Id, Activity Hour, Calories
  • hourlyIntensities_merged - Id, Activity Hour, Total Intensity, Average Intensity
  • hourlySteps_merged - Id, Activity Hour, Step Total
  • minuteCaloriesNarrow_merged - Id, Activity Minute, Calories
  • minuteCaloriesWide_merged - Id, Activity Hour, Calories per Minute (60)
  • minuteIntensitiesNarrow_merged - Id, Activity Minute, Intensity
  • minuteIntensitiesWide_merged - Id, Activity Hour, Intensity per Minute (60)
  • minuteMETSNarrow_merged - Id, Activity Minute, METs
  • minuteSleep_merged - Id, Date, Value, Log Id
  • minuteStepsNarrow_merged - Id, Activity Minute, Steps
  • minuteStepsWide_merged - Id, Activity Hour, Steps per Minute (60)
  • sleepDay_merged - Id, Sleep Day, Total Sleep Records, Total Minutes Asleep, Total Time In Bed
  • weightLogInfo_merged - Id, Date, Weight Kg, Weight Pounds, Fat, BMI, Is Manual Report, Log Id

Most of the data is based on the increments of time in which it was gathered (hourly, daily, etc), so I’ll evaluate and process the data on these terms as well.

Processing #

Daily data #.

The daily data was cleaned and partially processed with Google Sheets. This data comes from the following sets:

  • dailyActivity_merged
  • dailyCalories_merged
  • dailyIntensities_merged
  • dailySteps_merged
  • sleepDay_merged

The dailyActivity_merged file is already exhaustive, containing much of the data in the other daily data. As such, the following datasets were removed from the analysis for being redundant: dailyCalories_merged, dailyIntensities_merged, and dailySteps_merged. From this point, it was a simple matter of using Google Sheets to root out duplicate rows and null values, none of which were found.

The only daily data that wasn’t already incorporated in the dailyActivity_merged file was the sleepDay_merged dataset. As an avid fitness enthusiast myself, I know that quality sleep can be just as important as exercise and diet. It seemed obvious to do whatever I could to combine these two datasets in hopes of gaining new insights.

I removed three duplicate rows in the sleepDay_merged dataset. With the COUNTUNIQUE function, I also noticed that there were only 24 unique users in the dataset, as opposed to the 34 in the dailyActivity_merged dataset. I also noticed that users didn’t track their sleep every night. Furthermore, I changed the title of the “value” column to “sleepValue” to clarify its origin.

All told, cleaning the data through Sheets was incredibly simple and I’ll continue using it for some of my analysis.

Finally, I also added the final draft of both spreadsheets to the SQL database to compare it against the rest of the data. Before doing this, I made sure to change the date format so that it would match SQL’s DATE datatype. Using the following SQL query, I was able to merge both datasets together:

The resulting data was then exported as dailyMerged.csv .

Hourly Data #

The hourly data consists of the following:

  • hourlyCalories_merged
  • hourlyIntensities_merged
  • hourlySteps_merged

This data is too unwieldy to work with in spreadsheets, so these will be processed using tools like R Studio, and SQL. However, Google Sheets was sufficiently capable of carrying out some of the necessary cleaning. Before exporting them from Google Sheets, I checked the data for duplicate rows and reformatted the dates so BigQuery would accept them as DATETIME data type.

I renamed the datasets when I uploaded them to BigQuery to remove the extraneous “_merged” modifier. For example, the minuteSleep_merged dataset became “minuteSleep” and hourlySteps_merged became “hourlySteps”.

Using SQL, I joined the hourly data together with the following query:

The resulting data was then exported as hourlyMerged.csv .

Second and Minute Data #

The second and minute data consist of the following:

  • minuteCaloriesNarrow_merged
  • minuteCaloriesWide_merged
  • minuteIntensitiesNarrow_merged
  • minuteIntensitiesWide_merged
  • minuteMETSNarrow_merged
  • minuteSleep_merged
  • minuteStepsNarrow_merged
  • minuteStepsWide_merged
  • heartrate_seconds_merged

These datasets are too unwieldy even for SQL or R. As nice as it is in theory to have everything in fine grained detail, such data need to justify the trouble and strife required to clean, process, and analyze them, which doesn’t appear to be the case with the second and minute data. If there’s no way to aggregate and average this data into more manageable units of time, I’ll have to set it aside for the moment or at least until I have access to more computing power.

Weight Log #

The scant amount of data in weightLogInfo_merged makes this difficult to comfortably incorporate into the study. Only eight of the already paltry thirty-four participants logged their weight, and of those, only two of them did so more than five times. This is especially disappointing considering studies of large populations is one of the few areas where the controversial BMI metric undeniably shines.

Analyze and Visualize #

Daily data analysis #.

After making a quick few charts in Google Sheets, the sleep tracking data doesn’t appear to correlate strongly with any of the other data, whether that’s calories, steps, or active or sedentary minutes at any intensity. This is disappointing, though unsurprising considering the unreliability and lack of sleep data gathered. Perhaps more insight can be gleaned by using R.

Other charts measuring more banal observations indicate that at least the commonly logged data is internally consistent. For example, the number steps taken each day correlates strongly with calories burned, as does the total distance traveled.

As I’d hoped, I was able to get more insight from bringing the data into RStudio. Although there isn’t a strong correlation between activity and sleep (-0.1815268), it does appear there is a moderate negative correlation between sleep and non-active minutes (-0.5869577). This suggests that sleep has a stronger effect on whether or not an individual will be active the next day:

With all this daily data, it seemed prudent to aggregate the data by day of the week and see what trends I could find. With the following script, I was able to collate data based on day of the week (this codeblock is very long, so I put most of it behind an expandable section):

Click below to see the code for the other days of the week. Expand > # Wednesday > > day_Wednesday <- dailyMerged1 %>% + filter ( dailyMerged1 $ DayOfWeek == "Wednesday" ) %>% + select ( - c ( Id , ActivityDate )) > summary ( day_Wednesday ) Calories TotalSleepRecords TotalMinutesAsleep TotalTimeInBed TotalSteps TotalDistance TrackerDistance Min. : 1377 Min. : 1.000 Min. : 152.0 Min. : 260 Min. : 356 Min. : 0.250 Min. : 0.250 1 st Qu. : 1789 1 st Qu. : 1.000 1 st Qu. : 392.0 1 st Qu. : 425 1 st Qu. : 5318 1 st Qu. : 3.748 1 st Qu. : 3.748 Median : 2207 Median : 1.000 Median : 444.5 Median : 469 Median : 8686 Median : 6.175 Median : 6.175 Mean : 2378 Mean : 1.152 Mean : 434.7 Mean : 470 Mean : 8023 Mean : 5.720 Mean : 5.720 3 rd Qu. : 2942 3 rd Qu. : 1.000 3 rd Qu. : 477.0 3 rd Qu. : 525 3 rd Qu. : 10516 3 rd Qu. : 7.418 3 rd Qu. : 7.418 Max. : 4079 Max. : 3.000 Max. : 658.0 Max. : 679 Max. : 15108 Max. : 12.190 Max. : 12.190 LoggedActivitiesDistance ActiveDistance non_ActiveDistance ActiveMinutes non_ActiveMinutes DayOfWeek Min. : 0.00000 Min. : 0.000 Min. : 0.250 Min. : 0.00 Min. : 320.0 Length : 66 1 st Qu. : 0.00000 1 st Qu. : 0.000 1 st Qu. : 2.417 1 st Qu. : 0.00 1 st Qu. : 878.5 Class : character Median : 0.00000 Median : 1.805 Median : 3.590 Median : 33.50 Median : 924.5 Mode : character Mean : 0.09091 Mean : 2.062 Mean : 3.652 Mean : 38.08 Mean : 922.4 3 rd Qu. : 0.00000 3 rd Qu. : 2.910 3 rd Qu. : 5.062 3 rd Qu. : 58.50 3 rd Qu. : 977.8 Max. : 2.00000 Max. : 9.810 Max. : 7.110 Max. : 130.00 Max. : 1138.0 > > day_Wed_list <- + list ( + list ( "Total_Steps_Ave" = ~ mean ( day_Wednesday $ TotalSteps , na.rm = TRUE )), + list ( "Active_Minutes_Ave" = ~ mean ( day_Wednesday $ ActiveMinutes , na.rm = TRUE )), + list ( "Sedentary_Minutes_Ave" = ~ mean ( day_Wednesday $ non_ActiveMinutes , na.rm = TRUE )), + list ( "Calories_Ave" = ~ mean ( day_Wednesday $ Calories , na.rm = TRUE )), + list ( "Total_Hours_Asleep_Ave" = ~ mean ( day_Wednesday $ TotalMinutesAsleep / 60 , na.rm = TRUE )) + ) > day_Wed_summary <- summary_table ( day_Wednesday , day_Wed_list ) > print.default ( day_Mon_summary ) day_Monday ( N = 46 ) Total_Steps_Ave 9273.217391 Active_Minutes_Ave 49.804348 Sedentary_Minutes_Ave 940.782609 Calories_Ave 2431.978261 Total_Hours_Asleep_Ave 6.991667 attr (, "rgroups" ) [1] 1 1 1 1 1 attr (, "n" ) [1] 46 attr (, "class" ) [1] "qwraps2_summary_table" "matrix" "array" > > # Thursday > > day_Thursday <- dailyMerged1 %>% + filter ( dailyMerged1 $ DayOfWeek == "Thursday" ) %>% + select ( - c ( Id , ActivityDate )) > summary ( day_Thursday ) Calories TotalSleepRecords TotalMinutesAsleep TotalTimeInBed TotalSteps TotalDistance Min. : 257 Min. : 1.000 Min. : 59.0 Min. : 65.0 Min. : 17 Min. : 0.010 1 st Qu. : 1788 1 st Qu. : 1.000 1 st Qu. : 377.2 1 st Qu. : 416.0 1 st Qu. : 4363 1 st Qu. : 2.925 Median : 2168 Median : 1.000 Median : 423.5 Median : 457.0 Median : 8752 Median : 6.355 Mean : 2307 Mean : 1.031 Mean : 401.3 Mean : 434.9 Mean : 8184 Mean : 5.773 3 rd Qu. : 2868 3 rd Qu. : 1.000 3 rd Qu. : 467.2 3 rd Qu. : 492.8 3 rd Qu. : 10971 3 rd Qu. : 7.735 Max. : 4900 Max. : 2.000 Max. : 545.0 Max. : 568.0 Max. : 19542 Max. : 15.010 TrackerDistance LoggedActivitiesDistance ActiveDistance non_ActiveDistance ActiveMinutes non_ActiveMinutes Min. : 0.010 Min. : 0.0000 Min. : 0.000 Min. : 0.010 Min. : 0.00 Min. : 2.0 1 st Qu. : 2.925 1 st Qu. : 0.0000 1 st Qu. : 0.000 1 st Qu. : 2.652 1 st Qu. : 0.00 1 st Qu. : 873.0 Median : 6.355 Median : 0.0000 Median : 1.360 Median : 3.610 Median : 23.00 Median : 951.5 Mean : 5.745 Mean : 0.1562 Mean : 1.912 Mean : 3.699 Mean : 38.72 Mean : 901.3 3 rd Qu. : 7.735 3 rd Qu. : 0.0000 3 rd Qu. : 3.072 3 rd Qu. : 4.827 3 rd Qu. : 66.25 3 rd Qu. : 993.2 Max. : 15.010 Max. : 4.0000 Max. : 7.720 Max. : 7.700 Max. : 184.00 Max. : 1299.0 DayOfWeek Length : 64 Class : character Mode : character > > day_Thur_list <- + list ( + list ( "Total_Steps_Ave" = ~ mean ( day_Thursday $ TotalSteps , na.rm = TRUE )), + list ( "Active_Minutes_Ave" = ~ mean ( day_Thursday $ ActiveMinutes , na.rm = TRUE )), + list ( "Sedentary_Minutes_Ave" = ~ mean ( day_Thursday $ non_ActiveMinutes , na.rm = TRUE )), + list ( "Calories_Ave" = ~ mean ( day_Thursday $ Calories , na.rm = TRUE )), + list ( "Total_Hours_Asleep_Ave" = ~ mean ( day_Thursday $ TotalMinutesAsleep / 60 , na.rm = TRUE )) + ) > day_Thur_summary <- summary_table ( day_Thursday , day_Thur_list ) > print.default ( day_Thur_summary ) day_Thursday ( N = 64 ) Total_Steps_Ave 8183.515625 Active_Minutes_Ave 38.718750 Sedentary_Minutes_Ave 901.312500 Calories_Ave 2306.671875 Total_Hours_Asleep_Ave 6.688281 attr (, "rgroups" ) [1] 1 1 1 1 1 attr (, "n" ) [1] 64 attr (, "class" ) [1] "qwraps2_summary_table" "matrix" "array" > > # Friday > > day_Friday <- dailyMerged1 %>% + filter ( dailyMerged1 $ DayOfWeek == "Friday" ) %>% + select ( - c ( Id , ActivityDate )) > summary ( day_Friday ) Calories TotalSleepRecords TotalMinutesAsleep TotalTimeInBed TotalSteps TotalDistance Min. : 403 Min. : 1.00 Min. : 82.0 Min. : 85.0 Min. : 42 Min. : 0.030 1 st Qu. : 1850 1 st Qu. : 1.00 1 st Qu. : 355.0 1 st Qu. : 386.0 1 st Qu. : 5563 1 st Qu. : 3.680 Median : 2196 Median : 1.00 Median : 405.0 Median : 448.0 Median : 8198 Median : 5.630 Mean : 2330 Mean : 1.07 Mean : 405.4 Mean : 445.1 Mean : 7901 Mean : 5.512 3 rd Qu. : 2846 3 rd Qu. : 1.00 3 rd Qu. : 465.0 3 rd Qu. : 510.0 3 rd Qu. : 10465 3 rd Qu. : 7.110 Max. : 4044 Max. : 2.00 Max. : 658.0 Max. : 961.0 Max. : 16556 Max. : 11.470 TrackerDistance LoggedActivitiesDistance ActiveDistance non_ActiveDistance ActiveMinutes non_ActiveMinutes Min. : 0.030 Min. : 0.00000 Min. : 0.000 Min. : 0.03 Min. : 0.00 Min. : 6.0 1 st Qu. : 3.680 1 st Qu. : 0.00000 1 st Qu. : 0.000 1 st Qu. : 2.67 1 st Qu. : 0.00 1 st Qu. : 899.0 Median : 5.630 Median : 0.00000 Median : 0.880 Median : 3.77 Median : 21.00 Median : 987.0 Mean : 5.512 Mean : 0.07018 Mean : 1.722 Mean : 3.78 Mean : 35.74 Mean : 965.8 3 rd Qu. : 7.110 3 rd Qu. : 0.00000 3 rd Qu. : 3.150 3 rd Qu. : 4.91 3 rd Qu. : 61.00 3 rd Qu. : 1032.0 Max. : 11.470 Max. : 2.00000 Max. : 6.140 Max. : 7.24 Max. : 169.00 Max. : 1332.0 DayOfWeek Length : 57 Class : character Mode : character > > day_Fri_list <- + list ( + list ( "Total_Steps_Ave" = ~ mean ( day_Friday $ TotalSteps , na.rm = TRUE )), + list ( "Active_Minutes_Ave" = ~ mean ( day_Friday $ ActiveMinutes , na.rm = TRUE )), + list ( "Sedentary_Minutes_Ave" = ~ mean ( day_Friday $ non_ActiveMinutes , na.rm = TRUE )), + list ( "Calories_Ave" = ~ mean ( day_Friday $ Calories , na.rm = TRUE )), + list ( "Total_Hours_Asleep_Ave" = ~ mean ( day_Friday $ TotalMinutesAsleep / 60 , na.rm = TRUE )) + ) > day_Fri_summary <- summary_table ( day_Friday , day_Fri_list ) > print.default ( day_Fri_summary ) day_Friday ( N = 57 ) Total_Steps_Ave 7901.403509 Active_Minutes_Ave 35.736842 Sedentary_Minutes_Ave 965.771930 Calories_Ave 2329.649123 Total_Hours_Asleep_Ave 6.757018 attr (, "rgroups" ) [1] 1 1 1 1 1 attr (, "n" ) [1] 57 attr (, "class" ) [1] "qwraps2_summary_table" "matrix" "array" > > # Saturday > > day_Saturday <- dailyMerged1 %>% + filter ( dailyMerged1 $ DayOfWeek == "Saturday" ) %>% + select ( - c ( Id , ActivityDate )) > summary ( day_Saturday ) Calories TotalSleepRecords TotalMinutesAsleep TotalTimeInBed TotalSteps TotalDistance Min. : 1373 Min. : 1.000 Min. : 61.0 Min. : 69.0 Min. : 1202 Min. : 0.780 1 st Qu. : 1863 1 st Qu. : 1.000 1 st Qu. : 340.0 1 st Qu. : 382.0 1 st Qu. : 5079 1 st Qu. : 3.420 Median : 2363 Median : 1.000 Median : 426.0 Median : 470.0 Median : 10144 Median : 7.710 Mean : 2507 Mean : 1.193 Mean : 419.1 Mean : 459.8 Mean : 9871 Mean : 7.016 3 rd Qu. : 3073 3 rd Qu. : 1.000 3 rd Qu. : 507.0 3 rd Qu. : 539.0 3 rd Qu. : 13238 3 rd Qu. : 9.240 Max. : 4501 Max. : 2.000 Max. : 775.0 Max. : 961.0 Max. : 22770 Max. : 17.540 TrackerDistance LoggedActivitiesDistance ActiveDistance non_ActiveDistance ActiveMinutes non_ActiveMinutes Min. : 0.780 Min. : 0 Min. : 0.000 Min. : 0.590 Min. : 0.00 Min. : 402.0 1 st Qu. : 3.420 1 st Qu. : 0 1 st Qu. : 0.000 1 st Qu. : 2.730 1 st Qu. : 0.00 1 st Qu. : 850.0 Median : 7.710 Median : 0 Median : 2.010 Median : 3.770 Median : 44.00 Median : 911.0 Mean : 7.016 Mean : 0 Mean : 2.747 Mean : 4.266 Mean : 50.28 Mean : 927.2 3 rd Qu. : 9.240 3 rd Qu. : 0 3 rd Qu. : 4.160 3 rd Qu. : 5.330 3 rd Qu. : 80.00 3 rd Qu. : 998.0 Max. : 17.540 Max. : 0 Max. : 13.320 Max. : 9.480 Max. : 252.00 Max. : 1371.0 DayOfWeek Length : 57 Class : character Mode : character > > day_Sat_list <- + list ( + list ( "Total_Steps_Ave" = ~ mean ( day_Saturday $ TotalSteps , na.rm = TRUE )), + list ( "Active_Minutes_Ave" = ~ mean ( day_Saturday $ ActiveMinutes , na.rm = TRUE )), + list ( "Sedentary_Minutes_Ave" = ~ mean ( day_Saturday $ non_ActiveMinutes , na.rm = TRUE )), + list ( "Calories_Ave" = ~ mean ( day_Saturday $ Calories , na.rm = TRUE )), + list ( "Total_Hours_Asleep_Ave" = ~ mean ( day_Saturday $ TotalMinutesAsleep / 60 , na.rm = TRUE )) + ) > day_Sat_summary <- summary_table ( day_Saturday , day_Sat_list ) > print.default ( day_Sat_summary ) day_Saturday ( N = 57 ) Total_Steps_Ave 9871.122807 Active_Minutes_Ave 50.280702 Sedentary_Minutes_Ave 927.210526 Calories_Ave 2506.894737 Total_Hours_Asleep_Ave 6.984503 attr (, "rgroups" ) [1] 1 1 1 1 1 attr (, "n" ) [1] 57 attr (, "class" ) [1] "qwraps2_summary_table" "matrix" "array" > > # Sunday > > day_Sunday <- dailyMerged1 %>% + filter ( dailyMerged1 $ DayOfWeek == "Sunday" ) %>% + select ( - c ( Id , ActivityDate )) > summary ( day_Sunday ) Calories TotalSleepRecords TotalMinutesAsleep TotalTimeInBed TotalSteps TotalDistance Min. : 1214 Min. : 1.000 Min. : 58.0 Min. : 61.0 Min. : 655 Min. : 0.430 1 st Qu. : 1698 1 st Qu. : 1.000 1 st Qu. : 380.0 1 st Qu. : 436.0 1 st Qu. : 3688 1 st Qu. : 2.600 Median : 2027 Median : 1.000 Median : 481.0 Median : 527.0 Median : 6543 Median : 4.330 Mean : 2277 Mean : 1.182 Mean : 452.7 Mean : 503.5 Mean : 7298 Mean : 5.185 3 rd Qu. : 2676 3 rd Qu. : 1.000 3 rd Qu. : 550.5 3 rd Qu. : 602.5 3 rd Qu. : 10334 3 rd Qu. : 7.020 Max. : 4552 Max. : 3.000 Max. : 700.0 Max. : 961.0 Max. : 17298 Max. : 14.380 TrackerDistance LoggedActivitiesDistance ActiveDistance non_ActiveDistance ActiveMinutes non_ActiveMinutes Min. : 0.430 Min. : 0 Min. : 0.000 Min. : 0.430 Min. : 0.00 Min. : 566.0 1 st Qu. : 2.600 1 st Qu. : 0 1 st Qu. : 0.000 1 st Qu. : 2.260 1 st Qu. : 0.00 1 st Qu. : 758.5 Median : 4.330 Median : 0 Median : 0.000 Median : 3.230 Median : 0.00 Median : 868.0 Mean : 5.185 Mean : 0 Mean : 1.893 Mean : 3.289 Mean : 38.91 Mean : 887.7 3 rd Qu. : 7.020 3 rd Qu. : 0 3 rd Qu. : 3.520 3 rd Qu. : 4.035 3 rd Qu. : 58.50 3 rd Qu. : 945.5 Max. : 14.380 Max. : 0 Max. : 11.150 Max. : 6.730 Max. : 275.00 Max. : 1379.0 DayOfWeek Length : 55 Class : character Mode : character > > day_Sun_list <- + list ( + list ( "Total_Steps_Ave" = ~ mean ( day_Sunday $ TotalSteps , na.rm = TRUE )), + list ( "Active_Minutes_Ave" = ~ mean ( day_Sunday $ ActiveMinutes , na.rm = TRUE )), + list ( "Sedentary_Minutes_Ave" = ~ mean ( day_Sunday $ non_ActiveMinutes , na.rm = TRUE )), + list ( "Calories_Ave" = ~ mean ( day_Sunday $ Calories , na.rm = TRUE )), + list ( "Total_Hours_Asleep_Ave" = ~ mean ( day_Sunday $ TotalMinutesAsleep / 60 , na.rm = TRUE )) + ) > day_Sun_summary <- summary_table ( day_Sunday , day_Sun_list ) > print.default ( day_Sun_summary ) day_Sunday ( N = 55 ) Total_Steps_Ave 7297.854545 Active_Minutes_Ave 38.909091 Sedentary_Minutes_Ave 887.672727 Calories_Ave 2276.600000 Total_Hours_Asleep_Ave 7.545758 attr (, "rgroups" ) [1] 1 1 1 1 1 attr (, "n" ) [1] 55 attr (, "class" ) [1] "qwraps2_summary_table" "matrix" "array"

Using ggplot , I made several charts describing my findings:

Interesting how both Tuesday and Monday are in the top three in each chart, even Average Sedentary Minutes. It seems that people are more likely to go easy on Fridays and stay up later. Overall, the differences between the days of the week aren’t as large as one might expect, but these differences are still notable enough to consider.

Hourly Data Analysis #

I imported hourlyMerged.csv into RStudio, where I could compare the relationship between calories, steps, and intensity. I first divided the activity hour column into activityDate and time , then converted them to the appropriate data types, then added times corresponding to the day of the week and the time of day:

There appeared to be no strong correlation between intensity and either steps or calories burned:

In this case, I thought it might be worth analyzing what time of day participants were most active: night, morning, afternoon, and evening.

I then put all the summaries together to get a better idea of what time of day people were most active:

Finally, let’s once again put those data into charts:

No surprise that nighttime ranks last for physical activity. People are slightly yet significantly more likely to be more active during the afternoon than the evening or morning.

1. What are some trends in smart device usage? The clearest trend is that FitBit simply isn’t collecting enough data.

2. How could these trends apply to Bellabeat customers? If the customers can’t log this data easily, then they’re missing out on a lot of very useful insights.

3. How could these trends help influence Bellabeat marketing strategy? These gaps in FitBit’s data are a great opportunity for BellaBeat to step in and provide something their competitors have thus far been unable to provide.

Recommendations #

  • For example, if Friday is a day when customers are consistently less active and more sedentary, the app should encourage them to engage in some light physical activity.
  • Body weight is important data for anyone trying to improve their health. Whatever is preventing customers from regularly logging their weight needs to be uncovered and corrected.
  • Weight needs to be measured at regular intervals and under the same conditions for the data to be helpful. An individual’s weight can fluctuate dramatically throughout the day, so a person can appear to be up to ten pounds (4.5kg) heavier or lighter depending on when they logged their weight. This is one of the rare cases when too many readings can spoil the utility of the dataset. Perhaps the customer can be encouraged to set a weekly timer to remind them to weigh themselves under similar conditions. If they want to weigh themselves outside of these time frames, they should receive a dialog box confirming that they understand the issues with doing so.
  • Perhaps there can be hardware integration with smart scales, much in the same way as BellaBeat’s Spring water bottle automatically tracks hydration.
  • Sleep is just as important to a healthy life as exercise and a balanced diet. BellaBeat should strongly encourage their customers go to bed at an appropriate time for each of their schedules while wearing one of the company’s products. BellaBeat can really stand out in their field if they can use abundant and accurate sleep data to help their customers.
  • It’s also possible that the particular device used to track sleep wasn’t conducive to tracking sleep. For example, maybe some customers find wearing a wristwatch to bed to be too uncomfortable to be worth it. In such a case, this might require a hardware solution.
  • Even when the data has been gathered amply and correctly, they seem to be disconnected from each other. Sure, steps taken correlate strongly with calories burned, but that’s a banal observation. It’s very hard to see how someone can look at this FitBit data at a glance and use it to change their habits and routines. BellaBeat should not only be gathering better data than FitBit, but it should also leverage that data better to provide interesting and actionable insights to their customers.
  • The customer should feel like their data is being gathered and deployed to improve their lives. This can incentivize them to more regularly log their data and wear the tracking devices while sleeping. Without this sense of purpose, the customer will stop engaging with the device seriously.

Acknowledgments #

I’d like to thank Ed Garcia for his guidance on how to divide the data into days of the week and times of the day.

Google Data Analytics Certificate. Capstone project: Bellabeat

César muro cabral, introduction.

This capstone project is part of the last course of Google Data Analytics Professional Certificate.

In this case study, we answer key business questions for Bellabeat, a high-tech manufacturer of health-focused products for women.

The tool used for this analysis is the R programming language used with the integrated development environment RStudio.

Bellabeat is a successful small company, but they have the potential to become a larger player in the global smart device. Urška Sršen, cofounder and Chief creative Officer of Bellabeat, believes that analyzing smart device fitness data could help unlock new growth opportunities for the company. We have been asked to focus on one of Bellabeat’s products and analyze smart device data to gain insight into how consumers are using their smart devices. The insights you discover will then help guide marketing strategy for the company.

The primary Bellabeat products are

Bellabeat app : The Bellabeat app provides users with health data related to their activity, sleep, stress, menstrual cycle, and mindfulness habits. This data can help users better understand their current habits and make healthy decisions. The Bellabeat app connects to their line of smart wellness products.

Leaf : Bellabeat’s classic wellness tracker can be worn as a bracelet, necklace, or clip. The Leaf tracker connects to the Bellabeat app to track activity, sleep, and stress.

Time : This wellness watch combines the timeless look of a classic timepiece with smart technology to track user activity, sleep, and stress. The Time watch connects to the Bellabeat app to provide you with insights into your daily wellness.

Spring : This is a water bottle that tracks daily water intake using smart technology to ensure that you are appropriately hydrated throughout the day. The Spring bottle connects to the Bellabeat app to track your hydration levels.

Our data analysis will be divided en six consecutive phases: ask, prepare data, process and clean data, analyze, share and act.

In this first phase, we define the problem by asking the right questions and identifying the stakeholders and their expectations.

We can identify the key stakeholders as

Urška Sršen : Bellabeat’s cofounder and Chief Creative Officer.

Sando Mur : Mathematician and Bellabeat’s cofounder; key member of the Bellabeat executive team.

Bellabeat marketing analytics team : A team of data analysts responsible for collecting, analyzing, and reporting data that helps guide Bellabeat’s marketing strategy. You joined this team six months ago and have been busy learning about Bellabeat’’s mission and business goals — as well as how you, as a junior data analyst, can help Bellabeat achieve them

Sršen asks to analyze smart device usage data in order to gain insight into how consumers use non-Bellabeat smart devices. She then wants you to select one Bellabeat product to apply these insights to in your presentation. These questions will guide your analysis:

What are some trends in smart device usage?

How could these trends apply to Bellabeat customers?

How could these trends help influence Bellabeat marketing strategy?

You will produce a report with the following deliverables:

A clear summary of the business task

A description of all data sources used

Documentation of any cleaning or manipulation of data

A summary of your analysis

Supporting visualizations and key findings

Your top high-level content recommendations based on your analysis

Collecting and preparing the data

At this phase, we collect and qualify the data and also analyze the data sources to check its validity.

Sršen encourages to use public data that explores smart device users’ daily habits. She points you to use the FitBit Fitness Tracker Data . This Kaggle’s dataset was generated by respondents to a distributed survey via Amazon Mechanical Turk between 03.12.2016-05.12.2016. 30 eligible Fitbit users consented to the submission of personal tracker data, including minute-level output for physical activity, heart rate, and sleep monitoring.

The complete dataset is divided into 18 CSV files, in which a primary key column is an ID. This structured data is in a long format. These files are divided in daily, hourly and minute periods along the period 03.12.2016-05.12.2016. Notice that there are only two months of data, which can be considered a short period.

The data integrity of this Kaggle’s dataset is already measured and indicated in the usability icon. This data has a usability qualification of 10/10, with a score of 100 % in completeness, credibility, and compatibility.

Moreover, this data has a licence of public domain CC0 1.0 Universal, which allows us to modify, and distribute it freely.

Loading the data in RStudio

I download the CSV files to a specific folder on my personal computer. Let us call the necessary libraries in R to load the data into data frames.

First, we define a list of characters with the names of the files in the folder; then, with the functions list.files(), and lapply(), we load all the CSV files (instead of calling them one by one) into a list of data frames. Finally, with the set_names() and file_path_sans_ext() functions, we name the files with their filename.

I have not included the tables in wide format with the same data.

Now, we use the glimpse function to get an overview of the dataframes.

Let us perform an evaluation of the data source with the ROCCC’s criteria:

R eliable: This data is reliable since it has a suitable Kaggle usability qualification. Moreover, it was generated by a reliable company: Fitbit, which is an American consumer electronics and fitness company that also produces wireless-enabled wearable technology, physical fitness monitors, and activity trackers such as smart watches, pedometers, and monitors for heart rate, quality of sleep and stairs climbed as well as related software.

O riginal: It is original data.

C omprehensive: In general, this dataset needs to be more comprehensive. The numeric data must have units or some description. For example, there is a column called VeryActiveDistance with no specification of the distance units; some columns are repeated in other merged tables, etc. The most relevant is that there is no information about the sex, age, or disabilities of the 30 users. Bellabeat is a manufacturer of health-focused products for women, so not having these aspects will not allow us to make conclusions about the company’s target.

C urrent: This data covers two months of 2016, then is not current.

C ited: This dataset has been employed for previous data analysis projects.

At this stage, it is important to mention that I consider that more comprehensive data is needed for a better analysis. Bellabeat is a company with products for women, and the data that we are employing does not specify age, sex, the region where the 30 users are from, etc. Moreover, I consider that a greater sample size is needed and with a long period of time, and not just two months of records.

Processing and cleaning the data

Here we find and eliminate any errors and data constraints that can get in the way of results. This usually means cleaning data transforming it into more useful format, combining two or more datasets to make information more complete and removing outliers or null data that could skew the information.

I only focus on the daily and hourly files.

For the daily data, the dailyActivity_merged, weightLogInfo_merged and the sleepDay_merged dataframes are the ones that collect all the necessary information. Then, from our list containing the dataframes we select them in indvidual dataframes

Notice that I have called the dataframes from list_of_files as if they have an extra dimension, this is usually common when working with a list of a list.

We inspect their characteristics

We observe that date columns needs to be converted to a date-time format which will be done at the analysis phase.

In the daily_activity dataframe, the TotalDistance and TrackerDistance are the same. The LoggedActivityDistance and SedentaryActive distance are columns with just 0 values, we can do without them. Let us drop the previous both and TrackerDistance columns

The column Fat of weight_info has several Na values, but we will only focus on the BMI column. I pretended to merge these three dataframes but since they have different number of rows it is not possible.

Finally, we merge the three hourly dataframes since they have the same number of rows.

Let us look at its characteristics

Analyzing and visualizing the data

We load the libraries that we can consider relevant for the analysis

Analyzing weight and body mass index

Let us investigate the corresponding dataframe with the weight and body mass index (bmi) of the users.

We compute the relevant descriptive statistics

Unfortunately, there is only register of 8 different individuals.

For adults 20 years and older, the BMI is interpreted using standard weight status categories. These categories are the same for men and women of all body types and ages.

Let us inspect the status categories which belongs our individuals

The percentage of each category is

Therefore, more than 62% percent of the users present over weight or obesity. This let us elucidate that the majority of individuals of our data are, in general, overweight.

There is a relevant outlier; a person with a bmi of 47.54.

Analyzing the sleep activity data

Let us first convert the SleepDay column into date-time format with the function parse_date_time

then, we convert the columns of minutes to hours

We investigate the descriptive statistics of the dataframe

Variable type: numeric

We see that FitBit users have an average sleep around 6.9 hours per day and they and it takes them an average of 40 minutes to fall asleep.

Let us find out how the FitBit users distribute their time in bed hours during the week

Therefore, the users lay to be more time in bed the Sundays and less on Fridays and Tuesdays.

We can examine if there is any relationship between the total time on bed and the body mass index.

We obtain the mean of time in bed group by Id user

and merge both dataframes

Analyzing the daily activity dataframe

We explore the daily_activity dataframe and relate with the insights obtained previously.

We transform the ActivityDate column into a date_time format and add new columns with the corresponding week day and the total active minutes.

We obtain the descriptive statistics of each numeric column

Variable type: character

As we observe, the FitBit users spend much more time in a sedentary way than in an active way. Also, a light activity of walking is much greater than a very active one.

Let us inspect how much time per day of the week the users of FitBit have activity

We observe that the persons spends around 4 hours per day on activity. On Tuesday is when they have more activity.

We also inspect how much sedentary time have the users.

They spend 18 hours per day in sedentary activities. The Thursdays are when they have less sedentary hours.

We examine the calories burned by users, the total steps, and total distance for each day of the week

The calorie burning does not vary significantly for each day of the week. It is a little higher on Tuesday.

People walk more on Tuesday and then on Thursday.

Let us check out if there is a relation between calories and active and sedentary minutes

From this panel, we observe that total active minutes and total distance are positively correlated with calorie burn. Moreover, sedentary minutes is negative correlated but not so significantly as we might assume.

Inspecting by grouping by id

Now, we group the data by id and compute the mean of each attribute

Then, we analyze their correlation

Similar to the previous panel, the total steps or total distance is positively correlated with calories, but sedentary minutes is not significantly correlated with them.

Analyzing hourly_activity dataframe

Finally, let us examine how the total of steps, calories, and total intensity vary through each hour of the day.

Share phase

Through the performed analysis, we found the following relevant facts:

  • Google Data Analytics Capstone Project - Bellabeat
  • by Nicholas Peters
  • Last updated about 2 years ago
  • Hide Comments (–) Share Hide Toolbars

Twitter Facebook Google+

Or copy & paste this link into an email or IM:

Navigation Menu

Search code, repositories, users, issues, pull requests..., provide feedback.

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly.

To see all available qualifiers, see our documentation .

  • Notifications

Google Data Analytics Professional Certificate Capstone Project

ToeKnee013/Capstone-Project-BellaBeat

Folders and files, repository files navigation, google data analytics: capstone project, project overview.

I am a junior analyst working for Bellabeat , a high-tech manufacturer of health-focused products for women. Urška Sršen , cofounder and Chief Creative Officer of Bellabeat, believes that analyzing smart device fitness data could help unlock new growth opportunities for the company. I have been asked to focus on one of Bellabeat's products and analyze smart device data to gain insight into how consumers are using their smart devices. I will use the insights discovered to help guide the company's marketing strategy and make high-level recommendations.

Sršen the co-founder and Chief Creative Officer at Bellabeat, a tech manufacturer with health-focused womens products, is asking me to analyze smart device usage in order to gain insight into how consumers use non-Bellebeat smart devices. She then wants me to select one Bellabeat product to apply insight to my final presentation.

  • What are some trends in smart device usage?
  • How could these trends apply to Bellabeat customers?
  • How could these trends help influence Bellabeat marketing strategy?

I will attempt to uncover some of these trends in the data provided. These trends will be listed under the Analyze heading

Urška Sršen encourages me to use public data on Kaggle that explores smart device users' daily habits. FitBit Fitness Tracker Data this Kaggle data set contains personal fitness tracker from about thirty fitbit users.

Credibility/Source

"These datasets were generated by respondents to a distributed survey via Amazon Mechanical Turk between 03.12.2016-05.12.2016. Thirty eligible Fitbit users consented to the submission of personal tracker data, including minute-level output for physical activity, heart rate, and sleep monitoring. Individual reports can be parsed by export session ID (column A) or timestamp (column B). Variation between output represents use of different types of Fitbit trackers and individual tracking behaviors / preferences." (CC0: Public Domain, dataset made available through Mobius )

The tools used for the cleaning, analysis and sharing are BigQuery: SQL, Google Sheets and Tableau refer to the Detailed Cleaning Report for more details on the cleaning. To summarize, I made the following changes:

  • no NULL values were found
  • No duplicates were found in Id and ActivityDate
  • Formatted the table
  • Made a new table named "BellaBeat_Table_Clean"
  • Exported and imported the table into Google Sheets

Recall our objectives

Assumptions for the analysis:

  • Strictly working with this dataset FitBit_Data.csv for the analysis
  • "Calories" refers to calories burned
  • "Distance" refers to miles

With these metrics for the analysis set, the most important variables to look at with the given data are Distance and Calories . In the end, I developed four simple charts for the analysis.

SUM of Distance vs  SUM of Calories

What can we learn from these charts?

Insight: There seems to be a correleation between distance traveled at a certain intensity vs calories burned at that respective intensity. We can note that the calories burned at a specfic intensity progressively increases along with the intensity of the distance traveled.

BellaBeat FitBit Analysis

  • Upper Whisker: 3,653
  • Upper Hinge: 2,799
  • Median: 2,221
  • Lower Hinge: 1,942
  • Lower Whisker: 1,495
  • Upper Whisker: 3,311
  • Upper Hinge: 2,735
  • Median: 2,135
  • Lower Hinge: 1,865
  • Lower Whisker: 1,237
  • R-Squared ≈ 0.6
  • P-value < 0.0001

Final recommendations for marketing strategy:

  • Promote an advertisement towards individuals seeking weight loss which encourages them to exercise with more intensity
  • Integrate a software mechanic that promotes burning calories which rewards the customers. i.e. earning points for staying consistent with target calories burned a week or month which can be spent to obtain in app cosmetics/avatars/etc or reward points used to purchase groceries/gas/gift cards/etc.

IMAGES

  1. Google Data Analytics Professional Capstone Project

    google data analytics capstone project bellabeat

  2. My BellaBeat Project: Google Data Analytics Capstone Project

    google data analytics capstone project bellabeat

  3. Cyclistic

    google data analytics capstone project bellabeat

  4. Bellabeat

    google data analytics capstone project bellabeat

  5. GitHub

    google data analytics capstone project bellabeat

  6. Google Data Analytics Professional Certificate

    google data analytics capstone project bellabeat

VIDEO

  1. Eileen Valles Data Analytics Capstone Presentation

  2. Google Data Analytics : Bike Sharing SQLite Data Cleaning (Part 1A)

  3. Yaw Agyekum Data Analytics Capstone Presentation

  4. COURSERA ADVANCED BUSINESS ANALYTICS CAPSTONE WEEK 2 PEER GRADE ASSIGNMENT

  5. Bellabeat Data Analytics Case study in R

  6. Capstone Project Data Analytics -RevoU

COMMENTS

  1. Google Data Analytics Capstone Project: Bellabeat

    This data can help users better understand their current habits and make healthy decisions. The Bellabeat app connects to their line of smart wellness products. Leaf: Bellabeat's classic wellness tracker can be worn as a bracelet, necklace, or clip. The Leaf tracker connects to the Bellabeat app to track activity, sleep, and stress.

  2. Google Data Analytics Capstone Project: How Can Bellabeat, A ...

    The Bellabeat case study is part of the Google Professional Data Analytics Certificate. In this case study, I play the part of a junior data analyst in the marketing department of Bellabeat.

  3. Google Capstone Project: How Can Bellabeat, A Wellness ...

    This is an optional capstone project from the Google Data Analytics Course no: Capstone Project which is posted on GitHub and Kaggle. The analysis follows the 6 steps of Data Analysis taught in ...

  4. emily1618/Google-Data-Analytics-Bellabeat-Case-Study

    💡 BUSINESS TASK: Analyze Fitbit data to gain insight and help guide marketing strategy for Bellabeat to grow as a global player. Primary stakeholders: Urška Sršen and Sando Mur, executive team members.

  5. Google Analytics Capstone Project: Bellabeat Case Study!

    Here at Bellabeat, women's health is our passion. Bellabeat is a high-tech company that manufactures health-focused smart products worldwide. Urška Sršen and Sando Mur founded Bellabeat in 2013, with the intent to develop beautifully designed technology that informs and inspires women around the world.

  6. Google Data Analytics Capstone Project: Bellabeat

    Introduction. This is a Capstone Project for the Google Data Analytics Professional Certification.. Bellabeat is a high-tech company that manufactures health-focused smart products that help women easily track their overall health and wellness, and get connected to their body and mind throughout different stages in life.

  7. Google Data Analytics Capstone: Bellabeat

    2)Applying these trends to Bellabeat customers. 3)Using these trends to influence and shape Bellabeat's marketing strategy. This case study is completed as the capstone project, part of the Google Data Analytics Professional Certificate

  8. Google Data Analytics Capstone: Bellabeat Case Study

    INTRODUCTION. This capstone is part of the eighth and final course from the Google Data Analytics Professional Certificate, where you have the opportunity to complete an optional case study. In this project, we will apply the five steps of the data analysis process taught in the program: Ask, Prepare, Process, Analyze, Share, and Act.

  9. TheGreatPau/Bellabeat-Capstone-Project

    The main objective of this project is to focus on a Bellabeat product and analyze smart device usage data in order to gain insight into how people are already using their smart devices. Then, using this information, the insights gained will be used to produce high-level recommendations for how these trends can inform Bellabeat marketing strategy.

  10. Google Data Analytics Capstone Project: Bellabeat

    Introduction: This is a Capstone Project for the Google Data Analytics Professional Certification.. Bellabeat is a high-tech company that manufactures health-focused smart products that help women easily track their overall health and wellness, and get connected to their body and mind throughout different stages in life.

  11. Google Data Analytics Certificate Case Study 2

    Introduction. This case study is used as the capstone project for the Google Data Analytics Certificate Course. I will be using everything I have learned throughout the course to present my ...

  12. Google Data Analytics Capstone Project

    Google Data Analytics Capstone Project - BellaBeat. 29 June 2023 · 6066 words · 29 mins. data analysis📊 projects R SQL Spreadsheets. Introduction. This is the case study that served as my capstone project for Google's Data Analytics Course. I aimed to use as many of the skills I learned in that course while completing this project ...

  13. My BellaBeat Project: Google Data Analytics Capstone Project

    This Capstone Project is a real life scenario case study from the Course 8 of Google Data Analytics Certificate. In this task, I am a junior data analyst with Bellabeat. Bellabeat brand is into ...

  14. Google Data Analytics Professional Capstone Project

    Step 1: Ask. In my hypothetical role as the Junior data analyst for Bellabeat's marketing team, the CCO asks me to analyze smart device usage data in order to gain insight into how consumers use ...

  15. Google Data Analytics Certificate. Capstone project: Bellabeat

    This capstone project is part of the last course of Google Data Analytics Professional Certificate. In this case study, we answer key business questions for Bellabeat, a high-tech manufacturer of health-focused products for women. The tool used for this analysis is the R programming language used with the integrated development environment RStudio.

  16. Google Data Analytics Capstone Project: Bellabeat Case study

    This is a project documentation for Bellabeat case study from the Google Data Analytics Course. The analysis follows the 6 steps of Data Analysis : Ask, Prepare, Process, Analyze, Share and Act…

  17. RPubs

    Password. Forgot your password? Sign InCancel. RPubs. by RStudio. Sign inRegister. Google Data Analytics Capstone Project: Bellabeat. by Josefina Perez Mercader. Last updated11 months ago.

  18. Google Data Analytics Capstone Project-Bellabeat

    If the issue persists, it's likely a problem on our side. Unexpected token < in JSON at position 4. keyboard_arrow_up. content_copy. SyntaxError: Unexpected token < in JSON at position 4. Refresh. Explore and run machine learning code with Kaggle Notebooks | Using data from FitBit Fitness Tracker Data.

  19. Bellabeat

    Explore and run machine learning code with Kaggle Notebooks | Using data from FitBit Fitness Tracker Data. Explore and run machine learning code with Kaggle Notebooks | Using data from FitBit Fitness Tracker Data ... Kaggle uses cookies from Google to deliver and enhance the quality of its services and to analyze traffic. Learn more. OK, Got it.

  20. katiehuangx/Google-Data-Analytics-Capstone

    This analysis is an optional Capstone project from the Google Data Analytics Professional Certificate on Coursera. Background: Bellabeat is a high-tech manufacturer of beautifully-designed health-focused smart products for women since 2013.

  21. My BellaBeat Project: Google Data Analytics Capstone Project

    This Capstone Project is a real life scenario case study from the Course 8 of Google Data Analytics Certificate. In this task, I am a junior data analyst with Bellabeat. Bellabeat brand is into the…

  22. RPubs

    by RStudio. Sign inRegister. Google Data Analytics Capstone Project - Bellabeat. by Nicholas Peters. Last updatedabout 2 years ago. HideComments(-)ShareHide Toolbars. ×. Post on:

  23. Bellabeat Google Capstone project

    Explore and run machine learning code with Kaggle Notebooks | Using data from FitBit Fitness Tracker Data

  24. Bellabeat. Google Data Analytics Capstone Project

    Oct 28, 2022. 1. Google Data Analytics Capstone Project. For this project I was tasked as part of the Bellabeat marketing analytics team with providing analysis and marketing / growth ...

  25. GitHub

    I am a junior analyst working for Bellabeat, a high-tech manufacturer of health-focused products for women.Urška Sršen, cofounder and Chief Creative Officer of Bellabeat, believes that analyzing smart device fitness data could help unlock new growth opportunities for the company.I have been asked to focus on one of Bellabeat's products and analyze smart device data to gain insight into how ...