Since 2018 I have worked with colleagues to calibrate and validate a version of the Youth Activity Profile (YAP) for use with English youth. Pedro Saint-Maurice (National Cancer Institute) and Greg Welk (Iowa State University) previously demonstrated the potential of the YAP as a self-report questionnaire to accurately predict estimates of MVPA and sedentary behavior (SB) derived from activity monitors (see https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0143949 and https://www.sciencedirect.com/science/article/pii/S0749379716306912?via%3Dihub). Moreover, the US team have shown how the tool can be used at scale by teachers and practitioners with no expertise in activity measurement or exercise science (see https://www.tandfonline.com/doi/full/10.1080/02701367.2015.1127126). The predictive accuracy of the YAP is built around algorithms calibrated against MVPA and SB estimates from accelerometer-based activity monitors. The calibration algorithms are specific to MVPA during the school day, out-of-school, and weekends, and to out-of-school time spent in SB.
Stuart Fairclough (Edge Hill University), Danielle Christian (University of Central Lancashire), Pedro Saint-Maurice (National Cancer Institute), Paul Hibbing (University of Tennessee), Robert Noonan (University of Liverpool), Greg Welk (Iowa State University), Philip Dixon (Iowa State University), Lynne Boddy (Liverpool John Moores University)
We received funding from the Youth Sport Trust and Edge Hill University to evaluate the accuracy of the YAP and its calibration algorithms with English youth.
We aimed to:
(1) Assess the predictive accuracy of applying the original US-generated YAP calibration algorithms for PA and SB in a sample of English youth
(2) Develop and validate new English-specific YAP calibration algorithms
(3) Examine the potential surveillance utility of the new algorithms to assess compliance to PA guidelines
What we did
Before the study, the YAP was minimally amended by the research team to make the clarity, language, and terminology more appropriate for English youth (e.g., the word ‘recess’ was replaced with ‘break time’, ‘cell phone’ was replaced with ‘mobile phone’, etc). Through this process the fundamental content and meaning of the YAP questions were unaltered.
Around 400 primary and secondary school students aged 10-15 years old from 11 schools in northwest England agreed to take part in the study. They wore a SenseWear Armband Mini (SWA) multi-sensor activity monitor for 8-days to assess their MVPA and sedentary behavior (SB). On the 8th day of SWA wear the students completed the online YAP under the direction of the research team. The YAP has 15 questions comprised of three sections (school day, out-of-school, and sedentary behaviour), with five questions per section. The students were asked to recall their PA and SB over the past 7-days during context-specific time segments. For example, the school day questions ask on how many days students undertook active travel to and from school, and their activity levels during break time, lunch time, and PE. The out-of-school segment refers to activity levels before school, immediately after-school, evening, and across both Saturday and Sunday. The SB section asks about time spent watching TV, playing video games, using a mobile phone, a computer/tablet, and overall SB. All questions are structured using a 1-5 Likert scale (e.g., for active travel to school, a score of 1 indicates 0 days per week of active travel, whereas a score of 5 indicates 4-5 days per week). On completion of the YAP, researchers used recall ‘probing’ questions as a quality assurance mechanism to improve the accuracy of responses. These probes were specifically developed for the YAP calibration and are not part of the regular YAP administration protocol.
Schools also provided details of the previous week’s school timetable schedule which included days and times for school start and end, break times, lunchtime, and physical education (PE) lessons. This information was used to temporally match each student’s SWA data so estimates of MVPA and SB could be derived for specific time segments of the day. These segments reflected the content of the YAP questions and are detailed in Table 1 below.
- To assess aim 1, the original US YAP calibration algorithms were applied to the students’ SWA data.
- Next, the sample was randomly split (stratified by primary/secondary school) into a calibration sample (6 schools) and a cross-validation sample (3 schools).
- To assess aim 2, the YAP and SWA data from the calibration sample underwent quantile regression analyses to generate new YAP calibration algorithms. This resulted in new YAP calibration algorithms for MVPA during school, out-of-school, and at weekends, and SB out-of-school. These new algorithms were then applied to the cross-validation sample data and bias and mean absolute percent error (MAPE) were calculated to explore group-level agreement. Equivalence testing was also applied with the cross-validation sample to look at whether 95% confidence intervals (CI) for YAP-predicted minutes of MVPA/SB were within a 10% range (equivalence zone) of estimates from the SWA. Where there was no evidence of equivalence at 10%, the equivalence zone was increased by 5% until equivalence was reached (i.e., 15%, 20%, etc).
- To assess aim 3 we examined the agreement between the proportion of students achieving the MVPA recommendations of 60 minutes per day, and 30 minutes per school day according to the YAP and SWA. Classification accuracy of the YAP was evaluated using percent agreement, kappa, sensitivity, and specificity
What we found
Predictive ability of US YAP algorithms with English sample (Aim 1)
Group-level agreement between the US YAP algorithms and the SWA estimates of MVPA and SWA was weak for each segment. The algorithms underestimated MVPA and over-estimated SB (see Figures 1 and 2). %MAPE ranged from 26.5% to 51.0%.
Generation of English YAP algorithms (Aim 2)
For the calibration analyses 200 students had valid YAP and SWA data for at least one of the YAP segments of the week. In the final models the predictors of MVPA and SB were school level (i.e., primary/secondary), sex, and the interaction between the segment YAP score and school level. Root mean square error values were 12.1, 9.6%, 8.5%, and 15.3% for in-school, out-of-school, and weekend MVPA, and out-of-school SB, respectively.
From the three cross-validation schools, there were 129 students with valid YAP and SWA data for at least one YAP segment of the week. The new YAP algorithms over-estimated school day MVPA (by 3.6 min/day or 0.4% of segment time), out-of-school MVPA (by 5.2 min/day or 0.1% of segment time) and SB (21.8 min/day or 0.3% of segment time), and underestimated weekend MVPA (by 2.5 min/day or 0.1% of segment time). These results are presented in Figures 3-6.
The MAPE was lowest for weekend MVPA (3.6%) and highest for school day MVPA (17.3%). Figure 7 compares the MAPE between the US YAP algorithms and new English ones applied to the cross-validation sample, and also to the full sample.
When we applied equivalency tests to the cross-validation sample we found that school day and out-of-school MVPA predicted by the YAP were within the SWA-estimated MVPA 20% equivalence zones. Weekend MVPA and SB predicted by the YAP were within the 15% equivalence zones for SWA-estimated weekend MVPA and SB. These results are summarised in Figures 8 and 9.
Potential surveillance utility of the English YAP algorithms (Aim 3)
Sixty-min/day MVPA was achieved by 81% of the participants according to the SWA. YAP-predicted estimates of daily MVPA indicated that the recommendation was met by 85.8% of participants. Agreement was 80.7% and the kappa value was 0.31 (fair agreement). Sensitivity and specificity were 91% and 37%, respectively. The school day 30 min/day MVPA recommendation was achieved by 77.6% and 79.2% of participants, according to SWA and YAP-predicted estimates, respectively. Percent agreement and kappa values were 82.2% and 0.47, respectively (moderate agreement). Classification accuracy were 89% sensitivity and 57% specificity. The descriptive results are shown in Figure 10.
We aimed to examine the predictive accuracy of the US YAP algorithms for MVPA and SB with a sample of English youth, and to calibrate and test the validity and predictive utility of new English YAP algorithms. We found that the US YAP algorithms poorly predicted SWA estimates of MVPA and SB in English youth. Group-level predictions of in-school, out-of-school, and weekend MVPA, and out-of-school SB from the English YAP algorithms were promising, and the YAP demonstrated potential as a surveillance tool to identify prevalence of compliance to youth PA guidelines. The calibrated YAP estimates of MVPA and SB have great potential utility for future research and PA promotion, as existing calibrated self-report instruments for English youth are not available.
Strengths of the study included (1) use of a proven, rigorous YAP protocol and methodology; (2) use of manageable group sizes for data collection which allowed use of recall probes to enhance the students’ recall accuracy; (3) recording detailed timetable information from each school to accurately determine each student’s schedule during the week when they wore the SWA, so as to enhance the degree of temporal precision required for the calibration analyses; (4) use of an independent sample for the cross-validation analyses, and (5) the choice of the SWA as the device-based measure, which has previously demonstrated superior agreement with criterion measures of free-living energy expenditure than other research-grade and consumer activity monitors.
There are also limitations which should be considered. Schools were not selected at random and so a degree of sampling bias in favour of more active students may have been evident. Data were collected in the spring and summer months which may have reflected the relatively high estimates of MVPA. Therefore, the English YAP algorithms do not account for seasonal variation in the students’ PA and SB. The YAP content means that it can only be used to predict MVPA and SB during school-term time and not during vacation periods, and all modes of MVPA and SB may not be captured. However, schools in England are in session for around 39 weeks of the year so typical activity would be captured by the YAP. The YAP-predicted MVPA and SB estimates demonstrated good group-level agreement, but like values from all PA measurement tools, they cannot be considered exact values reflecting individual-level activity behaviours. Moreover, the calibration algorithms are based on MVPA and SB estimates from the SWA as the field-based criterion measure. There are inherent differences in how MVPA and SB are calculated by the SWA compared to accelerometer-only devices, and this limits comparability with data from other studies. Further, like all PA measurement instruments, the SWA is subject to measurement error which we could not control, and which may have attenuated the effects of the analyses. Incorporating measurement error modelling against a criterion measure has been shown help reduce the effects of measurement error and improve the precision of PA estimates from self-report questionnaires. A true criterion measure of free-living PA and SB requires accurate ground-truth measurement (e.g., wearable cameras) to label activity behaviours (although unsupervised machine learning methods are now emerging, which may remove the need for criterion measures). However, these approaches are yet to be feasible in large samples, and therefore, currently offer limited value for the calibration of self-report questionnaires. Lastly, the SWA uses a default 60-second epoch setting to record data, which may not have fully captured intermittent bouts of higher intensity PA that are characteristic of school-aged youth, however, this monitor has been shown to provide accurate estimates of PA in this population.
Poor agreement was observed in MVPA and SB derived from the US YAP algorithms and SWA worn by the English sample. YAP algorithms developed using the English sample data resulted in MVPA and SB estimates that had promising group-level agreement with the lowest error observed for weekend MVPA and out-of-school SB. The YAP has potential as a surveillance tool to monitor compliance with youth PA guidelines, but more refinement is needed to improve its classification accuracy. The group-level YAP estimates of MVPA indicate that the YAP is a promising self-report questionnaire for use with English youth, and potentially with samples from other countries in the UK. The YAP is a cost-effective, easy to implement instrument that can be used at scale and implemented by researchers and practitioners, to provide meaningful group-level estimates of MVPA and SB.
We would like to further refine the YAP algorithms with a more representative UK sample. To do this we would firstly explore the use of unsupervised machine learning to more accurately estimate MVPA and SB from the activity monitors. We would then conduct another YAP calibration and validation study employing replicate measurement error modelling procedures to enhance the precision of the calibration algorithms. These are challenging and time-consuming studies to do well, but we believe that the YAP instrument has much to offer as an easy to use and accessible self-report measure, particularly in England where calibrated and robustly validated youth PA surveillance measures don’t exist.
The study now published and is open access. All details here: https://www.mdpi.com/1660-4601/16/19/3711