Are Physical Activity Levels in Childhood Associated with Future Mental Health Outcomes? Longitudinal Analysis Using Millennium Cohort Study Data

The rising prevalence of mental health conditions among children and young adults accelerated by the COVID-19 pandemic emphasises the urgent need to address this issue effectively. A potential avenue for early diagnosis lies in physical activity patterns as individuals with mental health conditions often move less than the general population. This paper utilises Millennium Cohort Study data to investigate the relationship between childhood physical movement patterns, and mental distress and wellbeing outcomes in late adolescence. By controlling for a range of factors of both cohort members and their parents, the study employs well-adjusted logistic and linear regressions to assess the hypothesis. Objective physical movement data is collected with accelerometers, while mental distress is measured using the Kessler K6 scale and mental wellbeing using the Warwick–Edinburgh Mental Wellbeing Scale. The �ndings of the study suggest no signi�cant association between raw physical movement and mental distress; however, there is suggestive evidence of a weak positive association with mental wellbeing. In addition, the study found that lower exercise levels at age 7 were associated with an increased likelihood of mental distress at age 17, highlighting the potential impact of exercise habits on mental health in adolescence. Overall, these �ndings suggest that raw physical activity data may be a better predictor of speci�c mental health outcomes, such as those assessed by the Strengths and Dif�culties Questionnaire in similar studies. The paper offers recommendations for future research – such as using self-reported questionnaires to contextualise quantitative physical movement data – and a more comprehensive analysis of the cognitive and mental implications.


Introduction
The prevalence of mental health conditions among children in England has shown a signi cant increase, with one in six children aged 6-16 now expected to have a mental health condition, compared to one in nine children in 2017 (NHS Digital, 2021).The impact of the coronavirus pandemic has negatively contributed to this trend, as more than 39 per cent of surveyed children experienced a deterioration in their mental health

Data description
The data used for the analysis is from the Millennium Cohort Study (MCS), conducted by the Centre for Longitudinal Studies (CLS) .The MCS observes the lives of 18,818 children born in the UK over a 17-month period from September 2000 to January 2002, with an additional 701 children included in subsequent sweeps (Fitzsimons et al., 2020b: 8).
At the time of the writing, there have been seven sweeps of the MCS, with gathered information on cohort members and their parents.For this analysis, data from MCS4 (аged 7), MCS6 (age 14), and MCS7 (age 17) are utilised.MCS4 was conducted in mid-childhood, focusing on schooling, health, childcare, education, and social and family circumstances (CLS, 2020: 4).MCS6 delved into more sensitive topics, including risky Reinvention: an International Journal of Undergraduate Research 16:2 (2023) behaviours like alcohol, smoking and drug use, as well as antisocial activities, contact with law enforcement, puberty, romantic relationships and sexual behaviour (Fitzsimons et al., 2020a: 5).MCS7 was conducted at a pivotal time in teenagers' lives, when choices regarding schooling, further education, training, work and living at home were made (Fitzsimons et al., 2020b: 9).Additionally, an emphasis was put on mental health as well as social, emotional and cognitive development using the Kessler K6 scale and WEMWBS measures.
Throughout the course of the MCS, the proportion of productive cases -those contributing valuable and relevant information at each sweep -has decreased from 96.4 per cent to 55.2 per cent (Fitzsimons et al., 2020b: 78).Non-random attrition is a common occurrence in birth cohort studies due to factors such as refusal, relocation, death and other reasons.To ensure the representativeness of the data remains intact, survey weights provided in the dataset are used.In this study, only the rst child of the family is included, meaning 150 cases of twin siblings are excluded due to shared genetic and environmental factors that could potentially bias the analysis and violate the assumption of independence.As a result, the nal working sample consists of 10,614 cohort members.

Methodology Variables
This study utilised available data from the MCS and conducted a comprehensive review of relevant peerreviewed literature (Aggio et al., 2016;Ahn et al., 2017;Bell et al., 2019;Ohrnberger et al., 2017) to select confounding, dependent and independent variables.To ensure the integrity of the working sample, certain confounding variables were excluded due to a signi cant number of missing observations.The wellbeing score was derived from a shortened version of the Rosenberg Self-Esteem Scale (Fitzsimons et al., 2020a: 36), allowing for an assessment of participants' overall psychological wellbeing.The cognitive score was obtained by combining data from two assessments: the Cambridge Gambling Task , which evaluates decision-making and risk-taking behaviour, and the Word Activity , which measures respondents' understanding of word meanings (Fitzsimons et al., 2020a: 36).Body Mass Index (BMI) was calculated using height and weight measurements taken by the interviewer using a Leicester height measure and Tanita scales , respectively.Body fat percentage was measured by sending a weak electronic current through the body during the weight measurement (Fitzsimons et al., 2020b: 66).
To capture valuable insights into parental in uence, data about parents' physical activity levels, selfassessment of general health, education level (evaluated on a standardised NVQ scale), income level and continuous employment status since the previous sweep were collected.In addition, reading activities involving the child were included as an indicator of parental attention and active involvement in the child's cognitive development.For a comprehensive overview of the confounding variables and their respective sources, please refer to The analysis was conducted using the R programming language, with an initial focus on inspecting correlations to detect potential multicollinearity issues and identify interesting patterns for further Reinvention: an International Journal of Undergraduate Research 16:2 (2023) exploration.Initially, data from both parents were included in the analysis; however, high correlations among variables such as income, education levels and health necessitated the inclusion of only one parent's data.The selected parent (indicated as the rst parent in the survey) represented both parents in most cases, effectively addressing the issue of multicollinearity.Subsequently, no signi cant multicollinearity issues were observed after excluding variables related to the second parent from the analysis.
The primary dependent variable of interest was the Kessler K6 score, measured for cohort members in MCS7.
K6 scale is a six-question scale estimating the prevalence of psychological distress from the individual's experience in the last 30 days, focusing on the severity rather than a speci c diagnosis (Kessler et al., 2010: 6).The total scores range from 0 to 24: K6 score of 0-12 indicates low to moderate distress, while 13-24 is considered high risk of psychological distress (Kessler et al., 2010: 7).The graph in Figure 1 revealed a seminormal distribution with a right skew, meaning more respondents fell on the lower end of the psychological distress spectrum.Those in the higher risk group, with scores of 13 and above, were represented by the dark green portion of the graph.For binary logistic regression analysis, the K6 scale was dichotomised using the cut-off point.That means that study participants below the 13-point threshold were recoded as 0, while those with scores of 13 and above were recoded as 1.Reinvention: an International Journal of Undergraduate Research 16:2 (2023) The second dependent variable of interest in this study was general wellbeing, measured by the WEMWBS.
WEMWBS version used in MCS7 is a seven-item scale that is widely used to assess mental wellbeing, both at the population level and in targeted evaluations of speci c groups (Tennant et al., 2007: 2).Each positively worded item can be evaluated from 1 (lowest) to 5 (highest), thus for the seven-item scale, the total wellbeing score for each participant ranges from 7 to 35.The independent variable of interest was the activity levels during the MCS6 data collection.All cohort members in Scotland, Wales, Northern Ireland, as well as 81 per cent of England respondents, wore wrist accelerometers for two randomly selected 24-hour periods in ten days following the questionnaire visit.
While a total of 10,337 cohort members were eligible for activity monitoring during MCS6, only 4159 individuals returned accelerometers with valid data for both days, and an additional 645 individuals provided data for one day (Fitzsimons et al., 2020a: 42).The distribution of mean activity levels over the two days exhibited a bell-shaped curve with a long tail on the right side (Figure 3).This indicates that only a small number of respondents were highly active, while the majority fell within the range of 0 to 75 mg in terms of average activity levels.

Assumption checking
Binary logistic regression was the selected statistical method to examine the relationship between mean activity levels and mental distress, as measured by the Kessler K6 scale while controlling for other confounding variables.This choice was made based on the slight violation of linearity assumptions observed in diagnostic plots, as well as the skewness evident in Figure 1.The residuals plotted against the tted values and normality of residuals' distribution deviated from the dashed line; however, the scale-location plotdepicting the distribution of residuals across the predictors -as well as in uential cases plot in Figure 4 indicates a good model t.
Reinvention: an International Journal of Undergraduate Research 16:2 (2023) To investigate the association between the wellbeing score from MCS7 and mean activity levels, a linear regression analysis with standardised variables was performed.This choice was based on the distribution of the wellbeing score (Figure 2), which is closer to normal, and the absence of any severe violations of linearity, as indicated by Figure 5. Reinvention: an International Journal of Undergraduate Research 16:2 (2023) Diagnostic plots were utilised to ensure the validity of interpreting the outcomes within the assumptions of logistic and linear regressions.Prior to the analysis, the datasets containing the variables of interest were merged based on the MCS identi cation number.All variables were standardised, and any missing data was recoded into 'NA' to avoid result distortion.In both models, weights, clustering and strati cation data from MCS7 were incorporated to address representativeness and non-response concerns, given that the sample is limited to participants in the nal sweep.

Results
Model 1: Logistic regression for examining mental distress Running the base model of the outcome variable (Kessler K6 score) and the single predictor (mean activity) showed that a unit increase in mean activity decreases the odds of mental distress by a factor of 0.77 with a p-value (<0.001), indicating a statistically signi cant association, shown in After incorporating the confounding variables into the model, the mean activity levels lost their statistical signi cance and showed an even higher likelihood of mental distress, as indicated in Table 3. Due to missing data, only 2543 observations remained for the analysis.However, McFadden's R 2 (a pseudo-R 2 conceptually similar to the R 2 used in OLS (Smith and McKenna, 2013: 18) showed great predictive power of the modelthat is, the model explained around 79 per cent (adjusted) to 88 per cent (unadjusted) of the variation in the Kessler K6 score.Surprisingly, most of the variables, including mean activity levels, did not signi cantly predict mental distress during adolescence, with a p-value over 0.05.
Reinvention Table 4 provides insights into the marginal effects of the predictors, despite only two predictors demonstrating statistical signi cance (wellbeing score from MCS6 and exercise habits from MCS4).The marginal effect of mean activity indicated that each additional unit (mg) increase resulted in a 0.9 per cent decrease in the likelihood of mental distress in MCS7.However, the marginal effects of the statistically signi cant variables revealed that a one-unit increase in the wellbeing score from MCS6 increased the likelihood of distress by 7.9 per cent.Additionally, an increase in MCS4 exercise habits, re ecting less Reinvention: an International Journal of Undergraduate Research 16:2 ( 2023) exercise (from 1 -exercising ve or more days a week, to 7 -not exercising at all), raised the chance of mental distress by 1.8 per cent.

Model 2: Linear regression for examining mental wellbeing
The base model with the outcome as MCS7 wellbeing score and the predictor as mean activity levels showed a small but positive association (0.08, p < 0.01) with the wellbeing score -that is, more activity should lead to better wellbeing outcomes, as seen in Table 5.
Reinvention When all confounders were included in the model, the impact of mean activity levels diminished and lost statistical signi cance, as shown in Table 6.Unfortunately, a substantial number of observations were excluded due to missing data, resulting in a nal working sample of 2140 data points.However, R 2 suggested that the model explained 12 per cent (adjusted) to 26 per cent (unadjusted) of the variation, indicating a signi cant increase in predictive power as compared to the base linear model without confounders (Table 5).
Reinvention By including the measurement of mental wellbeing from the previous sweep (MCS6), the linear regression model transformed into a lagged outcome model.Removing the wellbeing score from MCS6 resulted in a signi cant drop in explanatory power, with only 5 per cent of the variation explained (adjusted R 2 ).Thus, controlling for the wellbeing level three years prior was crucial, as it heavily in uenced the wellbeing level at MCS7, accounting for 7 per cent of the variation.Unfortunately, a lagged outcome model was not feasible Reinvention: an International Journal of Undergraduate Research 16:2 ( 2023) when examining the relationship between mental distress and activity levels as the Kessler K6 scale was introduced in MCS7, while a different scale (SDQ) was used in previous sweeps.
Similar to the logistic regression results, not all explanatory variables demonstrated statistical signi cance.
Although several confounding variables had small individual effects, they collectively contributed to lower wellbeing scores.Interestingly, mean activity levels from MCS6, which were the main focus as an independent variable, showed a positive association but did not yield a signi cant effect on the mental wellbeing score in MCS7.

Discussion
In the regression analyses, mean activity levels at age 14 (MCS6) initially showed a signi cant relationship with mental distress and wellbeing at age 17.However, when confounding variables were added, the signi cance diminished, indicating the occurrence of Simpson's paradox (Carlson, 2019).The associations between mean activity levels and mental distress, as well as mental wellbeing in adolescence, were found to be statistically insigni cant.These ndings suggest that childhood physical activity levels may not be strongly associated with mental health outcomes in adolescence.
Interestingly, the wellbeing score from MCS6 emerged as a signi cant predictor in both models.Surprisingly, a higher wellbeing score in MCS6 was associated with a higher likelihood of mental distress and lower wellbeing in MCS7.This unexpected result could be due to the timing of data collection during a critical period in the cohort members' lives or the potential unnecessary adjustment of variables (Schisterman et al., 2009: 493).Overall, the well-adjusted models provided more reliable insights and showed a good t, indicating that the initial relationships observed in the base models were in uenced by confounding factors.
Previous studies examining the relationship between physical activity and mental health outcomes have yielded mixed results.One study using over 6000 MCS participants' data found that sedentary time at age 7 was associated with peer problems at age 11, as measured by the SDQ (Ahn et al., 2017: 95).However, this study evaluated mental health outcomes at age 17 using the Kessler K6 scale for mental distress and the WEMWBS, and no strong evidence of an association with physical activity was found.The SDQ, with its multiple categories, provided a more detailed analysis of speci c aspects of mental health.Another study investigating physical activity levels at ages 12-13 and mental health outcomes at ages 15-16 also used linear regression and included both the SDQ and WEMWBS (Bell et al., 2019: 3).This study did not nd strong evidence supporting an association between physical activity and better mental wellbeing or reduced symptoms of mental conditions in adolescents.However, a positive association was observed between physical activity and the emotional problems subscale of the SDQ, indicating that physical activity may potentially reduce symptoms of depression and anxiety in adolescents.
Other studies using different methodologies, such as self-reported questionnaires for physical activity measurement, have also yielded mixed results.Some found that physical activity (or lack of it) in uences some, but not all, mental health outcomes (Ashdown-Franks et al., 2017: 22;Suetani et al., 2017: 119), some found no longitudinal association between physical activity and future mental health status (Birkeland et   al., 2009: 30; Brunet et al., 2013: 28; Toseeb et al., 2014: 1097), while others reported a positive outcome of structured physical activities on mental health (Jewett et al., 2014: 642).
Overall, these ndings suggest that raw physical activity data may be a better predictor of speci c mental health outcomes, such as those assessed by the SDQ, but less so of general mental distress or wellbeing.The

Strengths and limitations
A notable strength of the study is the usage of objective physical activity measurement.Accelerometers prevent underestimation or overestimation as compared to self-reported questionnaires; however, they lack the context of movement, which may be an important factor when considering the mental health and physical activity relationship.Types of movement are not equal -structured physical activity, such as team sports, gardening or games including social interactions would be a contextual predictor of mental health outcomes.In addition, participation in structured sports activities leads to a greater development of the cognitive function, as they tend to be interactive, strategic and goal-oriented (Aggio et al., 2016(Aggio et al., : 1080)), and they alleviate depressive symptoms (Brunet et al., 2013: 28;Jewett et al. 2014: 642).To research purposeful, structured activity, accelerometer data may be combined with self-reported questionnaires to distinguish types of activity.
This study uses the newest published data of MCS at the time of writing, incorporating three sweeps in total -MCS4 (age 7), MCS6 (age 14), and MCS7 (age 17) -and having over 10,000 cases in the working sample.
One limitation is that the MCS is representative of a particular cohort included in the study, not the whole population.However, even with a substantial sub-sample, data loss affected the scope of the study.Selected explanatory variables with a majority of missing values were excluded from the analysis, as they were driving the working sample down.Still, four-fths of the observations from the dataset were dropped in regression analyses.Although multiple imputation would help with this issue, there is a threat of distorting the data, even when controlling for weighting, clustering strati cation.Perhaps the imputation could be carried out by groups rather than all subjects simultaneously (Bell and Fairclough, 2014: 451) to minimise the impact of distortion.
The age of the cohort members at the time of the study should be taken into consideration too.The Kessler K6 scale is often used with adults, as it entails assessing certain situations, which may be yet unfamiliar for children.The scale was rst used in MCS7 with cohort members whose age ranged from 16.1 to 18.3.This life period marks a pivotal transition from childhood to adulthood: the end of compulsory education, moving away from home, and end of Child and Adolescent Mental Health Services (CAMHS) -these factors can contribute to widening economic, social, and health inequalities (Patalay and Fitzsimons, 2021: 3).In addition, individual and sex developmental pattern differs (Miller et al., 1995: 29), therefore, using the Kessler K6 scale already puts many of the adolescents in adults' shoes.

Conclusion
This study aimed to investigate the relationship between physical activity, mental distress and wellbeing.
Objective physical activity data from accelerometers were used, along with well-established measures of mental distress (Kessler K6 scale) and mental wellbeing (WEMWBS).The ndings suggest that while raw physical activity data from accelerometers may be a better predictor of speci c mental health outcomes, such as those assessed by the SDQ, it may have less predictive power for general mental distress or wellbeing.
With the COVID-19 pandemic driving remote data collection (Savage et al., 2020) and the increasing use of health technology in precision and preventive medicine (Phillips et al., 2019), there is growing interest in the use of accelerometers for measuring and evaluating the health impacts of physical activity.Accelerometers Reinvention: an International Journal of Undergraduate Research 16:2 (2023) have the potential for both prevention and behaviour change interventions.Therefore, it is recommended to further explore their use for activity measurement.However, it is important to combine accelerometer data with self-reported questionnaires in future studies.This combination would help distinguish between raw movement and purposeful, structured physical activity, such as team sports, which may have additional cognitive, emotional and social bene ts.
In addition, numerous studies have already demonstrated the bene ts of physical activity on mental and physical health across different age groups and genders.Physical activity is actively promoted in schools and workplaces to enhance wellbeing and prevent weight-related health conditions.One interesting nding in this study was that lower exercise levels at age 7 were associated with an increased likelihood of mental distress at age 17, highlighting the potential impact of exercise habits established in early childhood on health outcomes in adolescence.[All graphs are produced by the author]
Figure 2 displays the distribution of WEMWBS scores, which closely approximates a normal distribution.The median score of 22.35 is very similar to the mean score of 22.49, indicating a relatively balanced distribution of responses across the scale.

Figure 4 :
Figure 4: Diagnostic plots testing linear regression assumptions for mental distress association.

Figure 5 :
Figure 5: Diagnostic plots testing linear regression assumptions for wellbeing score association.
Reinvention: an International Journal of Undergraduate Research 16:2 (2023) study's strengths and limitations are discussed in the following sections, along with recommendations for future research.

Figure 4 :
Figure 4: Diagnostic plots testing linear regression assumptions for mental distress association.

Figure 5 :
Figure 5: Diagnostic linear regression assumptions for wellbeing score association.

Table 1 :
Confounding variables with an indication of measurement.

Table 2 :
Table 2 below.Logistic regression (no confounders) results in odds ratio estimates.

Table 3 :
: an International Journal of Undergraduate Research 16:2 (2023) Logistic regression results in odds ratio estimates.

Table 4 :
Marginal effects of explanatory variables.

Table 6 :
: an International Journal of Undergraduate Research 16:2 (2023) Linear regression results in standardised estimates.

Table 1 :
Confounding variables an indication of measurement.Table Logistic regression (no confounders) results in odds ratio estimates.

Table 3 :
Logistic regression results in odds ratio estimates.Table Marginal effects of explanatory variables.

Table 5 :
Linear (no confounders) results in standardised estimates.

Table 6 :
regression in standardised estimates.