Short on time?

Get essay writing help

Correlation Between Age and the Exercise Habits of Individuals by Building a Linear Regression Model

  • Words: 2361
  • |
  • Pages: 5
  • This essay sample was donated by a student to help the academic community. Papers provided by EduBirdie writers usually outdo students' samples.


Since young, exercising has always been a part of my life. When I was 8, I was selected to join the development team and afterwards the school team for gymnastics. Regular physical exercise was something I had been engaged in since a very young age. However, after leaving primary school, I did not join a sport, and hence it led to a reduction in my exercise habits. Simultaneously, the amount of food consumed during my training days did not change after I stopped training. I did not bother keeping up my physical fitness until I started to dislike my physique towards the end of my secondary school years. After entering college, I became even more conscious of my body. Therefore, exercising became an important deal for me because I had the goal of getting fitter. I have ever since started to exercise regularly again.

Under the influence of my cousin of age 29, who goes for exercise classes 6 times a week, I was motivated to do the same. My cousin used to be overweight during her teenage years. However, after experiencing some eating disorder, she changed her mindset and decided to exercise. I noticed that teenagers and young adults that I know of tend to exercise significantly more than the other age groups. My inspiration for this investigation arose after observing the exercise habits of others around me. It led me to question, does getting older affect the exercise habits of people? Additionally, how could I use statistics in mathematics to answer this question? As such, I had to collate a pool of data that could provide me with the average exercise frequency of people from all age groups.

Browsing through other research papers, I discovered that studies have shown that as people age, their exercise levels drops. More than one-quarter of Americans over 50 don’t exercise, a new federal report estimates, increasing their risk for heart disease, diabetes and cancer. Additionally, another study suggest that the transition from adolescence to adulthood is, on average, a period of decline in physical activity, but with the decline levelling off into adulthood. Although this is only a representation of the fact that as people get older, they tend to exercise less, I beg to differ based on my personal experiences and what I have observed from the exercise habits of people around me.


The rationale of this report is to recognise if there is a correlation between the age and exercise habits of an individual. When there is a correlation between the variables, it indicates that as age changes, the frequency of exercise changes as well. Similarly, when there is no correlation, the change in age does not mean that there will be a change in the habits of exercise. This investigation tests the assumption that age indeed has an impact on exercise habits.The reason I chose age relative to exercise habits as the variables I am testing is because I believe that when people age, they tend to care more about image and health compared to the young due to the vulnerability of falling sick. With such analysis, I hope to share more about exercise and its factors with my family and friends.


The aim of this report is to find if there is a correlation between age and the exercise habits of individuals by building a linear regression model between these two variables by employing available primary data.

Correlation and Regression:

Data collected from two separate sets are termed bivariate. The sets of data each represent a certain variable. The outcome from one variable may be dependent on another variable, meaning a dependency from one variable on the other. On the other hand, the variables may also be independent of each other.

If the points on the scatter diagram lie near a straight line, then there is said to be a linear correlation between variables X and Y. This may be illustrated on a linear line Y=mX+c, whereby the relationship between X and Y is determined by the gradient value, m. If the value of m is positive, then X and Y will be positively related. However if m is negative, they would be negatively related.

Although the gradient plays an essential role in determining the type of correlation, Pearson’s Correlation Coefficient, r, evaluates the strength of the linear relationship between variables X and Y. Pearson’s Correlation Coefficient can be calculated as:



Data was collected in the form of a survey. Since I wanted to calculate if the exercise habits had anything to do with one’s age, I included people of all ages ranging from 13 to 76 into the survey. Additionally, the respondents were tasked to answer some demographic questions regarding their height and weight so that I could better understand some background of these people. The respondents were sought after from family, friends and, teachers.

A sample of 90 respondents participated in the collection of data. Although the survey was disseminated to countless people, I do acknowledge the limitations to my analysis as a result of an overwhelming response in a particular age group compared to the rest.

Fig 2. Summary of Ages of 90 Respondents

The reason why there were no targeted age group was because the more data used, the more accurate the result will be in measuring if there is a relationship between age and exercise habits.

Fig 3. Summary of responses for the number of times one exercises over a week

In order to determine the number of minutes one exercises, I first asked the number of times they exercised and subsequently the duration of each exercise session. The rationale for splitting this into two instead of directly asking the duration spent exercising in a week was because I wanted to find out exactly how long one exercises. Asking how long one exercises alone may lead to bias or even inaccurate data since there is a tendency to overgeneralise their exercise duration.

Afterwards, a TI-nspire CX was used to draw a linear regression line to fit in the data and analyse the model over the months of November and December. The use of Pearson’s product moment correlation coefficient was employed to check for the strength of the relationship between these two variables. The data for the variables may be found in the appendix.

Fig 4. Scatter Plot with linear regression line to determine the relationship between age and frequency of exercise.

Let X represent the age of a respondent and y be the amount of time (mins) spent exercising (per week)

Fig 4. suggests a negative linear correlation due to the downward sloping nature of the line which suggests that the regression line has a negative gradient (-300.015). This suggests that as one gets older, his or her time spent on exercising declines. Therefore, it may be concluded that there is some relation between one’s age and time spent exercising.

With this relation established, is is important to measure the degree of the relation. I used the formula below to calculate Pearson’s product moment correlation coefficient to determine the strength of the linear relationship between the two variables, age and minutes spent exercising in a week.

r = xy-x-yn(x2-(x)2n)(y2-(y)2n)

X represents the age of an individual whilst Y represents the total duration spent exercising in a week. n represents the number of ordered pairs in the sample, which in this case would be 90 individuals.

Save your time!
We can take care of your essay
  • Proper editing and formatting
  • Free revision, title page, and bibliography
  • Flexible prices and money-back guarantee
Place Order

Pearson’s correlation coefficient is a number within the range -1r1. The absolute value of the coefficient measures how closely the variables are related. The closer it is to 1, the closer the relationship. A correlation over 0.8 indicates a strong correlation between the variables.

Figure 5: Various interpretations of r values

A value of exactly ± 1 indicates a perfect linear relation. When r is reduced near zero, the correlation gets weaker signifying a reduced relationship between the two variables. The diagram below is indicative of the strength of the different correlations.

A line of best fit may be drawn to make interpolative estimations. This is a reliable estimate because it is within the data range and the product moment correlation coefficient is high. In this instance, the regression line may help to predict how the minutes of exercise changes when the age of an individual increases.

The respective x and y values from Table 1 was manually calculated using the GDC to give the values seen in Table 2. After calculating, these values were substituted into the equation.

The above negative r value suggests that there is a negative linear correlation . The r value of -0.31387.

ScreenShots from Ti Nspire CX

From both the manual calculation and the use of the GDC, the value r lies between the -0.50r0. This r value suggests that there is a weak negative correlation between one’s age and their exercise habits, meaning that the linear relationship is weak and being older doesn’t necessarily mean one’s exercise frequency declines. The regression line is only a line of the average values and does not target any individual result.

Pearson’s product moment correlation coefficient, r, tells me that there is a weak relationship between the two variables. However, in order for me to understand the relationship between exercise habits of individuals relative to their age better, it was important for me to analyse if this r value is reliable and sufficient for me to make a claim that the relationship between the two variables is weak. As a result, the use of bivariate hypothesis testing will be used to test the correlation between the two variables.

Before employing the hypothesis testing, the two possible hypothesis below may be established. Null Hypothesis (H0): The age and exercise habits of Singaporeans are independent of each other. Where q = 0, implying that there is no correlation between the variables. Alternate Hypothesis (H1): As one gets older, the greater the frequency of his or her exercise regularity. Where q >0 or q

The hypothesis test for the population correlation, q, will be used to determine the linear correlation between two variables. Since it has been established above that there is a weak negative correlation between age and exercise frequency, the alternative hypothesis q

To conduct the hypothesis testing, a benchmark for testing needs to be established first. This is done through the Test Statistic value, t*.

t* = rn-21-r2t*n-2

Inserting the r value, -0.3138 calculated above and n = 90 into the formula, a value of -3.1010 is obtained. With the test statistic value from the sample data, I then proceeded to check where this value lands on the t distribution curve. Before that, the significance level, also known as the probability of rejecting the null hypothesis in a statistical test when it is true has to be established. The most common value of =0.05will be used to measure this. To calculate the critical value, the GDC’s T inverse calculator function is employed, with a range of (0.95,88) as the acceptance region that would

fall under 95% of the distribution curve and the degrees of freedom (df) is 90-2=88.

Figure 6. T distribution diagram

Therefore, with a critical value of 1.66 with a 5% level of significance, any value that is below this value will be the rejected region. The observed value -3.1010 falls way below the critical value. Assuming H0 is true, the t* value = -3.1010 < 1.66235, the null hypothesis is rejected. Therefore, it is evident that at 5% of significance level, there is a weak negative correlation between age and frequency of exercise. To prove that these values are accurate and reliable, the linear regression t test is used to compare the values.


Overall, there is evidence from the product moment correlation coefficient that the age of a person and his exercise habits do share a relationship. However, this does not imply causation. Even though there is a correlation, supported by the hypothesis testing, the hypothesis testing only allows us to reject the alternative hypothesis. We are unable to confirm that this is a cause of such a phenomenon. Moreover, I realised that I could calculate the decrease in duration with every increment in age. This may be done by analysing the linear regression line:


From the coefficient of x, I am able to conclude that for every one year increase in age, there is an average decrease in exercise duration by 3.58623 minutes. I believe that the reliability of the results has been supported by conducting a hypothesis testing and linear regression line test. With regards to the figures recorded, it can be seen that the ‘a’ and ‘b’ are identical to the values from the line of regression. Moreover, the manually calculated ‘r’ value is also the same as that of the GDC calculated value.

However, I am aware of the limitations in this exploration. The use of a larger range and frequency of data could have been included to make this investigation more accurate and reliable. The number of respondents who are 17 years old make up a huge proportion of the survey results. Therefore, it would have been better if I was able to acquire a more balanced number of people from each age group to further support my RQ.

To further this RQ, I could even analyse the correlation between the type of occupation with exercise frequency. This would be an interesting idea since I could acquire a larger range of data and construct a linear regression line for each different X variable. Meanwhile, I could compare and interpret these ‘r’ values and make a more substantial conclusion.

With this set of results, I am aware of the strengths and limitations of this exploration. I have also answered my aim that I had in mind at the start of the exploration. I have established that there indeed is a correlation between one’s age and his or her frequency of exercise. However, contrary to my personal observation, this correlation is negative and weak, meaning that there is a limited relationship between the two variables. With a better understanding, I am able to share this information with others and encourage them to exercise more regularly especially those who use age as an excuse to reduce their exercise habits.


  1. Google Forms – Create and Analyze Surveys, For Free. Retrieved from
  2. S.3.1 Hypothesis Testing (critical Value Approach) | Stat Online. Retrieved from,
  3. Tests of Significance, (n.d), retrieved from,
  4. 7.3 – Decision Making in Hypothesis Testing,(n.d),

Make sure you submit a unique essay

Our writers will provide you with an essay sample written from scratch: any topic, any deadline, any instructions.

Cite this Page

Correlation Between Age and the Exercise Habits of Individuals by Building a Linear Regression Model. (2022, September 27). Edubirdie. Retrieved December 5, 2022, from
“Correlation Between Age and the Exercise Habits of Individuals by Building a Linear Regression Model.” Edubirdie, 27 Sept. 2022,
Correlation Between Age and the Exercise Habits of Individuals by Building a Linear Regression Model. [online]. Available at: <> [Accessed 5 Dec. 2022].
Correlation Between Age and the Exercise Habits of Individuals by Building a Linear Regression Model [Internet]. Edubirdie. 2022 Sept 27 [cited 2022 Dec 5]. Available from:
Join 100k satisfied students
  • Get original paper written according to your instructions
  • Save time for what matters most
hire writer

Fair Use Policy

EduBirdie considers academic integrity to be the essential part of the learning process and does not support any violation of the academic standards. Should you have any questions regarding our Fair Use Policy or become aware of any violations, please do not hesitate to contact us via

Check it out!
search Stuck on your essay?

We are here 24/7 to write your paper in as fast as 3 hours.