Prediction of body fat in adolescents: validity of the methods relative fat mass, body adiposity index and body fat index

To verify the validity of anthropometric methods body adiposity index (BAI), relative fat mass (RFM) and body fat index (BFI) to estimate body fat percentage (%BF) in adolescents. A cross-sectional study was carried out with 420 Brazilian adolescents aged 15–19 years, stratified by age (< 18 years, n = 356; ≥ 18 years, n = 64) and sex (boys, n = 216; girls, n = 204). The Anthropometric measurements height, body weight, hip circumference and waist circumference were collected to calculate the %BF by BAI, RFM, BFI methods. Subsequently, %BF was measured by dual emission X-ray absorptiometry (DXA), adopted as a reference method. In the statistical analysis of the data, the Pearson correlation test and the paired t test between %BF obtained by the equations and by the DXA were performed. The method validation criterion was that 68% of individuals should be within an acceptable error range of ± 3.5% of BF and Cohen's Kappa index ≥ 0.61. Additionally, the Bland–Altman graphical analysis was performed. All methods showed a high correlation with DXA. For the Kappa index, only the RFM reached the criterion in the total sample (0.67) and in the sample < 18 years (0.68). None of the methods reached the criterion of 68% of the sample within the error range of ± 3.5% of BF. The BAI, RFM and BFI equations were not valid for predicting BF in the studied sample according to the criteria adopted regardless of sex or age. Level V, cross-sectional descriptive study.


Introduction
Since 1975 the number of people with obesity has tripled worldwide, with more than 340 million children and adolescents (5-19 years) in 2016 with overweight or obesity [1], which are risk factors for the development or maintenance of obesity in adulthood and the development of cardiovascular, musculoskeletal, metabolic diseases and some types of cancer [2][3][4][5].
The body composition assessment in adolescence is important to identify the need to implement early lifestyle interventions to reduce morbidity and mortality in the short and long term [6], in addition to reducing spending in the public health sector [7]. Thus, there is a need to develop fast, practical and low-cost methods with the potential to be used in clinical practice for screening, especially in young adolescents, allowing early detection of overweight and obesity [6].
Body composition can be assessed by different methods [8]. However, due to its low cost, easy application and the possibility of being used throughout the life cycle, anthropometry is widely used. Several anthropometric methods have been developed to estimate body fat (BF) in specific populations, such as the Body Adiposity Index (BAI) [9], the Relative Fat Mass (RFM) [10] and the Body Fat Index (BFI) [11], both developed using DXA as a reference method. BAI was developed in subjects aged 18-67 years and RFM in subjects aged 20-85 years, both in the United States of America. BFI was developed in Korean participants aged 18-35 years. All of these methods were simple and practical, presenting a better performance than the Body Mass Index (BMI) to estimate the body fat percentage (%BF). They also showed other advantages, such as BAI and RFM does not require body weight measure to predict %BF, which can be a positive point in remote places. However, these methods were developed in adults of specific localities and ethnicities, thus, the authors suggest the validation in other age groups and ethnicities [9][10][11] considering that the amount and distribution of BF may differ depending on these factors [12].
In adults, a systematic review with 19 studies showed that BAI is not valid for estimating %BF [13]. A study with American adolescents aged 15.1 years old showed that BAI overestimated %BF in the lowest BF classification ranges and showed no advantage over BMI [14]. However, this study used skinfold measurements as a reference method to validate the model, which is a limitation in studies of this nature. Other studies have observed a high proportion of bias and low agreement between the %BF predicted by DXA and the %BF predicted by BAI in Brazilian children and adolescents [15,16].
Regarding the RFM method, greater accuracy was observed in relation to BMI in American boys (82.3% vs. 73.9%) and less accuracy than BMI in American girls (89.0% vs. 92.6%) participants from the NHANES 1999-2006 [17]. A recent study with Brazilian adolescents found that the RFM had high specificity for the %BF classification, however, the sensitivity was low [15]. This result conflicts with the results of Woolcott and Bergman [17] who found that the RFM was more accurate than the BMI for identifying overweight and obesity in boys. As for BFI, the most recent method among the others previously mentioned [11], there are still no studies in samples composed specifically by adolescents.
Given the absence of consensus and limitations presented, it is necessary to incorporate the evidence on the topic, to understand how well these methods can predict adiposity in adolescents. With this, it will be possible to provide knowledge that can facilitate clinical work and epidemiological studies. Thus, the objective of the present study was to verify the validity of the BAI, RFM and BFI methods to %BF prediction in Brazilian adolescents.

Materials and methods
This observational cross-sectional study was conducted in a convenience sample of adolescents (15.0-19.7 years old, n = 420) recruited from five public high schools and three private high schools in the city of Viçosa, Brazil between 2018 and 2019. The participants were divided into two age groups (< 18 years, n = 356; ≥ 18 years, n = 64) and by sex (boys, n = 216; girls, n = 204). The Chi-square test (χ 2 ) was carried out and showed that there is no difference (p = 0.65) in overweight/obesity prevalence rates between the study sample and target population [18].
Healthy boys and girls, high school students, aged between 15 and 19 years old were considered eligible to participate in this study. Volunteers with any conditions potentially altering the body composition or its assessment were not considered eligible, as: (1) acute clinical conditions or immunosuppressive therapy; (2) disabled people; (3) using drugs/medications (i.g., diuretics, β-receptor antagonists, anti-psychotic drugs, corticosteroids, neurotropic drugs, antiretroviral drugs and newly anti-diabetes treatment) [19]; (4) pregnant women (self-reported); (5) and people with fixed prostheses or silicone implants.
Initially, anthropometric measurements were performed, and after this the %BF measurement by DXA, which occurred in the morning (between 8 and 11 am) or in the afternoon (between 1 and 4 pm). A group of anthropometrists trained by the International Society for Advancement of Kinanthropometry (ISAK) certification standard collected the following measurements: stature (Sanny ® stadiometer, São Bernardo do Campo, Brazil), body weight (Welmy ® w 200/5 digital scale, Santa Bárbara d'Oeste, Brazil) with an accuracy of 0.05 kg, and hip circumference according to the recommendations of the ISAK [20], in addition to the waist circumference (measured at the upper edge of the iliac crest) [14]. The circumferences were measured with a metallic tape (Cescorf ® , Porto Alegre, Brazil). Table 1 presents the reference data, age group, country, reference method and the description of BAI [9], RFM [10] and BFI [11].
The %BF evaluation was performed by trained radiology technicians using equipment that was calibrated daily (GE Healthcare ® , Lunar Prodigy Advance DXA System, software version: 13.31). For the exam, the volunteers were instructed to remove metallic objects from the body. To standardize the exams, a single researcher was responsible for manually adjusting the regions of interest in the anatomical references [21] and for generating all the reports.

Statistical analysis
The sample size was calculated in the G*Power software package (version 3.1.9.2, Heinrich Heine University, Dusseldorf, Germany). The sample size calculation (power = 95%, alpha = 5%), performed with an effect size set at 0·35, two to five independent variables estimated a minimum of 63 participants. A total of 420 participants were recruited and additional analyses were performed stratified by sex and age groups (< 18 and ≥ 18 years old). The smallest sample size obtained was in the group of participants aged ≥ 18 years (n = 64).
The study participants were characterized with mean, standard deviation and minimum and maximum values. The analysis of the data distribution was verified by the values of asymmetry and kurtosis, adopting normal values between − 2 and + 2 [22]. The association and the difference between the %BF predicted by the equations and DXA were verified by Pearson's correlation coefficient and paired student t test, respectively.
The validity of the methods was verified for the continuous and categorical data, using appropriate statistical methods for each type of variable, establishing objective criteria [23]. The two agreement tests and the criteria adopted for validation of the BAI, RFM and BFI methods were: (1) Cohen's Kappa index ≥ 0.61 [24], adopting the following %BF classification: for boys ≤ 25% = Normal; > 25% = High; for girls: ≤ 30% = Normal; > 30% = High [25]. The Kappa index classification was: < 0.0 weak; between 0.0 and 0.2 a little; between 0.21 and 0.4 reasonable; between 0.41 and 0.6 moderate; between 0.61 and 0.8 substantial; between 0.81 and 1 almost perfect [24]; (2) According to previous publications, normally distributed data vary ± 1 SD from their mean (~ 68% of data). Thus, a validation criterion established was that the equations should present ≥ 68% of the sample within the error range of ± 3.5% of BF between the equations and the reference method for both sexes [26,27].
In addition, Bland-Altman graphical agreement analysis was performed, where the differences between the methods were plotted in relation to DXA [28]. Statistical analysis was performed using the IBM SPSS statistical software (IBM Corp. Released 2011. IBM SPSS Statistics for Windows, Version 20.0. Armonk, NY: IBM Corp), adopting a significance level of 5%.

Results
The characterization of the sample was presented in mean and standard deviation, as well as the minimum and maximum values for the total sample and stratified by age (< 18 years and ≥ 18 years) and by sex ( Table 2).
The %BF predicted by all equations showed a high correlation with the %BF measured by DXA (p < 0.001) ( Table 3), with the lowest value for BAI in the group of boys (r = 0.72) and the highest value for BFI in the total sample and the group < 18 years (r = 0.92). By the t-test, compared to %BF by DXA, it was verified that there was no significant difference to BAI in the total sample (p = 0.77) and group < 18 years (p = 0.181); in RFM for the girls (p = 0.397); and in BFI for the group ≥ 18 years old (p = 0.785) and for the boys (p = 0.933).
As a result of the Kappa index, only the RFM reached the criterion adopted for validation in the total sample (0.67) and the group < 18 years (0.68) ( Table 4).
For the limit of 68% of individuals within the established error range (± 3.5% of BF), none of the methods met the agreement criterion ( Table 5).
The Bland-Altman graphs of agreement showed in their central line the average bias for each method in each subgroup. In addition, the limits of agreement, represented by the dotted lines in the graphs were wide in all methods, showing large individual errors. The diagonal line represents the systematic error, which is observed by the slope of the line. It is observed by the slope in the diagonal line in all the graphs, a tendency of overestimation the prediction to lower values of BF and underestimation to higher values (Figs. 1, 2, 3, 4, and 5).

Discussion
The use of equations based on anthropometric measures to estimate %BF has been of great interest to researchers and clinical practice. Thus, the objective of the present study was to verify the validity of the BAI, RFM and BFI equations for predicting %BF with DXA as a reference method in Brazilian adolescents. In this study, none of the analyzed equations was valid for predicting %BF, nor were they able to correctly identify adolescents according to the body fat classifications (normal or high %BF), regardless of sex or age. According to our results, the BAI, RFM and BFI equations showed a moderately strong to very strong correlation with the %BF verified by DXA (0.72-0.92) ( Table 3). However, the correlation coefficient is not adequate to indicate agreement [23]. For example, in a scatter plot all the points can lie on a straight line, but does not pass through the origin, so there is a strong correlation with r = 1, but no agreement between the pairs of data (i.e., there is a systematic error) with one method always having a greater response than the other [23]. The present study showed similar results since even with high correlations in all methods, only the RFM reached the agreement criteria adopted for the kappa index in the total sample (0.67) and in the group aged < 18 years (0.68) ( Table 4). Moreover, for the limit of 68% of individuals within the established error range (± 3.5% of BF), none of the methods met the agreement criterion tests.
As for the t test, there were no significant differences in the BAI for the total sample (p = 0.77) and < 18 years groups (p = 0.181); in the RFM for the girls (p = 0.397); and in the BFI for ≥ 18 years group (p = 0.785) and for boys (p = 0.933). The statistically significant difference in the t test indicates the existence of systematic bias, and the p value > 0.05 only indicates that there is no bias, which does not represent that there is an agreement between the methods [23]. To check the agreement of methods, other tests must be used (i.g., Cohen's Kappa index, intraclass correlation coefficient (ICC), limits of agreement in the Bland-Altman graphical analysis, etc.), and it is necessary to consider the type of variables (categorical or numerical) and the number of categories to make the appropriate choice of the test [23].
A positive point of the present study was the establishment of objective criteria to validate the analyzed equations. Previously published studies have also carried out similar statistical analyses [11,15,29]. However, they have not clearly defined the validity criteria. For example, it is recommended that Bland-Altman graphical analysis be performed based on pre-established acceptable error limits since some studies present subjective conclusions that often differ from each other [30]. From a practical and statistical point of view, the criteria defined in the present study seem to be acceptable when it comes specifically to the study of BF [26,27]. Within the continuous data validation criteria, none of the methods analyzed reached the minimum agreement established. Woolcott and Bergman [10], performed Bland-Altman graphical analysis and determined the accuracy as the proportion of subjects with an error less than 20% among the %BF predicted by RFM and DXA. Precision was verified by the interquartile range between %BF predicted by RFM and DXA. However, in this study, it was not established which proportion of subjects should be within the established error ranges to conclude whether the method was valid or not. In addition to the lack of well-defined validation criteria, the error of < 20% in the BF prediction seems to be a very wide range, causing several errors in the classification of individuals with BF excess, which can lead to potential complications of health and overload of health systems [7].
Yang et al. [11] did not perform a Bland-Altman graphical analysis in the BFI validation study. Bergman et al. [9] presented the graphical analysis, but also did not establish any criteria for validating the BAI. In addition to the Bland-Altman graphical analysis, Bergman et al. [9] and Woolcott and Bergman [10], adopted different statistical analyses from the ones we adopted in this study. For example, Bergman et al. [9] used Lin's CCC to verify the agreement between BAI and DXA. It is common to observe in the literature the use of validation tests based on accuracy measures, together with precision measures, such as the correlation measure used in the formulas of the ICC and Lin's CCC [21]. However, the correlation measures can be influenced by the variability of the sample data, where a low variability leads to a low correlation. Therefore, two methods might have a low agreement, not because they are not interchangeable, but due to low variability in the data [31]. Thus, even though the measure of the accuracy of the tests mentioned above is high, a low correlation measure may mistakenly decrease the agreement coefficient. Thus, in this study, we opted not to use methods that insert the correlation measure in its formula. In the literature, there are results showing high bias and non-validity of BAI for prediction of %BF in both adults and adolescents [11,15]. One study found that the agreement of BAI was higher than the correlation between BMI and %BF by DXA in American-European adults by Lin's CCC. However, the agreement was poor, and the difference between BAI averages and DXA was large, which means that BAI was also not valid for predicting %BF [32]. Another study also found low agreement between %BF by BAI and DXA in Brazilian adults, where BAI overestimated %BF values in boys and underestimated in girls [33]. In the case of RFM, created after BAI, the distinction between genders was added, which gave it a better predictive capacity concerning BMI in American adults of different ethnicities  [10]. However, this result does not mean that RFM is an interchangeable method with a reference method. Similarly to the present study, RFM was also not valid for identifying overweight and obesity in American adolescents [17].
For the BAI, RFM and BFI equations to be considered valid for predicting the %BF, the equality curve of the Bland-Altman graph should present only random errors, distributed close to zero and within the established limits of agreement [31]. The line of bias observed in Figs. 1, 2, 3, 4 and 5 shows the average of the differences between the equations and DXA. In the analysis of Fig. 1, it is observed that BAI exhibited a line of bias close to zero in the total sample. This result can be explained by the fact that the number of boys and girls was similar (n = 216 and n = 204, respectively), and the high negative bias observed in the male group (− 5.1) (Fig. 4) added to the top positive bias of the female group (5.6) (Fig. 5) almost annulled each other and, thus, present a low bias in the total sample. The same occurs when analyzing the BAI in the subgroup < 18 years old, composed of 84.8% of the subjects that belong to the total sample, which confers great similarity between the data of these groups. About the group aged ≥ 18 years, a negative bias was observed (− 2.3). The high bias in opposite directions is also seen in this group between boys and girls, however, the value is negative due to the greater number of boys that make it up (70.3% of the sample).
Since the sample composition itself can lead to errors in the interpretation of the mean line of bias, it is also important to check the limits of agreement. In general, wide limits of agreement were verified in all methods and in all subgroups, which represents high individual errors. Among the methods studied, BAI was the one that exhibited the greatest individual errors for all subgroups. In agreement with findings from other studies [9,15], when comparing by sex, the girls always showed less amplitude in the limits of agreement, as they presented high %BF and low variability of the measures. Sometimes there was a greater variability of the measurements in the boys, who had both a low %BF and a high %BF (Figs. 4 and 5).
Moreover, it is interesting to be aware that the error range adopted of ± 3.5% of BF is proportionally different between the sexes. For example, %BF average evaluated by DXA in boys was 17% and in girls 32.9%. The 3.5% of BF error criterion represents 20.6% of the average BF in boys and only 10.6% in girls. This, theoretically, demonstrates a greater difficulty in validating the methods in girls, since the margin of error regarding their average %BF is much smaller concerning boys. Therefore, in addition to adopting well-defined criteria for the %BF validation, it is also interesting that in future studies these criteria should be adjusted based on the sample individuality.

Strength and limit
A possible limitation of this study was the use of a convenience sample, which may not represent the target population. The use of probabilistic samples increases the generalizability of the study, however, they are cost prohibitive and in most cases, it is not feasible. Despite their disadvantaged generalizability, convenience samples are less expensive, more efficient, and simpler to execute. Furthermore, the use of a homogenous convenience sample can be a strategy to increase the generalizability of the results. This type of strategy can be performed, for example, delimiting the essential characteristics that can yield biased estimates of the target population, such as age, sex and country/region [34], as we did in this study. Another limitation was the use of DXA as a reference method, as the most suitable method is the analysis of body composition by the four-compartment method [35]. However, due to its great practicality, accuracy and low radiation emission, DXA has been widely used as a reference method [36]. Lastly, the assessment of the participants' sexual maturation was not carried out, which limits the interpretation of the validity of the methods between the different maturation stages.
As practical implications, the validation and use of the BAI, RFM and BFI equations must be carried out in a careful and specific manner both in the research and clinical practice, as they can induce errors of interpretation. For future studies, it is suggested to create simple and specific equations for the target population. In addition, it is interesting to outline basic guidelines for the development and validation of equations for predicting %BF. In this study, well-defined criteria were adopted, which can be followed in future research with similar objectives. Furthermore, the adequacy of the criteria according to the sample's individuality, for example by sex or %BF percentiles can be an interesting strategy.

What is already known on this subject?
When reference methods for assessing body composition are not available (i.g., DXA), anthropometric equations can be interchangeable methods for predicting %BF. For example, BAI, RFM and BFI have been shown useful methods for estimating whole-body %BF and diagnosing overweight or obesity in specific samples.

What does this study add?
This is the first study that evaluated the validity of BFI in Brazilian adolescents and was the first that evaluated the validity of BAI, RFM and BFI based on well-defined validity criteria. It was verified that the BAI, RFM and BFI were not valid for %BF prediction in Brazilian adolescents. Therefore, it is necessary to be clear about the criteria for selecting equations that are valid and viable in clinical and research settings.