Introduction

The provision of universal family support services, with additional support for those with greater needs (a concept known as progressive universalism), underpins many maternal and child health services. This was the lead recommendation of the recent Marmot Review to give every child the best start in life and thereby reduce health inequalities [1, 2]. Policies to improve parenting support aim to enhance health and development of children, improve maternal physical and mental health, strengthen parent–child attachment and positive parenting, and increase rates of initiation and continuation of breastfeeding [25]. An additional goal of the US Nurse Family Partnership and the UK Family Nurse Partnership (FNP) is to improve the economic self sufficiency of vulnerable parents by helping them space future pregnancies, complete education and find work [6, 7].

Identifying families who would benefit from programs beyond universal services is a challenging balance of efficient use of limited resources, against risk of stigmatising some mothers as less able parents [8]. Young maternal age is often used to define families eligible for extended services [2, 6, 9], due to the association of young motherhood with many other factors that may increase risk of poorer child and maternal outcomes, and the belief that it is less stigmatizing than focusing on these other risk factors directly [10]. For instance, teenage mothers are often from lower socioeconomic backgrounds and both teenage mothers and mothers on low income report increased rates of depressive symptoms [11, 12]. Children of mothers who are depressed are at increased risk of poorer development and behaviour [13]. Depressed mothers tend to report feeling less attached to [14, 15], and have more hostile feelings toward, their children [15], although the long-term risks of poor attachment for child development remains unclear [16]. Socioeconomic environment and maternal education may also influence the effect of maternal depression on child well being and development [17, 18]. Breastfeeding is associated with many short- and long-term benefits for both the mother and child [19], and young maternal age, low education, and depressive symptoms are associated with poor breastfeeding outcomes [20, 21]. Thus, there is evidence that young maternal age is associated with a cascade of other factors that may increase risk for poorer child and maternal outcomes. However, it is also known that risk factors, even causal ones, are not necessarily good predictors of outcomes [22].

The aim of this study was to examine the predictive validity of young maternal age (<20 years) alone and then to include measures of antenatal depression, maternal education, financial difficulties, partner status, and smoking during pregnancy, for a range of poor maternal and offspring outcomes up to 61 months. We compared three prediction models: (1) young maternal age only (to reflect current practice in the UK and some other countries); (2) antenatal depression only (as this is likely to be related to postnatal depression and other outcomes) and (3) all six antenatal predictors (chosen as representing characteristics that are, or could be, obtained by practitioners during the antenatal period). This work builds on a previous study examining predictors of poor child development, which showed that using all six characteristics provided better prediction of poor child outcomes than maternal age alone [23].

Methods

Sample

The Avon Longitudinal Study of Parents and Children (ALSPAC) is a prospective, geographically representative study of children born to women resident in the Avon area of southwest England with an expected delivery date between 1st April 1991 and 31st December 1992. Details of the background, methodology, recruitment and response rates have been reported elsewhere (http://www.bristol.ac.uk/alspac/) [24]. The core ALSPAC sample consists of 14,541 pregnancies (Fig. 1). Ethical approval was obtained from the ALSPAC Law and Ethics committee and local research ethics committees.

Fig. 1
figure 1

Eligible cohort and numbers included for analyses. NEET not in employment, education or training, and not had a baby since study child

Maternal Outcomes

Ten items that formed the depression scale of the Edinburgh Postnatal Depression Scale (EPDS) [2527] were administered to women via questionnaire when their baby was 8 weeks old. Each question had 4 response categories scored from 0 to 3 and referred to the feelings of the mother in the past week. A score above 12 is used to indicate probable depressive disorder [25].

Duration of breastfeeding was assessed in the questionnaire when the child was 6 months old. Never having breastfed the child was considered a poor outcome [28].

Poor attachment was assessed by the question “Very occasionally, mothers have mentioned that they felt quite unattached to their babies or even that they felt dislike for them for several weeks. Has this ever happened to you?” included in the questionnaire when the child was 47 months old. If mothers responded positively they were classified as having feelings of poor attachment. Mothers’ feelings of hostility toward their child were also assessed in this questionnaire. Mothers were asked to respond yes (2), no (0), or sometimes (1) to the following three questions: I often get irritated by this child; I have frequent battles of will with this child; This child gets on my nerves. The scores for each question were summed and a total score of five or higher was defined as a high level of hostility [29].

Not in employment, education or training (NEET) was assessed in the questionnaire when their child was 61 months old. If mothers had not had a baby in those 5 years since their study child was born, and had not worked in the past year, or taken any courses or educational training in the past two years, they were classified as NEET. Women who had one or more babies after their study child was born were classified as not being NEET, irrespective of whether they were in employment, education or training, since these women would have had a pre-school age child and this frequently results in voluntarily staying at home with the child(ren).

Antenatal Predictors

Age of mother at last menstrual period (LMP) was obtained for n = 14,531 women (Fig. 1). and dichotomised at younger than 20 years, the cut-point commonly used to identify mothers eligible for programs [6, 9].

Highest education level of mothers and whether they experienced financial difficulties were recorded in the questionnaire at 32 weeks gestation. Highest education level was categorised into: O level or higher (Ordinary level exams most commonly taken at age 16 years, the legal minimum age for leaving school in the UK); and less than O level. Financial difficulties was assessed from five questions asking how difficult at the moment the mother found it to afford food, clothing, heating, rent or mortgage, and things she will need for the baby, with a score of 1 (very difficult) to 4 (not difficult) for each response. The algorithm for calculating the overall financial difficulties score was 20 minus the sum of the scores of each of the five items, resulting in an overall score where 0 represented no financial difficulties and 15 the maximum financial difficulties. Participants with a score greater than 8 were defined as experiencing financial difficulties [30]. Partner status at study enrolment (married or cohabitating vs. no partner or not living with partner) was assessed by questionnaire. Whether or not they had smoked during the first 3 months of their pregnancy was measured in the questionnaire administered at 18–20 weeks gestation.

The EPDS was also administered to women via questionnaire at 18–20 weeks gestation. Although the measure was developed for use with women who have recently given birth, none of the ten items is specific to the postnatal experience and it has been validated for use during pregnancy [31, 32].

Analysis

We calculated the proportion of mothers with poor outcomes who had each of the individual binary predictive factors, and also who had at least one and at least two of the six binary predictors. Specificity, positive predictive value and likelihood ratio of each binary predictor were calculated (Supplemental Table 1). Univariable and multivariable (with mutual adjustment for all other predictors) logistic regression was used to examine the associations of predictors with each outcome. The predicted probability of each poor outcome was calculated from these logistic regression models. In clinical practice all of the predictors would likely be used as binary variables, however, calibration statistics cannot be easily interpreted using a single binary predictor so maternal age, financial difficulties and EPDS score were included as continuous variables in the prediction models.

The area under the receiver operator characteristic curve (AUROC) was used to assess the discriminatory capability of the models, or how accurately each model separates mothers into those with and without poor outcomes. Model 1 contained only maternal age, model 2 contained only antenatal EPDS score, and model 3 included all six predictors. An AUROC of 1 represents a model that perfectly discriminates the outcome, whereas an AUROC of 0.5 represents a prediction tool that is no better than chance at identifying those at risk. While calibration statistics are not possible with single binary variables, we did calculate the AUROC for all three models using predictors as binary variables, as would be more commonly used in clinical practice (Supplemental Table 2).

Calibration of the three models, or the agreement between observed and predicted outcomes, was assessed by ranking mothers into deciles of their predicted risk from each model and then comparing the predicted to observed proportion within each decile. The Hosmer–Lemeshow goodness-of-fit chi-square statistic was used to test the accuracy of calibration [33]. This statistic tests the null hypothesis that the predicted proportion equals the observed proportion within ranked groupings (deciles) of predicted risk. A high P value suggests good calibration of predicted and observed risk.

The integrated discrimination improvement (IDI) [34] for model 2 or 3 in comparison to model 1 was also calculated. This assesses discrimination without relying on cut-off points and compares the average difference in predicted risk for women with and without poor outcomes. The IDI is greater when the second model correctly assigns individuals to higher or lower probabilities of having the outcome in comparison to the first model.

Missing Data

Analyses were conducted using participants with complete data on all six predictive factors (N = 10,955) and the maternal outcome (Table 1, Fig. 1). Analyses were also conducted on an imputed data set to examine the influence of missing data on the findings. Multiple imputation by chained equation was used to impute missing data on outcomes and predictors for respondents who had data on at least one outcome (N = 12,412, Fig. 1) using the ‘ice’ command in Stata [35]. The imputation model included all outcomes and predictors as well as predictors of ‘missingness’—birth weight, social class categorised according to the UK Registrar General’s classification from I (high managerial or professional) to V (unskilled manual workers), ethnicity (white versus non-white) and reaction to pregnancy (categorised from overjoyed to very unhappy). We generated 20 data sets and undertook 20 cycles of regression switching (‘switching’ the order in which variables with missing data have it imputed) [35]. The multiple multivariate imputation approach creates a number of copies of the data (20 copies here) in which missing values are imputed, with an appropriate level of randomness, by chained equations [35]. The results are obtained by averaging across the results from each of these 20 datasets using Rubin’s rules and the procedure takes account of uncertainty in the imputation [35]. Results from the analyses using multiply imputed data are presented in Supplemental Tables 3 and 4.

Table 1 Prevalence and amount of data available for each outcome and potential predictor measured during pregnancy

Results

The prevalence of each outcome and predictor is listed in Table 1, which shows 51.2% of women had at least one, and 22.8% had at least two of the predictors. Table 2 shows the proportion of mothers with poor outcomes who had each antenatal characteristic. Only a small proportion of women with any of the five outcomes were aged less than 20 years when they were pregnant (3.9–7.3%). More than half of the women with any of the poor outcomes, and up to 74.7% of women with postnatal depression, could be identified if information on all six predictors was used and a woman had at least one of these.

Table 2 Proportion and cumulative proportion (for at least 1 or 2 of the predictors) of outcome cases that would be detected with maternal characteristics measured during pregnancy

Maternal age less than 20 years was strongly associated with never breastfeeding and this association remained in multivariable analyses adjusted for all other factors (Table 3). Smoking during pregnancy was associated with all outcomes. Antenatal depression was strongly associated with postnatal depression, and feelings of poor attachment and hostility. Education less than O level was associated with never having breastfed and NEET, whereas mothers with a higher level of education were more likely to experience feelings of poor attachment.

Table 3 Univariable and multivariable associations of maternal predictive characteristics with outcomes

Table 4 shows the discrimination (AUROC), and calibration for all models. Discrimination of all outcomes, with the exception of never breastfeeding, was poor using model 1 (maternal age only). Discrimination of postnatal depression was better with model 2 (antenatal depression score only) or model 3 (all six predictors) than it was with model 1 (age only). Model 2 (antenatal depression score only) was poor at discriminating the breastfeeding and NEET outcomes, but discrimination of these outcomes was better when model 3 (all six predictors) was used. In comparison to model 1 (maternal age only) discrimination of feelings of poor attachment and hostility were improved with model 2, and more so when all six predictors were used (model 3). The Hosmer–Lemeshow goodness-of-fit tests indicated better calibration using model 3 (all six predictors) or model 2 (antenatal depression score only) than model 1 (maternal age only) for all outcomes with the exception of never breastfeeding where all models demonstrated good calibration, and feelings of hostility where model 2 had the worst calibration. Model 1 underestimated the likelihood of poor outcomes, with the exception of never breastfeeding, among those at highest risk (Table 4). The IDIs indicated that model 3 resulted in an improvement in calibration over model 1, particularly for depression at 8 weeks and never breastfeeding, with approximately 15% and 4% of mothers being correctly reclassified by model 3 in comparison to model 1. Model 2 also resulted in an improvement in calibration over model 1 for postnatal depression.

Table 4 Calibration and discrimination of the three models for each maternal outcome

Sensitivity Analyses

AUROC values calculated using all binary predictors (Supplemental Table 2) were lower but consistent with those obtained with a model that included continuous variables (Table 4). Associations between predictors and outcomes, and assessments of AUROC values using the multiply imputed dataset (Supplemental Tables 3 and 4) were consistent with analyses based on participants with complete data.

Discussion

Parent support interventions that may be effective at improving breastfeeding and return to employment, education or training, and reducing maternal postnatal depression and feelings of poor attachment and hostility are unlikely to have large impacts on these outcomes at the population level if young maternal age is used as the sole or main criteria for identifying eligible mothers. Only a small proportion of mothers with poor outcomes were teenagers, reflecting the small proportion of births to mothers aged 15–19 in most high income countries (e.g. 6.1% in England and Wales in 2009 [36], 10.2% in the US in 2006 [37] and 4.2% in Australia in 2008 [38]). Maternal age less than 20 years identified only 3–7% of later cases, depending on the maternal outcome studied. Maternal antenatal depression identified 43% of postnatal depression cases, and 15–19% of the other four outcomes.

Low education and smoking during pregnancy as single characteristics identified at least one-quarter and up to almost one-half of cases depending on the outcome. Smoking remained strongly associated with all maternal outcomes, even after adjusting for other characteristics, and is also more common (17% of pregnant women in England in 2005 [39]) than teenage motherhood in the population. If information on all six predictors was collected during pregnancy, between half and three-quarters of cases of poor maternal and offspring outcomes would be identified using at least one of the six predictors, and up to 45% would be identified among mothers with two or more of the six predictors. If pregnant women experiencing these characteristics could be engaged in effective programs many cases of poor outcomes could potentially be prevented. For example, if the 23% of pregnant women with two or more of the predictive characteristics were engaged in programs that were effective in encouraging and supporting breastfeeding, one-third of cases who would never have breastfed would potentially initiate breastfeeding.

Antenatal depression is, not surprisingly, a good predictor of postnatal depression, as they are potentially both part of the same illness episode [40, 41] and the majority of cases of postnatal depression in this cohort were preceded by antenatal depression [42]. Universal screening for depression is becoming available to all women in the perinatal period in Australia [43] and is recommended in the US [11], and if this information was used to identify the 14% of pregnant women with antenatal depression and provide them with effective interventions, 43% of the cases of postnatal depression could potentially be prevented. As a predictor of poor maternal outcomes, depression during pregnancy, however, is not as sensitive as smoking and low education at identifying cases of poor maternal outcomes. Antenatal depression was associated with feelings of poor attachment and hostility toward the child, which might be considered part of the symptoms of depression, but was not associated with never breastfeeding or NEET after adjustment for other factors during pregnancy. The lack of an association between depressive symptoms during pregnancy and initiation of breastfeeding has been shown in previous studies [44, 45]. The high proportion (27%) of women who never breastfed in this study is consistent with population figures, with 78% of women in England in 2005 breastfeeding their babies after birth, and only 50% of all new mothers breastfeeding at week six [20].

The strengths of this study are the large sample size and longitudinal design with inclusion of a large number of relevant characteristics that could be routinely measured during pregnancy. Using self-reported smoking status may underestimate the prevalence of smoking among pregnant women [46], but self-reported smoking status still contributed to the prediction of poor maternal outcomes in this study and reflects the clinical situation in which pregnant women report their smoking status at antenatal consultations. Given that calibration cannot be assessed with a single binary predictor we used the continuous age variable, which may underestimate the poor calibration of maternal age with a cut-off less than 20 years, as is used in practice. Some evidence for this is supported by our sensitivity analyses in which we showed a lower AUROC when a binary measure of age was used rather than a continuous measure. Reduced power caused by cohort attrition is unlikely to be a major problem in a study of this size and analyses using multivariate multiple imputation produced similar results to complete case analyses, suggesting little bias due to missing data.

There are several issues that may influence the use of a broader range of predictive factors in routine clinical settings to more accurately identify mothers who are at high risk of poor outcomes, and who may be supported with enhanced perinatal care. First, collection of all of the characteristics would need to be feasible and acceptable to pregnant women. Information about depression, for example, is currently collected in some, but not all, settings [43]. Our study suggests that the majority of pregnant women provide information on the characteristics we have examined. Similar proportions of missing data have been shown in a clinic setting where forms completed by FNP nurses in the second wave pilot sites in England had missing data on educational status, marital status and smoking for 10.1%, 8.8% and 11.4% of enrolled women, respectively, though other predictors that we examined were not reported [47]. Second, there would need to be a simple tool for using the collected data and generating a ‘risk’ score for each individual. This could range from a simple checklist of predictors through to computer-based tools. With simple checklists, which would be feasible in most settings, women with one or two out of the list of six risk factors examined here could be considered for more intensive support programmes. Computer-based tools can make use of predictive risk algorithms containing continuous variables and are becoming increasingly common, for example in the prediction of cardiovascular disease risk and successful outcome with in vitro fertilisation (http://www.ivfpredict.com). Third, whilst there is some randomised controlled trial evidence that interventions are effective at improving some outcomes for certain groups of vulnerable mothers [4851], it remains important to determine the effectiveness of programs on relevant outcomes among women identified using a larger number of predictive factors. Fourth, there would need to be available resources for providing programs to mothers identified at higher risk.

Improving outcomes among teenage mothers is important [52, 53], but focusing on this group alone will have little impact on improving depression, breastfeeding, feelings of poor attachment and hostility and reducing those not in employment, education or training among the overall population because maternal young age is not an adequate singular predictor, and few mothers with poor outcomes are teenagers. Other predictive factors such as maternal education level, smoking and depression during pregnancy, that have also been shown to be important predictors of child development outcomes [23], should be considered when offering women perinatal parent support programs.