References

Raghupathi W, Raghupathi V. Big data analytics in healthcare: promise and potential. Health Inf Sci Syst. 2014; 2:1-10
Ahlemeyer-Stubbe A, Coleman SY.London: Wiley; 2014
Newton JT, Bower EJ, Williams AC. Research in primary dental care. Part 1: setting the scene. Br Dent J. 2004; 196:523-526
Coleman SY, Gob R, Manco G, Pievatolo A, Tort-Martorell X, Reis M. How can SMEs benefit from big data? Challenges and a path forward. J Qual Reliab Engineer Int. 2016; 32:2151-2164
Coleman S. Data mining opportunities for small to medium enterprises from official statistics. J Off Stat. 2016; 32:849-865
Rahm E, Do HH. Data cleaning: problems and current approaches. IEEE Data Engineer Bull. 2000; 23:3-13
Her Majesty's Stationery Office. Data Protection Act 1998. 1998. http://www.legislation.gov.uk/ukpga/1998/29/contents
Williams AC, Bower EJ, Newton JT. Research in primary dental care. Part 3: designing your study. Br Dent J. 2004; 196:669-674
Analysis of patient attendance pattern. Dental Profile magazine 2002, NHSBSA Dental Service publications. http://www.nhsbsa.nhs.uk/DentalServices/2873.aspx
: National Institute for Health and Clinical Excellence; 2004
Wheeler D.Knoxville: SPC Press Inc; 1999
Qresearch Report on Trends in Consultation Rates in General Practices – UK, 1995–2008. Publication date: 09:30 September 30, 2008. https://digital.nhs.uk/catalogue/PUB02399
Wang Y, Hunt K, Nazareth I, Freemantle N, Petersen I. Do men consult less than women? An analysis of routinely collected UK general practice data. BMJ Open. 2013; 3 https://doi.org/10.1136/bmjopen-2013-003320
Wanyonyi KL, Radford DR, Gallagher JE. Dental treatment in a state-funded primary dental care facility: contextual and individual predictors of treatment need?. PLOS one. 2017; 12

Understanding our patient base: an introduction to data analytics in dental practice

From Volume 45, Issue 3, March 2018 | Pages 236-246

Authors

Rosie Pritchett

BDS(Hons), BSc(Hons)

General Professional Trainee within Newcastle Dental Hospital, NE2 4AZ; InDental Practice Ltd, Newcastle, UK

Articles by Rosie Pritchett

Shirley Coleman

PhD, CStat

Principal Statistician, ISRU, School of Maths and Stats, Newcastle University, NE1 7RU, UK

Articles by Shirley Coleman

James Campbell

BDS(Hons), MA(Cantab)

General Professional Trainee within Newcastle Dental Hospital, NE2 4AZ; InDental Practice Ltd, Newcastle, UK

Articles by James Campbell

Shiv Pabary

MBE, BDS(Hons), MFGDP (UK), DipConsSed

Practice Owner, InDental Practice Ltd, Fewster Square, NE10 8XQ, UK

Articles by Shiv Pabary

Abstract

Abstract: Dental practices are continually collecting patient data but it is often an underutilized resource. There is a growing trend towards use of data analytics within companies to guide business decisions. For dental practices which use digital systems, a large reserve of patient information is readily available. Simple data analytic techniques are presented which can be used to extract substantial insight into patient demographics, DNA (did not attend) rates and many other areas of practical relevance to clinical service delivery and business management.

CPD/Clinical Relevance: Data analytics is well established in many industries and has the potential for encouraging colleagues to look at their data and understand their patient base and changes over time; practice owners can gain insight into patient demographics to guide business decisions and improve patient care.

Article

Data analytics means taking an inquisitive look at raw data, such as patient addresses, and extracting meaningful information by summarizing, illustrating and analysing the data. Assessing the information in context leads to knowledge and the opportunity to use this knowledge to make decisions based on evidence. Dental practices collect vast quantities of data everyday but usually it is only interpreted as clinical records or for auditing purposes. More broadly across the healthcare sector, data supports clinical decisions, disease surveillance and population health management.1 Barriers to further analysis in dental practice include poor quality data, lack of analytics experience, lack of time and resistance to change.2

An example of data analytics within a small business is the analysis of a year's worth of daily sales in a clothing shop.3 Plotting a graph of typical sales takings per day of the week demonstrates that, though Saturday takings per hour are the highest overall, sales per customer are higher mid-week (Figure 1).

Figure 1. Data analytics of sales in a clothing shop.

This suggests that proportionately more customers browse at weekends, while more come to purchase on weekdays. This simple analysis helps to guide the choice of appropriate opening hours to reflect the change in sales across weekends and weekdays.

Let us consider how we might analyse the kinds of data available in dentistry and apply it to the decisions practices must make.

There are many different types of practice, for example: a private squat in a rented surgery; a mini-corporate expanding; several independent dentists forming a rebranded group practice. These practices all face different types of clinical and business challenges and these will determine which analyses will be most helpful.

For example, practices might focus on: acquiring new patients; minimizing the risk of an expansion plan; revitalizing the patient base or re-orientating the practice focus. Others might wish to identify: patients with unmet needs; those willing to invest significantly in dental treatment; true value underlying a practice sales pitch; operational inefficiency.

Each practice faces a distinct set of concerns and it is important to address which areas are of most urgency to their situation. A range of guidance exists on selecting appropriate research questions.4

What analysis is possible will be constrained by staff knowledge and training in data analytic techniques, as well as the data sources available.5 The following need to be considered:

  • What software systems and databases exist?;
  • Who handles data collection and storage?;
  • What analytic software is available in the practice (eg Microsoft Excel)?;
  • What skill set exists in-practice and what training is required?
  • Dental practices affect a wide range of people who are all stakeholders in the practice and in the data analytics about to be carried out. For example, principal(s), associate(s), employed staff, existing and new patients, local community and competing practices.

    The data collected and held by the practice has many dimensions, any of which can be the focus of the data analytics: dates and times of appointments, types of treatment, age and gender of patients.

    Combining patient information with open data, from sources such as the Office for National Statistics (ONS) or the NHS Business Services Authority (BSA), provides a wealth of potential insight.6

    Methodology

    Stepwise guidelines for analysing data in a dental practice constitute a structured approach for any practice wishing to use its data to enhance its clinical and business performance.

    These steps should not just form a one-off lengthy project, but provide a framework for ongoing assessment and continuous improvement (Table 1). The impact of small, measurable business decisions based on data analytics can be demonstrated by the changing trends in outcomes (Table 2).


    Step
    1 Describe the business and the people it affects 6 Consider access and confidentiality issues
    2 Prioritize strategic questions 7 Summarize data, eg tables, graphs
    3 Review data resources, staff skills and software 8 Enrich data using open sources
    4 Identify relevant data 9 Analyse data
    5 Describe and improve the data 10 Report and recommend next steps

    Data Analysis Application
    Office for National Statistics (ONS) data can be used to view demographics of a catchment area Providing evidence of the characteristics of the local population as justification to the BSA for deviating from regional or national norms in NHS claims; market research for presentation to prospective lenders when financing practice acquisition
    Practice software data on appointments can be extracted, revealing patient longevity of attendance, turnover rate, typical fees or costs of treatment per patient Establishing operational performance of the practice: evidencing and documenting efforts at improving access; justifying goodwill valuations; targeting under-served or high-value groups
    Use of focus groups to collect information to inform development of an online or paper questionnaire that will canvass patient and local community opinion of the practice Enhance reputation of practice and build word-of-mouth; demonstrate commitment to constant improvement and patient-centred care
    Quantitative analysis of forward order book and financial turnover; qualitative analysis of data on staff training and skills Informing decision to acquire a practice; due diligence on claims about operating efficiency and ultimate valuation

    Preparing data for analysis can take up a considerable proportion of the project time. Errors and omissions in data should be recorded and form part of a continuous improvement cycle for data quality. Once the benefits of data analytics are demonstrated, then all staff should buy in to providing data that is fit for purpose.

    We have applied this framework in a detailed service evaluation of a test dental practice to illustrate how its own data enriched with freely available open data can be analysed to reveal valuable insights.

    Describing the business and prioritizing

    The test dental practice provides mainly NHS care within North East England. The practice opened in 1981 and moved to purpose-built premises in 2009 comprising ten surgeries. As well as general dentistry, the practice also holds sedation and orthodontic contracts, and accepts private endodontic referrals.

    This practice has a stable base of patients and is not necessarily looking to expand. The focus of interest is analysis of the existing patient base in order to:

  • Enhance access to meet regulatory requirements;
  • Optimize activity to improve operational efficiency; and
  • Improve compliance with recall guidelines to ensure continuity of care.
  • These are some specific questions:

  • Are we treating a representative demographic of patients (all ages and genders)?;
  • Is access socially equitable?; and
  • What are the patterns of attendance?
  • Reviewing data resources

    Collection and storage of data involves the whole team, including receptionists, the practice manager, nurses and associates. Within this practice, the practice manager is responsible for reporting and has a comprehensive knowledge of tools within the software. For example, identifying patients who have not attended within a certain time period, and sending out reminder letters. With most practice software systems, any employee with access can obtain an overview of the data. There may be scope for further development of data analytic skills within the team.

    Identifying relevant data

    Practice software systems provide an extensive data source, our test practice uses EXACT (Software of Excellence). Each software provider has advisors on hand who can assist with downloading the appropriate information. Within EXACT ‘Contact Lists’ of patients can be downloaded who fit certain criteria within a desired date range.

    For this example, the EXACT dataset downloaded into Microsoft Excel included 25,176 entries, from 2005 to 2016. Very large datasets may require an extra step for downloading, to avoid this the date range can be narrowed. You may want to start with looking at the most recent 5 years.

    The following information can be used to address the questions posed in the introduction: name, address and postcode, date of birth, gender, date of last visit, dentist seen, date of last missed appointment, patient first visit.

    Describing and improving the data

    The raw data exported into Excel has 140 columns with 25,176 rows of patient data. (Figure 2). Many of the 140 columns are empty and information, such as the address, may be repeated. The data are in patient number order. Notice some of the peculiarities typical of raw data, for example the telephone numbers in column F have been written in ‘scientific form’ so all the detail is hidden.

    Figure 2. Screen shot of anonymized Excel data.

    As this is a practical guide, we consider some ways to get started. The FILTER option in Excel is useful for gathering an initial overview of the data. Hover the cursor over the ‘patient.sex’ column (column T in our dataset) and left click to highlight the column. Then click on DATA, then FILTER. Selecting ‘female’ will show the number of female patients. We have 13,441 of the total 25,176. So overall 53% of patients are female. The dataset has a lot of missing values in other columns but there are no missing values for gender. A more elegant way of counting the females is to use an Excel function =COUNTIF(T2:T25177, ‘female’) which returns the number 13,441 as the number of females. There are 11,735 males, so the ratio of women to men is 1.15.

    We can explore the ‘patient. balance (column X in our dataset) in a similar way and find out how many patients have zero or a negative balance of payments.

    There are usually errors in any dataset.7 These include: multiple versions of postcodes, some with spaces and some without; addresses in a mix of upper and lower case; dates of birth wrongly entered and data items missing. A note should be kept of all the corrections made and these notes should be used to help improve the data collection process.

    The dataset is large (nearly 20 MB) so it is important not to make new copies every time changes are made to correct errors and omissions. It is always important to keep an unchanged copy and subsequent copies should be clearly named to avoid confusion.

    The EXACT dataset includes the necessary data to analyse the numbers of new patients, drop-out rate, net number of new patients each year, gender mix, number of missed appointments and their cost in terms of time lost. Using basic data manipulation techniques, we can extract data that is relevant to these questions of interest.

    Access and confidentiality

    The data should be handled as per the Data Protection Act.8 Methods must be imposed to ensure safe handling, for example only using practice computers for analysis, using a separate password protected USB drive to store data and keeping records confidential. To ensure anonymity and that no data can be traced back to individual patients, the names can be erased from the data list or replaced with unique ID codes.9

    Care with patient data is paramount and it is worth noting that, not only names need to be considered, but also dates of birth and postcodes. It is well known that individuals can often be traced by triangulation (combining different pieces of information), particularly if they are unusual in any way, for example if they are the oldest person in a postcode with few residences.

    There is an important distinction between using data for service evaluation and carrying out research. The Medical Research Council has a helpful tool to decide if a project is a service evaluation or research (http://www.hra-decisiontools.org.uk/research/index.html). The data analytics discussed in this article is a service evaluation rather than research and consequently it does not need ethical approval.

    Summarizing the data

    Many different formulas are available in Excel to extract information from dates. For example, patient age can be calculated from the difference between the date of birth (in column H in Figure 2) and the date the spreadsheet was created, referred to as ‘general.date’ (in column CQ in our dataset). Make sure that both columns are in date format and use the ‘datedif’ formula to find age at last birthday. In the next available column (in our dataset this is column EN) type =datedif(h2,cq2,’y’) and the AGE at last birthday will appear in cell EN2. Cascade the command down the column by double clicking in the lower right hand corner of cell EN2. Patients with a missing date of birth have age 116 as Excel interprets blanks as 1st January 1900.

    The data can be summarized within Excel using pivot tables. To obtain a pivot table, place the cursor on any entry within the spreadsheet and click on INSERT and PIVOT TABLE.

    For example, to summarize the age range of patients in the pivot table:

  • Drag ‘AGE’ to the ROWS and the patient number to the VALUES;
  • Click on ‘pivot chart’ to get a bar chart. The bar chart shows the current age distribution of patients in the practice.
  • Excel has reasonable help facilities, accessed by clicking on ‘?’ at the top right of the screen.

    We can now produce some insight from the data. For simplicity, all analyses within this article were carried out using Excel, however, other statistical programs can be used.

    Enriching the data

    At this point we now have a large, well-organized dataset, in our case, regarding patient demographics and appointments. National demographics are useful to the practice as a comparison to help understand unmet need, effectively targeting marketing. Demographic data are available from numerous open sources, UK-wide and for local areas. In example 1, we integrate public data on age distribution with our established dataset.

    Analysing the data

    Example 1: Patient demographics and access

    We have used ONS data to compare the age range of the practice patients with the North East population (Figure 3). We can see that the practice appears to be capturing a representative sample of the general population overall. The practice age distribution is similar to the NE demographic except that there are fewer very young children and older adults and more people in middle age. A practice which appeared to be low in the 25–45 age group may consider initiatives such as late opening for those who have difficulty attending during work hours.

    Figure 3. Age distribution of North East region and the practice.

    Comparing the practice and regional age distributions helps determine whether clinical activity in the practice is meeting the needs of the population.10 As the minutiae of NHS practice and individual dentist performance is compared numerically to regional and national averages by the BSA, it will increasingly be necessary to develop a good understanding of how local demographics can deviate from regional norms.

    The patient postcodes are included in the EXACT dataset. The postcodes can be matched up to geographical locations using other open data, and mapping programs can be used to visualize the location density of the practice patient population. The postcodes can also be allocated to ONS local area census codes (called LA11), as shown in analyses elsewhere.11 There are about 300 households in each local area and a full range of demographic information can be accessed, including unemployment rates, deprivation levels, numbers of children under 16 and numbers of adults over 65. This information can be used to explore possible correlation between levels of deprivation and high dental need11 or patient behaviour such as missed appointments.

    More advanced statistical analyses can be used to predict areas of growth and drop-out and lead to targeted advertising to attract new patients or take up of new treatments. Specialist statistical software, such as the commercial package SPSS or the freeware ‘R’, is needed for these analyses.

    Example 2: Patient recall and attendance

    Our test practice is interested in maintaining long-term patient contact, to prevent deterioration in patient health and comply with guidelines on appropriate recall intervals.12 The data were used to analyse which patients attend regularly, and which patients are due for reassessment but have not been in contact.

    The number of patients returning for treatment can be visualized by plotting a bar chart of the year of last appointment; this potentially gives an estimate of how many patients have left the practice.

    In our example, 10,602 patients (42% of the dataset but 50% of patients with a recorded date of last visit) were seen in the last 24 months.

    The maximum recall suggested by NICE guidance is 24 months for adult patients;12 patients who have not attended for over 2 years are likely to be due a dental appointment. This analysis provided evidence for the practice manager to use EXACT to contact patients for a recall who had not attended between 2 and 3 years ago; it was assumed that those patients who last attended more than three years ago have moved out of the area or changed dental practice.

    Example 3: Patient turnover

    Any practice depends on a steady inflow of new patients that at least equals the rate of attrition. Our practice is interested in the rate of turnover and trends in new patient attendance over the past ten years. This is of relevance in managing levels of contracted NHS activity and assessing the impact of market conditions on the business.

    The number of new patients to the practice was quantified for each year since 2006. There appears to be a steady influx of patients, with peaks in 2011 and 2012 (Figure 4).

    Figure 4. Patient numbers by year of first visit.

    The practice gained an orthodontic contract in 2012, hence the peak may relate to this giving wider appeal to new patients. Monitoring the number of new patients is informative, for example before and after an advertising campaign or introduction of a new product or service to the practice. Further statistical analysis can be carried out with specialist software to see if there is a significant difference between the observed number of new patients and the expected number of new patients.

    To ascertain the turnover of patients in the practice, the number of new patients needs to be compared with the number leaving the practice. Patients who have not attended for a specified number of years can be identified from the dataset. We can find the number of days since the last visit and set a cut-off point, in this case we looked at patients who had not attended for 5 years or more.

    In our dataset, 4,186 patients had a missing last visit date; 5,792 patients out of 20,990 had a last visit 5 or more years ago, therefore 28% of patients had not attended for 5 years or more. These patients merit further characterization to see if there is a pattern developing.

    We now need to match the numbers in each year who have not attended for 5 years or more with the numbers of new patients in order to find the net number of new patients every year (Figure 5).

    Figure 5. Net number of new patients per year.

    The net number of patients increased in recent years and demonstrates a steady growth. This is useful information when considering hiring associates and tendering for new NHS contracts. A negative trajectory could be a warning sign that changes need to be implemented.

    Example 4: Gender mix

    A straightforward analysis of gender mix gives considerable insight. This example is aimed at understanding the patient community and enhancing access. We explore the data to see if it reflects the experience of other healthcare settings.

    Looking at the gender of the patients, the overall proportion of female patients is 53% (ie 13,441 females out of 25,178). The gender mix is interesting as it appears to be changing over time, with more females seen recently (Figure 6).

    Figure 6. Numbers of female and male patients by year of last visit.

    The proportion of female patients appears to be rising. This could imply that attendance patterns are changing. Patient behaviour varies considerably from time to time and it is important not to read too much into a single analysis. If the practice considers gender balance to be important then this can be further explored by statistical analysis and monitored by a control chart.13

    The proportional increase in favour of female patients contrasts with a trend towards reduced gender disparity in accessing other primary care health services.14 Medical primary care has a well-established higher female attendance rate across a broad age range, but the differential is closing; in our data it appears to be widening. This did not cause concern, however, as our differential was lower than for GP visits, and is comparable with the gender gap in attendance found when focusing on those undergoing treatment for specific conditions,15as is typically the case for dental attendees. There are slightly more women than men in the North East region. According to 2015 ONS census data, the ratio of women to men is 1.04,16however, the ratio of women to men whose last visit was in 2016 is 1.26, as shown in the graph.

    If the gender mix is found to be significantly different from expectation, then it would be worth exploring how access could be improved and how the practice could be made more appealing to both sexes.

    Example 5: Missed appointments

    The practice wishes to minimize the rate of missed appointments, as these are a burden on operational efficiency. ‘Date of Last Missed Appointment’ was extracted from the Contact List data. This does not show how many appointments a patient has missed, just that they have missed one at some point in time. There were 7,949 patients who had missed at least one appointment. The number of patients missing appointments has increased more recently, probably because the number of patients has increased. Further analysis could be carried out on the 7,949 patients by selecting a sample of those who have missed appointments recently and carrying out an in depth review of their demographics and dental treatments.

    Looking at missed appointments before and after introducing a text reminder service would provide evidence to justify the expense of implementing such a process.

    The dataset contains the length of the last missed appointment so the time lost due to missed appointments can be calculated, for example the number of hours of missed appointments in the test practice over the last 5 years is 1,450 or approximately 60 days.

    Most missed appointments were found to be 5, 10 or 15 minutes in length. This may suggest double booking for short examination slots would be beneficial. The time of day and the age range of patients with missed appointments is also interesting. If they are mostly in the 25–35 age range, this could be due to work or childcare constraints and the practice could be advised to offer later appointment times to improve access for this group; or provide a nursery area.

    The name of the dental practitioner who the patient failed to attend can also be revealing but this highlights the importance of not using data analytics to single out individuals. Work practice was extensively studied in the manufacturing industry where it was found beneficial to involve and empower the work force for quality improvement.13 However, it could be useful to highlight any behavioural concerns and training opportunities amongst associates which could be discussed at appraisal.

    Comparison with a second snapshot of data taken a few months later could be useful to investigate a specific query. A new dataset can be downloaded after implementing a change in the practice, for example, offering adult orthodontics. An overview of the data could reveal an increase or decrease in specific demographics or improved attendance within a selected age range.

    Reporting and recommendations

    The findings from these examples of applying data analytics to the test dental practice were shared with the practice staff and recommendations were made. Some actions and changes under consideration as a result of the analysis include:

  • The practice manager now sends out reminder letters to those who have not attended in the last 24 months because this was highlighted as a concern;
  • Introduction of longer opening hours to increase flexibility of appointments;
  • Ensure patients at either end of the age spectrum are aware that they can attend for check-ups, for example oral cancer screening for edentulous patients;
  • Steady growth and population mix are satisfactory so no change needed at present.
  • Discussion

    Digital record keeping systems provide a ‘big data’ resource that can be utilized to provide meaningful insight into the patient demographics of a dental practice. Numerous web-based applications and consultants are available to carry out this analysis, however, this article aims to provide the initial steps into practical data analysis. Open data sources can add a further layer of information to practice data by comparing with local or national populations.

    Data analytics can be used to evaluate the effect of changes in skill mix activity within the practice by looking at net patient numbers and patient mix. The changes in numbers of missed appointments, DNAs and financial income can be monitored in successive time periods. There are many further applications of data analytics which are relevant for dental practice data but which have not been illustrated in this introductory article. Previous research has shown that treatment needs can be predicted for specific demographic groups leading to opportunities to monitor practice performance, tailor services offered and prepare promotional material.17Data analytics can be used to address queries arising from the DAF (Dental Assurance Framework) benchmarks. The techniques can also be extended to exploring the coverage of the catchment area and comparing several practices or geographical locations (https://fingertips.phe.org.uk/).

    Conclusion

    This article aims to show what insight can be gained from basic data analytics. Further insight can be gained from either studying the techniques yourself or hiring someone to do it for you.