Continuous Glucose Monitors: Does Better Accuracy Mean Better Glycemic Control?

Accuracy is good, but precision is essential.

Dan Heller

Sep 19, 2023

**Wearing the Dexcom G6 and G7 simultaneously: Detecting glucose is not as straightforward as it would appear.**

1×

0:00

-9:49

You can listen to this AI-generated audio discussion that summarizes this article.

Introduction

Pop quiz: Your CGM displays a glucose level of 120 mg/dL, but a finger-stick blood glucose meter (BGM) displays 180 mg/dL. What do you do next?

Take insulin because your “real” blood sugar is high.
Calibrate your CGM because the BGM is “more accurate”.
Ignore it because you don’t really understand how all this works anyway.

Sorry to say, it’s a trick question. To actually know what to do, and what not to do, you need to understand a lot more about how CGMs work. And unless you know this, it’s very hard to manage T1D as well as you could.

Making in-the-moment decisions, such as whether to eat, or to bolus insulin, or do nothing at all, should never be based on a single glucose value. And that’s because glucose is always moving around your body in different concentrations and in different places. This basic fact lies at the heart of the trick question—and what makes managing T1D so difficult. Knowing your ACTUAL, SYSTEMIC blood glucose level is simply not possible. Full stop.

That said, the glucose values you get from a CGM or a BGM are not entirely random either. There’s a degree of error with each read, but that error rate is not constant. Different conditions affect that volatility, or rather, how much error is involved.

This is an extremely complex topic, but if you really want the gritty, technical details, see the article, “Differences in venous, capillary and interstitial glucose concentrations.” Much of my article here is based on findings from this and other studies. In this particular paper, the authors explain how and why there’s great variability in glucose readings using different types of measuring methods, and how that variability isn’t just a matter of diffusion time (delay).

In short, the researchers demonstrated physiological differences between venous, capillary, and interstitial glucose levels—especially after meals, with interstitial readings deviating by more than 30%. Yes! That’s a lot! And the implications are profound, as we will investigate thoroughly in this article.

Self-managing T1D does not require the sophisticated level of understanding of these findings, but only the top-level headline: Learning to see—and react to—glucose patterns is the way to achieve healthy glycemic control. That is, while specific and isolated glucose values, such as 180 or 120, may be more or less accurate under different conditions, what’s far more important is understanding glucose patterns. That is, the effects your actions have on glucose levels. Which foods will do what; the nature of exercise; how sleep and daily routines affect glucose levels. Until you develop these observations—which requires paying attention—the readings generated by your CGM can feel like watching the stock market: It’s a chart that seems to have a life of its own.

So, again: PATTERNS. Individual readings are useless—it’s how readings change in response to actions. That’s what matters. And to do that, you need a CGM that gives you reliable data designed for pattern recognition.

Your ability to recognize and react to patterns takes time and a great deal of empirical experience (by watching your CGM a lot and seeing what happens when you do things), I’ve categorized four different categories of T1Ds:

Category 1: People who are either new to T1D or haven’t taken the time to make these connections about glucose patterns. They rarely look at CGM, and leave it to their care team to do analysis to see if there are patterns that might suggest adjusting basal dosing, or bolus dosages for meals, or other tips and tricks to improve glucose levels.
Category 2: People who’ve likely had T1D for a few years, are generally older than 25, or are engaged enough to realize that managing T1D requires at least some broad steps. They look at their CGM 3-5 times a day to see if glucose patterns are particularly high or low before they decide to take insulin or eat (or leave it alone).
Category 3: People who are highly engaged and familiar with their personal glucose patterns. They look at their CGM, not just to react to current glucose patterns, but to take action in anticipation of doing something (such as eating, sleeping, etc.).
Category 4: People who really aim to be healthy. They watch their CGM data every hour (or more) and make micro-adjustments (insulin or carbs) as necessary to maintain tighter control.

You can probably quickly identify which category you’re in. If you’re in category one or two—where you don’t really engage with the data very closely—most any CGM on the market will suffice (with some exceptions), because most will show generally reliable long term patterns—days, weeks or months at a time. You may see your night time glucose levels are higher, or that you glucose patterns tend to spike too high after meals and never come down. These patterns are good enough for clinicians to make grand, sweeping recommendations. “Let’s raise your nighttime basal rate,” for example.

For this group, the “accuracy” of a CGM is entirely irrelevant, because few CGMs are so inaccurate that your doc can’t assess your high level glycemic management.

For those who are more engaged and watch their CGM data closely—which is absolutely essential for the healthiest outcomes—this is when the product you use really matters. This especially true for those who use automated insulin pumps.

It’s here where you’d think that a more “accurate” CGM would be better. But it’s also where we get back to the topic raised earlier about the way glucose moves in the body. That is, glucose is not smoothly distributed throughout the body, like dye in water. That is, while you can measure a sample of water that has dye in it and say, “It’s blue,” you cannot measure a sample of blood to infer that all your blood has the same amount of glucose in it.

Instead, glucose in the blood is more like sludge creeping down a polluted river with a lot of other muck. If you sample a cupful of that thick water, the amount of “sludge” in that sample can be weighed. But you cannot declare that the entire river has the same level of sludge. The only thing you can infer is that that cup has that amount of sludge.

The likelihood that any single sample of sludge is representative of the total amount in the river varies depending on two things: volatility and total sludge levels. If the amount of sludge is low, then each sample is more likely to be consistent with one another. But as the volume of sludge increases—say, you’ve just had lunch—then the distribution varies a lot. That is, one cup of sludge may not necessarily be the same as another cup. The higher the sludge—or the higher your glucose levels—the less “accurate” any given reading is.

The same is also true if the amount of sludge is changing, up or down. As levels change, the consistency between measurements are less so.

This is why CGM values can be highly deceptive. If your CGM shows 250 mg/dL, that does not necessarily indicate a specific action should be taken. A single value by itself is not only meaningless, but it may even be an accurate anomaly. Yes, that sample may have shown that level of glucose, but few other samples may yield the same result.

This alludes to the paper I mentioned earlier: Subjects were given meals and their glucose levels were tracked from both blood and interstitial tissues. The researchers found great variability among all these sources—blood, capillary and interstitial tissues all yielded varying readings. If you’re going to make dosing decisions on glucose levels after a meal, your glucose variability is such that no CGM is “accurate” enough to predict exactly the right amount of insulin you’d need.

In other words, measuring glucose has inherent error built into it. Hence, “accuracy” is not reliable for making clinical decisions, at least, not by itself. You need a lot more data than that—namely, sets of values that are in statistically likely proximity to one another.

This is called precision—when you use multiple data points rather than one—as opposed to “accuracy”, where only ONE read has to match the value of another device.

It also explains why “accuracy” is more of a marketing term than a meaningful measure of a CGM’s utility.

This article here argues that precision, not accuracy, is the most useful CGM quality for real-time T1D management. And it’s not just conjecture—researchers have studied this phenomenon and published their results in the article, “Limits to the Evaluation of the Accuracy of Continuous Glucose Monitoring Systems by Clinical Trials.”

Here, the authors describe the erratic and random patterns of glucose fluctuations, and call into question the appropriateness of how clinical trials for CGMs are conducted in the first place. Popular accuracy metrics (like MARD) can be misleading without considering trends and consistency.

Real-World Scenario

A perfect illustration of this can be seen in the chart below, which shows CGM data from both the G6 and G7 that I wore at the same time. The G7, in red, is purportedly “more accurate” than the G6 because it meticulously weighs each sample of sludge from the river, gives you its weight, and then moves to the next sample. The G6, by contrast, is doing a more sophisticated analysis on the signals it gets from the sensor and makes a more “precise” inference on what the body’s glucose levels are. That is, “precision” is a far more useful metric because that’s where adjacent readings are more in line with one another, so you can more easily see glucose trends.

Remember the pop quiz at the beginning? A finger-prick blood glucose test might yield a value like 180 mg/dL, but the CGM might only show 120. Which value is “correct” is not straightforward, nor will it always be consistent. If you were to treat the 180 value as “correct,” and administer 2u of insulin, that may work… or it might be an overkill if the 180 value was more of an anomaly due to the erratic nature of glucose concentration levels. Here, the CGM value of 120 might have been wiser.

If you’re a T1D and learning to make in-the-moment management decisions—or rely on an automated insulin pump to read this data—which would you prefer? The “accurate” G7? Or the “precise” G6?

Let’s put it to the test: Below is a familiar screenshot from the Dexcom G6 app:

The “down arrow” on this Dexcom reading indicates glucose levels are dropping. And yes, 75 is definitely low. But you can’t infer very much more than that. How fast is it dropping? How high was it before? How long did it take to get from that peak to this level? This is what you need to know to make an informed decision on what to do next.

Let’s expand to see the whole chart now:

Here, we see that the levels were quite high (180s), then dropped to roughly the 150s, and then suddenly took a dive in the last 30 minutes. But we can also see that it’s starting to level off. Yes, only ONE reading suggests this tapering off. This is really important because it means you can anticipate near-term patterns, say 30m from now.

That’s why I ignore directional arrows—it can’t answer these questions. And worse, it can distract you from what you should be doing: Learning how to read CGM charts.

Since this is my own data (but not from the chart shown earlier), I know the following: I took 3u of insulin an hour ago for a yogurt I ate that had 45g net carbs. I also know what I’m going to do next: Sit and keep writing this article. (If I were going to exercise, I would not have taken insulin because the activity would have consumed the carbs and brought my glucose levels lower. For more on how exercise affects glucose levels, see this article.)

Knowing those things (at least!) is how T1D management is done best. I am a Category 4 T1D: I pay attention to things I do and how they affect glucose values (the CGM chart). In this case, I knew there were no other significant factors involved, so I knew to take 15g of dextrose tabs.

So, was I right?

Now that some time has passed, let’s look at the chart again.

Hey! 108 is a good number, and the slight upward trend doesn’t bother me. Still, I’ll keep an eye on it.

Here’s the thing that made all this happen: I TRUSTED THE G6 DATA. It was smooth, the trend line was reliable, and the readings did not wobble around. Most importantly, it was TIMELY. I was able to make an in-the-moment decision when it was at 75. By contrast, if the data was “wobbly,” I would have had to wait longer—potentially much longer—before the pattern stabilized, and by then, it might have been too late.

To illustrate the wobbly nature of data, let’s look at that G6/G7 chart one more time and see how the red (G7) plots are erratic:

24-hour graph of Dexcom G6 and G7 glucose levels

Let’s zoom in on the 4:00pm window. Here’s an enlargement of that data:

**A two-hour time window of G6 and G7 glucose readings**

The G6 data is smooth, but the G7 is bouncing all around like a buoy bobbing in turbulent ocean waters. Each G7 reading may well be “accurate” in isolation, but they offer no help in determining directionality or rate of change.

I’ll return to this screenshot later in this article, but you can easily see that the differences between the G6 and G7 are clear and obvious: I can rely on the G6 data to make an informed decision, but anomalous spikes or dips that the G7 produces are intermittent and—by themselves—are entirely unreliable. You cannot and should not take insulin or eat just because there seems to be one or two (or more!) individual readings that move suddenly up or down. You have to wait much longer before you can infer a reliable pattern from the G7, and by then, you may have missed real, actionable opportunities to make corrections. And that’s a big deal in T1D management. Time is critical when it comes to glycemic management, because things spin out of control very fast.

Now, let’s be honest. If you’re a Category 1 or 2 person, you may not be watching your CGM like this. In fact, you may be using an automated insulin pump to do that work. But remember, those systems are doing exactly the same read-by-read analysis that I do for myself, making the same in-the-moment decisions I described earlier. No algorithm can figure out G7 data any better than you can. So, those systems won’t work if the data being observed is not reliable enough. Garbage in, garbage out.

And that’s where the risk lies with G7 data. But more importantly, that’s where the risk lies with so-called “accurate” data. You don’t want accuracy, you want precision.

Read that bold text again—seriously. Out loud. With a British accent if you must. It’s that important.

Now let’s expand this to a real-world self-management test, where I wore a G6 and G7 at the same time for a month to see which sensor gives better data to make better in-the-moment clinical decisions. Read on.

I bet your endo could learn something from this post, so feel free to share it!

Does the G7 yield greater glucose control?

Before I explain how I tested the G6 vs the G7, I need to make it clear that Dexcom’s clinical trial that demonstrated the MARD level for the G7 was not intended to claim that the G7 resulted in healthier outcomes. That’s a very different goal. The company only intended to conduct an “efficacy trial,” which is only intended to show that the sensor was good enough to be approved by the FDA.

What Dexcom did not do is perform an “effectiveness trial,” which is when test subjects would wear each of the two sensors and make real-time management decisions under real-world conditions. I explain this in much greater detail in my article on how to evaluate clinical trials.

Since no trials have tested the G7 this way, I did it on myself. As it happens, my T1D is under very tight control, where my time in range is 95%, with <2% below range (70 mg/dL) and <4% above range (180 mg/dL).

NOTE: I do not aim for this level of tight control. I do not have “targets” in mind. I am not fanatical or obsessed about numbers. I merely follow the four basic habits of T1D management, which I describe in my article, The Four Habits of Healthy T1Ds. Habit #1: Watch your CGM and get to know your patterns; Habit #2: Make small interventions with carbs or insulin “as needed.”

You can read the article to see what habits #3 and #4 are.

As the data from my experiments will show, I was only able to achieve a TIR of 75-80% using the G7. What’s more, I also experienced considerably more hypo events and greater variability with the G7, which can be harmful.

There’s a lot of detail here, so let’s start with my experiment.

During March, 2023, I wore both the G6 and G7 at the same time, but would only observe data from one sensor’s app at a time to make real-time management decisions. That’s right, I did NOT look at both apps and compare them in real-time. My goal was to determine which data made it easier or better to make in-the-moment decisions. After a period of a few days, I switched to the other sensor’s app. I repeated this back-and-forth between the two sensors several times.

Upon completion of the experiment, I downloaded all my data to Excel and analyzed it to see how my TIR varied between the two. I also collected data for insulin (InPen bluetooth enabled insulin pen), carbohydrates, exercise, sleep, and glucose levels from my Contour Next One blood glucose meter (BGM), which I included in my analysis report.

This post is public so feel free to share it.

The graphic below is the topline dashboard from my month wearing both the G6 and G7:

**Dashboard stats for the G6 and G7 during March 2023**

The first thing that pops out is that the G7 reported glucose values ~5% lower than the G6, which is consistent with what others have reported online. Aside from that, the two sensors appear roughly equivalent: The G6 averaged 121 mg/dL, versus the G7’s 116, and the standard deviations (SD) were 33 vs. 34, respectively.

But the real difference between the two sensors is shown by the time-in-range (TIR) stats on a day-by-day basis, as shown in the following graph:

**Time-in-Range (TIR) charts for the G6 and G7**

When I used the G6 to make decisions, I achieved a TIR of >90%. When I used the G7, my TIR dropped to the ~70% range. The reason is obvious: The G7 data would at time present patterns that looked like my glucose levels were moving in a particular direction at a particular rate, which would have meant either taking insulin or consuming carbs. But in actuality, those patterns were anomalies, so my actions would cause my real glucose levels to go off in the wrong direction. And sometimes, dangerously so.

Let’s zoom into the two-hour window between 4-6pm that I illustrated earlier. This kind of movement is highly representative of the kind of volatility seen in the G7 versus the G6, and why it’s hard to make real-time decisions.

Remember, I couldn’t see the G6 data (the smoother blue graph), so at 5:30pm, and with only the G7 data in view, I saw the very rapid rise from 88 to 155 in a matter of 30 minutes. Granted, the data leading up to that was highly erratic, but these successive readings were not–they were decisively rising, and fast. Without any idea where these levels might top out, especially given the rapid rate of change, I thought I needed to start bolusing.

As I always do, I began with small, incremental boluses, keeping a close eye on those glucose levels as they rise, waiting to see when they level off or begin to fall. The goal is to avoid taking too much, or too little. I’m aiming for the Goldilocks effect.

Turns out, the G7’s data shot up to 270. If this really was my real glucose level, the stacked boluses I’d taken would have perfectly corrected these readings, and I would have had a soft landing. But, as the insulin started to kick in, my glucose levels plummeted to 49, making it clear to me that the G7 readings were not giving me reliable information. Individual readings may have been “accurate,” but they were not representative of my actual systemic glucose levels.

To achieve tight glucose control, one must be able to look at short time windows, and respond quickly to glucose movements to glucose movements, even if they are finely tuned adjustments. (Most people aren’t in tight control, and typically work on bigger time windows, so they won’t be as affected by these erratic readings.)

Over time, these anomalous readings will create more errors in decisions (or pump algorithms) than successes, which will impose an upper limit on how well they can ever perform.

Below are more daily charts to consider (without additional commentary). You can zoom in on your own and guess how/why I was able–or unable–to see trends in time to make decisions proactively.

The G7 generally reports lower BG values

While both the G6 and G7 were tighter (SD=29 and 28), the G7’s volatility is apparent.

The G7 appears to behave better this day, but real-time decisions were based on G6 data

The day was 100% in range, but the G7’s data was all over the map. (Thanks, G6!)

The Dexcom G7 trial: Exploring the futility of “accuracy.”

In Dexcom’s published report, “Accuracy and Safety of Dexcom G7 Continuous Glucose Monitoring in Adults with Diabetes,” 318 diabetic subjects wore three G7 sensors simultaneously over the course of ten days. For three of these days, subjects underwent clinically induced hyperglycemia and hypoglycemia under controlled conditions, where blood samples were taken and measured using a reference blood glucose sensor, the YSI 2300 Stat Plus glucose analyzer. The analysis showed that the “mean absolute relative difference” (MARD) between the two was ~8.8% for the G7, versus ~10% for the G6. The lower the percentage, the smaller the difference to the reference analyzer. Hence, greater accuracy.

Let me remind the reader that the G7 trial had subjects where THREE G7s simultaneously during the testing period. When blood samples were taken and measured on the iStat device, it then compared that value to G7 readings.

Ok, wait. If the person was wearing three G7’s, which one was used to compare measurements? Or were all three averaged together? Or, did Dexcom choose which of the three that happened to be closest to the iStat? The company doesn’t reveal this in the trial data, and that alone raises eyebrows to me.

All this further substantiates the paper I cited earlier about the inappropriateness of using CGM values against a single-read iStat device. Any claims about MARD should not be taken at face value.

Moreover, MARD values in Dexcom’s data varied widely, especially under different conditions, such as glucose levels and rates of change, as shown in this figure from their report.

The mean and median per-sensor MARDs were 8.8% and 7.8%, respectively, 442 (71.4%) had MARD values <10%, and 12 (1.9%) had MARD values >20%.

According Dexcom’s data, the G7’s MARD value was best when glucose values were in the sweet spot of glycemic ranges, but diminished as glucose levels edged higher. This bar graph suggests the best MARD happened most often at ideal glucose ranges, but most T1Ds only spend about 30% of their day in those ranges. The rest of their day is spent far outside, usually well above 180 mg/dL, where the G7’s MARD rating is well above 14%.

What is also not revealed in Dexcom’s report is the rate of change (ROC), which can also greatly affect MARD. Dexcom limited its testing to only 1 mg/dL change per minute, which showed some of the worst performing MARD values. In the real world, once a T1D eats a meal, glycemic levels can change at 2-4 mg/dL per minute as a matter of course. Relying on CGMs to capture that data is prone to significant error bars. (The G6’s algorithm is far superior in this regard for smoothing out these errors and giving the user or algorithm more reliable data to work with.)

To what degree this variability in MARD plays into real-world conditions, we can look at this meta-analysis of multiple studies on overall glucose levels for T1Ds who wear CGMs. It shows that only 30% of T1Ds have glucose ranges between 70-180 mg/dL 70% of the time, which is where the G7 is most accurate. By contrast, 80% of T1Ds spend more than 70% of their time above 180 md/dL, where the G7’s accuracy exceeds 30% error. (For context, 44.5% have an A1c between 7–9%, 32.5% exceed 9%, and only 23% of T1Ds had an A1c <7%.)

Despite the fact that the G7 is the most accurate at glucose levels between 70-180, T1Ds spend far more time far above 180. Hence, T1Ds are experiencing accuracy error rates of >30% most of the time. This means that decisions that either humans or algorithms are going to make in whether to dose insulin or carbs are dealing with highly imperfect information (especially compared to the G6, which was more reliable.

Summary

I personally suspect that few people will find the G7 helps T1Ds improve their glycemic control. This will also be a problem for automated insulin pumps for the same reasons.

Nevertheless, I suspect Dexcom is primarily focused on the value of the improved MARD rating in their marketing plans. It’s invaluable to claim that your MARD is superior to all other CGMs, regardless of the dubious value of MARD.

It also helps that Dexcom’s target market is moving well beyond T1Ds into the T2D market, where there are nearly 40 million T2Ds, with another 98 million presumed to be undiagnosed. That, plus a very rapidly emerging market of non-diabetic “life-hackers”, such as athletes, health enthusiasts, and everyday consumers. In fact, Dexcom has released a non-prescription version of the G7 called Stelo, and these people don’t care that much about volatility.

This post is public so feel free to share it.

Of course, the downside for T1Ds is that some could actually see worse outcomes, and not even realize it. The G7’s propensity to report lower average glucose averages (than what is actually in the bloodstream) may give people the false impression that their glycemic control has actually improved with the G7.

I hope the G6 never goes away. But a better idea is to provide the G6’s algorithms to a G7 sensor. Here’s a marketing plan: Sell the G7 with the G6 algorithm as the less expensive over-the-counter product for the comparatively fewer number of T1Ds that actually need higher quality data—that is, better precision—to manage glucose levels. We’re already paying too much for all the other stuff we have to buy, and we’re a tiny market compared to the rest of the world. This way, everyone’s a winner!

Anna Franson

Jul 6

Thank you for this article! This is the best explanation I have seen on the smoothing of the G6 data vs the G7 and its impact. I use a tandem pump and I usually get mid-80s TIR and now know why I may be struggling to get to 90. I have very good control and this helps explain the “diabetes is just weird sometimes” phenomenom.

Expand full comment

4 replies by Dan Heller and others

Thomas

Feb 22, 2024

I suspect Freestyle Libre 3 is even “worse” than G7 with it’s MARD of 7.9% and one minute sampling rate. G7 on steroids :D I switched recently from G6 and what a ride I had yesterday when trying to fix one hypo got 3 consecutive nasty ones instead (tried to bolus after each climb out of low so that blood sugar wouldn’t skyrocket later). The readings just aren’t as predictable, they jump around and I ended up reacting too soon… But I will keep using it, who knows I might see some patterns in time and get a handle of it hopefully, I just like every minute readings too much.

1 reply by Dan Heller

7 more comments...

Type 1 Diabetes: It's Not that Simple

Discussion about this post