Volatile Structs Are Broken

For at least a year we’ve taken a break from reporting volatile bugs to C compiler developers. There were various reasons: the good compilers were becoming pretty much correct, we faced developer apathy about volatile incorrectness, our volatile testing tool had some weird problems, and basically I just got bored with it. Today my PhD student Yang Chen got a new and improved volatile testing tool working, with the goal of testing the volatile correctness of the new CompCert 1.8 for x86. Instead, he immediately found some distressing bugs in other compilers. As a side note Pin, the basis for Yang’s new tool, is pretty damn cool.

Given this C code:

struct S1 {  
  int f0;
};

volatile struct S1 g_18;

void func_1 (void) {
  g_18 = g_18;
}

All recent versions of GCC for x86 (I tried 3.0.0, 3.1.0, 3.2.0, 3.3.0, 3.4.0, 4.0.0, 4.1.0, 4.2.0, 4.3.0, 4.4.0, 4.5.0, and today’s SVN head) and Clang for x86 (I tried 2.6, 2.7, and today’s SVN head) give equivalent wrong code:

regehr@john-home:~$ gcc-4.4 -fomit-frame-pointer -O -S -o - small.c
func_1:
    rep
    ret

The generated code must both read from and write to the struct. Aren’t there any embedded systems broken by this? LLVM-GCC 2.1 through 2.7 and Intel CC generate correct code.

And here is the obligatory link to our 2008 paper Volatiles Are Miscompiled, and What to Do about It. But obviously we’re not doing enough. Bugzilla links are here and here. (Update: fixed in LLVM within three hours of being reported. Cool!)

September 30, 2010

regehr

Computer Science, Embedded, Software Correctness

2 responses to “Volatile Structs Are Broken”

xilun says:

October 6, 2010 at 4:37 pm

Maybe this was different with C90 but C99 indeed does not seem to mandate generation of assembly instructions reading then storing back g_18: “6.7.3 Type qualifiers […] What constitutes an access to an object that has volatile-qualified type is implementation-defined.”

So while IMO g_18.f0 = g_18.f0 * 2; should clearly generate load, mul, then store instructions, implementing func1() as a nop seem to be compliant with both 5.1.2.3 and 6.7.3. This is also consistent with the only clear requirement for volatiles: to be usable for signalisation between signals handlers and the rest of the program.

(Even though a note in C99 says that volatile “may be used to describe an object corresponding to a memory-mapped input/output port”, this obviously can only happen on systems where all bus access are strongly ordered throughout every hardware layer, otherwise the order of instructions generated by gcc would be meaningless in regard with what happen on the bus — and this is not the case for x86, and now more and more rare even on embedded systems.)

So maybe the only question that remains is: does “rep ret” executes as “ret” (and does so on all targeted x86 cpus). If this is not the case, this clearly is a bug. Otherwise, probably not.
regehr says:

October 7, 2010 at 12:27 am

Hi Xilun- Yes, what constitutes an access to a volatile variable is implementation defined. But you have missed the fact that all reasonable C implementations consider a read from a volatile variable to be a read access and a write to a volatile variable to be a write access. We discussed this in some detail in the EMSOFT paper linked from the post.

This is a real bug. As I noted in an update, it was fixed in LLVM within hours of being reported. I expect that it will get fixed in GCC as well.