Is this email not displaying correctly? View it in your browser.
Hi Listheads,
Our hearts go out to everyone affected by Hurricane Sandy. We hope you and your loved ones are safe and sound.
We are amazed by the inventiveness of the submissions we got for last week's programming contest. We had no idea our readers had so much free time.
We've compiled a list of our favorite submissions. We got some spectacular entries and it was tough choosing a winner. But we're not made of ice cream, so we had to pick just one.
Misha Dynin, for the following entry:
1 #ifdef _
2
3 _(A)_(B) _(C)_(D) _(E)
4 _(F) _(G) _(H) _(I)_(J)
5 _(K) _(L) _(M) _(N)
6 _(O) _(P) _(Q)_(R)
7 _(S) _(T) _(U)
8 _(V)_(W) _(X)
9
10 _(Y) _(Z)_(a) _(b)_(c) _(d)_(e)
11 _(f) _(g) _(h) _(i)
12 _(j)_(k) _(l) _(m)_(n)
13 _(o) _(p)_(q) _(r)
14 _(s) _(t) _(u)
15 _(v)_(w) _(x) _(y)_(z)
16
17 #else
18 #define _(z)_##z)();typedef _##z(*
19 #include <stdio.h>
20 typedef int(*
21 #include __FILE__
22 #define __(_,z)_z _(__ y){return fputs(z,stdout),y?y(0):(_z)_0;}
23 #undef _
24 __)();__(_0,"")__(nl,"\n")__(_," ")
25 #define _(_)__(_,#_)
26 #include __FILE__
27
28 int main() {
29 (L)(i)(n)(k)(e)(d)(_)(L)(i)(s)(t)(N)(Y)(C)(nl);
30
31 return 0;
32 }
33
34 #endif /* http://mishadynin.com/ */
We spent much of the afternoon trying to understand this program, and we think we've figured it out. If you want to follow along at home, we suggest running cc -E misha.c
, which will run the C preprocessor and expand the macros.
The source is split into two parts by a large #ifdef _
. The first half contains an ASCII art LinkedList logo and the second half contains the program's code. When the preprocessor starts running through the source, _
is undefined, so the ASCII art is ignored.
We'll start in the #else
branch by unpacking the first macro:
#define _(z)_##z)();typedef _##z(*
This becomes clearer if you add a space:
#define _(z) _##z)();typedef _##z(*
Macros that take arguments don't require a space after the closing parenthesis. For this macro, _(A)
would evaluate to _A)();typedef A(*
. The ##
s let us replace z
even when it's in the middle of a word. If we didn't have those, _(A)
would evaluate to _z)();typedef _z(*
.
Immediately after this, we include stdio.h
and then we have the seemingly incomplete line typedef int(*
. To understand what this line does we'll need to look at the next line too.
The piece of line 21 that we didn't initially know was the __FILE__
macro. __FILE__
evaluates to the full path of the current file surrounded by quotes. This means, if you have misha.c
in /tmp
, __FILE__
will be "/tmp/misha.c"
. With this in mind, we know now that line 21 actually includes misha.c
inside of itself!
When we include the file, we start at the top again, but this time _
is defined, so we ignore our code and evaluate our ASCII art. On closer inspection, we find that our art is actually just a bunch of invocations of the _
macro, and combined with our first incomplete typedef produces the following code (reformatted slightly):
typedef int(*_A)();typedef _A(*_B)();typedef _B(*_C)();typedef _C(*_D)();
typedef _D(*_E)();typedef _E(*_F)();typedef _F(*_G)();typedef _G(*_H)();
typedef _H(*_I)();typedef _I(*_J)();typedef _J(*_K)();typedef _K(*_L)();
typedef _L(*_M)();typedef _M(*_N)();typedef _N(*_O)();typedef _O(*_P)();
typedef _P(*_Q)();typedef _Q(*_R)();typedef _R(*_S)();typedef _S(*_T)();
typedef _T(*_U)();typedef _U(*_V)();typedef _V(*_W)();typedef _W(*_X)();
typedef _X(*_Y)();typedef _Y(*_Z)();typedef _Z(*_a)();typedef _a(*_b)();
typedef _b(*_c)();typedef _c(*_d)();typedef _d(*_e)();typedef _e(*_f)();
typedef _f(*_g)();typedef _g(*_h)();typedef _h(*_i)();typedef _i(*_j)();
typedef _j(*_k)();typedef _k(*_l)();typedef _l(*_m)();typedef _m(*_n)();
typedef _n(*_o)();typedef _o(*_p)();typedef _p(*_q)();typedef _q(*_r)();
typedef _r(*_s)();typedef _s(*_t)();typedef _t(*_u)();typedef _u(*_v)();
typedef _v(*_w)();typedef _w(*_x)();typedef _x(*_y)();typedef _y(*_z)();
typedef _z(*
You'll notice that the last line is incomplete. It'll be completed later.
These are all typedef'd function pointers. If you're a bit shaky on function pointers, or pointers in general, take a look at the Clockwise/Spiral Rule. To create a type out of a function pointer, you just put typedef
in front of the pointer declaration. The name of the pointer becomes the name of the type.
On line 22, we define the __
macro:
#define __(_,z)_z _(__ y){return fputs(z,stdout),y?y(0):(_z)_0;}
which can be rewritten for clarity as follows:
#define __(_,z) _z _(__ y) { return fputs(z,stdout), y ? y(0) : (_z) _0; }
This macro is especially confusing because it uses _
as the name of its first parameter. This does not get expanded to the _
macro. It is treated as a normal macro parameter. This macro defines a function with its name taken from _
. The function has one parameter, y
, which has type __
. This is not the same as the macro __
and doesn't get expanded recursively because this occurrence of __
doesn't have any arguments. The function prints z
to standard out and then returns either y(0)
or _0
casted to the type _z
. We've already seen that _z
was typedef'd to a function pointer, but we haven't seen _0
yet. While we don't know what type __
is, seeing y(0)
leads us to believe that __
is a function pointer type.
In C, multiple expressions separated by commas evaluate to the value of the last expression, which explains how we can have two expressions in the return statement. The return value of fputs
is discarded.
After undefining the _
macro on line 23, we have a relatively dense line 24:
__)();__(_0,"")__(nl,"\n")__(_," ")
Let's unpack that:
__)();
__(_0,"")
__(nl,"\n")
__(_," ")
The first line completes our long string of typedefs from earlier, defining the type __
as a pointer to a function that returns a function pointer of type _z
. This confirms our assumption that __
is a function pointer type, which we hypothesized after we saw y
treated as a function in the definition of the __
macro.
The next three lines use the __
macro to define three functions, _0
, nl
, and _
, which print the empty string, a newline, and space, respectively.
Next we define one more macro, called _
:
#define _(_)__(_,#_)
which cleans up to
#define _(_) __(_,#_)
The new piece here is #_
, which evaluates to _
surrounded by double quotes. Thus _(a)
becomes __(a,"a")
which becomes _z a(__ y){return fputs("a",stdout),y?y(0):(_z)_0;}
, a function called a
that prints "a" and returns a pointer to a function of type _z
.
With _
redefined, we include __FILE__
one more time. Our ASCII art is once again interpreted as a bunch of macros, but this time, we define functions that print each of the letters rather than defining function pointer types.
The last interesting line is line 29:
(L)(i)(n)(k)(e)(d)(_)(L)(i)(s)(t)(N)(Y)(C)(nl);
This doesn't look like valid C at first, but let's make one small change to make it more familiar:
L(i)(n)(k)(e)(d)(_)(L)(i)(s)(t)(N)(Y)(C)(nl);
Here we call the L
function, and pass it the i
function as an argument. L
prints "L", and then returns i(0)
(look at the __
macro if you're lost). i
prints "i" and then returns _0
casted to a _z
. Recall that _0
is the same as L
, i
and all the other lettered functions, but it prints the empty string instead of a letter.
Because L(i)
returned i(0)
and i(0)
returned a pointer to _0
, we're now back inside main
with a handle on _0
that we immediately call with n
as its argument. You can almost imagine that main
now looks like this:
_0(n)(k)(e)(d)(_)(L)(i)(s)(t)(N)(Y)(C)(nl);
L
returns a _z
, which was defined as typedef _y (*_z)()
(i.e., it's a type that is a pointer to a function that returns a _y
and takes no arguments). If this is the case though, how are we allowed to call L
's return value with one argument? The answer is that the function declaration int foo()
is different than int foo(void)
. The second declares a function foo
that takes zero arguments while the first declares a function foo
without argument information. Because our typedef of _z
doesn't have argument information, our compiler doesn't complain when we try to call a function pointer of type _z
with one argument. There is a Stack Overflow answer with more information.
Calling _0(n)
prints nothing and then returns n(0)
, which prints "n" and returns _0
leaving us with the equivalent of
_0(k)(e)(d)(_)(L)(i)(s)(t)(N)(Y)(C)(nl);
This goes on until we get to _0(nl)
which prints nothing, calls nl(0)
which prints a newline and then returns a final pointer to _0
which we ignore.
Pretty awesome, right? We just have one minor correction, Misha. Line 29 should read:
(L)(i)(n)(k)(e)(d)(L)(i)(s)(t)(_)(N)(Y)(C)(nl);
Peace out, cub scouts.
Nick & Dave
P.S. Please email us if you have any corrections or clarifications for our explanation!
*|IFNOT:ARCHIVE_PAGE|* *|LIST:DESCRIPTION|*
If you don't want to receive the LinkedList anymore, unsubscribe.
*|LIST:ADDRESS|*