The general idea would be to create a function that prints the same string twice, and feed it the rest of the program's code, so that when run, the output shows exactly what is in the file. Another way to approach it would be to print out the program as an entire string of text, then insert it into itself, so that the entire output, of course, is exactly what is written in the program.
Now, these ideas will need some tweaking, of course, and that is the part that will vary by language. The most troublesome elements to correctly duplicate, with most languages, are line breaks in the code, and the quotation marks used around the text string.
For many languages, the first problem can be avoided simply by putting the entire program into a single line.
To avoid the problem of printing quotation marks, which often have to be escaped by a backslash \ or a single quote, the ASCII character value for the double quotation marks, 34, can be used.
I can't cover all the bases, of course, but I will explain how to do this in the languages that I do know.
In C, the 'printf' function outputs a string in quotes or a string variable. That string can be follwed by a list of comma-delimited arguments that will replace certain character sequences in the first string.
If the string contains %c, that character sequence will be replaced with the character denoted by an integer value. For example, running:
printf("h%cllo.",101);
will simply output:
hello.
101 is the ASCII value of the character 'e,' which is inserted in place of %c in the string.
Similarly, if the original string contains %s, that sequence will be replaced by a string. The output of:
printf("h%slo.","el");
is, again, just:
hello.
Also, more than one of these %-escaped sequences may be used, and will be read in order, left to right. For example, the code snippet:
printf("h%c%s%c.",101,"ll",111);
will also print out
hello.
Finally, the strings used can be string or character variables, which can also contain %-escape sequences. The code:
char*s="h%c%so.";
char*t="ll";
printf(s,101,t);
will, one last time, just print out:
hello.
This will be useful in our program if we can insert a string into itself. Something like:
char*s="abc%sdef";
printf(s,s);
Will output the following:
abcabc%sdefdef
That's exactly what we want; the first time the string is printed, the %s is replaced by the entire string, but then the %s prints out exactly as it appears. That being said, the following is a program in C that will print itself:
char*s="char*s=%c%s%c;void main() {printf(s,34,s,34);}";
void main() {
printf(s,34,s,34);
}
The first time string s is printed, both instances of %c are replaced with ASCII character 34, quotation marks ("), and the %s is replaced with s itself. The second time, the %c%s%c is printed out just like it looks in the code, as we have seen above.
Now, I have used normal grouping/indenting techniques to make it more readable, and the output will be on one line, but that's okay because semicolon-delimited arguments in C can be written out in a single line.
Here's what the program (and the output) should really look like (it won't look like it's on one line, so I'll use the hanging indent to show that the same line continues below. If I stretched the page out as far as some of these programs require, levik would have my head.):
char*s="char*s=%c%s%c;void main(){printf(s,34,s,34);}";void
main(){printf(s,34,s,34);}
An equivalent program can place the declaration of s inside the main function, it works the same either way:
void main(){char*s="void main(){char*s=%c%s%c;printf
(s,34,s,34);}";printf(s,34,s,34);}
So that's one. C++ can use the same functions as C in the same way, but we have a problem. The printf function is part of the C standard library, so simply running the above program will not work; the C++ compiler doesn't know where to find the printf function. We need to tell it where to look, using a #include command.
Just as before, the output of this will be on an entire line. However, the #include is one of the few commands in C++ that needs to end with a new line; we cannot just put the whole program into a single line because it will not work.
That means that we need to have a new line in our output. Putting \n into a string will insert a new line, but then to output the \n in our program, we need to print it by escaping the backslash, as \\n. Then, since \\n is now part of the program, we need to print that out, but both backslashes need to be escaped, as \\\\n. It soon becomes clear that we cannot do this directly.
However, there is an ASCII character for a new line that we can use, just like we did with the quotation marks. The new line character has an ASCII value 10.
Using this, we can make the following program:
#include <stdio.h>
char*s="#include <stdio.h>%cchar*s=%c%s%c;void main(){printf
(s,10,34,s,34);}";void main(){printf(s,10,34,s,34);}
which prints itself, line breaks and all.
Equivalent algorithms can be constructed for other languages, of course.
Since we already have a program in C, C-based languages will likely be easiest to tackle next.
PHP is very similar, except that all variable names begin with a $ and do not need a storage class specifier. Also, the main() funtion is not needed, but the entire program begins with <?php and ends with ?>. The complete program I came up with reads as follows:
<?php $s='<?php $s=%c%s%c; printf($s,39,$s,39); ?>'; printf
($s,39,$s,39); ?>
Also note that PHP is used to generate HTML, so if you try to put this into a file and look at the result, you won't see anything. If you look at the source of the page, however, you will indeed see that the output is just as you have written it.
A Perl program will also look much the same as in C, but there is no need for a main() function, the arguments of printf are not put into parentheses.
$s="$s=%c%s%c;printf$s,34,$s,34;";printf$s,34,$s,34;
If you hadn't noticed, using the printf function makes these all very similar and somewhat trivial.
To write the program in javascript, we will not be able to use the printf function. We can, however, use the substring function, a member function of the String class, which will print off part of a string. We will use document.write to write text to the browser window.
Calling substring with one argument will print from the specified position to the end of the string. For example:
s="abcdef";
document.write(s.substring(3));
Will show:
def
Called with two integers as arguments, it will print the portion of the string between the specified indices.
The code:
s="abcdef";
document.write(s.substring(2,4));
when run, will display only:
de
To make this program, we can create a string that contains the rest of the code besides itself, print out the first part of it, quotation marks, then the entire string, closing quotation marks, then the rest of of the string. To print quotation marks, they need to be escaped from a string with a \. This will cause the same problem we saw with \n in the C++ code above. To get around this, we can use the String class member function fromCharCode to convert the ASCII value to a character.
Here is the program:
s="s=;document.write(s.substring(0,2)+String.fromCharCod
e(34)+s+String.fromCharCode(34)+s.substring(2));";
document.write(s.substring(0,2)+String.fromCharCode
(34)+s+String.fromCharCode(34)+s.substring(2));
To write a code in Java, it works the same way as with javascript. However, Java is used to create classes, not necessarily to write programs. So, we will create a class, which needs a main() function as in C and C++, and use the same substring technique as with javascript (also note that the actual indices used will change with the lenghth of the string).
Implementation will be slightly easier now, because Java allows variable typecasting, which means we can convert an integer directly into an ASCII character by simply putting (char) in front of it.
Here is the code for a class that prints itself out when main() is called:
class S{ static public void main(String[]x) { String s="
class S{static public void main(String[]x){String s=;
System.out.print(s.substring(0,52)+(char)34+s+(char)
34+s.substring(52));}}";System.out.print(a.substring
(0,52)+(char)34+a+(char)34+a.substring(52));}}
Another Java implementation that I found, which defines a character variable instead of calling (char)34 every time, reads as follows:
class S{public static void main(String[]a){String s="class
S{public static void main(String[]a){String s=;char c=3
4;System.out.println(s.substring(0,52)+c+s+c+s.substrin
g(52));}}";char c=34;System.out.println(s.substring(0,5
2)+c+s+c+s.substring(52));}}
You can use ASCII character values in BASIC, just as in other languages. The following program does almost the same thing as the C program, but it uses the ASCII values for s (115) and $ (36), which is what I called the string (s$, as well as : (58), = (61), and " (34). Pretty much it makes a string of those calls, including two calls to the string itself, then actually prints it out (basically, the ':' allows the program to be written on one line, and the '?' prints out each of the 'chr$' values and calls to 's$'):
s$="?chr$(115)chr$(36)chr$(61)chr$(34)s$ chr$(34)chr$(58)
s$":?chr$(115)chr$(36)chr$(61)chr$(34)s$ chr$(34)chr$
(58)s$
Charlie offered another [neater] solution in BASIC, here, implementing the PRINT USING funcitonality in the same way as printf works in C-based languages:
a$ = "a$ = &&&: PRINT USING a$; CHR$(34); a$; CHR$(34):":
PRINT USING a$; CHR$(34); a$; CHR$(34):
Python, is a language I have dabbled in to simulate online applications, but have never used extenxively.
I found the following short program on the Internet:
l='l=%s;print l%%`l`';print l%`l`
However, [I think] a command line such as #!/usr/bin/env pythonis required for the program to run, and the above program does not reproduce that line (nor will it run without it).
Here is one that does:
#!/usr/bin/env python
s = """print '#!/usr/bin/env python'
print 's = ""'+'"'+s+'""'+'"'
print s"""
print '#!/usr/bin/env python'
print 's = ""'+'"'+s+'""'+'"'
print s
A language I didn't attempt yet is Prolog; actually I just began taking a class in it. A Prolog 'program' is only a list of predicates that can be called from an intepreter. The following program was all I could find, but I was rather disappointed, since it only prints the first predicate, although a second functor with two predicates is used in the implementation, but are not printed out when the predicate is called:
quine:-Q="write(quine),write((:-)),put(81),put(61),put(
34),writes(Q),put(34),put(44),writes(Q),put(46),put(
13),put(10),write(writes),put(40),put(91),put(72),pu
t(124),put(84),put(93),put(41),write((:-)),write(put
),put(40),put(72),put(41),put(44),write(writes),put(
40),put(84),put(41),put(46),put(13),put(10),write(wr
ites),put(40),put(91),put(93),put(41),put(46)",write
(quine),write((:-)),put(81),put(61),put(34),writes(Q
),put(34),put(44),writes(Q),put(46),put(13),put(10),
write(writes),put(40),put(91),put(72),put(124),put(8
4),put(93),put(41),write((:-)),write(put),put(40),pu
t(72),put(41),put(44),write(writes),put(40),put(84),
put(41),put(46),put(13),put(10),write(writes),put(40
),put(91),put(93),put(41),put(46).
writes([H|T]):-put(H),writes(T).
writes([]).
When you load this file, and type quine. into the interpreter, it does indeed print out the contents predicate quine, but not the rest of the program.
So, it's not a completely valid solution, but I haven't found (or been able to write) anything better.
A completely alternate approach to this problem is to create a program that reads a source file and prints it, and then direct it to read itself.
This is most easily done in BASIC, which has a built-in function to do just that:
10 LIST
Silverknight also noted a .bat directive that will read itself and print to a command window (when the file is stored in test.bat):
type test.bat
Here is a sample program in C++ that reads its own source file (stored in 'readfile.cpp'):
#include <iostream>
#include <fstream>
using namespace std;
char*file="readfile.cpp";
void main() {
char c;
ifstream s;
s.open(file);
s.get(c);
do {
cout << c;
s.get(c);
} while (! s.eof());
s.close();
}
Just for fun, the following C++ program prints itself backwards:
#include <iostream>
#include <fstream>
using namespace std;
char charAtPos(int);
char*file="backwards.cpp";
void main () {
char c;
ifstream s;
int count(0), p;
s.open(file);
s.get(c);
do {
count++;
s.get(c);
} while (!s.fail());
s.close();
for (p=count; p>0; --p)
cout << charAtPos(p);
}
char charAtPos(int pos) {
char c;
ifstream s;
s.open(file);
for (int i=0; i<pos; i++)
s.get(c);
s.close();
return c;
}
And, of course, there are always people would ignore the lack of quotation marks around the word "itself," and submit responses such as:
C/C++:
main(){cout << "itself";}
(which needs #include <iostream> so the complier knows what cout means, but that's beside the point).
COBOL:
MOVE "ITSELF" TO PRINTLINE
PUT PRINTLINE
BASIC:
PRINT "ITSELF"
Other 'valid' solutions were noted in the problem comments, in different languages, namely in BASIC, C, and Ruby.
This page contains links to 'quine' programs in many language you can think of, including obscurities such as BlooP and Snack. The most interesting ones I saw were the programs in BeFunge and BrainF*** (languages which certainly live up to their names).
Solutions I have quoted here, other than those that came from comments to this problem, were written by:
Prolog - Pekka P. Pirinen
Python - Frank Stajano
Java - Bertram Felgenhauer |