Visit Jeremy's Blog.
Go Back > Blogs > SourceHorse
User Name


Rate this Entry

Quad-wheel, vroom puttaputta

Posted 04-14-2012 at 11:14 AM by SourceHorse
Updated 04-14-2012 at 11:20 AM by SourceHorse

 _______    |\_
| o   o |   /o \\
| o   o |  (_. ||
| o   o |   /__\\
'-------'   )___(

The Source Horse
Article 2
Quad-wheel, vroom puttaputta
The Source Horse is a trail log of free program execution paths. This article will look at quad-wheel, a minimal implementation of Javascript. The name quad-wheel is probably inspired by v8, the name of Google's Javascript implementation. Quad-wheel is about 9,000 lines of C, while v8 is about 320,000 lines of C++. Quad-wheel's parser is written in bison, and it produces opcodes for a virtual stack machine. The C code is easy to read and it draws from public sources. For example, associative arrays use a red-black tree implementation from [1].

Little languages are fun because they are easier to comprehend. It is a well established tradition to embed a little language in a large program to make it easier to customize and extend the larger program. For example, emacs and elisp, the GIMP and SIOD, and so on. Over time, parts of the large program can be re-written as scripts and vice-versa. This churn is only possible because the flexibility of scripts.

In Javascript, the typeof operator can be used to test whether a variable is defined. I often use this to check whether a function has been given all of the required arguments. Quad-wheel lacks the typeof operator. Here is room for improvement. I do not mean to stirrup any trouble, but lets go offroad and tear up some dirt. Instead of just reading code, I will write some code to inflate the quad-wheel!

I cloned the Git repository [2]. I found the parser in a file named parser.y. I scanned this file and looked for operators.

180 %left '*' '/' '%'
181 %left NEG '!' INC DEC '~' TYPEOF VOID	/* - ++ -- typeof */
182 %left NEW								/* new */
Line 181 declares left precedence operators, including the TYPEOF operator. That is interesting! It is already there? I scan the rest of the file. Though the TYPEOF operator is declared, it is never defined. So I define it above the VOID operator by adding line 519 in the same file.

518 	| '!' expr				{ $$ = codes_join($2, code_not()); }
519	| TYPEOF expr                           { $$ = codes_join($2, code_typeof()); }
520 	| VOID expr				{ $$ = codes_join3($2, code_pop(1), code_push_undef()); }
The action on line 519 uses the C statement codes_join($2, code_typeof()). This statement adds the expression and the typeof opcode to the stack. However, the changes do not end in the parser. I still need to define the TYPEOF keyword and the new code_typeof() function.

Some searching reveals that code_not() is defined in several files: code.h, code.c, eval.c, and lexer.c. Ultimately the opcode is implemented in eval.c, but that depends on declarations in the other files. I will prepare to implement code_typeof().

I searched for the VOID keyword in lexer.c.

077		{ "void", VOID },
078		{ "typeof", TYPEOF },
079		{ "__debug", __DEBUG }
Bison automatically generates constants from the parser grammar. The lexer associates various words with these constants. I added line 78 to associate "typeof" with the TYPEOF constant.

I searched for the DEBUG opcode in code.h.

081	OP_DEBUG,		/* 0	-				DEBUG OPCODE, output top 							*/
082	OP_TYPEOF,		/* 0    -														*/
083	OP_LASTOP		/* 0	-				END OF OPCODE
The virtual machine opcodes are declared in code.h. I added line 82 to declare OP_TYPEOF. I will use it later.

I searched for the code_debug() function in the same file.

205 OpCodes *code_debug();
206 OpCodes *code_typeof();
207 OpCodes *code_reserved(int type, unichar *id);
This section of the file declares the C functions that emit the virtual machine opcodes. I added code_typeof() on line 206. I will implement this function later.

I searched for the DEBUG opcode in code.c.

074	"DEBUG",
075	"TYPEOF",
076 };
This section of the file names the opcodes. I added line 75 to name the typeof opcode as "TYPEOF".

I searched for the code_debug() function in the same file.

214 OpCodes *code_debug() { NEW_CODES(OP_DEBUG, 0); }
215 OpCodes *code_typeof() { NEW_CODES(OP_TYPEOF, 0); }
216 OpCodes *code_reserved(int type, unichar *id)
I added line 215 to implement the code_typeof() function. It adds the OP_TYPEOF opcode to the virtual machine instruction stack. The opcode OP_TYPEOF is executed elsewhere by the virtual machine, and it is defined elsewhere in eval.c.

I searched for OP_DEBUG and found it in eval.c. While scanning eval.c, I found the vprint() function, which could be used as a model for the typeof operator!

131 static const char *vprint(Value *v)
132 {
133	static char buf[100];
134	if (is_number(v)) {
135		snprintf(buf, 100, "NUM:%g ", v->d.num);
136	} else if (v->vt == VT_BOOL) {
137		snprintf(buf, 100, "BOO:%d ", v->d.val);
138	} else if (v->vt == VT_STRING) {
139		snprintf(buf, 100, "STR:%s ", tochars(v->d.str));
140	} else if (v->vt == VT_VARIABLE) {
141		snprintf(buf, 100, "VAR:%x ", (int)v->d.lval);
142	} else if (v->vt == VT_NULL) {
143		snprintf(buf, 100, "NUL:null ");
144	} else if (v->vt == VT_OBJECT) {
145		snprintf(buf, 100, "OBJ:%x", (int)v->d.obj);
146	} else if (v->vt == VT_UNDEF) {
147		snprintf(buf, 100, "UND:undefined");
148	}
149	return buf;
150 }
Further into the file, I copied OP_DEBUG and implemented the typeof opcode based on the vprint() function.

1026			case OP_DEBUG: {
1027				topeval1();
1028				if (TOP.vt == VT_OBJECT) {
1029					printf("R%d:", TOP.d.obj->__refcnt);
1030				}
1031				value_tostring(&TOP);
1032				printf("%s\n", tochars(TOP.d.str));
1033				break;
1034			}
1035			case OP_TYPEOF: {
1036				unichar *v;
1037				topeval1();
1038				if (is_number(&TOP)) {
1039					v = unistrdup_str("number");
1040				} else if (TOP.vt == VT_BOOL) {
1041					v = unistrdup_str("boolean");
1042				} else if (TOP.vt == VT_STRING) {
1043					v = unistrdup_str("string");
1044				} else if (TOP.vt == VT_NULL) {
1045					v = unistrdup_str("object");
1046				} else if (TOP.vt == VT_OBJECT) {
1047					if (TOP.d.obj->ot == OT_FUNCTION) {
1048						v = unistrdup_str("function");
1049					} else {
1050						v = unistrdup_str("object");
1051					}
1052				} else if (TOP.vt == VT_UNDEF) {
1053					v = unistrdup_str("undefined");
1054				} else {
1055					v = unistrdup_str("/* notreached */");
1056				}
1057				value_erase(TOP);
1058				value_make_string(TOP, v);
1059				break;
1060			}
1061			case OP_RESERVED: {
At this point in the code, the Javascript has already been parsed and the data stack has already been set up. This is where the opcode is executed.

Typeof is an operator that takes an expression as an operand. This expression is evaluated. This value has a type. The typeof operator returns a string based on the type.

To keep the code concise, the implementation uses macros such as topeval1(), is_number(), and TOP.

The expression is on top of the stack. The topeval1() macro evaluates this expression, and associates the value with the top of the stack.

The is_number() macro tests whether the associated value is a number. The .vt field identifies the value type. If the value is an object, then the .d.obj->ot field identifies the object type.

The unistrdup_str() function creates a unicode copy of an ASCII string.

The value_erase() macro destroys the value on top of the stack and leaves an undefined value in its place.

The value_make_string() macro sets the top of the stack to a unicode string value.

Line 1036 declares a C variable for the unicode string returned by this operator.
Line 1037 evaluates the expression and stores the value on the stack.
Line 1038 if the operand value is a number,
Line 1039 then set the return value to "number".
Line 1040 otherwise, if the operand value is a logical boolean,
Line 1041 then set the return value to "boolean"
Line 1042 otherwise, if the operand value is a string,
Line 1043 then set the return value to "string"
Line 1044 otherwise, if the operand value is null,
Line 1045 then set the return value to "object"
Line 1046 otherwise, if the operand value is an object
Line 1047 and it is a function,
Line 1048 then set the return value to "function"
Line 1049 otherwise, if the operand value is an object and it is not a function
Line 1050 then set the return value to "object"
Line 1052 otherwise, if the operand value is undefined,
Line 1053 then set the return value to "undefined"
Line 1054 otherwise, the operand value type is something unexpected,
Line 1055 then set the return value to "/* notreached */"
Line 1057 destroy, erase, or free the value on top of the stack
Line 1058 set the return value on top of the stack
Line 1059 done executing the typeof opcode

This is a passable typeof operator implemented in about 30 lines of C code. It can only be this compact because the heavy lifting has been done by the parser, the virtual machine, and the macros.

Happy traces to you,

The Source Horse


Posted in Uncategorized
Views 1037 Comments 0
« Prev     Main     Next »
Total Comments 0




All times are GMT -5. The time now is 07:27 PM.

Main Menu
Write for LQ is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration