Linux - NewbieThis Linux forum is for members that are new to Linux.
Just starting out and have a question?
If it is not in the man pages or the how-to's this is the place!
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
If STRUCT ENTRY was an array of structures then the array could be sent to the target as is and storage could be malloc allocated on the target. The array would still be valid as the indexes are virtual. I guess int on the pc is 32 bits but the int index on the target would be 16 bit. But the indexes would still be valid as long as the struct size was known on the target. If code generation were done I would probably like to go for the java like abstract machine on the target but a simple one of course. Then the language could be ported to other machines maybe even as open source although I would need some proprietary code to access the target mcu's peripherals. What do you think and thank you for your input.
That is the advantage for the JVM approach. It even works for the threaded code as the runtime portion of threaded code is just a library of functions. The only difference between them and a subroutine is that the starting address value serves as the threaded code "instruction", and instead of a return, the last instruction(s) are those needed to update the thread version of a PC, and an indirect jump.
Neither require application code (which uses the library) to be of the same license. For the threaded code it is "linked" to the runtime, and the runtime can have anything in it desired, thus a separate license, and only combined by user when the target object code is generated.
The advantage of adding the jump table is that instead of using the addresses directly, the thread instruction is used as an index into the jump table - thus needing a few more instructions, but making the application completely independent of the runtime, and not requiring a "link them together" step other than the vm needing a loader to put it in an appropriate place in memory.
In either case, it is not possible (usually due to the lack of memory protection) to separate the virtual machine entire from the application - but it is very very close, as the vm just copies the "application" into a typeless (or nearly so) array before starting interpretation. If the two can be combined externally (installed into the flash as a unit), you eliminate the need for the VM load - and only the base boot loader would be used to copy from flash to ram. In some architectures, the "ram" would also be the flash - and eliminate the entire loader. It would all look like the bios at that point.
If STRUCT ENTRY was an array of structures then the array could be sent to the target as is and storage could be malloc allocated on the target. The array would still be valid as the indexes are virtual. I guess int on the pc is 32 bits but the int index on the target would be 16 bit. But the indexes would still be valid as long as the struct size was known on the target. If code generation were done I would probably like to go for the java like abstract machine on the target but a simple one of course. Then the language could be ported to other machines maybe even as open source although I would need some proprietary code to access the target mcu's peripherals. What do you think and thank you for your input.
Sorry, forgot to answer the first part of your question about struct entry.
The problem with that is the size of the struct entry - there is no need to bulk up most of the structures - If you look, almost none of them are fully used - most only use the left/right pointers.
Converting the tree into a code sequence eliminates the structure entire, and saves all the memory and time needed to walk the tree during execution. Effectively the walking is done during the final translation, thus the VM/interpreter doesn't have to do that work.
Semantically, there is no difference between HEX, or DEC. If you notice (the result of both is an int), the scanner specification turns them into an integer, so that the syntax element constant only needs "NUMBER" for these two.
I think there is a scanner problem though.
How is the scanner going to handle a input sequence like "v=a-5" (an expression), from "v= -5" (a constant)?
If the "-5" is handled as a CONSTANT, then the expression is syntactically wrong. The problem is cause by the "-" to set the negative. The parsing of the expression (whether "a-5" or "-5") would be identified differently, the "expression - CONSTANT", or "- CONSTANT" separately. An optimizing pass would identify the difference between the the first and second case, and then collapse the "- CONSTANT" into a single element where the constant is a negative. It could be done during a parse, but it requires the action routine for the parse to look at the "- expression" subtree to decide if it is a constant and can be collapsed into a negative constant.
Now semantic handling of the constants can be done in different ways...
My original thought (not necessarly valid, mind), was to store the various constants in different tables, one table for each type (it does mean a different search function for each table, or a parameter to tell a search function which type to look for). The tables provide a way to accomplish duplicate elimination (saving memory, and better data packing in the target system). It also reduces the number of types for the parse tree to just pointers to entries in the various tables. If the table structure definition can contain all types, then the parse tree only needs to contain references to either the parse tree (forming the tree), or pointer to a table entry. It does mean that the scanner would be the one to enter the constants into the various tables and not done in the action portion of the parser rules (which would provide updates to the table in form of attribute flags and such).
One of the purposes of the tables (outside the duplication elimination) is to carry extra information (the attributes, and the "other" field) which can be used when generating code, and possible debugging information (semantic errors - like making the value of a character too large, conversion of an integer to a character, character to an integer...).
I still do not understand why the STRUC ENTRY and STRUC SYMBOL arrays cannot be passed as is to the mcu and the entry tree walked by the interpreter. Then no labels would be needed.
I still do not understand why the STRUC ENTRY and STRUC SYMBOL arrays cannot be passed as is to the mcu and the entry tree walked by the interpreter. Then no labels would be needed.
I didn't mean "cannot". It just adds more processing overhead, and memory overhead - a minimum of about 8 bytes per structure, and the added indexing to perform the equivalent of a simple jump. It also adds to simple a + b, as there are three entries - one for a, one for b, and one for the plus (which would be 24 bytes). And depending on the arch, direct code format only needs two load instructions and an add, usually coming out to about 7 bytes. Using a basic threaded code form (push a, push b, add) only uses 10 bytes (assuming everything is in a two byte length).
Conversion from the tree into a more appropriate target form eliminates the tree walking overhead from the target (it is still done, but on the translating PC, not the embedded system).
The parsing operation would first analyze something like "A = B + C", turning it into the parse tree:
Code:
a = b + c
|
assign--+---------+
| |
| |
a |
add---+----+
| |
| |
b c
If each level of the tree needed 6 bytes, then the tree has 12 bytes and the symbol table entries for a,b,and c(even if it only needed 4 bytes each - something for the type of each a,b,c and storage for the value for a,b,c is 12 more bytes, for a total of 24.
A translation of that into a basic stack machine would make it:
Code:
push b (push value)
push c (push value)
add (pop b,c add, then push result)
pop a (pop value of a+b, address of a, store value in a)
If a simple threaded code implementation (2 bytes for each active element) this adds up to 14.
It also allows the elimination of having to track the data types - that tracking is done during the translation phase to select the correct operators. For instance, if a/b/c were 32 bits instead of 16, perhaps the interpreter defines a "push", "push2", "push4" operations - in which case, for a 32bit sequence
it becomes:
Code:
push4 b
push4 c
add4
pop4 a
And still only needs 14 bytes, but no runtime checking for data types (that was done during code generation). Dynamic types can also be handled, but then there would only be one operating function instead of 3 - and they would look at the data types associated with the variable (same as with the symbol table) to determine how many bytes to push, add, and store. But the basic code size remains 14.
Granted, I am assuming there is an assembler available for the target machine (there usually is one).
Are you allowed to tell me which CPU is being used? (I don't want the entire system, just the cpu core instructions, and CPU limitations would be useful).
How should I change the code to satisfy your complaint? Thanks. Alvin...
There is no unary minus there. The grammar you have only allows for value-value... I believe this grammer (plus the - handling in the scanner) would allow for a "(var + -5)", which looks strange. It would NOT allow for "(var-5)" as the -5 is consumed by the scanner.
I think the following should work:
Code:
prim_exp: var {$$=var_gen($1); }
| constant {$$=const_gen($1); }
| unary_ops
;
unary_ops: MINUS var
| MINUS constant
| MINUS unary_ops
| LP exp RP {$$=exp_gen($2); }
;
As that will allow a "-var", "-5", and "-(expression)". At least it has no shift/reduce conflicts in the grammar I'm using for an example. I'm not sure what your production rules for those added pieces would be though. The one for unary_ops in prim_exp should be the default as the tree built should be what is returned from unary_ops rather than a newly generated entry.
I will add the unary minus to the grammar. I agree and think I will use a 32 bit wide stack that will handle signed int32, string pointers and floating point which is also 32 bit. I may also need a costack next to it to hold unsigned char number indicating what type the associated value is on the stack. My whole idea is to use the CCS C compiler as it has built in functions for the peripherals I plan to allow the user to access with the language. I can't supply the compiler with the product so I was going to have an interpreted assembler and a pseudo assembler code to run in the target which would activate associated CCS C functions to execute the code. The device is a microchip PIC24EP512GP806 16 bit machine with 170k words of flash and 53k of ram. The machine has a 16 deep hardware stack for return addresses. I will also use a packet between the pc and the pic24 (over the usb) so I can provide a code update to the mcu over the USB using a bootloader.
I am not the swiftest person in all this so I am having trouble knowing which bnf lines to assign entry structures to and which to just do say $$=$2 to. Also it has been years since I wrote a tree walking routine so I may need help designing that. Maybe the struc entry needs a bit for marking that that part of the tree has already been traversed. Thank you for your help. Alvin....
Willing to try. I admit to not being familar with the PIC 24, but I think some things can help with the interpreter - the limited stack isn't a problem for a threaded code interpreter (even when using an index table) as the interpreter itself may use the hardware stack, but the threaded code itself doesn't. That can use a software implemented stack which wouldn't have the same limitations. Implementing the interpreter in C for the pic makes it easer to use the index table. The virtual machine/threaded code data stack might be 16 bit aligned, with two stack entries providing a 32 bit unit. That would reduce the amount of lost space.
Hopefully I'll have an example set (I've been delayed a bit). I have a grammar, and most of the scanner, and am working on the table support (using 4, one for characters, one for numbers, one for strings and one for program symbols). So I've started testing those. A tree walker to translate to a pcode/threaded code style interpreter is not complicated. No need to even mark entries that have been processed. Since it runs on the PC time isn't an issue - but the output code would end up on the target system (or an emulator for interpreter).
Maybe I am backwards. Maybe I should be developing the pseudo assembly stack abstract language for the target first. Should I have three stacks? One for 32 bit INTs and FLOATs, One (16 bit) for STRING pointers and one for (8 bit) CHARs? Should I pass declare statements through as is?
Code:
DCL INT abc,def=-3249576
DCL CHAR achar='A'
DCL STRING a="This is a test"
and would I need flex and bison to decode the DCL statements? What operations would I need?
1) Not necessarily backwards. The goal is the application, and that takes looking at the beginning (the language of the application), and the target (where it will run). Then something in between to provide the translation. So far, I think you have focused on the application, and the beginning of the "in between". So now the target gets looked at.
2)
No - that way would result in lost space.
3) And yes, flex and bison would parse the declarations, and generate appropriate symbol table entries, that can be referenced during a code generation.
An abstract stack machine is very simple (easy to generate code for as well), but such a "machine" interpreter only requires two varibles (as far as the code being interpreted)...
a generic stack (it is up to the code to track whether an entry is 1, 2, 4, 8 bytes long), a program counter (index into the code array), and a working memory (for variables...). In hardware, all three share the same memory, but the advantage for an interpreter is that it can emulate a read-only program store...
Most such programs have three parts:
Code:
+-------------------------+
| application |
| code (read only) |
| | This part may be a separate array...
| |
+-------------------------+
| general storage |
|(ints, strings, |
| bytes, floats |
| both constants and | This part could be a separate array... but can make things confusing
| variable, though | for debugging (indexes to data depend on where the base is...)
| strings would have | For instance, what is the difference between an index into a constant table
| a fixed maximum | vs an index to a datum on the stack? If the constant table is in the code
| size) | then it gets tricky.
| |
+-------------------------+
| heap storage |
| dynamically allocated |
| (might not exist) | Might not exist, but is usually whatever the stack doesn't use.
| grows | |
| V |
+-------------------------+
| (stack space) |
... ...
| |
| /|\ | Is usually part of the same memory as the heap
| grows toward top | which allows them to share space
|(bottom of the stack) |
+-------------------------+
I have even seen some divide it up differently, putting the stack first, then the program code, static variables (in the C sense) and constants, followed by the heap. (RT-11 for the PDP-11 did that - a pain to change the stack size).
This is also roughly the usual memory image of most applications. If you think of the interpreter as a machine, the rest tends to fall out. Indexing everything the same way makes it simpler, and without special cases. One issue that crops up is more dependant on the hardware being used - if memory accesses have an alignment requirement (such as 16/32 bit ints and floats must be on an even address...) it makes the interpreter have to stick to those alignment restrictions, or add some overhead to first move the data to an appropriately aligned area before operating on it (such as having to move 4 byte data somewhere to make them available for the hardware floating point operation, and then move the result back to the non-aligned interpreter view of the storage.
Now for "DCL CHAR achar='A'" (the simple case)
The parsing table would have a tree defining what this represents.. (and there can be two variations too).
The 'A' can be identified as a character (a ' can be used as C uses it to identify single characters),
So the assignment to achar would occur, either as a character, or as an int (either could be valid). The DCL CHAR would give the variable achar the attribute of being a character. Using just the assignment could make the variable (without even a DCL CHAR) to be a character. Now translation COULD be done in two ways - tread achar as a variable receiving a value:
Code:
pushci 'A' (push character immediate, followed by the ascii value A)
popc achar (where achar is assigned to be a byte...)
...
achar: byte (assign a byte storage for the variable achar)
OR...
Code:
achar: 'A' (same as the c declaration char achar = 'A'; which may be a variable OR const)
The first requires runtime execution to get the value set, it can be thought of as the same as a declaration in C used within a function: "void xyz(char c) { char achar = c;...". The second just has it as the initial value for achar. The second depends on the translator to interpret the "variable = expression" part of the parse tree such that it can do this form IF the expression is a constant (or derived from other constants...). It must do the first if the expression has any kind of variable in it.
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.