LinuxQuestions.org - regular expressions

- Linux - Newbie (https://www.linuxquestions.org/questions/linux-newbie-8/)

- - regular expressions (https://www.linuxquestions.org/questions/linux-newbie-8/regular-expressions-4175478394/)

Nope. Not familar with it (did look it up though). It looks like it would be useful to check inherited code though...

I need to ask a stupid question. For recursive C functions will the function's declared data types be issued again new for a recursive call to that function or do I have to use a stack of data types and manage my own use of the data types for that function? Thank you. Alvin...

All local variables (and the parameters) are on the program stack. They are part of the activation of the function. Frequently, all it takes for local variables to be allocated is to simply adjust the stack level after the function begins executing.

The only problem with that is that the variables need to be initialized - they won't necessarily default to a value of 0. This is because the stack used may have been used before.

The compiler takes care of any type meaning - the assembly code/binary doesn't have any information on data types.

BTW, I've started generating code from the parse tree. Not far enough along (the expression code generation is only about half done, I have declarations completed - well, other than functions which is where the code lies).

Thanks for the recursion information. My problem is when a command that returns a value is in an expression. It looks like I will have to make more than the infix to postfix routine recursive. The get expression routine will also have to be recursive as well as the command processing. At least the tokenizer won't have to be recursive. I started at the bottom with the expression analyzer and am working up with the commands being at the top level. I will do the commands last. The process , like all programming, is recursive or at least repetitive in a loop of generating code, thinking and regenerating the code already written. Thanks again for your help.

Not a problem. It also keeps me on my toes.

Does GCC C allow nested functions where you have a function defined within another function?

Apparently, yes:

http://gcc.gnu.org/onlinedocs/gcc/Nested-Functions.html

Though it causes conniptions for the compiler writers (the variable hierarchy is a booger to access as it requires iterating through the nested local variables to identify where the variable is stored).

I don't use them as I believe they are nonstandard. I have used them in Pascal, but found them to be confusing to debug (a variable is local, or local to the parent function, or local the parent of the parent...). And at each level, you can have the same name declared with different types... which is why you have to be careful with variable naming (it means the variable that locally defined hides the definition of the variable declared somewhere higher the nest). It works, but I found keeping the nesting shallow (no more than 3 levels, and keep the nested functions really short) worked best.

I always seem to do best when I wait to the last minute on ordering parts; new compiler levels; etc. If I order too soon I end up regretting it. I am considering moving to a PIC32 part. It is 32 bit with a max of 512KW instruction words flash and 512KB ram. If the microchip compiler has library functions that support peripherals I may be able to use it instead of the CCS C compiler which only goes to PIC24 and is not completely ansi for C. The thing runs at 200MHz which maybe fast enough to run an interpreter at the interrupt level (using the interpreter to process interrupts). With these possibilities I may continue with the interpreter direction and not switch it to a compiler for an abstract machine. It may be considered overkill but at a price range of $4.95 to $7.95 a part it may be worth it. The part I would use is 100 pin 512KW words and 128K ram which should be big enough for what I want to do. I may write a PC based preprocessor to figure out how much storage should be allocated for symbol table, data types, arrays, and code area and send that information in a prepacket so the interpreter could allocate ram as needed. But with so much ram I could set it for a max size program (64k for program area and 64k for data, symbol table, etc.) Maybe you could use my board for your stack machine as 200MHz would run your system as fast as a hardware based machine of probably 100 MHz. It looks now that I would be selling boards with the software probably free. Unless I can somehow monetize it. What do you think?

I believe GCC supports the PIC32 directly.

It is even large enough for a Java VM (microJava). Even the Dalvic interpreter runs there.

And in the case of the stack machine, it should run closer to 150 MHz - and with assembly language tuning even better. It also would remove a number of limitations (restricted to a 16 bit code/data address for one, lack of shared stack and memory for another - making function local variables simpler).

So it really depends on the level of programming you want the target audience to be able to do.

There is also the possibility to run ucLinux (http://www.uclinux.org/), though they don't list the PIC32 as a known port (what they do list are commercial device ports, but not the CPU used).

Sophisticated interrupt handling is never simple. Most of the interpreters use an event queue instead - which is polled between the interpreted instructions, so that the details of handling a particular device remain with whatever kernel it is using - and simplify the programming interface to it.

Open source is always a bit of a bonus - it gives customers confidence by allowing them to hire others to do the programming, feel more in control of the device, and makes it easier for service support (not having to worry/track licenses and such).

Microchip suggested using Harmony to develop with: http://ww1.microchip.com/downloads/e...esentation.pdf

I'm sure they would - after all, I believe it is a cost item.

BTW, the link gives a "file not found" error.

I've had some luck - the translator is now generating what appears to be proper code for if-then-else-endif while (even a do... until), return, and function calls in expressions. It even does proper integer to float conversions (and float to integer) when necessary.

NOT verified yet - just from initial visual inspection of the generated code it looks right. My test "program" isn't really a program, but a collection of statements to generate some code.

Doesn't handle arrays yet - that is the next thing I'm going to add.

My basic test program is:

Code:

short        value;

byte        char;

byte        str[20];

float        fvalue;



int function main (int c, byte str2[20])

{

    value = c;

    fvalue = value;

    fvalue = 5 * 3 + 2;

    if c then

        fvalue = 3 + 2.0;

    else

        fvalue = 3;

    endif

    while c;

      c = c - 1;

    wend

#    do

#        c = str2[3] + 1;

#    until c > value;

    return c;

}

int function test (int a, int b, int c)

{

  if c then

        a = b;

  else

        b = a;

  endif

  c = test(c,b,a);

}

And the (unoptimized) generated code looks like:

Code:

# assembler code generated from test.code



.data



value:        .block        2                # line 1

char:        .block        1                # line 2

str:        .block        20                # line 3

fvalue:        .block        4                # line 4



.code



main:                                # line 6

        PUSHX        -5[2]                # line 8

        POP2        value

        PUSH2        value                # line 9

        CVTIF

        POP4        fvalue

        PUSHI        5                # line 10

        PUSHI        3

        MUL

        PUSHI        2

        ADD

        CVTIF

        POP4        fvalue

        PUSHX        -5[2]                # line 11

        ADJ        

        JMPEQ        $1

        PUSHI        3                # line 12

        CVTIF

        PUSHI4        2.0

        ADDF

        POP4        fvalue

        JMP        $2

$1:

        PUSHI        3                # line 14

        CVTIF

        POP4        fvalue

$2:

$3:

        PUSHX        -5[2]                # line 16

        ADJ        

        JMPEQ        $4

        PUSHX        -5[2]                # line 17

        PUSHI        1

        SUB

        POPX        -5[2]

        JMP        $3

$4:

        PUSHX        -5[2]                # line 22

        POPX        0[2]

        RETURN        

test:                                # line 24

        PUSHX        -7[2]                # line 26

        ADJ        

        JMPEQ        $1

        PUSHX        -6[2]                # line 27

        POPX        -5[2]

        JMP        $2

$1:

        PUSHX        -5[2]                # line 29

        POPX        -6[2]

$2:

        PUSHX        -5[2]                # line 31

        PUSHX        -6[2]

        PUSHX        -7[2]

        CALL        test

        ADJN        -3

        PUSHR        0

        POPX        -7[2]

        RETURN

The "# line nnn" construct identifies the source line used for the code on that line and following lines.

Verification remains (the offsets for parameter access may be off, but if they are, they are all off by the same amount).

This code is NOT a valid program. It has several things wrong (loop conditions aren't changed, function recursion isn't tested so it is an infinite loop...) It is only to check the parsing and code generation.

Very interesting. Looks Good!

There are still small things to check. Right now I'm doing a rather brain dead test to see if something is a constant byte/short/int. It works for positive numbers, but not tests for sure with negative. The nice thing is that it doesn't matter when it comes to arithmetic since everything on the stack is 32 bits, so even short byte values get sign extended to 32 bits. If things changed to make the stack work differently that might make things different (well maybe... even if the stack were shared with the data segment it could still be a 32 bit unit, but with byte addressable subunits for some operations - string comparison for instance). But it would make using local variables easier (pushing on the stack would always do 32 bits like it does now, but the increment would be in units of 4, and not 1, making it easy to use other registers for byte level data manipulation of stack elements.. it could impact the ADJ/ADJN instructions perhaps).

Also working out relative comparisons - they have to work differently when used in a test vs used in an expression. The expression has to evaluate to a 0/1 value, in a test, only the condition codes are needed. I think I can pull this off using two slightly different coding functions, for testing the function starts off with a relative test function that only generates the comparison, but falls into the regular expression function for the terms... which would then use the expression based relation codings if the relations are nested (weird things like "(a < b)*c > x" would cause the "a < b" to be evaluated as an expression, but the "> x" would be evaluated as a test. But if it were embedded in an assignment (using the expression only) it would cause the "> x" to generate a 0/1 value. For now, I'll first do the 0/1 value generation, as the test case it just discards the value (a one byte instruction).