[SOLVED] Generate SPECIAL alphanumeric WORDLIST - no repeating characters side-by-side

klambo · 07-11-2011, 11:13 AM

Quote:

Originally Posted by wje_lq

First of all, I hope you realize that to send to a file all 10-character combinations of the following characters

Code:

0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ

such that no two adjacent characters are of equal value, and placing each combination on its own line, will generate a file that's 31 petabytes long. I hope you have room for this.

The program listed below will generate that file. Just redirect standard output to a file in the normal manner.

No, I haven't tested it to completion. But you can test a crippled version of it by doing these four things:

Comment out the first definition of *character-set*, by adding a semicon at the beginning of the line.
Uncomment the second definition, which just uses "ABC", by removing the semicolon from the beginning of the line.
Comment out the first definition of *word-length*.
Uncomment the second definition, which uses a word length of four.

If you do that and run the program, you'll get this output. That's the kind of output you're looking for, right?

Code:

ABAB
ABAC
ABCA
ABCB
ACAB
ACAC
ACBA
ACBC
BABA
BABC
BACA
BACB
BCAB
BCAC
BCBA
BCBC
CABA
CABC
CACA
CACB
CBAB
CBAC
CBCA
CBCB

So here's the code, all 42 lines of it.

Code:

#!/usr/bin/clisp

(defparameter *character-set* "0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ")
;(defparameter *character-set* "ABC")     ; < --- this line is for testing

(defparameter *word-length* 10)
;(defparameter *word-length* 4)           ; < --- this line is for testing

(defparameter *character-list*
   (coerce *character-set* 'list))

(defun final-char (in-string)
   (cond
      ((> (length in-string) 0)
         (elt in-string (1- (length in-string))))
      (t
         nil)))

(defun new-char-list (in-string)
   (let ((result))
      (mapcar
         (lambda (candidate)
            (cond
               ((not (eql candidate (final-char in-string)))
                  (push candidate result))))
         *character-list*)
      (nreverse result))
      )

(defun extend-string (in-string desired-length)
   (mapcar
      (lambda (new-char)
         (let ((new-string (concatenate 'string in-string (string new-char))))
            (cond
               ((>  (length new-string) desired-length))
               ((>= (length new-string) desired-length)
                  (format t "~a~%" new-string))
               (t
                  (extend-string new-string desired-length)))))
      (new-char-list in-string)))

(extend-string "" *word-length*)

Hope this helps.

Thanks for this Is there a way so save the results to txt file ?

I worked it out clisp clisptxt > file.txt

k3eper · 08-08-2011, 08:50 PM

wje_lq Brilliant script lol! I had this same thought the other day and was trying to fudge crunch and grep together.

Could you maybe suggest changes needed to get rid of any repeating characters in each word at all

EXAMPLE:

Current ABATRWHS
NEW: ATEFUZWH

No character is the same character.

Again awsome script lol so simple really do wish i could script.

grail · 08-10-2011, 07:55 PM

Try raising your own question and referencing this one as most people do not follow SOLVED questions (for a fairly obvious reason)

enteptain · 08-24-2011, 10:04 PM

hoping i can find some help here but first wanted to say thanks for the bash and awk scripts guys. handy code
im trying to mod this scrit to put a-z upper and lower case in front of each word in my list.
this is what i've been using to put numbers 1-9999 at the end

perl -e 'while(my $line = <>) { chomp($line); foreach my $i (0000..9999) {printf ("$line%04d\n", $i); }}' wordlist.txt >> wordlistnew.txt

thanks in advance. I'm writing these into a tool

grail · 08-24-2011, 10:13 PM

As above, please ask your own question and reference here if required as most won't look at old questions, especially SOLVED ones.

enteptain · 08-24-2011, 10:19 PM

Quote:

Originally Posted by grail

As above, please ask your own question and reference here if required as most won't look at old questions, especially SOLVED ones.

okay! thanks for the lighting reply speed!!!

sapto · 02-12-2012, 06:22 PM

The output file format in .txt can be very large. Is it possible to, for example, after 100 mb, creates a new output files also in .txt?

danielbmartin · 02-12-2012, 09:15 PM

Quote:

Originally Posted by Kenhelm

Code:

 grep -Ev '(.)\1'

It was easy to adapt this clever grep to read a long list of English words and keep only those which contain one or more letter pairs (such as the word letter).

Code:

grep -E '(.)\1' < $InFile > $Work02

Now I want to extend the theme to keep only those words which contain two or more letter pairs (such as the word success). I tried this ...

Code:

grep -E '/(.).*(.)/\1 2' < $InFile > $Work03

... but that generates an empty file.

Daniel B. Martin

PTrenholme · 02-12-2012, 10:00 PM

Try egrep '(.)\1+.*(.)\2+' $InFile > $Work03

An example:

Code:

$ echo -e success\\ntest\\n | egrep '(.)\1+.*(.)\2+'
success

Snark1994 · 02-13-2012, 06:14 AM

Quote:

Originally Posted by sapto

The output file format in .txt can be very large. Is it possible to, for example, after 100 mb, creates a new output files also in .txt?

Instead of redirecting to a file, try piping to split:

Code:

clisp clisptxt | split -d -b $(CHUNK_SIZE_IN_BYTES) - $(FILE_NAME_PREFIX)

Though again this could have gone into its own thread.

Hope this helps,

danielbmartin · 02-13-2012, 06:52 AM

Quote:

Originally Posted by PTrenholme

Try egrep '(.)\1+.*(.)\2+' $InFile > $Work03

Lovely, works like a charm! Thank you.

It was an easy matter to extend this one-liner to ...
(1) build a list of all words with three letter pairs (such as the word committee).

Code:

egrep '(.)\1+.*(.)\2+.*(.)\3+' < $InFile > $Work04

... and (2) build a list of all words with three consecutive letter pairs (such as the word bookkeeper).

Code:

egrep '(.)\1+(.)\2+(.)\3+' < $InFile > $Work05

On what basis does technical intuition guide you toward egrep rather than grep?

Daniel B. Martin

sapto · 02-13-2012, 07:30 AM

Quote:

Originally Posted by Snark1994

Instead of redirecting to a file, try piping to split:

Code:

clisp clisptxt | split -d -b $(CHUNK_SIZE_IN_BYTES) - $(FILE_NAME_PREFIX)

Though again this could have gone into its own thread.

Hope this helps,

I need only .txt format for aircrack...thanks a lot

PTrenholme · 02-13-2012, 10:44 AM

Quote:

Originally Posted by danielbmartin

Lovely, works like a charm! Thank you.

It was an easy matter to extend this one-liner to ...
(1) build a list of all words with three letter pairs (such as the word committee).

Code:

egrep '(.)\1+.*(.)\2+.*(.)\3+' < $InFile > $Work04

... and (2) build a list of all words with three consecutive letter pairs (such as the word bookkeeper).

Code:

egrep '(.)\1+(.)\2+(.)\3+' < $InFile > $Work05

On what basis does technical intuition guide you toward egrep rather than grep?

Daniel B. Martin

The "<" in the command is both unnecessary and a waste of resources. grep reads all arguments after the first one (the regular expression) as input file names.
egrep is just a standard alias for grep -e

ntubski · 02-13-2012, 09:13 PM

Quote:

Originally Posted by PTrenholme

egrep is just a standard alias for grep -E

Fixed that for you.

albanese · 08-17-2013, 10:35 AM

how to save to a file txt