LinuxQuestions.org
Review your favorite Linux distribution.
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices


Reply
 
Search this Thread
Old 02-26-2014, 05:05 PM   #1
Lucien Lachance
Member
 
Registered: May 2013
Posts: 82

Rep: Reputation: Disabled
Ruby: Matching for a delimiter


I'm writing a coding kata called the String Calculator. You can view more of the specs here: http://osherove.com/tdd-kata-1/ So far, I've come up with this. I would like to know how I can write a regex pattern that will split the string based on any delimiter. Here's an example:

# When given add('1') => 1

# When given add('1,2') => 2

# When given add("//;\n1;2") => 3

Giving it the pattern:
Code:
input.split(/[,\n]|[^0-9]/)
works as expected, however it splits "//;\n1;2" like this

=> ["", "", "", "", "1", "2"]

How can I remove those empty characters from being matched?

Code:
def self.add(input)
  solution = input.split(/[,\n]/).map(&:to_i).reduce(0, :+)
  input.match(/\n$/) ? nil : solution
end

Last edited by Lucien Lachance; 02-26-2014 at 07:42 PM. Reason: sorry about that!
 
Old 02-26-2014, 07:28 PM   #2
grail
Guru
 
Registered: Sep 2009
Location: Perth
Distribution: Manjaro
Posts: 7,906

Rep: Reputation: 2098Reputation: 2098Reputation: 2098Reputation: 2098Reputation: 2098Reputation: 2098Reputation: 2098Reputation: 2098Reputation: 2098Reputation: 2098Reputation: 2098
Well I am not sure which version of ruby you are using, but on mine (2.1.0p0), your split code throws an error:
Code:
$ ruby -e 'input="//;\n1;2"; print input.split([,\n]|[^0-9])'
-e:1: syntax error, unexpected ',', expecting ']'
input="//;\n1;2"; print input.split([,\n]|[^0-9])
I am able to overcome the error by using the following split:
Code:
input.split(/[,\n]|[^0-9]/)
Now that I have the same output, my suggestion would be to use 'delete_if' as seen here

Of course it could just be you are using the wrong method. Why not try:
Code:
input.scan(/[0-9]+/)
I threw in the '+' in case you have multi-digit values passed to you, ie 45
 
Old 02-26-2014, 07:43 PM   #3
Lucien Lachance
Member
 
Registered: May 2013
Posts: 82

Original Poster
Rep: Reputation: Disabled
Sorry about that, I'm running on 2.0.0, I fixed the error. I originally thought of using scan, but felt like that wasn't really solving the problem of reading between numbers and delimiters as the kata suggests. Scan works wonders, but do you think using it in this case (based on the spec) is ideal?

Also, I could say:
Code:
solution = input.scan(/\d+/).map(&:to_i).reduce(0, :+)
So a refactor might be:

Code:
def self.add(input) 
  solution = input.scan(/\d+/).map(&:to_i).reduce(0, :+)
  input.end_with?("\n") ? nil : solution
end
but!

what happens if I feed add with
Code:
 add("//1\n112")
Whereas '1' is the delimiter

Last edited by Lucien Lachance; 02-26-2014 at 08:01 PM.
 
Old 02-26-2014, 09:29 PM   #4
grail
Guru
 
Registered: Sep 2009
Location: Perth
Distribution: Manjaro
Posts: 7,906

Rep: Reputation: 2098Reputation: 2098Reputation: 2098Reputation: 2098Reputation: 2098Reputation: 2098Reputation: 2098Reputation: 2098Reputation: 2098Reputation: 2098Reputation: 2098
Now you are losing me a little bit

1. Why are you allowing the user (I assume this is person entering details) to enter gibberish or non-numerical input?

2. Your last example makes no sense, because if the delimiter is '1', this becomes nothing plus 12??

3. I also don't understand the following:
Code:
input.end_with?("\n") ? nil : solution
None of your examples have a newline at the end of the 'input' string plus the scan or split removes this so a solution can be provided. To me this seems a counter intuitive test??
 
Old 02-26-2014, 09:52 PM   #5
Lucien Lachance
Member
 
Registered: May 2013
Posts: 82

Original Poster
Rep: Reputation: Disabled
Okay, let me explain. For the string calculator kata, the following input is not allowed. (these are just the rules) http://osherove.com/tdd-kata-1/

Code:
 StringCalculator.add("1,\n") # Should return nil
If any string ends in a newline this is considered invalid, so I return nil.

Now, with the delimiter situation. A string calculator should do the following:

Code:
  StringCalculator.add('1') # Should return 1
  StringCalculator.add('1,2') # Should return 3
  StringCalculator.add("//;\n1;2") # Should return 3 and accepts a delimiter of ';'
  StringCalculator.add("//+\n2+2") # Should return 4 and accepts a delimiter of '+'
However, what if the position of input[2, 1] is a number. This is incorrect, because this is supposed to be a delimiter, not a numerical value. Hope I explained this better. What do you think the ideal solution is to handle this problem?

[CODE]

Last edited by Lucien Lachance; 02-26-2014 at 09:54 PM.
 
Old 02-27-2014, 12:22 AM   #6
grail
Guru
 
Registered: Sep 2009
Location: Perth
Distribution: Manjaro
Posts: 7,906

Rep: Reputation: 2098Reputation: 2098Reputation: 2098Reputation: 2098Reputation: 2098Reputation: 2098Reputation: 2098Reputation: 2098Reputation: 2098Reputation: 2098Reputation: 2098
Ok, so I have had a look at the exercise and I am guessing you are up to:
Code:
4. Support different delimiters
   1. to change a delimiter, the beginning of the string will contain a separate line that looks like this:   “//[delimiter]\n[numbers…]” for example “//;\n1;2” should return three where the default delimiter is ‘;’ .
This being the case I would say that none of the current solutions are correct. You should be specifically looking for the pattern mentioned above, ie. '//[delim]\n', once found you need to extract and use this delimiter on the rest of the string. So you need to step back a bit and solve in the order mentioned as both current methods are ignoring the fact that you have been supplied
a new delimiter to use.

As for:
Quote:
However, what if the position of input[2, 1] is a number.
I am sorry but this is still confusing and does not appear to make sense to me
You may need to provide an example data and highlight what you are using for the delimiter?
If I do partially understand, I would again mention that supplying a number as a delimiter would seem to be a method you cannot accurately test for as all digits of the same value will be considered
a delimiter and hence removed, which may give you the incorrect total value expected.

Lastly, as someone who does testing in their own job, I find the following statement from the page to be bad advice:
Quote:
Make sure you only test for correct inputs. there is no need to test for invalid inputs for this kata
I always like to test for invalid input as it lets me know what needs to be shored up against a user who doesn't know how to use my software (which generally is everyone else but you).
Just my 2 cents
 
Old 02-27-2014, 12:31 AM   #7
Lucien Lachance
Member
 
Registered: May 2013
Posts: 82

Original Poster
Rep: Reputation: Disabled
That's been the hardest part. This damn delimiter matching! I'm not the best with regex, unfortunately. There's lots of ways to solve this, my first try I generated a string of supported character and checked to see if it matched a string beginning with '//'. Could you help get me started a bit, I appreciate the help. Seriously. In the mean time, take a look at this method I've come up with.

Something not yet implemented:
Code:
def supported_delimiters
  [*(33..46), *(58..64)].map(&:chr).join
end
Here's my production code:
Code:
module StringCalculator
  def self.add(input)
    validate_negatives(input)
    solution = input.scan(/\d+/).map(&:to_i).reduce(0, :+)                          
    input.end_with?("\n") ? nil : solution                                          
  end                                                                               
  
  def self.validate_negatives(input) 
    negatives = input.scan(/-\d+/)
    fail "negatives not allowed: #{negatives.join(', ')}" if negatives.any?                                                                                              
  end                                                                               
    
  private_class_method :validate_negatives
end

Last edited by Lucien Lachance; 02-27-2014 at 12:33 AM.
 
Old 02-27-2014, 07:46 AM   #8
grail
Guru
 
Registered: Sep 2009
Location: Perth
Distribution: Manjaro
Posts: 7,906

Rep: Reputation: 2098Reputation: 2098Reputation: 2098Reputation: 2098Reputation: 2098Reputation: 2098Reputation: 2098Reputation: 2098Reputation: 2098Reputation: 2098Reputation: 2098
Like most of us I feel you have tried to jump ahead and so are missing the point of the lesson.

At each iteration of item(s) you add to your solution you should be testing it against data to see if it can handle the solution.
Now although not mentioned directly, it has been implied that the initial delimiter is to be a comma. So you need to backup and create a delimiter item / variable that has a default
but later may get changed.

Also, if we cut back your current solution to:
Code:
module StringCalculator
  def self.add(input)
    solution = input.to_i
    input.end_with?("\n") ? nil : solution                                          
  end
end
With this basic example, when the input ends with "\n" it will return 'nil', when looking at the advice you should probably return an error message that this is unacceptable input.

So really you need to actually go back prior to this and either use scan or split with just a comma as delimiter.
If you do use scan you will need to change to something like (for the basic 0,1,2 number solution):
Code:
delimiter = ","
solution = input.scan(/[^#{delimiter}]*/).map(&:to_i).reduce(0, :+) # or change scan to split(delimiter)
Then when accepting a new delimiter you can add a test in between to extract the new delimiter and assign to variable. Here the catch would be that you would then need to remove the
new delimiter definition from your original string prior to summing the data

Hope that helps
 
Old 02-27-2014, 09:06 AM   #9
Lucien Lachance
Member
 
Registered: May 2013
Posts: 82

Original Poster
Rep: Reputation: Disabled
I've tested those conditions initially with split(','). It's difficult to include every detail, and I realize that now. I'll re-iterate over this and break the tests one by one. One last thing. I'm raising an error for negatives (which the problem states), would it still be in the principle of single responsibility if I raised the error if the string ends in a newline in the function add? Or, should that be done else where?

Also, if you're interested, you can see my tests here: https://github.com/deidora/orlando_d...g_calc_spec.rb

Last edited by Lucien Lachance; 02-27-2014 at 09:08 AM.
 
Old 02-27-2014, 10:34 AM   #10
grail
Guru
 
Registered: Sep 2009
Location: Perth
Distribution: Manjaro
Posts: 7,906

Rep: Reputation: 2098Reputation: 2098Reputation: 2098Reputation: 2098Reputation: 2098Reputation: 2098Reputation: 2098Reputation: 2098Reputation: 2098Reputation: 2098Reputation: 2098
Well I did watch the video from Corey and whilst I followed his method it seemed a little unusual as he defined the module in such a way that it would need to be extended / added to String
for it to work.

Based on that, it turns out your original solution using split was very similar and once the items in the array have been mapped with to_i the empty cells simply default to 0.

As for raising errors, again the video simply ignored the fact by converting all newlines to the delimiter which means no string will ever end in a newline.
If you were to raise this type of error I would suggest it would need to be the very test applied to the input data.
 
Old 02-27-2014, 11:33 AM   #11
Lucien Lachance
Member
 
Registered: May 2013
Posts: 82

Original Poster
Rep: Reputation: Disabled
Took me quite some time to clean up everything, but I've taken care of what you said about handling an error if the string ends in a newline. I think this is as clean as I can get it. I still have yet to solve the delimiter issue. I left that for the final refactor because I know this will require some thinking time on my part. More often that not, delimiter by itself will be a method because of its depth. I'm pretty happy with this approach and your advice given. I think what Corey was aiming for in the video was to not have to worry about invalid input because if you're trying to do the simplest thing possible, handling for invalid input sidetracks you away from the task. Just my .02. I agree though, I always test for invalid cases at work as well.

https://github.com/deidora/orlando_d...string_calc.rb

Last edited by Lucien Lachance; 02-27-2014 at 11:48 AM.
 
Old 02-27-2014, 08:03 PM   #12
grail
Guru
 
Registered: Sep 2009
Location: Perth
Distribution: Manjaro
Posts: 7,906

Rep: Reputation: 2098Reputation: 2098Reputation: 2098Reputation: 2098Reputation: 2098Reputation: 2098Reputation: 2098Reputation: 2098Reputation: 2098Reputation: 2098Reputation: 2098
As an option / idea for the delimiter issue:

1. Check the string starts with '//' # If you wanted to be complete, you should actually check that the string starts with - - '//any_character\n' - - the reason is this is part of the description
of how a new delimiter is defined, so the user could enter - - '//blah,1,2,3' - - there is no new delimiter here as no single character followed by newline

2. Extract the character immediately after '//' into delimiter (variable or returned from method)

3.
a. Replace all up to first newline with nothing in original string
b. Split based on new delimiter and let to_i handle none digit characters (ie set to 0)
 
Old 02-27-2014, 09:49 PM   #13
Lucien Lachance
Member
 
Registered: May 2013
Posts: 82

Original Poster
Rep: Reputation: Disabled
Code:
module StringCalculator
  def self.add(string)
    string.end_with?("\n") ? fail('ends in newline') : solve(string)
  end

  def self.solve(string)
    verify(string)
    custom = delimiter(string)
    numbers = string.gsub(/\n/, custom).split(custom).map(&:to_i)
    numbers.reject { |n| n > 1000 }.reduce(0, :+)
  end

  def self.delimiter(string)
    string.match(%r{^//}) ? string[2, 1] : ','
  end

  def self.verify(string)
    find = string.scan(/-\d+/)
    fail("negatives not allowed: #{find.join(', ')}") if find.any?
  end

  private_class_method :verify, :delimiter, :solve
end
Something like this, right? This does the job, but I probably need to raise an error when string[2, 1] is not a number.

Last edited by Lucien Lachance; 02-27-2014 at 10:18 PM. Reason: refactor!
 
Old 02-27-2014, 10:20 PM   #14
grail
Guru
 
Registered: Sep 2009
Location: Perth
Distribution: Manjaro
Posts: 7,906

Rep: Reputation: 2098Reputation: 2098Reputation: 2098Reputation: 2098Reputation: 2098Reputation: 2098Reputation: 2098Reputation: 2098Reputation: 2098Reputation: 2098Reputation: 2098
Yeah not bad, although maybe we could have thought a little simpler:
Code:
def self.delimiter(string)
    string[0, 2] == '//' ? string[2, 1] : ','
end
 
Old 02-27-2014, 10:42 PM   #15
Lucien Lachance
Member
 
Registered: May 2013
Posts: 82

Original Poster
Rep: Reputation: Disabled
I did feel a little uncomfortable using #match in this instance, but I do see how that's much more readable. Thanks for all the help. It's almost as if we were pair programming the in the same room, haha.
 
  


Reply

Tags
regex, regexp, ruby


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
RVM + local install ruby + system ruby maintenance Corpus-Khu Linux - Software 0 02-13-2014 01:45 AM
How use CUT -d 'delimiter' is delimiter is a TAB? frenchn00b Programming 12 11-06-2013 03:17 AM
[SOLVED] awk with pipe delimited file (specific column matching and multiple pattern matching) lolmon Programming 4 08-31-2011 12:17 PM
Any issues installing Ruby Gems and Ruby on Rails in Slackware? Lufbery Slackware 8 02-09-2011 07:22 PM
Perl Script needed to be reversed to output matching, not non-matching 0bfuscated Programming 2 07-20-2010 10:51 AM


All times are GMT -5. The time now is 11:35 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration