Paste number 53658: "split" proposal

Index of paste annotations: 1 | 2

Paste number 53658: "split" proposal
Pasted by: cgay
9 months, 1 day ago
#dylan | Context in IRC logs
Paste contents:
Raw Source | XML | Display As
How would people feel about unifying the three versions of "split" we have with the following?  I came up with this in 10 minutes so feel free to poke holes in it!

    // In common-dylan library
    define generic split
        (string :: <string>, separator :: <object>,
         #key start, end: _end, max-splits)
     => (strings :: <sequence>);

    // In common-dylan library
    define method split
        (string :: <string>, separator :: <string>,
         #key start, end: _end, max-splits)
     => (strings :: <sequence>)
      split(string, method (string, start) ... end);
    end;

    // In common-dylan library
    define method split
        (string :: <string>, separator :: <character>,
         #key start, end: _end, max-splits)
     => (strings :: <sequence>)
      split(string, method (string, start) ... end);
    end;

    // In common-dylan library
    define method split
        (string :: <string>, separator :: <function>,
         #key start, end: _end, max-splits)
     => (strings :: <sequence>)
      // ...loop between start and _end, applying function...
    end;

    // In regular-expressions library
    define method split
        (string :: <string>, separator :: <regex>,
         #key start, end: _end, max-splits)
     => (strings :: <sequence>)
      split(string, method (string, start) ... end);
    end;

Annotations for this paste:

Annotation number 1: more...
Pasted by: cgay
9 months, 1 day ago
Context in IRC logs
Paste contents:
Raw Source | Display As
I don't think the "trim?" argument to the current split gf in common-dylan is important.  It's replacible with map(trim?, split(...)).

|Agent suggests adding a remove-empty-items argument.  I agree.

There should be a whitespace? method in common-dylan, if there isn't already, so the common case is just split(string, whitespace?).

Annotation number 2: updated split proposal
Pasted by: cgay
8 months, 4 weeks ago
Context in IRC logs
Paste contents:
Raw Source | Display As
    // In common-dylan library
    define generic split
        (thing :: <sequence>, separator :: <object>,
         #key start :: <integer> = 0,
              end: _end :: false-or(<integer>),
              max-splits :: false-or(<integer>))
     => (parts :: <sequence>);

    // In common-dylan library
    // This is in some sense the most basic method, since others can be
    // implemented in terms of it.
    define method split
        (seq :: <sequence>, separator :: <function>,
         #key start, end: _end, max-splits)
     => (parts :: <sequence>)
      // ...loop between start and _end, applying separator function...
      // The function accepts (seq, start, end) and returns
      // (match-start, match-end) or #f.
    end;

    // In common-dylan library
    // Splits seq around occurrances of the separator subsequence.
    // Works for the relatively common case where seq and separator
    // are both <string>s.
    define method split
        (seq :: <sequence>, separator :: <sequence>,
         #key start, end: _end, max-splits)
     => (parts :: <sequence>)
      split(seq, method (sequence, start, _end) ... end);
    end;

    // In common-dylan library
    // Splits seq around any element that is in the separator <set>.
    // Do we even have a built-in <set> class?
    define method split
        (seq :: <sequence>, separator :: <set>,
         #key start, end: _end, max-splits)
     => (parts :: <sequence>)
      split(seq,
            method (sequence, start, _end)
              if (member?(sequence[start], separator))
                values(start, start + 1)
              end
            end,
            start: start, end: _end, max-splits: max-splits)
    end;

    // In common-dylan library
    define method split
        (seq :: <sequence>, separator :: <character>,
         #key start, end: _end, max-splits)
     => (parts :: <sequence>)
      split(seq, make(<string>, size: 1, fill: separator), ...)
    end;

    // In regular-expressions library
    define method split
        (seq :: <sequence>, separator :: <regex>,
         #key start, end: _end, max-splits)
     => (parts :: <sequence>)
      split(seq, method (sequence, start, _end) ... end);
    end;

Note that there is no "trim" parameter.  It can be accomplished via

    map(trim, split(...))

assuming there's a "trim" function available, which there should be.


Note that there is no "remove-empty-subsequences" parameter.  It can
be accomplished via

    choose(complement(empty?), split(...))

which is actually fewer characters, when you include ':' and "#t".
:-)  I don't think there's a strong argument to be made based on
efficiency either.

Colorize as:
Show Line Numbers
Index of paste annotations: 1 | 2

Ads absolutely not by Google

Lisppaste pastes can be made by anyone at any time. Imagine a fearsomely comprehensive disclaimer of liability. Now fear, comprehensively.