| Paste number 53658: | "split" proposal |
| Pasted by: | cgay |
| 9 months, 1 day ago | |
| #dylan | Context in IRC logs | |
| Paste contents: |
| How would people feel about unifying the three versions of "split" we have with the following? I came up with this in 10 minutes so feel free to poke holes in it! // In common-dylan library define generic split (string :: <string>, separator :: <object>, #key start, end: _end, max-splits) => (strings :: <sequence>); // In common-dylan library define method split (string :: <string>, separator :: <string>, #key start, end: _end, max-splits) => (strings :: <sequence>) split(string, method (string, start) ... end); end; // In common-dylan library define method split (string :: <string>, separator :: <character>, #key start, end: _end, max-splits) => (strings :: <sequence>) split(string, method (string, start) ... end); end; // In common-dylan library define method split (string :: <string>, separator :: <function>, #key start, end: _end, max-splits) => (strings :: <sequence>) // ...loop between start and _end, applying function... end; // In regular-expressions library define method split (string :: <string>, separator :: <regex>, #key start, end: _end, max-splits) => (strings :: <sequence>) split(string, method (string, start) ... end); end; |
Annotations for this paste:
| Annotation number 1: | more... |
| Pasted by: | cgay |
| 9 months, 1 day ago | |
| Context in IRC logs | |
| Paste contents: |
| I don't think the "trim?" argument to the current split gf in common-dylan is important. It's replacible with map(trim?, split(...)). |Agent suggests adding a remove-empty-items argument. I agree. There should be a whitespace? method in common-dylan, if there isn't already, so the common case is just split(string, whitespace?). |
| Annotation number 2: | updated split proposal |
| Pasted by: | cgay |
| 8 months, 4 weeks ago | |
| Context in IRC logs | |
| Paste contents: |
| // In common-dylan library define generic split (thing :: <sequence>, separator :: <object>, #key start :: <integer> = 0, end: _end :: false-or(<integer>), max-splits :: false-or(<integer>)) => (parts :: <sequence>); // In common-dylan library // This is in some sense the most basic method, since others can be // implemented in terms of it. define method split (seq :: <sequence>, separator :: <function>, #key start, end: _end, max-splits) => (parts :: <sequence>) // ...loop between start and _end, applying separator function... // The function accepts (seq, start, end) and returns // (match-start, match-end) or #f. end; // In common-dylan library // Splits seq around occurrances of the separator subsequence. // Works for the relatively common case where seq and separator // are both <string>s. define method split (seq :: <sequence>, separator :: <sequence>, #key start, end: _end, max-splits) => (parts :: <sequence>) split(seq, method (sequence, start, _end) ... end); end; // In common-dylan library // Splits seq around any element that is in the separator <set>. // Do we even have a built-in <set> class? define method split (seq :: <sequence>, separator :: <set>, #key start, end: _end, max-splits) => (parts :: <sequence>) split(seq, method (sequence, start, _end) if (member?(sequence[start], separator)) values(start, start + 1) end end, start: start, end: _end, max-splits: max-splits) end; // In common-dylan library define method split (seq :: <sequence>, separator :: <character>, #key start, end: _end, max-splits) => (parts :: <sequence>) split(seq, make(<string>, size: 1, fill: separator), ...) end; // In regular-expressions library define method split (seq :: <sequence>, separator :: <regex>, #key start, end: _end, max-splits) => (parts :: <sequence>) split(seq, method (sequence, start, _end) ... end); end; Note that there is no "trim" parameter. It can be accomplished via map(trim, split(...)) assuming there's a "trim" function available, which there should be. Note that there is no "remove-empty-subsequences" parameter. It can be accomplished via choose(complement(empty?), split(...)) which is actually fewer characters, when you include ':' and "#t". :-) I don't think there's a strong argument to be made based on efficiency either. |