I need to do a simple split of a string, but there doesn't seem to be a function for this, and the manual way I tested didn't seem to work. How would I do it?

19

Best Answer


Here is my really simple solution. Use the gmatch() function to capture strings which contain at least one character of anything other than the desired separator. The separator is any whitespace (%s in Lua) by default:

function mysplit (inputstr, sep)if sep == nil thensep = "%s"endlocal t={}for str in string.gmatch(inputstr, "([^"..sep.."]+)") dotable.insert(t, str)endreturn tend
Split string in Lua?

Splitting a string in Lua can be done using the string.gmatch() function or by iterating through the string manually. Here are two methods to split a string in Lua:

Method 1: Using string.gmatch()

The string.gmatch() function is a powerful tool in Lua that allows you to iterate over all occurrences of a pattern in a string. To split a string using this method, you can define a pattern that matches the delimiter and use string.gmatch() to iterate over the string.

Here's an example that splits a string by a comma (','):

local str = 'apple,banana,orange'

for word in string.gmatch(str, '[^,]+') doprint(word)end

Output:

applebananaorange
Method 2: Iterating through the string manually

If you prefer a more manual approach, you can split a string in Lua by iterating through it character by character and building substrings whenever you encounter the delimiter.

Here's an example that splits a string by a space (' '):

local str = 'hello world'local delimiter = ' 'local substrings = {}local currentIndex = 1

for i = 1, #str doif str:sub(i, i) == delimiter thensubstrings[#substrings + 1] = str:sub(currentIndex, i - 1)currentIndex = i + 1endend

-- Add the last substringsubstrings[#substrings + 1] = str:sub(currentIndex)

-- Print the substringsfor i, substring in ipairs(substrings) doprint(substring)end

Output:

helloworld

These are two common methods to split a string in Lua. Choose the one that best fits your needs and start manipulating strings with ease!

If you are splitting a string in Lua, you should try the string.gmatch() or string.sub() methods. Use the string.sub() method if you know the index you wish to split the string at, or use the string.gmatch() if you will parse the string to find the location to split the string at.

Example using string.gmatch() from Lua 5.1 Reference Manual:

 t = {}s = "from=world, to=Lua"for k, v in string.gmatch(s, "(%w+)=(%w+)") dot[k] = vend

If you just want to iterate over the tokens, this is pretty neat:

line = "one, two and 3!"for token in string.gmatch(line, "[^%s]+") doprint(token)end

Output:

one,

two

and

3!

Short explanation: the "[^%s]+" pattern matches to every non-empty string in between space characters.

Just as string.gmatch will find patterns in a string, this function will find the things between patterns:

function string:split(pat)pat = pat or '%s+'local st, g = 1, self:gmatch("()("..pat..")")local function getter(segs, seps, sep, cap1, ...)st = sep and seps + #sepreturn self:sub(segs, (seps or 0) - 1), cap1 or sep, ...endreturn function() if st then return getter(st, g()) end endend

By default it returns whatever is separated by whitespace.

Here is the function:

function split(pString, pPattern)local Table = {} -- NOTE: use {n = 0} in Lua-5.0local fpat = "(.-)" .. pPatternlocal last_end = 1local s, e, cap = pString:find(fpat, 1)while s doif s ~= 1 or cap ~= "" thentable.insert(Table,cap)endlast_end = e+1s, e, cap = pString:find(fpat, last_end)endif last_end <= #pString thencap = pString:sub(last_end)table.insert(Table, cap)endreturn Tableend

Call it like:

list=split(string_to_split,pattern_to_match)

e.g.:

list=split("1:2:3:4","\:")


For more go here:
http://lua-users.org/wiki/SplitJoin

Because there are more than one way to skin a cat, here's my approach:

Code:

#!/usr/bin/env lualocal content = [=[Lorem ipsum dolor sit amet, consectetur adipisicing elit,sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat.]=]local function split(str, sep)local result = {}local regex = ("([^%s]+)"):format(sep)for each in str:gmatch(regex) dotable.insert(result, each)endreturn resultendlocal lines = split(content, "\n")for _,line in ipairs(lines) doprint(line)end

Output:Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magnaaliqua. Ut enim ad minim veniam, quis nostrud exercitationullamco laboris nisi ut aliquip ex ea commodo consequat.

Explanation:

The gmatch function works as an iterator, it fetches all the strings that match regex. The regex takes all characters until it finds a separator.

A lot of these answers only accept single-character separators, or don't deal with edge cases well (e.g. empty separators), so I thought I would provide a more definitive solution.

Here are two functions, gsplit and split, adapted from the code in the Scribunto MediaWiki extension, which is used on wikis like Wikipedia. The code is licenced under the GPL v2. I have changed the variable names and added comments to make the code a bit easier to understand, and I have also changed the code to use regular Lua string patterns instead of Scribunto's patterns for Unicode strings. The original code has test cases here.

-- gsplit: iterate over substrings in a string separated by a pattern-- -- Parameters:-- text (string) - the string to iterate over-- pattern (string) - the separator pattern-- plain (boolean) - if true (or truthy), pattern is interpreted as a plain-- string, not a Lua pattern-- -- Returns: iterator---- Usage:-- for substr in gsplit(text, pattern, plain) do-- doSomething(substr)-- endlocal function gsplit(text, pattern, plain)local splitStart, length = 1, #textreturn function ()if splitStart thenlocal sepStart, sepEnd = string.find(text, pattern, splitStart, plain)local retif not sepStart thenret = string.sub(text, splitStart)splitStart = nilelseif sepEnd < sepStart then-- Empty separator!ret = string.sub(text, splitStart, sepStart)if sepStart < length thensplitStart = sepStart + 1elsesplitStart = nilendelseret = sepStart > splitStart and string.sub(text, splitStart, sepStart - 1) or ''splitStart = sepEnd + 1endreturn retendendend-- split: split a string into substrings separated by a pattern.-- -- Parameters:-- text (string) - the string to iterate over-- pattern (string) - the separator pattern-- plain (boolean) - if true (or truthy), pattern is interpreted as a plain-- string, not a Lua pattern-- -- Returns: table (a sequence table containing the substrings)local function split(text, pattern, plain)local ret = {}for match in gsplit(text, pattern, plain) dotable.insert(ret, match)endreturn retend

Some examples of the split function in use:

local function printSequence(t)print(unpack(t))endprintSequence(split('foo, bar,baz', ',%s*')) -- foo bar bazprintSequence(split('foo, bar,baz', ',%s*', true)) -- foo, bar,bazprintSequence(split('foo', '')) -- f o o

I like this short solution

function split(s, delimiter)result = {};for match in (s..delimiter):gmatch("(.-)"..delimiter) dotable.insert(result, match);endreturn result;end

You can use this method:

function string:split(delimiter)local result = { }local from = 1local delim_from, delim_to = string.find( self, delimiter, from )while delim_from dotable.insert( result, string.sub( self, from , delim_from-1 ) )from = delim_to + 1delim_from, delim_to = string.find( self, delimiter, from )endtable.insert( result, string.sub( self, from ) )return resultenddelimiter = string.split(stringtodelimite,pattern) 

a way not seen in others

local function str_split(str, sep)local sep, res = sep or '%s', {}string.gsub(str, '[^'..sep..']+', function(x) res[#res+1] = x end)return res end

Simply sitting on a delimiter

local str = 'one,two'local regxEverythingExceptComma = '([^,]+)'for x in string.gmatch(str, regxEverythingExceptComma) doprint(x)end

You could use penlight library. This has a function for splitting string using delimiter which outputs list.

It has implemented many of the function that we may need while programming and missing in Lua.

Here is the sample for using it.

> > stringx = require "pl.stringx"> > str = "welcome to the world of lua"> > arr = stringx.split(str, " ")> > arr{welcome,to,the,world,of,lua}> 

I used the above examples to craft my own function. But the missing piece for me was automatically escaping magic characters.

Here is my contribution:

function split(text, delim)-- returns an array of fields based on text and delimiter (one character only)local result = {}local magic = "().%+-*?[]^$"if delim == nil thendelim = "%s"elseif string.find(delim, magic, 1, true) then-- escape magicdelim = "%"..delimendlocal pattern = "[^"..delim.."]+"for w in string.gmatch(text, pattern) dotable.insert(result, w)endreturn resultend

Super late to this question, but in case anyone wants a version that handles the amount of splits you want to get.....

-- Split a string into a table using a delimiter and a limitstring.split = function(str, pat, limit)local t = {}local fpat = "(.-)" .. patlocal last_end = 1local s, e, cap = str:find(fpat, 1)while s doif s ~= 1 or cap ~= "" thentable.insert(t, cap)endlast_end = e+1s, e, cap = str:find(fpat, last_end)if limit ~= nil and limit <= #t thenbreakendendif last_end <= #str thencap = str:sub(last_end)table.insert(t, cap)endreturn tend

For those coming from the exercice 10.1 of the "Programming in Lua" book, it seems clear that we could not use notion explained later in the book (iterator) and that the function should take more than a single char seperator.

The split() is a trick to get pattern to match what is not wanted (the split) and return an empty table on empty string. The return of plainSplit() is more like the split in other language.

magic = "([%%%.%(%)%+%*%?%[%]%^%$])"function split(str, sep, plain)if plain then sep = string.gsub(sep, magic, "%%%1") endlocal N = '\255'str = N..str..Nstr = string.gsub(str, sep, N..N)local result = {}for word in string.gmatch(str, N.."(.-)"..N) doif word ~= "" thentable.insert(result, word)endendreturn resultendfunction plainSplit(str, sep)sep = string.gsub(sep, magic, "%%%1")local result = {}local start = 0repeatstart = start + 1local from, to = string.find(str, sep, start)from = from and from-1local word = string.sub(str, start, from, true)table.insert(result, word)start = tountil start == nilreturn resultendfunction tableToString(t)local ret = "{"for _, word in ipairs(t) doret = ret .. '"' .. word .. '", 'endret = string.sub(ret, 1, -3)ret = ret .. "}"return #ret > 1 and ret or "{}"endfunction runSplit(func, title, str, sep, plain)print("\n" .. title)print("str: '"..str.."'")print("sep: '"..sep.."'")local t = func(str, sep, plain)print("-- t = " .. tableToString(t))endprint("\n\n\n=== Pattern split ===")runSplit(split, "Exercice 10.1", "a whole new world", " ")runSplit(split, "With trailing seperator", " a whole new world ", " ")runSplit(split, "A word seperator", "a whole new world", " whole ")runSplit(split, "Pattern seperator", "a1whole2new3world", "%d")runSplit(split, "Magic characters as plain seperator", "a$.%whole$.%new$.%world", "$.%", true)runSplit(split, "Control seperator", "a\0whole\1new\2world", "%c")runSplit(split, "ISO Time", "2020-07-10T15:00:00.000", "[T:%-%.]")runSplit(split, " === [Fails] with \\255 ===", "a\255whole\0new\0world", "\0", true)runSplit(split, "How does your function handle empty string?", "", " ")print("\n\n\n=== Plain split ===")runSplit(plainSplit, "Exercice 10.1", "a whole new world", " ")runSplit(plainSplit, "With trailing seperator", " a whole new world ", " ")runSplit(plainSplit, "A word seperator", "a whole new world", " whole ")runSplit(plainSplit, "Magic characters as plain seperator", "a$.%whole$.%new$.%world", "$.%")runSplit(plainSplit, "How does your function handle empty string?", "", " ")

output

=== Pattern split ===Exercice 10.1str: 'a whole new world'sep: ' '-- t = {"a", "whole", "new", "world"}With trailing seperatorstr: ' a whole new world 'sep: ' '-- t = {"a", "whole", "new", "world"}A word seperatorstr: 'a whole new world'sep: ' whole '-- t = {"a", "new world"}Pattern seperatorstr: 'a1whole2new3world'sep: '%d'-- t = {"a", "whole", "new", "world"}Magic characters as plain seperatorstr: 'a$.%whole$.%new$.%world'sep: '$.%'-- t = {"a", "whole", "new", "world"}Control seperatorstr: 'awholenewworld'sep: '%c'-- t = {"a", "whole", "new", "world"}ISO Timestr: '2020-07-10T15:00:00.000'sep: '[T:%-%.]'-- t = {"2020", "07", "10", "15", "00", "00", "000"}=== [Fails] with \255 ===str: 'a�wholenewworld'sep: ''-- t = {"a"}How does your function handle empty string?str: ''sep: ' '-- t = {}=== Plain split ===Exercice 10.1str: 'a whole new world'sep: ' '-- t = {"a", "whole", "new", "world"}With trailing seperatorstr: ' a whole new world 'sep: ' '-- t = {"", "", "a", "", "whole", "", "", "new", "world", "", ""}A word seperatorstr: 'a whole new world'sep: ' whole '-- t = {"a", "new world"}Magic characters as plain seperatorstr: 'a$.%whole$.%new$.%world'sep: '$.%'-- t = {"a", "whole", "new", "world"}How does your function handle empty string?str: ''sep: ' '-- t = {""}

I found that many of the other answers had edge cases which failed (eg. when given string contains #, { or } characters, or when given a delimiter character like % which require escaping). Here is the implementation that I went with instead:

local function newsplit(delimiter, str)assert(type(delimiter) == "string")assert(#delimiter > 0, "Must provide non empty delimiter")-- Add escape characters if delimiter requires itdelimiter = delimiter:gsub("[%(%)%.%%%+%-%*%?%[%]%^%$]", "%%%0")local start_index = 1local result = {}while true dolocal delimiter_index, _ = str:find(delimiter, start_index)if delimiter_index == nil thentable.insert(result, str:sub(start_index))breakendtable.insert(result, str:sub(start_index, delimiter_index - 1))start_index = delimiter_index + 1endreturn resultend

Here is a routine that works in Lua 4.0, returning a table t of the substrings in inputstr delimited by sep:

function string_split(inputstr, sep)local inputstr = inputstr .. seplocal idx, inc, t = 0, 1, {}local idx_prev, substrrepeat idx_prev = idxinputstr = strsub(inputstr, idx + 1, -1) -- chop off the beginning of the string containing the match last found by strfind (or initially, nothing); keep the rest (or initially, all)idx = strfind(inputstr, sep) -- find the 0-based r_index of the first occurrence of separator if idx == nil then break end -- quit if nothing's foundsubstr = strsub(inputstr, 0, idx) -- extract the substring occurring before the separator (i.e., data field before the next delimiter)substr = gsub(substr, "[%c" .. sep .. " ]", "") -- eliminate control characters, separator and spacest[inc] = substr -- store the substring (i.e., data field)inc = inc + 1 -- iterate to nextuntil idx == nilreturn tend

This simple test

inputstr = "the brown lazy fox jumped over the fat grey hen ... or something."sep = " " t = {}t = string_split(inputstr,sep)for i=1,15 doprint(i, t[i])end

Yields:

--> t[1]=the--> t[2]=brown--> t[3]=lazy--> t[4]=fox--> t[5]=jumped--> t[6]=over--> t[7]=the--> t[8]=fat--> t[9]=grey--> t[10]=hen--> t[11]=...--> t[12]=or--> t[13]=something.

Depending on the use case, this could be useful. It cuts all text either side of the flags:

b = "This is a string used for testing"--Removes unwanted textc = (b:match("a([^/]+)used"))print (c)

Output:

string