Regular Expression to split on spaces unless in quotes


I would like to use the .Net Regex.Split method to split this input string into an array. It must split on whitespace unless it is enclosed in a quote.

Input: Here is "my string"    it has "six  matches"

Expected output:

  1. Here
  2. is
  3. my string
  4. it
  5. has
  6. six  matches

What pattern do I need? Also do I need to specify any RegexOptions?

4/3/2009 9:49:03 PM

Accepted Answer

No options required




Regex regex = new Regex(@"\w+|""[\w\s]*""");

Or if you need to exclude " characters:

        .Matches(input, @"(?<match>\w+)|\""(?<match>[\w\s]*)""")
        .Select(m => m.Groups["match"].Value)
        .ForEach(s => Console.WriteLine(s));
2/16/2009 7:27:04 PM

Lieven's solution gets most of the way there, and as he states in his comments it's just a matter of changing the ending to Bartek's solution. The end result is the following working regEx:


Input: Here is "my string" it has "six matches"


  1. Here
  2. is
  3. "my string"
  4. it
  5. has
  6. "six matches"

Unfortunately it's including the quotes. If you instead use the following:


And explicitly capture the "token" matches as follows:

    RegexOptions options = RegexOptions.None;
    Regex regex = new Regex( @"((""((?<token>.*?)(?<!\\)"")|(?<token>[\w]+))(\s)*)", options );
    string input = @"   Here is ""my string"" it has   "" six  matches""   ";
    var result = (from Match m in regex.Matches( input ) 
                  where m.Groups[ "token" ].Success
                  select m.Groups[ "token" ].Value).ToList();

    for ( int i = 0; i < result.Count(); i++ )
        Debug.WriteLine( string.Format( "Token[{0}]: '{1}'", i, result[ i ] ) );

Debug output:

Token[0]: 'Here'
Token[1]: 'is'
Token[2]: 'my string'
Token[3]: 'it'
Token[4]: 'has'
Token[5]: ' six  matches'

