URL Encoding using C#


Question

I have an application which sends a POST request to the VB forum software and logs someone in (without setting cookies or anything).

Once the user is logged in I create a variable that creates a path on their local machine.

c:\tempfolder\date\username

The problem is that some usernames are throwing "Illegal chars" exception. For example if my username was mas|fenix it would throw an exception..

Path.Combine( _      
  Environment.GetFolderPath(System.Environment.SpecialFolder.CommonApplicationData), _
  DateTime.Now.ToString("ddMMyyhhmm") + "-" + form1.username)

I don't want to remove it from the string, but a folder with their username is created through FTP on a server. And this leads to my second question. If I am creating a folder on the server can I leave the "illegal chars" in? I only ask this because the server is Linux based, and I am not sure if Linux accepts it or not.

EDIT: It seems that URL encode is NOT what I want.. Here's what I want to do:

old username = mas|fenix
new username = mas%xxfenix

Where %xx is the ASCII value or any other value that would easily identify the character.

1
306
2/17/2018 2:03:39 PM

Accepted Answer

Edit: Note that this answer is now out of date. See Siarhei Kuchuk's answer below for a better fix

UrlEncoding will do what you are suggesting here. With C#, you simply use HttpUtility, as mentioned.

You can also Regex the illegal characters and then replace, but this gets far more complex, as you will have to have some form of state machine (switch ... case, for example) to replace with the correct characters. Since UrlEncode does this up front, it is rather easy.

As for Linux versus windows, there are some characters that are acceptable in Linux that are not in Windows, but I would not worry about that, as the folder name can be returned by decoding the Url string, using UrlDecode, so you can round trip the changes.

181
5/23/2017 10:31:37 AM

I've been experimenting with the various methods .NET provide for URL encoding. Perhaps the following table will be useful (as output from a test app I wrote):

Unencoded UrlEncoded UrlEncodedUnicode UrlPathEncoded EscapedDataString EscapedUriString HtmlEncoded HtmlAttributeEncoded HexEscaped
A         A          A                 A              A                 A                A           A                    %41
B         B          B                 B              B                 B                B           B                    %42

a         a          a                 a              a                 a                a           a                    %61
b         b          b                 b              b                 b                b           b                    %62

0         0          0                 0              0                 0                0           0                    %30
1         1          1                 1              1                 1                1           1                    %31

[space]   +          +                 %20            %20               %20              [space]     [space]              %20
!         !          !                 !              !                 !                !           !                    %21
"         %22        %22               "              %22               %22              "      "               %22
#         %23        %23               #              %23               #                #           #                    %23
$         %24        %24               $              %24               $                $           $                    %24
%         %25        %25               %              %25               %25              %           %                    %25
&         %26        %26               &              %26               &                &       &                %26
'         %27        %27               '              '                 '                '       '                %27
(         (          (                 (              (                 (                (           (                    %28
)         )          )                 )              )                 )                )           )                    %29
*         *          *                 *              %2A               *                *           *                    %2A
+         %2b        %2b               +              %2B               +                +           +                    %2B
,         %2c        %2c               ,              %2C               ,                ,           ,                    %2C
-         -          -                 -              -                 -                -           -                    %2D
.         .          .                 .              .                 .                .           .                    %2E
/         %2f        %2f               /              %2F               /                /           /                    %2F
:         %3a        %3a               :              %3A               :                :           :                    %3A
;         %3b        %3b               ;              %3B               ;                ;           ;                    %3B
<         %3c        %3c               <              %3C               %3C              &lt;        &lt;                 %3C
=         %3d        %3d               =              %3D               =                =           =                    %3D
>         %3e        %3e               >              %3E               %3E              &gt;        >                    %3E
?         %3f        %3f               ?              %3F               ?                ?           ?                    %3F
@         %40        %40               @              %40               @                @           @                    %40
[         %5b        %5b               [              %5B               %5B              [           [                    %5B
\         %5c        %5c               \              %5C               %5C              \           \                    %5C
]         %5d        %5d               ]              %5D               %5D              ]           ]                    %5D
^         %5e        %5e               ^              %5E               %5E              ^           ^                    %5E
_         _          _                 _              _                 _                _           _                    %5F
`         %60        %60               `              %60               %60              `           `                    %60
{         %7b        %7b               {              %7B               %7B              {           {                    %7B
|         %7c        %7c               |              %7C               %7C              |           |                    %7C
}         %7d        %7d               }              %7D               %7D              }           }                    %7D
~         %7e        %7e               ~              ~                 ~                ~           ~                    %7E

Ā         %c4%80     %u0100            %c4%80         %C4%80            %C4%80           Ā           Ā                    [OoR]
ā         %c4%81     %u0101            %c4%81         %C4%81            %C4%81           ā           ā                    [OoR]
Ē         %c4%92     %u0112            %c4%92         %C4%92            %C4%92           Ē           Ē                    [OoR]
ē         %c4%93     %u0113            %c4%93         %C4%93            %C4%93           ē           ē                    [OoR]
Ī         %c4%aa     %u012a            %c4%aa         %C4%AA            %C4%AA           Ī           Ī                    [OoR]
ī         %c4%ab     %u012b            %c4%ab         %C4%AB            %C4%AB           ī           ī                    [OoR]
Ō         %c5%8c     %u014c            %c5%8c         %C5%8C            %C5%8C           Ō           Ō                    [OoR]
ō         %c5%8d     %u014d            %c5%8d         %C5%8D            %C5%8D           ō           ō                    [OoR]
Ū         %c5%aa     %u016a            %c5%aa         %C5%AA            %C5%AA           Ū           Ū                    [OoR]
ū         %c5%ab     %u016b            %c5%ab         %C5%AB            %C5%AB           ū           ū                    [OoR]

The columns represent encodings as follows:

  • UrlEncoded: HttpUtility.UrlEncode

  • UrlEncodedUnicode: HttpUtility.UrlEncodeUnicode

  • UrlPathEncoded: HttpUtility.UrlPathEncode

  • EscapedDataString: Uri.EscapeDataString

  • EscapedUriString: Uri.EscapeUriString

  • HtmlEncoded: HttpUtility.HtmlEncode

  • HtmlAttributeEncoded: HttpUtility.HtmlAttributeEncode

  • HexEscaped: Uri.HexEscape

NOTES:

  1. HexEscape can only handle the first 255 characters. Therefore it throws an ArgumentOutOfRange exception for the Latin A-Extended characters (eg Ā).

  2. This table was generated in .NET 4.0 (see Levi Botelho's comment below that says the encoding in .NET 4.5 is slightly different).

EDIT:

I've added a second table with the encodings for .NET 4.5. See this answer: https://stackoverflow.com/a/21771206/216440

EDIT 2:

Since people seem to appreciate these tables, I thought you might like the source code that generates the table, so you can play around yourselves. It's a simple C# console application, which can target either .NET 4.0 or 4.5:

using System;
using System.Collections.Generic;
using System.Text;
// Need to add a Reference to the System.Web assembly.
using System.Web;

namespace UriEncodingDEMO2
{
    class Program
    {
        static void Main(string[] args)
        {
            EncodeStrings();

            Console.WriteLine();
            Console.WriteLine("Press any key to continue...");
            Console.Read();
        }

        public static void EncodeStrings()
        {
            string stringToEncode = "ABCD" + "abcd"
            + "0123" + " !\"#$%&'()*+,-./:;<=>?@[\\]^_`{|}~" + "ĀāĒēĪīŌōŪū";

            // Need to set the console encoding to display non-ASCII characters correctly (eg the 
            //  Latin A-Extended characters such as ĀāĒē...).
            Console.OutputEncoding = Encoding.UTF8;

            // Will also need to set the console font (in the console Properties dialog) to a font 
            //  that displays the extended character set correctly.
            // The following fonts all display the extended characters correctly:
            //  Consolas
            //  DejaVu Sana Mono
            //  Lucida Console

            // Also, in the console Properties, set the Screen Buffer Size and the Window Size 
            //  Width properties to at least 140 characters, to display the full width of the 
            //  table that is generated.

            Dictionary<string, Func<string, string>> columnDetails =
                new Dictionary<string, Func<string, string>>();
            columnDetails.Add("Unencoded", (unencodedString => unencodedString));
            columnDetails.Add("UrlEncoded",
                (unencodedString => HttpUtility.UrlEncode(unencodedString)));
            columnDetails.Add("UrlEncodedUnicode",
                (unencodedString => HttpUtility.UrlEncodeUnicode(unencodedString)));
            columnDetails.Add("UrlPathEncoded",
                (unencodedString => HttpUtility.UrlPathEncode(unencodedString)));
            columnDetails.Add("EscapedDataString",
                (unencodedString => Uri.EscapeDataString(unencodedString)));
            columnDetails.Add("EscapedUriString",
                (unencodedString => Uri.EscapeUriString(unencodedString)));
            columnDetails.Add("HtmlEncoded",
                (unencodedString => HttpUtility.HtmlEncode(unencodedString)));
            columnDetails.Add("HtmlAttributeEncoded",
                (unencodedString => HttpUtility.HtmlAttributeEncode(unencodedString)));
            columnDetails.Add("HexEscaped",
                (unencodedString
                    =>
                    {
                        // Uri.HexEscape can only handle the first 255 characters so for the 
                        //  Latin A-Extended characters, such as A, it will throw an 
                        //  ArgumentOutOfRange exception.                       
                        try
                        {
                            return Uri.HexEscape(unencodedString.ToCharArray()[0]);
                        }
                        catch
                        {
                            return "[OoR]";
                        }
                    }));

            char[] charactersToEncode = stringToEncode.ToCharArray();
            string[] stringCharactersToEncode = Array.ConvertAll<char, string>(charactersToEncode,
                (character => character.ToString()));
            DisplayCharacterTable<string>(stringCharactersToEncode, columnDetails);
        }

        private static void DisplayCharacterTable<TUnencoded>(TUnencoded[] unencodedArray,
            Dictionary<string, Func<TUnencoded, string>> mappings)
        {
            foreach (string key in mappings.Keys)
            {
                Console.Write(key.Replace(" ", "[space]") + " ");
            }
            Console.WriteLine();

            foreach (TUnencoded unencodedObject in unencodedArray)
            {
                string stringCharToEncode = unencodedObject.ToString();
                foreach (string columnHeader in mappings.Keys)
                {
                    int columnWidth = columnHeader.Length + 1;
                    Func<TUnencoded, string> encoder = mappings[columnHeader];
                    string encodedString = encoder(unencodedObject);

                    // ASSUMPTION: Column header will always be wider than encoded string.
                    Console.Write(encodedString.Replace(" ", "[space]").PadRight(columnWidth));
                }
                Console.WriteLine();
            }
        }
    }
}

Licensed under: CC-BY-SA with attribution
Not affiliated with: Stack Overflow
Icon