home   articles   tags   browse code   

C++ urlencode function


 

A url like http://hostname.com:80/folder/file.php?arg=value&b=c#anchor has several components:
scheme: http
host: hostname.com
port: 80
path: folder/file.php
query parameters: arg=value&b=c
fragment: anchor


The query parameters are a set of keys and values. The problem is that often query parameter values conflict with other portions of the query string. For instance if you were doing an address lookup and the url is lookup.php?address=1500 #200 M&M Street, the # of the address conflicts with the anchor fragment, and the & conflicts with a delimiter in the query string. Most languages have a built in function to encode or escape the data so it doesn't conflict. Spaces can be encoded as + or %20, & is encoded as %26, and # as %23. These numbers of course are the characters' hex values from an ascii table. It would be excellent to have a C++ function to automatically encode a url query parameter's value or its uricomponent.

PHP provides us with a urlencode function, javascript provides us with 3 escape, encodeURI and encodeURIComponent(). So here is a urlencode() function for c++, it was modeled after javascript's encodeURIComponent(), and uses the php function name, parameter type and return type.

#include <iostream>
#include <string>

using namespace std;

string urlencode(const string &c);
string char2hex( char dec );

int main(int argc, char *argv[])
{
    string address = "123 #5 M&M Street";
    cout << "address=" << address << endl;
    cout << "address=" << urlencode(address) <<endl;
    //outputs 123%20%235%20M%26M%20Street
}

//based on javascript encodeURIComponent()
string urlencode(const string &c)
{
   
    string escaped="";
    int max = c.length();
    for(int i=0; i<max; i++)
    {
        if ( (48 <= c[i] && c[i] <= 57) ||//0-9
             (65 <= c[i] && c[i] <= 90) ||//abc...xyz
             (97 <= c[i] && c[i] <= 122) || //ABC...XYZ
             (c[i]=='~' || c[i]=='!' || c[i]=='*' || c[i]=='(' || c[i]==')' || c[i]=='\'')
        )
        {
            escaped.append( &c[i], 1);
        }
        else
        {
            escaped.append("%");
            escaped.append( char2hex(c[i]) );//converts char 255 to string "ff"
        }
    }
    return escaped;
}

string char2hex( char dec )
{
    char dig1 = (dec&0xF0)>>4;
    char dig2 = (dec&0x0F);
    if ( 0<= dig1 && dig1<= 9) dig1+=48;    //0,48inascii
    if (10<= dig1 && dig1<=15) dig1+=97-10; //a,97inascii
    if ( 0<= dig2 && dig2<= 9) dig2+=48;
    if (10<= dig2 && dig2<=15) dig2+=97-10;

    string r;
    r.append( &dig1, 1);
    r.append( &dig2, 1);
    return r;
}

Note: It is possible to use sprintf(buffer, "%x", c[i]) to convert a character to hex, but I wanted to avoid using sprintf in this case.
 

Tags: c++
 
Dave Hope on Apr 16th, 2010 10:09 am said:
Nice bit of code, I've seen it in a new places and wondered if you were the original author? - If so, what license have you released it under? Thanks
 

vGamBIT on Jul 20th, 2010 2:44 pm said:
my version: void appendchar2hex(char dec, string& r) { char alp[] = {"0123456789ABCDEF"}; r.append(1, alp[(dec&0xF0)>>4]); r.append(1,alp[dec&0x0F]); } and change call from: escaped.append(char2hex(c[i]) ); to: appendchar2hex(c[i], escaped);
 



 

 



Related Articles
 


home  |  privacy policy  |  terms of use  |  contact  


©2010, Zedwood Digital