Friday, June 24, 2011

Convert text to html codes

I was surprised there was no simple script out here to do this conversion. I took the an table of html ascii and wrote this bash script.

I know there are functions to convert text to ascii. I didn't want to go through the hassle of formatting it for HTML.

This script accepts as input the text you want to convert and also a html formatted table of ascii codes.



#!/bin/sh
# $1 input text
# $2 ascii table



to_process=$1
res=""
for ((i=0;i<${#to_process};i++))
{
letter=${to_process:i:1}

if [[ $letter == "." ]]
then
res="${res} &#46;"
else
line_no=`grep -n "<td>$letter</td>" $2 | cut -f 1 -d ':'`
((line_no++))

if [[ ${#line_no} -gt 0 ]]
then

part=`sed -n "${line_no}p" $2 | sed -e 's/<td>//' -e 's/<\/td>//'`
res=${res}${part}

fi

fi
}
echo $res | sed -e 's/amp\;//g' -e 's/ //g'


Here is the table I used:


<table cellspacing="0" border="1" width="100%" class="reference">
<tbody><tr>
<th align="left">ASCII Character</th>
<th align="left">HTML Entity Code</th>
<th align="left">Description</th>
</tr>
<tr>
<td> </td>
<td>&#32;</td>
<td>space</td>
</tr>
<tr>
<td>!</td>
<td>&#33;</td>
<td>exclamation mark</td>
</tr>
<tr>
<td>"</td>
<td>&#34;</td>
<td>quotation mark</td>
</tr>
<tr>
<td>#</td>
<td>&#35;</td>
<td>number sign</td>
</tr>
<tr>
<td>$</td>
<td>&#36;</td>
<td>dollar sign</td>
</tr>
<tr>
<td>%</td>
<td>&#37;</td>
<td>percent sign</td>
</tr>
<tr>
<td>&</td>
<td>&#38;</td>
<td>ampersand</td>
</tr>
<tr>
<td>'</td>
<td>&#39;</td>
<td>apostrophe</td>
</tr>
<tr>
<td>(</td>
<td>&#40;</td>
<td>left parenthesis</td>
</tr>
<tr>
<td>)</td>
<td>&#41;</td>
<td>right parenthesis</td>
</tr>
<tr>
<td>*</td>
<td>&#42;</td>
<td>asterisk</td>
</tr>
<tr>
<td>+</td>
<td>&#43;</td>
<td>plus sign</td>
</tr>
<tr>
<td>,</td>
<td>&#44;</td>
<td>comma</td>
</tr>
<tr>
<td>-</td>
<td>&#45;</td>
<td>hyphen</td>
</tr>
<tr>
<td>.</td>
<td>&#46;</td>
<td>period</td>
</tr>
<tr>
<td>/</td>
<td>&#47;</td>
<td>slash</td>
</tr>
<tr>
<td>0</td>
<td>&#48;</td>
<td>digit 0</td>
</tr>
<tr>
<td>1</td>
<td>&#49;</td>
<td>digit 1</td>
</tr>
<tr>
<td>2</td>
<td>&#50;</td>
<td>digit 2</td>
</tr>
<tr>
<td>3</td>
<td>&#51;</td>
<td>digit 3</td>
</tr>
<tr>
<td>4</td>
<td>&#52;</td>
<td>digit 4</td>
</tr>
<tr>
<td>5</td>
<td>&#53;</td>
<td>digit 5</td>
</tr>
<tr>
<td>6</td>
<td>&#54;</td>
<td>digit 6</td>
</tr>
<tr>
<td>7</td>
<td>&#55;</td>
<td>digit 7</td>
</tr>
<tr>
<td>8</td>
<td>&#56;</td>
<td>digit 8</td>
</tr>
<tr>
<td>9</td>
<td>&#57;</td>
<td>digit 9</td>
</tr>
<tr>
<td>:</td>
<td>&#58;</td>
<td>colon</td>
</tr>
<tr>
<td>;</td>
<td>&#59;</td>
<td>semicolon</td>
</tr>
<tr>
<td><</td>
<td>&#60;</td>
<td>less-than</td>
</tr>
<tr>
<td>=</td>
<td>&#61;</td>
<td>equals-to</td>
</tr>
<tr>
<td>></td>
<td>&#62;</td>
<td>greater-than</td>
</tr>
<tr>
<td>?</td>
<td>&#63;</td>
<td>question mark</td>
</tr>
<tr>
<td>@</td>
<td>&#64;</td>
<td>at sign</td>
</tr>
<tr>
<td>A</td>
<td>&#65;</td>
<td>uppercase A</td>
</tr>
<tr>
<td>B</td>
<td>&#66;</td>
<td>uppercase B</td>
</tr>
<tr>
<td>C</td>
<td>&#67;</td>
<td>uppercase C</td>
</tr>
<tr>
<td>D</td>
<td>&#68;</td>
<td>uppercase D</td>
</tr>
<tr>
<td>E</td>
<td>&#69;</td>
<td>uppercase E</td>
</tr>
<tr>
<td>F</td>
<td>&#70;</td>
<td>uppercase F</td>
</tr>
<tr>
<td>G</td>
<td>&#71;</td>
<td>uppercase G</td>
</tr>
<tr>
<td>H</td>
<td>&#72;</td>
<td>uppercase H</td>
</tr>
<tr>
<td>I</td>
<td>&#73;</td>
<td>uppercase I</td>
</tr>
<tr>
<td>J</td>
<td>&#74;</td>
<td>uppercase J</td>
</tr>
<tr>
<td>K</td>
<td>&#75;</td>
<td>uppercase K</td>
</tr>
<tr>
<td>L</td>
<td>&#76;</td>
<td>uppercase L</td>
</tr>
<tr>
<td>M</td>
<td>&#77;</td>
<td>uppercase M</td>
</tr>
<tr>
<td>N</td>
<td>&#78;</td>
<td>uppercase N</td>
</tr>
<tr>
<td>O</td>
<td>&#79;</td>
<td>uppercase O</td>
</tr>
<tr>
<td>P</td>
<td>&#80;</td>
<td>uppercase P</td>
</tr>
<tr>
<td>Q</td>
<td>&#81;</td>
<td>uppercase Q</td>
</tr>
<tr>
<td>R</td>
<td>&#82;</td>
<td>uppercase R</td>
</tr>
<tr>
<td>S</td>
<td>&#83;</td>
<td>uppercase S</td>
</tr>
<tr>
<td>T</td>
<td>&#84;</td>
<td>uppercase T</td>
</tr>
<tr>
<td>U</td>
<td>&#85;</td>
<td>uppercase U</td>
</tr>
<tr>
<td>V</td>
<td>&#86;</td>
<td>uppercase V</td>
</tr>
<tr>
<td>W</td>
<td>&#87;</td>
<td>uppercase W</td>
</tr>
<tr>
<td>X</td>
<td>&#88;</td>
<td>uppercase X</td>
</tr>
<tr>
<td>Y</td>
<td>&#89;</td>
<td>uppercase Y</td>
</tr>
<tr>
<td>Z</td>
<td>&#90;</td>
<td>uppercase Z</td>
</tr>
<tr>
<td>[</td>
<td>&#91;</td>
<td>left square bracket</td>
</tr>
<tr>
<td>\</td>
<td>&#92;</td>
<td>backslash</td>
</tr>
<tr>
<td>]</td>
<td>&#93;</td>
<td>right square bracket</td>
</tr>
<tr>
<td>^</td>
<td>&#94;</td>
<td>caret</td>
</tr>
<tr>
<td>_</td>
<td>&#95;</td>
<td>underscore</td>
</tr>
<tr>
<td>`</td>
<td>&#96;</td>
<td>grave accent</td>
</tr>
<tr>
<td>a</td>
<td>&#97;</td>
<td>lowercase a</td>
</tr>
<tr>
<td>b</td>
<td>&#98;</td>
<td>lowercase b</td>
</tr>
<tr>
<td>c</td>
<td>&#99;</td>
<td>lowercase c</td>
</tr>
<tr>
<td>d</td>
<td>&#100;</td>
<td>lowercase d</td>
</tr>
<tr>
<td>e</td>
<td>&#101;</td>
<td>lowercase e</td>
</tr>
<tr>
<td>f</td>
<td>&#102;</td>
<td>lowercase f</td>
</tr>
<tr>
<td>g</td>
<td>&#103;</td>
<td>lowercase g</td>
</tr>
<tr>
<td>h</td>
<td>&#104;</td>
<td>lowercase h</td>
</tr>
<tr>
<td>i</td>
<td>&#105;</td>
<td>lowercase i</td>
</tr>
<tr>
<td>j</td>
<td>&#106;</td>
<td>lowercase j</td>
</tr>
<tr>
<td>k</td>
<td>&#107;</td>
<td>lowercase k</td>
</tr>
<tr>
<td>l</td>
<td>&#108;</td>
<td>lowercase l</td>
</tr>
<tr>
<td>m</td>
<td>&#109;</td>
<td>lowercase m</td>
</tr>
<tr>
<td>n</td>
<td>&#110;</td>
<td>lowercase n</td>
</tr>
<tr>
<td>o</td>
<td>&#111;</td>
<td>lowercase o</td>
</tr>
<tr>
<td>p</td>
<td>&#112;</td>
<td>lowercase p</td>
</tr>
<tr>
<td>q</td>
<td>&#113;</td>
<td>lowercase q</td>
</tr>
<tr>
<td>r</td>
<td>&#114;</td>
<td>lowercase r</td>
</tr>
<tr>
<td>s</td>
<td>&#115;</td>
<td>lowercase s</td>
</tr>
<tr>
<td>t</td>
<td>&#116;</td>
<td>lowercase t</td>
</tr>
<tr>
<td>u</td>
<td>&#117;</td>
<td>lowercase u</td>
</tr>
<tr>
<td>v</td>
<td>&#118;</td>
<td>lowercase v</td>
</tr>
<tr>
<td>w</td>
<td>&#119;</td>
<td>lowercase w</td>
</tr>
<tr>
<td>x</td>
<td>&#120;</td>
<td>lowercase x</td>
</tr>
<tr>
<td>y</td>
<td>&#121;</td>
<td>lowercase y</td>
</tr>
<tr>
<td>z</td>
<td>&#122;</td>
<td>lowercase z</td>
</tr>
<tr>
<td>{</td>
<td>&#123;</td>
<td>left curly brace</td>
</tr>
<tr>
<td>|</td>
<td>&#124;</td>
<td>vertical bar</td>
</tr>
<tr>
<td>}</td>
<td>&#125;</td>
<td>right curly brace</td>
</tr>
<tr>
<td>~</td>
<td>&#126;</td>
<td>tilde</td>
</tr>
</tbody></table>

No comments: