How to Implement a Substitution Cipher in C++

A substitution cipher is probably the simplest cipher to implement and, at the same time, it is also the easiest cipher to break.

The algorithm is quite simple. Let's consider an alphabetical string, and a number -- the offset. What we're going to do is this: replace each letter with the letter that's "number" positions ahead of it. So we'll be shifting each letter a few positions ahead. For instance, if the offset number is 2 then the letter "a" becomes "c" (since "c" is the 2nd letter after "a"), "b" becomes "d", "c" becomes "e", or "z" becomes "b", etc.

Here's an MS Paint drawing to illustrate things better
Letter mappings in a substitution cipher

As you can see in the case of "z" we're doing, what people call, "a rotating shift". In our case, this means that "z" + 2 equals "b" or that "y" + 2 equals "a", because since there's no other two letters after "y" or "z" we have to sort of "rotate" and start the counting from the beginning.

Here's some code that illustrates how to implement such a cipher in C/C++.

#include <cctype>
//      Shifts each character in 'input' by 'offset' letters
//      Example: "abCdz" becomes "efGhd"
void encrypt(char * input, unsigned int offset) {
        for (int i = 0; input[i] != 0; i++) {
                char firstLetter = islower(input[i]) ? 'a' : 'A';
                unsigned int alphaOffset = input[i] - firstLetter;
                unsigned int newAlphaOffset = alphaOffset + offset;
                input[i] = firstLetter + newAlphaOffset % 26;
        }
}

To better understand the code you need to be aware of how the ASCII code works. In C/C++, and other languages for that matter, each 'char' variable stores a number -- an ASCII encoding that represents a character. It is very important to know that letters have their own ASCII encodings and these encodings are consecutive. That means that the letter 'a' and 'b' have consecutive encodings. So if 'a' is represented by the number 61 in the ASCII table, then 'b' will be represented by the number 62. This is crucial in the implementation of our algorithm because it allows us to obtain the encrypted version of a character by simply adding the offset number to the character. So basically, when we encrypt the letter 'a', we add the number two to its encoding. We get 61 + 2 = 63, which is the encoding for the letter 'c'.

It's also important to know that lowercase letters have different encodings than uppercase letters, so that means the ASCII code for 'a' is not the same as the ASCII code for 'A'. This basically implies that 'a' and 'A' are different characters in the mind of a computer, even though we might not differentiate between them in real life. Hence we have to do the switching, using the ternary ?:, operator between 'a' and 'A'.

As a challenge, do the decryption function for this algorithm. It should look have the same prototype, except for the name, which will obviously be "decrypt" instead of "encrypt". To give you a little tip, you should be careful when decrypting letters like 'a', because you would have to substract the offset from 0 (which is the alphabetical offset of 'a') and you would get a negative number, but what do you do with it? Think about it...

Here's the full program for those of you who are too lazy...

AttachmentSize
substitutionCipher.txt1.81 KB
Substitution_cipher_mappings.JPG17.95 KB
Anonymous's picture

struct translate //

struct translate // TRANSLATION TABLE
{
char orig; // LETTER FROM INPUT TEXT
char trans; // LETTER IT'S TRANSLATED TO
int freq; // FREQUENCY OF ORIG IN INPUT TEXT
char lock; // PREVENTS TRANS FROM BEING CHANGED
} letter[26];

string intext; // INPUT TEXT

//----------------------------------------------------------------------------
// DETERMINE LETTER FREQUENCIES
void freqtally()
{
for(uint i = 0 ; i < 26 ; i++) // FOR EACH LETTER,
for(uint j = 0 ; j < intext.length() ; j++) // SEARCH FOR IT,
if(intext[j] == letter[i].orig) // AND, WHEN FOUND,
letter[i].freq++; // INCREMENT COUNT
}
//----------------------------------------------------------------------------
// INITIALIZES TRANSLATION TABLE
void initable()
{
string source("ABCDEFGHIJKLMNOPQRSTUVWXYZ");
for(int i = 0 ; i < 26 ; i++)
{
letter[i].orig = source[i];
letter[i].trans = source[i];
letter[i].freq = 0;
letter[i].lock = false;
}
}
//----------------------------------------------------------------------------
// GETS INPUT TEXT FROM DISK FILE
// RETURNS true IF OK, false if not
bool readtext(const char *filename)
{
ifstream infile(filename);
if(!infile)
{
Console::WriteLine("Error reading file, or not found.");
return(false);
}
intext.clear();
stringfromfile(intext, infile);
toupper(intext);
return(true);
}
//----------------------------------------------------------------------------
// DISPLAY TEXT, AS TRANSLATED, and optionally also write to disk file.
// This looks kind of messy, but putting it into 2 different functions
// didn't look much better. It would probably be ok to just use one
// function, with the constream manipulators, and let them be ignored
// when writing to a disk file. Could also break some of the sections here
// into separate functions (showtranslations(), showfreq(), showastranslated())
void showtrans(int tofile) // FLAG WHETHER TO also WRITE TO DISK FILE
{
ofstream outfile;
if(tofile)
outfile.open("deciphed.txt");

uint i;
for(i = 0 ; i < 26 ; i++) // SHOW CURRENT CODE SCHEME
{
// DIFFERENT COLOR IF LOCKED. The Console:: functions can be intermingled with
// ordinary cout calls. A good thing because Console::Write outputs chars as decimal numbers.
// Couldn't figure out how to force their output as letters.
Console::ForegroundColor = (letter[i].lock ? ConsoleColor::Green : ConsoleColor::Gray);
cout << setw(3) << setiosflags(ios::left) << letter[i].trans;
if(tofile)
outfile << setw(3) << setiosflags(ios::left) << letter[i].trans;
}
cout << endl;
if(tofile)
outfile << endl;

Console::ForegroundColor = ConsoleColor::Red;
for(i = 0 ; i < 26 ; i++) // SHOW LETTER FREQUENCIES
{
cout << setw(3) << setiosflags(ios::left) << letter[i].freq;
if(tofile)
outfile << setw(3) << setiosflags(ios::left) << letter[i].freq;
}
cout << endl << endl;
if(tofile)
outfile << endl << endl;
for(i = 0 ; i < intext.length() ; i++) // FOR EACH CHAR IN INTEXT,
{
if(Console::KeyAvailable == true) // ANY KEY CUTS LISTING SHORT
{
Console::ReadKey(true);
break;
}
bool found = false;
for(uint j = 0 ; j < 26 ; j++) // LOOK IT UP IN TABLE
if(intext[i] == letter[j].orig) // IF FOUND, IT'S A LETTER
{ // SHOW ITS TRANSLATION
Console::ForegroundColor = (letter[j].lock ? ConsoleColor::Green : ConsoleColor::Gray);
cout << letter[j].trans;
if(tofile)
outfile << letter[j].trans;
found = true;
break; // BREAK LOOP TO DO NEXT CHAR
}
if(!found) // OUTPUT NON-LETTER AS-IS
{
if(intext[i] == '\t') // expand tabs to spaces
cout << " ";
else
{
Console::ForegroundColor = ConsoleColor::Gray;
cout << intext[i];
}
if(tofile)
outfile << intext[i];
}
}
cout << endl;
if(tofile)
outfile << endl;
} // end showtrans()
//----------------------------------------------------------------------------
// SWAP TRANSLATIONS FOR 2 LETTERS
// BEFORE: ZBRUET AFTER: ZBRUET <- LETTER[].ORIG, NEVER CHANGES
// ZBRUET ETRUZB <- LETTER[].TRANS
// RETURNS true IF OK, false IF SWAP DISALLOWED
int swaplets(char a, char b) // LETTERS WHOSE TRANSLATIONS ARE TO BE SWAPPED
{
a = toupper(a);
b = toupper(b);

int adex, bdex; // adex and bdex are INDEXES OF A AND B IN LETTER[].TRANS
for(adex = 0 ; (adex < 26) && (letter[adex].trans != a) ; adex++); // LOCATE A
for(bdex = 0 ; (bdex < 26) && (letter[bdex].trans != b) ; bdex++); // LOCATE B

if((letter[adex].lock) || (letter[bdex].lock)) // TEST FOR LOCKED
return(false);

char ch = letter[adex].trans; // SWAP THEIR TRANSLATIONS
letter[adex].trans = letter[bdex].trans;
letter[bdex].trans = ch;
return(true);
}
//----------------------------------------------------------------------------
// INITIALLY ASSIGN TRANSLATIONS FOR MOST FREQUENT LETTERS
// This doesn't always result in the assignments I expect, maybe because of the letter locking.
void etaoin()
{
// ASSIGN MOST FREQUENT LETTER TO 'E', TEMPORARILY LOCK IN PLACE,
// 2ND MOST FREQUENT = 'T', ETC.
swaplets(letter[0].orig,'E'); letter[0].lock = true;
swaplets(letter[1].orig,'T'); letter[1].lock = true;
swaplets(letter[2].orig,'A'); letter[2].lock = true;
swaplets(letter[3].orig,'O'); letter[3].lock = true;
swaplets(letter[4].orig,'I'); letter[4].lock = true;
swaplets(letter[5].orig,'N'); letter[5].lock = true;
swaplets(letter[6].orig,'S'); letter[6].lock = true;
swaplets(letter[7].orig,'H'); letter[7].lock = true;
swaplets(letter[8].orig,'R'); letter[8].lock = true;
swaplets(letter[9].orig,'D'); letter[9].lock = true;
swaplets(letter[10].orig,'L'); letter[10].lock = true;
swaplets(letter[11].orig,'U'); letter[11].lock = true;
// SHUFFLE THESE SELDOM-USED TO END OF ARRAY, IF POSSIBLE
swaplets(letter[25].orig,'Q'); letter[25].lock = true;
swaplets(letter[24].orig,'J'); letter[24].lock = true;
swaplets(letter[23].orig,'X'); letter[23].lock = true;
swaplets(letter[22].orig,'Z');

for(int i = 0 ; i < 26 ; i++) // RE-UNLOCK THEM ALL
letter[i].lock = false;
}
//----------------------------------------------------------------------------
// SORT ARRAY IN ORDER OF DECREASING FREQUENCY
int sortlets(struct translate *l, struct translate *r)
{
return(r->freq - l->freq); // a reverse-sort (backwards)
}
//----------------------------------------------------------------------------
// MAIN
int main(array ^args)
{
Console::Clear();
cout << "Usage: DECIPHER filename.ext. Output, if any, goes to DECIPHED.TXT" << endl;
string s;
if(args->Length < 2)
{
s = "CIPHERED.TXT"; // used during testing, when there are no command line args.
//return(1); // an alternative: quit.
}
else
SystemStringToBasicString(args[1],s);

initable(); // PUT LETTERS IN THE TRANS. TABLE
if(readtext(s.c_str())) // GET CIPHERED TEXT FROM FILE
{
freqtally(); // TALLY LETTER FREQUENCIES
// sort by decreasing frequency
qsort(letter,26,sizeof(letter[0]),(int (*)(const void *,const void *)) sortlets);

if(yesno("Automatically assign ETAOIN... to most frequent letters?"))
etaoin(); // ASSIGN ETAOIN TO MOST FREQ.

uint i, j;
string buf;
char ch;
while(1) // START PROCESSING
{
showtrans(false); // DISPLAY TEXT
Console::ForegroundColor = ConsoleColor::Gray;
cout << "wap, ock, nlock, one, uit: ";
ch = tolower(getche());
cout << endl;
switch(ch)
{
case 's': // SWAP LETTERS
do
{
cout << "Swap: ";
getline(cin,buf,'\n');
}
while(buf.length() < 2); // C/R = ABORT
if(!swaplets(buf[0],buf[1]))
putchar(7); // A LETTER IS LOCKED - BEEP
cout << endl;
break;
case 'u': // unlock letters
case 'l': // LOCK LETTERS
cout << ((ch == 'l' ? "Lock letter(s): " : "Unlock letter(s): "));
getline(cin,buf,'\n'); // could use my cgets() from net.cpp
if(!buf.length()) // C/R = ABORT
break;
toupper(buf);
for(j = 0 ; j < buf.length() ; j++)
for(i = 0 ; i < 26 ; i++)
if(letter[i].trans == buf[j])
{
letter[i].lock = (ch == 'l' ? true : false);
break;
}
cout << endl;
break;
case 'd': // DONE - WRITE TO DISK & EXIT
showtrans(true);
case 'q': // quit - no write to disk
cout << "Press ESC to quit. ";
if(getch() == 27)
return(0);
cout << endl;
break;
default:
break;
} // END SWITCH
} // END WHILE(1)
} // end if(readtext)
return(0);
}

Post new comment

The content of this field is kept private and will not be shown publicly. If you have a Gravatar account associated with the e-mail address you provide, it will be used to display your avatar.