Euler Solution 59

From ProgSoc Wiki

Jump to: navigation, search

Contents

Euler Problem 59

Break the XOR encryption. The encryption key consists of three lower case characters. Using cipher1.txt (right click and 'Save Link/Target As...'), a file containing the encrypted ASCII codes, and the knowledge that the plain text must contain common English words, decrypt the message and find the sum of the ASCII values in the original text.

Solutions

Python by Althalus

Runtime: 35ms

Loop 3 times for each character in key. First loop, for each letter of the alphabet, decrypt the first character, and every subsequent third character. If the decrypted character is non-printable, the current character is not going to form part of a valid key. Of the possible decrypted strings, find the one with the most e's. Repeat the process for the rest of the key. It's not all that sophisticated approach, but scarily enough it works.

import string
from time import time
code= [<snip. Paste them in from the file>]
alphabet = [ord(x) for x in   ['a','b','c','d','e','f','g','h','i','j','k','l','m','n','o','p','q','r','s','t','u','v','w','x','y','z']]
def crackIt(start, length, s, pk):
 possibleResults = []
 for x in pk:
  count = start
  s1 = s[:]
  printable = True
  while count < len(s):
   newChar =  x^s1[count]
   s1[count]= newChar
   if chr(s1[count]) not in string.printable: printable=False
   count += length
  if printable: possibleResults.append((s1,chr(x)))
 return possibleResults

keyLength = 3
theCode, theKey = code[:], 
for start in xrange(0,keyLength):
 mostE, theChar = 0, 
 for tempCode in crackIt(start,keyLength,theCode,alphabet):
  e = sum([1 for x in tempCode[0] if x == ord('e')])
  (mostE,theCode,theChar) = ((e, tempCode[0],tempCode[1]) if e > mostE else (mostE,theCode,theChar))
 theKey += theChar

print 'Text:\n%s\n\nKey: %s\n\nASCII Sum: %d' % (string.join([chr(x) for x in theCode], ), theKey, sum(theCode))

C by SanguineV

Note: Edited to a simpler version, check history for older long one.

Runtime: 2 ms

#include <stdio.h>

// The ciphertext!
char code[] = {...};

// Used for finding the key, 3 letters, each has 26 alpha potentials!
int keyFind[3][26] = {{0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0},
                      {0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0},
                      {0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0}};

// The key!
char key[3] = {0,0,0};

/* Find the key!
 * ALGORITHM:
 * 1. Pass through a sample of the cipher file and XOR with ' ', the most common
 *    character in English!
 * 2. If the result of the XOR is a lower case English letter than add to the total
 *    number of times this has occurred for that letter (keyFind)!
 * 3. Assume the most common appearing letter is the correct letter of the key! */
void findKey()
{
  char curr;
  int pos = 0;
  while(pos < 300)
  {
    curr = code[pos] ^ ' ';
    if (curr > 96 && curr < 123)
    {
      keyFind[pos%3][curr - 97] += 1;
    }
    pos++;
  }
  curr = 0;
  int i = 0;
  int tot = 0;
  while(i < 3)
  {
    pos = 0;
    tot = 0;
    while (pos < 26)
    {
      if (keyFind[i][pos] > tot)
      {
        tot = keyFind[i][pos];
        curr = pos;
      }
      pos++;
    }
    key[i] = curr + 97;
    i++;
  }
}
 
// Find the key, then add up total and print the key!
int main()
{
  findKey();
  long total = 0;
  int iter = 0;
  while (iter < 1201)
  {
    total += (code[iter] ^ key[iter%3]);
    iter++;
  }
  printf("Total: %u with password '%c%c%c'.\n",total,key[0],key[1],key[2]);
  return 0;
}

Uses some similar knowledge about character frequency as Justin's approach (we were discussing in the ProgSoc room, so no surprised there). However, without optimising the code, the algorithm is pretty fast only traversing the ciphertext 1.25 times. You only need a sample to determine the distribution of letters in a block of text... so why scan 1200 characters when 300 is probably adequate? (100 per letter of the code.)

Personal tools