Hebern 1 rotor: The Hill Climbing algorithm


Home Page
Hebern's machines Home Page
Hebern 1 rotor Home Page
Cryptanalysis, Home Page

Introduction

Wikipedia definition :

"In numerical analysis, hill climbing is a mathematical optimization technique which belongs to the family of local search. It is an iterative algorithm that starts with an arbitrary solution to a problem, then attempts to find a better solution by making an incremental change to the solution. If the change produces a better solution, another incremental change is made to the new solution, and so on until no further improvements can be found".

Note: This approach is extremely computationally intensive. The use of a computer is almost essential. Obviously the use of Hill Climbing is recent, it was not available before the Second World War.

Hill Climbing Algorithm

We present a very simple implementation of Hill Climbing adapted to find rotor wiring. More complex approaches would undoubtedly make it possible to find the complete solution to the problem of 1 rotor (or even 5 rotor) Hebern machines.

If you use the fitness functions bigrams, trigrams, etc. you need to know the language used and have good statistics for this language.

1. Generate initial random rotor wiring, decrypt and compute an inital 
   score with the fitness function

2. Repeat the following steps. Exit from loop if the fitness function no 
   longer improves the score for a predetermined number of loops (for 
   example, 1000 steps)

   2.1. Randomly we swap two rotor connections.
   2.2. Decrypt and Compute the fitness function.
   2.3. If the score with the new rotor wiring is better, keep it,
        otherwise, discard this new rotor wiring.

3. If the new score if better that better score, fix the better score to 
   the new one and print this score, the key and the beginning of the plain 
   text discovered.

4. Repeat from step 1, i.e. restart with new random rotor wiring. Stop if 
   the predetermined number of loops is reached (Shotgun approch).

Remark: I tried several fitness functions:
- IC (Index of Coincidences) [default fitness function]
- UNI (Unigram)
- BIG (Bigram)
- TRI (Trigram)
- QUA (Quadrigram)

Exemple

Note: In my testing, I needed messages with a minimum size of 500 characters to find plain text.

Consider the following cryptogram:


C:\H1_TOOLS> python groupe.py < MSGS\gutemberg.cry
UJAES HSEIH HOZBM PGBGN IIPPE MAZYA RDRDT UWUTB GWZYH ITJQZ
BIUTV FJGTG VXPTD NNUWB GNXQJ QLZLX AZSVV TNTHF QYYBF KFURJ
HIRNB VOOYW TYWXJ JJKMR UBNTS QOYBL KTVWS DPXIV ISFGV WJBHA
ADZFB FFXAM AYZRZ FZPMP WINIQ OFMAX JWBVV HAWOG KHSPO BFNZF
HCWLJ VEKWQ TOVRM LNFZZ OCUXZ TFTBI PQPTW QVGCO KCJAZ JMEWI
SJLFR OYZSL MZXLA PTXDC BDVID QDXID CDSQF CHZAT GNSKZ WIMQS
WVJBN TSJGF CMOZY ARDRD TUWUT BMMDN SUFEM FSOFX VSGWW DOZSP
VOWMT PRANV VFBDB TKFJG TBQXR JFKBO NNIWF MUWAM RMCPS OJSCK
PRQPB FWXRC VBRQS SOTYD TYXPP YIMDP ERLPQ ZRVJE JDSAA TSRBF
QUUXV SVUZH IGWHJ NJTPK VBCCM SOMHO QPUHJ JIYCC GGSNQ TSFSF
...

First, we use statistics of trigrams of the English language without spaces. We relaunch the HC five times (shotgun).


C:\H1_TOOLS> python hc_h1.py -h
usage:
  -c crypto  The cryptogram
  -M method  The method (BIG,TRI,...), default: IC
  -m file_method The file used with method
  -t tries   Number of tries, default: 15

C:\H1_TOOLS> python hc_h1.py -c MSGS\gutemberg.cry -M TRI -m english -t 5

0 5.219001281980454 PROJECTGUTENBERGSTHEWORKSOFEDGARALLANPOEBYEDGARALL
Key:  ZGURKWBYNXETOIMSVDPLCQFJHA  Score:  5.219001281980454

PROJECTGUTENBERGSTHEWORKSOFEDGARALLANPOEBYEDGARALLANPOETHIS
EBOOKISFORTHE
USEOFANY...

In the following example, we use bigrams:

C:\H1_TOOLS> python hc_h1.py -c MSGS\gutemberg.cry -M BIG -m english
0 7.485136687479294 PROJECTGUTENBERGSTHEWORKSOFEDGARALLANPOEBYEDGARALL
...

Links