logo

Python Challenge - Level 3

Problem

One small letter, surrounded by EXACTLY three big bodyguards on each of its sides.

The image shows 3 big candles on each side of a small one. The description hints us to find letters! So open the page source, we see a comment block at the bottom of the page:

<!--
kAewtloYgcFQaJNhHVGxXDiQmzjfcpYbzxlWrVcqsmUbCunkfxZWDZjUZMiGq
...

Hmm, shall we find all segments of the pattern AAAbCCC?

Solution

Step 1: Load Data

Similar to level 2, You can manually copy-and-paste the text to a file(resources/level3.txt in source code), then read from it:

>>> data = open('resources/level3.txt').read()

Or extract the text from HTML directly.

>>> import urllib.request
>>> import re
>>> html = urllib.request.urlopen("http://www.pythonchallenge.com/pc/def/equality.html").read().decode()
>>> data = re.findall("<!--(.*?)-->", html, re.DOTALL)[-1]

Step 2: Find the Matches

Now we have the content as a big long string, we can use regular expression to find all the matches. The pattern can be described as [^A-Z]+[A-Z]{3}([a-z])[A-Z]{3}[^A-Z]+. Here's a break down of the pattern:

  • [a-z]: 1 lower case letter
  • [A-Z]: 1 upper case letter
  • [A-Z]{3}: 3 consecutive upper case letters
  • [A-Z]{3}[a-z][A-Z]{3}: 3 upper case letters + 1 lower case letter + 3 upper case letters
  • [^A-Z]: any character BUT an upper case letter
  • [^A-Z]+: at least one such character
  • [^A-Z]+[A-Z]{3}[a-z][A-Z]{3}[^A-Z]+: something else before and after our patter(AAAbCCC) so there's no more than 3 consecutive upper case letters on each side
  • [^A-Z]+[A-Z]{3}([a-z])[A-Z]{3}[^A-Z]+: ...and we only care about the lower case

Let's see what we get:

>>> re.findall("[^A-Z]+[A-Z]{3}([a-z])[A-Z]{3}[^A-Z]+", data)
['l', 'i', 'n', 'k', 'e', 'd', 'l', 'i', 's', 't']

And join them together

>>> "".join(re.findall("[^A-Z]+[A-Z]{3}([a-z])[A-Z]{3}[^A-Z]+", data))
'linkedlist'

That's it! linkedlist.

Put Everything Together

import urllib.request
import re

html = urllib.request.urlopen("http://www.pythonchallenge.com/pc/def/equality.html").read().decode()
data = re.findall("<!--(.*)-->", html, re.DOTALL)[-1]
print("".join(re.findall("[^A-Z]+[A-Z]{3}([a-z])[A-Z]{3}[^A-Z]+", data)))

Next Level

http://www.pythonchallenge.com/pc/def/linkedlist.html

The page will redirect you to linkedlist.php

http://www.pythonchallenge.com/pc/def/linkedlist.php

Further Readings