Saturday, December 20, 2008

Project Euler Part 2

**This was a draft written in December 2008/January 2009**
I'm publishing it now because, even though it isn't done, it will never get any more done. So that makes it done.

Problem 12: The sequence of triangle numbers is generated by adding the natural numbers. So the 7^(th) triangle number would be 1 + 2 + 3 + 4 + 5 + 6 + 7 = 28. What is the value of the first triangle number to have over five hundred divisors?
My first attempt was to iterate through the triangular numbers, checking their number of divisors:
        

Friday, December 19, 2008

Project Euler

**This was a draft written in December 2008**
I'm publishing it now because, even though it isn't done, it will never get any more done. So that makes it done.

[...] Boring as home is downloading TF2 and losing my internet to the ensuing packet deluge has made it worse. As such, I've decided to take advantage of the down time and work on some solutions to project euler problems. This post will list the problems and how I solved them.

In solving each problem, I start with whatever pops to mind first. I strive for readability and conciseness first. Then, if this doesn't work, I will implement a more complex algorithm to solve the problem. I stick to algorithms I already know (though I obviously had to look some of them up for implementation).

It appears the biggest problem here is going to be windows. GVim isn't too amused by my attempts to make it behave like normal. Apparently, they assume windows users would Also, I've decided that this will be more fun in python3k, so I need to set that up also. Windows 2003 isn't amused by this idea... Ahh, genius! obviously, I shouldn't be allowed to run a file I've pushed over a WRITE mounted samba share! In the same user's session who is sharing the directory!

Problem 1: Find the sum of all the multiples of 3 or 5 below 1000.
Kind of can't believe this didn't work on the first try (wrote and instead of or; I need to read problems more thoroughly...)

def problem1():
def problem1_generator():
for n in range(0,1000):
if n%3 == 0 or n%5 == 0:
yield n
result = sum(problem1_generator())
return result

This isn't really as pythonic as it could be though:
def problem1b():
numbers = (n for n in range(0,1000) if n%3 == 0 or n%5 == 0)
return sum(numbers)



Problem 2: sum of all even fibonacci numbers less than 4 million.
Also got this one wrong on the first try, forgot the even part...

def problem2():
def fib_generator(end=float("Infinity")):
""" return fibonacci numbers < b =" 1,2" b =" b," 2 ="="0)">



Problem 3: What is the largest prime factor of the number 600851475143 ?
A simple brute force solution.

def problem3():
from math import sqrt
def factors_generator(n, primes=[2]):
""" Due to the way python binds default method args,
primes will persist across method calls...
"""
for i in range(2, int(sqrt(n))):
if not any(p for p in primes if i%p == 0):
primes.append(i)
if n%i == 0:
n = n//i
yield i
if n == 1:
break

return sorted(list(factors_generator(600851475143)))[-1]



Problem 4: Find the largest palindrome made from the product of two 3-digit numbers.

def problem4():
numbers = list(range(100,1000))
to_test = sorted([x*y for x in numbers for y in numbers])
for n in reversed(to_test):
s = str(n)
if s == s[::-1]:
return n

Here we can see the excellent reversed built-in, added in Py3k. It's kind of unfortunate that str(reversed(s)) converts the generator itself to a string, rather than the items in it.



Problem 5: What is the smallest number that is evenly divisible by all of the numbers from 1 to 20?
I was originally doing this the hard way, writing my own custom lcm routine. Then I remembered that the lcm can be computed in terms of the gcd, and things got much easier.

def problem5():
from functools import reduce

def gcd(a,b):
while b:
a,b = b, a%b
return a

def lcm(a,b):
return a // gcd(a,b) * b

return reduce(lcm, range(1,21))



Problem 6: Find the difference between the sum of the squares of the first one hundred natural numbers and the square of the sum.
This one is cake.

def problem6():
numbers = range(1,101)
squared = (x**2 for x in numbers)
return sum(numbers)**2 - sum(squared)



Problem 7: Find the 10001st prime.
I reuse the factors_generator method created for problem 3. This takes about 30 seconds on my machine, so it isn't winning any speed tests.

def problem7():
primes = []
i = 2
while len(primes) < 10001: for factor in factors_generator(i, primes): pass i *= 2 print(len(primes)) return primes[10000]

This is a bit entangled, so here's a standalone solution. I think it's even less code...
def problem7b():
primes = []
i = 2
while len(primes) < 10001: if not any(p for p in primes if i%p == 0): primes.append(i) i += 1 return primes[10000]



Problem 8: Find the greatest product of five consecutive digits a (given) 1000-digit number.
For the number, check the code. First problem: how the hell do you represent a 1000 digit number cleanly in code? I was forced to use a multiline string and then strip whitespace... Other than that, not much to see here.

def problem8():
from functools import reduce
from operator import mul
numberS = """
73167176531330624919225119674426574742355349194934
96983520312774506326239578318016984801869478851843
85861560789112949495459501737958331952853208805511
12540698747158523863050715693290963295227443043557
66896648950445244523161731856403098711121722383113
62229893423380308135336276614282806444486645238749
30358907296290491560440772390713810515859307960866
70172427121883998797908792274921901699720888093776
65727333001053367881220235421809751254540594752243
52584907711670556013604839586446706324415722155397
53697817977846174064955149290862569321978468622482
83972241375657056057490261407972968652414535100474
82166370484403199890008895243450658541227588666881
16427171479924442928230863465674813919123162824586
17866458359124566529476545682848912883142607690042
24219022671055626321111109370544217506941658960408
07198403850962455444362981230987879927244284909188
84580156166097919133875499200524063689912560717606
05886116467109405077541002256983155200055935729725
71636269561882670428252483600823257530420752963450
"""
number = "".join(numberS.split()) #remove whitespace
product = 0
for i in range(0, len(number)-5):
digits = (int(a) for a in number[i:i+5])
t = reduce(mul, digits)
product = max(product, t)

return product




Problem 9: There exists exactly one Pythagorean triplet for which a + b + c = 1000. Find the product abc.
This is pretty simple, if you know about the modified version of Euclid's formula. This version is guaranteed to generate all Pythagorean triples (eventually).

def problem9():
from math import sqrt
def pythagorean_generator():
""" return all pairs (a,b) s.t. a**2 + b**2 = c**2
and a,b,c integers
"""
k = 1
mstart = 2
batch_size = 100
while True:
for m in range(mstart, mstart+batch_size):
for n in range(1,m):
yield ( k*2*m*n, k*(m**2-n**2) )
mstart += batch_size
k += 1

for a,b in pythagorean_generator():
c = int(sqrt(a**2 + b**2))
if a + b + c == 1000:
return a*b*c



Problem 10: Find the sum of all the primes below two million.
The naive algorithm just isn't going to be fast enough in python, clocking in around 10 minutes. Thus, it is interesting to note that in a lower level language, the naive algorithm probably would clock in at under a minute.

def problem10():
primes = [2]
i = 3
while i < 2*10**6: if not any(p for p in primes if i%p == 0): primes.append(i) i += 2 return sum(primes)

Of course, we can always try a naive implementation of the Sieve of Eratosthenes.
def problem10b():
from math import sqrt
primes = []
numbers = list(range(0, 2*10**6))
crossed = [False]*len(numbers)

for i in range(2, len(numbers)):
if not crossed[i]:
p = numbers[i]
primes.append(p)
for j in range(p, len(numbers), p):
crossed[j] = True
return sum(primes)



This gives us the answer in a couple seconds. Of course, I also managed to introduce an off-by-two error due to my indexing of the arrays (which is why numbers now includes 0 and 1, so that numbers[i] = i). I always hate these situations where you have to iterate through two arrays simultaneously. In a more complicated problem I'd consider wrapping things up into an object, but that seems unwarranted here.



Well, that's enough for now. Something wrong or unpythonic? Hit me back.

Monday, October 27, 2008

First Experiences with my Ipod Touch

**This was a draft, originally written in August 2008**
I'm publishing it now because, even though it isn't done, it will never get any more done. So that makes it done ;)

So, my boss bought me an Ipod touch. Not sure if it is worth the extra hours I agreed to work or not (not all that many really, and I still get paid!).

The first thing I noticed was the unhappy request for iTunes. The device refused to turn on until it was connected to iTunes. Given that I regard iTunes with about as much as regard as an evangelical Christian in Kansas has for Barack Obama, this sort of upset me. Luckily, I was able to side step this by connecting it to my friend's mac. One second, and the device started working (no idea what it wanted).

Obviously being an internet junkie (and having heard people rave about Safari on the touch), I wanted to browse reddit. Unfortunately, our campus' wireless seems to have decided we don't deserve internet and is refusing to give anyone DHCP leases. Both my touch and my friend's mac happily accepted the DHCP fake IP (169.x.x.x). Of course, I fail at networking for not recognizing this...

The next step was to try it under linux. Hmm... looks like it still requires jailbreaking. Unfortunate; I won't be able to do that for a few months at least. I guess Apple really has the platform locked down.

So, my file server gets tarnished with iTunes. Oh well, it's got enough other crap on it already. iTunes didn't respond well when I set it's directory to my music folder, requiring that folder to be added manual afterwards. A minor annoyance.

Registration is of a pain; Apple's servers are really slow. And the tab navigation of the form is completely wrong... And apparently I can't do it without entering a credit card. I guess I'll do that if only because there is some music I have been meaning to buy, and the iTunes store might have DRM free versions.

iTunes seems to believe I want another copy of all of my album art. Not sure where it's going; there isn't all that much space on my OS partition so this might not end well. It also seems to dislike my wma music which is an opinion I share with it. No idea what it's transcoding to; m4a would be equally as useless.

I wonder how well iTunes can manage my xvid/VC9 encoded videos. Automatic transcoding to h264 for my device would be sweet... Tried to add a folder of .mkv, I think that confused it. Went to movies and selected open, this turned out to have been a poor decision. The UI went away and I can't figure out how to get back. No helpful back arrow. Oh, it's another window entirely (didn't see it pop up through the VNC session). Tried a folder of more normal files... It seems to have detected the mpg files at least. Not sure what it's doing. A quick google search suggests using handbreak to convert my files, so I guess I'm stuck doing it manually. I'm sure iTunes isn't exactly going to let me script in the handbreak cli either... Actaully, handbreak only seems to work for DVDs.

Next step: sync music. I have more than 8GB of music, so we'll see what iTunes decides to do when I request a sync. Hmm... it seems to have created an iPod playlist. It's initially populated as I would like: random albums. I wonder if it automatically changes each time I sync. It does not, which prevented me from changing my selection for a few months.

Eventually I sat down and spent about half an hour searching the net, and I discovered how to setup a new smart playlist... which is okay. It isn't smart about the albums like the original one, but that's probably my failure. I managed to delete the sync history, which I think will enable me to regenerate the original selection playlist and see what it did. What an odd button to have in preferences... Unfortunately, it didn't offer to setup the playlist again, so I'm stuck searching. After about half an hour more googling, I stumbled upon the 'select-by album' field, which does what I want.

Wednesday, September 17, 2008

Covariance and Contravariance in C# and Java

Every time I start programming something new, I inevitably come across code like this:


Of course, the above code is a complete lie; it won't work in C# or java. This is because, in these languages, the list types are not covariant. Subtypes are covariant wrt their supertype; they instance can stand in anyplace a superclass instance can. Type covariance is a narrowing convention. Lists with different generic parameters cannot stand in for each other. So, you get a compile time error. Of course, to confuse everyone more, the array types in both these languages are covariant.

In java, one can slide around this problem:
While this code will compile and do what you want, this syntax actually expresses contravariance. Contravariance is a widening convention; the return ? extends BaseClass is wider than DerivedClass.

What programmers typically want with this anti-pattern is to fit our local objects to some other signature. While, on the surface, this seems like a logical request, there are several issues lurking...

First, consider what would happens in C# and Java with covariant arrays. Wikipedia says it best:
// a is a single-element array of String
String[] a = new String[1];

// b is an array of Object
Object[] b = a;

// Assign an Integer to b. This would be possible if b really were
// an array of Object, but since it really is an array of String,
// we will get a java.lang.ArrayStoreException.
b[0] = 1;
What this means is that despite being statically typed, both languages actually have to do runtime type checks for array insertion. As the Wikipedia article points out, this is basically your only choice if you want covariant mutable lists.

However, invariant lists are exactly what we want! External parties shouldn't be messing with our members anyway. So, the crux of the problem is the lack of an actual ImmutableList class. It could be made trivially covariant and would fit right in with these languages OO style. While I'll often criticize Java for it's large number of useless classes (cough xml), data structures are one area where more definately is better.

In the end, C# doesn't have any support for generic contravariance (or covariance) at all. This seems to be by design; the CLI bytecode can theoretically support such things. Java, of course, gets a free pass because of type-erasure.

So why was it an anti-pattern?

It looks like Java has the edge here, but it's actually Microsoft who has the right idea. It's already an immutable list... so why is it returning a list at all? Instead, the method should be returning an IEnumerable (Iterable). One can easily design a class which wraps our list and spits it out in the correct format for the receiver. No incorrect assumptions about modifying lists, type casts, or unecessary copies. In C#:

Using this idea, I can keep a mutable list of mutable objects (the derived class). If someone else needs to see that data, I can use the type system to give them an immutable view of immutable objects (the base class) that is statically enforced.

If you have to use a static type system, the type system should at least help you out.

*** Note:
While I haven't looked at the bytecode for this class, I imagine that Microsoft's compiler tracks the implicit conversions. However, type-erasure in java (and the nature of Hot Spot as being more type-ignorant) would probably not require this.