Language of Choice

Reading Steve Yegge’s blog and Paul Graham‘s essays has made me want to try programming in a Lisp dialect for a major project. The only problem has been I don’t really like Common Lisp. I like the Lisp part of Common Lisp. But I don’t like the libraries, in particular there isn’t a good standard way to do serve side programming (if you find one, point it out to me). Also Common Lisp makes it hard to interact with the operating system. In contrast Python, Perl, etc. make operating system interaction extremely easy (in perl system(“command”)).

The other major Lisp dialect is Scheme, which I like for its simplicity but most Scheme implementations don’t have a strong community. There is one Scheme, PLT-Scheme, that does have a very active community of users and good support for server side and GUI programming. Even though the web page emphasizes teaching there is good support for practical programming.

The problem with PLT Scheme is speed, it’s just as slow as Python whereas some Common Lisp compilers has performance comparable to C, see language shootout.

There is another Scheme implementation, Gambit Scheme, that is speedy (as speedy as C) but once again Gambit Scheme doesn’t have all the libraries I want. So for now I’m going to try using PLT Scheme for my personal projects and see how it works out. In particular I’m going to be porting my computational geometry code over to PLT Scheme.

Hopefully most of the code I write for PLT Scheme will also run using Gambit Scheme.

Advertisements

Startup Idea

From my last post you can see that I’m interested in search with an emphasis on finding people and their personal information. Me and a friend have been hoping to capitalize on this by building a website/company. But yesterday I was browsing around online and discovered that someone already had the same basic idea. And made a website called wink.com out of it.

My conundrum is whether to continue with the same idea, abandon it, or modify it. From looking at wink.com the website seems very well put together so I don’t think it’s good planning to go head to head with them. I also don’t want to abandon this concept since I think it has a lot of promise. And I don’t have interesting ideas every day. Also I get to play with search engine technology which is fun.

The only choice is to modify the idea. My friend has been pushing all along to use public government databases (courts, prisons etc.) for our primary source and this, I think can be our distinguishing feature from wink.com

We’ll still use the google api‘s to search blogs, MySpace, etc. But we’ll also mine the government databases for information (I’m pretty sure wink.com doesn’t do this).

If you’re curious about the current state of our effort check out http://dewittbell.com/search/.

Searching for Personal Information

Often when reading an article or technical paper by a professor or journalist I want to find info about them. So in this post I’m listing some of the techniques for getting information. Many of them come from the book “Google Hacking“.

1. Google the person’s name:

As a first step this often works with professors who have uncommon last names (it’ll usually take you to the professors websites). For people with common common surnames and first names this does not always yield the best info.

To narrow down the search results you can try searching with the middle name or middle initial. But this often times discards relevant results. Also try searching for “name + family” or “name + homepage” or do a google blog search of the persons name.

Even better use “name + job” or “name + major” or “name + city” or “name + school”, where “job” is their current or past job (such as programmer, nurse, etc.), “major” is what they majored or minored in for college, “city” is a current or past city they’ve lived in, and “school” is a school that they’ve attended.

2. Search Google using their email address if you have one:

The good thing is that email addresses are almost unique so almost always all the search results are relevant. The bad thing is that often you don’t get any search hits.

3. Search Google Groups with the person’s email address. Usually only works when searching for info on technical people (such as programmers, mathematicians, etc.).

4. Search Amazon with the person’s username (try there email id for the username)

Only works if the person has a common user id that they use across websites. I often use bjwbell for my user id, so if you search “bjwbell site:amazon.com” it’ll return my amazon.com profile. Sadly most people don’t use the same user id across websites. So this technique doesn’t have a high rate of success.

5. Search MySpace for their email or name.

Not useful for getting info on professors (no prof. that I know of maintains a MySpace page) but good for ordinary people (such as school teachers, students, etc) who have MySpace pages.

6. Search court records

This needs a whole post all by itself. The effort in searching online court databases is considerable since usually you have to search county by county (such as saccourt.com).

The above 5 techniques (the sixth one I’ll elaborate on in a future post) are good ways for tracking down info on people who participate in online activities.

Packing Calculations

Todays problem is from Pugh’s great book “Real Mathematical Analysis“, it’s problem 23 in chapter 1.

Problem Statement:

Let b(R) and s(R) be the number of integer unit cubes in \textbf{R}^M that intersect the ball and the sphere of radius R respectively, centered at the origin. By integer unit cube I mean a unit cube which has its vertices on the integer lattice.

(a) Let m = 2 and calculate the limits

\textrm{lim}_{R \rightarrow \infty} \frac{s(R)}{b(R)} and \textrm{lim}_{R \rightarrow \infty} \frac{s(R)^2}{b(R)}.

(b) Take m \ge 3. What exponent k makes the limit

\textrm{lim}_{R \rightarrow \infty} \frac{s(R)^k}{b(R)}

interesting?

(c) Let c(R) be the number of integer unit cubes that are contained in the ball of radius R, centered at the origin. Calculate

\textrm{lim}_{R \rightarrow \infty} \frac{c(R)}{b(R)}.

(d) Shift the ball to a new, arbitrary center (not on the integer lattice) and re-calculate the limits.

End Problem Statement

This problem doesn’t involve finding a packing but instead involves counting the number of unit cubes after the packing has been chosen (in this case we’re packing a ball with integer unit cubes). So I find it to be a very interesting problem.

The first simplification I make is to only consider the first quadrant of the ball with radius R. By symmetry s(R) = 4 s_{\textrm{quad 1}}(R) and similarly for b(R) = 4 b_{\textrm{quad 1}}(R), where s_{\textrm{quad 1}} and b_{\textrm{quad 1}} are the functions s and b restricted to the first quadrant. For the rest of this solution I only work with the first quadrant. Let B_R = \{(x, y) \in \textbf{R}^2 : x^2 + y^2 \le R^2, x \ge 0, y \ge 0\}. The area of B_R is \pi R^2/4 so as a first approximation b(R) \approx \pi R^2/4.

After some thought the best way to count the number of unit cubes in s(R) is to imagine tracing out the curve y = R\sin(\theta), \textrm{ } x = R\cos(\theta) as \theta goes from 0 to \pi/2. Let c be the number of unit cubes the curve has been traced through. Initially c = 0. As we trace out the curve, every time y \in \textbf{N} we add 1 to c since the curve has entered a new unit square, similarly if x \in \textbf{N}, 1 is added to c. If y and x are both integers then only 1 should be added to c (this happens very rarely so I mostly ignore this case).

For a small example let R = 2. Then x goes from 2 to 0, so x \in \textbf{N} 3 times, similarly y goes from 0 to 2 so y \in \textbf{N} 3 times.

This gives us an initial count of 6 for s(2) but we need to handle the corner cases.

The corner cases are (x = 0, y = 2) and (x = 2, y = 0). Note that x = 0, y = 2 means both x and y are integers so that square was counted twice instead of once, also don’t forget that the square was counted when the curve entered it. So we need to subtract two from 3 + 3 giving 3 + 3 – 2. Also when x = 2, y = 0 this square is again counted twice so we need to subtract 1 from 3 + 3 – 2 giving 3 + 3 – 3 = 3 Thus s(2) = 3.

Example 2: R = 4, x goes from 4 to 0 and y goes from 0 to 4 so we have 5 + 5 – 3 (remember the corner cases) = 7.

From the above derivations s(R) = 2R - 3. To a first approximation the formula for s(R) in m dimensions is mR.

To calculate b(R) first note that if the packing of B_R was perfect than b(R) would be \pi R^2/4. It’s not a perfect packing so b(R) is greater than \pi R^2/4. To calculate how much more we first note that all the unit cubes used in counting s(R) have on average half their area in B_R and half their area outside B_R (remember we’re taking the limit to large R, so the result follows from symmetry).

There are 2R - 3 of these unit cubes that are only half in B_R. So they cover an area of R – 3/2 inside B_R so b(R) must be \pi R^2/4 - R + 3/2 (the part of B_R covered by unit cubes that are completely contained in B_R) + 2R – 3 (the unit cubes that are part-in and part-out) = \pi R^2/4 + R - 3/2.

To summarize:

s(R) = 2R - 3, \textrm{ } b(R) = \pi R^2/4 + R - 3/2,

c(R) = b(R) - s(R) = \pi R^2/4 - R + 3/2.

Now that we’ve worked out b(R), s(R), and c(R) the answers are:

(a) 0 and 8/\pi.

(b) m, higher dimensions follow the same methods of calculations.

(c) 1.

(d) Hehe, I haven’t tried this part maybe in a future post. My instinct is that the limits are identical.