What better place to get sushi in Tokyo, than by going to an actual fish market? By the time we arrived at Tsukiji Fish Market (around 6:00AM in the morning), the lines were already pretty long. Daiwa Sushi (大和) had a significantly shorter wait. We probably only waited for

]]>What better place to get sushi in Tokyo, than by going to an actual fish market? By the time we arrived at Tsukiji Fish Market (around 6:00AM in the morning), the lines were already pretty long. Daiwa Sushi (大和) had a significantly shorter wait. We probably only waited for around 20 minutes before we were seated.

We decided on the omakase which was seven pieces of nigiri, 6 pieces of maki for ¥3500.

My favourite piece was either the Anago or the Shrimp Head. The quality of the fish was amazing. The Ikuro was suprisingly good, since it didn't have the fishy taste you get so much roe imported here in Canada.

Probably not the best sushi you could find in Tokyo, but it was pretty damn close.

]]>An age-old puzzle that's challenged kids and pancake chefs for ages:

The goal of the puzzle is to move an entire stack of disks from one rod/plate to another with the following restrictions:

- Only one disk can be moved at a time.
- Each move consists of taking the upper

An age-old puzzle that's challenged kids and pancake chefs for ages:

The goal of the puzzle is to move an entire stack of disks from one rod/plate to another with the following restrictions:

- Only one disk can be moved at a time.
- Each move consists of taking the upper disk from one of the stacks and placing it on top of another stack
- No disk may be placed on top of a smaller disk.

So how can we go about solving this programatically? The key insight is that this problem has optimal substructure. This means that the solution for a larger problem, say for 4 disks is composed of the solutions for smaller subproblems: 1,2 and 3 disks.

```
def moveTower(height,fromPole, toPole, withPole):
if height >= 1:
moveTower(height-1,fromPole,withPole,toPole)
moveDisk(fromPole,toPole)
moveTower(height-1,withPole,toPole,fromPole)
def moveDisk(fp,tp):
print("moving disk from",fp,"to",tp)
moveTower(3,"A","B","C")
```

We first move the first `n-1`

disks from pole A to B by calling moveTower recursively but for the stack of n-1 disks. Then we can move disk `n`

from pole A to C. We can then do the same recurisive call and move the `n-1`

disks from B to C.

The recursive solution seems a bit expensive, with the two recursive calls leading to a $O(2^n)$ runtime. Can we do a bit better?

Our recursive search right now is solving the same subproblem over and over again. If we saved or **memoized** the result, we could possibly trade off some of our runtime by using more memory and essentially caching the solutions to each subproblem.

```
memo = []
lookup = ["ABC","ACB","BAC","BCA","CAB","CBA"]
def topDown(height,fromPole, toPole, withPole):
global memo
memo = [[-1 for x in range(6)] for y in range(height+1)]
memo[0][0] = ""
for x in range(6):
memo[0][x] = ""
return topDownHelper(height,fromPole, toPole, withPole)
def topDownHelper(height,fromPole, toPole, withPole):
global memo
move = fromPole + toPole + withPole
index = 0
while move != lookup[index]:
index += 1
if memo[height][index] < 0:
memo[height][index] = topDownHelper(height-1,fromPole,withPole,toPole)\
+ fromPole +'->'+ toPole +', ' \
+ topDownHelper(height-1,withPole,toPole,fromPole)
return memo[height][index]
print topDown(3,"A","B","C")
```

Another approach would be to intelligently build up our solution, solving for 1 disk, then 2 disks, etc... By going in the opposite direction of our recursive solution, we save time by not solving the same subproblems. This is a technique dubbed **dynamic programming**. Since we are building up our solution successively, we only need to remember the solutions that are needed immediately. This saves us some valuable space.

```
moves = ["A->B","A->C","B->A","B->C","C->A","C->B"]
prevLookup = [1,0,3,2,5,4]
prev2Lookup = [5,3,4,1,2,0]
def bottomUp(height,fromPole, toPole, withPole):
prev = ["" for x in range(6)]
curr = ["" for x in range(6)]
for i in range(height):
for j in range(len(moves)):
prevStr = prev[prevLookup[j]] + ', ' if prev[prevLookup[j]] != "" else ""
nextStr = ', ' + prev[prev2Lookup[j]] if prev[prev2Lookup[j]] != "" else ""
curr[j] = prevStr + moves[j] + nextStr
curr,prev = prev,curr
return prev
print bottomUp(4,"A","B","C")[0]
```

Doing a quick check on each implementation's running time, we see that the recursive solution explodes exponentially, while our memoized/dynamic programming implementations grow linearly. Clearly dynamic programming is the way to go!

]]>The dragon curve or Heighway Dragon is a beautiful self-similar fractal first investigated by NASA physicists John Heighway, Bruce Banks, and William Harter.

]]>Jack [Heighway] came into my office (actually cubicle) and said that if you folded a one dollar bill repeatedly he thought it would make a random walk

The dragon curve or Heighway Dragon is a beautiful self-similar fractal first investigated by NASA physicists John Heighway, Bruce Banks, and William Harter.

Jack [Heighway] came into my office (actually cubicle) and said that if you folded a one dollar bill repeatedly he thought it would make a random walk or something like that. (We’d been arguing about something in Feller’s book on return chances.) I was dubious but said “Let’s check it out with a big piece of paper.” (Those were the days when NASA could easily afford more than one dollar’s worth of paper.) Well, it made a funny pattern alright but we couldn’t really see it too clearly. So one of us thought to use tracing paper and “unfold” it indefinitely so we could record (tediously) as big a pattern as we wanted. But each time we made the next order, it just begged us to make one more!

As they folded, and unfolded, a pattern emerged:

When you made a single fold you get a single right turn:

```
Right
```

After a second fold, you would get:

```
Right Right Left
```

And the third:

```
Right Right Left Right Right Left Left
```

Fourth:

```
Right Right Left Right Right Left Left Right Right Right Left Left Right Left Left
```

The sequence of turns are actually following a specific pattern:

$$S_{n+1} = S_n R \bar{S_n}$$

where $\bar{S}$ is the same sequence but reversed in direction and you replace L's with R's and vice-versa. The $n+1$'th fold appends the $R$ then the subsequent folds are the same sequence but reversed in order and direction.

The animation on the sidebar uses this sequence to iteratively trace out the curve:

```
function dragonCurveIter(seq,iterations) {
// 0 - L , 1 - R
for(var i = 0; i < iterations; i++) {
var a = seq.reverse().map(function(x){return Number(!x)})
seq.reverse().push(1);
seq = seq.concat(a);
}
return seq
}
```

Interestingly enough, this recursive definition can also be interpreted geometrically. Appending $R\bar{S_n}$ is equivalent to having the original curve rotated by 90 degrees about the endpoint of the curve.

]]>MTA provides a public dataset of turnstile data per station since May 2010. We can grab weather data from Weather Underground, who kindly provides historical weather data in CSV format. You can also grab the same data through their API.

The data we collected so far usuable yet,

]]>MTA provides a public dataset of turnstile data per station since May 2010. We can grab weather data from Weather Underground, who kindly provides historical weather data in CSV format. You can also grab the same data through their API.

The data we collected so far usuable yet, so we'll have to do a bit of data munging.

The MTA data has multiple entries per row, so we should first unwind the data so we get an entry per row:

```
import csv
import os
with open("masterfile.csv", 'wb') as outfile, open("data/maynycweather.csv", 'wb') as weather:
outWriter = csv.writer(outfile)
outfile.write('C/A,UNIT,SCP,DATEn,TIMEn,\
DESCn,ENTRIESn,EXITSn\n') # column names
for name in os.listdir('./turnstile'):
with open('./turnstile/' + name, 'rb') as infile:
inReader = csv.reader(infile)
for row in inReader:
for i in xrange(3,len(row),5):
outWriter.writerow(row[0:3] + row[i:i+5])
```

For simple manipulations, the `csv`

module will usually suffice. However, doing more complicated operations, such as calculating aggregates or dealing with different datatypes like DateTimes can be difficult and tedious. This is where `pandas`

saves the day!

```
import pandas
import datetime
def reformat_weather_dates(date):
return datetime.datetime.strptime(date,\
'%Y-%m-%d').strftime('%Y-%m-%d')
def reformat_subway_dates(date):
return datetime.datetime.strptime(date,\
'%m-%d-%y').strftime('%Y-%m-%d')
df = pandas.read_csv('masterfile.csv')
dfweather = pandas.read_csv('data/nyc052013.csv')
df = df[df['DESCn'] == 'REGULAR'] # Filter by REGULAR
df['ENTRIESn_hourly'] = (df['ENTRIESn'] - df['ENTRIESn'].shift(1)).fillna(1) # Calculate daily entries
dfweather.rename(columns={'EDT':'DATEn'}, inplace=True) # - rename column names so they can merge
df['DATEn'] = df['DATEn'].map(reformat_subway_dates) # - reformat so dates match using map
dfweather['DATEn'] = dfweather['DATEn'].map(reformat_weather_dates)
final = pandas.merge(df,dfweather,on='DATEn')
final.to_csv('final_master.csv')
```

We can easily filter by specific categories. For example, we'll only want turnstile data from the category of 'Regular':

```
df = df[df['DESCn'] == 'REGULAR'] # Filter by REGULAR
```

This grabs all the indices of rows that match 'REGULAR', then regrabs the rows from the original frame.

```
# Calculate daily entries
df['ENTRIESn_hourly'] = (df['ENTRIESn'] - df['ENTRIESn'].shift(1)).fillna(1)
```

One way to get a sense of our data is to visualize it. Here, we'll try out ggplot, a port of the popular R graphing library.

Entries seem to peek during specific two-hour windows. This seems to correspond to peak operating times, like rush hour (8-9am and 4-5pm), lunch and dinner etc...

We can also look at comparing ridership relative to weather, like how many people exit stations relative to the average humidity.

```
from pandas import *
from ggplot import *
df = pandas.read_csv('./turnstile_data_master_with_weather.csv')
df['meandewpti'] = df['meandewpti'].map(lambda x: round((x-32.0)*(5.0/9.0),0))
daily = df.groupby(df.meandewpti).EXITSn_hourly.sum()
daily.index.name = 'day'
daily = daily.reset_index()
p = ggplot(daily, aes('day', weight='EXITSn_hourly',alpha=0.5)) + \
geom_bar(fill="green") + \
theme_xkcd() + \
ggtitle("May 2011 - Turnstile Exits by Dew Point") + \
xlab("Degrees Celsius") + \
ylab("# of Exits")
print p
```

Don't think we can infer much from the plot, except most the most popular days in May had a humidity around 14-16°C.

]]>Inspired by this impressive piece of engineering, I tried to write my one quine in CoffeeScript.

The trick I used was to encode the entire file into decimal ASCII and store it in the data array. To rebuild the actual code, you convert it back to characters. This lets you

]]>Inspired by this impressive piece of engineering, I tried to write my one quine in CoffeeScript.

The trick I used was to encode the entire file into decimal ASCII and store it in the data array. To rebuild the actual code, you convert it back to characters. This lets you store the code, and run it too!

]]>