Saturday, January 30, 2016

Golang, FizzBuzz, and Channels (Analyzing an Interesting Implementation)

One: The Setting

I recently ran across an interesting implementation, written by Russ Cox, of FizzBuzz. He said:

Every time I see FizzBuzz mentioned somewhere on the internet I think two things.

1. That's a dumb problem.
2. There's actually a nice Go solution.

I've seen lots of people make the first observation but have never seen anyone make the second. It's basically a tiny prime sieve.

That sounded interesting, so I took a look at it. The code was short and at a glance looked simple, but once I tried tracing what was happening I was completely lost. But this should be simple. I must have been missing something obvious. So I thought I'd try to unravel how it works.

Two: Let's Review, What is FizzBuzz?

FizzBuzz is a relatively simple programming exercise. It comes from a game meant to help kids learn division; it's played by counting one to some number, and when a number is divisible by three, the player says "Fizz" instead of the number. If the number is divisible by five, the player says "Buzz," and if the number is divisible by three and five, the player says "FizzBuzz."

Fizz: What's the Big Deal?

I said that this is a relatively simple programming exercise, but I'm surprised to read that there are a large number of professional programmers who apparently can't implement FizzBuzz. Jeff Atwood even wrote a whole blog post about it, and his post links to several other posts that talk about candidate interviewees that couldn't perform simple programming tasks, which I still have trouble believing given that, in my experience, FizzBuzz is nearly as "standard" a programming exercise as "Hello, World!"

You can, with very little effort, find implementations of FizzBuzz in...I'd not say every language, but just about every language in practical use today. Even if someone who claims to be an experienced programmer hasn't heard of FizzBuzz by name, they should be able to implement some version of it if they know basic math, looping, and string manipulation.

Four: How Do You FizzBuzz?

Problem statement: count from 1 to 100 and if the number is divisible by 3, output "Fizz". If the number is divisible by 5, output "Buzz". If it is divisible by 3 and 5, output "FizzBuzz". Otherwise, output just the number.

A simple implementation in Go looks something like this:

// FizzBuzz
package main

import "fmt"
import "strconv"

func main() {

 // Create a loop to count 1 to 100
 for i := 1; i <= 100; i++ {

  // Create a string variable that gets reinitialized each iteration
  var strOutput string
  strOutput = ""

  // Fizz on 3
  if i%3 == 0 {
   strOutput = strOutput + "Fizz"
  }
  // Buzz on 5
  if i%5 == 0 {
   strOutput = strOutput + "Buzz"
  }
  // Otherwise, output the number
  if strOutput == "" {
   strOutput = strconv.Itoa(i)
  }
  // Print the result
  fmt.Println(strOutput)
 }

}

Of course there are other variations that can achieve the same output, such as using a switch/case block. I used this way because my brain likes using if statements and there were relatively few things to compare, and it also was a simple way to not check for divisible by 3, divisible by 5, and divisible by 3 and 5 separately...this just adds on to a string.

Most solutions will have this basic pattern. A for loop to iterate 1 through 100. A string to hold what will be printed to the console. And some modulo math, using the % or in some languages the "mod" keyword, to divide the current counter value by 3 and 5 and if the remainder is 0 that means that yes, the value is equally divisible by 3 or 5. Because that's what mod is. The remainder. In case you forgot.

Buzz: What Are Channels?


For me, the purpose of checking into this novel method of FizzBuzz was to better understand how channels work, so I won't really get into the nitty gritty details and instead give an overview of what a channel is for.

The Go language has concurrency built into its DNA. Without much effort from the developer, Go will try to take advantage of multiple logical processors; using the keyword "go" will send a function off to do its thing independently scheduled from the other goroutines.

Sometimes you send a process off to do something that doesn't impact other running processes. When that isn't the case, you need to find a way to synchronize data or have the processes communicate with each other (keep in mind that process has a specific technical definition that may not really fit with the specifics of Go's implementation, as pedants will point out that threads, processes, tasks, etc. vary in definition depending on operating system, memory sharing, how and by what mechanism they are scheduled, etc. I'm being generic when I say processes are off doing something. You have a function working on something that may happen asynchronous from your main() function and result in weird stuff happening if you assume order of execution timing...I'll leave it at that.)

In many languages you would have to set up some kind of mutex, or pipe, or socket to get independent processes to talk to each other and pass data or signal each other. Go uses channels. When you see a "<-" in Go source code, that's a channel passing data. That's really the simplest definition...it's a mechanism for passing data, whether it's ints, strings, booleans...among different running goroutines. Google can provide plenty of samples for channel use in Go. I find the problem is more like...it's a simple concept...and simple examples seem easy enough to grasp...but when I looked at the Russ Cox example, it left me scratching my head what exactly was happening.

The only other thing I can offer about channels is that the arrow always points to the left.

Fizz: What is the Novel FizzBuzz Implementation?

Here it is (from the Russ Cox link to Go playground code):

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
package main

import "fmt"

func main() {
 c := generate()
 c = filter(c, 3, "Fizz")
 c = filter(c, 5, "Buzz")
 for i := 1; i <= 100; i++ {
  if s := <-c; s != "" {
   fmt.Println(s)
  } else {
   fmt.Println(i)
  }
 }
}

func generate() <-chan string {
 c := make(chan string)
 go func() {
  for {
   c <- ""
  }
 }()
 return c
}

func filter(c <-chan string, n int, label string) <-chan string {
 out := make(chan string)
 go func() {
  for {
   for i := 0; i < n-1; i++ {
    out <- <-c
   }
   out <- <-c + label
  }
 }()
 return out
}

Short. Kind of...simple? Maybe?

Let's see if I can trace what's happening; lines 1 and 3 are standard stuff...this is the main program (package main) and it's importing the fmt library. Line 5 is defining function main().

Line 6 runs generate() and the return value is placed into "c".

Generate() is on line 18; takes no arguments, and returns a channel that passes string values.

Line 19 makes a string channel and assigns it to a locally scoped "c." It then spawns a goroutine on line 20, and that goroutine, called func(), is a infinite loop that just sends an empty string (line 22) down the channel. Then generate returns (line 25.)

Back to line 7. Here channel c is given the return value from filter(c, 3, "Fizz"). Filter is defined on line 28; it takes a string-type channel, integer, and a string for arguments and returns another string channel.

Line 29 creates a string channel called out. Then 30 through 37 defines another goroutine called func(), which defines an infinite loop (lines 31 through 36) running another for loop doing some math and variable passing (lines 32 through 34.)

The initial call to filter() passes the channel (c) created by generate(), pumping a stream of "" strings because of the infinite loop on 21 through 23, along with the integer 3. So the for loop in 32 through 34 would start at 0, then ... I'm not entirely sure what "<- <-" does. A single one passes channel information from whatever is on the right to the left side. So the first iteration (i = 0) sends "" into channel out then increments i to 1. It then does the same thing until i = 2 (because 2 < (3 - 1) makes the condition false and breaks the loop).  That would send 0, 1, and 2 through the loop, break out, then send the string "label" (which was, in that call on line 7, "Fizz") down channel c into channel out.

Only it's not passing 0, 1 and 2. Those are just the iterations of the loop; the channel (<-c, not c, technically) is handling a string value, not an integer. There's no conversions here. And the initial channel is passing "" through it. So each time through that loop for i = 0, 1, and 2, it's just passing "", "", and "" from <-c to <-out.

In other words, for that "filter" call, the recipient will get "", "", "", "Fizz", ...and that's it.

That's the goroutine for lines 31 through 37; once it's spawned, filter() returns the channel "out" to the caller (c on line 7.)

Then it returns to line 8, which is another filter call assigned to channel c, and the call is made with the value of 5 and the label "Buzz". That means this time the for loop would slip through 0, 1, 2, 3, and 4 (because again 4 < (5-1) would be false) before breaking out of the loop and sending "Buzz" through the channel.

Then execution would return to line 9 which is a for loop encompassing lines 9 through 15. It creates a counter i and assigning it the value of 1, then loops until i = 100 by incrementing by one each time.

Line 10 creates variable s and assigns it whatever is pulled next from channel c (so it would be a string, as the channels are all of string types.) If the string is not empty (""), line 11 prints the string sent by the channel. Otherwise line 12 and 13 says to print the value of i.

(Quick note - yes, fmt.Println will convert the integer to a string. The argument it takes is an interface, not a string or int or other specific value. But if line 13 were something like, "fmt.Println("The count is " + i), that would fail from the mixing of types.)

So somehow the channels are acting as a kind of...chain...? Can you chain channels together?

Seven: Can I Trace It?

Go has some tools included that let you enable profiling, which is really kind of neat. When enabled, the pprof tools sample your application periodically, and save the output to a special file that when interpreted by the profiler can create these bubble graphs to show what percent of time and memory are being used by what functions. While helpful in tracking performance issues, this doesn't really tell me the execution path of the application.

I next tried to create a call graph using an included Golang tool. That was a bit of a mistake...it didn't just trace the functions of my immediate source code but of all the libraries the application touched, resulting in what seemed like thousands of bubbles of data points graphed out in a dense spiderweb of interconnections. It...wasn't very useful.

I then tried a tool called godebug. It's really kind of awesome; once downloaded, I just added a breakpoint in my source by inserting the line

_ = "breakpoint"

...into my source code and then running

godebug run ./fizzbuzz-channels.go

...and it would enter a step-through of the source at the breakpoint. The problem was that no matter where I put a breakpoint, execution would end up stepping through lines 6 through 15. If I place a breakpoint in just the generate or filter function, it would stop and step through that function once, but then keep looping through the for loop in main() and never revisit the other functions even though they were evidently doing something.

I wasn't having much luck finding an automatic way to trace the execution path of the program.

Eight: Tinker Time?

I had a hypothesis of how to model the application's execution in my head. There were four main goroutines; main(), the function pumping "" strings from generate through the channel each time something read from the channel, and two filter() functions, one after another either passing nothing more to the chain of strings or tacking on a Fizz or Buzz before passing on the message.

One way to test that would be to add a little extra logic to the code.


 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
package main

import "fmt"

func main() {
 c := generate()
 c = filter(c, 3, "Fizz")
 c = filter(c, 5, "Buzz")
 for i := 1; i <= 100; i++ {
  if s := <-c; s != "" {
   fmt.Println(s)
  } else {
   fmt.Println(i)
  }
 }
}

func generate() <-chan string {
 c := make(chan string)
 go func() {
  for {
   c <- ""
  }
 }()
 return c
}

func filter(c <-chan string, n int, label string) <-chan string {
 out := make(chan string)
 go func() {
  for {
   for i := 0; i < n-1; i++ {
    if n == 3 {
     out <- <-c + "3check"
    }
    if n == 5 {
     out <- <-c + "5check"
    }
   }
   out <- <-c + label
  }
 }()
 return out
}

The filter function now adds a check for the value of the integer passed to it and tacks on a string of 3check or 5check, depending on which filter() is run. When I first ran FizzBuzz, the initial output looked like this:
1
2
Fizz
4
Buzz
Fizz
7
8
Fizz
Buzz

The modified version looked like this:
3check5check
3check5check
Fizz5check
3check5check
3checkBuzz
Fizz5check
3check5check
3check5check
Fizz5check
3checkBuzz

This means that each in each iteration, both of those "filters" are checked, and they both run back to back each time. Generate() sends a "", the first filter checks for a 3, and the second filter checks for the 5 count, and together they assemble a string to slide down the channels to form a ""+filter()1+filter()2 response that is then evaluated by the 1 to 100 for loop in main().

Fizz: Why Aren't Fizz and Buzz, or 3check and 5check, Interleaved at Random?

Goroutines are running independent of each other. Without some sort of queuing or syncing mechanism, output from the goroutines would occur whenever a task is completed. It's very possible to have an application where the output is not the same each run.

But in my test, the output is pretty consistent; 3check always precedes 5check (not to mention the output for the 3- and 5-factor numbers are in the right spot.) That means the two goroutines are chained together, not totally separate, so they're always running in the same order. I suspect that is from the order of the three assignments of "c" in main(). But how to test that?

Let's reverse the order of the assignment. That should change the output to 5check3check each time. Here's the new main():


func main() {
 c := generate()
 c = filter(c, 5, "Buzz")
 c = filter(c, 3, "Fizz")
// c = filter(c, 5, "Buzz")
 for i := 1; i <= 100; i++ {
  if s := <-c; s != "" {
   fmt.Println(s)
  } else {
   fmt.Println(i)
  }
 }
}


The rest of the application is unchanged, and the only thing in main() that was altered was copying the Buzz/5 assignment to c to precede the Fizz/3 check (I commented out the existing line out of habit for easier reversion.) If my theory that the three assignments to c acts as a way to create an ordered chain of channels is correct, 3check5check should become 5check3check. Here's the output:
5check3check
5check3check
5checkFizz
5check3check
Buzz3check
5checkFizz
5check3check
5check3check
5checkFizz
Buzz3check

Bingo!

Buzz: But How Does It Know i Each Time?

I feel like this is a "newbie" issue in trying to trace out what the program is doing. I was staring at the source code, thinking I have main() with a for loop counting i and I have two filter() goroutines that have loops incrementing i to determine if the current counter is at 3 or 5. But how do the goroutines know what the count is to determine if they should use a Fizz or Buzz?

Somewhere in the cobwebbed corners of my brain I recalled reading that goroutines shared memory; could the goroutines see the same variable? Nope. Shared memory or not, they are scoped differently and are invisible to each other. Plus the i's are independently declared (with := ) and that means that they are different variables.

It wasn't until I was rubber-ducking the problem that I realized my mental model for this FizzBuzz solution was wrong. 

Here's what is evaluated down the channel in each iteration:
"" "1" "1"
"" "2" "2"
"" "Fizz" "3"
"" "1" "4"
"" "2" "Buzz"
"" "Fizz" "1"
"" "1" "2"
"" "2" "3"
"" "Fizz" "4"
"" "1" "Buzz"

In other words, I was confused because I had the original basic solution in my head, where you increment the counter by one then figure out if the result is divisible by 3 or 5. This solution just counts to 3 (or 5, via the second filter() goroutine) then starts over again. It's not evaluating numbers 1 through 100.

The output of the numbers 1 through 100 comes from the main() loop. Every time the loop in main() gets a bundle from the channel that doesn't have a string (the Fizz or Buzz) it instead prints the number of the current iteration.

End result...a FizzBuzz program!

1 comment: