Sunday, July 19, 2015

Go and Mod Operations (A Performance and Testing Tale)

I was recently working on a simple application where one of the functions calculated, from a count of the Unix epoch, time remaining before an entry in slice was going to be deleted. Each entry has a timestamp, the function grabs the current time, calculates the difference, then translated it to a human readable amount of "X days, Y hours, Z minutes remaining."

It's not a hard problem to model. I had the time remaining in seconds (thanks to the epoch calculation.) I took the number of seconds in a day, an hour, and minute, then figure out how many days fit in the epoch. Then how many hours are in the remainder. Then how many minutes fit in that remainder, and seconds...well, that's the final remainder, no calculation needed.

This is an obvious use case for the mod operator. I knew that, and I did a quick search for mod support in Go; I got a hit on using Go's math package and its associated mod function. (If you're experienced with basic math operations on Go, I already know what you're thinking...just hang stick with me.)

I had remembered being advised in the past to use an int64 when dealing with epoch calculations, and that's what the application used. The library used float64, and meant I'd have to cast types to get the calculations to work. But I tried a quick implementation with the mod function from the math package, and due to rounding I would get weird results (like 2 days 24 hours and 60 minutes...)

I wanted to get this working and conceptually it was simple to implement a quick and dirty workaround.


// How many days?
for int64Difference >= 86400 {
 intDays++
 int64Difference = int64Difference - 86400
}


(60 seconds a minute, 60 minutes an hour, 24 hours a day = 86,400 seconds...)

Repeat the process on the remainder each time to get hours and minutes. It's very simple to understand. I figured it was also expedient given that in my particular use case I was working with values that should never exceed 7 days.

As I was doing the initial work on this, I realized that...derp...there was a simple mod operator, %.

Yeah, forehead slap. But the documentation implied it worked with int, not int64. I thought about the situation:
A) I could go back later to fix the code...performance was acceptable at the moment.
B) The casting of values was another operation that would possibly eat a few cpu cycles.
C) I'm sure it's bad, but it can't be horribly bad to do it with some simple loops.

I implemented the looping method, tested the application, and deemed performance acceptable.

But I was still curious how horrible the looping method was compared to the %-mod. I started creating some mini-applications to test performance, and one thing led to another and soon I had combined them into one application to test various cases. The application was pieced together just for testing purposes, so it wasn't pretty, but it seemed to do the job.

In the end, I discovered my method was not only bad, but horribly bad. The fact that the application is dealing with relatively small numbers limits the performance issues, but changing implementations will probably be simple improvement.

The application takes each time entry in a slice and calculates the loops serially. My test application allowed testing serially, another that splits the slice in half for goroutines to calculate individually, a "massive" mode that threw out goroutines for each of the calculations, and finally using mod.

Additionally, in a previous test I had blogged about I discovered that the act of writing output to the terminal could, in the right circumstances, be a performance hit. To test for that, I added another flag that would silence the output.

I ran the various permutations and in summary, for every version that didn't use mod, running 500 int-sized random numbers (in a slice of int64's, to better match the conditions of the application I was testing on behalf of) through the loops took around ten and a half to eleven minutes (I used the Unix "time" utility to measure the run-time.)

Using mod (%) and casting, the time dropped to 0.016 seconds. While it didn't make a significant difference in the workaround method, using mod in silent mode dropped the time to 0.008 seconds.

Is the effect increased if the number of elements is increased? I upped it to 5,000,000 and re-ran the modulo test. In verbose mode, the time increased to 2 minutes 17 seconds. When I repeated it in silent mode, the time dropped to 0 minutes 13.753 seconds.

The takeaways?

1) When dealing with mainstream languages, the language implementors probably know better than you how to do something.
2) Test different implementations. If you suspect something may work better with another implementation, try it out.
3) Despite how horrible the workaround was, in the practical use, the numbers being calculated are small enough that the inefficient method didn't hinder the application to an unacceptable degree.
4) Once again writing strings to the terminal can significantly hinder performance, but only when the method being used for the calculation isn't horribly inefficient.
5) I really have little idea how big really big numbers are. I've read articles that point out how bad humans are at comprehending large numbers, but I tried validating random numbers generated against an online epoch time converter and they seemed correct. If someone finds where my calculations are wrong, please let me know so I can fix it.

The application performed the calculations, despite my inefficient implementation of the algorithm, with acceptable speed. I learned how poor the method was after creating a test application. And it's not hard to refactor the application in a later release.

Overall...it's a great example of how to work a little better by learning from my mistakes.

I'll end the blog post with a copy of the pieced-together test program. Some side notes; yes, it's a slice of int64, while the random call is limited to the size of int. That's to better simulate the conditions of the application I had created and to prevent weird casting causing overflows between types when doing calculations.

There's also a waitgroup (from the sync package) used to make sure main() didn't close out shortly after spawning goroutines in some of the calculation methods. Otherwise the program spawned goroutines and almost immediately existed without having a chance to write output to the terminal.

Enjoy!


package main

import (
 "fmt"
 "math/rand"
 "os"
 "strconv"
 "sync"
 "time"
)

var slcTimes []int64
var boolSilent bool

func main() {

 var intCounter int
 var strCounter string
 var strMode string
 var strChatty string

 if len(os.Args) < 4 {
  fmt.Println("Usage: ./loopcheck <count> <serial|parallel|modulo|massive> <verbose|silent>")
  os.Exit(1)
 }

 strCounter = os.Args[1]
 strMode = os.Args[2]
 strChatty = os.Args[3]

 intCounter, err := strconv.Atoi(strCounter)
 if err != nil {
  fmt.Println("This doesn't appear to be a valid count: " + strCounter)
  os.Exit(1)
 }

 if strMode != "serial" && strMode != "parallel" && strMode != "massive" && strMode != "modulo" {
  fmt.Println("Need a mode as second argument of serial, parallel, massive or modulo, not " + strMode)
  os.Exit(1)
 }

 switch strChatty {
 case "verbose":
  boolSilent = true
 case "silent":
  boolSilent = false
 default:
  fmt.Println("Please set verbose or silent run")
  os.Exit(1)
 }

 // Seed the Random function
 rand.Seed(time.Now().UTC().UnixNano())

 var int64Random int64

 // 604,800 seconds is a week...
 // Fill the slice with values
 for b := 0; b < intCounter; b++ {

  //int64Random = rand.Int63()
  int64Random = int64(rand.Int())
  slcTimes = append(slcTimes, int64Random)
 }

 if strMode == "serial" {
  for i := range slcTimes {
   HowMuchTime(i)
  }
 }

 if strMode == "massive" {

  var wg sync.WaitGroup
  wg.Add(len(slcTimes))

  for i := range slcTimes {
   go HowMuchTimeMassive(&wg, i)
  }

  wg.Wait()

 }

 if strMode == "modulo" {

  for i := range slcTimes {

   // Set the integers
   intCenturies := 0
   intDecades := 0
   intYears := 0
   intMonths := 0
   intWeeks := 0
   intDays := 0
   intHours := 0
   intMinutes := 0

   // Let's get the time to deletion for each entry
   int64TimeInSeconds := slcTimes[i]

   // How many centuries?
   intCenturies = int(int64TimeInSeconds / 3155692600)
   int64TimeInSeconds = int64TimeInSeconds % 3155692600

   // How many decades?
   intDecades = int(int64TimeInSeconds / 315569260)
   int64TimeInSeconds = int64TimeInSeconds % 315569260

   // How many years?
   intYears = int(int64TimeInSeconds / 31556926)
   int64TimeInSeconds = int64TimeInSeconds % 31556926

   // How many months?
   intMonths = int(int64TimeInSeconds / 2629743)
   int64TimeInSeconds = int64TimeInSeconds % 2629743

   // How many weeks?
   intWeeks = int(int64TimeInSeconds / 604800)
   int64TimeInSeconds = int64TimeInSeconds % 604800

   // How many days?
   intDays = int(int64TimeInSeconds / 86400)
   int64TimeInSeconds = int64TimeInSeconds % 86400

   // How many hours?
   intHours = int(int64TimeInSeconds / 3600)
   int64TimeInSeconds = int64TimeInSeconds % 3600

   // How many minutes?
   intMinutes = int(int64TimeInSeconds / 60)

   // Convert to strings
   strCenturies := strconv.Itoa(intCenturies)
   strDecades := strconv.Itoa(intDecades)
   strYears := strconv.Itoa(intYears)
   strMonths := strconv.Itoa(intMonths)
   strWeeks := strconv.Itoa(intWeeks)
   strDays := strconv.Itoa(intDays)
   strHours := strconv.Itoa(intHours)
   strMinutes := strconv.Itoa(intMinutes)

   // Formatting
   strCenturies = strCenturies + " Centuries "
   strDecades = strDecades + " Decades "
   strYears = strYears + " Years "
   strMonths = strMonths + " Months "
   strWeeks = strWeeks + " Weeks "
   strDays = strDays + " Days "
   strHours = strHours + " Hours "
   strMinutes = strMinutes + " Minutes "

   strMessages := "From " + strconv.FormatInt(slcTimes[i], 10) + " : " + strCenturies + strDecades + strYears + strMonths + strWeeks + strDays + strHours + strMinutes

   if boolSilent != false {
    fmt.Println("Iteration " + strconv.Itoa(i+1) + " : " + strMessages)
   }
  }

 }

 if strMode == "parallel" {

  var intMax int = intCounter - 1
  var intTemp int
  var wg sync.WaitGroup
  wg.Add(2)

  intTemp = intCounter / 2

  go HowMuchTimeParallel(&wg, 0, intTemp)
  go HowMuchTimeParallel(&wg, intTemp+1, intMax)

  wg.Wait()
 }
}

func HowMuchTime(i int) {

 var intCenturies int
 var intDecades int
 var intYears int
 var intMonths int
 var intWeeks int
 var intDays int
 var intHours int
 var intMinutes int
 var strMessages string

 // Set the integers
 intCenturies = 0
 intDecades = 0
 intYears = 0
 intMonths = 0
 intWeeks = 0
 intDays = 0
 intHours = 0
 intMinutes = 0

 // Let's get the time to deletion for each entry
 int64TimeInSeconds := slcTimes[i]

 // How many centuries?
 for int64TimeInSeconds >= 3155692600 {
  intCenturies++
  int64TimeInSeconds = int64TimeInSeconds - 3155692600
 }
 // How many decades?
 for int64TimeInSeconds >= 315569260 {
  intDecades++
  int64TimeInSeconds = int64TimeInSeconds - 315569260
 }
 // How many years?
 for int64TimeInSeconds >= 31556926 {
  intYears++
  int64TimeInSeconds = int64TimeInSeconds - 31556926
 }
 // How many months?
 for int64TimeInSeconds >= 2629743 {
  intMonths++
  int64TimeInSeconds = int64TimeInSeconds - 2629743
 }
 // How many weeks?
 for int64TimeInSeconds >= 604800 {
  intWeeks++
  int64TimeInSeconds = int64TimeInSeconds - 604800
 }
 // How many days?
 for int64TimeInSeconds >= 86400 {
  intDays++
  int64TimeInSeconds = int64TimeInSeconds - 86400
 }
 // How many hours?
 for int64TimeInSeconds >= 3600 {
  intHours++
  int64TimeInSeconds = int64TimeInSeconds - 3600
 }
 // How many minutes?
 for int64TimeInSeconds >= 60 {
  intMinutes++
  int64TimeInSeconds = int64TimeInSeconds - 60
 }
 // Convert to strings
 strCenturies := strconv.Itoa(intCenturies)
 strDecades := strconv.Itoa(intDecades)
 strYears := strconv.Itoa(intYears)
 strMonths := strconv.Itoa(intMonths)
 strWeeks := strconv.Itoa(intWeeks)
 strDays := strconv.Itoa(intDays)
 strHours := strconv.Itoa(intHours)
 strMinutes := strconv.Itoa(intMinutes)
 // Formatting
 strCenturies = strCenturies + " Centuries "
 strDecades = strDecades + " Decades "
 strYears = strYears + " Years "
 strMonths = strMonths + " Months "
 strWeeks = strWeeks + " Weeks "
 strDays = strDays + " Days "
 strHours = strHours + " Hours "
 strMinutes = strMinutes + " Minutes "
 strMessages = "From " + strconv.FormatInt(slcTimes[i], 10) + " : " + strCenturies + strDecades + strYears + strMonths + strWeeks + strDays + strHours + strMinutes
 if boolSilent != false {

  fmt.Println("Iteration " + strconv.Itoa(i+1) + " : " + strMessages)
 }
}

func HowMuchTimeMassive(wg *sync.WaitGroup, i int) {

 var intCenturies int
 var intDecades int
 var intYears int
 var intMonths int
 var intWeeks int
 var intDays int
 var intHours int
 var intMinutes int
 var strMessages string

 // Set the integers
 intCenturies = 0
 intDecades = 0
 intYears = 0
 intMonths = 0
 intWeeks = 0
 intDays = 0
 intHours = 0
 intMinutes = 0

 // Let's get the time to deletion for each entry
 int64TimeInSeconds := slcTimes[i]

 // How many centuries?
 for int64TimeInSeconds >= 3155692600 {
  intCenturies++
  int64TimeInSeconds = int64TimeInSeconds - 3155692600
 }
 // How many decades?
 for int64TimeInSeconds >= 315569260 {
  intDecades++
  int64TimeInSeconds = int64TimeInSeconds - 315569260
 }
 // How many years?
 for int64TimeInSeconds >= 31556926 {
  intYears++
  int64TimeInSeconds = int64TimeInSeconds - 31556926
 }
 // How many months?
 for int64TimeInSeconds >= 2629743 {
  intMonths++
  int64TimeInSeconds = int64TimeInSeconds - 2629743
 }
 // How many weeks?
 for int64TimeInSeconds >= 604800 {
  intWeeks++
  int64TimeInSeconds = int64TimeInSeconds - 604800
 }
 // How many days?
 for int64TimeInSeconds >= 86400 {
  intDays++
  int64TimeInSeconds = int64TimeInSeconds - 86400
 }
 // How many hours?
 for int64TimeInSeconds >= 3600 {
  intHours++
  int64TimeInSeconds = int64TimeInSeconds - 3600
 }
 // How many minutes?
 for int64TimeInSeconds >= 60 {
  intMinutes++
  int64TimeInSeconds = int64TimeInSeconds - 60
 }
 // Convert to strings
 strCenturies := strconv.Itoa(intCenturies)
 strDecades := strconv.Itoa(intDecades)
 strYears := strconv.Itoa(intYears)
 strMonths := strconv.Itoa(intMonths)
 strWeeks := strconv.Itoa(intWeeks)
 strDays := strconv.Itoa(intDays)
 strHours := strconv.Itoa(intHours)
 strMinutes := strconv.Itoa(intMinutes)
 // Formatting
 strCenturies = strCenturies + " Centuries "
 strDecades = strDecades + " Decades "
 strYears = strYears + " Years "
 strMonths = strMonths + " Months "
 strWeeks = strWeeks + " Weeks "
 strDays = strDays + " Days "
 strHours = strHours + " Hours "
 strMinutes = strMinutes + " Minutes "
 strMessages = "From " + strconv.FormatInt(slcTimes[i], 10) + " : " + strCenturies + strDecades + strYears + strMonths + strWeeks + strDays + strHours + strMinutes
 if boolSilent != false {
  fmt.Println("Iteration " + strconv.Itoa(i+1) + " : " + strMessages)
 }

 wg.Done()

}

func HowMuchTimeParallel(wg *sync.WaitGroup, min int, max int) {

 var intCenturies int
 var intDecades int
 var intYears int
 var intMonths int
 var intWeeks int
 var intDays int
 var intHours int
 var intMinutes int
 var strMessages string

 // Set the integers
 intCenturies = 0
 intDecades = 0
 intYears = 0
 intMonths = 0
 intWeeks = 0
 intDays = 0
 intHours = 0
 intMinutes = 0

 for i := min; i <= max; i++ {
  // Let's get the time to deletion for each entry
  int64TimeInSeconds := slcTimes[i]

  // How many centuries?
  for int64TimeInSeconds >= 3155692600 {
   intCenturies++
   int64TimeInSeconds = int64TimeInSeconds - 3155692600
  }
  // How many decades?
  for int64TimeInSeconds >= 315569260 {
   intDecades++
   int64TimeInSeconds = int64TimeInSeconds - 315569260
  }
  // How many years?
  for int64TimeInSeconds >= 31556926 {
   intYears++
   int64TimeInSeconds = int64TimeInSeconds - 31556926
  }
  // How many months?
  for int64TimeInSeconds >= 2629743 {
   intMonths++
   int64TimeInSeconds = int64TimeInSeconds - 2629743
  }
  // How many weeks?
  for int64TimeInSeconds >= 604800 {
   intWeeks++
   int64TimeInSeconds = int64TimeInSeconds - 604800
  }
  // How many days?
  for int64TimeInSeconds >= 86400 {
   intDays++
   int64TimeInSeconds = int64TimeInSeconds - 86400
  }
  // How many hours?
  for int64TimeInSeconds >= 3600 {
   intHours++
   int64TimeInSeconds = int64TimeInSeconds - 3600
  }
  // How many minutes?
  for int64TimeInSeconds >= 60 {
   intMinutes++
   int64TimeInSeconds = int64TimeInSeconds - 60
  }
  // Convert to strings
  strCenturies := strconv.Itoa(intCenturies)
  strDecades := strconv.Itoa(intDecades)
  strYears := strconv.Itoa(intYears)
  strMonths := strconv.Itoa(intMonths)
  strWeeks := strconv.Itoa(intWeeks)
  strDays := strconv.Itoa(intDays)
  strHours := strconv.Itoa(intHours)
  strMinutes := strconv.Itoa(intMinutes)
  // Formatting
  strCenturies = strCenturies + " Centuries "
  strDecades = strDecades + " Decades "
  strYears = strYears + " Years "
  strMonths = strMonths + " Months "
  strWeeks = strWeeks + " Weeks "
  strDays = strDays + " Days "
  strHours = strHours + " Hours "
  strMinutes = strMinutes + " Minutes "
  strMessages = " From " + strconv.FormatInt(slcTimes[i], 10) + " : " + strCenturies + strDecades + strYears + strMonths + strWeeks + strDays + strHours + strMinutes
  if boolSilent != false {

   fmt.Println("Iteration " + strconv.Itoa(i+1) + " : " + strMessages)
  }
 }

 wg.Done()
}

No comments:

Post a Comment