2012 Year Summary – This Blog, Books And Running

This year I have covered a bit more then 1000 miles. Here are some of the stats that nike site is giving you. First you can see I run mostly off-road  This is great not only due to the amazing views but also because it’s quite and give you a chance to think without any ‘city’ noise around.

Screen Shot 2013-01-03 at 12.48.59 PM Continue reading

Web Workers And Big Data – A Real World Example

Web Workers in the 19th centery

I had an interesting conversation on G+ with developers around web workers and the need to a ‘real world’ example that use ‘big chunk’ of data. In this example we will dive into this senario. No more, ‘hello world’ and calculation of some nice number (Pi, e etc’). In the code we will take an array of 33.55 millions numbers (~=32MB) make some calculation on each and everyone of them and return the new data to our main page/app. We will use  transferable objects because they are a new powerful option to ‘move’ (not copy) big arrays in and out the worker. Since the support for transferable objects is done with: webkitPostMessage() (so far). We will use Chrome as the browser for these examples.

This is the main page of our example. In the code snippert below you can see the test function that let us choose the method of delivery.


// The entry point for our comparing. 
function test(useIt) {
  // useIt: true  - use transferrable objects
  //        false - COPY function for sending the data. 
  var useTransferrable = useIt;
  setupArray(); // create the 32M array with numbers

  if (useTransferrable ) {
    console.log ("## Using Transferrable object method on size: " +
                 uInt8View.length);
    // This is the syntax to send by using trans-obj.
    worker.postMessage(uInt8View.buffer, [uInt8View.buffer]);
  } else {
    console.log ("## Using old COPY method on size: " + 
                 uInt8View.length);
    // Simple send msg that copy the data to the worker
    worker.postMessage({'copy': 'true', 
                      'ourArray': uInt8View.buffer});
  }
}

and here is the worker that is doing the hard work on 32M of numbers. You can think on better ways to do ‘hard work’… Specially if you are in the world of WebGL and you have big matrix under your arms.


  // Here we are 'computing' something important on the data. 
  // In our case - just play with % if you have new hardware
  // try: Math.sqrt( uInt8View[i] * Math.random() * 10000);
  for (var i=0; i < dataLength; i++ ) {
    uInt8View[i] = uInt8View[i] % 2;
  }
  
  if (useTransferrable) {
    self.postMessage(uInt8View.buffer, [uInt8View.buffer]);
  } else {
    self.postMessage(e.data.ourArray);
  }

The results are clear (as the sun over the beach in Maui). It was much faster to work with transferrable objects.

web workers - compare options to move data in and out
With transferrable objects it took us 292ms while with copy it was 533ms.
Last but note least, you can find more examples and deep coverage on web workers in my book on web workers. Psst… if you can’t sleep, it might help you on that front as well.
Web Workers - The book

Rank Your Book Collection

books
I love books. It is due to my parents, that put in me from the age of 4 or 5 this passion to the written word. It might be (also), my literature teacher from high school, that took the only subject I really didn’t like (yep… I enjoyed math, physics and computers but hated literature) and made his course a pure adventure full of joy. I remember lots of moments when you finish a book but keep thinking on the subjects/point of views/heroes years and years after the 4th time you reading it. In a way, my kindle is a wonderful device but I still really like to hold a ‘real’ book.

Last weekend, I’ve decided to ‘sort’ my mobile (=kindle) books. Since I’ve had them (all 1,073) on one big folder, I wrote this little script that build a list of their names and then use amazon to get their rating. From here, the path to a spreadsheet with the data is very short. Now, I know what are the best ones, by harnessing the ‘wisdom of the crowds’.
Happy reading!

Here is the code (or if you like a better version try it on github)

 

/**
 * Description: read a list of books (from a collection on your hard drive)
 * and use amazon review to rank them. This is helpful if you have lots of books.
 * It's good to put the best one on your kindle for the next vacation/conf etc'.
 *
 * @author Ido Green
 * @date 4/24/2011
 * @see http://greenido.wordpress.com/
 * http://amazon.com 
 * http://gskinner.com/RegExr/ - to handle regex IF you want to get ranking from the html
 * 
 */
class scanAmazon {

    private $books = array();
    private $newRankList = array();

    /**
     * Ctor
     * @param type $dir - the path to your directory of books
     */
    function __construct($dir) {
        $this->buildList($dir);
    }

    /**
     * Run on all the books and get the rating, then, save them to a CSV file.
     */
    public function run() {
        $this->getRating();
        $this->saveToFile("booksRanking.csv", implode("\n", $this->newRankList));
    }

    /**
     * build a list of books' name from the file names
     * @param type $dir - the path to your directory of books
     */
    private function buildList($dir) {
        if ($handle = opendir($dir)) {
            while (false !== ($file = readdir($handle))) {
                //echo "$file\n";
                $name = substr($file, 0, strlen($file) - 5);
                if (strlen($name) > 2) {
                    array_push($this->books, $name);
                }
            }
            closedir($handle);
            sort($this->books);
        }
    }

    
    /**
     * Get the rating of the books
     * we are looking for this pattern: Rated 4.7 out of 5.0
     * 1. Use google results:  http://www.google.com/search?sourceid=chrome&ie=UTF-8&q=site%3Aamazon.com+BOOK-NAME
     * 2. Use amazon directly: http://www.amazon.com/s/ref=nb_sb_noss?url=search-alias%3Dstripbooks&field-keywords=BOOK-NAME
     */
    private function getRating() {
        echo "Working on " . count($this->books) . " books\n";
        $i = 1;
        foreach ($this->books as $book) {  
            $searchUrl = 'http://www.amazon.com/s/ref=nb_sb_noss?url=search-alias%3Dstripbooks&field-keywords=' . urlencode($book);
           // this is the pattern "alt="4.3 out of 5 stars"
            $resPage = file_get_contents($searchUrl);
            $matches = array();
            $ind2 = strpos($resPage, "out of 5 stars");
            $ind1 = $ind2 - 4;//strripos($resPage, '"',$ind2);
            
            //google: $ind1 = strpos($resPage, "Rated") + 5;
            //google: $ind2 = strpos($resPage, "out of", $ind1);
            if ($ind2 > $ind1 && ($ind2-$ind1) newRankList, $rank . "," . str_replace(",", " ", $book) . ",".
                        $searchUrl);
                echo "{$i}) {$book} - Ranking: {$rank} out of 5.0\n";
            }
            else {
                echo "{$i} ERR - {$book} got no rating url: {$searchUrl}\n";
            }
            $i++;
            sleep(5); // let not overload amazon server :) 
            // $found = preg_match('/Rated (\d\.\d) out of 5.0/gi', $resPage, $matches);
            //if ($found && count($matches) > 0) { $rank = $matches[0]; }
        }
    }

    /**
     * simple saver of data/string to file
     * @param  $fileName
     * @param  $data
     * @return  false when we could not save the data
     */
    function saveToFile($fileName, $data) {
        try {
            $fh = fopen($fileName, 'w');
            fwrite($fh, $data);
            fclose($fh);
        } catch (Exception $exc) {
            error_log("Err: Could not write to file: {$fileName
                    } Trace:" . $exc->getTraceAsString());
            return false;
        }
        return true;
    }

}

// start the party
$scanner = new scanAmazon("PATH TO YOUR BOOKS");
$scanner->run();


Good Books I’m reading now

It’s mainly a post to my presents (not that they are going to read it).
Hey, after all these years that you tought me the love of books. Here is a short list of what I’m reading when I don’t have time (usually around midnight and above). In this list I’ve included some recent books I’ve read:

  • What the Dog Saw: And Other Adventures.
    After Outliers, Blink and Tipping Point anything that Mr. Gladwell will write – I promise to read (as quickly as I can).
  • The Black Swan: The Impact of the Highly Improbable – it’s very good book on a simple topic. We know NOTHING on the stock market.
    “Assuming more order than exists in chaotic nature” – Our brains are wired for narrative, not statistical uncertainty. And so we tell ourselves simple stories to explain complex thing we don’t–and, most importantly, can’t–know. The truth is that we have no idea why stock markets go up or down on any given day, and whatever reason we give is sure to be grossly simplified, if not flat out wrong.
  • Predictably Irrational: The Hidden Forces That Shape Our Decisions
    In a similar way to the Black Swan – it contain lots of answers on how can we recover from an economic crisis.
  • The Intelligent Investor: It is the best book I’ve read about investing. The greatest investment advisor of the twentieth century, Benjamin Graham taught and inspired people worldwide. Graham’s philosophy of “value investing” — which shields investors from substantial error and teaches them to develop long-term strategies — has made The Intelligent Investor the stock market bible ever since its original publication in 1949.