Chrome, webdev

Dart Crawler Example

In the Dart hackathon I got few questions about applications on the server. The best way was to try and give the hackers a code sample… It’s by definition a very simple code but I’m sure that you can take it to the next level without any problem.


#import('dart:io');
#import('dart:uri');
#import('dart:json');
// Dart Hackathon TLV 2012
//
// A simple example to fetch RSS/JSON feed and parse it on the server side
// This is a good start for a crawler that fetch info and parse it.
//
// Author: Ido Green | greenido.wordpress.com
// Date: 28/4/2012
//
class Crawler {
String _urlToFetch = "http://feeds.feedburner.com/html5rocks";
String _dataFileName = "webPageData.json";
HttpClient _client;
var rssItems;
//Ctor.
Crawler() {
_client = new HttpClient();
}
// Fetch the page and save the data locally
// in a file so we could process it later
fetchWebPage() {
// Get all the updates of h5r
Uri pipeUrl = new Uri.fromString(_urlToFetch);
// open a GET connection to fetch this data
var conn = _client.getUrl(pipeUrl);
conn.onRequest = (HttpClientRequest request) {
request.outputStream.close();
};
conn.onResponse = (HttpClientResponse response) {
print("status code:" + response.statusCode);
var output = new File(_dataFileName).openOutputStream();
response.inputStream.pipe(output);
// In case you want to print the data to your console:
// response.inputStream.pipe(stdout);
};
}
// Read a file and return its content.
readFile() {
File file = new File(_dataFileName);
if (!file.existsSync()) {
print ("Err: Could not find: " + _dataFileName);
return;
}
InputStream file_stream = file.openInputStream();
StringInputStream lines = new StringInputStream(file_stream);
String data = "";
lines.onLine = () {
String line;
while ((line = lines.readLine()) != null) {
//print ("== "+line);
data += line;
}
};
lines.onClosed = () {
print ("Got to the end of: "+_dataFileName);
print ("This is our file content:\n" + data);
parsePage(data);
};
}
//
// Basic (real basic) parsing
//
parsePage(data) {
// cut the intersting part of the feed
int start = data.indexOf("<title>");
int end = data.lastIndexOf("</channel>");
var feed = data.substring(start, end);
// put the items in an array
rssItems = feed.split("<title>");
for (var item in rssItems) {
print("\n** Item: " +item);
}
}
} // End of class
//
// Start the party
//
void main() {
Crawler crawler = new Crawler();
crawler.fetchWebPage();
crawler.readFile();
}

this example could be consider version 0.01 of a real crawler. You do need to add to the real first version features like:

  • Discovery – Be able to get links from the current page and jump into them. This is much harder then it sounds, as you want to make sure it won’t continue forever.
  • Parsing – parse the information on the page. Try to gain the meta data and add it to the ‘real’ content (which is based on your goals from the crawler).
  • Analyze – Meaning, normalize the information of the page and put it in a storage (DB, file, a cloud solution etc’).
  • Logging &Monitoring – As this server side process will run while you are sleeping… It’s best to have some good ‘watch-dog’ on it. The start will be with some simple logging and analyzing of the logs. The second step will be to use a tool to monitor the action.

Key lessons:

  • There is a real need to libraries that will make the parsing better. xPath, DOM to Map (or Array) etc’.
  • The debugging in the editor could improved… and as a first step you might want to use a logging library that will give you a lot of information for each step.
  • The editor making the development phase very nice with warnings on (almost) every issue that you might do. I found it very productive to be back in the good hands of ‘IDE’.
  • I guess that in the near future we will see some good examples that use Dart VM on the server – It’s going to be interesting to profile their performance and see where do we stand vis a vis other modern languages like: Scala.
Standard
Chrome, webdev

Dart Instagram Web App

One (of the many) good things that happend during the Dart hackathon 2012 in Tel Aviv was the ability to hack with friends. There were a lot of interesting project and I had a bit of time to hack this simple web app that show a combination of few tools. The main goal was to investigate and see how we can work with web services in Dart while giving the user a cool UI. First, I’ve looked at how my JavaScript code should look in Dart. Then, it was easy to bake the functionality into the code that fetch images from Instagram. When you have a case where you need to fetch some unstructured data from the web you might want to consider using yahoo pipes (and/or the new version YQL). In our case, I saw that the work on web.stagram is in the area of what I’ll need in terms of data but (like so many other web site) they don’t have any JSON feed I can work with. The option to parse feeds (RSS/Atom) in JavaScript is painful so here y! pipes come to the rescue. This pipe will take the page of ‘photo of the day’ and will give you back a JSON output of all the information you will want to see in a feed from that page. From here the basic code to fetch the JSON and to build the HTML is looking like that:


// init values on the page
  startThePage() {
    String baseurl = "http://pipes.yahoo.com/pipes/pipe.run?_id=8a481ba9ce15f5efa8ac6b894b45eeac&rand=3334&_render=json";
    XMLHttpRequest request = new XMLHttpRequest();
    request.open("GET", baseurl, true);
    request.on.load.add((e) {
      _divCar.hidden = false;
      
      var response = JSON.parse(request.responseText);
      var imgs = response['value']['items'];
      for (final img in imgs) {
        writeCarousel(img);
      }
    });
    request.send();
  }

Other tools/frameworks I’ve used in this mini-project:

  • Dart – I must say that the new language is very easy to pick up. If you are Java programer a lot of things will look (very) familiar (to good and bed). But, even if you spend you last several years hacking on JavaScript – you will feel at home after the first few hours.
  • Twitter bootstrap – These days, it’s one of the best options to have a quality responsive layout with a lot of other CSS goodies.
  • Y! pipes – Instead of taking the time build an RSS to JSON web service (which might be a cool idea for another hackathon) I’ve used pipes that give you not only that but also a fast cache version of the information so you won’t put any load on the servers of your source.
  • The unoffical web.stagram API – In our case, it was the best way to get the ‘photo of the day’ from Instagram.
Overall, it’s very simple code, yet, it’s giving us some views of what can be done with Dart. I would love to put some more time into this project in order to have a nice web app (and not just simple web site as it’s current state).
Standard
Chrome, webdev

Dart Hackathon TLV Summary

Web Workers in the 19th centeryLast Friday and Saturday we hosted a Dart Hackathon in Tel Aviv. When you have a group of people
OK… When hackers, geeks, coders, ninjas and software engineers are coming to spend their weekend hacking on the bleeding edge of technology you know good things will come live. I thought we will have some cool demos in the end but the level of the projects we saw was very impressive. From a generic library of Types to new math game that is doing some clever things with inheritance, canvas and other goodies both on the server side and the client.

Few teams that I would like to mention here:

  • DJ web app – A cool web app that let you ‘play’ the DJ part.
  • Volfied like game, only much better – https://github.com/yanivoliver/DartVolfied
  • GraphMVC – A modular framework for graph (vertices/edges) data structures https://github.com/habeanf/dartgraph
  • Implement the Novem game in dart – It’s a new math game that doesn’t exist on the web (nor on mobile) so they are keeping the source for now. We might have some parts without the algo in Github in the next few days.

All the information about the teams with their ideas and links to their Github repositories

dart

Few hackers came to me during the event with questions about Dart server side. Here is a basic example to server side crawler: https://gist.github.com/2517000 it took me less then ten minutes to write it and I’m sure you can take it from here to the next level. I’ve also had a bit of time (not too much) to work on a web app. It is a simple way to watch cool photos from Instagram on your browser.

The code is in Github under DartInsta and this project uses several technologies:

  • Dart – of course… all the main logic of the web app is written in Dart.
  • Twitter bootstrap – yes, let’s have some good responsive layout without to invest too much effort.
  • Y! pipes for the feeds – who said you can’t enjoy JSON from any web site on the web?
  • Instagram (or the unofficial web.stagram API) – After all, we do need some photos and it’s better to have some real good ones.

Old style dartSome thoughts for the future:

    • Dart is a cool pre alpha technology that (I hope) going to help us build solid web apps without the need to be a ‘guru’. It’s still very (very) early so there are many things that we could improve over time.
    • The community (web developers, Java developers etc’) should try and see what are the libraries the will give the most ‘bang for the buck’. Since it’s so early in the life cycle it will be great to have some libraries that moving everyone forward and not something like the case with jQuery slideshows (= too many not so ‘great’ ones).
    • We should do more events like this but not on weekends so people that can’t drive on Shabbas could join us.
    • Dart is very easy for Java developers. It’s not the case with ‘hard core’ JS ninjas.
    • In case you are going to organize a hackathon here are few great tips.

(*) For hebrew speakers, here is a great explanation that was recorded two days after the event.

Standard
Chrome, HTML5, JavaScript, webdev

Dart Hackathon In Tel Aviv

In the last weekend of April there we are going to have a Dart Global Happy Hour around the world. Luckily, we will have Tel Aviv on the map as well. Fitst, for the ones that still think we are speaking here about

Well, we are not talking about dart game in the irish pub. Although it’s good fun…

So… what is Dart?

Dart is structured web programming for the entire modern web. Like a good draught, Dart is fresh yet familiar, with unique touches that help create a delightful new experience for aficionados of software development. Dart delivers a smooth pour of a new language, libraries, virtual machine, and compilation to modern JavaScript. Dart will make web development crisp and refreshing again.

So in order to gain more feedback (and have fun hacking) we are going to have a #Dart hackathon in the last weekend of April. The keynote will be giving by +Gilad Bracha and we will have other Dart experts, helping during the hackathon. The event is going to take place at the Hub in Tel Aviv so if you wish to attend you better register asap at hackathon-israel.eventbrite.com/ and for the schedule and more details on the event: http://goo.gl/iFccu

We ask all the participants to bring their own laptops and power cords. Please make sure to have Java, Dart SDK and the Dart Editor on your laptop before the hackathon. Here is a good page that will guide you on the process: http://www.dartlang.org/docs/getting-started/editor/. The hub will provide WiFi and we will make sure there is enough food/drinks. If you wish to ‘test the water’ before the event – dartlang.org is an excellent resource to test the language and get a feel to the power of the APIs.

  • Disclaimer: Dart is “technology preview” (not yet even alpha). This hackathon is for experienced developers.
Standard