In the Dart hackathon I got few questions about applications on the server. The best way was to try and give the hackers a code sample… It’s by definition a very simple code but I’m sure that you can take it to the next level without any problem.
this example could be consider version 0.01 of a real crawler. You do need to add to the real first version features like:
- Discovery – Be able to get links from the current page and jump into them. This is much harder then it sounds, as you want to make sure it won’t continue forever.
- Parsing – parse the information on the page. Try to gain the meta data and add it to the ‘real’ content (which is based on your goals from the crawler).
- Analyze – Meaning, normalize the information of the page and put it in a storage (DB, file, a cloud solution etc’).
- Logging &Monitoring – As this server side process will run while you are sleeping… It’s best to have some good ‘watch-dog’ on it. The start will be with some simple logging and analyzing of the logs. The second step will be to use a tool to monitor the action.
- There is a real need to libraries that will make the parsing better. xPath, DOM to Map (or Array) etc’.
- The debugging in the editor could improved… and as a first step you might want to use a logging library that will give you a lot of information for each step.
- The editor making the development phase very nice with warnings on (almost) every issue that you might do. I found it very productive to be back in the good hands of ‘IDE’.
- I guess that in the near future we will see some good examples that use Dart VM on the server – It’s going to be interesting to profile their performance and see where do we stand vis a vis other modern languages like: Scala.