Session expiration Your session is going to expireClick here to extend


Small project <800

Posted on

7/5/16 10:20 AM



This project has expired

Why don't you register anyway? We are sure that you will find many similar projects out of the thousands waiting for you!

Post similar project now


Logical requirements

I need a webscraper for the German website [OBSCURED]. It should iteratively query the search engine for a set of user defined key words (Definition in a config file). Each query results should be filtered regarding two criterions:

  • Does the search result match with a user defined regular expression (Definition in a config file)?

  • Wasn’t the result already found before in a previous run? (The results are stored by date in descending order. Hence, this is also an abort criterion for one key word)

If both criterions are matched the scraper should extract specific data from the query result and store it into a database. Furthermore, the tool should notify a user by email about the recently found information.

Technical requirements

  • I need full source code access. Code should be well documented.

  • Tool must run on a low performance device (like Raspberry Pi) and under Linux

  • Languages

    • Preferred: Java Version 8 or Scala 2.11

    • Alternatively: C++/Python

    • NO Perl

  • Specification of a proxy should be possible

  • Storage database should be MS Access compatible or SQL Lite


You can contact me both in Eglish and in German.