Webpage Link extractor

  • Posted:
  • Proposals: 0
  • Remote
  • #39787
  • Expired
  • 0

Description

Experience Level: Intermediate
.NET application to be developed to achieve below.
1.

Input : HTML page (Can be stored in a buffer). HTML page data can be unicode data.
Output : All hyperlinks present in a page (Links can also be having unicode characters)

2.

Input : HTML page (Can be stored in a buffer). HTML page data can be unicode data.
Output : All hyperlinks present in a page with title which is in <a tag (Links and titles can also be having unicode characters)

3.

Input : RSS page (Can be stored in a buffer). RSS page data can be unicode data.
Output : All hyperlinks present in a page with title (Links and titles can also be having unicode characters)

The application should not take much CPU resources and memory and there should not be any memory leaks.

Input can be given as a folder where HTML, RSS files are dumped. The output to be created as files with same filename with extention .links in a output folder. Input and output folder can be specified in config file of an application.

Small, compact program preferbly using standard .NET functions is MUST. If application has less than 100 lines of code would be great.

New Proposal

Create an account now and send a proposal now to get this job.

Sign up

Clarification Board Ask a Question

    There are no clarification messages.