Xpath Projects
Looking for freelance Xpath jobs and project work? PeoplePerHour has you covered.
Post an offer to educate them
Past "Xpath" Projects
urgent
Build a Chrome Extension that makes url links from tel numbers
I am looking for a developer who can build a chrome extension in the latest manifest 3 format which does this: 1) User loads an https html page as part of their work 2) The extension scans the page for a mobile phone number that meets this criteria: - is 11 digits long (after removing any spaces if they exist) - only contains numbers - starts with 07 Example: 07548123456 = good 01604123456 = bad 3) If the above criteria is not met, do nothing, else... 4) Extension removes the leading 0 and turns the number into a clickable whatsapp link of this format: https://wa.me/+447548123456 +44 will always be the country code to append to the start (replacing the leading 0). 5) When the user clicks the hyperlink, the desktop version of whatsapp will open for the user to start a chat with the number. Please test this on your machine. This is the xpath location on the page where I want the number to be detected, please do not use this as a way of restricting where the extension will find the number on a page, but as a guide to see that in the first use case I am working on, that the number is nested in a form and tables. /html/body/table/tbody/tr[2]/td/table/tbody/tr/td[2]/table/tbody/tr/td/form/blockquote/table/tbody/tr[5]/td[2] The extension should be able to find a matching number anywhere on the page regardless of it's structure, and be able to turn an unlimited number of matches into the whatsapp link format specified above. The chrome extension is to be build in the latest manifest 3 requirements. And, the extension will have a logo (just the whatsapp logo is fine!) and be identifiable from the extension tray on Chrome. It will work on the latest version of Chrome. I do not want it published to the Google Chrome extension store, it should be delivered privately and I will load it as an unpacked extension. Please do all checks to ensure the extension meets the Google policies and that it loads correctly without failing or being flagged by Google. Please ask if you have any questions!
Data capture (mostly from javascript data sources )
Our current skillset: In scraping data we're able to extract data on exhibitors using Xpath or CSS path routes (with wildcards) via a desktop scraping package (screaming frog and/or fMiner) using a list of relevant web-pages, however we have minimal skills when it comes to extracting data where the javascript creates the links in real time (i.e. obfuscates them from spidering), or there is a need to traverse a page to extract data via pop-ups. We also sometime struggle with iframes on webpages. What we're after: Someone with deeper skills in data extraction than our own self taught skills. We anticipate given the Javascript nature of much of the data sources this will require someone with some element of headless chrome and/or automation platform experience, or a deep understanding of website code to capture and refine the underlying resource calls related to individual exhibitor pages. What we can offer For the right person, we have a flow of different sites which prove problematic each week, and are able to provide a steady flow of new opportunities through time if this is so desired.
Chrome extension rework
I need help to some changes in my chrome extension. maybe some changes in CSS/Xpath.
Python scrapping expert in xpath and regular expression
Hi - I need some one who is expert in python Selenium scrapping and feel comfortable to work with xpath and regular expressions. I need 2-3 hours help with script. please let me know if you are available urgently to start now for next 2-4 hours Thank you
Octoparse Web Scraper Setup
Hi, Looking for someone to assist with octoparse web scraper to scrape stock figures from supplier website. Some of the attributes are hidden within the html so someone confident with advanced setup on octoparse is required Xpaths etc
Sample Selenium Python Pytest Frramework
Hi, I need someone proficient in Python Pytest (if you do not have previous experience in Pytest, please do not quote) that can create a test framework with the following: Site basics: Click and accept a cookie warning banner At each of 5 given pages, enter a search term in the search box, select a filter and submit, then verify the search results given are the same as expected Click on links and verify expected text within 5 pages (approx 5 links per page, so total 25) Code layout required Require this to be a fully spec Python Pytest framework, so: Utilities folder Contains 'Base Class' with reusable items such as 'verify link presence', 'select option by text' Page Objects Folder Contains a page objects file for each of the 5 pages with the various page items (require these to be a mixture of selection by: CSS, ID, Name, Xpath, ClassName and linkText) Tests folder Contains: conftest file with actions such as selectable browser driver (chromedriver, geckodriver, etc), maximise page, and close page at end of test (and any others you feel would offer value). Finally, the above should have a report function that details which tests have failed (by name and function) and takes a screenshot at the point of failure All of the above should have commenting in clear english to explain the coding Whilst my target price is low, I would expect the person with the level of experience I am looking for to already have existing frameworks that handle the majority of the work above, meaning all they have to 'plug in' to the test is the element details for the pages and links themselves. Obviously, if you are able to create all the above to my specs under my budget (but crucially, with QUALITY) the that would improve your chances. Thanks for taking the time to view this, I look forward to hearing from you.
opportunity
Javascript / Xpath / DOM Population from XML string
We have an XHTML Page that allows users to enter data and click a button to create XML. It works great. The page has the ability to add multiple data elements, like a list of people, each person could have multiple child data elements like addresses. A data element may be a string, date, checkbox, radio, select, etc. The page works great and we can enter data and get the XML out. All the page elements have attributes that define it's XML node, so it is possible to create a xpath query to find elements based on any XML node. We would like to add a generic Load XML, that parses the XML and as it iterates through the nodes populates the data controls in the HTML. It is possible to infer the correct XHTML xpath from the XML nodes current path as you iterate the XML nodes. If the data input for a node does not exist then the script must find a parent element and call the parent add data element button so it can populate the data. This must be generic for reuse and change. Hardcoding of XPaths will not be accepted. We need a Load XML function creating in JavaScript. We have some tests on the page to test the new Load XML method. Before any work is started we will explain the work and ensure you are clear. Senior experienced JavaScript's devs only please.
Compile selectors for a .NET Core + Selenium crawler
For an Italian customer I'm building ( I'm an Italian Software Engineer) a Crawler based on my crawler framework where basically most of the times I just need to see an html source and CONFIGURE xpath selectors for a specific page. Only rarely, when the page is not a flat html table or a simple page with cards, i need to write some custom code, custom algorithm using directly selenium and any kind of selectors. An example of flat html page (just to configure with standard algorithms): http://www.albopretorio.aslfrosinone.it:8080/aol/pubblicazione/pubblicazioneDocumentoElenco.action An example of "clean" cards page (just to configure with standard algorithms): http://albopretorio.regione.fvg.it/ap/AAS5 The "problem" is that I have approx 200 pages to crawl, so in order to speed up the process, I need someone that knows C# and Selenium. It's important to note that, if the job is well-done, I'll have for sure maintenance and support because obviously pages can be changed and compatibility broken. Please only people that can show me .NET Framework or .NET Core projects with Selenium.
Scrapy script to extract SKUs and pricing from a website
Hi, I used Scrapy within Python, but am short on time so was hoping someone could create a Scrapy script to extract the SKUs from a website. Information is Name of Product, Category of product and Price of product. All the information is within xpath so should be easy to create and run. This is the website: https://www.boots.com/brands
opportunity
Web crawler to find mentions of phrase within website body text
I need a script that is capable of doing the following.. From a .csv list of keywords and phrases the script searches an entire defined web domain and finds every exact mention of that phrase (ideally only within the rendered body next, not menus footers etc.. xpath?) and exports a resulting list with each imported keyword/phrase mapped against each URL on the domain which mentions that keyword. So an example import file would be similar to.. Domain: domain.com Keyword 1 Keyword 2 Keyword 3 Keyword 4 Keyword 5 Keyword 6 etc etc The exported csv file mail look like, Keyword 1 URL1.html URL1.html Keyword 2 URL2.html URL7.html URL5.html Keyword 3 URL1.html Keyword 4 URL4.html URL4.html Keyword 5 Keyword 6 URL3.html URL2.html URL4.html Import files may contain upwards if 600 words/phrases
Need some Python code to find elements on web page
I need some Python code to find elements on web page, change input boxes, click button. Preferably using XPath. Put inside loop. I need the code block only to insert into my Python project. Will provide more detail and website to work on. Could you provide an estimate please? Regards, Greg.
Reading XML messages with Excel VBA MSXML2 with XPath
I would like to learn how to read XML files with Excel VBA using the MSXML2 library and XPath. I am reasonably proficient in Excel VBA but reading XML is just beyond my limits.
pre-funded
Code Scrapy to pull data from web site
I currently have Scrapy set up, I need the spider to be coded to correctly to pull the first line of detail from a search. I do not know how to inculde Xpath in the search. I am having difficulty in understanding how to include Xpath so that only the first search terms is returned. I have the X Path for the first set pf results I need for this extraction set up already: Headline: //*[@id="b_results"]/li[1]/h2/a URL: //*[@id="b_results"]/li[1]/div/div/cite Body text: //*[@id="b_results"]/li[1]/div/p This is currently set up on an EC2 instance for which I am able to provide quick access. The URL's are already set up for the searches, the URL to search is structured a follows: https://www.teana.com/search?q=%22AOBIN+MEISEL%22+za.larp.com (this is a dummy URL, real URL will be provided on acceptance). I will still need to run the scraping and do the work, I more need direction in terms of how to set the scraper up to pull the detail from the site.
AHK Web controlling with WinHttp.WinHttpRequest.5.1
Hello. AHK expert! Recently I've found really useful function from AHK forums. Which is using Google translator directly in AHK. Not really needing to access internet browser's window to use Google translator. It looks like it's using https/http request attaching JS Inside of AHK. using COMobject. WinHttp.WinHttpRequest.5.1 Here it looks like this. This is the original format. (Working in Autohotkey Sheet) ------------------------------ http://pasted.co/eeff531e (this is the code snippet) ------------------------------ And I actually need little bit of variations from it. But I'm not really deep skilled to make one. so I'm here to ask you some of simple tasks. (Please get noted all scripts should be AHK based script.) (Maintaining the proxy part too) I need something that could be looking like => functions.pageclick(#hitthisbutton) based from the original script. I think these tasks could be achieved pretty easily if you know COM in AHK + Javascript. here's the specifics. 1) Function Enabling clicking buttons in certain webpages and navigate to the next page. 2) Function Enabling Keystrokes such as typing words, and even {Pgdn} {End} Keystrokes 3) Let me know how to specify the section of the webpage (Query Selector, or Xpath, or at least please explain to me how can I specify certain sections in the page) 4) Function Enabling above 1)2)3) command inside of sub IFrame inside of the page. (It should be able to download /scrape off html source of Iframe) - as you might know, these internet functions often don't work if there's another frame inside of a page. for example) - https://www.youtube.com/watch?v=jaBr3pqGr10&t=6s - please refer this video to help your understand. - we need to have a breakthrough of this. not through selenium or any new parts. but only based from COMobject. WinHttp.WinHttpRequest.5.1 This is the whole thing I need :) I believe it should be quick and easy to achieve if you have long period of experience in COMobject. WinHttp.WinHttpRequest.5.1 in Autohotkey. When you can do that all 1)2)3)4), I promise to reward 50 $. Thank you very much for reading. I will be waiting for your reply !
Scrap websites using Web Harvest
I need a developer that is able to scrap product catalogues into a specific template that I will give to them. The script is based on web harvest XPATH/XQUERY tool, the tool will generate an XML of the products. We have around 20 websites that needs scraping. For instance, look at https://www.thefinejewellerycompany.com/earrings. The result should be for every product, we get a record: For instance for this product https://www.thefinejewellerycompany.com/9ct-gold-4mm-bead-stud-earrings 9ct Gold 4mm Ball Stud Earrings Capturing the spirit of the classic elements of jewellery, these 4mm 9ct gold ball studs are ideal for every day wear. 40 0.29 9ct Yellow Gold 10 I will provide you with a sample web harvest script, and you should be able to replicate it to all the other websites. I will run the web harvest tool on my servers based on your script. Please let me know how much do you want for the whole job of 20 stores. There will be more in the future.
Automatic scrapping in DRUPAL 7 (+PYTHON ?)
The goal is to be able to add and update some Drupal nodes based on the content from external websites. This is kind of web scrapping for Drupal, creating some new nodes from different listings Importation will be based on 2 CSV files that include the parameters to help importing the correct fields : - one file to describe which startup list URL we'd like to crawl : for example listing-1.csv (attached) - one file to describe which elements of the startup we need to import with the corresponding drupal node field. Example xpath-fields-1.csv The idea is that this development could then be adapted to import new data from another startup list so it must be adaptable. Example: In our case, we need to get the startup list from 'https://angel.co/companies?locations[]=1717-France&company_types[]=SaaS&company_types[]=Startup' and update automatically on a periodic manner some drupal nodes. Data that will be provided: - node template of a drupal 'startup' node type - URL to scrap periodically (cf files attached with parameters) : https://angel.co/companies?locations[]=1717-France&company_types[]=SaaS&company_types[]=Startup For each startup listed, we need to be able to import data into a drupal node For example for the startup https://angel.co/appsfire, the file xpath-fields-1.csv gives the data we need to import and corresponding drupal fields: - Title - image - startup description - City + tags + number of employees + URL + social netwoks links: in our case it would import in different fields 'PAris', ' iOS · Mobile · Android · Mobile Advertising', 11-50 employees, appsfire.com, http://twitter.com/appsfire, https://www.facebook.com/appsfire,http://www.linkedin.com/company/appsfire.com - Founder name: Ouriel Ohayon - Funding: we should ideally sum all investments, for example 3 600 000$ + 1 000 000$ in the case of appsfire. ** IMPORTANT ** -Web scrapping should have a delay so it doesn't get blacklisted by web sites (harvesting startups list could be spread on multiples hours or days) -For the scraping, we need solution coptabile with AJAX website. Seems that Python library (selenium, scrapy) can do the job but we are open to suggestions NOTES: -Ideally, we'd like a solution based on existing Drupal modules, for example Feeds to perform this mission. The developper should be autonomous to setup his own test site.
Extract all data from a single webpage onto a spreadsheet
General information for the business: Lead list Description of requirements/functionality: A script that copies all information from a webpage, into a spreadsheet (crawler / scraper) Extra notes: we need to get all data on this webpage index onto a spreadsheet: http://www.aktasmak.fi/haku.php?lang=eng&sl=municipality&ng=all&target=all&rp=all&ri=0&ra=2225&pg=all (takes about 30sec to load) This includes all the info under the "info" for each business (same page, div xpath extract). Spreadsheet should include all the available information for each business. You can see the attached images to see where every piece of information is located. The following columns are mandatory (When available): 1. Business Name 2. Description 3. Area 4. Address 5. Website 6. Phone number 1 7. Phone Number 2 8. Contact Person 9. Email 10.Business ID Following can be either in separate columns with Y/N as values or just in one column but comma delimited. 11. Meat 12. Fish 13. Vegetable 14. Potato 15. Fruit and Berries 16. Milk 17. Sauces and Dressings 18. Milled Products 19. Bakery Products 20. Beverages 21. Sweets/Confectionary 22. Mushrooms 23. Herbs and Spices 24. Honey 25. Convenience Food 26. Oils and vegetable fat products 27. Eggs 28. Local / Organic shop 29. Own shop 30. Company Visits 31. Cafe 32. Imports 33. Online Retail 34. Corporate Gifts 35. Corp. Gifts 36. Catering 37. Accommodation 38. Entertaiment 39. Restaurant 40. Ordering/Delivery 41. Wholesale 42. Farm Shop
500 word article on data scraping using Python
Num. of articles: 1 Words per article: 500 Information for the blog/website: Data scraping Industry: Technology Topic: Data scraping Tone: Instructional/Educational Outline & Structure: As mentioned in notes Extra notes: Looking for a technical writer with good programming knowledge, ideally with some data scraping experience, to write an article on the basics of data scraping using Python. Including information about xpath use would also be useful. Thanks Matt