Developing our very own proprietary tool called Skrapling
Have you ever found yourself copying and pasting from webpage after webpage in an effort to put together a website quote for a translation client? I know the struggle – I’ve been there myself. Although not for a good few years now.
Over the last years at my Nordic translation agency, COMUNICA, we have been using and developing our very own proprietary tool for the extraction of website text.
The tool is called Skrapling and it has been an invaluable help when it comes to preparing quotes for website clients.
After much tweaking and development, we are now launching the tool as a subscription service so that other agencies and translators can benefit from its capabilities.
We are very proud of the final product and so on the occasion of its release, I thought I would take a look back at its development and what it can do today.
A Smart Solution to a Persistent Problem
The idea for Skrapling was born as a smart solution to a persistent problem. At COMUNICA we regularly had website clients who got in touch asking for a quote, but it was often difficult for us to get a clear sense of exactly how many words the web contained.
Clients rarely had their website texts available in a simple format like a Word document or a PDF. The texts often existed only on the website itself and quite often these were websites that had grown and sprawled organically over time … so there was no central registry of all its text and composite parts.
Sometimes multiple languages even coexisted in a mangle of pages that were all knotted together and painstaking to pull apart. The process could take hours.
Routinely we found ourselves looking to the stars and asking:
Why oh why is there not a way to literally scrape out all that text and organise it in a simple and practical format!?
After much searching and hoping, I decided it would be necessary to take matters into my own hands. I contacted a local software developer and soon after, Skrapling was born.
Instant Advantages and Skrapling 2.0
We were quick to see the benefits of Skrapling shortly after implementation. Suddenly, quotes could be prepared much more quickly as our PMs no longer needed to scour through websites, manually copying and pasting out the text.
This meant both cost savings for us and a better experience for our clients, who were receiving their quotes in record time.
Skrapling quickly became a part of our everyday work flow at COMUNICA, even if the beta version could at times be a clunky tool with a certain number of kinks.
We found that some websites had blockers to keep crawlers like Skrapling out, but with a bit of communication and collaboration between ourselves and the client, these issues could almost always be overcome.
After some time spent ironing out the kinks and learning from our initial experiences, I decided to commission a new and improved version of the tool that would be ready for market.
The end result is a simple and elegant web-based tool with an intuitive interface and a user-friendly design. In other words: all the original benefits of Skrapling but now in a sleek and smooth format that anybody can use and enjoy.
How It Works
Using Skrapling is simple. All you have to do to get started is to input the URL you want to work on. The tool will begin by identifying the language of the website and then start crawling its content and extracting their words.
Each page will be listed with its word count next to it and this happens in real time until the entire process is complete. How long exactly this takes will depend on the size of the website, but on average it can take anywhere between five and ten minutes.
Once the website has been scraped, there is a number of things you can do. Firstly, you can export a list of all the URLs into an Excel file to send to your client. They can then check the list and confirm, or they can deselect certain pages they want to exclude from the translation.
This process creates clarity for both parties and offers flexibility to your clients, helping them keep costs down and refine their outcomes.
Next, you can export the actual text itself into RTF, Word, PDF, Excel or HTML formats before finally uploading the exported file into your usual CAT tool to begin translating.
Alignment and Translation Memories
Another benefit that we have found very helpful at COMUNICA is the way Skrapling can be used to align multilingual pages and create translation memories and term bases.
This means organising the multiple language versions of a website so the pages match up, creating order and coherence for the client.
These aligned texts can then be uploaded into our CAT tools to create translation memories, and as valuable resources in identifying key terms and compiling glossaries.
This has been a great help when working with new clients as it allows us to jumpstart our collaborative relationship and skip ahead to a point where we already have a foundation of resources and reference texts to help facilitate and ease the translation process.
So not only does the client get a quicker quote, but the overall quality of their translations is better, too!
Skrapling – Coming Now to a Screen Near You!
Skrapling was launched on the 1st of October 2022, so head on over to the website now and sign up for a free trial.
If you want to reap all the benefits that Skrapling has to offer, you can sign up for various different subscription tiers depending on how much you want to use the tool, or you can buy a single voucher for one-time use.
Whether you’re a freelancer, an agency or a media outlet, I hope you’ll get a lot of benefit from Skrapling and I am very pleased to now finally be sharing our secret weapon with the rest of the industry. Happy skraping!