Introduction to Resume/CV Parsing

Top  Previous  Next

One or more Resumes/CVs in a folder or attached to Outlook email can be used to auto-create Candidate profiles, with Resumes/CVs imported as well.

 

The top section of a Resume/CV can be automatically parsed and extracted to make a new Candidate profile. In order to do this extraction successfully, Resumes/CVs must be in one of the following formats:

MS Word format - DOC

Rich Text Format - RTF

Text Format - TXT

HTML format - HTM

XML format - XML

Adobe Acrobat PDF format - PDF

WordPerfect format - WPD

 

The text content of Resumes/CVs is obtained using one of a number of IFILTER utilities that must be installed, together with Deskflow, on each workstation. Check your PC workstation for the presence of these IFILTER utilities by doing the following:

For Deskflow Users that have opted for a Sovren Parser license, text extraction is done by Sovren instead of the IFILTERS.

Click Start > Control Panel > Add or Remove Programs

Scroll down the list to ensure that the following entry exists: PDF, RTF and WPD IFilter Setup

If the entry does not exist, uninstall Deskflow and re-install Deskflow Version 7.2.25 or greater

 

Check Deskflow to ensure that correct dictionaries are installed for Resume content extraction.

Each dictionary contains 14 comprehensive text lists of: Last Names, Last Name Prefixes, First Names, Name Prefixes, Countries, States, Cities, Streets, Stop Words, Ignore Words, Street Components and Suite Components.

 

The Deskflow Resume Importer content extraction procedure works in the following way:

A text copy of the Resume/CV is created.

FirstName, LastName, Address, Phone numbers, Email addresses, Education history, Work experience and Skills are extracted from each text resume and placed in a form before being saved into the database.

Rank value is calculated for each Resume/CV. Minimum Rank value is 0.

Rank of 5100 or greater indicates that minimum data in the Resume/CV was successfully identified. Name & Email address carries most weight.

Rank below 5100 indicates poor quality extraction, needing manual intervention.

Data can be copied and pasted from the Resume/CV on the left-side window into the form in the right-side window, then edited.

Columns in the bottom window can be sorted or filtered.

Unusable Resumes/CVs may be deleted.

Name and address can not always be extracted from a Word document if the name and address is located in a header.