Introduction to Resume/CV Parsing via Sovren

Top  Previous  Next

One or more Resumes/CVs in a folder or attached to Outlook email can be used to auto-create Candidate profiles, with Resumes/CVs imported as well.

Resumes/CVs already in the Deskflow database can also be used to batch update existing Deskflow People profiles.

 

A Resume/CV can be automatically parsed and extracted to make a new Candidate profile or to update an existing profile.

 

In order to do this extraction successfully, Resumes/CVs must be in one of the following formats:

MS Word format - DOC

Rich Text Format - RTF

Text Format - TXT

HTML format - HTM

XML format - XML

Adobe Acrobat PDF format - PDF

WordPerfect format - WPD

 

If Deskflow is not configured to use the Sovren Parser option, then text content of Resumes/CVs is obtained using one of a number of IFILTER utilities that must be installed, together with Deskflow, on each workstation. Check your PC workstation for the presence of these IFILTER utilities by doing the following:

Click Start > Control Panel > Add or Remove Programs

Scroll down the list to ensure that the following entry exists: PDF, RTF and WPD IFilter Setup

If the entry does not exist, uninstall Deskflow and re-install Deskflow Version 7.2.25 or greater

 

Check Deskflow to ensure that correct Skills dictionaries are installed for Resume/CV content extraction.

 

The Deskflow Resume Importer content extraction procedure works in the following way:

A text copy of the Resume/CV is created.

An XML document version of the Resume/CV is created by the Sovren parser.

FirstName, LastName, Initials, NamePrefix (Honorific), NameSuffix, Address, Phone numbers, Email addresses, Job Experience, Education and Skills are extracted from each XML document and placed in a form for manual inspection before being selected and saved into the database.

The Sovren parser will use the Deskflow Skills Dictionary (if there is one) to extract Skills from the Resume/CV text.

Rank value is calculated for each Resume/CV. Minimum Rank value is 0.

Rank of 5100 indicates that minimum data in the Resume/CV was successfully identified. Name & Email address carries most weight.

Rank of 5100 or below indicates poor quality extraction, needing manual intervention.

Data can be copied and pasted from the resume on the left-side window into the form in the right-side window, then edited.

Columns in the bottom window can be sorted, grouped and filtered.

Unusable Resumes/CVs may be deleted.

Name and address can not always be extracted from a Word document if the name and address is located in a header.