How ATS Reads Your Resume: Understanding Resume Parsing Technology
When you submit a resume to an ATS, the software doesn't 'read' it the way a human does. Instead, it uses parsing technology to break your resume into structured data fields. Understanding this process reveals why formatting matters as much as content, and why a beautiful resume can fail while a plain one succeeds.
The Resume Parsing Process
Resume parsing is the automated extraction of information from a resume file into structured, searchable data. The ATS parser converts your formatted document into a flat data record with fields like name, email, phone, location, job titles, employers, dates, skills, and education.
The process begins with text extraction—the parser strips all formatting and converts your resume into plain text. For DOCX files, this means reading the underlying XML structure. For PDFs, the parser attempts to extract the text layer. For scanned documents, OCR (Optical Character Recognition) may be used, though this is less reliable.
Once the text is extracted, the parser uses NLP algorithms to identify entities (people, organizations, locations), categorize sections, and map information to the appropriate database fields.
What Information ATS Extracts
The ATS attempts to extract and categorize specific data points from your resume. Contact information is usually identified first—the parser looks for email patterns, phone number formats, and physical addresses near the top of the document.
Work experience parsing is more complex. The system looks for patterns like 'Job Title at Company Name, Location (Start Date – End Date)' and then associates bullet points beneath as job responsibilities. Education parsing follows similar patterns, looking for degree names, institution names, graduation dates, and GPA.
Skills extraction can use either section-based detection (finding a 'Skills' heading and listing items beneath it) or full-document scanning that identifies known skill terms throughout your resume.
- Contact info: name, email, phone, LinkedIn URL, location
- Work history: job titles, company names, dates, descriptions
- Education: degrees, institutions, graduation years, GPA
- Skills: technical skills, software proficiency, languages
- Certifications: certification names, issuing bodies, dates
Why Formatting Breaks ATS Parsing
The parser relies on predictable document structure to correctly categorize information. When resumes use complex formatting, the parsing accuracy drops significantly. Tables and multi-column layouts are the most common culprits—the parser reads text linearly (left to right, top to bottom), so a two-column layout can cause left and right column text to merge incorrectly.
Headers and footers are often ignored entirely by ATS parsers. If your name or contact information is in a header, the ATS may not extract it at all. Similarly, text boxes, SmartArt, and embedded images are typically invisible to the parser.
Custom fonts, special characters, and non-standard bullet points can also cause issues. Some parsers replace unrecognized characters with empty spaces or garbled text, potentially breaking the meaning of your content.
| Format Element | Parsing Impact | Recommendation |
|---|---|---|
| Tables | Text merges across cells | Avoid entirely |
| Two-column layout | Content gets jumbled | Use single column |
| Headers/Footers | Often completely ignored | Put all info in body |
| Text boxes | Content may be skipped | Use standard paragraphs |
| Custom fonts | Characters may not render | Use Arial, Calibri, Times |
How Different File Formats Are Parsed
DOCX files are generally the most ATS-friendly format because the underlying XML structure makes text extraction straightforward. The parser can reliably identify headings, paragraphs, lists, and formatting hierarchy from the XML tags.
PDF parsing is more variable. Text-based PDFs (created from word processors) usually parse well, but the parser cannot always determine the reading order, especially in multi-column layouts. PDFs created from design tools like InDesign or Canva often embed text as graphic elements, making extraction unreliable.
Plain text (.txt) files are the most reliably parsed but offer no formatting, making them impractical for most applications. Some ATS platforms also accept .rtf files, which parse similarly to DOCX.
Pro Tips
Test your resume's parseability by copying all text from your document and pasting it into a plain text editor—if the text appears jumbled or out of order, the ATS will have the same problem
Place your name and contact information in the main body of the document, never in a header or footer
Use standard section headings verbatim: 'Work Experience,' 'Education,' 'Skills,' 'Certifications'—creative alternatives like 'Where I've Made an Impact' confuse parsers
Stick to reverse chronological format, which is the most reliably parsed resume structure across all ATS platforms
Common Mistakes to Avoid
Placing contact information in the document header, which most ATS parsers skip entirely
Using two-column or multi-column layouts that cause text to merge incorrectly during parsing
Saving resumes as image-based PDFs from design tools like Canva, which contain no extractable text
Using creative section headings that the parser doesn't recognize as standard resume sections

